I am looking at the registrations for the remaining 4 “system”
code pages: 932, 936, 949 & 950. This seems complicated since IE uses
other names for them. For example, for 936, IE recognizes Chinese, CN-GB,
csGB2312, csGB231280, csISO58GB231280, GB2312, GB2312-80, GB231280, GBK,
GB_2312_80, iso-ir-58, and, of course its known to the system as 936. Our APIs report this code page as being “gb2312” There is an existing registration for GBK, aliases of CP936,
MS936 and windows-936, but not of the gb2312 name. The existing
registration points to broken links at Microsoft and IBM. This should
probably be updated to point to: http://www.microsoft.com/globaldev/reference/dbcs/936.mspx http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP936.TXT
and http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit936.txt
I am a bit uncertain that GBK == 936, although this is what
the existing registration implies. The alternative solution would seem to be to register a new
charset as “windows-936” with the same additional aliases as the
GBK registration and point to the above tables. This would then also lead
to the question of whether GBK and gb2312 should be listed as aliases for any
such windows-936 code page although the interpretation of those aliases could
differ for other systems. My goal is to clarify the Microsoft system code page
mappings such as for 932, 936, 949 & 950, and I’d appreciate any suggestions
about how to best do that J Thanks, - Shawn Shawn Steele SDE Windows International Microsoft |