- To: ietf-charsets@iana.org
- Subject: Registration of new charset GBK
- From: Anthony Fok <anthony@thizlinux.com>
- Date: Fri, 15 Mar 2002 04:45:03 +0800
- Cc: =?gb2312?B?s8LXsw==?= <chenzh@cesi.ac.cn>, Cheng XU <xucheng@cn.ibm.com>,haible@ilog.fr, suzhe@gnuchina.org, shwang@sonata.iscas.ac.cn,=?gb2312?B?zuK9oQ==?= <jwu@sonata.iscas.ac.cn>, leon@xteamlinux.com.cn,ygh@dlut.edu.cn, roger.so@sw-linux.com, pablo@mandrakesoft.com, zw@debian.org,Dirk Meyer <dmeyer@adobe.com>, markus.scherer@jtcsv.com,Ken Lunde <lunde@adobe.com>, li18nux2000@li18nux.org, bsd-locale@haun.org,wuzg@cesi.ac.cn, Yoshihiko Enomoto <YENOMOTO@jp.ibm.com>,Jack Kang <Jack.Kang@sun.com>
- Sender: Anthony Fok <anthony@thizlinux.com>
- User-Agent: Mutt/1.3.27i
Dear all,
Attached is the registration of the GBK charset for IANA.
Thanks to Bruno Haible for notifying me the subtle difference among
GBK implementations, and for his affirmation that we should treat
GBK == CP936. :-) All comments and suggestions are welcome.
Cheers,
Anthony
Application of IANA Charset Registration for GBK
------------------------------------------------
Charset name:
GBK
Charset aliases:
CP936, MS936, windows-936
Suitability for use in MIME text:
Yes
Published specification(s):
The GBK (国标扩展 Guobiao Kuozhan) specification was created by the
Chinese IT Standardization Technical Committee (中华人民共和国
全国信息技术标准化技术委员会) in December 1995:
Chinese Internal Code Specification
(汉字内码扩展规范 Hanzi Neima Kuozhan Guifan)
["Specifications defining the extensions of internal codes for
Chinese ideograms."]
ISO 10646 equivalency table:
Code Page 936 (CP936) is the most popular implementation of GBK.
A mapping to Unicode is provided by Microsoft:
http://www.microsoft.com/typography/unicode/936.txt
Mapping data in CharMapML (XML) format is also available:
http://oss.software.ibm.com/cvs/icu/~checkout~/charset/data/xml/windows-936-2000.xml
Additional information:
The People's Republic of China has already expressed her
fundamental consent to support the combined efforts of the ISO/IEC
and the Unicode Consortium through publishing a Chinese National
Standard that was code- and character- compatible with ISO 10646-1 /
Unicode 2.1. This standard was named GB 13000.1-93.
Since the legacy GB 2312-1980 standard was still widely used, it
was important to provide a smooth migration path towards
GB 13000.1-93. GBK (1995) was the first step in this direction.
It defines a two-byte encoding scheme which extends GB2312 to
include the entire character repertoire of the base CJK Unihan area
(U+4E00 to U+9FA5) and other additions.
The most popular implementation of GBK is Code Page 936 (CP936)
on Microsoft Windows system. Therefore, some existing software
also recognize the names CP936, MS936 and windows-936.
As the GBK code space is limited, it cannot support the full code
space of ISO 10646. To remedy this shortcoming, the GBK
specification has since been "replaced" by the mandatory
GB 18030-2000 standard (GB18030).
Person & email address to contact for further information:
CHEN Zhuang (陈壮)
chenzh@cesi.ac.cn
Chinese IT Standardization Technical Committee
Chinese Electronics Standardization Institute
Intended usage:
COMMON (Still commonly used)
OBSOLETE (Superceded by GB18030)
Compiled by Anthony Fok <anthony@thizlinux.com> (霍东灵), March 15, 2002.