[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Registration of new charset CP51932
Charset name: CP51932
Charset aliases: (none)
Suitability for use in MIME text:
Yes, CP51932 is suitable for use with subtypes of the "text"
Content-Type. Note that CP51932 is an multi-octet charset.
Care should be taken to choose an appropriate Content-Transfer-Encoding.
Published specification(s):
Uses ISO 2022 rules to select:
code set 0: US-ASCII (a single 7-bit byte set)
* 0x5C is U+005C : REVERSE SOLIDUS (YEN SIGN)
* 0x7E is U+007E : TILDE
code set 1: Microsoft Standard Character Set (a double 8-bit byte set)
* JIS X 0208-1983
* NEC special characters (Row 13)
* NEC selection of IBM extensions (Rows 89 to 92)
code set 2: Halfwidth Katakana (a single 7-bit byte set)
JIS X 0201-1976
requiring SS2 as the character prefix
Meaning and mapping to Unicode of each character is refer to
Windows Codepage 932.
http://msdn.microsoft.com/en-us/goglobal/cc305152.aspx
ISO 10646 equivalency table:
http://cpansearch.perl.org/src/NARUSE/Encode-EUCJPMS-0.07/ucm/cp51932.ucm
Additional information:
This is a request for a new registration of this charset.
CP51932 is real implementation of EUC-JP mostly used by Web Browsers.
Internet Explorer gives a reference implementation.
Firefox, Safari, Opera, and Google Chrome support also this.
They refers this charset by the name "EUC-JP".
http://coq.no/character-tables/mime/euc/en
The name "CP51932" is in use following applications:
* Citrus iconv (NetBSD and DragonFly uses this)
* patched GNU libiconv in FreeBSD ports
* Mojikan http://www.mirai-ii.co.jp/moji/mojikan/
* nkf 2.0.5
* PHP 5.2.1
* Ruby 1.9.1
* Encode-EUCJPMS-0.06
Moreover applications which uses MLang.DLL or .NET Framework for
converting "EUC-JP" implicitly uses this charset.
So this charset is widely used, but doesn't have its own name.
Intended use of this name is to override the implementation of EUC-JP
or charset convertion.
http://wiki.whatwg.org/wiki/Web_Encodings
http://www.w3.org/Bugs/Public/show_bug.cgi?id=7444
Why the name is not "Windows-51932" is some of applications which accept
the name "CP51932" don't support the name "Windows-51932".
CP51932 is for use of importing legacy data.
UTF-8 is preferred to CP51932 for new system.
Related references are:
"Remarks" of "GetEncodings Method" of "System.Text"
http://msdn.microsoft.com/en-us/library/system.text.encoding.getencodings.aspx
"UnicodeによるJIS X0213実装入門―情報システムの新たな日本語処理環境"
日経BPソフトプレス, ISBN 978-4891006082, 2008, p. 17-18, 20, 120-158
CP51932 - Legacy Encoding Project
http://legacy-encoding.sourceforge.jp/wiki/index.php?cp51932
This charset is also known as Windows Codepage 51932.
Person & email address to contact for further information:
NARUSE, Yui
Email: naruse@airemix.jp
Intended usage: LIMITED USE
--
NARUSE, Yui <naruse@airemix.jp>