[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: shift_jis / windows-31J




> I'm being asked to document things like a character set selector attribute that's a byte.  It has entries like:
> 	0x80 Specifies the JIS character set. (IANA name shift_jis)
> 
> We all know that "Microsoft's" shift_jis is really Windows-31J, but the on-the-surface reasonable 
>request to replace this shift_jis with Windows-31J would mean that we'd
>be specifying an identifier that our software didn't recognize.  That
>doesn't help solve the problem.  Even when we do recognize windows-31J,
>we'd tell you that the name was shift_jis (round tripping.)  

I do not think so, since you specify 0x80 in data rather than
"windows-31J" or "shift_jis" in this particular case.

> This kind of documentation shows up "everywhere", so it'd be nice if people got to shift_jis in the 
> registry and saw "gee, Microsoft uses a variation".
> 
> At this point it's rather a mess, and the behavior's pretty stuck.  If it is desirable for the registry 
>to point people in the right direction, then doing something like what
>HTML did, at the registry level, would be most helpful.

I agree with this idea.  Except the addition of a new alias, I agree.

I reformulated your proposal using the latest registration template. 
Here goes.

--------------------------------------------------------------------------------


Charset name: Windows-31J
Charset aliases: csWindows31J
MIBenum: 2024

Suitability for use in MIME text:

This charset can be used for the top-level media type "text".

Published specification(s):

http://msdn.microsoft.com/en-us/goglobal/cc305152.aspx

ISO 10646 equivalency table:

http://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT

Additional information:

Windows Japanese.  A variant of Shift_JIS to include NEC special
characters (Row 13), NEC selection of IBM extensions (Rows 89 to 92),
and IBM extensions (Rows 115 to 119).  The CCS's are JIS X0201:1997,
JIS X0208:1997, and these extensions.  Windows-31J text is commonly
declared with the shift_jis name of the parent charset.

Person & email address to contact for further information: ??

Intended usage: LIMITED USE

--------------------------------------------------------------------------------

Charset name: Shift_JIS

MIBenum: 17

Charset aliases: MS_Kanji and csShiftJIS

Suitability for use in MIME text:
This charset can be used for the top-level media type "text".

Published specification(s): Appendix 1 of JIS X0208:1997.

ISO 10646 equivalency table:

There are no authoritative definitions and several variations 
exist.  An obsolete variation is available at:

http://unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/JIS/SHIFTJIS.TXT

Additional information:

This charset is an extension of csHalfWidthKatakana by adding graphic
characters in JIS X 0208.  The CCS's are JIS X0201:1997 and JIS
X0208:1997.

Several vendor specific charsets that derive from shift_jis often use
the shift_jis name instead of a more specific vendor charset name.

Person & email address to contact for further information: ?

Intended usage: LIMITED USE


-- 
MURATA Makoto <murata@hokkaido.email.ne.jp>