[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Registration of new charset CP50220



I'm happy if you're happy :)

I think that CP50220 could point the states for each escape sequence at appropriate tables, however some of the code not not completely match the tables (I'd have to take a close look).  However it is unlikely that the various standards referred to by these escape sequences are exactly what our behavior is, I've heard complaints.  Readers would presume that the escape sequences were exactly as the JIS standards, however I think that's not true :(

So I guess I'm a bit unclear as to your goal.  If you intend to accurately document the behavior, then it seems that more detail is required.  If you are just trying to register a place for CP50220 and note that it's not ISO-2022-JP, then this might work.  If the higher level detail of Microsoft exact behavior is required, then this might be something I need to follow up on, which might take a little while :(

There are 2 ways to read this sentence, I read it differently:  (It can be read that CP50220, Windows-31J & Shift_JIS are all variants of ISO-2022-JP.)  Sorry, I didn't think of reading it the other way.
"CP50220 is a variant of ISO-2022-JP (like Windows-31J and Shift_JIS)."

Perhaps this would have helped me:
"CP50220 is a variant of ISO-2022-JP.  (Similar to the way that Windows-31J is a variant of Shift_JIS)"

-Shawn

 
http://blogs.msdn.com/shawnste


________________________________________
From: Masatoshi Kimura [VYV03354@nifty.ne.jp]
Sent: Friday, September 03, 2010 6:54 PM
To: Shawn Steele
Cc: 'NARUSE, Yui'; 'ietf-charsets'
Subject: Re: Registration of new charset CP50220

(2010/09/04 4:50), Shawn Steele wrote:
> I don't think the registration needs an alias, unless someone's
> actually using that name already?

The "csCP50220" alias was added for MIB requirements. See
<http://mail.apps.ietf.org/ietf/charsets/msg01880.html>.

> If I understand the intent correctly, the intent here is to match
> Microsoft's 50220 behavior?  If so, I think that defining the
> "character sets" in terms of the JIS standards is a little bit odd.
> Instead I might consider pointing the character set mappings, like at
> http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/ ? (Not
> sure if that completely works)

There is no official character set mappings for CP50220 (it it existed,
the problem would be much less complicate).
Moreover, it will be insuffiecient to point a mapping table because
CP50220 is a stateful encoding. A BNF or a decoding algorithm (like
HTML5 spec) will be required. Defining in terms of the JIS standards is
an easiest workaround.

> I'm not sure the comparison to shift_jis makes much sense.  They both
> encode Japanese, but 50220/iso-2022-jp are both stateful escape
> sequence based encodings, whereas shift_jis is "just" a double-byte
> code page.

The comparison is not a CP50220/ISO-2022-JP vs. Shift_JIS but a
Shift_JIS vs. Windows-31J. It is required to express "characters
extended by Windows Codepage 932" because only shift_jis variant
(Windows-31J, Windows Codepage 932, or whatever) has the "official"
mappings provided by Microsoft. Again, it would not be required if
Microsoft provided an official mappings for CP50220.

--
VYV03354@nifty.ne.jp