[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Registration of new charset CP50220



Ok

-Shawn

 
http://blogs.msdn.com/shawnste


________________________________________
From: NARUSE, Yui [naruse@airemix.jp]
Sent: Thursday, September 16, 2010 8:03 AM
To: Shawn Steele
Cc: Masatoshi Kimura; 'ietf-charsets'; Ryan Cavalcante; Peter Constable
Subject: Re: Registration of new charset CP50220

(2010/09/06 14:54), Shawn Steele wrote:
> I'm happy if you're happy :)
>
> I think that CP50220 could point the states for each escape sequence
> at appropriate tables, however some of the code not not completely
> match the tables (I'd have to take a close look).  However it is
> unlikely that the various standards referred to by these escape
> sequences are exactly what our behavior is, I've heard complaints.
> Readers would presume that the escape sequences were exactly as the
> JIS standards, however I think that's not true :(
>
> So I guess I'm a bit unclear as to your goal.  If you intend to
> accurately document the behavior, then it seems that more detail is
> required.  If you are just trying to register a place for CP50220 and
> note that it's not ISO-2022-JP, then this might work.  If the higher
> level detail of Microsoft exact behavior is required, then this might
> be something I need to follow up on, which might take a little while
> :(
>
> There are 2 ways to read this sentence, I read it differently:  (It
> can be read that CP50220, Windows-31J&  Shift_JIS are all variants of
> ISO-2022-JP.)  Sorry, I didn't think of reading it the other way.
> "CP50220 is a variant of ISO-2022-JP (like Windows-31J and
> Shift_JIS)."
>
> Perhaps this would have helped me: "CP50220 is a variant of
> ISO-2022-JP.  (Similar to the way that Windows-31J is a variant of
> Shift_JIS)"

A registration is a standard for information exchange.
So what we need is a reasonable subset of real implementation of
Windows Codepage 50220.

For example CP932, it has some strange mappings like:
<U0080> \x80 |0
<UF8F0> \xA0 |0
<UF8F1> \xFD |0
<UF8F2> \xFE |0
<UF8F3> \xFF |0
Moreover it has User Defined Characters.

I don't think they are needed for IANA Charset registry.

I believe that what I proposed for CP50220 is a reasonable one.

--
NARUSE, Yui  <naruse@airemix.jp>