[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Registration of new charset CP50220



(2010/09/02 20:43), Masatoshi Kimura wrote:
> (2010/08/31 8:20), Shawn Steele wrote:
>> Windows, .Net& MLang aren't going to change the behavior of these
>> code pages, it would break people. Instead we'd encourage customers
>> to use UTF-8, particularly if they're having problems.
>
> Totally agree. I do not want to change the existing legacy codes only
> to comply the registration. Note that sending ESC(I is illegal
> regardless of "CP50220". Mozilla ISO-2022-JP encoder should have been
> changed anyway.

This refers:
Mozilla has ISO-2022-JP ESC(I decoder, but when sending JIS X 0201-Katakana
Mozilla encodes them as Character reference.

>> I was sort of assuming that since you're using the Windows
>> nomenclature, you're attempting to pin down the behavior for some
>> sort of interoperability when you see the Windows names. It is,
>> perhaps, odd for the "7 bit" form to do something when it sees 8
>> bit data, but I was just letting you know that's what it does :)
>> I'm sure there are also other subtle discrepancies between the
>> 5022x behavior and the official standards, but we're pretty much
>> stuck with the existing behavior.
>
> Windows does not use the name "CP50220" at all. It treats the name
> "ISO-2022-JP" as Windows Codepage 50220. Windows should behave as if
> it uses genuine ISO-2022-JP because it doesn't use its own name.
> Therefore it should preclude "Content-Transfer-Encoding: 8bit" on
> sending. MSIMN didn't, so many users complained about this rudeness.
> Eventually, Outlook Express fixed this bug. The name "CP50220" is
> used by non-MS implementations to differentiate it from genuine
> ISO-2022-JP. Those implementations do not share exactly the same
> behavior. Even WideCharToMultiByte and MLang are differ from each
> other. So there is no authoritative definition of "CP50220".
>
> Consequently, I think the regitration should include only the
> greatest common denominator. The rest should be left undefined.

Only I wanted is decoder, so I can change the description like:

   * On sending JIS X 0201-Katakana, it MUST be converted to related
     character of JIS X 0208 or escaped characters on the context.

-- 
NARUSE, Yui  <naruse@airemix.jp>