[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Registration of new charset CP50220
Windows, .Net & MLang aren't going to change the behavior of these code pages, it would break people. Instead we'd encourage customers to use UTF-8, particularly if they're having problems.
I was sort of assuming that since you're using the Windows nomenclature, you're attempting to pin down the behavior for some sort of interoperability when you see the Windows names. It is, perhaps, odd for the "7 bit" form to do something when it sees 8 bit data, but I was just letting you know that's what it does :) I'm sure there are also other subtle discrepancies between the 5022x behavior and the official standards, but we're pretty much stuck with the existing behavior.
If Mozilla were to target the Windows CP50220 behavior specifically (as opposed to the more general iso-2022-jp), then how exactly they wanted to follow that behavior would be up to them. If they thought that just mapping it to iso-2022-jp was acceptable and more convenient, then that would be their choice, same way we may iso-2022-jp to 50220 even though it isn't a perfect match.
-Shawn
-----Original Message-----
From: Masatoshi Kimura [mailto:VYV03354@nifty.ne.jp]
Sent: Monday, August 30, 2010 4:07 PM
To: Shawn Steele
Cc: NARUSE, Yui; ietf-charsets
Subject: Re: Registration of new charset CP50220
The purpose of this registration is to "standardize" how to handle errors when Web browsers encount illegal ISO-2022-JP sequences.
Mozilla encoder has changed a halfwidth katakana handling to match the behavior.
https://bugzilla.mozilla.org/show_bug.cgi?id=563283
> Decoding is identical (which might be most interesting for users > of tagged content).
The fist version of the registration had included all decoding methods which is supported by CP50220. (i.e. ESC ( J, SI, and 8bit) However latter two methods were removed from the registration by two reasons.
1. Some implementation (e.g. Mozilla's one) don't support them.
Should Mozilla decoder be changed to match the begavior?
2. The charset supposed to be a 7-bit. It's strange to include a 8-bit character handling.
Changing the regstration to 8-bit is not a solution because it will require the Content-Transfer-Encoding MIME header field. It is not compatible with ISO-2022-JP. Old Microsoft Internet Mail/News had the bug.