[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Update of charset windows-1252, draft 2
>> By the way, the windows-1255 charset has changed recently. See the
>> mapping for 0xCA in:
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1255.TXT
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit1255.txt
Actually, the behavior of the windows code page has not changed, the previous mapping for 0xCA is identical to the current mapping. I'm guessing that since it wasn't a real code point it was filtered out by whomever created CP1255.txt (I wasn't here, I'm not sure how it came to be :)).
>> So we may want to update the out-of-date one (CP1255.TXT)
> I think that is for Microsoft to do. But it does make their stated
> policy of not updating any of their codepages less credible.
I'm not about to touch that data file, I'm not sure how it was created, obviously best-fit and unassigned unicode code points were filtered out. (A few code pages also map to the PUA, but those mappings aren't in the older Unicode tables.)
> While doing that, I would suggest they fix the character name
> comments (both in the cp* files and in the bestfit* files) to
> align with Unicode 5.0. It is so much less confusing that way.
These are effectively our raw source files, and were provided without any manipulations in order to avoid the risk of introducing a technical error. It'd be nice if the comments were pretty, but as it is we can easily prove that it's the same as the windows tables.
- Shawn
Shawn Steele
Windows International
Microsoft