[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: windows-1252
Frank Ellermann wrote:
> Erik van der Poel wrote:
>>RFC 2978 does not require a Unicode mapping. It says that
>>there "SHOULD" be a 10646 mapping, but it does not use the
>>word "MUST".
>
> You need a good excuse to ignore a SHOULD, a typical example
> are old implementations (= here old charset registrations).
I agree that it is really a good idea to provide the 10646 mapping.
> It also says "MUST be stable", that's why we got tons of new
> registered charsets doing something for the "Euro", like 858
> instead of 850.
It is true that the RFC says "stable", but it does not say what "stable"
means in the context of charsets. Does it mean that assigned codepoints
must not change? Of course. Does it mean that unassigned codepoints must
not change? That is debatable. (And remember that UTF-8 is specifically
permitted to have unassigned codepoints that might change later.)
> In the case of 1252 all it takes is to explain what the five
> interesting octets are supposed to be: Maybe "cp-1252" and
> windows-1252 are two different charsets, the former with one
> to one mappings, the latter with five unassigned code points.
I can't find "cp-1252" in the IANA charset registry:
http://www.iana.org/assignments/character-sets
I have been wondering, however, about this "re-registration" of
windows-1252. Why is it being registered again? Is it because the
contact person/email address is being changed? If so, then that should
be stated explicitly. I'm not very happy about these 2 URLs supplied
with it either:
http://www.microsoft.com/globaldev/getwr/steps/wrg_unicode.mspx
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_19mb.asp
Those documents are not 10646 equivalency tables, nor are they even
published specs, for 1252.
Mike, I have a suggestion. How about dealing with windows-1252
separately? If we can agree on a change for the windows-1252
registration (e.g. contact person/email), then you can apply the same
fix to the other re-registrations. It might even be a good idea to have
some non-person email address at Microsoft be the contact. E.g.
iana-charsets@microsoft.com. Then it doesn't matter whether Chris Wendt
or Mike Ksar leave Microsoft.
Then, or in parallel, discuss windows-874 separately. If we can agree on
a pattern for that charset, you can apply the same pattern to the other
new windows-NNN charsets.
Erik