[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: windows-1252



Frank Ellermann wrote:
> Erik van der Poel wrote:
>>RFC 2978 does not require a Unicode mapping. It says that
>>there "SHOULD" be a 10646 mapping, but it does not use the
>>word "MUST".
> 
> You need a good excuse to ignore a SHOULD, a typical example
> are old implementations (= here old charset registrations).

I agree that it is really a good idea to provide the 10646 mapping.

> It also says "MUST be stable", that's why we got tons of new
> registered charsets doing something for the "Euro", like 858
> instead of 850.

It is true that the RFC says "stable", but it does not say what "stable" 
means in the context of charsets. Does it mean that assigned codepoints 
must not change? Of course. Does it mean that unassigned codepoints must 
not change? That is debatable. (And remember that UTF-8 is specifically 
permitted to have unassigned codepoints that might change later.)

> In the case of 1252 all it takes is to explain what the five
> interesting octets are supposed to be:  Maybe "cp-1252" and
> windows-1252 are two different charsets, the former with one
> to one mappings, the latter with five unassigned code points.

I can't find "cp-1252" in the IANA charset registry:

http://www.iana.org/assignments/character-sets

I have been wondering, however, about this "re-registration" of 
windows-1252. Why is it being registered again? Is it because the 
contact person/email address is being changed? If so, then that should 
be stated explicitly. I'm not very happy about these 2 URLs supplied 
with it either:

http://www.microsoft.com/globaldev/getwr/steps/wrg_unicode.mspx
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_19mb.asp

Those documents are not 10646 equivalency tables, nor are they even 
published specs, for 1252.

Mike, I have a suggestion. How about dealing with windows-1252 
separately? If we can agree on a change for the windows-1252 
registration (e.g. contact person/email), then you can apply the same 
fix to the other re-registrations. It might even be a good idea to have 
some non-person email address at Microsoft be the contact. E.g. 
iana-charsets@microsoft.com. Then it doesn't matter whether Chris Wendt 
or Mike Ksar leave Microsoft.

Then, or in parallel, discuss windows-874 separately. If we can agree on 
a pattern for that charset, you can apply the same pattern to the other 
new windows-NNN charsets.

Erik