[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: windows-1252
Erik van der Poel wrote:
> It is true that the RFC says "stable", but it does not say
> what "stable" means in the context of charsets. Does it mean
> that assigned codepoints must not change? Of course.
Yes, otherwise silently deleting UNICODE-1-1 would be an idea.
> Does it mean that unassigned codepoints must not change?
> That is debatable.
New Unicode points are assigned all the time. Nothing's wrong
with that if you know that it's possible doing something where
it doesn't matter (e.g. no canocicalization).
> I can't find "cp-1252" in the IANA charset registry:
> http://www.iana.org/assignments/character-sets
It's on the quoted MAPPINGS/VENDOR/MICSFT/WINDOWS/CP1252.TXT
page at unicode.org submitted by cpxlate@microsoft.com
The old (RfC 2278) = proposed new (RfC 2978) table at...
<http://www.microsoft.com/globaldev/reference/sbcs/1252.htm>
...is most probably simply the same charset. In that case it
would be nice to register "cp-1252" as an alias with a pointer
to CP1252.TXT. The old RfC 2278 registrtation template is
<http://www.iana.org/assignments/charset-reg/windows-1252>
The proposed RfC 2978 update offers two additional links and
as you said a new contact address.
> Then, or in parallel, discuss windows-874 separately.
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT
http://www.microsoft.com/globaldev/reference/sbcs/874.htm
JFTR. We should check if there's a potential conflict for an
alias cp-874. ICU says http://purl.net/net/cp/874 => TIS-620.
For http://purl.net/net/cp/1252 it says "windows-1252" mapping
the five interesting octects 0x81 to u+0081 etc. I've no idea
what ibm1252-P100-2000 is supposed to be, my IBM OS/2 199mumble
has an Euro at 0x80, not u+0080. OTOH it also says that this
is "1004" and not 1252, so probably that's beside the point.
Bye, Frank