[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: windows-1252



Erik van der Poel wrote:

> It is true that the RFC says "stable", but it does not say
> what "stable" means in the context of charsets. Does it mean
> that assigned codepoints must not change? Of course.

Yes, otherwise silently deleting UNICODE-1-1 would be an idea.

> Does it mean that unassigned codepoints must not change?
> That is debatable.

New Unicode points are assigned all the time.  Nothing's wrong
with that if you know that it's possible doing something where
it doesn't matter (e.g. no canocicalization).

> I can't find "cp-1252" in the IANA charset registry:
> http://www.iana.org/assignments/character-sets

It's on the quoted MAPPINGS/VENDOR/MICSFT/WINDOWS/CP1252.TXT
page at unicode.org submitted by cpxlate@microsoft.com

The old (RfC 2278) = proposed new (RfC 2978) table at...
<http://www.microsoft.com/globaldev/reference/sbcs/1252.htm>
...is most probably simply the same charset.  In that case it
would be nice to register "cp-1252" as an alias with a pointer
to CP1252.TXT.  The old RfC 2278 registrtation template is
<http://www.iana.org/assignments/charset-reg/windows-1252>

The proposed RfC 2978 update offers two additional links and
as you said a new contact address.

> Then, or in parallel, discuss windows-874 separately.

http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT
http://www.microsoft.com/globaldev/reference/sbcs/874.htm

JFTR.  We should check if there's a potential conflict for an
alias cp-874.  ICU says http://purl.net/net/cp/874 => TIS-620.

For http://purl.net/net/cp/1252 it says "windows-1252" mapping
the five interesting octects 0x81 to u+0081 etc.  I've no idea
what ibm1252-P100-2000 is supposed to be, my IBM OS/2 199mumble
has an Euro at 0x80, not u+0080.  OTOH it also says that this
is "1004" and not 1252, so probably that's beside the point.

                        Bye, Frank