[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Registering a charset alias
> So if I understand this data correctly IE does not treat ISO-8859-1 and
> Windows-1252 the same?
FYI, there are quite a few differences between iso-8859-1 and windows-1252. In
summary, the windows variant elected to use the C1 region (0x80-0x9F) for a
bunch of stuff, none of which appears in iso-8859-1. The Euro symbol at 0x80 is
arguably the most significant difference in practice.
> That is not my experience, but maybe I do not understand
> the code pages concept good enough.
It may be that IE treats ISO-8859-1 the same as windows-1252 because ISO-8859-1
is in some sense a subset. But you'd be well advised not to count on that
behavior.
> > I think most of our encodings don't lend themselves to the superset
> > concept. There're probably variations for individual code points even
> > in closely related code pages. GB18030 might be an exception there.
I'd have to check to be sure, but I believe the Microsoft variant of GBK
contains some stuff that isn't in GB18030. (Microsoft additions leaking into
the subset charsets is a very common problem, especially in the CJK sets.)
Ned