[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Registering a charset alias



> So if I understand this data correctly IE does not treat ISO-8859-1 and
> Windows-1252 the same?

FYI, there are quite a few differences between iso-8859-1 and windows-1252. In
summary, the windows variant elected to use the C1 region (0x80-0x9F) for a
bunch of stuff, none of which appears in iso-8859-1. The Euro symbol at 0x80 is
arguably the most significant difference in practice.

> That is not my experience, but maybe I do not understand
> the code pages concept good enough.

It may be that IE treats ISO-8859-1 the same as windows-1252 because ISO-8859-1
is in some sense a subset. But you'd be well advised not to count on that
behavior.

> > I think most of our encodings don't lend themselves to the superset
> > concept.  There're probably variations for individual code points even
> > in closely related code pages.  GB18030 might be an exception there.

I'd have to check to be sure, but I believe the Microsoft variant of GBK
contains some stuff that isn't in GB18030. (Microsoft additions leaking into
the subset charsets is a very common problem, especially in the CJK sets.)

				Ned