[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Registering a charset alias



On Sat, Aug 15, 2009 at 12:24 AM, Anne van Kesteren<annevk@opera.com> wrote:
> What would help I think is clear documentation on how browsers
> (Chrome, Safari, Firefox, Internet Explorer, Opera) for a given label from
> set A arrive at the final label from set B. Set A is near-infinite and set B
> should be finite and essentially consist of the list of supported
> encodings. If we have that we should be able to propose a better
> algorithm than Unicode currently defines that HTML5 can then use.

I mostly agree. I believe we will bump into various differences
between the browsers, but the deeper we dive, the less interesting
those differences will become. That is why I suggested that we
prioritize.

For example, I believe that the GB2312/GBK/GB18030 family of encodings
is quite important. However, as far as I know, the major browsers do
not use GB18030 as the "superset", even though it was clearly defined
to be the superset.

> Opera currently uses the Unicode Charset Alias Matching rules, but
> they are not perfect and we want to change away from them again. I'll
> look into getting a definitive list of the encodings we support.

That would be great. The Firefox, Safari and Chrome lists should all
be available, since they're open source. Shawn, would Microsoft be
willing to publish the latest Windows/IE list of
charsets/aliases/supersets?

Erik