[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Registering a charset alias



On Sat, 15 Aug 2009 17:22:10 +0200, Erik van der Poel <erikv@google.com> wrote:
> I mostly agree. I believe we will bump into various differences
> between the browsers, but the deeper we dive, the less interesting
> those differences will become. That is why I suggested that we
> prioritize.

I'm personally not in a hurry, but I don't mind tackling some bits before others.


> For example, I believe that the GB2312/GBK/GB18030 family of encodings
> is quite important. However, as far as I know, the major browsers do
> not use GB18030 as the "superset", even though it was clearly defined
> to be the superset.

I would personally be fine with pushing for this. My reason would mostly be to simplify matters. The less encodings that need to be supported the better in my opinion. They are mostly an historical artifact after all and what matters most is that non-UTF-8 documents can still be read.


>> Opera currently uses the Unicode Charset Alias Matching rules, but
>> they are not perfect and we want to change away from them again. I'll
>> look into getting a definitive list of the encodings we support.
>
> That would be great. The Firefox, Safari and Chrome lists should all
> be available, since they're open source. Shawn, would Microsoft be
> willing to publish the latest Windows/IE list of
> charsets/aliases/supersets?

I've done the work for Opera:

  http://wiki.whatwg.org/wiki/Web_Encodings

As a start for aliases it is not very useful I think since we use the UTS22 rules. Something similar goes for Chromium. I have not checked Safari. From the limited testing on character encoding matching I have done so far I think we want to base the matching algorithm on a combination of what Internet Explorer and Firefox are doing. They are more strict and highly likely to be compatible with the largest body of Web content.

It would be great if people could help out and create similar tables for the other browsers.


-- 
Anne van Kesteren
http://annevankesteren.nl/