[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Registering a charset alias

To: Erik van der Poel <erikv@google.com>, Markus Scherer <markus.icu@gmail.com>
Subject: Re: Registering a charset alias
From: Anne van Kesteren <annevk@opera.com>
Date: Sat, 15 Aug 2009 09:24:40 +0200
Cc: Shawn Steele <Shawn.Steele@microsoft.com>,Ira McDonald <blueroofmusic@gmail.com>,ietf-charsetsianaorg <ietf-charsets@iana.org>
In-reply-to: <c07a32650908141617x607895e3yaac4f86be795a1b9@mail.gmail.com>
List-Id: <ietf-charsets.mail.apps.ietf.org>
List-Owner: <mailto:ietf-charsets-owner@mail.apps.ietf.org>
List-Subscribe: <mailto:mailserv@mail.apps.ietf.org?subject=subscribe%20ietf-charsets>
List-Unsubscribe: <mailto:mailserv@mail.apps.ietf.org?subject=unsubscribe%20ietf-charsets>
Organization: Opera Software ASA
Original-recipient: rfc822;ned+ietf-charsets@mrochek.com
References: <op.uyl5bcjb64w2qv@annevk-t60><c07a32650908131154l21c8583du956c5aa5a4e7605b@mail.gmail.com><op.uyl9tzep64w2qv@annevk-t60><e395be80908131614p2e6ccb69u6bac9de57bc0f3d@mail.gmail.com><c07a32650908131856k44cbb0dcg129c64ffd57336e5@mail.gmail.com><CAD7705D4A93814F97D3EF00790AF0B316030FE6@tk5ex14mbxc105.redmond.corp.microsoft.com><c07a32650908141405lafcb236n98aec273dc45ff49@mail.gmail.com><CAD7705D4A93814F97D3EF00790AF0B31603105A@tk5ex14mbxc105.redmond.corp.microsoft.com><c07a32650908141549v103ae000qfd9e013ccb164ea8@mail.gmail.com><6bb028490908141603s5805ae6et6d486e7f3df5ca6@mail.gmail.com><c07a32650908141617x607895e3yaac4f86be795a1b9@mail.gmail.com>
Spam-test: False ; 0.8 / 4.5 ; RDNS_NONE,SPF_SOFTFAIL
User-Agent: Opera Mail/10.00 (Linux)

On Sat, 15 Aug 2009 01:17:30 +0200, Erik van der Poel <erikv@google.com> wrote:
> No, I don't think we should recommend behavior that is more lenient
> than what the major browsers currently do. (I believe the major
> browsers don't strip "x-"?)

As far as I know that does not happen, indeed. I agree we should keep this to a minimum.

> So I don't think the following spec from HTML 5, section 2.7 is very
> good either:
>
> "When comparing a string specifying a character encoding with the name
> or alias of a character encoding to determine if they are equal, user
> agents must use the Charset Alias Matching rules defined in Unicode
> Technical Standard #22. [UTS22]
>
> For instance, "GB_2312-80" and "g.b.2312(80)" are considered equivalent  
> names."

Indeed. We experimented with this and it caused some compatibility issues.

What would help I think is clear documentation on how browsers (Chrome, Safari, Firefox, Internet Explorer, Opera) for a given label from set A arrive at the final label from set B. Set A is near-infinite and set B should be finite and essentially consist of the list of supported encodings. If we have that we should be able to propose a better algorithm than Unicode currently defines that HTML5 can then use.

Opera currently uses the Unicode Charset Alias Matching rules, but they are not perfect and we want to change away from them again. I'll look into getting a definitive list of the encodings we support.

> The general approach should be: As lenient as the major browsers, but
> not more lenient. Lenience leads to a proliferation of garbage.

Agreed, unless simplicity at no other cost can be greatly increased.

-- 
Anne van Kesteren
http://annevankesteren.nl/

Follow-Ups:
- Re: Registering a charset alias
  - From: Erik van der Poel <erikv@google.com>

References:
- Registering a charset alias
  - From: Anne van Kesteren <annevk@opera.com>
- Re: Registering a charset alias
  - From: Erik van der Poel <erikv@google.com>
- Re: Registering a charset alias
  - From: Anne van Kesteren <annevk@opera.com>
- Re: Registering a charset alias
  - From: Ira McDonald <blueroofmusic@gmail.com>
- Re: Registering a charset alias
  - From: Erik van der Poel <erikv@google.com>
- RE: Registering a charset alias
  - From: Shawn Steele <Shawn.Steele@microsoft.com>
- Re: Registering a charset alias
  - From: Erik van der Poel <erikv@google.com>
- RE: Registering a charset alias
  - From: Shawn Steele <Shawn.Steele@microsoft.com>
- Re: Registering a charset alias
  - From: Erik van der Poel <erikv@google.com>
- Re: Registering a charset alias
  - From: Markus Scherer <markus.icu@gmail.com>
- Re: Registering a charset alias
  - From: Erik van der Poel <erikv@google.com>

Prev by Date: Re: Registering a charset alias
Next by Date: Re: Registering a charset alias
Prev by thread: Re: Registering a charset alias
Next by thread: Re: Registering a charset alias
Index(es):
- Date
- Thread