[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Registering a charset alias





On 2009/08/15 8:17, Erik van der Poel wrote:
> No, I don't think we should recommend behavior that is more lenient
> than what the major browsers currently do. (I believe the major
> browsers don't strip "x-"?)

Very important indeed. There is a big difference between tolerating 
existing crap and generating, or encouraging the generation, of new crap.

Regards,    Martin.


> So I don't think the following spec from HTML 5, section 2.7 is very
> good either:
>
> "When comparing a string specifying a character encoding with the name
> or alias of a character encoding to determine if they are equal, user
> agents must use the Charset Alias Matching rules defined in Unicode
> Technical Standard #22. [UTS22]
>
> For instance, "GB_2312-80" and "g.b.2312(80)" are considered equivalent names."
>
> The general approach should be: As lenient as the major browsers, but
> not more lenient. Lenience leads to a proliferation of garbage.
>
> I chose the name x-x-big5 for an internal X-Windows-only encoding for
> Big5 that only had 2-byte characters, no ASCIIs. That name was
> intended to be used only internally, within Netscape, but our X
> resource file was written in plain ASCII, and Microsoft picked that
> up, assuming that x-x-big5 was ordinary big5. I shouldn't have exposed
> that name in a plain text file, nor should I have put that name in the
> same namespace as ordinary charsets.
>
> Erik
>
> On Fri, Aug 14, 2009 at 4:03 PM, Markus Scherer<markus.icu@gmail.com>  wrote:
>> How about a general rule (maybe in HTML 5) that if x-abc is not recognized
>> then the implementation should strip the x- and try abc instead. Apparently,
>> this may need to be done multiple times, to deal with x-x-big5. (Whoever
>> came up with *that*?)
>>
>> markus
>>
>

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp