[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: ignore dashes etc. (was Registration of new charset GB18030 (fwd))
At 09:50 02/07/18 -0700, Markus Scherer wrote:
>I second the various proposals to make the IANA charset matching rules
>more lenient.
I did not propose to make the matching rules more lenitent.
The only thing I suggested was that we check that we have no
conflicting registrations under such potential matching rules.
And I somewhat already regret it.
I don't think it's a good idea to change the matching rules.
We would get more and more on a slippery slope with bugwards
compatibility (Gresham's Law), and I don't see why we need to
go there when things such as e.g. XML go very clearly in the
other direction, based on very bad experience with bugwards
compatibility.
Regards, Martin.
>To make a complete proposal:
>
>I propose that charset names should be recommended to be matched ignoring
>the following:
>- letter case differences (A=a, B=b, ... for A-Z and a-z)
>- dashes '-'
>- underscores '_'
>- spaces ' '
>
>For example, the following all match "gb18030":
> "GB 18030" "gB-18030" "Gb_18030" "_ -g b-1_8 0-3_0 -_"
>
>I can live without the spaces in this recommendation, although I think it
>could be useful and does no harm.
>Spaces are not allowed in IANA charset names, so they can only occur in
>user-supplied names.
>
>markus
>
>Lars Marius Garshol wrote:
>
>>* Martin Duerst
>>| It may be possible to add a rule to the IANA registry that there
>>| should be no registrations that only differ in hyphens or
>>| underscores.
>>I think that would be a good idea. ...
>