[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Indicating charset variants (was: RE: windows 936)
Martin Duerst wrote:
> XML spoils things.
That's bad. There are quite a lot registrations with names
in the form IBM00858 or IBM01140. I'd guess that nobody
will use these names with leading zeros, and sticks to the
whatever+euro alias, e.g. pc-multilingual-850+euro.
On the platforms where this charset is used its local name
is "codepage 850", the preferred MIME name should include
850, not the obscure 00858 or 858.
>| EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')*
Ugh, that's really bad.
> Now there would be three ways ahead:
> - Ignore XML. I don't think we want to go there.
True.
> - Try to change XML. A few years ago, that would have been
> easy with an erratum, but I don't think this will be met
> with cheers these days.
Interoperability is more important. They have already fixed
xml:lang to allow empty values, they should fix SystemLiteral
to be a valid URI, and while they're at it maybe updating the
EncName is a better option than...
> - Choose a separator different from '+'. After quite a bit of
> thinking, I have reached the conclusion that the obvious
> thing to do would be to use something like '--'.
...registering new aliases as preferred MIME names for the
various existing whatever+euro entries.
> What does everybody think?
Either fix XML or the registry for cases like ...850+euro.
Frank