[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Are charset names supposed to be case sensitive?
"Martin J. Dürst", Sat, 17 Dec 2011 22:09:31 +0900:
> On 2011/12/17 17:01, Leif Halvard Silli wrote:
>> Magic vs semantics: I don't attach magic to the casing. But I do
>> recognize that there nevertheless is semantics attached to the casing -
>> by others than myself.
...
> If not,
We have a 'if not' situation.
> I would strongly suggest not to use case differences to refer
> to different usages of the same label, because this may cause a lot
> of confusion. (It already had Shawn confused, and me, too.)
Looking at my registration letter for 'unicode', I think it isn't the
very casing, but the language I use about the casing that is possibly
confusing:
''' NB! Alias: At the time of this registration, the spec upon which
the registration of the 'unicode' and the 'unicodeFFFE' charset is
based, defines 'utf-16' (lowercase) as alias for 'unicode'.[2] '''
If I remove the '(lowercase)', then the above should be clear enough,
no? Also: In the same letter I say that 'utf-16' lowercase *cannot* be
registered as alias for 'unicode' due to the fact that 'UTF-16'
uppercase is already registered as a charset name. So there is material
enough to at least avoid jumping to conclusions ...
>> Thus I've tried to be consistent with the casing
>> found in the IANA registry (UPPERCASE) and in the Microsoft listing
>> (lowercase and mixedCASE).
>
> The fact that the IANA registry lists the charset labels with
> uppercase characters isn't more than a random convention, and this
> may also be so for the Microsoft listings.
The registry says: ''However, no distinction is made between use of
upper and lower case letters.'' I read this to mean that products are
not supposed to make distinctions based on the casing. Hence I assumed
this mailing list would read *me* the same way ... Am I wrong w.r.t
the registry? Does the casing in the registry matter, except for
registry conventions or sorts?
> Please use a single case
> version unless case is really significant in the sense that one and
> the same product, in one and the same protocol slot, reacts different
> to different case forms.
When I quote Microsoft or the IANA registry, I must of course use the
casing used in those documents. But - OK - else: Until I eventually
discover a case where the casing matters, I will try to use the a
single casing.
>> This consistency have the following benefits:
>>
>> * It makes it easier to separate the semantics of the utf-16 alias
>> in the Microsoft listing from the semantics of the UTF-16 name in
>> the IANA registry.
>
> For all intents and purposes, these are one and the same charset
> label. If you want to distinguish them, please do so with additional
> words, not with case.
I did that: I said 'utf-16 *alias*' versus 'UTF-16 *name*' ... But the
casing apparently nullified the effect ...
>> * It separates 'unicode' from the trademarked/registered 'UNICODE'.
>> * The casing 'unicodeFFFE' is more readable than 'unicodefffe' or
>> 'UNICODEFFFE'.
>
> For these two, you have only used a single casing, so there's no
> confusion. So these should be fine.
Good.
--
Leif H Silli