[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Are charset names supposed to be case sensitive?



"Martin J. Dürst", Sat, 17 Dec 2011 22:09:31 +0900:
> On 2011/12/17 17:01, Leif Halvard Silli wrote:

>> Magic vs semantics: I don't attach magic to the casing. But I do
>> recognize that there nevertheless is semantics attached to the casing -
>> by others than myself.
  ... 
> If not, 

We have a 'if not' situation.

> I would strongly suggest not to use case differences to refer 
> to different usages of the same label, because this may cause a lot 
> of confusion. (It already had Shawn confused, and me, too.)

Looking at my registration letter for 'unicode', I think it isn't the 
very casing, but the language I use about the casing that is possibly 
confusing:

''' NB! Alias: At the time of this registration, the spec upon which 
  the registration of the 'unicode' and the 'unicodeFFFE' charset is
  based, defines 'utf-16' (lowercase) as alias for 'unicode'.[2] '''

If I remove the '(lowercase)', then the above should be clear enough, 
no? Also: In the same letter I say that 'utf-16' lowercase *cannot* be 
registered as alias for 'unicode' due to the fact that 'UTF-16' 
uppercase is already registered as a charset name. So there is material 
enough to at least avoid jumping to conclusions ...

>> Thus I've tried to be consistent with the casing
>> found in the IANA registry (UPPERCASE) and in the Microsoft listing
>> (lowercase and mixedCASE).
> 
> The fact that the IANA registry lists the charset labels with 
> uppercase characters isn't more than a random convention, and this 
> may also be so for the Microsoft listings.

The registry says: ''However, no distinction is made between use of 
upper and lower case letters.'' I read this to mean that products are 
not supposed to make distinctions based on the casing. Hence I assumed 
this mailing list would read *me* the same way ...  Am I wrong w.r.t 
the registry? Does the casing in the registry matter, except for 
registry conventions or sorts? 

> Please use a single case 
> version unless case is really significant in the sense that one and 
> the same product, in one and the same protocol slot, reacts different 
> to different case forms.

When I quote Microsoft or the IANA registry, I must of course use the 
casing used in those documents. But - OK - else: Until I eventually 
discover a case where the casing matters, I will try to use the a 
single casing.

>> This consistency have the following benefits:
>> 
>> * It makes it easier to separate the semantics of the utf-16 alias
>>    in the Microsoft listing from the semantics of the UTF-16 name in
>>    the IANA registry.
> 
> For all intents and purposes, these are one and the same charset 
> label. If you want to distinguish them, please do so with additional 
> words, not with case.

I did that: I said 'utf-16 *alias*' versus 'UTF-16 *name*' ...  But the 
casing apparently nullified the effect ...

>> * It separates 'unicode' from the trademarked/registered 'UNICODE'.
>> * The casing 'unicodeFFFE' is more readable than 'unicodefffe' or
>>    'UNICODEFFFE'.
> 
> For these two, you have only used a single casing, so there's no 
> confusion. So these should be fine.

Good.
-- 
Leif H Silli