[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ignore dashes etc. (was Registration of new charset GB18030 (fwd))



You are quite right. I was inferring from Martin's message, and should have looked at the source documents:

http://www.iana.org/assignments/character-sets says quite clearly:

The character set names may be up to 40 characters taken from the
printable characters of US-ASCII. However, no distinction is made
between use of upper and lower case letters.

http://www.ietf.org/rfc/rfc2978.txt also mentions it briefly (and less than clearly):

...A combined ABNF
definition for such names is as follows:

mime-charset = 1*mime-charset-chars
mime-charset-chars = ALPHA / DIGIT /
"!" / "#" / "$" / "%" / "&" /
"'" / "+" / "-" / "^" / "_" /
"`" / "{" / "}" / "~"
ALPHA = "A".."Z" ; Case insensitive ASCII Letter
DIGIT = "0".."9" ; Numeric digit


And case-insensitivity is a good thing; also good would be hyphen and underscore insensitivity.

Mark
___
mark.davis@us.ibm.com
IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193
(408) 256-3148
fax: (408) 256-0799

ned.freed@mrochek.com




          ned.freed@mrochek.com

          2002.07.20 22:47



To: Martin Duerst <duerst@w3.org>
cc: Mark Davis/Cupertino/IBM@IBMUS, charsets <ietf-charsets@iana.org>, Markus Scherer <markus.scherer@jtcsv.com>
Subject: Re: ignore dashes etc. (was Registration of new charset GB18030 (fwd))


> At 20:41 02/07/18 -0700, Mark Davis wrote:

> >And what harm does it do, to make the name matching case-insensitive --
> >especially since a great many implementations do that anyway?

> Case-insensitive matching doesn't harm, as 'charset' matching was
> always case sensitive in the specs and in all implementations.

I don't know where you got this idea, but it simply isn't true. RFC 2046
section 4.1.2 is quite clear on the matter:

Unlike some other parameter values, the values of the charset parameter are NOT
case sensitive.

I also can assure you that various cases of US-ASCII, Iso-8859-1, and
numerous other charsets are routinely used in practice.

Now, it is true that RFC 2278 doesn't come out and say that all charset values
are case-insensitive. And this should probably be clarified. But it is a heck
of a stretch to infer that they are case sensitive given that the subset
intended for use in MIME most definitely are not. (This last point is actually
reiterated in the ABNF in RFC 2978 section 2.3.)

Ned


GIF image