[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Are charset names supposed to be case sensitive?



I guess I would like to see some sort of table breaking down the various 
flavors of UTF-16 and/or UCS-2 that would need to be tagged separately:

* big-endian or little-endian by default
* accepts BOM
* requires BOM
* supports all 17 planes or just BMP
* etc.

That way I would have a clearer sense of what can be currently tagged, 
what cannot be tagged and needs to be, and what is just an application 
quirk or bug.

It seems Leif might be trying to tag the incomplete or erroneous 
behavior of individual applications, even if they don't correspond to 
documented behavior, or to tag mis-documented behavior that may not 
actually be implemented (like "unicode" meaning "BMP only").  I'm not 
sure that's a goal of registering charsets.  It also seemed to me—though 
I assume I'm wrong here—that he was trying to call particular attention 
to errors in Microsoft implementations, but I'm sure Shawn and others 
can speak to that.

--
Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14
www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell ­