[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Are charset names supposed to be case sensitive?



* Doug Ewell wrote:
>I guess I would like to see some sort of table breaking down the various 
>flavors of UTF-16 and/or UCS-2 that would need to be tagged separately:
>
>* big-endian or little-endian by default
>* accepts BOM
>* requires BOM
>* supports all 17 planes or just BMP
>* etc.

I think it would be helpful to start with separating what the encodings
are and what the particular behavior of "HTML implementations" is. The
registry is not really meant to cover the encoding detection rules for
"HTML when served over HTTP" with handling of <meta> elements and such,
it's more for "you have a label and you have bytes, this is how you get
characters", where the definition of the label, and not the data format
tells you how you get the characters.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/