[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Charset reviewer appointed



At 08:51 25.07.98 +0900, Martin J. Duerst wrote:
>However, please note that XML already decided to make
>the BOM mandatory for UTF-16. I told them that that was
>not something they should define, but they didn't listen.
>
>There would be a "way out" by saying that in that case,
>the BOM is part of an "intermediate layer" (no, it's
>of course not part of XML, because it's not present
>in UTF-8 or other encodings), and not part of UTF-16
>as defined above. But such a "way out" is really clumsy.

The BOM is part of the charset that UTF-16 represents.
Any application can say anything it wants to *further restricting*
what characters can apply where; the part we couldn't tolerate
was if XML insisted upon strings that were *illegal* in the registered
UTF-16, yet calling the charset "UTF-16".

Ken Whistler wrote:

>With regards to Harald Alvestrand's summary of the open
>issues with respect to the UTF-16 registration, the only
>way I see forward, given the nature of the "charset"
>definition, is to split this request into two registrations:
>
>UTF-16   big-endian UTF-16
>UTF-16BS little-endian (byte-swapped) UTF-16

I see this as a reasonable thing to do.

                          Harald

-- 
Harald Tveit Alvestrand, Maxware, Norway
Harald.Alvestrand@maxware.no