[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Revised proposal for UTF-16



Dan Kegel wrote:

> At 03:29 PM 5/18/98 +0900, MURATA Makoto wrote:
> >UTF-16 should be sent in network byte order (big-endian).  However,
> >recipients should be able to handle both big-endian and little-endian.
>
> I think it might be good to add the lines:
>
> If UTF-16 is sent in little-endian byte order, it MUST be prefixed with
> a BOM to allow recipients to determine the byte order.
> UTF-16 sent in network byte order MAY be prefixed with a BOM.

Just a couple of minor points:

We should *prohibit* sending out little-endian when the charset label says
"utf-16". The little endian folks are welcome to register their own charset
name if they wish to do so.

I don't have a copy of ISO 10646, but if I'm not mistaken, the BOM has a
different official name, something like "zero width no-break space".

I agree with Dan that the BOM should not be mandatory for big endian. We
should probably use the normal IETF (RFC) words like "MAY", "SHOULD" or
whatever they are.

Erik