[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Fwd: I-D ACTION:draft-hoffman-utf16-03.txt
At 16:47 99/05/06 +0900, MURATA Makoto wrote:
> >4.3 Interpreting text labelled as UTF-16
> >
> >Text labelled with the "UTF-16" charset might be serialized in either
> >big-endian or little-endian order. If the first two octets of the text
> >is 0xFE followed by 0xFF, then the text can be interpreted as being
> >big-endian. If the first two octets of the text is 0xFF followed by
> >0xFE, then the text can be interpreted as being little-endian. ...
>
> I think that leading 0xFE 0xFF or 0xFF 0xFE in this case (charset = "utf-16") is
> always a byte order mark and is not a zero-width non-break space. I would like
> to make this explicit, since "the character 0xFEFF in the first
> position of a stream MAY be interpreted as a zero-width non-breaking
> space, and is not always a byte-order mark." (in 3.2).
I think it would be nice if we could it make that way, but I'm not
at all sure that we can do that. We can't just change definitions that
were around previously.
Regards, Martin.
#-#-# Martin J. Du"rst, World Wide Web Consortium
#-#-# mailto:duerst@w3.org http://www.w3.org