[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: UTF-16
MURATA Makoto wrote:
>
> UTF-16 generators MUST send in big-endian byte order and must
> begin with the zero width non breaking space (also called Byte
> Order Mark or BOM) (0xFEFF).
The 2nd "must" is in lower-case. Should it be upper-case?
> Thus, an UTF-16 parser encountering the code 0xFFFE as the
an UTF-16 -> a UTF-16
> If the BOM
> is absent, there is no way to 100% reliably detect little-endian
> data that does not use the BOM.
the BOM is absent ... data that does not use the BOM (2x)
> The Coded Character Set that UTF-16 refers to is the same version
> of ISO/IEC 10646-1 and Unicode that the charset "UTF-8" refers to.
We need a reference to the UTF-8 RFC 2279 at the end of the document.
Erik
- Follow-Ups:
- Re: UTF-16
- From: MURATA Makoto <murata@apsdc.ksp.fujixerox.co.jp>
- References:
- UTF-16
- From: MURATA Makoto <murata@apsdc.ksp.fujixerox.co.jp>