[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UTF-16



MURATA Makoto wrote:
> 
> UTF-16 generators MUST send in big-endian byte order and must
> begin with the zero width non breaking space (also called Byte
> Order Mark or BOM) (0xFEFF).

The 2nd "must" is in lower-case. Should it be upper-case?

> Thus, an UTF-16 parser encountering the code 0xFFFE as the

an UTF-16 -> a UTF-16

> If the BOM
> is absent, there is no way to 100% reliably detect little-endian
> data that does not use the BOM.

the BOM is absent ... data that does not use the BOM (2x)

> The Coded Character Set that UTF-16 refers to is the same version
> of ISO/IEC 10646-1 and Unicode that the charset "UTF-8" refers to.

We need a reference to the UTF-8 RFC 2279 at the end of the document.

Erik