[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

UTF-16





If I am not mistaken, we have reached a consensus on the 
charset "utf-16".  What will happen next?

-----------------------------------------------------------
We propose to register UTF-16 as a charset in IANA.

UTF-16 generators MUST send in big-endian byte order and must
begin with the zero width non breaking space (also called Byte
Order Mark or BOM) (0xFEFF).

NOTE: Some implementations that do not conform to this
specification have occasionally sent data in little-endian byte
order. When they do this, they commonly precede the data with the
BOM.  Thus, an UTF-16 parser encountering the code 0xFFFE as the
first character of a purported UTF-16 stream may safely assume
that he has encountered a nonconformant data source.  If the BOM
is absent, there is no way to 100% reliably detect little-endian
data that does not use the BOM.

This character set is not permitted for use with MIME text/* media
types.  However, the MIME-like mechanism of HTTP may use this
character set for text/*, since this mechanism is exempt from the
restrictions on the text top-level type (see section 19.4.1 of
HTTP 1.1 [RFC-2068]).

   [RFC-2068] R. Fielding, J. Gettys, J. Mogul, H. Frystyk,
   T. Berners-Lee. "Hypertext Transfer Protocol -- HTTP/1.1" 
   UC Irvine, DEC, MIT/LCS. RFC 2068. January, 1997.

Charset name(s): UTF-16

Published specification(s): 

UTF-16 as a Character Encoding Scheme is defined in Appendix C.3
of [UNICODE] and Amendment 1 of [ISO-10646].

The Coded Character Set that UTF-16 refers to is the same version
of ISO/IEC 10646-1 and Unicode that the charset "UTF-8" refers to.

  [ISO-10646] ISO/IEC, Information Technology - Universal
  Multiple-Octet Coded Character Set (UCS) - Part 1: Architecture
  and Basic Multilingual Plane, May 1993.

  [UNICODE] The Unicode Consortium, "The Unicode Standard -- Version 2.0", 
  Addison-Wesley, 1996.


Person & email address to contact for further information:

Tatsuo L. Kobayashi
Digital Culture Research Center, JUSTSYSTEM Corp.
Email: Tatsuo_Kobayashi@justsystem.co.jp

Murata Makoto (Family Given)
Fuji Xerox Information Systems,
KSP 9A7, 2-1 Sakado 3-chome,
Takatsu-ku, Kawasaki-shi,
213 Japan
Email: murata@fxis.fujixerox.co.jp 



Makoto
 
Fuji Xerox Information Systems
 
Tel: +81-44-812-7230   Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp