[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ISO-10646-UCS-x aliases



Francois Yergeau wrote:
> Respecting UTF-16 vs ISO-10646-UCS-2 however, there is a real difference,
> the latter being restricted to U+FFFF.

Yes, there is a real difference. However, more often than not, "UCS-2" just means "byte 
serialization of the internal 16-bit Unicode/ISO 10646 form", and as the generating software 
upgrades to handle surrogate pairs, the text really is UTF-16. Also, most receiving software will 
byte-unserialize a "UCS-2" byte stream into 16-bit Unicode, and if it handles surrogate pairs, then 
interpret it as UTF-16 anyway. In other words, in practice, the difference between UCS-2 and UTF-16 
is in processing the text, not in encoding/converting it.

markus

-- 
Opinions expressed here may not reflect my company's positions unless otherwise noted.