[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: internationalization/ISO10646 question - UTF-16



On Thu, Dec 19, 2002 at 02:03:12PM -0800, Markus Scherer wrote:
> 
> Remember that UTF-8 was designed to shoehorn Unicode/UCS into Unix file 
> systems, nothing more. Where ASCII byte-stream compatibility is not an 
> issue, there are Unicode charsets that are more efficient than UTF-8, 
> different ones for different uses.

Well, it is true that the UTF-FSS encoding, the previous name for UTF-8,
was for UNIX filesystems (FSS means File Systems Safe), but when it was
renamed to UTF-8 by SC2/WG2, it at the same time replaced the UTF-1
encoding, which was intended for network use. So UTF-8 is purposedly
meant for network interchange by the designers of ISO 10646.
Furthermore IETF/IESG has stated the policy that UTF-8 is the preferred 
encoding for all Internet protocols, all existing protocols need
to support it, and new protocols should only use UTF-8. 
So nowadays UTF-8 is much more than just for Unix filesystems.

One wonders why W3C made UTF-16 the encoding of choice for XML.

Kind regards
Keld