[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC 2279 (UTF-8) to Full Standard




>
>And as you can see by my just cited quotation from 10646 itself, such
>argumentation was always a kind of shell game by detractors of UTF-16
>and Unicode. The people making such arguments were not plugged in to
>the process in ISO and were apparently unaware that WG2 itself was
>keenly aware of the interoperability problems and eager to ensure that
>all UTF's for 10646 were *equally* applicable to all characters encoded
>in the standard.
>
>And the repeated concerns about the "eventual allocation" of characters
>in the 32-bit codespace that UTF-16 could not handle have reached
>the status of urban legends -- endlessly repeated among those in the
>Linux community who use repetition to define accuracy, without bothering
>to check with the source.

I am sure UTF-16 could be expanded with an other surrogate space to
handle all of original UCS (all 31 bits). I general I think is is wrong
to restrict the available 31 bits of UCS into the UTF-16 space just
because Unicode did the wrong choice from the beginning by using
only 16 bits. UTF-8 can encode much more than UTF-16 code space.
Though UTF-16 programs will not be able to handle all of them.
It is no different from me using a 8-bit code space having to encode
or discard all character outside code values 0-255.

   Dan