[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Registration of new charset SCSU



Harald Tveit Alvestrand wrote:
> - the "charset" SCSU is, as far as I can see, the combination of the CCS
>    UNICODE/ISO 10646 with the CES of UTR #6.
>    This should be expressed clearly in the "Published specification" section;
>    as currently written, it sounds like you're registering the CES only,
>    which is a no-no.

This is a good way to put it. I will update my proposal.

> - I would like to add under "Additional information":
>    SCSU is completely useless for applications that require a canonical
>    representation of text. This is an intentional part of its design.

Well, I will try to find a somewhat nicer way to say this... :-)

You are right. The intention behind SCSU is not to have an encoding that is good for internal processing; the intention is to have an encoding of Unicode that is more compact than UTF-8 or UTF-16 and that is useful in files (beware of searching though) and especially in protocols.

For example, the XML parser that I know always converts everything from the document charset into UTF-16 before it does anything else. It does not care about such issues.

> I don't like it much as a general purpose tool, but it may find a market
> niche somewhere.

That's fair.

What is the next step? Do I need to update and resend the proposal? Should I wait for a few days to give more people time to respond?

markus