[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Equality



> > It is very useful, if the uniqueness is not achievable, to have
> > some short notation of regular expressions to represent all the
> > equivalent characters.
> 
> This merely moves the burden up to the user, to type that regular
> expression,

That's why a *SHORT* notation is very useful.

> I found it very illuminating that the June 1993 version of ECMA-35, to
> be proposed to ISO as a new edition of 2022, requires the lowest
> numbered of G0/G1/G2/G3 to be used when a character is present in
> multiple sets, *even if a higher numbered set  is already invoked
> and the lowest numbered set is not* (clause 7.5). This amounts to a
> version of uniqueness.

I don't think it any useful.

As the code points of some character varies with no regularlity in
different character sets, it means you must have a table. And, if
you have such a table, it is not at all difficult for the receiver
side to use the table to disambigufy a character with multiple
representations.

And, do you think 'A' in JIS X0208 is identical to 'A' in ASCII?
Do you think Han characters of GB, CNS, JIS, KCS unified in ISO 10646
the same cahracters?

					Masataka Ohta