[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Indicating charset variants (was: RE: windows 936)



At 03:54 07/09/25, Shawn Steele wrote:
>>> I agree that a separator like "--" might be best.
>
>I知 concerned about the addition of another layer of names that痴 incompatible with the current set of names.  The concept of adding a variation would require changes to all client parsers, and that just isn稚 a realistic expectation.  So the net result would be that a few applications emitting variations would create data unrecognizable by the majority of current software.

It is very clear that having every page suddenly come with a label
with some --variant information attached isn't an option.

Everybody would still be allowed to use a label without a variant.
Most people would choose that, because that's what's currently
supported. But some applications, and some data, where it really
matters, could be more precise.


>If there was to be such a huge breaking change in the naming of code pages, the effort in updating the software and legacy data sets would be much better used to migrate to Unicode than to migrate to a different form of the existing code pages.

That's an extremely valid point. However, the two issues are quite
interrelated. For people and applications where exact conversion
is important, correctly and precisely labeling an encoding variant
can be the first step to migrating to Unicode.


>My thinking is that for the registrations for code pages like 936, it would probably be worth stating that variations of the official code page exist and cause different interpretations on some systems.

That would definitely be very valuable information.

Regards,    Martin.




#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp