[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: IANA Character Set Registration Submittal



I would like to point out that the published specifications that are
referenced here are incomplete. In particular:
1. The specifications only show roundrip mappings, which means they
   are each omitting hundreds of fallback (one-way) mappings that
   Windows actually performs.
2. The substitution characters are not specified.
3. For MBCS charsets, the specification is incomplete as to which byte sequences
   are valid vs. illegal vs. unassigned.
   While lead bytes are specified, trail byte ranges are not,
   and illegal vs. unassigned (e.g., windows-932 0x80) are not specified.

See also
http://www.unicode.org/reports/tr22/
http://icu.sourceforge.net/charts/charset/
The latter link points to data files showing actual Windows conversion
API behavior.
(Due to a recent web site move, some URLs may not work correctly; in
this case, try http://dev.icu-project.org/cgi-bin/viewcvs.cgi/charset/data/
for the data files.)

Best regards,
markus

On Tue, 15 Mar 2005 13:43:19 -0800, Mike Ksar <mikeksar@microsoft.com> wrote:
> 
> Attached are 5 new charset registration applicatons and 9 previously
> registered charsets which needed updating.

-- 
Opinions expressed here may not reflect my company's positions unless
otherwise noted.