[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: IANA Character Set Registration Submittal



Markus,
 
What you indicate is missing is not required by IANA in RFC 2978.  The references that I included refer to how MS defines these character sets and they are implemented in our products based on those specifications.  In particular:
 
1.  Fallback mappings is not part of the registration procedures in RFC 2978.
2.  Substitution characters is not part of the registration procedures either.
3.  Same with MBCS charsets.
 
Your concerns are implementation specific rather than registration requirements.  Nine of the submittals are to update references on already registered character sets at IANA.
 
Mike Ksar

________________________________

From: Markus Scherer [mailto:markus.icu@gmail.com]
Sent: Fri 3/25/2005 2:57 PM
To: ietf-charsets@iana.org
Subject: Re: IANA Character Set Registration Submittal



I would like to point out that the published specifications that are
referenced here are incomplete. In particular:
1. The specifications only show roundrip mappings, which means they
   are each omitting hundreds of fallback (one-way) mappings that
   Windows actually performs.
2. The substitution characters are not specified.
3. For MBCS charsets, the specification is incomplete as to which byte sequences
   are valid vs. illegal vs. unassigned.
   While lead bytes are specified, trail byte ranges are not,
   and illegal vs. unassigned (e.g., windows-932 0x80) are not specified.

See also
http://www.unicode.org/reports/tr22/
http://icu.sourceforge.net/charts/charset/
The latter link points to data files showing actual Windows conversion
API behavior.
(Due to a recent web site move, some URLs may not work correctly; in
this case, try http://dev.icu-project.org/cgi-bin/viewcvs.cgi/charset/data/
for the data files.)

Best regards,
markus

On Tue, 15 Mar 2005 13:43:19 -0800, Mike Ksar <mikeksar@microsoft.com> wrote:
>
> Attached are 5 new charset registration applicatons and 9 previously
> registered charsets which needed updating.

--
Opinions expressed here may not reflect my company's positions unless
otherwise noted.