[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Registration of new charset: UTF-32

To: Keld Jørn Simonsen <keld@dkuug.dk>
Subject: Re: Registration of new charset: UTF-32
From: Asmus Freytag <asmusf@ix.netcom.com>
Date: Sat, 19 May 2001 18:28:35 -0700
Cc: Misha Wolf <Misha.Wolf@reuters.com>, Mark Davis <mark@macchiato.com>,ietf-charsets@iana.org, w3c-i18n-ig@w3.org
In-reply-to: <20010514184801.A16154@rap.rap.dk>
References: <4.2.0.58.20010513113314.01c39950@popd.ix.netcom.com><B0015962675@euvig1.dtc.lon.ime.reuters.com><4.2.0.58.20010513113314.01c39950@popd.ix.netcom.com>

At 06:48 PM 5/14/01 +0200, Keld Jørn Simonsen wrote:
>On Sun, May 13, 2001 at 11:34:24AM -0700, Asmus Freytag wrote:
> > At 08:05 PM 5/11/01 +0100, Misha Wolf wrote:
> >
> >
> > >Has anyone looked to see how this ties in with:
> > >   Extensible Markup Language (XML) 1.0 (Second Edition)
> > >   Autodetection of Character Encodings (Non-Normative)
> > >   http://www.w3.org/TR/REC-xml#sec-guessing
> > >
> > >Misha
> > >
> >
> > I took a quick look. The section already talks about 4-byte codes.
> > Replace UCS-4 by UTF-32 in that section and it would seem to cover it.
>
>I think that the w3c specs should rather refer the 10646 specs,
>and thus keep the reference to UCS-4.

We deliberately introduced the term UTF-32 since this regularizes
the notation for everyone. Little is to be gained by using a mixed
notation in that section using UTF-8, UTF-16 and UCS-4 together.

If you would like to fix the 10646 spec, you could propose that the
term UTF-32 is formally added there as well. As it stands, 10646
has an unfortunate asymmetry in notation that is cumbersome to use
for the non-specialist.

A./

References:
- Re: Registration of new charset: UTF-32
  - From: Asmus Freytag <asmusf@ix.netcom.com>
- Re: Registration of new charset: UTF-32
  - From: Misha Wolf <Misha.Wolf@reuters.com>

Prev by Date: Re: Registration of new charsets UTF-32, UTF-32BE, UTF32LE
Next by Date: Re: Registration of new charset: UTF-32
Prev by thread: Re: Registration of new charset: UTF-32
Next by thread: Re: Registration of new charset: UTF-32
Index(es):
- Date
- Thread