[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Registration of new charset: UTF-32

To: Asmus Freytag <[email protected]>
Subject: Re: Registration of new charset: UTF-32
From: Keld J�rn Simonsen <[email protected]>
Date: Sun, 20 May 2001 19:15:03 +0200
Cc: Keld J�rn Simonsen <[email protected]>,Misha Wolf <[email protected]>, Mark Davis <[email protected]>,[email protected], [email protected]
In-reply-to: <[email protected]><"from asmusf"@ix.netcom.com>
References: <[email protected]><[email protected]><[email protected]><[email protected]><[email protected]>
Sender: [email protected]
User-Agent: Mutt/1.2.5i

On Sat, May 19, 2001 at 06:28:35PM -0700, Asmus Freytag wrote:
> At 06:48 PM 5/14/01 +0200, Keld J�rn Simonsen wrote:
> >On Sun, May 13, 2001 at 11:34:24AM -0700, Asmus Freytag wrote:
> > > At 08:05 PM 5/11/01 +0100, Misha Wolf wrote:
> > >
> > >
> > > >Has anyone looked to see how this ties in with:
> > > >   Extensible Markup Language (XML) 1.0 (Second Edition)
> > > >   Autodetection of Character Encodings (Non-Normative)
> > > >   http://www.w3.org/TR/REC-xml#sec-guessing
> > > >
> > > >Misha
> > > >
> > >
> > > I took a quick look. The section already talks about 4-byte codes.
> > > Replace UCS-4 by UTF-32 in that section and it would seem to cover it.
> >
> >I think that the w3c specs should rather refer the 10646 specs,
> >and thus keep the reference to UCS-4.
> 
> We deliberately introduced the term UTF-32 since this regularizes
> the notation for everyone. Little is to be gained by using a mixed
> notation in that section using UTF-8, UTF-16 and UCS-4 together.
> 
> If you would like to fix the 10646 spec, you could propose that the
> term UTF-32 is formally added there as well. As it stands, 10646
> has an unfortunate asymmetry in notation that is cumbersome to use
> for the non-specialist.

You really should not do this. UCS-4 is the canonical representation of
10646. UTF-32 would be misleading, as the UCS-4 is not a transformation format,
but the "real thing".

Kind regards
Keld

Follow-Ups:
- Re: Registration of new charset: UTF-32
  - From: Martin Duerst <[email protected]>

References:
- Re: Registration of new charset: UTF-32
  - From: Asmus Freytag <[email protected]>
- Re: Registration of new charset: UTF-32
  - From: Misha Wolf <[email protected]>
- Re: Registration of new charset: UTF-32
  - From: Asmus Freytag <[email protected]>

Prev by Date: Re: Registration of new charset: UTF-32
Next by Date: Re: Registration of new charset: UTF-32
Prev by thread: Re: Registration of new charset: UTF-32
Next by thread: Re: Registration of new charset: UTF-32
Index(es):
- Date
- Thread