[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Best fit

To: ietf-charsets@mail.apps.ietf.org
Subject: Re: Best fit
From: Frank Ellermann <nobody@xyzzy.claranet.de>
Date: Sun, 22 Oct 2006 03:40:18 +0200
List-Id: <ietf-charsets.mail.apps.ietf.org>
List-Owner: <mailto:ietf-charsets-owner@mail.apps.ietf.org>
List-Subscribe: <mailto:mailserv@mail.apps.ietf.org?subject=subscribe%20ietf-charsets>
List-Unsubscribe: <mailto:mailserv@mail.apps.ietf.org?subject=unsubscribe%20ietf-charsets>
Message-hash: 6E1CA007A2C524517E223AF3488DBC91
Organization: <URL:http://purl.net/xyzzy>
Original-recipient: rfc822;ned+ietf-charsets@mrochek.com
References: <c07a32650610202144l2f4a16f1we112c5beadf6dc39@mail.gmail.com><4539EB47.3756@xyzzy.claranet.de><c07a32650610210830m51f1f2d1o817687e47ecb1118@mail.gmail.com>
Sender: news <news@sea.gmane.org>
Spam-test: False ; 0.0 / 4.5

Erik van der Poel wrote:

> I don't know who created the tables, but they were submitted by an
> individual from Microsoft.

For "surprising" mappings it's interesting to know how they could be
reproduced or verified, or if that's maybe only an observation with
API xyz version m.n by an "unknown" individual.

> ICU may have chosen 0x1A, but that was their own decision. There is
> no interoperability problem here

An u2w.icu( x ) != u2w.bestfit( x ) effect could be ugly.  For some
code pages like <http://purl.net/net/cp/858> ICU tries hard to list
an "official" substitution character, in that case 0x7F, not 0x1A.

> The 698 WCTABLE mappings are from Microsoft's implementation.
[...]
> I have confirmed that their implementation does return these.

Thanks for info, "did anybody check this" was a part of my question.

> The mappings are sorted in a strange way. Maybe they will fix that,
> but it shouldn't prevent this charset from being updated at IANA.

Sure, that's why I've changed the subject.  I wanted to know how the
new "best fit" tables were created.  This "best fit" is unrelated to
IANA considerations.

> Should we strip the best fit mappings from the table and post it
> somewhere?

They're fine, but could be improved by adding a hint how they were 
determined, and who could fix them if needed.

Frank

Follow-Ups:
- RE: Best fit
  - From: Kent Karlsson <kent.karlsson14@comhem.se>
- Re: Best fit
  - From: Erik van der Poel <erikv@google.com>

References:
- Update of charset windows-1252
  - From: Erik van der Poel <erikv@google.com>
- Best fit (was: Update of charset windows-1252)
  - From: Frank Ellermann <nobody@xyzzy.claranet.de>
- Re: Best fit (was: Update of charset windows-1252)
  - From: Erik van der Poel <erikv@google.com>

Prev by Date: Re: Update of charset windows-1252
Next by Date: RE: Best fit
Prev by thread: RE: Best fit (was: Update of charset windows-1252)
Next by thread: RE: Best fit
Index(es):
- Date
- Thread