[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Best fit



Erik van der Poel wrote:
 
> I don't know who created the tables, but they were submitted by an
> individual from Microsoft.

For "surprising" mappings it's interesting to know how they could be
reproduced or verified, or if that's maybe only an observation with
API xyz version m.n by an "unknown" individual.

> ICU may have chosen 0x1A, but that was their own decision. There is
> no interoperability problem here

An u2w.icu( x ) != u2w.bestfit( x ) effect could be ugly.  For some
code pages like <http://purl.net/net/cp/858> ICU tries hard to list
an "official" substitution character, in that case 0x7F, not 0x1A.

> The 698 WCTABLE mappings are from Microsoft's implementation.
[...]
> I have confirmed that their implementation does return these.

Thanks for info, "did anybody check this" was a part of my question.
 
> The mappings are sorted in a strange way. Maybe they will fix that,
> but it shouldn't prevent this charset from being updated at IANA.

Sure, that's why I've changed the subject.  I wanted to know how the
new "best fit" tables were created.  This "best fit" is unrelated to
IANA considerations.
 
> Should we strip the best fit mappings from the table and post it
> somewhere?

They're fine, but could be improved by adding a hint how they were 
determined, and who could fix them if needed.

Frank