[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Indicating charset variants (was: RE: windows 936)



On 9/23/07, Martin Duerst <duerst@it.aoyama.ac.jp> wrote:
> At 22:38 07/09/22, Erik van der Poel wrote:
> >One way to estimate that percentage might be to compare the http or
> >meta charset with an encoding detector's result.
>
> Yes. Do you have any way to do that (of course that would be done
> just on a careful sample)?

I'd like to try this at some point, but it may not be very soon...

> A very typical case would be XML Signature. According to the spec,
> you can sign e.g. an XML document in Shift_JIS, but it's done
> by conversion to UTF-8. If the conversion isn't the same when
> the signature is checked, the signature won't match anymore even
> if it's actually correct.
>
> Cases like these are quite different from the usual browsing case,
> where an odd wrong character may just be overlooked.

Interesting. I guess the charset name is just one part of the problem.
Implementations would have to agree on their Unicode mapping tables
too.

Erik