[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: shift_jis / windows-31J



(2010/11/17 8:00), Bjoern Hoehrmann wrote:
> * Anne van Kesteren wrote:
>> It would only break if the content they consumed contained code points
>> that mapped to invalid characters and they relied on that, though. And the
>> content was not relying on being mapped to the superset mapping of
>> Windows-31J instead, which seems far more likely given the dominance of
>> the Web and Windows.
>
> The prime example problem with shift_jis is the ambiguity of the octet
> 0x5C which maps to a backslash for some and to the yen sign for others.
> As far as I am aware, 0x5C is not invalid and this particular problem
> is not a matter of supersets and subsets, you get 0x5C and you do not
> know whether you should interpret it as yen sign or backslash. And it's
> not going to change, systems built around one interpretation will use
> that interpretation, systems built around the other interpretation will
> stick with their interpretation aswell. If you have two web services
> that exchange data they may be running on Windows and on the Web, but
> they may not be using the Windows/Browser/whatever interpretation.

In practice, 0x5C in Shift-JIS is U+005C but yen sign glyph.

see also https://bugs.webkit.org/show_bug.cgi?id=24906

-- 
NARUSE, Yui  <naruse@airemix.jp>