[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Registration of some code pages



On Tue, 07 Sep 2010 07:02:55 +0200, Shawn Steele  
<Shawn.Steele@microsoft.com> wrote:
>> If I have a portable software, it should work on Unix as the same as
>> it does on Windows.
>> So the expectation that "shift_jis" on Windows means "Windows-31J"  
>> seems wrong.
>
> That's the fundemental problem.  If you have portable software and run  
> it on Unix and on Windows, and save your file using "shift_jis" you're  
> going to have some odd discrepencies.  Obviously that's not good, but  
> it's pretty entrenched.  Clearly we cannot expect Unix boxes to pretend  
> shift_jis is Windows-31J (but some apps do), however it's also a tad  
> unreasonable to expect Windows boxes to suddenly be very strict when  
> they encounter "shift_jis" as that would break a very large number of  
> documents that currently "work."

I think we can expect all browsers to at least start "pretending" that  
shift_jis is Windows-31J. And similarly for all other encodings. And maybe  
changes to browsers find their way back upstream, but that is outside my  
interest area.


> My feeling that this is a fairly annoying pain, and I could probably  
> invent a number of transition schemes that might get some sort of  
> reasonable parity and migrate documents over a decade or two.  However,  
> I think that would still be a painful process, and that everyone's  
> energy would be better spent encouraging use of a more consistent  
> encoding, such as UTF-8, that avoids most of the problems with code  
> pages evolving in different directions.

Sure, we encourage people to use UTF-8 all over the place. However, there  
is a segment of the web that is not being updated as much (or at all) and  
still wants to be rendered. All browsers should render that segment the  
same way. And new browsers should not have to reverse engineer existing  
browsers to figure out how to render that segment correctly. (Where  
correctly means e.g. interpreting shift_jis as Windows-31J.)


>> Such automatic overrides are needed for documents which declares its
>> encoding in the document or related metadata.
>
> My suggestion is not to make such a replacement "automatic", but rather  
> noting "somewhere" (like the standards or registry) that this name  
> misrepresentation happens sometime.  Then the app developer can figure  
> out what to do for their app and user base.

If you develop a new browser you do not really want to have to figure such  
things out. It is also nigh-on impossible given the amount of content out  
there, global market, etc. So we should document what we think is best so  
others do not have to figure it out all over again.


-- 
Anne van Kesteren
http://annevankesteren.nl/