[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Registration of some code pages



That's sort of the message I hear from different directions.  (like end users & customers and developers)

I don't know enough about the standard version of shift_jis/iso-2022-jp/etc. :) (I know some of you are ROFL now :), I'll pause for a moment and let you collect yourselves ;0)....

But I gather there may be some conflicting behavior as well, and that the windows version isn't just additions? 

My understanding (hearsay), is also that "ours" isn't the only variation of these code pages, though our version certainly gets a lot of attention. 

-Shawn

-----Original Message-----
From: Ned Freed [mailto:ned.freed@mrochek.com] 
Sent: Tuesday, September 07, 2010 4:01 PM
To: Anne van Kesteren
Cc: NARUSE, Yui; Shawn Steele; 'ietf-charsets@iana.org'
Subject: Re: Registration of some code pages

> On Tue, 07 Sep 2010 07:02:55 +0200, Shawn Steele 
> <Shawn.Steele@microsoft.com> wrote:
> >> If I have a portable software, it should work on Unix as the same 
> >> as it does on Windows.
> >> So the expectation that "shift_jis" on Windows means "Windows-31J"
> >> seems wrong.
> >
> > That's the fundemental problem.  If you have portable software and 
> > run it on Unix and on Windows, and save your file using "shift_jis" 
> > you're going to have some odd discrepencies.  Obviously that's not 
> > good, but it's pretty entrenched.  Clearly we cannot expect Unix 
> > boxes to pretend shift_jis is Windows-31J (but some apps do), 
> > however it's also a tad unreasonable to expect Windows boxes to 
> > suddenly be very strict when they encounter "shift_jis" as that 
> > would break a very large number of documents that currently "work."

Adding another data point: We're under considerable pressure from our Japanese customers to just add the 31J stuff to our shift_jis and iso-2022-jp tables and be done with it. They will accept nothing less than the ability to use the additional character and send them out labelled as iso-2022-jp, or less often, shift_jis.

> I think we can expect all browsers to at least start "pretending" that 
> shift_jis is Windows-31J. And similarly for all other encodings. And 
> maybe changes to browsers find their way back upstream, but that is 
> outside my interest area.

I'll probably get chided for saying this, but it sure seems to me that this battle is already lost and we should suck it up and move on. It's always been permissible to add characters to a chqrset, even though there are always going to be implementations that are slow to support, or may never be upgraded to support, the new characters.

So, unless there are cases where a code point has been used in conflicting ways, why don't we just add the additional characters to shift_jis and iso-2022-jp? (Perhaps a revision to RFC 1468 is in order.)

				Ned