[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: ietf-charsets@mail.apps.ietf.org windows 1250 - another update toreview :)



I was apparently blindly relying on erroneous data, sorry.

How's this correction?

> >   The graphic (non-control) characters of Windows-1250 have a supported
> >   set of characters similar to the ISO-8859-2 charset. There are
> >   differences in the range from 80 to FF (hex).

I'll double check the others as well.

Thanks,
Shawn


-----Original Message-----
From: Erik van der Poel [mailto:erikv@google.com] 
Sent: Wednesday 13 June 2007 9:35
To: Shawn Steele
Cc: ietf-charsets@mail.apps.ietf.org
Subject: Re: ietf-charsets@mail.apps.ietf.org windows 1250 - another update to review :)

Perhaps this is a misunderstanding due to the word "superset" (my
mistake). What I meant was an upward compatible character encoding, if
you disregard the control characters. Clearly, iso-8859-2 and
windows-1250 are different in the range 0xA0 to 0xFF:

--- icu-iso-8859-2 2006-10-06 22:01:39.000000000 -0700
+++ cp1250 2007-06-05 15:47:24.000000000 -0700
@@ -126,69 +126,69 @@
 A0 0000A0
-A1 000104
+A1 0002C7
 A2 0002D8
 A3 000141
 A4 0000A4
-A5 00013D
-A6 00015A
+A5 000104
+A6 0000A6
 A7 0000A7
 A8 0000A8
-A9 000160
+A9 0000A9
 AA 00015E
-AB 000164
-AC 000179
+AB 0000AB
+AC 0000AC
 AD 0000AD
-AE 00017D
+AE 0000AE
 AF 00017B
 B0 0000B0
-B1 000105
+B1 0000B1
 B2 0002DB
 B3 000142
 B4 0000B4
-B5 00013E
-B6 00015B
-B7 0002C7
+B5 0000B5
+B6 0000B6
+B7 0000B7
 B8 0000B8
-B9 000161
+B9 000105
 BA 00015F
-BB 000165
-BC 00017A
+BB 0000BB
+BC 00013D
 BD 0002DD
-BE 00017E
+BE 00013E
 BF 00017C
 C0 000154
 C1 0000C1

So we should not only check all of your updates, but also choose a
better word than "superset".

Erik

On 6/12/07, Erik van der Poel <erikv@google.com> wrote:
> Are the graphic character mappings of windows-1250 really a superset
> of iso-8859-2? Do the bytes in the range 0xA0 to 0xFF map to the same
> Unicodes?
>
> I already knew of the windows-1252 and -1254 supersets, but not -1250.
> Maybe the differences are "minor"?
>
> Erik
>
> On 6/12/07, Shawn Steele <Shawn.Steele@microsoft.com> wrote:
> > Please review updates to windows 1250.  I used the feedback we had last year for 1252 to guide this request.
> >
> > Thanks,
> > Shawn
> >
> > -------------------------------------------------------------------------
> > Charset name: windows-1250
> >
> > Charset aliases: (None)
> >
> > Suitability for use in MIME text:
> >
> >   Yes, windows-1250 is suitable for use with subtypes of the "text"
> >   Content-Type. Note that windows-1250 is an 8-bit charset. Care should
> >   be taken to choose an appropriate Content-Transfer-Encoding.
> >
> > Published specification(s):
> >
> >   1) http://www.microsoft.com/globaldev/reference/sbcs/1250.htm
> >
> > ISO 10646 equivalency table:
> >
> >   http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1250.TXT
> >
> > Additional information:
> >
> >   UTF-8 is preferred to windows-1250 when permissible.
> >
> >   Although not authoritative, the following references may also be of
> >   interest:
> >
> >   Printed mapping table:
> >   Dr. International "Developing International Software, Second Edition",
> >   Microsoft Press, ISBN 0-7356-1583-7, 2003, p. 729-737
> >
> >   Microsoft windows extended "best fit" behavior:
> >   http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit1250.txt
> >
> >   This is an update of an existing registration of this charset. This
> >   charset name is in use.
> >
> >   This charset is also known as Windows Code Page 1250 or cp1250 for
> >   short; these are NOT aliases.
> >
> >   The graphic (non-control) characters of Windows-1250 have a supported
> >   set of characters similar to the ISO-8859-2 charset. There are
> >   differences in the range from 80 to FF (hex).
> >
> > Person & email address to contact for further information:
> >
> >   Shawn Steele
> >   Email: Shawn.Steele@microsoft.com
> >
> >   Microsoft Corporation
> >   One Microsoft Way,
> >   Redmond, WA 98052
> >   U.S.A.
> >
> > Intended usage: COMMON
> >
> >
>