[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Big5 / CP950



Moved the note, Removed big5+, if anyone knows other examples, I'd include those.

> You have "COMMON" here while your Shift_JIS registration has "LIMITED".
> Is that by accident, or is there some rationale behind it?

Um, by accident.  I copied the original shift-jis registration, and used the windows-1252 as a template for this.  I have no clue what the distinction is :)  Changed to LIMITED USE.  (reasoning that the variations are cause instability between implementations, so I'd much rather have people picking something like UTF-8).  Is there a definition of these terms?  All of them should be OBSOLETE in favor of UTF-* ;-)  I'd use that if I could get away with it.

-Shawn

 
http://blogs.msdn.com/shawnste

________________________________________
From: "Martin J. Dürst" [duerst@it.aoyama.ac.jp]
Sent: Wednesday, September 21, 2011 1:00 AM
To: Shawn Steele
Cc: 'ietf-charsets@mail.apps.ietf.org'; Makoto Murata (eb2m-mrt@asahi-net.or.jp)
Subject: Re: Big5 / CP950

Hello Shawn,

On 2011/09/21 2:44, Shawn Steele wrote:
> Here's some proposed text for a more complete registration.

Many thanks for doing this work. Some comments below, mostly nits.

> Comments welcome.  AFAICT this code page is quite a bit less stable than others, and there are a plethora of mappings.  I've included two ISO10646 equivalency tables for that reason.
>
> Thanks,
> Shawn
>
>
-----------------------------------

Charset name: big5

Charset aliases: (None)

MIBenum: 2026

Suitability for use in MIME text:

Yes, big5 is suitable for use with subtypes of the "text"
Content-Type. Note that big5 is an 8-bit charset. Care should
be taken to choose an appropriate Content-Transfer-Encoding.

Two example ISO 10646 equivalency tables:  http://unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/OTHER/BIG5.TXT
http://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP950.TXT

Note that Big5 has many variants, so these exemplars provide two 
common mappings:

Additional information:

Several vendor specific charsets that derive from Big5 often use
the Big5 name instead of a more specific vendor charset name.
Big5-HKSCS is one example, Microsoft Code Page 950, and
several font specific variations are other examples.

Although not authoritative, the following references may also be of
interest:

Printed mapping table:
Dr. International "Developing International Software, Second Edition",
Microsoft Press, ISBN 0-7356-1583-7, 2003, p. 778 and appendixes on CD.

Microsoft windows extended "best fit" behavior:
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit950.txt

Additional information about the many variants of Big5:
http://en.wikipedia.org/wiki/Big-5

The wide variety of existing variations of Big5 may make it
unsuitable for many modern applications.  Developers should
consider whether UTF-8 or UTF-16 would be more appropriate for
new applications.

This is an update of an existing registration of this charset. This
charset name is in use.

This charset is also known as Windows Code Page 950 or cp950 for
short; these are NOT aliases.

Person&  email address to contact for further information:

Shawn Steele
Email: Shawn.Steele&microsoft.com

Microsoft Corporation
One Microsoft Way
Redmond, WA 98052
U.S.A.

Intended usage: LIMITED USE