[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Big5 / CP950



Hello Shawn,

On 2011/09/21 2:44, Shawn Steele wrote:
> Here's some proposed text for a more complete registration.

Many thanks for doing this work. Some comments below, mostly nits.

> Comments welcome.  AFAICT this code page is quite a bit less stable than others, and there are a plethora of mappings.  I've included two ISO10646 equivalency tables for that reason.
>
> Thanks,
> Shawn
>
>
> -----------------------------------
>
> Charset name: big5
>
> Charset aliases: (None)
>
> MIBenum: 2026
>
> Suitability for use in MIME text:
>
> Yes, big5 is suitable for use with subtypes of the "text"
> Content-Type. Note that big5 is an 8-bit charset. Care should
> be taken to choose an appropriate Content-Transfer-Encoding.
>
> Two example ISO 10646 equivalency tables:  Note that Big5 has
> many variants, so these exemplars provide two common mappings:
> http://unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/OTHER/BIG5.TXT
> http://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP950.TXT

(I'd put the "Note that Big5 has many variants...) after the URIs)

> Additional information:
>
> Several vendor specific charsets that derive from Big5 often use
> the Big5 name instead of a more specific vendor charset name.
> Big5-HKSCS is one example, Microsoft Code Page 950, Big5+ and
> several font specific variations are other examples.

 From what I have read in the Wikipedia article, Big5+ seems to be quite 
far away from the "average" Big5 variant. I'm not sure I'd list it up here.

> Although not authoritative, the following references may also be of
> interest:
>
> Printed mapping table:
> Dr. International "Developing International Software, Second Edition",
> Microsoft Press, ISBN 0-7356-1583-7, 2003, p. 778 and appendixes on CD.
>
> Microsoft windows extended "best fit" behavior:
> http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit950.txt
>
> Again not authoritative, but the Wikipedia article currently touches
> on the many variations of Big5 and may be of interest to implementers:
> http://en.wikipedia.org/wiki/Big-5

I'd personally shorten the text here, e.g. to something like:
"Additional information about the many variants of Big5:"


> The wide variety of existing variations of Big5 may make it
> unsuitable for many modern applications.  Developers should
> consider whether UTF-8 or UTF-16 would be more appropriate for
> new applications.
>
> This is an update of an existing registration of this charset. This
> charset name is in use.
>
> This charset is also known as Windows Code Page 950 or cp950 for
> short; these are NOT aliases.
>
> Person&  email address to contact for further information:
>
> Shawn Steele
> Email: Shawn.Steele&microsoft.com
>
> Microsoft Corporation
> One Microsoft Way
> Redmond, WA 98052
> U.S.A.
>
> Intended usage: COMMON

You have "COMMON" here while your Shift_JIS registration has "LIMITED". 
Is that by accident, or is there some rationale behind it?

Regards,    Martin.