[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: q about gb 2312/gbk



At 10:30 01/08/22 -0700, Markus Scherer wrote:
>Hello, I have two questions about GB* simplified-Chinese charsets:
>
>1. There are two entries for GB 2312:
>1.a)
>Name: GB_2312-80                                        [RFC1345,KXS2]
>MIBenum: 57
>Source: ECMA registry
>Alias: iso-ir-58
>Alias: chinese
>Alias: csISO58GB231280
>
>1.b)
>Name: GB2312  (preferred MIME name)
>MIBenum: 2025
>Source: Chinese for People's Republic of China (PRC) mixed one byte,
>         two byte set:
>           20-7E = one byte ASCII
>           A1-FE = two byte PRC Kanji
>         See GB 2312-80
>         PCL Symbol Set Id: 18C
>Alias: csGB2312
>
>How are they different?
>The second one is clearly the commonly used MBCS charset.
>Is the first one the DBCS-only part for ISO 2022, or is it also MBCS? 
>Please clarify.

The first was very clearly intended as the DBCS-only part.
I'm not sure it's used at all, but then transmitting DBCS-
only data is rather rare anyway.

Most registrations that have iso-ir-xxx as alias are in
a very similar situation (of course some of them are single-
byte, but still only half of an actual 8-bit 'charset').

In the case of Korean, the situation is unfortunately a bit
more complicated in practice; please see the thread starting at
http://lists.w3.org/Archives/Public/ietf-charsets/2001AprJun/0025.html
for details.



>2. I cannot find a registration for GBK (Microsoft 936).
>Am I just missing it?

If you think something is missing that you need, please submit
a registration.

Regards,   Martin.