[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: q about gb 2312/gbk
At 10:30 01/08/22 -0700, Markus Scherer wrote:
>Hello, I have two questions about GB* simplified-Chinese charsets:
>
>1. There are two entries for GB 2312:
>1.a)
>Name: GB_2312-80 [RFC1345,KXS2]
>MIBenum: 57
>Source: ECMA registry
>Alias: iso-ir-58
>Alias: chinese
>Alias: csISO58GB231280
>
>1.b)
>Name: GB2312 (preferred MIME name)
>MIBenum: 2025
>Source: Chinese for People's Republic of China (PRC) mixed one byte,
> two byte set:
> 20-7E = one byte ASCII
> A1-FE = two byte PRC Kanji
> See GB 2312-80
> PCL Symbol Set Id: 18C
>Alias: csGB2312
>
>How are they different?
>The second one is clearly the commonly used MBCS charset.
>Is the first one the DBCS-only part for ISO 2022, or is it also MBCS?
>Please clarify.
The first was very clearly intended as the DBCS-only part.
I'm not sure it's used at all, but then transmitting DBCS-
only data is rather rare anyway.
Most registrations that have iso-ir-xxx as alias are in
a very similar situation (of course some of them are single-
byte, but still only half of an actual 8-bit 'charset').
In the case of Korean, the situation is unfortunately a bit
more complicated in practice; please see the thread starting at
http://lists.w3.org/Archives/Public/ietf-charsets/2001AprJun/0025.html
for details.
>2. I cannot find a registration for GBK (Microsoft 936).
>Am I just missing it?
If you think something is missing that you need, please submit
a registration.
Regards, Martin.