[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: For the record



Martin,

As chariman of ISO SC2 and JIS X 0208 committee, I would like to give you so
me information.

At 7:46 PM 97.8.30, Martin J. D$B|r(Jst wrote:
> Hello everybody,
>
> In the charset policy BOF at the recent IETF meeting in Munich,
> chaired by Harald Alvestrand, he showed a slide with variants
> of Han characters (Kanji) that are unified in Unicode/ISO 10646,
> but which may be problematic. He also showed this list in his
> plenary talk presenting the planned IETF charset policy.
> This list has been published on page 885 (explanatory page 7),
> bottom, of JIS X 0221-1995, the Japanese translation of ISO
> 10646 (explanatory material not contained in the original),
> and probably elsewhere.
>
JIS description of Unification rules found in JIS X 0221 was based on CJK-JR
G standing document. CJK-JRG was the committe who do the han-unification.

ISO version of the rule that is an updated version of the JIS description is
available as Amendment 8 of ISO/IEC 10646-1.

> In the BOF, I commented on this. I said that these were indeed
> mostly character components that turned up in many characters,
> and that a high percentage of them was explicitly unified by
> the new version of the base Japanese Kanji standard,
> JIS X 0208:1997. I mentionned a figure of something like 90
> or 95%, which turns out to be too high if one counts cases,
> but probably correct if one counts the characters affected
> (see below).
>
Since we do not have sufficient information to identify each Kanji from Chin
a, Taiwan, and Korea, it is very difficult to compare 10646 unification rule
s based with JIS X 0208:1997 unification rules and to evaluate compatibility
between the  two rules. At the time CJK-JRG did the unification, Japan also
could not provides sufficient indentification information on each Kanji.

I do not know the availability of some of GB standards. For example, Dr. Yas
uoka of Kyoto University anvailed mystery behind GB standards.

As far as I understand, CJK-JRG work only used 24 dots fonts that is not suf
ficient for real unification consideration. Real consideration of unificatio
n rules requires identification information and very high quality Kanji shap
e information.

It is obvious that a complex Kanji shape could not represented in 24 dots.

During the course of JIS X 0208 revision, sometimes we use 300 dots scanned
image. For example, list of variant implementation shape found in JIS X 0208
:1997 starting from 401 to 490 is based on 300 dots scanned Kanji shapes.

> To this, Masataka Ohta strongly protested, saying something
> to the effect that he had been on the commitee developping
> that standard. I have now had time to look at JIS X 208:1997
> again. On page 399 (explanatory page 25), it lists the members
> of the two commitees involved. On the following page, it gives
> additional acknowledgements. Whatever that may mean, I have
> not been able to find the name Masataka Ohta on these pages.
> [my name turns up at the end of the text on page 400, as one
> of the contributors to the public review done by the commitee,
> in the form Duerst, Martin J.]
>
> In the case that I have missed Masataka Ohta's name somewhere
> in JIS X 208:1997, I would like him to give us the exact page,
> and if necessary line number, to verify. In the case he has
> indeed participated, but has for some reason be forgotten,
> I ask the chair of both commitees listed on page 399, Prof.
> Shibano, to tell us how Masataka Ohta has been involved.
>
Masataka Ohta is really the member of JIS X 0208 committee and recorded as a
member of ***WG2*** found in the middle of page 399 of JIS X 0208:1997, 6 li
nes below my name.

However, he is not officially representing JIS committee and most of his opi
nions and interpretations contradics committee positions.

>
>
> Now for the list that Harald has shown. This list has 8 lines,
> with four groups that each contain 2 or three variants.
> For these, I give the item number of Section 6.6.3.2 of JIS
> X 208:1997 (p. 12,...) which gives examlpes of unification,
> and comments if necessary.
>
The list is not an example but normative rules of unification.

ISO/IEC 10646-1 AMD 8 only list examples. AMD 8 does not cover complete list.

> Note that JIS 208 also contains and lists exceptions, but
> that these are carried over to Unicode/ISO 10646 as being
> separated by the source separation rule.
>
>
> Line 1
> 	case 1 (3 variants)	128 (2 variants, third is
> 					handwriting and not
> 					covered by JIS 208)

Third variant found in JIS X 0221 is not belong to the same font family. Thu
s we ommitted. JIS Kanji Dictionary, which will be published in November, in
cludes the shape.

> 	case 2 (3 variants)	161 (2 variants, third is
> 					the single-character
> 					shape which is not listed
> 					in JIS 208 section 6.6.3.2)

Basicaly, this is an error of the first edition of JIS X 0208. This rule is
basically for compatibility purpose.

> 	case 3 (3 variants)	153 (JIS 208 lists one more variant)

This rule come from well known Kanji shape design error of Kangxi dictionary.

> 	case 4 (3 variants)	155 (2 variants, middle is
> 					the single-character
> 					shape which is not listed
> 					in JIS 208 section 6.6.3.2)

Separation of 61-27 from 16-91 is an error of the first edition of JIS X 0208.

.
.
.

>
> With all the comments, it's difficult to exactly say what percentage
> this would amount to. But counting each case as one item, it's around
> 66%. If one counts characters affected, and not cases as such, however,
> the percentage is much higher, because the cases with the most characters
> (line 1: case 1, 2, 4; line 8: case 4) all are included in JIS 208.
>
So far as I understand, CJK-JRG without sufficient information on each Kanji
and its shape, they did a good job. Even though they based on explanatory pa
ges of JIS X 0208:1990, ISO/IEC 10646-1 has better specification of Unificat
ion than JIS X 0208:1990.

regards.

Kohji Shibano

+--Kohji SHIBANO, Professor of Systems Programming---+
| Tokyo International University, shibano@tiu.ac.jp  |
|    Office Tel:+81-492-32-1111, -1119 (fax),        |
+-kshibano@mix.or.jp, Home Tel & Fax:+81-44-954-7337-+