[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Unicode support in X11

To: ietf-charsets <ietf-charsets@INNOSOFT.COM>
Subject: Unicode support in X11
From: Borka Jerman-Blazic <jerman-blazic@ijs.si>
Date: Thu, 11 Nov 1993 09:32:27 +0100
Conversion: Prohibited
Resent-message-id: <01H566TGXE4Y8WVYKU@INNOSOFT.COM>
X400-Content-type: P2-1984 (2)
X400-MTS-identifier: [/PRMD=ac/ADMD=mail/C=si/;931111103227]
X400-Originator: jerman-blazic@ijs.si
X400-Received: by mta kanin.arnes.si in /PRMD=ac/ADMD=mail/C=si/; Relayed; Thu,11 Nov 1993 10:33:34 +0100
X400-Received: by /PRMD=ac/ADMD=mail/C=si/; Relayed; Thu,11 Nov 1993 09:32:27 +0100
X400-Recipients: ietf-charsets@innosoft.com


Maybe someone from the list can help these people, the message is
from the Unicode list.

Regards,
========================================================================
     
        I am trying to do somethign with X-Windows and Unicode. I have the
     O'Reilly books and I have some idea about how the 16 bit font routines
     and how the wc font routines from R5 work. Working from the R5 update
     book and Ken Lunde's Japanese Information Processing Book, I think I
     have to do something like the project at Waseda which does
     multi-lingual text by using the wc/mbs routines and an all-inclusive
     "locale" file.

We (AIX) have done exactly this in a internal prototype we have.
We have built a locale and an "Xlocale" that incorporates all of the 
ISO8859 family, Japanese, Korean and Chinese charsets.  The locale
uses FSS-UTF as the file code set (i.e. returned from nl_langinfo(CODESET))
and the wide character is 16-bit wchar_t.  
     
        Some things I don't understand:
     
     1. How does one write a locale file? For X-windows, not for C. 

Within the X I18N community this is referred to as the "Xlocale" database.
Unfortunately, with release R5 from X11 the sample implementation 
came with 2 different types of Xlocale.  (We are working to 
unify this with R6).  Depending on which implementation you are using
you need to look at different formats.  But all Xlocale databases are
described in /usr/lib/X11/nls/*.   If you have the MIT source tape,
you can look at 

.../mit/lib/nls/Xsi	- Xlocale database definitions
.../mit/lib/X/Xsi	- source code (methods)
or
.../mit/lib/nls/Ximp	- Xlocale database definitions
.../mit/lib/X/Ximp	- source code (methods)

Each has samples of various formats.  We (AIX) do not use either of these
strictly (we based it on Xsi but added some other features).
     
     2. How do the wc/mbs routines know which font to use from a font
     set? I'm guessing that it takes a character and checks the fonts
     specified by the font set in order, until it finds one that has
     a glyph for that character. Where is this information?

The basic premise is that each locale is split into multiple "charsets".
Each charset is associated with a specific font encoding.  E.g. for 
ja_JP.eucJP, the locale is split into 3 charsets:

generic name    standard JIS charset	Font charset (glyph encoding)
------------	--------------------	----------------------------
Romaji		JISX 0201 (GL)		JISX0201.1979-0
Katakana	JISX 0201 (GR)		JISX0201.1979-0
Kanji		JISX 0208		JISX0201.1983-0

So the Xlocale database defines the number of charsets and the 
respective font charset to be used.  At XCreateFontSet() time, the
Xlocale databases is referenced to match the locale's knowledge of
charsets with what the Xlocale database defines for font encoding.
It then queries the X server for all fonts and does a matching
algorithm to match the fonts to the Xlocale database definition.

The bottom line is that along with the Xlocale database there are
a set of methods (refer to .../mit/lib/X/[Ximp|Xsi] to see the code.

I am not sure either of the R5 can handle UTF/Unicode without some 
modifications of the code.  Mainly because of the conversion from UTF/Unicode
encoding to MIT X font encodings.  So you need conversions to 
complete the job.
     
     3. The O'Reilly book seems to say that X can handle a context
     dependent language such as Arabic (one where there is more than one
     glyph per character and which character is used depends upon the
     context of the surrounding characters). I can't figure out the details
     of how or where this done, or how one would write one's own context
     routine.  Where is this information?

Don't try.  There is only one functions to state whether the XFontSet
does "context dependent" drawing.  Yet, no one has done it yet.  And even
if set, it is not enough information to know what is going on.  There
is some activity for R6 but not guaranteed to make it.    Read the
XContextDependent??() function in R5.

We have added both bi-directional and contextual support but outside of
the Xlib.  
     
     4. Special bonus question: where can I get fonts for foreign X-windows
     fonts? I can't seem to find anything except ISO-8859-1 and
     ISO-8859-whatever=hebrew=is, and Japanese fonts. I would like to find
     other ISO-8859-n fonts, because I think it would be relatively easy to
     rearrange them in a Unicode order.

Not from MIT :-(.   There are several places  on the internet (e.g. Thai,
Korean and Chinese) but I've lost contact of where they are.  There is a
group within the X Consortium that discusses these questions so I 
suggest you contact them for advise (sorry I can't give out the email
address).
 
     Tom Fruchterman
     RAF Technolgy
     tom@raf.com
 
 
 Frank Rojas                              

 AIX Internationalization Architecture     VNET:    AUSTIN(FXROJAS)
 Advanced Workstation and System Division  Tie-line 678-8183  
 IBM, Mail 9652                            Phone:   (512) 838-8183
 Austin, TX 78758                          FAX:     (512) 838-8374
                                     AWD Net: fxrojas@nlsarch.austin.ibm.com

Prev by Date: RE: another opinion
Next by Date: RE: A spec for showing language in MIME headers
Prev by thread: RE: A spec for showing language in MIME headers
Next by thread: another opinion
Index(es):
- Date
- Thread