[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

plain ASCII version of TSCII charset registration request



Dear Ned and Martin:

As per your suggestion, I am posting below a plain ASCII version
of TSCII charset registration request. While preparing this document we

have taken into account your comments. Please let us know if there
are still points to be sorted out.

K. Kalyanasundaram
----

Request for the Registration of the Charset TSCII

To:
ietf-charsets@ iana.org

Subject:
Registration of new charset

Character set name:
TSCII
(TAMIL SCRIPT CODE FOR INFORMATION INTERCHANGE)

Character set aliases:
None

Suitability for use in MIME text:
YES
usable as 8bit or with base64 or quoted-printable encoding

Published Specifications:
   http://www.tscii.org/tsciispec.html

ISO 10646 Equivalency Table: 
 Available as a technical note at the Unicode Consortium website
   http://www.unicode. org/notes/ tn15/

As a glyph-based encoding, TSCII codechart includes vowels, consonants
and abugida (compound vowel-consonant) characters. Unicode, as a
character encoding encodes only vowels and consonants. Hence not all
codepoints of TSCII can be converted one-to-one with ISO 10646.

Intended usage:
COMMON

Additional Information:

Tamil is one of the main Indian languages (Dravidian in Origin)
currently spoken by over 70 million people worldwide. TSCII (Tamil
Script Code for Information Interchange) is a bilingual 8-bit
glyph-based encoding scheme (Roman and Tamil). The TSCII scheme was
collectively worked out through Net-based discussions in 1998. TSCII is
modeled on "ISO-8859-XX family of charsets" with standard plain ASCII
set filling the 7-bit part and a set of Tamil character glyphs filling
the 8-bit part.

Full technical details on the TSCII charset are available at the TSCII
official website:    http://www.tscii.org/tsciispec.html

Person(s) & email address to contact for further information:

TSCII USER GROUP represented by

Kalyanasundaram, Kuppuswamy (Switzerland)
    kalyan.geo@yahoo. com
Manivannan, Mani (USA)
   mmanivannan@ gmail.com
Nedumaran, Muthu (Malaysia)
   muthu@murasu. com

TAMIL SCRIPT CODE FOR INFORMATION INTERCHANGE (TSCII)
Glyph/Character Listing and ISO 10646 Mapping Table

Column #1 is the TSCII code position (in hex),
Column #2 is the TSCII character name

ISO 10646 Mapping table can be obtained as a Technical note from the
Unicode Consortium website:
     http://www.unicode. org/notes/ tn15

A Unicode based PDF file that includes the actual glyph forms of all
characters included in TSCII charset is available at the TSCII website
  http://www.tscii. org/tsciispec.html

HEX Character Name
00  NULL
01  START OF HEADING
02  START OF TEXT
03  END OF TEXT
04  END OF TRANSMISSION
05  ENQUIRY
06  ACKNOWLEDGE
07  BELL
08  BACKSPACE
09  HORIZONTAL TABULATION
0A  LINE FEED
0B  VERTICAL TABULATION
0C  FORM FEED
0D  CARRIAGE RETURN
0E  SHIFT OUT
0F  SHIFT IN
 
10  DATA LINE ESCAPE
11  DEVICE CONTROL ONE
12  DEVICE CONTROL TWO
13  DEVICE CONTROL THREE
14  DEVICE CONTROL FOUR
15  NEGATIVE ACKNOWLEDGE
16  SYNCHRONOUS IDLE
17  END OF TRANSMISSION BLOCK
18  CANCEL
19  END OF MEDIUM
1A  SUBSTITUTE
1B  ESCAPE
1C  FILE SEPARATOR
1D  GROUP SEPARATOR
1E  RECORD SEPARATOR
1F  UNIT SEPARATOR
 
20  SPACE
21  EXCLAMATION MARK
22  QUOTATION MARK
23  NUMBER SIGN
24  DOLLAR SIGN
25  PERCENT SIGN
26  AMPERSAND
27  APOSTROPHE
28  LEFT PARENTHESIS
29  RIGHT PARENTHESIS
2A  ASTERISK
2B  PLUS SIGN
2C  COMMA
2D  HYPHEN MINUS
2E  FULL STOP
2F  SOLIDUS

30  DIGIT ZERO
31  DIGIT ONE
32  DIGIT TWO
33  DIGIT THREE
34  DIGIT FOUR
35  DIGIT FIVE
36  DIGIT SIX
37  DIGIT SEVEN
38  DIGIT EIGHT
39  DIGIT NINE
3A  COLON
3B  SEMICOLON
3C  LESS-THAN SIGN
3D  EQUALS SIGN
3E  GREATER-THAN SIGN
3F  QUESTION MARK
 
40  COMMERCIAL AT
41  LATIN CAPITAL LETTER A
42  LATIN CAPITAL LETTER B
43  LATIN CAPITAL LETTER C
44  LATIN CAPITAL LETTER D
45  LATIN CAPITAL LETTER E
46  LATIN CAPITAL LETTER F
47  LATIN CAPITAL LETTER G
48  LATIN CAPITAL LETTER H
49  LATIN CAPITAL LETTER I
4A  LATIN CAPITAL LETTER J
4B  LATIN CAPITAL LETTER K
4C  LATIN CAPITAL LETTER L
4D  LATIN CAPITAL LETTER M
4E  LATIN CAPITAL LETTER N
4F  LATIN CAPITAL LETTER O

50  LATIN CAPITAL LETTER P
51  LATIN CAPITAL LETTER Q
52  LATIN CAPITAL LETTER R
53  LATIN CAPITAL LETTER S
54  LATIN CAPITAL LETTER T
55  LATIN CAPITAL LETTER U
56  LATIN CAPITAL LETTER V
57  LATIN CAPITAL LETTER W
58  LATIN CAPITAL LETTER X
59  LATIN CAPITAL LETTER Y
5A  LATIN CAPITAL LETTER Z
5B  LEFT SQUARE BRACKET
5C  REVERSE SOLIDUS
5D  RIGHT SQUARE BRACKET
5E  CIRCUMFLEX ACCENT
5F  LOW LINE

60  GRAVE ACCENT
61  LATIN SMALL LETTER A
62  LATIN SMALL LETTER B
63  LATIN SMALL LETTER C
64  LATIN SMALL LETTER D
65  LATIN SMALL LETTER E
66  LATIN SMALL LETTER F
67  LATIN SMALL LETTER G
68  LATIN SMALL LETTER H
69  LATIN SMALL LETTER I
6A  LATIN SMALL LETTER J
6B  LATIN SMALL LETTER K
6C  LATIN SMALL LETTER L
6D  LATIN SMALL LETTER M
6E  LATIN SMALL LETTER N
6F  LATIN SMALL LETTER O

70  LATIN SMALL LETTER P
71  LATIN SMALL LETTER Q
72  LATIN SMALL LETTER R
73  LATIN SMALL LETTER S
74  LATIN SMALL LETTER T
75  LATIN SMALL LETTER U
76  LATIN SMALL LETTER V
77  LATIN SMALL LETTER W
78  LATIN SMALL LETTER X
79  LATIN SMALL LETTER Y
7A  LATIN SMALL LETTER Z
7B  LEFT CURLY BRACKET
7C  VERTICAL LINE
7D  RIGHT CURLY BRACKET
7E  TILDE
7F  DELETE

80  TAMIL DIGIT CUZHI = Tamil digit zero
81  TAMIL DIGIT ONRRU = Tamil digit one
82  TAMIL GRANTHA LETTER SRI = Tamil letter sri
83  TAMIL GRANTHA LETTER JA = Tamil letter ja
84  TAMIL GRANTHA LETTER SSA = Tamil letter ssa
85  TAMIL GRANTHA LETTER SA = Tamil letter sa
86  TAMIL GRANTHA LETTER HA = Tamil letter ha
87  TAMIL GRANTHA LETTER KSHA = Tamil letter ksha
88  TAMIL GRANTHA LETTER J = Tamil letter j
89  TAMIL GRANTHA LETTER SS = Tamil letter ss
8A  TAMIL GRANTHA LETTER S = Tamil letter s
8B  TAMIL GRANTHA LETTER H = Tamil letter h
8C  TAMIL GRANTHA LETTER KSH = Tamil letter ksh
8D  TAMIL DIGIT IRANNNTU = Tamil digit two
8E  TAMIL DIGIT MUUNNRRU = Tamil digit three
8F  TAMIL DIGIT NAANNKU = Tamil digit four

90  TAMIL DIGIT AINTHU = Tamil digit five
91  LEFT SINGLE QUOTATION MARK
92  RIGHT SINGLE QUOTATION MARK
93  LEFT DOUBLE QUOTATION MARK
94  RIGHT DOUBLE QUOTATION MARK
95  TAMIL DIGIT AARRU = Tamil digit six
96  TAMIL DIGIT EEZHU = Tamil digit seven
97  TAMIL DIGIT ETTU = Tamil digit eight
98  TAMIL DIGIT ONPATHU = Tamil digit nine
99  TAMIL LETTER NGAKARA UKARAM = Tamil letter ngu
9A  TAMIL LETTER NJAKARA UKARAM = Tamil letter nju
9B  TAMIL LETTER NGAKARA UUKAARAM = Tamil letter nguu
9C  TAMIL LETTER NJAKARA UUKAARAM = Tamil letter njuu
9D  TAMIL NUMBER PATHTHU = Tamil number ten
9E  TAMIL NUMBER NUURRU = Tamil number one hundred
9F  TAMIL NUMBER AAYIRAM = Tamil number one thousand

A0  <NOT ASSIGNED>
A1  TAMIL VOWEL SIGN KAAL = Tamil vowel sign aa
A2  TAMIL VOWEL SIGN KOKKI = Tamil vowel sign i
A3  TAMIL VOWEL SIGN CUZHI-K-KOKKI = Tamil vowel sign ii
A4  TAMIL VOWEL SIGN KONNNTAI = Tamil vowel sign u
A5  TAMIL VOWEL SIGN CUZHIK KONNNTAI = Tamil vowel sign uu
A6  TAMIL VOWEL SIGN KOMPU = Tamil vowel sign e
A7  TAMIL VOWEL SIGN IRATTAI-K-KOMPU = Tamil vowel sign ee
A8  TAMIL VOWEL SIGN IRATTAI-C-CUZHI = Tamil vowel sign ai
A9  COPYRIGHT SIGN
AA  TAMIL VOWEL SIGN CIRRAKU = Tamil au length mark
AB  TAMIL LETTER AKARAM = Tamil letter a
AC  TAMIL LETTER AAKAARAM = Tamil letter aa
AD  TAMIL VOWEL IKARAM (USAGE IN SLOT DEPRECATED) = Tamil letter i
AE  TAMIL LETTER IIKAARAM = Tamil letter ii
AF  TAMIL LETTER UKARAM = Tamil letter u
 
B0  TAMIL LETTER UUKAARAM = Tamil letter uu
B1  TAMIL LETTER EKARAM = Tamil letter e
B2  TAMIL LETTER EEKAARAM = Tamil letter ee
B3  TAMIL LETTER AIKAARAM = Tamil letter ai
B4  TAMIL LETTER OKARAM = Tamil letter o
B5  TAMIL LETTER OOKAARAM = Tamil letter oo
B6  TAMIL LETTER AUKAARAM = Tamil letter au
B7  TAMIL AAYTHAM LETTER AKHENAM or AKHAAN = Tamil letter aaytham
B8  TAMIL LETTER KAKARA AKARAM = Tamil letter ka
B9  TAMIL LETTER NGAKARA AKARAM = Tamil letter nga
BA  TAMIL LETTER CAKARA AKARAM = Tamil letter ca
BB  TAMIL LETTER NJAKARA AKARAM = Tamil letter nja
BC  TAMIL LETTER TAKARA AKARAM = Tamil letter tta
BD  TAMIL LETTER NNNAKARA AKARAM = Tamil letter nnna
BE  TAMIL LETTER THAKARA AKARAM = Tamil letter ta
BF  TAMIL LETTER NAKARA AKARAM = Tamil letter na

C0  TAMIL LETTER PAKARA AKARAM = Tamil letter pa
C1  TAMIL LETTER MAKARA AKARAM = Tamil letter ma
C2  TAMIL LETTER YAKARA AKARAM = Tamil letter ya
C3  TAMIL LETTER RAKARA AKARAM = Tamil letter ra
C4  TAMIL LETTER LAKARA AKARAM = Tamil letter la
C5  TAMIL LETTER VAKARA AKARAM = Tamil letter va
C6  TAMIL LETTER ZHAKARA AKARAM = Tamil letter llla
C7  TAMIL LETTER LLAKARA AKARAM = Tamil letter lla
C8  TAMIL LETTER RRAKARA AKARAM = Tamil letter rra
C9  TAMIL LETTER NNAKARA AKARAM = Tamil letter nna
CA  TAMIL LETTER TAKARA IKARAM = Tamil letter tti
CB  TAMIL LETTER TAKARA IIKAARAM = Tamil letter ttii
CC  TAMIL LETTER KAKARA UKARAM = Tamil letter ku
CD  TAMIL LETTER CAKARA UKARAM = Tamil letter cu
CE  TAMIL LETTER TAKARA UKARAM = Tamil letter ttu
CF  TAMIL LETTER NNNAKARA UKARAM = Tamil letter nnnu

D0  TAMIL LETTER THAKARA UKARAM = Tamil letter tu
D1  TAMIL LETTER NAKARA UKARAM = Tamil letter nu
D2  TAMIL LETTER PAKARA UKARAM = Tamil letter pu
D3  TAMIL LETTER MAKARA UKARAM = Tamil letter mu
D4  TAMIL LETTER YAKARA UKARAM = Tamil letter yu
D5  TAMIL LETTER RAKARA UKARAM = Tamil letter ru
D6  TAMIL LETTER LAKARA UKARAM = Tamil letter lu
D7  TAMIL LETTER VAKARA UKARAM = Tamil letter vu
D8  TAMIL LETTER ZHAKARA UKARAM = Tamil letter lllu
D9  TAMIL LETTER LLAKARA UKARAM = Tamil letter llu
DA  TAMIL LETTER RRAKARA UKARAM = Tamil letter rru
DB  TAMIL LETTER NNAKARA UKARAM = Tamil letter nnu
DC  TAMIL LETTER KAKARA UUKAARAM = Tamil letter kuu
DD  TAMIL LETTER CAKARA UUKAARAM = Tamil letter cuu
DE  TAMIL LETTER TAKARA UUKAARAM = Tamil letter ttuu
DF  TAMIL LETTER NNNAKARA UUKAARAM = Tamil letter nnnuu

E0  TAMIL LETTER THAKARA UUKAARAM = Tamil letter tuu
E1  TAMIL LETTER NAKARA UUKAARAM = Tamil letter nuu
E2  TAMIL LETTER PAKARA UUKAARAM = Tamil letter puu
E3  TAMIL LETTER MAKARA UUKAARAM = Tamil letter muu
E4  TAMIL LETTER YAKARA UUKAARAM = Tamil letter yuu
E5  TAMIL LETTER RAKARA UUKAARAM = Tamil letter ruu
E6  TAMIL LETTER LAKARA UUKAARAM = Tamil letter luu
E7  TAMIL LETTER VAKARA UUKAARAM = Tamil letter vuu
E8  TAMIL LETTER ZHAKARA UUKAARAM = Tamil letter llluu
E9  TAMIL LETTER LLAKARA UUKAARAM = Tamil letter lluu
EA  TAMIL LETTER RRAKARA UUKAARAM = Tamil letter rruu
EB  TAMIL LETTER NNAKARA UUKAARAM = Tamil letter nnuu
EC  TAMIL LETTER KAKARAM = Tamil letter k
ED  TAMIL LETTER NGAKARAM = Tamil letter ng
EE  TAMIL LETTER CAKARAM = Tamil letter c
EF  TAMIL LETTER NJAKARAM = Tamil letter nj

F0  TAMIL LETTER TAKARAM = Tamil letter tt
F1  TAMIL LETTER NNNAKARAM = Tamil letter nnn
F2  TAMIL LETTER THAKARAM = Tamil letter t
F3  TAMIL LETTER NAKARAM = Tamil letter n
F4  TAMIL LETTER PAKARAM = Tamil letter p
F5  TAMIL LETTER MAKARAM = Tamil letter m
F6  TAMIL LETTER YAKARAM = Tamil letter y
F7  TAMIL LETTER RAKARAM = Tamil letter r
F8  TAMIL LETTER LAKARAM = Tamil letter l
F9  TAMIL LETTER VAKARAM = Tamil letter v
FA  TAMIL LETTER ZHAKARAM = Tamil letter LLL
FB  TAMIL LETTER LLAKARAM = Tamil letter ll
FC  TAMIL LETTER RRAKARAM = Tamil letter rr
FD  TAMIL LETTER NNAKARAM = Tamil letter nn
FE  TAMIL LETTER IKARAM = Tamil letter i
FF  <NOT ASSIGNED>

NOTES:

i) Third vowel "i" is included in slots AD and EF but usage of "i" at
slot AD is deprecated. Inclusion of the glyph at slot AD is for
rendering legacy data and to enable conversion to other encodings. Text
converters to other encodings should attempt to determine which slot is
used in the text for ikaram before converting.

ii) Though ukara- and uukara modifiers (at slots 4A, 4B) are indicated
as "TAMIL VOWEL SIGN U" and "TAMIL VOWEL SIGN UU" respectively, their
usage is permitted only for the grantha vowels. Entire ukara and uukara
uyirmey series are encoded directly in TSCII and they alone are to be
used to render these uyirmeys.

iii) Tamil numerals 0-9 are indicated as "TAMIL DIGITS" while Tamil
numerals 10,100 and 100 are indicated as "TAMIL NUMBERS". This is to
recognize the fact that Tamil numerals are being in used in two
different systems (decimalic as in Arabic using digits 0-9 and as an
additive-positional system using numerals 10,100 and 1000 as well).

iv) TAMIL LETTER AAYTHAM at slot 7B is a dependant letter though in
Unicode 4.1 it is listed differently as TAMIL VISARGA SIGN. (in Tamil
grammar this aaytham letter is called as “caarpu ezuttu”),

Acknowledgment:
TSCII user group would like to acknowledge the help of following
persons in the preparation of this TSCII specifications document: Mr.
S. Kaviarasan (USA), Mr. Ravindran Paul (Malaysia), Mr. Doddannan
Sivaraj (India), Dr. RM. Krishnan (India), Dr. Kumar Mallikarjunan
(USA) and Mr. Sinnathurai Srivas (UK).