[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
BOF minutes
- To: ietf-charsets <ietf-charsets@INNOSOFT.COM>
- Subject: BOF minutes
- From: Borka Jerman-Blazic <jerman-blazic@ijs.si>
- Date: Tue, 10 Aug 1993 22:13:46 +0200
- Conversion: Prohibited
- Resent-message-id: <01H1KY6FYF6Q984UZO@INNOSOFT.COM>
- X400-Content-type: P2-1984 (2)
- X400-MTS-identifier: [/PRMD=ac/ADMD=mail/C=si/;930810221346]
- X400-Originator: jerman-blazic@ijs.si
- X400-Received: by mta kanin.arnes.si in /PRMD=ac/ADMD=mail/C=si/; Relayed; Tue,10 Aug 1993 22:14:13 +0200
- X400-Received: by /PRMD=ac/ADMD=mail/C=si/; Relayed; Tue,10 Aug 1993 22:13:46 +0200
- X400-Recipients: ietf-charsets@innosoft.com
The BOF minutes were mailed to Erik and to RARE WG-CHAR list. I am away
from my office more then three weeks and I have read some of my mail
today. I noticed some arguing about the BOF summary and that is why I am
resending the minutes to this list.
Regards,
Borka
==================
Delivery-date: Saturday, July 17, 1993 at 13:11 GMT+0200
From:Borka Jerman-Blazic <S=jerman-blazic;O=ijs;P=ac;A=mail;C=si>
To: <S=huizer;G=erik;O=surfnet;P=surf;A=400net;C=nl>
Message-ID:inbox:22
Subject:BOF Minutes
Please find enclosed the BOF Minutes,
Cheers,
Borka
17-Jul-93
Minutes of the UCS BOF
The BOF took place at RAI, Amserdam, on 27th IETF, on July 14, 16:00
to 18:00
The BOF was chaired by Borka Jerman-Blazic. The list of attendees will be
included.
Introductory brief tutorial was given by Borka Jerman-Blazic. She
pointed out to some of the problems which appear on the network due to
the lack of support of the national character sets used for input/output
/processing/displaying the text written in languages used all over the
world. She stressed the need for proper maintenance of the
character integrity over the network. The requirement for processing and
interchanging different character sets correclty is especially rele-
vant for some internet services dealing with names of persons or
organizations.
Peter Svanberg gave short overview of the level of support for
non-ASCII character sets in different Internet protocols. Some of the
protocols were identified as hostile to 8 bits characters. Among them
DNS, SMTP, FTP, NNTP, WAIS, MIME Text/Enhanced, NFS, AFS, Whois, URN,
Gopher etc. The more recently developed protocols such as MIME part 1
and part 2 as well as some currently on-going projects such as Whois++
as was mentioned by Simon Spero support 16 bits coding and the
repertoires provided by such coding. He mentioned too, that several
IETF groups developing new protocols/services consider the importance
of the proper support of the character sets problem.
The next speaker was Mr.Masatak Ohta. He presented his view regarding
the idea the International Universal Coding system to be recommended
for use over the Internet. He identifyed 5 properties which are required
to be present in the recommended coding system. These are:
Identity for encoding and decoding which he understand as unique
mapping between particular graphic character and its code (bit
combination),
Causality understanded as independence of a processed coded character
from the other incomming characters in the data stream,
Finite State Recognition, state dependence of the code required for
presentation/display of multi-octed coded data,
Finite resynchronizability which means that the state of automation
can be determined uniquely by reading fixed finite number of octets,
Equality, requirement that a character coded with different coding
system can be always recognized as the same character.
Mr Ohta looked for the required properties in ISO 10 646 and find out
that the Causality and Finite resynchronizability are not satisfied.
Equality is not yet worked out. He proposed an extension to the
existing UCS code system consisting of 5 additional bits which will
enable the deficiency of the UCS coding system to be overcomed. The
discussion showed that the proposed solution is not in the general
stream of the development of the standard character set codes and
their applications in the computing systems. One of the possible
solutions to the problems identified by Mr.Ohta could be the use of
the whole model of UCS i.e the 4 envisaged octets which define besides
the cell and row position for a character in the Multilingual Basic
Plane of ISO 10 646 additional planes and groups. There was proposal
the required 5 additional bits to be coded as a private plane in the
UCS scheme. John Klensin noted that such approach could clash with
the reassignment of a such plane in the futher standardization process
of ISO JTC1/SC2. In the discussion the problem of handling of
bidirectional text was also identified.
Harald Alvestrand pointed out that what is happening now is a sort of
transition period between 8-bit coding and 16-bit coding provided with
UCS. Other parralel stream for support of different national charac-
ter sets is the "character switching" which is enabled by use of the
code extension technique of ISO 2022. It was obvious that this scheme
is not of practical use for Internet except for special cases i.e the
Japanese e-mail solution.
The BOF then discussed the possible working items if IESG approve the
formation of a working group. The chair identifyed several papers
which are Internet drafts dealing with the character sets problems
such as: RFC 1345, "X400 use of the international character sets",
"Character Sets and Languages". Other items were discussed and
proposed by the BOF attendees. They are summarized below. John
Klensin pointed out that special precautions has to be taken in the
recommendation of UTF 2 as data interchange method over the Internet
in connection with the possible assignements of additional coding
planes by JTC1 SC2. He also recommended the use of mailing lists
already working within IETF. They are: <ietf-charsets@innosoft.com>
and two others working on mailing issues (822ext and 821).
As a summary the BOF decided to propose to IESG to consider the possi-
bility of setting up of a working group to work on the following work-
ing items:
- a document defining how UCS can be used in a uniform way in
Internet protocols, especially taking in consideration the UTF-2
encoding of UCS. The document will provide guidance to other
protocols which have to deal with these items over the Internet,
-a document identifying the languages and the characters required for
coding text written in particular natural language (a sort of
guidelines for services dealing with multilinguality such as NIR
service based on usage of plein text),
-a document defining a tool for coded character sets conversion to be
provided within some services such as e-mail user agent including
fall-back representation of incoming characters that are outside the
supported character repertoire of the receiver,
-a proposal for extending the mandatory issues which have to be
covered in the RFC standardization process to include character set
consideration/support.