[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

ISO-8859-10; registration of new charset values; error in MIME draft



(Replies to this messages should be directed to
ietf-types@uninett.no only.)

Legend:
> quote from RFC 1521
: quote from draft-ietf-822ext-mime-reg-03.txt
% quote from ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets
/ quote from ISO/IEC 8859-10:1992(E)

RFC 1521 asks for IANA registration of values of the "charset"
parameter:

> 7.1.1.     The charset parameter
 
>    An initial list of predefined character set names can be found at the
>    end of this section.  Additional character sets may be registered
>    with IANA, although the standardization of their use requires the
>    usual IESG [RFC-1340] review and approval.  Note that if the

> Appendix E -- IANA Registration Procedures
 
>    MIME has been carefully designed to have extensible mechanisms, and
>    it is expected that the set of content-type/subtype pairs and their
>    associated parameters will grow significantly with time.  Several
>    other MIME fields, notably character set names, access-type
>    parameters for the message/external-body type, and possibly even
>    Content-Transfer-Encoding values, are likely to have new values
>    defined over time.  In order to ensure that the set of such values is

No registration procedure for character sets is specified in
RFC 1521, though.

The current Internet Draft draft-ietf-822ext-mime-reg-03.txt,
"Multipurpose Internet Mail Extensions (MIME) Part Four:
Registration Procedures", says:

: Registration of character sets for use in MIME is covered
: elsewhere and is no longer addressed by this document.

Which document cover the registration of character sets, then?

The IANA register file at
< ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets >
states:

% These are the official names for character sets that may be used in
% the Internet and may be referred to in Internet documentation.

Many of the registered names have been taken from the
informational RFC 1345. Is it necessary to write an RFC to get a
new character set registered? Is it sufficient to do that?

I'm asking these questions, since I've noticed that part 10 of
ISO 8859 is not included in the IANA registry. That standard was
published in 1992:

   ISO/IEC 8859-10:1992(E)
   Information technology -- 8-bit single-byte coded graphic
      character sets -- Part 10: Latin alphabet No. 6
   International Organization for Standardization, 1992-12-15

<From the Scope:

/ This [coded character] set is suited for multiple-language
/ applications involving Danish, English, Estonian, Finnish,
/ German, Greenlandic, Icelandic, Sami (Lappish), Latvian,
/ Lithuanian, Norwegian, Faroese, and Swedish.

RFC 1521 _can_ be read as allowing the use of the "ISO-8859-10"
value, even without it being included into the IANA registry:

> 7.1.1.     The charset parameter

>    An initial list of predefined character set names can be found at the
>    end of this section.  Additional character sets may be registered

>    The defined charset values are:
>  
>    US-ASCII -- as defined in [US-ASCII].
>  
>         ISO-8859-X -- where "X" is to be replaced, as necessary, for the
>              parts of ISO-8859 [ISO-8859].  Note that the ISO 646
>              character sets have deliberately been omitted in favor of
>              their 8859 replacements, which are the designated character
>              sets for Internet mail.  As of the publication of this
>              document, the legitimate values for "X" are the digits 1
>              through 9.
 
>    No other character set name may be used in Internet mail without the
>    publication of a formal specification and its registration with IANA,

The statement about legitimate values for "X" at the time of the
publication of RFC 1521 is false. It was published in September
1993, when the ISO 8859-10 stadnard was 9 months old.
Unfortunately, this statement hasn't yet been changed in the
latest Internet Draft draft-ietf-822ext-mime-imt-04.txt. (I'm
sure this is due to oversight, not to ignorance.)

Let me also point out that four new parts of ISO 8859 are in the
ISO pipeline:

ISO 8859-11: Latin/Thai alphabet
ISO 8859-12: Latin/Devanagari alphabet
ISO 8859-13: Latin alphabet No. 7 (Baltic Rim)
ISO 8859-14: Latin alphabet No. 8 (Celtic)

It's possible that Latin/Devanagari and Latin (Celtic) will
appear interchanged in the final part numbering.

I would appreciate any clarification of these questions about
IETF character set registration.

-- 
Olle Jarnefors, Royal Institute of Technology (KTH) <ojarnef@admin.kth.se>