[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[ietf-charsets] Recent charset additions and issues
Several recent additions to the charset registry illustrate a number
of issues. The specific entries I refer to are:
Name: Amiga-1251
MIBenum: 2104
Source: See (http://www.amiga.ultranet.ru/Amiga-1251.html)
Alias: Ami1251
Alias: Amiga1251
Alias: Ami-1251
(Aliases are provided for historical reasons and should not be used)
Name: KOI7-switched
MIBenum: 2105
Source: See <http://www.iana.org/assignments/charset-reg/KOI7-switched>
Aliases: None
Name: OSD_EBCDIC_DF04_15
MIBenum: 115
Source: Fujitsu-Siemens standard mainframe EBCDIC encoding
Please see: <http://www.iana.org/assignments/charset-reg/OSD-EBCDIC-DF04-15>
Alias: None
Name: OSD_EBCDIC_DF03_IRV
MIBenum: 116
Source: Fujitsu-Siemens standard mainframe EBCDIC encoding
Please see: <http://www.iana.org/assignments/charset-reg/OSD-EBCDIC-DF03-IRV>
Alias: None
Name: OSD_EBCDIC_DF04_1
MIBenum: 117
Source: Fujitsu-Siemens standard mainframe EBCDIC encoding
Please see: <http://www.iana.org/assignments/charset-reg/OSD-EBCDIC-DF04-1>
Alias: None
Also relevant is the following excerpt from the registry:
The value space for MIBenum values has been divided into three
regions. The first region (3-999) consists of coded character sets
that have been standardized by some standard setting organization.
This region is intended for standards that do not have subset
implementations. The second region (1000-1999) is for the Unicode and
ISO/IEC 10646 coded character sets together with a specification of a
(set of) sub-repertoires that may occur. The third region (>1999) is
intended for vendor specific coded character sets.
Assigned MIB enum Numbers
-------------------------
0-2 Reserved
3-999 Set By Standards Organizations
1000-1999 Unicode / 10646
2000-2999 Vendor
One issue is that the MIBenum values assigned to these charsets does not
seem to be consistent with the description above and with the reference
information at the indicated URIs. It appears that the last three are in
fact vendor charsets and therefore should have MIBenum values in the 2000
to 2999 range. Conversely, it is not clear why KOI7-switched has been
assigned a Vendor MIBenum value, nor which vendor might be responsible.
Another issue is that the three OSD_EBCDIC_DF* charsets give no indication
in the source documents as to whether or not the charsets are suitable for
use with MIME text. Such an indication is supposed to be part of the
registration (RFC 2978 section 5). A related issue is the fact that the
registry itself provides no such indication for any charsets, which
is at best highly inconvenient for implementors.
None of the charsets above have been provided with an alias beginning with
"cs" for use with the printer MIB as discussed in section 2.3 of RFC 2978.
If that were consistently done, there would be no charset with a confusing
Alias: None
line in the registry.
How can we minimize these issues in the future? I believe that use of RFC 2978
(or a successor) as a checklist during the review process would help. I believe
that the addition to the registration template of a brief history of the
charset origin (originator and affiliation) would help in determining
whether a particular charset is a Vendor charset or Set By [a] Standards
Organization[s]. Finally, inclusion of a "MIME-text" field in the registry
with a yes/no value would not only be a boon to implementors of applications
which use charsets in a MIME context, but would prompt IANA to obtain a
statement of MIME text compatibility if it is lacking in the registration
application.