[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: General policy
> When talking about labelling, I was thinking of mechanisms outside
> the bytestream itself, unlike ISO 2022.
>
> That means that when you start the bytestream, you expect to see one
> character set/encoding method, and when you go to the middle of the bytestream
> you still expect to see the same charset/method.
>
> The MIME "charset=" construct is the paramount example of such a labelling.
> It could probably be easily added as part of TELNET negotiation, and so on
> for other protocols.
Note the following text from RFC 1341 (MIME) [annotation is mine]:
NOTE: Beyond US-ASCII, an enormous proliferation of character sets
is possible. It is the opinion of the IETF [822 extension] working
group that a large number of character sets is NOT a good thing. We
would prefer to specify a single character set that can be used
universally for representing all of the world's languages in
electronic mail [or any text processing protocol]. Unfortunately,
existing practice in several communities seems to point to the
continued use of multiple character sets in the near future. For
this reason, we define names for a small number of character sets
for which a strong constituent base exists. It is our hope that ISO
10646 or some other effort will eventually define a single world
character set which can then be specified for use in Internet mail
[and other protocols], but in the advance of that definition we
cannot specify the use of ISO 10646, Unicode, or any other character
set whose definition is, as of this writing, incomplete.
This holds for any batched or simple request/response protocol, where
negotiation is impossible. MIME specifies an elaborate (to implement) rule
when to use which character set. How would you introduce character set labeling
with DNS, for example?
--
Luc Rooijakkers Internet: lwj@cs.kun.nl
SPC Company, the Netherlands UUCP: uunet!cs.kun.nl!lwj