[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Thoughts about characters transmission



        [please excuse this cross-post;  I am following a thread]

>> The _most_important_point_ is that a single common representation code
>> be defined _for_the_line_ (suiting the purpose, namely to cover all national
>> languages in one single way) and that people be instructed that every bit
>> of text should travel in that code on the wire, whatever_the_protocol_is.
>
>I agree to most of what Andre'' is saying and I have an additional
>point here: that the single common representation code should be something
>that can be handled by existing software and hardware,   ...

        I agree with most of what André said,  and agree with you on
this one point.   But ...

>will take a long time before the conversion software is installed
>on all machines, or even a large share of the installed base.
>Also I would like to emphasis the need for world-wide solutions.
>This would mean that ISO 8859-1 would not be a good candidate,
>we need something ASCII based (or even with a smaller repertoire
>than ASCII to cover the problems with EBCDIC and national ISO 646
>variants).

        I don't understand the warrant here,  Keld.   You're right that
we need world-wide solutions and you're right that we should have some-
thing ASCII based.   How does these make ISO 8859-1 a bad choice?

        I've spent a significant part of *my* life working with others
toward a true solution to the  ASCII <---> EBCDIC  problem.   Some form
of concensus was reached a long time ago and folks have successfully
"beat IBM over the head"  with it,  and IBM has finally acknowledged a
"de facto network EBCDIC"  [my term]  which they call CodePage 1047.
CP 1047 maps one-for-one with ISO 8859-1.   The mapping of 1047/8859-1
is the most palatable mapping to the most sites on the InterNet.

        I see the common code André mentions.   I see ISO 8859-1
"on the wire".   I see some  greater-than-8-bit  code in the future
that is a superset of  8859-1.   (and whether TCP has been super-
ceeded or wether we "tag" things,  I am NOT addressing here)
What's the problem?

        [I think it was Nathaniel who said,  "memory is cheap and
bandwidth is cheaper".   In agreement,  I say we scrap the 16-bit
stop-gap solution and go directly to 32-bit and then start looking
toward bit-unconstrained (bit-free?) representations.   Just my opinion]

>Keld

--
Rick Troth <troth@rice.edu>,  Rice University,  Information Systems