[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Revised proposal for UTF-16



After I sent my reply, I thought you might have intended the BOM to be
the indication that a "new" protocol was being used. I.e. the client
would go ahead and send the BOM followed by UTF-16 request headers, and
then check the server response to see whether it accepted the UTF-16 or
simply returned an error (in single-octet ASCII). If the server returned
an ASCII error, the client would realize that it was an "old" server,
and then re-try the request using ASCII instead of UTF-16. (I'm not too
excited about this idea, though. (To put it politely.))

Also, even if the server sent a BOM at the beginning of the stream, that
does not mean that the whole stream (on this particular connection) is
UTF-16, because, with "Keep-Alive" (or whatever it's called), the
connection is kept alive for several transactions, some of which may not
be text (they might be images).

Anyway, I guess this thread is somewhat off topic. I'd be happy to
continue off-line, if you want.

Getting back to the issue at hand, Makoto's prose seems OK to me.

Erik

Dan Kegel wrote:
> 
> Right, I was thinking of a hypothetical future protocol
> like http or smtp but based on UTF-16, not the current http or smtp.
> 
> At 08:33 AM 5/31/98 -0700, Erik van der Poel wrote:
> >Dan Kegel wrote:
> >>
> >> In the case of HTTP headers, we can probably consider the
> >> entire HTTP header stream as a single message, and only require
> >> the BOM at the beginning of the stream, e.g. the client and server
> >> would each send the BOM as the first two bytes after opening the
> >> socket.
> >
> >No, HTTP headers are always encoded with one octet per character, even
> >if the body is UCS-2 or UCS-4 (or UTF-16). You would have
> >interoperability problems if you tried to send the headers themselves in
> >UTF-16. A client could only send UTF-16 headers if it knew beforehand
> >that the server could deal with it.
> >
> >For example, if the link that the user clicked on had an HREF with
> >"whttp://...";, where "whttp" is some new protocol that accepts UTF-16,
> >then the client could safely send UTF-16 headers.
> >
> >(Note: I'm not proposing to create a new protocol called whttp. I'm just
> >saying that the current HTTP cannot deal with UTF-16 headers.)