[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HTTP Range and BOM



Erik van der Poel wrote:

> I don't know if this needs to be mentioned, but there is a Range request
> header in HTTP that allows you to ask for a particular part of an
> object. I believe that the BOM should always be only at the beginning of
> the object, and should not be inserted for Range requests.
> 
> For example, even if the client asks for bytes 500 to 1000, the server
> should not prepend a BOM in front of byte 500.

Are you suggesting that this be mandated somehow in the IANA registration?  It
seems very odd to me that a charset description make these kinds of
restrictions on any particular protocol.  How can we anticipate all uses of a
charset?  Isn't it for the protocol designers to make such decisions?

In general, I see a lot of problems with mandating the insertion of the BOM. 
What should happen if the document already begins with a BOM?  How does the
insertion interact with Content-Length?  With document authentification?  What
should trans-coding gateways do with the BOM?  Maybe we can think of things to
say about each of these issues, but should the charset definition really
attempt to cover all of this?

Rather than mandating the BOM, would the following work - make the usual
conservative-send/liberal-accept distinction, i.e., compliant UTF-16 producers
may only send in big-endian order, compliant UTF-16 consumers must be prepared
to accept little-endian data, such content necessarily beginning with a BOM.

Hmm, I'm not sure that really deals with my devils-advocate questions above,
though.
Sigh.

- John Burger
  The MITRE Corporation