[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Comments on draft-yergeau-rfc2279bis-00.txt

To: ietf-charsets@iana.org
Subject: RE: Comments on draft-yergeau-rfc2279bis-00.txt
From: Francois Yergeau <FYergeau@alis.com>
Date: Thu, 03 Oct 2002 13:31:53 -0400
Original-recipient: rfc822;ned+ietf-charsets@mrochek.com
Spam-test: False ; 1.1 / 5.2

McDonald, Ira wrote:
> This IETF standard should NOT encourage the use of leading BOM in
> streams of UTF-8 text.

The current text neither encourages nor discourages BOM usage, it only
points out the existence of the convention and gives some caveats (like the
uncertainty when stripping a BOM and the possible breakage of digital sigs
and the like).

> The optional use of leading BOM in UTF-8 (as
> I know Martin said) destroys the crucial property that US-ASCII
> is a perfect subset of UTF-8 and that US-ASCII can pass _without
> harm_ through UTF-8 handling software libraries.

This totally clashes with my understanding.  Can you please explain how the
existence of the BOM convention in UTF-8 changes anything to the
interpretation of US-ASCII strings that by definition never contain a BOM?

> UTF-8 never needs a 'byte-order' signature.

This is unfortunately not true, except in the limited realm of properly
internationalized protocols with proper implementations and no reliance on
humans to correctly label things.  Which leaves out quite a few things,
prominent among them file systems: my disks are full of text files in either
Latin-1, UTF-8 or UTF-16, and the BOM is the only thing that distinguishes
them.  Many of those files result from a "Save as" where the original was
properly labelled in some protocol, but the metadata simply gets lost.

-- 
François

Follow-Ups:
- RE: Comments on draft-yergeau-rfc2279bis-00.txt
  - From: Martin Duerst <duerst@w3.org>

Prev by Date: Re: Comments on draft-yergeau-rfc2279bis-00.txt
Next by Date: RE: Comments on draft-yergeau-rfc2279bis-00.txt
Prev by thread: RE: Comments on draft-yergeau-rfc2279bis-00.txt
Next by thread: RE: Comments on draft-yergeau-rfc2279bis-00.txt
Index(es):
- Date
- Thread