[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Comments on draft-yergeau-rfc2279bis-00.txt



Patrik Fältström wrote:

> What I hear on this list is that the consensus is that BOM SHOULD NOT be 
> used. I would like it to be MUST NOT be used in Internet protocols, 
> which leads to tagged UTF-8 text be illegal if the BOM exists in the text.


That would violate the Unicode standard. If UTF-8 is clearly indicated with some charset label, then an initial sequence of ef bb bf must be interpreted as the character U+feff ZWNBSP. Since that is not a very useful character at the beginning of a text, it can usually be ignored.

Personally, I find François' text very clear. It acknowledges existing, reasonable and useful practice.

Best regards,
markus


-- 
Opinions expressed here may not reflect my company's positions unless otherwise noted.