[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Thoughts about characters transmission
Masataka Ohta writes:
> But, I have disappointed that NET-TEXT does not solve the unfairness, the
> currently recognized issue of UTF2, at all.
There are some errors in the NET-TEXT message with regard to UTF-2
sequences of more then 3 bytes, but the basic premise was to remain
compatible with UTF-2. This may or may not be a worthwhile goal,
as Otha-san pointed out rightly, but I believe NET-TEXT is pretty much
the minimal extension you can make while still remaining compatible.
Anyone want to comment on this?
> With additional 2 single octet encoding and 60 two octet encoding at most,
> you can't encode non-European characters as efficient as the European
> ones.
Note that any non-ASCII character requires at least 2 bytes,
in any applicable encoding. If I understand you right, you would like
2-octet representations for all of GB, JIS and KSC, right? While this is
theoretically possible (these are all 94^2 charsets and hence require
3 * 94^2 = 26508 combinations), I don't see any solution, since there is
at most 7 bits per octet available (octets < 128 should occur only when
representing the corresponding ASCII character). So, would you be
willing to accept 3-byte encodings for these?
Also, I'd like some comments from other people as well.
> The article also contains imcomplete and incorrect summary of the bof.
Incomplete, yes, but could you please explain to me what was incorrect?
--
Luc Rooijakkers Internet: lwj@cs.kun.nl
SPC Company, the Netherlands UUCP: uunet!cs.kun.nl!lwj