[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Register EBCDIC Character Set "OSD_EBCDIC_DF04_1"
jean-frederic clere wrote:
> For OSD_EBCDIC_DF04_1 and OSD_EBCDIC_DF04_15 that is a 8 bits roundtrip
> mapping but for 2 bytes mapping, undefined characters are mapped as '?'
> 0x6F.
Ok, so the substitution character for these tables is 0x6F.
On the question of roundtrips, I think we are not communicating properly due to a mismatch in
terminology.
I believe what you are saying is that these _tables_ as a whole perform a roundtrip of their
repertoire between a BS2000-EBCDIC codepage and the Unicode portion corresponding to the equivalent
ISO 8859 codepage. In other words, the tables map between exactly N codes on the EBCDIC-based and N
Unicode code points. (N being the same on both sides, N=256 for SBCS and N=128 for IRV.) Is this
correct?
Then, for every Unicode character _outside_ of this repertoire, there is no mapping, and the default
behavior is to use 0x6F as the substitution character.
What I was trying to ask was whether the individual _mappings_ in the tables (each line in the text
table listing) were roundtrip mappings. This means that when you write something like
0xFC 0x00DC #LATIN CAPITAL LETTER U WITH DIAERESIS
that means that you map Unicode U+00DC to 0xFC while converting from Unicode to this charset, and
you map 0xFC to Unicode U+00DC while converting from the charset to Unicode. Fallback mappings only
go one way. Since many conversion implementations have fallback mappings in addition to roundtrip
mappings, they should be published, and should be marked properly. See Unicode TR 22 for details.
If the tables in your registration requests are pure remappings as described above, then of course
each mapping is a roundtrip mapping.
Is this how the converter implementation works on BS2000? Is it true that BS2000 converters do not
perform any fallback (one-way) mappings?
Best regards,
markus