The “bigger” problem isn’t finding a name that we recognize, although that’s big too, but rather if I do: using System; using System.Text; class Example { static void Main() { Console.WriteLine(Encoding.GetEncoding(932).WebName); Encoding.GetEncoding("csWindows-31J"); Encoding.GetEncoding("Windows-31J"); } } Then I’ll get “shift_jis” as the encoding name. (WebName’s effectively as close as .Net gets to the IANA charset names.) That cannot change without breaking
tons of stuff. C# does happen to recognize csWindows-31J, but the next line will throw an exception. I’d have to dig more to see if MLang recognized the csWindows-31J, but that wouldn’t really solve the problem. So, IANA could decide that Microsoft’s variant should have some name (say xxxx or maybe Windows-31J, or use the csWindows-31J we almost know about (not all
products do)). However, we’d still return “shift_jis” when you asked for the name. We pretty much can’t change that because if you tag your .Net generated document with Encoding.WebName (like maybe an asp.net server), and you upgrade, then I won’t be able
to read it if I haven’t upgraded. Certainly that’d be a huge migration pain, and we’d much, much, much rather people migrate to UTF-8 or UTF-16 than spend any more time in old encodings. Our partners and competitors would like to interoperate with our encodings, but the shift_jis name is a bit misleading since ours is a variant. “Everyone”
knows that (or quickly discovers it), but it would be nice if the that was a bit better documented in the registry. -Shawn From: Markus Scherer [mailto:markus.icu@gmail.com]
2010/11/11 Shawn Steele <Shawn.Steele@microsoft.com> > Moreover XML doesn't allow "+" for EncName. I picked the syntax based on a previous thread a couple years ago, I didn't realize this was a problem. The ICU converter alias list has the following aliases tagged with "WINDOWS": Shift_JIS, MS_Kanji, csShiftJIS, csWindows31J, cp932, windows-932. I don't know which of these names Windows actually recognizes. If there is at least one name
that Windows recognizes (MS_Kanji??) and that does not collide with an IANA standard-Shift-JIS alias, you could use that. markus |