[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Registration of some code pages



On Fri, 03 Sep 2010 18:19:49 +0200, Shawn Steele  
<Shawn.Steele@microsoft.com> wrote:
> I don't think it can be restricted to the web.  For various reasons this  
> difference could happen in other places too (MIME).

More generic is fine with me, but from the other people commenting I get  
the feeling that changing the details of how things in the registry are  
today is controversial. And I'd really like to just get it done.


> And "Big5" doesn't always mean "windows 950".  It often does on a  
> Windows box, but it might actually mean "Big5" on other machines.  (I  
> was going to provide examples, but there's a lot of variety across  
> vendors, wikipedia lists a lot of variations).

Sure, that is a problem that needs to be fixed though.


> HTML5 seems to be providing "clear guidance" to web developers, so if we  
> had this kind of annotation, then HTML5 could point to that.  However,  
> even in HTML (especially in HTML?), big5 doesn't necessarily always mean  
> Windows 950, it could actually be a different varient from whatever  
> authoring system originated it.

Sure, it could mean something else, but that seems hardly relevant. Or are  
you proposing that we have some additional variable for deciding how to  
decode something labeled as "big5"? I don't think that is going to happen.


> Seems to me that, in practice, these labels narrow down the behavior,  
> but there are variations.  Sometimes only a couple codepoints, sometimes  
> bigger, but a "wise" application would allow for imperfect code page  
> identification.  (Like the user drop-down to change code pages).

That seems way more complicated than what we have now and since this is  
all legacy (I consider non-UTF-8 legacy) anyway I am not sure we should be  
concerned with that. We should pick that what works best and most often  
that is "what IE does" as IE dominated the market in when all the legacy  
encodings dominated.


> Ironically, IE9 beta's are getting stricter about the page declarations,  
> and I've started seeing the opposite (Arabic sites tagged as Arabic but  
> they're actually UTF-8, etc.), which is, I guess, why people have been  
> doing autodetection.

Are you talking about the platform releases or internal betas that have  
the IE-specific code activated?


> Anyway, I'd like to annotate a couple records as "this is sometimes  
> called xxx" without making it an alias, especially when the "alias" is  
> already defined as something else.

All I'm saying is that is way too vague for implementations to use. I  
suppose it is an incremental step to the registry getting closer to  
reality, but I would prefer something more drastic.


-- 
Anne van Kesteren
http://annevankesteren.nl/