[whatwg] Character-encoding-related threads
Leif Halvard Silli
xn--mlform-iua at xn--mlform-iua.no
Mon Feb 13 12:48:10 PST 2012
Anne van Kesteren, Mon Feb 13 12:02:53 PST 2012:
> On Mon, 13 Feb 2012 20:46:57 +0100, Anne van Kesteren wrote:
>> The list starts with <a> and the moment you do not use UTF-8 (or UTF-16,
>> but you really shouldn't) you can run into problems. I wonder how
>> controversial it is to just require UTF-8 and not accept anything else.
Hear, hear!
> I guess one could argue that <a> is already captured by the requirements
> around URL validation. That would leave <form> and potentially some
> script-related features. It still seems sensible to me to flag everything
> that is not labeled as UTF-8,
Indeed. Such a step would make it a must for HTML5-compliant authoring
tools to default to UTF-8. It would also positively affect validators -
they would have to give "mild" advices about how to, the simplest way,
use UTF-8. (E.g. if page is US-ASCII or US-ASCII with entities, then -
a simple move: Just at a encoding declaration.) It is likely to have
many, many positive side effects.
> but if we want something intermediate we
> could start by flagging non-UTF-8 pages that use <form> and maybe obsolete
> <form accept-charset> or obsolete any other value than utf-8 (I filed a
> bug on that feature already to at least restrict it to a single value).
The full way - all pages regardless of <form> - seems the simplest and
best.
--
Leif H Silli
More information about the whatwg
mailing list