[whatwg] Make quoted attributes a conformance criterion

Sun Jul 26 19:24:27 PDT 2009

On Jul 26, 2009, at 6:53 PM, Jonas Sicking wrote:

> On Sun, Jul 26, 2009 at 9:09 AM, Mike Shaver<mike.shaver at gmail.com>  
> wrote:
>> On Sun, Jul 26, 2009 at 5:15 AM, Keryx Web<webmaster at keryx.se> wrote:
>>> My analogy was simply this: Just like it makes sense for a  
>>> JavaScript lint
>>> tool to enforce semi-colons, it makes sense for an HTML  
>>> conformance checker
>>> to enforce quotation marks.
>>
>> A lint tool is not a conformance checker.  Your proposal here is
>> analogous to removing ASI from ECMAScript, such that a program which
>> relied on it would not be conformant.
>>
>> I recommend that you find an HTML guru of the same stature as
>> Crockford in the JS community, and convince her to write a lint tool
>> which forbids unquoted attribute values.  Once you have that, you can
>> (attempt to) popularize that style via evangelism for the lint tool,
>> rather than trying to foist your stylistic preferences -- which, as  
>> it
>> happens, I share -- onto the world via spec requirements.
>
> The more I think about it, the more I'm intrigued by Rob Sayres idea
> of completely removing the definition of what is "conforming". Let the
> spec define UA (or HTML consumer) behavior, and let lint tools fight
> out best practices for authoring.

I was intrigued by this idea as well, but Henri Sivonen raised an  
important point that, to a significant extent, changed my mind. A Web  
content development toolchain will often include markup generaters, as  
well as validation as part of QA. With a centrally defined notion of  
markup conformance, markup generators can seek to produce content that  
meets the conformance rules, while validators can make sure to check  
the conformance rules as a baseline. This makes it more practical to  
swap out parts of the toolchain. Otherwise, switching either  
validators or markup generators would be likely to produce a flood of  
errors, which would make the switching costs fairly high. Thus, there  
is an interoperability benefit to defining at least a baseline core of  
conformance rules. It's not for interoperability between content and  
user agents, but for interoperability between content generators and  
markup checkers.

That being said, validators can and should compete on the basis of  
providing additional useful warnings. To build that kind of ecosystem  
doesn't require the removal of markup conformance. JavaScript, C and C+ 
+ are examples of languages where conforming syntax is strictly  
defined, yet tools are available that do additional static analysis  
for both style and correctness. For example, GCC and MSVC have very  
different sets of C++ warnings, but the fact that syntax errors and  
certain mandatory warnings are defined by the C++ spec makes it easier  
to move code from one to the other, while leaving them room to compete  
on quality and usefulness of optional warnings, among other things.

So, in conclusion, having a baseline for correct syntax may actually  
make it easier to develop an ecosystem of style-checking tools.  
However, this makes it important to keep the core set of syntax errors  
relatively minimal. I'm not sure HTML5 as currently drafted entirely  
hits that balance, but mandating optional tags or requiring double  
quotes on attributes would be a move in the wrong direction.

Regards,
Maciej