[whatwg] Make quoted attributes a conformance criteria

Sat Jul 25 03:08:23 PDT 2009

On Fri, Jul 24, 2009 at 9:52 PM, Keryx Web<webmaster at keryx.se> wrote:
> On 2009-07-23 20:32, Eduard Pascual wrote:
>>
>> While I don't consider a hard requirement would be appropriate, there
>> is an audience sector this discussion seems to be ignoring: Authoring
>> Tools' developers. IMO, it would be highly desirable to have some
>> guidelines for these tools to determine when they*should*  quote
>> attribute values.
>
>
> There is one further rub. Code that initially has been made by authoring
> tools have a tendency to wind up in some front end developers lap, to be
> amended and/or fixed manually at a later stage. That is even more a reason
> for a strong recommendation about quotes.
>
> Furthermore, I doubt that most people on this list did read my blog post I
> included as an URL when starting this discussion.[1]
I can't speak for others, but I did read your post. And still I am
convinced that a hard requirement to quote all values is not the best
solution. There are some values that MUST be quoted, some that SHOULD
be quoted, and even some that SHOULD NOT be quoted. Those that must be
quoted are already covered by the spec, and validators will yield the
relevant error message when encountering such values unquoted. For
those values that *should* be quoted (those that improve in
readability when quoted, or those that could lead to errors when they
are later changed if unquoted), a warning from the validator should be
enough. Finally, there are some values that are better unquoted, such
as those attributes that can only take a number (there is no risk of
errors, and the quotes would normally hurt readability more than they
help it). Even in the case of @type for <input>, quotes seem quite an
overkill: AFAIK, there is no valid value for this attribute that will
make them strictly needed; so there is no risk of the author changing
the value into something that requires quotes and forget to add them
(unless, of course, s/he changes it to something invalid, which will
already bring problems of its own). Since <input> elements tend to be
relatively short, and often given in a single line of source, adding
boilerplate to them for no purpose doesn't seem to be a good idea.

> In that post I talked about a common scenario. One developer works on the
> business logic. It puts out attribute values. Another developer works on the
> presentation logic. He makes templates. Dev 2 omits the quotes and for a
> long time it might work, since the business logic in question only produces
> single word values. Then there might come a change, because dev 1 - or the
> users of the CMS - suddenly starts to produce longer values. Suddenly things
> break, and since nobody touched the presentation logic code, it might not be
> the first place where the developers look for an error.
>
> And believe me, lots of back end devs are absolutely clueless about front
> end issues! Yes, they might skip validation completely, but at least such a
> rule of thumb can be implemented more easily into their work flow.
Again, once the pages are going through a validator, warnings are
hints as good as errors to detect the source of the problem.

> I also note that no one who has spoken against my suggestion claims to have
> any teaching experience.
Although I didn't mention it, because I didn't think it was relevant,
I have some teaching experience (I won't claim to have worked my whole
life as a teacher; but I have worked as a particular teacher for a few
years). Do you really think this:
"Error: Attribute values must always be quoted"
would be more educative than this
"Warning: Values for attribute X should be quoted, or errors might
arise if the value is later changed"
? And these are just examples of messages.
Of course, if you just say your students "all the code you provide
must validate" warnings my go unnoticed. However, you may try
something like this "all the code you provide must validate; and
warnings must be addressed or be properly reasoned". IMO, this kind of
details marks the difference between training code-typing zombies or
developers capable to solve problems.

In summary: I considered your arguments from the teaching perspective;
but I consider that the difference between errors and warnings has
more didactic value than a totalitarian validator that just rejects
safe code based on a seemingly arbitrary rule.

>
> I see 4 effects that my suggestions might have:
>
> 1. Dismiss completely.
Unlikely. On the worst case, it is at least being discussed.

> 2. No new wording, but change the code examples.
Better consistency would be appropriate for some of the examples.
However, there are many values there that are better unquoted
(especially numbers).

> 3. Add some words about best practice, but do not enforce quotes as a
> conformance criterion.
>
> 4. Go all the way and do just that.
Again, there is a middle point between these: making validators issue
warnings for potentially unsafe attributes is, IMO, the sanest
approach here.
Adding some comments about the fact that in case of doubt it's safer
to quote the value would also be an improvement.

> The scientific evidence in favor of my suggestion might be quite easy to
> pick up. Just ask any standards aware teacher how common it is that not
> using quotes messes up students code!
Warnings would be better in that sense (as long as the students are
required to either address or justify each of them). This forces the
students to do something that will be a extremely valuable thing both
during their learning and on their later careers: thinking and
reasoning.

>
> Stopping before (4) above will force people like me to keep requiring false
> XHTML from my students. My main concern is that in HTML 5 we get lots of new
> boolean attributes, like "required" on inputs or "maxlength" on textareas,
> and having to write things like 'required="required"' will make the code
> longer and messier, since a normal input element might span 2 or 3 lines.
Again, warnings are better than a hard requirement.

> Of course this can be settled if we get a tool like JSLint, that can enforce
> a voluntary stricter check (Crockford's "good parts"), but please note that
> ES 5 introduces a concept of "strict" rules.
>
> This means that ES 5 will be in a similar position to HTML 5, having a lax
> rule set about what browsers must be able to do, and a strict "conformance
> critera" like rule set that authors are encouraged to follow.
>
> Perhaps this could be solved by simply adding an option to the validator:
> "Do not allow unquoted non-boolean attribute values".
>
> Henri Sivonen, are you reading this?
>
> --
> Keryx Web (Lars Gunther)
> http://keryx.se/
> http://twitter.com/itpastorn/
> http://itpastorn.blogspot.com/
>
> 1. http://itpastorn.blogspot.com/2009/07/value-of-false-xhtml.html
>

On Sat, Jul 25, 2009 at 5:55 AM, Bil Corry<bil at corry.biz> wrote:
> Aryeh Gregor wrote on 7/24/2009 5:44 PM:
>> On Fri, Jul 24, 2009 at 6:26 PM, Bil Corry<bil at corry.biz> wrote:
>>> That's a classic XSS vulnerability.  The backend developer must know if there are quotes or not in the template, then encode/sanitize the value accordingly.
>>
>> It's not XSS if the values are statically provided by the first
>> developer and aren't generated from user input.
>
> Sure, but I was basing my reply on the provided example: "Then there might come a change, because dev 1 - or the users of the CMS - suddenly starts to produce longer values."
>
> Even in the case where the developer is providing the values via a trusted source (say a database), it's still a best practice to encode/sanitize the value.

Then, regardless of conformance criteria, it would be convenient to at
least mention such best practices somewhere in the spec.

Regards,
Eduard Pascual