[whatwg] main element parsing behaviour

Simon Pieters simonp at opera.com
Wed Nov 7 04:13:30 PST 2012

On Wed, 07 Nov 2012 12:55:46 +0100, Jirka Kosek <jirka at kosek.cz> wrote:

> Changing parser each time new element is added is really evil idea and
> sign of a bad design.
> Parsing algorithm should be either not touched at all, or it should be
> promptly changed to treat all unknown elements in other way if the
> current treatment of unknown elements is not suitable for some reason.

There are three ways to parse a new element that we probably want for new  

"inline" - like <span>, current behavior for unknown elements.
"block" - like <address>, currently a finite list of elements.
"void" - like <img>, currently a finite list of elements.
(Possibly also "block void", - like <hr>, although none such elements have  
been added since parsing was specified.)

If we were to design a system where we can make up new elements that go in  
one of those categories without changing the parser, I think we  
effectively have to put a magic string in the tag name, e.g. any element  
that starts with "block" is treated like <address>, but that has  

* Looking at a substring of the tag name complicates the parser and  
probably ruins some optimizations.
* It means new non-inline elements will have long, ugly two-word names  
which is inconsistent with the rest of the language.

I can imagine other designs as well but they don't seem any better.

In conclusion, I think changing the parser when we introduce a new "block"  
or "void" element is a better approach.

Simon Pieters
Opera Software

More information about the whatwg mailing list