[whatwg] Thesis draft about HTML5 conformance checking

Ian Hickson ian at hixie.ch
Mon Mar 12 00:21:17 PDT 2007


On Mon, 12 Mar 2007, olivier Thereaux wrote:
> 
> Did you have a chance to look at engines in authoring tools? What type of
> parser do NVU

Gecko, same as Firefox.


> Amaya,

Amaya's editor uses the same rendering engine as Amaya's browser, which I 
presume was ignored due to its negligible market share.


> golive etc work on?

Golive uses Opera's rendering engine.


> How about parsing engines for search engine robots? These are probably 
> as important, if not more as some of the browser engines in defining the 
> "generic" engine for the web today.

Search engine companies are notoriously secretive about what their 
indexing pipelines support, since any insight into how they work can be 
abused by people attempting to game their ranking algorithms. The WHATWG 
specification (in particular the parsing part, but other parts as well) 
has, however, been influenced by what information search engine 
implementors have confidentially contacted me with, and what suggestions 
they have anonymously or subtly sent to the list over the years. (This is 
why a careful study of the specification's acknowledgements will reveal 
employees from several search engine implementors.) In any case, reverse 
engineering search engine indexing pipelines is extremely difficult and 
tedious, orders of magnitude more so than even browsers.

Why do you think search engine behaviour is more important than browser 
engine behaviour? For what it's worth, search engine engineers I have 
spoken to have told me that what browsers do is far more important than 
what a particular version of a search engine does in terms of what the 
specification should say, because their results are better when their 
algorithms match the browsers' behaviours.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'



More information about the whatwg mailing list