[whatwg] Tag Soup: Blocks-in-inlines
Billy Wong
billyswong at gmail.com
Wed Jan 25 04:21:22 PST 2006
On 1/25/06, Lachlan Hunt <lachlan.hunt at lachy.id.au> wrote:
> I'm not saying it won't break anything, but every single change we make
> to the parsing could possibly break any number of the billions of pages
> on the web in any number of browsers.
But using your method (swapping inline node and block node) would
break presently valid and correct webpages. If breaking things is
unavoidable, I prefer breaking things which are written incorrectly.
My idea is very extreme but simple and effecient:
Parse the page regardless of what between "</" & ">". See what's
written inside the close-tag merely a visual clue.
Example: <span><div>X</span>Y</div>
+ span
+ div
+ #text: X
+ #text: Y
To correctly written webpages, this should pose no problems. To
incorrect webpages, they deserve it since the point they ask the UA to
use "standard mode".
On 1/25/06, Lachlan Hunt <lachlan.hunt at lachy.id.au> wrote:
> Anne van Kesteren wrote:
> > Quoting Lachlan Hunt <lachlan.hunt at lachy.id.au>:
> >> 1.
> >> <em><p>X</em>Y</p>
> >>
> >> BODY
> >> + P
> >> + EM
> >> + #text: X
> >> + #text: Y
> >>
> >> The theory is that any inline elements
> >
> > This gives problems for new elements I assume... We already have a
> > problem with
> > <header><h1>test</h1></header>...
>
> I don't see how this affects new elements, it should only affect known
> inline elements.
>
> >> 2.
> >> <em><p>XY</p></em>
> >>
> >> BODY
> >> + P
> >> + EM
> >> + #text: X
> >> + #text: Y
> >
> > And this likely breaks existing content. Perhaps not for EM, but
> > certainly for
> > other inline elements, like <span>.
>
> I'm not saying it won't break anything, but every single change we make
> to the parsing could possibly break any number of the billions of pages
> on the web in any number of browsers. However, the chances are that
> such pages are already broken is several browsers already (probably
> built for IE only, who's quirks we are definitely not keeping), so I
> don't see this as a huge problem.
>
> There's nothing wrong with saner parsing at the expense of a few broken
> pages which I'm sure will still remain readable (even if they don't look
> perfect) and/or be easily fixed by their authors. Trying to remain 100%
> compatible with 100% of the web is physically impossible.
>
> However, span does show some interesting behaviour which should be made
> more consistent with other inline elements.
>
> <!DOCTYPE html><span><p>X</span>Y</p>
>
> Firefox:
> HTML
> + HEAD
> + BODY
> + SPAN
> + P
> + #text: X
> + #text: Y
>
> Opera 9/Win:
> HTML
> + BODY
> + SPAN
> +P
> +#text: X
> +#text: Y
>
> IE6:
> HTML
> + HEAD
> + TITLE
> + BODY
> + SPAN
> + P
> + #text: X
> + #text: Y
> + #text: Y (Highlighted in red in the DOM view)
>
> --
> Lachlan Hunt
> http://lachy.id.au/
>
>
More information about the whatwg
mailing list