[whatwg] Tag Soup: Blocks-in-inlines
Billy Wong
billyswong at gmail.com
Wed Jan 25 07:17:51 PST 2006
On 1/25/06, Lachlan Hunt <lachlan.hunt at lachy.id.au> wrote:
> Billy Wong wrote:
> > On 1/25/06, Lachlan Hunt <lachlan.hunt at lachy.id.au> wrote:
> >> I'm not saying it won't break anything, but every single change we make
> >> to the parsing could possibly break any number of the billions of pages
> >> on the web in any number of browsers.
> >
> > But using your method (swapping inline node and block node) would
> > break presently valid and correct webpages.
>
> Such pages are invalid because inline-level elements are not allowed to
> contain block-level elements. HTML pages containing the following:
>
> <span>
> <div>...</div>
> </span>
>
> could be considered well-formed (if you apply the concept of
> well-formedness to HTML, even though it's not formally defined for it),
> but it's certainly not valid according to any official DTD.
>
Sorry. I don't notice that this is invaild. I am new here. What
makes inline-level element not feasible to contain block-level
elements?? I am confused.
> > If breaking things is unavoidable, I prefer breaking things which are written incorrectly.
>
> No-one is intending to break anything that is written correctly.
I should change my line to "break things that are not well-formed
instead of those well-formed"
>
> > My idea is very extreme but simple and effecient:
> > Parse the page regardless of what between "</" & ">". See what's
> > written inside the close-tag merely a visual clue.
> >
> > Example: <span><div>X</span>Y</div>
> > + span
> > + div
> > + #text: X
> > + #text: Y
>
> I'm kind of confused by what you're trying to do there. You seem to be
> implicitly closing the div immediately before the span. But then the Y
> doesn't seem to be a child of the span at all in the markup, it looks
> like it should be a child of the div, yet in your DOM, it's not a child
> of the div, but is of the span.
>
> The DOM look equivalent to this markup:
>
> <span><div>X</div>Y</span>
>
It is my fault for not explaining it more clearly. Here I treat a
close tag like, what is written inside the close-tag doesn't matter to
the parser. So your observation is correct. I don't read and guess
what should I do when </span> is given instead of </div>. I treat any
</xyz> after <div> to be </div>. If somebody write a webpage not
well-formed, then the error will be displayed in such a distubing way
that no one can ignore it. If the error is by mistake (which I
presume to be the only reason of a page not well-formed), web
developer(s) can catch the source of problem more easily - the error
will be observable *from* the starting point of the error *to* the
ending point of the error. If this is too insane to everyone, as I
have said before, this idea is "very extreme". I do not suggest that
this will be the best choice.
> which is insane. It would make a little more sense if it were like this:
>
> + span
> + div
> + #text: X
> + #text: Y
>
> In other words, it would be equivlant to this markup:
>
> <span><div>X</div></span>Y
>
> That is actually quite sane and is what OpenSP does with invalid HTML,.
> regardless of which elements are used (presumably according to some SGML
> rules), but it would not be compatible with the current state of the web
> at all, and so is not a real option.
>
> > To correctly written webpages, this should pose no problems. To
> > incorrect webpages, they deserve it since the point they ask the UA to
> > use "standard mode".
>
> In theory, that sounds nice, but you have to remember:
>
> "to a rough approximation, all the content on the Web is errorneous,
> invalid, or non-conformant." -- Hixie
>
> So, to say "they deserve it" to 100% of the web (roughly speaking) isn't
> really an option, unfortunately. It's ok to say it to the most
> pathological of cases that depend on one particular browser's insane and
> undefined error recovery techniques, yet already breaks in everything
> else, but not to the whole web.
>
First, my idea would not, and should not, break the whole web. If it
is really deployed, it would only break webpage that are not
well-formed in this particular way.
Second, this discussion begins to be for error-handling in HTML5. I
believe the motto "Make the wrong looks wrong". Since the
introduction of CSS and its ability to do "div span { blahblahblah;
}", we can't go back to IE's insectual appoach. If the error-handling
mechanism make people feel mixing open-close-tags "okay" and then the
mechanism doesn't work up to their expectation occasionally, they will
blame the browser and never notice their fault. Unless we can find a
perfect mechanism which will never "break" their expectation, the
problem will go on. And I suppose the mechanism we are discussing
here should be used only in HTML5 onward, something the whole web not
using these day.
Of course, if someone can suggest a mechanism which does not "break"
things, I will love it.
> --
> Lachlan Hunt
> http://lachy.id.au/
>
>
More information about the whatwg
mailing list