[whatwg] [editing] HTML Editing APIs specification ready for implementer feedback

Wed Aug 3 13:13:41 PDT 2011

Mike Smith has been kind enough to add a component to the W3C Bugzilla
for the editing spec, in the WebApps WG product.  From now on I'll use
it for tracking feedback so I can be organized about things that I
can't fix right away.  The link to file a bug is:

http://www.w3.org/Bugs/Public/enter_bug.cgi?product=WebAppsWG&component=HTML+Editing+APIs

I'll still respond to all e-mail feedback.

On Tue, Aug 2, 2011 at 8:31 PM, Ryosuke Niwa <rniwa at webkit.org> wrote:
> Feedback on selections 5 through 7:
>
> The definition of collapsed line break isn't clear.  Maybe it's br
> immediately before the end of a block?

Unfortunately, that's not good enough if we want it to be correct in
all cases.  For instance, the <br> in <p><span><br><!-- foo
--></span></p> behaves like a collapsed line break for CSS purposes.
Maybe we could just let the algorithm be wrong in those cases, if we
can't come up with a better definition.  The problem is that a real
definition would depend very heavily on CSS.  This is why there's a
big red box in the spec . . .

> Isn't this essentially the collapsed line break and a br inside a block
> where br is the sole visible node?

Sorry, I don't understand what you're saying here.

> The definition of visible should definitely take display: none and
> visibility: hidden into account.  Also, you should take collapsed whitespace
> into account.  e.g. " <br> " is invisible even though there are Text nodes
> before and after br.  CSS2.1 spec section 16.6.1 has some elaboration on how
> whitespace is collapsed.

visibility: hidden shouldn't be taken into account, I don't think.
Such nodes still take up space, they just don't get painted, so they
don't behave like real invisible nodes.  I definitely need to take
collapsed whitespace into account, and display: none.  I filed a bug:

http://www.w3.org/Bugs/Public/show_bug.cgi?id=13631

> Step 3 in "remove extraneous line breaks before" seems redundant because we
> traverse the tree in the reversed tree order in step 4.

I'm not sure what you mean.  Step 3 is "While ref has children, set
ref to its lastChild", and step 4 is "While ref is invisible but not
an extraneous line break, and ref does not equal node's parent, set
ref to the node before it in tree order".  Suppose you have

<span>foo<br></span><p>bar</p>

and the algorithm is invoked with ref equal to the . First we set
ref to the in step 1. Then in step 3 we set it to the .
Then step 4 does nothing, because the is an extraneous line
break. Then in step 5 we remove the . Without step 3, ref would
still be equal to the .

> What are "sibling criteria" and "new parent instructions" in 6.2?

I changed

"""
To wrap a list node list of consecutive sibling nodes, given sibling
criteria and new parent instructions, run the following algorithm.  If
not provided, sibling criteria match nothing and new parent
instructions return null.
"""

to

"""
To wrap a list node list of consecutive sibling nodes, run the
following algorithm. In addition to node list, the algorithm accepts
two inputs: an algorithm sibling criteria that accepts a node as input
and outputs a boolean, and an algorithm new parent instructions that
accepts nothing as input and outputs a node or null. If not provided,
sibling criteria returns false and new parent instructions returns
null.
"""

I also changed "sibling criteria" and "new parent instructions" to use
<var> instead of <dfn>/, to match other variables. Does that
make it clear?
http://aryeh.name/gitweb.cgi?p=editing;a=commitdiff;h=2992f823

> Also where would new parent be inserted if new parent's parent was not null?
>  Or will it stay where it was?

I clarified in a comment:

http://aryeh.name/gitweb.cgi?p=editing;a=commitdiff;h=5771e7c0

Does it make sense now?

> I'm not sure why we'd need to add br's so aggressively in this algorithm

Basically every time I add a , it's because I found a case in
tests where there was some bug otherwise. As far as I know, every
time a is added, it's needed to stop two lines from running
together -- I avoid adding unnecessary s and in fact remove them
in a lot of places. If you look closely at the conditions, these
 's will only be added when you're wrapping in block elements.
Things like if you have

  <blockquote>foo</blockquote>[bar]

and the author runs "indent", it has to become

  <blockquote>foo<br>[bar]</blockquote>

not

  <blockquote>foo[bar]</blockquote>

None of these should normally do anything if you're wrapping in
something like a . Are there any specific cases that you think are
unnecessary?

> Can 6.3 be tied with "phrasing content" concept used in the rest of HTML5
> spec?

See the comment at the beginning of the "allowed child" algorithm,
beginning with "TODO: This list doesn't currently match HTML's
validity requirements for a few reasons:". The biggest problem with
phrasing content is that it's only defined for valid elements, but we
need to also care about things like for compatibility.

But as I say in the comments next to the definition for "name of an
element with inline contents", "TODO: The definitions of prohibited
paragraph children and elements with inline contents should be in the
HTML spec (possibly under a different name) so they don't fall out of
sync. They'll do for now."  So yes, I do think it would be good if
these were moved into HTML somehow, but it's not high priority.

> 7.2: Firefox appears to differentiate backColor and hiliteColor; namely
> backColor is always the first non-transparent background color of the block
> ancestors.

Yes.  See the comment at the beginning of backColor, "We have three
behaviors to choose from for this one: . . .".  There's no interop on
what backColor does, and I wound up defining it identically to
hiliteColor.  This basically matches the behavior of IE and WebKit,
but Gecko and Opera behave differently.

> 7.2: The last time I checked, only WebKit respected vertical-align for "sub"
> and "sup" so I'm not sure we should keep that.  It appears that all other
> browser ignores vertical-align.

Yes, see the comment next to "If node's "vertical-align" property has
resolved value "sub", set affected by subscript to true".  I honor
vertical-align because WebKit will currently create such markup in
styleWithCSS = true mode, although the spec says not to, so I wanted
the algorithm to handle the existing markup correctly.  If WebKit
wants to change, I'd be happy to remove this from the spec, since it
complicates things.

> 7.6: In WebKit, we have the concept of typing style, which is a collection
> of CSS styles that will be applied when user types characters (uses b, i,
> etc... when StyleWithCSS is false).  Maybe we can introduce this concept in
> the spec, and step 2 in 7.6 can update that?

This seems like it's the same basic idea as "state override" and
"value override".  I clarified what those are meant to do, since it
probably didn't make sense on a first reading:

http://aryeh.name/gitweb.cgi?p=editing;a=commitdiff;h=22697d3d

Do you think anything else needs to be changed about how these work?
In WebKit it seems not to be just a matter of CSS styles: it will also
honor createLink with a collapsed selection (other browsers don't, but
the spec does).

> 7.7: Should we assume two colors are same if both of them had alpha=0 since
> they are transparent anyway?

I think it's simpler not to.  The worst that happens is we replace one
transparent color by another, which doesn't matter, so there's no
point in making it more complicated.

> 7.8: WebKit (and Firefox 5 as far as I checked) regards 700, 800, & 900 as
> "bold".

See the comment next to "inline command activated values" for bold.
On my test systems with default settings (both Ubuntu 10.04 and
Windows 7 IIRC), 600 and up translates to bold in IE, Firefox, and
Opera, and 700 and up translates to bold in Chrome. But even in
Chrome, running the "bold" command on [foo] will strip the span tag, not
replace it with or anything. See here:

http://aryeh.name/spec/editing/autoimplementation.html#bold

If you run the tests (takes a while), you can scroll down to the test
'foo[bar]baz' and look for
yourself -- Chrome matches the current spec here. It would be nice if
CSS standardized this better, though.

> The algorithm to compute the legacy font size in 7.11 doesn't really match
> the one WebKit and Firefox implement. Maybe it's better to say it's the
> numbers between 1 and 7 such that it would have produced the closest font
> size to the resolved value of "font-size" in pixels when used in font
> element's size attribute.

I think that's basically what the algorithm does, doesn't it?  (BTW,
no one but WebKit seems to use any kind of algorithm like this at all
in my testing.  See the comment next to "Value".)

> 7.15: WebKit uses blacklist.  And IE doesn't modify any inline style
> declaration so I'd vote for black-listing.  I did an extensive research
> about this when I recently re-implemented WebKit's
> RemoveFormat: https://bugs.webkit.org/show_bug.cgi?id=43017

Thanks, I'll look into that.  Bug filed:

http://www.w3.org/Bugs/Public/show_bug.cgi?id=13632

> I've read a part of sections 7 and 8 but I've kind of lost here.  The spec
> is very detailed and I can't really get a high-level view of what will
> happen.  It might be helpful to have some high-level summary of what it
> tries to do for each algorithm something like one at the beginning of 7.6.

I'll do that.  Bug filed:

http://www.w3.org/Bugs/Public/show_bug.cgi?id=13633

> I'm mainly concerned that there doesn't seem to be a good way for me to
> check whether the current implementation is consistent with your spec
> because the spec is defined in terms of algorithms.  Indeed, it's a NP-hard
> problem :(

Implementations will have to be rewritten to match the spec, yeah.
They're not going to give the same results if they weren't written
from scratch to behave like the spec.  A good test suite should be
able to check if implementations are matching the spec well enough.

> Also, I'm not certain if there's a much value in each browser matching the
> spec exactly.  I feel like we need to have some kind of tolerance level as
> done in browserscope's RTE2 test suite (+roland since he worked on this
> stuff).

The RTE2 test suite needs to have some tolerance because otherwise it
would just fail almost every browser for almost every test.  Once we
have an exact spec, I don't see why browsers shouldn't match it
exactly.  If there are specific things that you think should be left
undefined, by all means please suggest them, but undefined behavior is
usually a very bad thing on the web platform.

> Test suites like Roland's might also help us determining whether
> your spec is consistent with browsers' or not.  In fact, is there a way we
> can run browserscope tests under your reference implementation?

I wrote my own tests, which formed the basis for a lot of the spec:

http://aryeh.name/spec/editing/autoimplementation.html

They're not very well documented right now, I should fix that up.
They do have some built-in tolerance right now, like RTE2, because
otherwise they'd be useless.  But I think we should eventually have a
zero-tolerance test suite, once browsers start really implementing the
spec.

I filed a bug on documenting my tests:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=13634

Thanks a lot for all your feedback!