[whatwg] Spec comments, sections 1-2

Wed Jul 15 15:25:55 PDT 2009

In 2.4.4.1:

"If position is not past the end of input, return to the top of the
step labeled loop in the overall algorithm (that's the step within
which these substeps find themselves)."

Why not just "go to step 9"?  In any event this is inconsistent with
2.4.4.2, which says

"If position is not past the end of input, return to the top of step 9
in the overall algorithm (that's the step within which these substeps
find themselves)."

Either both should say "the top of step 9" or both should say "the top
of the step labeled loop".  I don't see the value in the whole "in the
overall algorithm . . ." part, since in context there's no ambiguity
with just giving the number.

"If sign is "positive", return value, otherwise return 0-value."

I initially read "0-value" as a single word, like "p-value" or
whatever.  Perhaps it should have spaces to make it more immediately
obvious that it's subtraction ("0 - value").

In 2.6.2:

The specification says that user agents may serve HTTPS content as
though it were unencrypted.  For instance, an example states: "If a
user connects to a server with a self-signed certificate, the user
agent could allow the connection but just act as if there had been no
encryption."  If this is done, however, man-in-the-middle attacks
become trivial, unless the user is expected to notice the lack of
encryption (unlikely).

For instance, suppose a user navigates to PayPal and bookmarks it.
PayPal is configured so if you try using HTTP (e.g., typing
"paypal.com" in the URL bar), it will redirect to HTTPS.  Therefore
the user will bookmark a URL such as https://www.paypal.com/.  Now
suppose the user later attempts to access this site from the bookmark
with a MITM present (e.g., a free wireless router placed in a public
place by a malicious person).

The router can intercept the HTTPS request, make its own identical
HTTPS request, and return the results to the original HTTPS request,
but signed with its own key instead of the original.  If the user
agent behaves as described in the example, the only way for the user
to notice this is to notice that the URL bar looks different, or
whatever visual cue the browser uses.  If the user agent raises a
prominent scary warning or even makes it difficult for the user to
continue, on the other hand, there's no way for the attacker to
prevent this, AFAIK.

The section should prohibit user agents from displaying self-signed
pages without at least giving a warning.  Or, at a minimum, it should
strongly discourage it.  Currently it seems to indicate that this
behavior is acceptable.  As far as I know, existing browsers all
present scary warnings for self-signed pages (probably so scary as to
be misleading, in fact, but that's a separate issue).

In 2.7:

"User agents must at a minimum support the UTF-8 and Windows-1252
encodings, but may support more.

"It is not unusual for Web browsers to support dozens if not upwards
of a hundred distinct character encodings."

Why aren't the most important ones listed as requirements?  This seems
to be contrary to the usual HTML 5 philosophy of mandating (or at
least precisely specifying) existing behavior that's required for
compatibility.