[whatwg] [wf2] More late comments and questions on Web Forms 2.0
Ian Hickson
ian at hixie.ch
Tue Aug 15 00:30:12 PDT 2006
On Sun, 12 Mar 2006, Henri Sivonen wrote:
>
> 3.6.1
> Item 10. There's a comma missing after '"[")' and before "a modifier".
Fixed.
> 3.6.1
> Example in item 11. Double quote missing in '"[n· string'.
Fixed.
> 5.
> Step 5. When XML submission is used, characters that are not XMLChars as per
> XML 1.0 need to be dealt with. I suggest dropping them.
I prefer converting them to U+FFFD. Dropping characters can be the source
of very hard-to-debug security problems. Done.
> Also, when XML submission is used, CRLF line breaks on the data level
> are weird, because the CR would have to escaped in order to preserve it
> in XML. I suggest using LF line breaks in XML submission. LF line breaks
> in XML may be serialized as literal (unescaped) LF, CR or CRLF.
Done.
> 5.
> Step 5. I think NFC normalization should be applied before using legacy
> encodings as well. E.g. Windows-1252 can encode many precomposed European
> characters but cannot encode the decomposed versions without precomposing
> first. However, in some special cases like Windows-1258 (Vietnamese) it is
> necessary to separate some diacritics from the base characters after the NFC
> step. (But I imagine Windows-1258 encoders do that themselves.)
I don't see that this is a WF2 problem. It's up to the encoding
specifications to specify how to encode Unicode characters.
> 5.
> Step 8. What happens if a 204 response changes the character encoding
> metadata? Or Content-Type in general for that matter?
This is the realm of the HTTP specification.
> 5.2.
> "Note that a string containing the codepoint's value itself (for example, the
> six-character string "U+263A" or the seven-character string "☺") is not
> considered to be human readable and must not be used as a transliteration."
>
> I agree with the sentiment, but changing that behavior is not
> backwards-compatible.
Backwards compatible with what? IE's behaviour is broken (there's no way
to submit a literal "☺" followed by a U+263A character). It could
even be a security risk in certain instances.
> 5.3. & 5.5.
> "The submission character encoding is selected from the form's accept-charset
> attribute. UAs must use the encoding that most completely covers the
> characters found in the form data set of the encodings specified. If the
> attribute is not specified, then the client should use either the page's
> character encoding, or, if that cannot encode all the characters in the form
> data set, UTF-8."
>
> I think sending UTF-8 to unsuspecting form handlers is worse that losing some
> unencodable characters. Sending UTF-8 to programs that don't expect it amounts
> to garbage in which increases the global amount of garbage out.
If they haven't specified an encoding, then using the page's encoding is
as much a guess as using UTF-8. The server hasn't said what it expects, it
should use the encoding metadata in the submission to deal with this.
The sooner we switch to a full-UTF-8/16 solution the better.
> 5.4.
> Can the presence of the accept-charset attribute be considered non-conforming
> when the XML submission type is specified?
Seems like a fine thing to warn about. I don't know if it should be an
error; what if the page changes the enctype around?
> 5.6. and elsewhere
> Minor typographical nit: Em dash used with spaces on both sides as opposed to
> either em dash without spaces or en dash with spaces.
Em-dash without spaces is ugly, and en-dash is too short. IMHO. :-)
> 5.6.
> "The value of the enctype attribute must be dispatched using a case-
> insensitive literal comparison."
>
> "case-insensitive" marked up as code. Still worried about considering
> Turkish i conforming.
Yeah... I think HTML5 might switch pure-ASCII attributes' case-folding to
ASCII-only. Not sure yet.
> 6.1.
> "(Even if importing into a text/html document, the newly imported nodes will
> still be namespaced.)"
>
> But will tagName return in upper case?
->HTML5. (Yes. But that isn't specced yet.)
> General DOM
> Will localName return the name in lower case in HTML DOM?
->HTML5. (Depends on whether the Document is an "HTML" or "XML" Document.)
> 6.1.
> "The following script has only one possible valid outcome:"
>
> "Valid" used loosely. :-)
Fixed.
> 7.10.
> Does "mirror" mean "reflect"?
Changed to reflect, but note that neither term is well-defined in WF2.
> B.
> Is the presence of inapplicable attributes in the input element non-
> conforming? (I think it would be useful to make inapplicable attributes
> non-conforming.)
It should warn, for sure, but I don't know that making it non-conforming
is useful. As mentioned previously, I don't like making things
non-conforming unless they are very clearly wrong.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list