[whatwg] [wf2] More late comments and questions on Web Forms 2.0

Henri Sivonen hsivonen at iki.fi
Sun Mar 12 06:31:11 PST 2006

These are based on the 2006-01-10 draft.

Item 10. There's a comma missing after '"[")' and before "a modifier".

Example in item 11. Double quote missing in '"[n· string'.

Step 5. When XML submission is used, characters that are not XMLChars  
as per XML 1.0 need to be dealt with. I suggest dropping them.

Also, when XML submission is used, CRLF line breaks on the data level  
are weird, because the CR would have to escaped in order to preserve  
it in XML. I suggest using LF line breaks in XML submission. LF line  
breaks in XML may be serialized as literal (unescaped) LF, CR or CRLF.

Step 5. I think NFC normalization should be applied before using  
legacy encodings as well. E.g. Windows-1252 can encode many  
precomposed European characters but cannot encode the decomposed  
versions without precomposing first. However, in some special cases  
like Windows-1258 (Vietnamese) it is necessary to separate some  
diacritics from the base characters after the NFC step. (But I  
imagine Windows-1258 encoders do that themselves.)

Step 8. What happens if a 204 response changes the character encoding  
metadata? Or Content-Type in general for that matter?

"Note that a string containing the codepoint's value itself (for  
example, the six-character string "U+263A" or the seven-character  
string "☺") is not considered to be human readable and must not  
be used as a transliteration."

I agree with the sentiment, but changing that behavior is not  

5.3. & 5.5.
"The submission character encoding is selected from the form's accept- 
charset attribute. UAs must use the encoding that most completely  
covers the characters found in the form data set of the encodings  
specified. If the attribute is not specified, then the client should  
use either the page's character encoding, or, if that cannot encode  
all the characters in the form data set, UTF-8."

I think sending UTF-8 to unsuspecting form handlers is worse that  
losing some unencodable characters. Sending UTF-8 to programs that  
don't expect it amounts to garbage in which increases the global  
amount of garbage out.

Can the presence of the accept-charset attribute be considered non- 
conforming when the XML submission type is specified?

5.6. and elsewhere
Minor typographical nit: Em dash used with spaces on both sides as  
opposed to either em dash without spaces or en dash with spaces.

"The value of the enctype attribute must be dispatched using a case- 
insensitive literal comparison."

"case-insensitive" marked up as code. Still worried about considering  
Turkish i conforming.

"(Even if importing into a text/html document, the newly imported  
nodes will still be namespaced.)"

But will tagName return in upper case?

General DOM
Will localName return the name in lower case in HTML DOM?

"The following script has only one possible valid outcome:"

"Valid" used loosely. :-)

Does "mirror" mean "reflect"?

Is the presence of inapplicable attributes in the input element non- 
conforming? (I think it would be useful to make inapplicable  
attributes non-conforming.)

Henri Sivonen
hsivonen at iki.fi

More information about the whatwg mailing list