[whatwg] [WF2] form submission protocols and methods

Fri Dec 9 17:15:01 PST 2005

On Dec 9, 2005, at 3:42 PM, Ian Hickson wrote:

> On Fri, 9 Dec 2005, Maciej Stachowiak wrote:
>>
>> I think a lot of section 5.6 should be removed from the spec.
>
> Most of section 5.6 consists of defining behaviour to ensure
> interoperability between implementations, since if the spec doesn't  
> list
> what happens then implementations either have to reverse-engineer each
> other or end up not doing the same thing.

Sure, but I think in many cases the spec could just specify current  
de facto behavior, instead of inventing new semantics.

> Note that user agents may implement whichever URI schemes are  
> required for
> their particular application. The WF2 specification does not specify a
> required core set of protocols that must be implemented. For those  
> that
> are implemented, UAs must use the algorithms given in section 5.6 when
> submitting data using those protocols, but for those that aren't, the
> presence of the protocol in section 5.6 doesn't imply the UA is non-
> conformant or anything.

I think this could be made more clear.

>> Also, you might want more controls in the form that aren't themselves
>> submitted but specify options for how to do the upload.
>
> The new definition allows that. Let me know if that is better:
>
>    http://whatwg.org/specs/web-forms/current-work/#missing-enctype
>
>
>> It would be better if file upload semantics could be selected
>> explicitly.
>
> True. Maybe a new enctype value instead?

I like the proposal of using a new enctype value, that makes it  
totally clear when this behavior applies. Under the current scheme  
upload format varies based on whether the user has selected anything  
in the file upload control. This seems poor. Submission should either  
fail in this case, or there should be some defined behavior.

>> http: - "put" and "delete" are little-used methods on the web.
>
> Well, yeah, since there's basically no way to use them. This is partly
> intended to address this.

In theory you could use them through XMLHttpRequest, if that were  
spcified as allowed I think it would be more useful than allowing  
them as form submission methods. They seem like the sort of thing you  
are more likely to want as one of many side effects to a complex  
action, rather than a sole action in themselves.

>> ftp: - I do not believe any methods but "get" should have specified
>> behavior.
>
> What should happen if the author specifies something else then?

As suggested below, treat as unknown method, i.e. like "get".

>> The spec itself says "ftp:" is not recommended as a submission  
>> method,
>> so why extend it?
>
> It's not a matter of extending it, it's a matter of defining it so  
> that
> all UAs can converge on the same behaviour.

Do any UAs currently support non-get methods for ftp:? If so,  
standardizing this behavior would be reasonable. But I do not believe  
they do, in which case this constitutes an extension. Current UA  
behavior (i.e. get only) is a fine behavior to converge on, in my  
opinion.

>> data: - all the methods except "get" seem weird and of questionable
>> usefulness. The things you can do through trivial text  
>> substitution are
>> extremely limited and are better handled by script IMO.
>
> So what should happen instead?

Once again, treat as unknown method, i.e. like "get". I think there  
is a mistaken desire to want to fill every box in the 3-D protocol/ 
format/method grid, but it seems more reasonable to me to treat  
method as specific to protocol and for now only applicable to http,  
and essentially ignore method for the other protocols.

>> file: - "post", "put" and "delete" are severe security risks even in
>> documents that themselves come from file: URLs, since this would make
>> downloaded HTML documents considerably more dangerous. The spec says
>> "For security reasons, untrusted content should never be allowed to
>> submit or fetch files specified by file URIs" but it is unclear what
>> these means.
>
> What is unclear? The term "untrusted content" or the phase "allowed to
> submit or fetch"? (Or something else?)

"Untrusted content" is unclear. It implies the existence of something  
that isn't "untrusted content", i.e. "trusted content". Where is that  
defined? I do not believe it is defined anywhere, in which case  
specifying its behavior seems non-useful.

>> If this is meant to apply to normal "file:" URL documents, then I
>> strongly oppose these extensions. If it is not, then it is specifying
>> behavior for some kind of special "trusted" mode which is not itself
>> defined by this or any other spec, which seems outside the scope  
>> of the
>> spec.
>
> As the spec says: "The semantics described in this subsection are
> recommended, but UAs may implement alternative semantics if  
> desired, as
> consistent behaviour for submission to file: URIs is not required for
> interoperability on the World Wide Web."

If it's just a suggestion, it should go in a non-normative part of  
the document.

>> mailto: - "post"/"put"/"delete" behavior seems useless, since the  
>> form
>> can control the body but not the headers (or at least the headers  
>> can't
>> come from form elements in any obvious way). It seems like in most  
>> cases
>> you'd want the body text to be composed text in any case - popping  
>> up a
>> message window full of form submission date does not seem useful. I
>> recommend just removing everything but "get" for now, since the  
>> feature
>> freeze means it is too late to redesign this.
>
> Assuming there is no feature freeze, how would you do it? I assume  
> that
> when I specified this I tried to make it match implemented  
> behaviour while
> defining the parts that weren't interoperably implemented. In  
> particular,
> note that with mailto: URIs, the enctype is usually set to text/plain
> which is a lot more readable than most submission types.

Once again I would omit this and treat like "get". I think being able  
to specify the message body is a handy feature for "mailto:"  
submission. But this is already offered via the "get" method, see  
http://ftp.ics.uci.edu/pub/ietf/uri/rfc2368.txt and search for  
"body". So "post" is not adding any new capabilities and therefore I  
think the spec should omit this extension and treat it as an unknown  
method, i.e. same as "get".

>> smsto: / sms: - It seems overly aggressive to specify form submission
>> behavior for URI schemes that are not themselves formally  
>> specified in
>> any way. Indeed, the spec itself says "Behavior is undefined, pending
>> the release of an smsto: or sms: specification."
>
> I've commented these out for now.

Sounds good.

>> I do not think it is right for the spec to call for undefined  
>> behavior.
>> The right way to leave behavior undefined is to not specify it.
>
> I think it's much better to explicitly state when behaviour is  
> undefined
> rather than leave the reader in the dark, but that's another story.

Well, submission behavior is also unspecified for "gopher:", "sip:",  
"nfs:", and so forth. I do not think it is the spec's job to list  
every URI scheme.

>> javascript: - This is redundant with onsubmit event handlers.  
>> Recommend
>> removal.
>
> So what should happen when you submit to a javascript: URI?

By my cursory testing, current UA behavior is to do nothing,  
effectively disallowing javascript: as a submit action. Perhaps this  
could be specified. Note that javascript: URIs are not allowed in all  
contexts. For instance they are not allowed as the source for a  
script, style or img tag by any current browser as far as I know.  
They are allowed only as link hrefs or frame/iframe src (as well as  
entered directly via the UI).

>
>
>> As a general comment, making "put" and "delete" have the behavior of
>> "post", when they have no obvious semantic of their own, seems
>> questionable.
>
> What would you have them do instead?

I said in the very next sentence what I would have them do!

>> They should be treated however unknown methods would be, since for  
>> those
>> protocols they effectively are unknown.
>
> If it's not known, it's treated as GET. We could do that instead,  
> it just
> seemed that PUT's semantics were closer to POST than GET.

Sure, but if there is no actual PUT operation, then it seems like the  
right treatment is as unknown action. Why should POST be handled  
differently for mailto: than, say, HEAD? Or FOOBAR? All three are  
equally meaningless in this context.

Regards,
Maciej