[whatwg] A mechanism to improve form autofill

Ian Hickson ian at hixie.ch
Thu Aug 2 11:42:12 PDT 2012


On Mon, 23 Jul 2012, Ian Hickson wrote:
> 
> So we could define the autocomplete="" field's value as follows: [...]

I've now specced this, with some minor changes.


On Wed, 25 Jul 2012, David Holloway wrote:
>
> A "contact" address might be helpful for sites that are non-commercial in
> nature.  Airlines and hotels often ask for contact information such as here:
> 
> https://src.chromium.org/viewvc/chrome/trunk/src/chrome/test/data/autofill/heuristics/input/10_register_hotels.com.html?revision=89396&view=markup
> 
> Or, optionally:
> 
>   subsection = up to one of: "shipping" or "billing"
> 
> Where omitting the subsection covers the general case.

I went with making shipping/billing optional.


> > Anything other than "work", "home", and "fax"? Should it be "work-fax" 
> > and "home-fax"?
> 
> "mobile", "pager"?

Added those.


On Wed, 25 Jul 2012, Maciej Stachowiak wrote:
> 
> For some of these fields, autocomplete="" as a hint to autocompletion 
> seems sufficient. However, I think some may logically be a distinct 
> input type as well. Some of the information represented in the proposal 
> below is also redundant with existing type values (so it needs to be 
> specified either twice or in a conflicting way).

I've added a section that details the difference between type="", 
inputmode="", and autocomplete="". Let me know if that doesn't answer your 
questions on this front.


> I think cc-number is worthy of a distinctive type value. Credit card 
> numbers have a distinctive syntax. At the very least, they are numeric 
> and should trigger a numeric keyboard on touch devices and restriction 
> to digits. But they cannot be <input type=number> because it would be 
> wrong to format and localize the number (with comma or dot separators 
> for instance), and a spinner button is an obviously inappropriate 
> treatment. A similar consideration applies to cc-csc. These should 
> either be assigned distinctive types, or else we need to introduce a new 
> input type for a string of digits that is not to be formatted as a 
> number or treated as a spinner button (<input type=digits> or <input 
> type=numeric>). I think it is essential to do that before widely 
> deploying these autocomplete values, or else browsers will start using 
> the autocomplete value to drive behavior of the control itself, which 
> defeats the purpose of having a separate autocomplete attribute.

As far as I can tell, this is just <input type=text inputmode=numeric>.


> cc-exp subtypes could be distinguished by input type for cases where 
> they are not selects. Or alternately, it would be nice if there was a 
> way to use <input type=month> in browsers that have support for it, and 
> the traditional two selects or two text fields.

Without script, that's hard. With script it's possible today.


> >                           language, bday, bday-day, bday-month, 
> >                           bday-year,
> 
> It's unfortunate that we don't have distinct input types for just a day, 
> just a month, or just a year.

Why? (What's wrong with type=number, <select>, and type=number 
respectively?)


> <input type=url> exists, doesn't seem necessary to also have an 
> autocomplete value.

As with the others, type=url just means "the data type is URL", it doesn't 
mean "the value is my home page". Let me know if you still disagree after 
having read the section I added to the spec and I'll reconsider. :-)


> Also, should this not be a contact field?

Do people have different home pages based on whether they're at home or at 
work or on their cellphone?


> 
> >   contact-type  = "home", "work", "cell", or "fax"
> >   contact-field = one of: email, tel, tel-country-code, tel-national,
> >                           tel-area-code, tel-local, tel-local-prefix, 
> >                           tel-local-suffix, tel-extension, impp
> 
> I would suggest dropping the contact field values "email" and "tel" and 
> instead infer them from type.

Please let me know if you still support this after reading the 
aforementioned section in the spec. (In particular, the spec talks 
explicitly about the "tel" case.)


> So instead of <input type=tel autocomplete="work tel"> you would just 
> say <input type=tel autocomplete=work> (and would not be able to say 
> <input type=text autocomplete="work tel">, which would be an inferior 
> user experience when tel is given special behavior, or <input type=email 
> autocomplete="work tel">, which would be inconsistent).

I'm a little wary about adding more magic here, these attributes are 
already pretty complicated. See the autocomplete section's algorithms and 
let me know if you still think we should do something along those lines. 
If it's something people are willing to implement, I wouldn't want to 
stand in the way; I agree that it has some good side-effects (like making 
it impossible to have certain combinations).

I could also introduce some conformance requirements to make the bogus 
combinations non-conforming; currently I haven't made type=tel 
autocomplete=email non-conforming for instance.


On Wed, 25 Jul 2012, Anne van Kesteren wrote:
> 
> This is also true for the inputmode attribute. In particular its 
> Telephone, E-mail, and URL states.

I've de-emphasised those in the spec. They rarely have valid use cases. 
(About the only one I could come up with was a textarea where the user 
agent dynamically detects the user is about to enter a phone number and 
dynamically changes the inputmode accordingly, or some such.)


> If we add this, we should also add guidance on how 
> type/autocomplete/inputmode work together.

Done. Let me know if you can think of more to say here.


On Wed, 25 Jul 2012, Maciej Stachowiak wrote:
> 
> Similarly, I'm confused about the need to have both <input type=number> 
> and <input inputmode=numeric>. They are not exactly the same, but it is 
> mysterious that one is a type and the other is the inputmode. Also, 
> neither is appropriate for pure digit strings such as credit card 
> numbers of CVVs, where the thousands separator and negative indicator 
> should never be added, either explicitly by the user or as part of 
> formatting by the UA.

inputmode=numeric explicitly says it's intended (in part) for credit card 
numbers, though I agree that some of the things it describes are not 
necessary for those. I don't think that's necessarily a problem. It didn't 
seem like any platform had a credit card input mode, anyway. If they did, 
I'd be happy to add that as an explicit input mode.


On Thu, 26 Jul 2012, Aryeh Gregor wrote:
> 
> Government-issued ID numbers might be worth adding.  In America, social 
> security numbers are sometimes used for this purpose, but are treated as 
> semi-secret, so you usually don't enter them on web forms. (My American 
> college did use my social security number as an ID number, but not in 
> web forms as far as I remember.)  But in Israel, and I assume some other 
> countries, there are national ID numbers that are considered public 
> info.  E.g., my Israeli id number (mispar zehut) is 332752187.  It's 
> printed on my checks and things like that, so it's no secret, and since 
> it's guaranteed to exist and be unique, various institutions use it for 
> login instead of or in addition to a username -- my bank, health 
> insurance provider, etc.

I haven't added this yet.

I also haven't added:
 - payment instrument type
 - payment instrument start date
 - payment instrument issue number (for Maestro)

I also haven't removed, as some people suggested, the three cc-name 
subfields.

I'm open to making all these changes, but figured I would get some more 
input on them first, in particular from Ilya who did the research to come 
up with the original set of fields.


> I would also like to point out that this feature seems to overlap with 
> not only type="" (as has been pointed out), but inputmode="" as well, 
> and for that matter pattern="".  I think it would be quite unfortunate 
> if authors found themselves writing things like
> 
>   <input inputmode="numeric" pattern="\d{16}" autocompletetype="cc-num">
> 
> because that's logically pretty redundant.  But maybe it's the only way 
> to preserve our sanity, because it allows authors to figure out what 
> combination of features they need for their inputs instead of us trying 
> to figure out in advance what the possibilities are.

Yeah. The n-dimensional matrix of all the possible user experiences is 
sparse, certainly, at least in terms of what makes sense, but the full 
list of what makes sense is a lot bigger than anything I'd feel 
comfortable putting in the spec explicitly, I think.


On Thu, 26 Jul 2012, Smylers wrote:
> 
> Perhaps specifying certain autocomplete types could set defaults for 
> pattern and inputmode? So for this example autocomplete=cc-num would, if 
> pattern isn't specified, imply pattern=\d{16}, and equivalently for 
> inputmode?

I'd much rather we'd stick with only type="" implying magical values 
elsewhere, rather than having the magic go in all directions.


On Thu, 26 Jul 2012, Aryeh Gregor wrote:
> 
> That would be surprising, because autocomplete is just a hint, while 
> pattern doesn't allow form submission if it's not met.  Also, I couldn't 
> swear to you that all credit card numbers are actually 16 digits, or 
> that they will forever be 16 digits, so I'm hesitant to make that 
> connection canonical.

I can swear to you that they are not, in fact. :-)

See, e.g.: http://en.wikipedia.org/wiki/Bank_card_number

My biggest concern would be with introducing an input type that restricted 
input to valid credit card number types, and then finding that there was a 
new type that was incompatible with this, resulting in browsers that 
didn't let users spend their money. We always hate it when other 
industries (or even, when we ourselves) back us into a corner where 
backwards-compatibility limits where we can go, so I would really rather 
not do this to other industries.


On Thu, 26 Jul 2012, Smylers wrote:
> 
> I'd rather trust Hixie to find out what the rules are and bake them into 
> the spec than for every separate webmaster to try to get this right, 
> because some inevitably won't, especially if there are rules which 
> apparently work for many common cases but actually exclude a minority.

Simplest is just not to bother with a pattern for credit card numbers.


On Thu, 26 Jul 2012, Smylers wrote:
> 
> So I'm wondering if there could be a 'membership' or 'ID number' 
> field-type, followed by an identifier which organization this is, such 
> as:
> 
>   membership-uk-library
>   membership-israel-id
>   membership-flypoints
> 
> or:
> 
>   idnum-uk-library
>   idnum-israel
>   idnum-flypoints
> 
> This would be different from the other autocomplete field types Hixie 
> has proposed, because the organization suffix is open-ended, rather than 
> from a fixed set. I think that's inevitable: the HTML standard can 
> hardly spec every organization that somebody could be a member of.

I haven't added this, as it seems rather more complicated than anything we 
have so far, but it's not necessarily a bad idea, and if browsers are 
interested in doing this then I'd be happy to add it.


On Thu, 26 Jul 2012, Smylers wrote:
> Ian Hickson writes:
> > > Also, I do not understand why we have credit cards types. Is anyone 
> > > willing to have his credit cards information saved locally?
> > 
> > Sure, why not?
> 
> I am too, but I can understand why people who share their computer (and 
> user accounts) with others wouldn't want their card numbers saving.

That's a UA configuration issue, presumably. (Similar to saving 
passwords.)


> The relevant part of the spec currently says that for autocomplete=off:
> 
>   the user agent should not remember the control's value, and should not
>   offer past values to the user.
> 
> Could we turn those "should not"s into "may choose not to" or similar, 
> to indicate that there's nothing wrong with browsers offering users such 
> a feature? Or possibly to "must not ... unless the user has specifically 
> configured the user agent to enable remembering sensitive data"?

This text got rewritten; please let me know if you think further changes 
are needed.


> If there is to be an autocomplete type for payment card numbers then I 
> think that the restrictions on saving autocomplete=off values should 
> also apply to them. I suspect sites currently using autocomplete=off for 
> card numbers would be unwilling to switch to autocomplete=cc-number if 
> it meant all users card numbers would suddenly start being saved.

I haven't seen many sites use autocomplete=off for credit card numbers, 
but I guess I don't fill in those forms very often (I try to limit the 
number of merchants who have that information -- their systems have proved 
far less secure than my own). Anyway, I think this is a UA issue.


> Thinking specifically about payment card input, but more generally than 
> just autocomplete, these features would be useful as a user:
> 
> * When entering a new number, if I type or paste in spaces or hyphens
>   they are stripped from the number submitted to the site.

This seems easy enough for sites to do.


> * If the number doesn't pass the Luhn check digit algorithm, treat the
>   field as invalid and refuse to submit the form until I've fixed it.

That would be unfortunate for, e.g., China UnionPay customers...


> * For my browser to have multiple sets of card details stored, which I
>   can pick from.

That's a UA UI issue. The spec doesn't preclude it.


> * For the browser only to fill in stored card details of types that are
>   accepted. For example:
> 
>   » I prefer to pay with my credit card, but some sites only accept
>     debit cards. So I'd like to have my credit card details stored and
>     used on most sites, but the debit card details also stored and used
>     on those that don't accept credit cards.
> 
>   » Some people prefer, say, Amex, but have a Mastercard they use on
>     sites that don't accept Amex.
> 
>   » The 2012 Olympics box office only accepts Visa. It's pointless the
>     browser filling in the details of any other brand of card.

These problems are rather complicated, so I haven't gone there yet. If 
these are things that browsers are interested in solving, let's talk once 
the existing stuff is implemented. :-)


> * If a card's CSC is stored for the browser to fill this in when making
>   a repeat transaction on a site that stores my card number but prompts
>   for the CSC each time. I think the East Coast trains site in the UK
>   does that, and Amazon if shipping to a new address.
> 
>   For this to work for a user who has multiple card numbers stored in
>   her browser, the site needs to indicate not just that the text box is
>   for a CSC, but which card it is for. This is typically displayed to
>   the user as a card number with most of the digits replaced by Xs, and
>   sometimes with the card type as well; a way of specifying that in
>   mark-up would enable a browser to pick the appropriate card.

Is this common enough that it's worth worrying about? (It somewhat defeats 
the point of asking for the CSC in those cases... are we sure sites would 
bother to help the user agent out here?)


> * To fill in 3D Secure password characters, for the payment card being
>   used.
> 
>   Unlike the other card payment fields, a stored 3D Secure password
>   (Verified by Visa and similar) only needs to be sent back to one site,
>   that of the card issuer, not to every site taking payment. However, if
>   multiple cards are stored by the browser (say a debit and credit card
>   from the same issuer) then the correct password needs to be picked --
>   the one that goes with the card number submitted a page or two back --
>   which requires the browser knowing this is a 3D Secure password field,
>   (not just a normal site-specific password field it can remember with
>   its usual password manager).
> 
>   At least some variants of 3D Secure only ask for certain characters of
>   the password each time. For this to work with a password manager would
>   require the fields to be labelled with which character is being
>   requested.

Is this not just a regular password challenge/response situation?


> * To work when a part of a card form is served form a different iframe.
> 
>   To be PCI compliant, many retailers don't want card numbers and CSCs
>   to be submitted to their site, but to a third party whose systems are
>   certified as meeting certain standards.
>   
>   Sometimes all payment details go to the third party, but there are
>   third-party providers who serve an iframe for embedding in the
>   retailers form. The iframe has <input> elements for the card number
>   and CSC; all other fields, including the cardholder's name and the
>   card's expiry date, are directly in the retailer's page and submitted
>   to them. It's presented as a single form to users. Presumably
>   JavaScript is required to submit both forms simultaneously.
> 
>   An example of such a service:
>   http://www.hostedpci.com/solutions/checkout_express
> 
>   You can see this for yourself by going to the Modnique.com site, as
>   seen in the screenshot on the above page, and pretending to buy
>   something. When you get to the card details page, view the source or
>   right-click on the card number part to see that it's an iframe.
> 
>   Since it isn't apparent to the customer that there are two separate
>   forms here, instructing the browser to fill in my card details should
>   also complete the fields in the iframe.
> 
>   However ... if I've instructed the browser to fill in card details on
>   a site, I wouldn't want it also passing them to a random 'advert' that
>   also happens to be on the page, served from an iframe. A malicious
>   advert could include form fields (possibly obscured by something
>   else), hoping the user will fill them in on the parent page.

I've left this well alone for now, given the risk. Once we're confident 
that autocomplete is solidly implemented, let's reexamine this problem. :-)


> >  - credit card details (and subfields such as "name", "exp" etc)
> 
> The pedant in my dislikes the term 'credit card' (and hence the 
> abbreviation 'cc') to refer to something that includes debit cards. It's 
> a particularly unfortunate term for a site that only accepts debit cards 
> and not credit cards.

The spec defines these as "payment instrument" name, etc. The fields are 
still cc-* for brevity.


> Or optionally just set the control to the value, without offering, like 
> password managers typically do? Or is that too risky, because a site 
> could have JavaScript which automatically submits a form after a short 
> delay, to see if the browser has filled in any details for the user?

I haven't discussed this in the spec (not even in a "security" section, 
which I'd like to add -- if anyone has any specific suggestions on things 
that I should call out in such a section, let me know), but yes, that's 
both a UA choice regarding the UI to use, and a security risk (which has 
had real-world implications in deployed autofill solutions in the past).

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


More information about the whatwg mailing list