[whatwg] Supporting more address levels in autocomplete

Ian Hickson ian at hixie.ch
Fri Feb 28 17:47:06 PST 2014


On Mon, 24 Feb 2014, Jukka K. Korpela wrote:
> 2014-02-22 3:03, Ian Hickson wrote:
> > 
> > (Note that a lot of people in the UK have no idea how to write their 
> > address according to current standards. For example, people often 
> > include the county, give the "real" town rather than the "post town", 
> > put things out of order, indent each line of the address, etc.)
> 
> The phenomenon is probably not limited to the UK. Few people even know 
> the current standards (national and international).

Well sure, but since we're writing a standard, if our assumption is that 
people don't know standards, we're not going to reach a useful conclusion.


> Some fine-grained control for naming different components of an address 
> are undoubtedly useful at times. It would be even more useful to have a 
> common, "standard" name for just an address. That is, whatever someone 
> wants the sender to put in an envelope. Its internal structure does not 
> matter, as long as it works, and as usual, it is up to the recipient to 
> specify the address in a manner that works.
> 
> Forms that require the user to split his address to small pieces may 
> have their reasons. But if you just want to have an address to send 
> stuff to, then all you need is a working postal address. A textarea 
> with, say, name="postal", if used on different pages, would then let the 
> user enter his entire address very simply, after just once typing it.
> 
> Probably "postal" should be specified so that it relates to a postal 
> address that is complete for delivery except the recipient name. The 
> reason is that the name is so often asked separately

On Mon, 24 Feb 2014, Evan Stade wrote:
>
> I agree with this, and plan to propose it separately from the proposal 
> currently under discussion. It might be hard to parse a working address 
> out of a free-form input, but the other direction is doable enough: 
> creating a block of text suitable to printing on an envelope given 
> tokenized values. This tackles the problem of how to format an 
> autocompleted address for a particular country and UI language (i.e. in 
> the user agent has to know how to do it, but the website doesn't).

We can definitely add something like this. We already have a simpler 
version of this for street addresses.


On Mon, 24 Feb 2014, Charles McCathie Nevile wrote:
> 
> That depends on whether you want to force your customers to think like 
> the Post Office, or whether you prefer to be responsive to your 
> customers. Speaking without data, I suspect that nervousness at not 
> being able to put *what someone thinks* is their address translates 
> fairly readily into a certain amount of failure to proceed with a 
> transaction.

I'd love to see real data on this. I can imagine scenarios that would lead 
this to go both ways.


On Mon, 24 Feb 2014, Dan Brickley wrote:
>> 
> Who is using the data? Just post offices? Or taxi drivers, pizza 
> delivery bikers, pedestrians?

The latter three are unlikely to really need much more depth at the 
locality level.


On Mon, 24 Feb 2014, Evan Stade wrote:
>
> Regarding UK addresses, libaddressinput[1], which is used by Google for 
> various products, currently accepts two levels of administrative region 
> for GB: city and optional county.

You need two levels, but those aren't it. :-) Counties haven't officially 
been used in UK addresses since the mid 90s.


> > This would be the first open-ended field name. Do we really want to 
> > make this open-ended? What happens if a form has n=1..3, and another 
> > has n=2..4? What if one has n=1, n=2, and n=4, but not n=3?
> 
> I don't know why a web author would do this

Web authors do all kinds of crazy stuff. We have to be ready for it such 
that we never end up forced to introduce weird heuristics.


> but n=m doesn't require n=m-1 or n=m+1 to be present. n=2..4 would just 
> mean the site didn't get the n=1 value.

My concern is that authors do something like this:

   <input ... autocomplete="address-line-1">
   <input ... autocomplete="address-level-2">
   <input ... autocomplete="address-level-3">

...and then the user enters their address:

   1600 Amphitheatre Parkway
   Mountain View
   CA

...and then the user goes to another site:

   <input ... autocomplete="address-line-1">
   <input ... autocomplete="address-line-2">
   <input ... autocomplete="address-level-1">
   <input ... autocomplete="address-level-2">
   <input ... autocomplete="address-level-3">

...and the browser autofills:

   1600 Amphitheatre Parkway
   (empty)
   Mountain View
   Mountain View
   CA

...or some such.


> > How does a site know how many levels to offer?
> 
> It offers as many as it knows what to do with. It probably wouldn't know 
> what to do with n=5, or n=100, and it's highly unlikely a user agent 
> would return a value for those levels anyway, so practically speaking, 
> n=1 to n=3 should be sufficient for now (although n=4 seems possible in 
> the near future). But I don't see the purpose in setting a limit in the 
> spec.

This makes me extremely uncomfortable.

We're saying, "we don't know how to do this, I hope you do". Why would we 
be less able to answer this than Web authors? It's not like Web authors 
are experts in postal addresses.

I think we should pick the number that is actually needed, and be firm 
that that is the number.


> > What should a Chinese user interacting with a US company put in as 
> > their address, if they want something shipped to China?
> 
> They would put in the same address regardless of the nationality of the 
> company, assuming the company is able to properly handle their address. 

Shouldn't we want everyone to be able to handle everyone's address?


> Which inputs are visible to the user should depend on which country 
> they're entering. This means that if a user changes the country, the 
> inputs shuffle around and hide or show.

Are we really expecting many sites to do this? I've only seen the most 
advanced sites do this.


> > So they would be synonyms? Or separate fields?
> 
> They are pseudo-synonyms.

I don't know what that means.


> In the US, "region" aligns with "address-level-1", and either one would 
> return the same value. In the UAE, where there are cities but no higher 
> level administrative region, "locality" aligns with "address-level-1". 
> In China, "address-level-1" is a province a province-level city such as 
> Beijing. Beijing is also "region", confusingly, and a district of the 
> city is a "locality".

If we're going to do this, we need to have a mapping for every locality 
defined in the spec. This seems like a losing proposition.

Why not make them straight synonyms?


> So generally speaking, if I ship to both China and the US, I would 
> create a form with "address-level-[1..4]" and if the user starts to 
> enter a US address, only show the first 2 levels. If the user starts to 
> enter a Chinese address, show more levels. If using requestAutocomplete, 
> all the inputs are hidden all the time anyway.

Are we going to have a list in the spec giving how many levels should be 
given for each country?

Note that the "country" field is often near the end of the form. How do 
you know which country the user is entering an address for when all the 
user's entered is three lines of text?

(Most Web developers don't have access to a reverse geocoder that can 
guess the answer from the first line.)


On Tue, 25 Feb 2014, Jürg Lehni wrote:
>
> I think it is dangerous to make any kind of assumption about valid 
> postal addresses.
> 
> Here's a great list of all kinds of exceptions to rules that programmers 
> tend to believe to be true:
> 
> (Don't we love rules?)
> 
> http://www.mjt.me.uk/posts/falsehoods-programmers-believe-about-addresses/

I didn't see any there that were contradicted by the assumptions in the 
HTML spec; did you have any particular ones in mind?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


More information about the whatwg mailing list