[whatwg] Supporting more address levels in autocomplete

Ian Hickson ian at hixie.ch
Mon Mar 3 14:18:38 PST 2014

On Mon, 3 Mar 2014, Evan Stade wrote:
> I'm still confused. The site author has entered bad markup. Is your 
> concern that site authors will be unable to write good markup?

Some will write good markup, I'm sure.

Our job as language designers is to maximise the number of authors doing a 
good job, and minimise the number of authors who make unintentional 

> > There's no point us allowing address-level881. It will never be 
> > useful.
> Is there a point in disallowing it?

Yeah. It simplifies the language, means there's less to test so it 
simplifies testing, it simplifies authoring, it reduces tutorial 
complexity, it makes answering questions like "how many should I include" 
easy to answer, and so on.

> Ultimately it doesn't matter too much, but I would think it's a goal to 
> avoid spec churn.

Adding features isn't such a big deal, especially when they're in response 
to changing political conditions.

> If we're going to set some limit, let's say 4.


> > Well if for some reason you want to exclude non-US customers, sure. 
> > But suppose you do want to include all customers, but you're a 
> > mom-and-pop store who is just going to put what you put in the form 
> > onto the envelope, and who doesn't know the intricacies of each 
> > country's postal standards.
> >
> > How many fields should you list?
> In this case, address-level-n doesn't help you. In order to be able to 
> write an address onto an envelope, you want an address blob, not 
> tokenized bits. This address blob was proposed further up the thread, 
> and I think it's a good idea, but distinct from the current topic, which 
> is how to get tokenized bits for places like China.
> Of course, tokenized bits can be used to create an address blob, but it 
> requires some sophistication to do so.

If you take the fields from the spec today and those proposed in this 
thread, and concatenate them one-to-a-line in the following order:


...the mail is going to get where you want it to get, right?

So for the mom-and-pop store, this seems like it would be sufficient.

Even if they render it as:

   "address-level4", "address-level3" "postal-code"
   "address-level2" "address-level1"

...so that it's optimised for the US, it would still work everywhere, 
you'd just have some slightly annoyed postal staff in some countries.

So I don't think it's right to say that address-level* doesn't help you 
in the mom-and-pop store case. It does.

> I don't think you can just write a stack of inputs that accepts input 
> for any country. The country determines:
> a) what fields make sense
> b) what fields are required
> c) the order of fields
> You could ignore (a) and settle for a crappy UI that shows all fields 
> that make sense anywhere in the world, but you'd still be left with 
> solving (b) and (c).

(b) is an easy-to-solve problem: you don't make any of them required, and 
if the customer entered insufficient fields, they're not getting their 
package, and will have to be contacted out-of-band.

Can you elaborate on (c)?

If this is something that's required to make user of these autofill 
fields, then we should explain to authors what they need to do.

> > Alternatively, if "region" is always the last address-level* value, then
> > we could just do the mapping backwards:
> >
> >    address-line1
> >    address-line2
> >    address-line3
> >    address-levelN
> >    ...
> >    address-level3
> >    address-level2 = locality
> >    address-level1 = region
> This isn't backwards, this is what we're proposing.

Then why would UAE be missing address-level1? I'm confused.

The reason I say this is backwards is that it is the reverse of the 
"address-line*" fields. This could be confusing.

One question is whether the current "locality", which is defined as 
"City, town, village, post town, or other locality within which the 
relevant street address is found", should map to 4 or 2. If it maps to 2, 
we'll probably have to change the way we define this to be more generic.

> > But maybe we can do better, and just have dedicated names. What 
> > countries need more than two, today? How many do they each need? What 
> > are they? If we had hard data here it might be easier to design a 
> > better solution; do you happen to have that data?
> At least Korea, China, and Thailand need the third level. I think China
> will need a 4th soon. Here's a rundown for Chinese administrative levels:
> http://en.wikipedia.org/wiki/Administrative_divisions_of_China
> The three that make it onto the envelope currently are:
> "Provincial level"
> "Prefectural level"
> "County level"
> You can click through on the wikipedia link for explanations of the 
> various forms these levels take.
> I don't think dedicated names are advisable given the wide variety of 
> names for each address level (even within a single country, much less 
> across all countries). For example, "region" is already super generic 
> and unhelpful.

Being generic is kind of the point, since as you point out, different 
countries have different levels.

> Is there a name for these fields that you think would be less confusing 
> to the authors?

It sounds like we could have country-name, region, locality, province, but 
I agree that at the end of the day it's just confusing to have four words 
that are so vague that you can't tell what order they go in.

Still, having 1,2,3,4,3,2,1 is kinda weird.

Here's some dumb ideas. We could extend "address-line", as follows:

   "address-line1" |
   "address-line2" |- "street-address"
   "address-line3" |
   "address-line7" / "locality"
   "address-line8" / "region"
   "address-line9" / "country-name"

This leaves one unused number in the middle (4), in case we need to add to 
the street address side or the locality side.

Or we could do:

   "address-line1" |
   "address-line2" |- "street-address"
   "address-line3" |

...or, similar, but extending region instead of locality:

   "address-line1" |
   "address-line2" |- "street-address"
   "address-line3" |

We could make "region" into a multi-line field like "street-address":

   "address-line1" |
   "address-line2" |- "street-address"
   "address-line3" |
   "region-line1" |
   "region-line2" |- "region"
   "region-line3" |

Or alternatively:

   "address-line1" |
   "address-line2" |- "street-address"
   "address-line3" |
   "region-level3" / "locality"
   "region-level2" / "region"
   "region-level1" / "country-name"

Compared to those, the main proposal here doesn't seem that much better 

   "address-line1" |
   "address-line2" |- "street-address"
   "address-line3" |
   "address-level2" / "locality"
   "address-level1" / "region"

I dunno. Anyone else want to try to pick a colour for this bikeshed?

> > > > Are we going to have a list in the spec giving how many levels 
> > > > should be given for each country?
> > >
> > > No. That is up to the site's ability to handle the data. For 
> > > example, if I'm soliciting *just* US addresses, I wouldn't know what 
> > > to do with address-level3, hence I won't ask for it.
> >
> > Ok. What do you do if you're soliciting addresses from any country?
> I put all the fields my database or payments backend or w/e can handle. 
> If there's no column for address-level-4 in my database, I don't put a 
> field for address-level-4 in my webpage.
> Then I hide them all and invoke requestAutocomplete. Or I write 
> complicated JS to manipulate my markup to show the user what they expect 
> to see based on which country they're entering info for (hide the fields 
> that don't make sense, mark "required" for the ones that are necessary, 
> etc.)

requestAutocomplete() is a proprietary Chrome thing right now, so we 
shouldn't be recommending that people use it. (I'd love for other browsers 
to pick it up, since I agree that it makes things like this WAY better. 
But that's academic until they do.)

Similarly, I think requiring "complicated JS" is a too-high barrier for 
many authors, at least if we don't give explicit advice as to what this JS 
should do.

Hence the question, what should authors do if they're soliciting addresses 
from any country, if we don't tell them what this "complicated JS" is to do?

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

More information about the whatwg mailing list