[whatwg] Comments on the definition of a valid e-mail address

Aryeh Gregor Simetrical+w3c at gmail.com
Sun Aug 23 13:25:06 PDT 2009


On Sun, Aug 23, 2009 at 4:00 PM, Tab Atkins Jr.<jackalmage at gmail.com> wrote:
> Unless you avoid validating *entirely*, there's virtually always going
> to be some subset of theoretically valid addresses that you'll flag as
> invalid, though.

There shouldn't be, IMO, if the browser is forbidden to submit them.

> Unlike type=tel, emails have a relatively simply format which *very
> nearly everyone* uses.  I agree that if an email works but is one of
> those crazy formats it's probably not a good idea to bar them from
> using it, but in practice that's exactly what happens right now with
> email validation scripts.  If type=email doesn't validate at all
> people will still just continue to use their broken homebrew
> validators both on client-side and server-side.

They'll probably do that anyway.  HTML 5 doesn't have to mandate it.

> Would you mind sharing these 200 or so that don't validate?  Obviously
> there are privacy concerns, but I think it would be sufficient to just
> replace every alpha character with 'x' and every numeric with '0', or
> some similar information-removing transformation.  None of them fail
> validation because of the letters or numbers used, so that would still
> give us the information we need without revealing stuff we don't.

I doubt it would be useful.  I summarized all the interesting points,
and remember that these are only 0.007% of the total.  Also, note that
it was 3255 of them that didn't validate.  It was 202 that didn't
validate even after the regex was adjusted to allow whitespace
everywhere (should be equivalent to stripping 0x9, 0xA, 0x20 from
email input).



More information about the whatwg mailing list