[whatwg] Can the maximum allowed value length be changed to restrict the number of characters?
Jukka K. Korpela
jkorpela at cs.tut.fi
Tue Aug 20 06:49:16 PDT 2013
2013-08-20 2:40, Ryosuke Niwa wrote:
>> http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#maximum-allowed-value-length
>>
>> Why is the maxlength attribute of the input element specified to
>> restrict the length of the value by the code-unit length?
Apparently because in the DOM, "character" effectively means "code
unit". In particular, the .value.length property gives the length in
code units.
>> This is counter intuitive for users and authors who typically
>> intend to restrict the length by the number of composed character
>> sequences.
That is true. We should not expect end users to know whether a character
they enter occupies one code unit or two, i.e. whether it is a BMP
character or not. Then again, I don't expect most users to enter non-BMP
characters, though this might be changing as e.g. emoticons become more
popular.
>> In fact, this is the current shipping behavior of
>> Safari and Chrome.
And IE, but not Firefox. Here's a simple test:
<input maxlength=2 value="𐐀">
On Firefox, you cannot add a character to the value, since the length is
already 2. On Chrome and IE, you can add even a second non-BMP
character, even though the length then becomes 4. I don't see this as
particularly logical, though I'm looking this from the programming point
of view, not end user view.
>> Can the specification be changed to use the number of composed
>> character sequences instead of the code-unit length?
In contexts where you want to set maxlength in the first place, your
reasons might well be related to limitations that apply to the code unit
length. It's a different thing if the intent is to limit the amount of
visible characters.
Interestingly, an attempt like
<input pattern=.{0,42}>
to limit the amount of *characters* to at most 42 seems to fail.
(Browsers won't prevent from typing more, but the control starts
matching the :invalid selector if you enter characters that correspond
to more than 42 code units.) The reason is apparently that "." means
"any character" in the sense "any code point", counting a non-BMP
character as two.
> Also,
> http://www.whatwg.org/specs/web-apps/current-work/multipage/common-input-element-attributes.html#the-maxlength-attribute
> says "if the input element has a maximum allowed value length, then
> the code-unit length of the value of the element's value attribute
> must be equal to or less than the element's maximum allowed value
> length."
>
> This doesn't seem to match the behaviors of existing Web browsers or
> http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#maximum-allowed-value-length
> unless I'm misreading something. Namely, the value attribute set in
> the markup or by script isn't automatically truncated at the
> element's maximum allowed value length.
There seems to be a conflict here indeed. It is different from the
character vs. code unit issue, however.
Definitions in 4.10.21.1 clearly imply that the length of the value of a
control may exceed the limit set by maxlength. The "Constraints" part
deals with the question what happens then (in form submission).
Yucca
More information about the whatwg
mailing list