[whatwg] Can the maximum allowed value length be changed to restrict the number of characters?
Jukka K. Korpela
jkorpela at cs.tut.fi
Tue Aug 20 07:21:35 PDT 2013
2013-08-20 17:09, Anne van Kesteren wrote:
> On Tue, Aug 20, 2013 at 12:30 AM, Ryosuke Niwa <rniwa at apple.com> wrote:
>> Can the specification be changed to use the number of composed character sequences instead of the code-unit length?
>
> In a way I guess that's nice, but it also seems confusing that given
>
> data:text/html,<input type=text maxlength=1>
>
> pasting in U+0041 U+030A would give a string that's longer than 1 from
> JavaScript's perspective.
Oh, right, this is an issue different from the non-BMP issue I discussed
in my reply. This is even clearer in my opinion, since U+0041 U+030A is
clearly two Unicode characters, not one, even though it is expected to
be rendered as “Å” and even though U+00C5 is canonically equivalent to
U+0041 U+030A.
> I don't think there's any place in the
> platform where we measure string length other than by number of code
> units at the moment.
Besides, if “character” means something else than Unicode character
(Unicode code point assigned to a character) or, as a different concept,
Unicode code unit, then the question would arise what it means. For
example, would a letter followed by 42 combining marks still be one
character? (Such monstrosities are actually used, in an attempt to
create “funny” effects.)
Yucca
More information about the whatwg
mailing list