[whatwg] textarea newline format - raw value vs. transformed value and setSelectionRange
Michael A. Puls II
shadow2531 at gmail.com
Sun Oct 10 20:44:02 PDT 2010
Consider the following:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title></title>
<script>
window.addEventListener("DOMContentLoaded", function() {
var ta = document.getElementsByTagName("textarea")[0];
alert(ta.value.replace(/\r|\n/g, encodeURIComponent));
ta.focus();
ta.setSelectionRange(8, 8);
}, false);
</script>
</head>
<body>
<textarea rows="3">Line 1
Line 2
Line 3</textarea>
</body>
</html>
The behavior between Firefox 4 latest trunk and Opera 10.70 latest
snapshot is different because they're using different newline formats.
Firefox is using '\n' while Opera is using "\r\n", which causes the cursor
to be placed at different positions.
See step 1 at
<http://www.whatwg.org/specs/web-apps/current-work/multipage/the-button-element.html#attr-textarea-wrap-hard-state>.
That says that the 'value' getter returns the raw value + newlines
normalized to "\r\n". I always thought that meant that the raw value (what
was parsed into the DOM) contained newlines normalized to "\r\n" too for
textareas and that a browser with an HTML5 parser like Firefox would
automatically show newlines normalized to "\r\n" without even having a
conversion done (internally) for the 'value' getter. But, now I'm not so
sure.
I'm also not sure step 1 applies to the 'value' setter. I can't tell for
sure. It looks like not, but not sure.
Also, does everyone agree with step 1?
Also, I'm not sure if setSelectionRange() should operate on the raw value,
or the transformed value in step 1.
Opera's not using an HTML5 parser yet, so I can't check what it might do,
but could this be clarified?
In
<http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#preprocessing-the-input-stream>
it says:
"U+000D CARRIAGE RETURN (CR) characters and U+000A LINE FEED (LF)
characters are treated specially. Any CR characters that are followed by
LF characters must be removed, and any CR characters not followed by LF
characters must be converted to LF characters. Thus, newlines in HTML DOMs
are represented by LF characters, and there are never any CR characters in
the input to the tokenization stage."
Does that mean that the raw value of the parsed textarea should only ever
have '\n' for newlines (unless the 'value' setter is used in JS to
introduce '\r' characters)?
If so, does that mean that setSelectionRange() should operate on the raw,
internal value (that just has '\n' for newlines in it normally), but the
'value' getter still returns the transformed value with newlines
normalized to "\r\n"?
I see
<http://www.whatwg.org/specs/web-apps/current-work/multipage/editing.html#dom-textarea/input-setselectionrange>,
but it doesn't mention this.
--
Michael
More information about the whatwg
mailing list