[whatwg] textarea newline format - raw value vs. transformed value and setSelectionRange
Ian Hickson
ian at hixie.ch
Tue Jan 4 16:38:17 PST 2011
On Sun, 10 Oct 2010, Michael A. Puls II wrote:
>
> Consider the following [simplified]:
>
> <!DOCTYPE html>
> <title></title>
> <script>
> window.addEventListener("DOMContentLoaded", function() {
> var ta = document.getElementsByTagName("textarea")[0];
> ta.value = ta.value.replace(/\r|\n/g, encodeURIComponent);
> }, false);
> </script>
> <textarea rows="3">Line 1
> Line 2
> Line 3</textarea>
>
> The behavior between Firefox 4 latest trunk and Opera 10.70 latest
> snapshot is different because they're using different newline formats.
The correct behaviour is that the element's value becomes
"Line 1%0ALine 2%0ALine 3"
> See step 1 at
> <http://www.whatwg.org/specs/web-apps/current-work/multipage/the-button-element.html#attr-textarea-wrap-hard-state>.
>
> That says that the 'value' getter returns the raw value + newlines normalized
> to "\r\n".
No, it says that the submission value has that transformation applied. The
'.value' getter returns the _raw_ value, which doesn't have U+000Ds added
by the user agent (they can only be there if the script added them).
> I always thought that meant that the raw value (what was parsed into the
> DOM)
The "raw value" is what the user edits.
> contained newlines normalized to "\r\n" too for textareas and that a
> browser with an HTML5 parser like Firefox would automatically show
> newlines normalized to "\r\n" without even having a conversion done
> (internally) for the 'value' getter.
No, the HTML parser strips U+000D characters ("\r").
> I'm also not sure step 1 applies to the 'value' setter. I can't tell for
> sure. It looks like not, but not sure.
It doesn't apply to .value at all, only to the 'value' concept, which is a
concept used in form submission and constraint validation.
> Also, I'm not sure if setSelectionRange() should operate on the raw
> value, or the transformed value in step 1.
Raw value, because <textarea> is defined as an element that "represents a
multiline plain text edit control for the element's raw value".
> Opera's not using an HTML5 parser yet, so I can't check what it might
> do, but could this be clarified?
It's not clear to me what isn't clear. :-) Could you elaborate on what the
spec says that led you to your interpretation?
> In
> <http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#preprocessing-the-input-stream>
> it says:
>
> "U+000D CARRIAGE RETURN (CR) characters and U+000A LINE FEED (LF)
> characters are treated specially. Any CR characters that are followed by
> LF characters must be removed, and any CR characters not followed by LF
> characters must be converted to LF characters. Thus, newlines in HTML
> DOMs are represented by LF characters, and there are never any CR
> characters in the input to the tokenization stage."
>
> Does that mean that the raw value of the parsed textarea should only
> ever have '\n' for newlines (unless the 'value' setter is used in JS to
> introduce '\r' characters)?
Yes.
> If so, does that mean that setSelectionRange() should operate on the
> raw, internal value (that just has '\n' for newlines in it normally),
> but the 'value' getter still returns the transformed value with newlines
> normalized to "\r\n"?
The value getting doesn't return the transformed value. See the definition
of the value getting for details.
> I see
> <http://www.whatwg.org/specs/web-apps/current-work/multipage/editing.html#dom-textarea/input-setselectionrange>,
> but it doesn't mention this.
I've clarified the spec to indicate that setSelectionRange() and company
operate on the raw value.
Cheers,
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list