[html5] r6757 - [e] (0) Since I'm going to be editing this algorithm some more, let's bite the b [...]
whatwg at whatwg.org
whatwg at whatwg.org
Tue Oct 25 15:59:21 PDT 2011
Author: ianh
Date: 2011-10-25 15:59:20 -0700 (Tue, 25 Oct 2011)
New Revision: 6757
Modified:
complete.html
index
source
Log:
[e] (0) Since I'm going to be editing this algorithm some more, let's bite the bullet and do what foolip and anne wanted, which is to normalise newlines early for sanity.
Modified: complete.html
===================================================================
--- complete.html 2011-10-25 22:44:32 UTC (rev 6756)
+++ complete.html 2011-10-25 22:59:20 UTC (rev 6757)
@@ -33068,11 +33068,26 @@
<p>The <dfn id=webvtt-parser-algorithm>WebVTT parser algorithm</dfn> is as follows:</p>
- <ol><li><p>Let <var title="">input</var> be the string being parsed,
- after conversion to Unicode.</li>
+ <ol><li>
- <li><p>Replace all U+0000 NULL characters in <var title="">input</var> by U+FFFD REPLACEMENT CHARACTERs.</li>
+ <p>Let <var title="">input</var> be the string being parsed, after
+ conversion to Unicode, and with the following transformations
+ applied:</p>
+ <ul><li><p>Replace all U+0000 NULL characters by U+FFFD REPLACEMENT
+ CHARACTERs.</li>
+
+ <li><p>Replace each U+000D CARRIAGE RETURN U+000A LINE FEED
+ (CRLF) character pair by a single U+000A LINE FEED (CRLF)
+ character.</li>
+
+ <li><p>Replace all remaining U+000D CARRIAGE RETURN characters by
+ U+000A LINE FEED (CRLF) characters.</li>
+
+ </ul></li>
+
+ <li>
+
<li><p>Let <var title="">position</var> be a pointer into <var title="">input</var>, initially pointing at the start of the
string. In an <a href=#incremental-webvtt-parser>incremental WebVTT parser</a>, when this
algorithm (or further algorithms that it uses) moves the <var title="">position</var> pointer, the user agent must wait until
@@ -33088,9 +33103,7 @@
<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>
<li><p>If <var title="">line</var> is less than six characters
long, then abort these steps. The file is not a <a href=#webvtt-file>WebVTT
@@ -33111,30 +33124,18 @@
<i>end</i>.</li>
<li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
-
- <li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
- <i>end</i>.</li>
-
- <li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
<li><p><i title="">Header</i>: <a href=#collect-a-sequence-of-characters>Collect a sequence of
- characters</a> that are <em>not</em> U+000D CARRIAGE RETURN (CR)
- or U+000A LINE FEED (LF) characters. Let <var title="">line</var>
- be those characters, if any.</li>
+ characters</a> that are <em>not</em> U+000A LINE FEED (LF)
+ characters. Let <var title="">line</var> be those characters, if
+ any.</li>
<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
<i>end</i>.</li>
<li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
-
- <li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
- <i>end</i>.</li>
-
- <li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
<!-- In v2, this is where we can put header metadata processing -->
@@ -33144,13 +33145,11 @@
<li><p><i>Cue loop</i>: <a href=#collect-a-sequence-of-characters>Collect a sequence of
- characters</a> that are either U+000D CARRIAGE RETURN (CR) or
- U+000A LINE FEED (LF) characters.</li>
+ characters</a> that are U+000A LINE FEED (LF)
+ characters.</li>
<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>
<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>end</i>. (In such a case, <var title="">position</var> is also forcibly past the end of <var title="">input</var><!-- since we've just collected newlines, so we
@@ -33200,9 +33199,6 @@
<li><p>Let <var title="">cue</var>'s <a href=#text-track-cue-identifier>text track cue
identifier</a> be <var title="">line</var>.<p></li>
- <li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
-
<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then discard <var title="">cue</var> and jump
to the step labeled <i>end</i>.</li>
@@ -33210,9 +33206,7 @@
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>
<li><p>If <var title="">line</var> is the empty string, then
discard <var title="">cue</var> and jump to the step labeled <i>cue
@@ -33230,18 +33224,10 @@
labeled <i>cue text processing</i>.</li>
<li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
-
- <li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled <i>cue text
- processing</i>.</li>
-
- <li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>
<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>cue text processing</i>.</li>
@@ -33275,18 +33261,10 @@
labeled <i>end</i>.</li>
<li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
-
- <li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
- <i>end</i>.</li>
-
- <li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>
<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>cue loop</i>.</li>
Modified: index
===================================================================
--- index 2011-10-25 22:44:32 UTC (rev 6756)
+++ index 2011-10-25 22:59:20 UTC (rev 6757)
@@ -33068,11 +33068,26 @@
<p>The <dfn id=webvtt-parser-algorithm>WebVTT parser algorithm</dfn> is as follows:</p>
- <ol><li><p>Let <var title="">input</var> be the string being parsed,
- after conversion to Unicode.</li>
+ <ol><li>
- <li><p>Replace all U+0000 NULL characters in <var title="">input</var> by U+FFFD REPLACEMENT CHARACTERs.</li>
+ <p>Let <var title="">input</var> be the string being parsed, after
+ conversion to Unicode, and with the following transformations
+ applied:</p>
+ <ul><li><p>Replace all U+0000 NULL characters by U+FFFD REPLACEMENT
+ CHARACTERs.</li>
+
+ <li><p>Replace each U+000D CARRIAGE RETURN U+000A LINE FEED
+ (CRLF) character pair by a single U+000A LINE FEED (CRLF)
+ character.</li>
+
+ <li><p>Replace all remaining U+000D CARRIAGE RETURN characters by
+ U+000A LINE FEED (CRLF) characters.</li>
+
+ </ul></li>
+
+ <li>
+
<li><p>Let <var title="">position</var> be a pointer into <var title="">input</var>, initially pointing at the start of the
string. In an <a href=#incremental-webvtt-parser>incremental WebVTT parser</a>, when this
algorithm (or further algorithms that it uses) moves the <var title="">position</var> pointer, the user agent must wait until
@@ -33088,9 +33103,7 @@
<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>
<li><p>If <var title="">line</var> is less than six characters
long, then abort these steps. The file is not a <a href=#webvtt-file>WebVTT
@@ -33111,30 +33124,18 @@
<i>end</i>.</li>
<li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
-
- <li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
- <i>end</i>.</li>
-
- <li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
<li><p><i title="">Header</i>: <a href=#collect-a-sequence-of-characters>Collect a sequence of
- characters</a> that are <em>not</em> U+000D CARRIAGE RETURN (CR)
- or U+000A LINE FEED (LF) characters. Let <var title="">line</var>
- be those characters, if any.</li>
+ characters</a> that are <em>not</em> U+000A LINE FEED (LF)
+ characters. Let <var title="">line</var> be those characters, if
+ any.</li>
<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
<i>end</i>.</li>
<li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
-
- <li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
- <i>end</i>.</li>
-
- <li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
<!-- In v2, this is where we can put header metadata processing -->
@@ -33144,13 +33145,11 @@
<li><p><i>Cue loop</i>: <a href=#collect-a-sequence-of-characters>Collect a sequence of
- characters</a> that are either U+000D CARRIAGE RETURN (CR) or
- U+000A LINE FEED (LF) characters.</li>
+ characters</a> that are U+000A LINE FEED (LF)
+ characters.</li>
<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>
<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>end</i>. (In such a case, <var title="">position</var> is also forcibly past the end of <var title="">input</var><!-- since we've just collected newlines, so we
@@ -33200,9 +33199,6 @@
<li><p>Let <var title="">cue</var>'s <a href=#text-track-cue-identifier>text track cue
identifier</a> be <var title="">line</var>.<p></li>
- <li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
-
<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then discard <var title="">cue</var> and jump
to the step labeled <i>end</i>.</li>
@@ -33210,9 +33206,7 @@
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>
<li><p>If <var title="">line</var> is the empty string, then
discard <var title="">cue</var> and jump to the step labeled <i>cue
@@ -33230,18 +33224,10 @@
labeled <i>cue text processing</i>.</li>
<li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
-
- <li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled <i>cue text
- processing</i>.</li>
-
- <li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>
<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>cue text processing</i>.</li>
@@ -33275,18 +33261,10 @@
labeled <i>end</i>.</li>
<li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
-
- <li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
- <i>end</i>.</li>
-
- <li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>
<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>cue loop</i>.</li>
Modified: source
===================================================================
--- source 2011-10-25 22:44:32 UTC (rev 6756)
+++ source 2011-10-25 22:59:20 UTC (rev 6757)
@@ -36188,12 +36188,30 @@
<ol>
- <li><p>Let <var title="">input</var> be the string being parsed,
- after conversion to Unicode.</p></li>
+ <li>
- <li><p>Replace all U+0000 NULL characters in <var
- title="">input</var> by U+FFFD REPLACEMENT CHARACTERs.</p></li>
+ <p>Let <var title="">input</var> be the string being parsed, after
+ conversion to Unicode, and with the following transformations
+ applied:</p>
+ <ul>
+
+ <li><p>Replace all U+0000 NULL characters by U+FFFD REPLACEMENT
+ CHARACTERs.</p></li>
+
+ <li><p>Replace each U+000D CARRIAGE RETURN U+000A LINE FEED
+ (CRLF) character pair by a single U+000A LINE FEED (CRLF)
+ character.</p></li>
+
+ <li><p>Replace all remaining U+000D CARRIAGE RETURN characters by
+ U+000A LINE FEED (CRLF) characters.</p></li>
+
+ </ul>
+
+ </li>
+
+ <li>
+
<li><p>Let <var title="">position</var> be a pointer into <var
title="">input</var>, initially pointing at the start of the
string. In an <span>incremental WebVTT parser</span>, when this
@@ -36215,9 +36233,8 @@
<li><p><span>Collect a sequence of characters</span> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</p></li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var
+ title="">line</var> be those characters, if any.</p></li>
<li><p>If <var title="">line</var> is less than six characters
long, then abort these steps. The file is not a <span>WebVTT
@@ -36240,39 +36257,21 @@
<i>end</i>.</p></li>
<li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var
- title="">position</var> to the next character in <var
- title="">input</var>.</p></li>
-
- <li><p>If <var title="">position</var> is past the end of <var
- title="">input</var>, then jump to the step labeled
- <i>end</i>.</p></li>
-
- <li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var
title="">position</var> to the next character in <var
title="">input</var>.</p></li>
<li><p><i title="">Header</i>: <span>Collect a sequence of
- characters</span> that are <em>not</em> U+000D CARRIAGE RETURN (CR)
- or U+000A LINE FEED (LF) characters. Let <var title="">line</var>
- be those characters, if any.</p></li>
+ characters</span> that are <em>not</em> U+000A LINE FEED (LF)
+ characters. Let <var title="">line</var> be those characters, if
+ any.</p></li>
<li><p>If <var title="">position</var> is past the end of <var
title="">input</var>, then jump to the step labeled
<i>end</i>.</p></li>
<li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var
- title="">position</var> to the next character in <var
- title="">input</var>.</p></li>
-
- <li><p>If <var title="">position</var> is past the end of <var
- title="">input</var>, then jump to the step labeled
- <i>end</i>.</p></li>
-
- <li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var
title="">position</var> to the next character in <var
title="">input</var>.</p></li>
@@ -36284,13 +36283,12 @@
<li><p><i>Cue loop</i>: <span>Collect a sequence of
- characters</span> that are either U+000D CARRIAGE RETURN (CR) or
- U+000A LINE FEED (LF) characters.</p></li>
+ characters</span> that are U+000A LINE FEED (LF)
+ characters.</p></li>
<li><p><span>Collect a sequence of characters</span> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</p></li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var
+ title="">line</var> be those characters, if any.</p></li>
<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>end</i>. (In such a case, <var
@@ -36342,11 +36340,6 @@
<li><p>Let <var title="">cue</var>'s <span>text track cue
identifier</span> be <var title="">line</var>.<p></li>
- <li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var
- title="">position</var> to the next character in <var
- title="">input</var>.</p></li>
-
<li><p>If <var title="">position</var> is past the end of <var
title="">input</var>, then discard <var title="">cue</var> and jump
to the step labeled <i>end</i>.</p></li>
@@ -36357,9 +36350,8 @@
title="">input</var>.</p></li>
<li><p><span>Collect a sequence of characters</span> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</p></li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var
+ title="">line</var> be those characters, if any.</p></li>
<li><p>If <var title="">line</var> is the empty string, then
discard <var title="">cue</var> and jump to the step labeled <i>cue
@@ -36378,23 +36370,13 @@
labeled <i>cue text processing</i>.</p></li>
<li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var
- title="">position</var> to the next character in <var
- title="">input</var>.</p></li>
-
- <li><p>If <var title="">position</var> is past the end of <var
- title="">input</var>, then jump to the step labeled <i>cue text
- processing</i>.</p></li>
-
- <li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var
title="">position</var> to the next character in <var
title="">input</var>.</p></li>
<li><p><span>Collect a sequence of characters</span> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</p></li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var
+ title="">line</var> be those characters, if any.</p></li>
<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>cue text processing</i>.</p></li>
@@ -36428,23 +36410,13 @@
labeled <i>end</i>.</p></li>
<li><p>If the character indicated by <var title="">position</var>
- is a U+000D CARRIAGE RETURN (CR) character, advance <var
- title="">position</var> to the next character in <var
- title="">input</var>.</p></li>
-
- <li><p>If <var title="">position</var> is past the end of <var
- title="">input</var>, then jump to the step labeled
- <i>end</i>.</p></li>
-
- <li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var
title="">position</var> to the next character in <var
title="">input</var>.</p></li>
<li><p><span>Collect a sequence of characters</span> that are
- <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
- characters. Let <var title="">line</var> be those characters, if
- any.</p></li>
+ <em>not</em> U+000A LINE FEED (LF) characters. Let <var
+ title="">line</var> be those characters, if any.</p></li>
<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>cue loop</i>.</p></li>
More information about the Commit-Watchers
mailing list