[html5] r5080 - [giow] (0) Captions - Stage 9.1: More parser rules for WebSRT.

Wed May 5 14:17:56 PDT 2010

Author: ianh
Date: 2010-05-05 14:17:52 -0700 (Wed, 05 May 2010)
New Revision: 5080

Modified:
   complete.html
   index
   source
Log:
[giow] (0) Captions - Stage 9.1: More parser rules for WebSRT.

Modified: complete.html
===================================================================

--- complete.html	2010-05-05 20:23:21 UTC (rev 5079)
+++ complete.html	2010-05-05 21:17:52 UTC (rev 5080)
@@ -26175,6 +26175,12 @@
 
   <p class=XXX>...
 
+  <!-- XXX
+   Make sure that .cues and .activeCues doesn't change while script is
+   running, except for addCue/removeCue and the removal of all cues in
+   the face of a dynamic track.src change.
+  -->
+
   </div>
 
 
@@ -26254,22 +26260,41 @@
 
   <h6 id=parsing-0><span class=secno>4.8.10.11.2 </span>Parsing</h6>
 
-  <p>A <dfn id=websrt-parser>WebSRT parser</dfn>, given an input byte stream, must
-  convert the bytes into Unicode characters by interpreting them as
-  UTF-8. Bytes or sequences of bytes that are not valid UTF-8
-  sequences must be interpreted as a U+FFFD REPLACEMENT CHARACTER. All
-  U+0000 NULL characters must be replaced by U+FFFD REPLACEMENT
-  CHARACTERs.</p>
+  <p>A <dfn id=websrt-parser>WebSRT parser</dfn>, given an input byte stream and a
+  <a href=#timed-track-list-of-cues>timed track list of cues</a> <var title="">output</var>,
+  must convert the bytes into a string of Unicode characters by
+  interpreting them as UTF-8, and then must parse the resulting string
+  according to the <a href=#websrt-parser-algorithm>WebSRT parser algorithm</a> below. A
+  <a href=#websrt-parser>WebSRT parser</a>, specifically its conversion and parsing
+  steps, is typically run asynchronously, with the input byte stream
+  being updated incrementally as the resource is downloaded.</p>
 
-  <p>The Unicode characters from a string that must be parsed
-  according to the following algorithm:</p>
+  <p>When convering the bytes into Unicode characters, bytes or
+  sequences of bytes that are not valid UTF-8 sequences must be
+  interpreted as a U+FFFD REPLACEMENT CHARACTER, and all U+0000 NULL
+  characters must be replaced by U+FFFD REPLACEMENT CHARACTERs.</p>
 
+  <p>The <dfn id=websrt-parser-algorithm>WebSRT parser algorithm</dfn> is as follows:</p>
+
   <ol><li><p>Let <var title="">input</var> be the string being
    parsed.</li>
 
    <li><p>Let <var title="">position</var> be a pointer into <var title="">input</var>, initially pointing at the start of the
    string.</li>
 
+   <li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
+   either U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
+   characters.</li>
+
+   <li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
+   <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
+   characters. Let <var title="">line</var> be those
+   characters, if any.</li>
+
+   <li><p>If <var title="">line</var> is the empty string, then the
+   file has ended. Abort these steps. The <a href=#websrt-parser>WebSRT parser</a>
+   has finished.</li>
+
    <li><p class=XXX>...</li>
 
   </ol></div>

Modified: index
===================================================================
--- index	2010-05-05 20:23:21 UTC (rev 5079)
+++ index	2010-05-05 21:17:52 UTC (rev 5080)
@@ -26076,6 +26076,12 @@
 
   <p class=XXX>...
 
+  <!-- XXX
+   Make sure that .cues and .activeCues doesn't change while script is
+   running, except for addCue/removeCue and the removal of all cues in
+   the face of a dynamic track.src change.
+  -->
+
   </div>
 
 
@@ -26155,22 +26161,41 @@
 
   <h6 id=parsing-0><span class=secno>4.8.10.11.2 </span>Parsing</h6>
 
-  <p>A <dfn id=websrt-parser>WebSRT parser</dfn>, given an input byte stream, must
-  convert the bytes into Unicode characters by interpreting them as
-  UTF-8. Bytes or sequences of bytes that are not valid UTF-8
-  sequences must be interpreted as a U+FFFD REPLACEMENT CHARACTER. All
-  U+0000 NULL characters must be replaced by U+FFFD REPLACEMENT
-  CHARACTERs.</p>
+  <p>A <dfn id=websrt-parser>WebSRT parser</dfn>, given an input byte stream and a
+  <a href=#timed-track-list-of-cues>timed track list of cues</a> <var title="">output</var>,
+  must convert the bytes into a string of Unicode characters by
+  interpreting them as UTF-8, and then must parse the resulting string
+  according to the <a href=#websrt-parser-algorithm>WebSRT parser algorithm</a> below. A
+  <a href=#websrt-parser>WebSRT parser</a>, specifically its conversion and parsing
+  steps, is typically run asynchronously, with the input byte stream
+  being updated incrementally as the resource is downloaded.</p>
 
-  <p>The Unicode characters from a string that must be parsed
-  according to the following algorithm:</p>
+  <p>When convering the bytes into Unicode characters, bytes or
+  sequences of bytes that are not valid UTF-8 sequences must be
+  interpreted as a U+FFFD REPLACEMENT CHARACTER, and all U+0000 NULL
+  characters must be replaced by U+FFFD REPLACEMENT CHARACTERs.</p>
 
+  <p>The <dfn id=websrt-parser-algorithm>WebSRT parser algorithm</dfn> is as follows:</p>
+
   <ol><li><p>Let <var title="">input</var> be the string being
    parsed.</li>
 
    <li><p>Let <var title="">position</var> be a pointer into <var title="">input</var>, initially pointing at the start of the
    string.</li>
 
+   <li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
+   either U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
+   characters.</li>
+
+   <li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
+   <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
+   characters. Let <var title="">line</var> be those
+   characters, if any.</li>
+
+   <li><p>If <var title="">line</var> is the empty string, then the
+   file has ended. Abort these steps. The <a href=#websrt-parser>WebSRT parser</a>
+   has finished.</li>
+
    <li><p class=XXX>...</li>
 
   </ol></div>

Modified: source
===================================================================
--- source	2010-05-05 20:23:21 UTC (rev 5079)
+++ source	2010-05-05 21:17:52 UTC (rev 5080)
@@ -28285,6 +28285,12 @@
 
   <p class="XXX">...
 
+  <!-- XXX
+   Make sure that .cues and .activeCues doesn't change while script is
+   running, except for addCue/removeCue and the removal of all cues in
+   the face of a dynamic track.src change.
+  -->
+
   </div>
 
 
@@ -28379,16 +28385,22 @@
 
   <h6>Parsing</h6>
 
-  <p>A <dfn>WebSRT parser</dfn>, given an input byte stream, must
-  convert the bytes into Unicode characters by interpreting them as
-  UTF-8. Bytes or sequences of bytes that are not valid UTF-8
-  sequences must be interpreted as a U+FFFD REPLACEMENT CHARACTER. All
-  U+0000 NULL characters must be replaced by U+FFFD REPLACEMENT
-  CHARACTERs.</p>
+  <p>A <dfn>WebSRT parser</dfn>, given an input byte stream and a
+  <span>timed track list of cues</span> <var title="">output</var>,
+  must convert the bytes into a string of Unicode characters by
+  interpreting them as UTF-8, and then must parse the resulting string
+  according to the <span>WebSRT parser algorithm</span> below. A
+  <span>WebSRT parser</span>, specifically its conversion and parsing
+  steps, is typically run asynchronously, with the input byte stream
+  being updated incrementally as the resource is downloaded.</p>
 
-  <p>The Unicode characters from a string that must be parsed
-  according to the following algorithm:</p>
+  <p>When convering the bytes into Unicode characters, bytes or
+  sequences of bytes that are not valid UTF-8 sequences must be
+  interpreted as a U+FFFD REPLACEMENT CHARACTER, and all U+0000 NULL
+  characters must be replaced by U+FFFD REPLACEMENT CHARACTERs.</p>
 
+  <p>The <dfn>WebSRT parser algorithm</dfn> is as follows:</p>
+
   <ol>
 
    <li><p>Let <var title="">input</var> be the string being
@@ -28398,6 +28410,19 @@
    title="">input</var>, initially pointing at the start of the
    string.</p></li>
 
+   <li><p><span>Collect a sequence of characters</span> that are
+   either U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
+   characters.</p></li>
+
+   <li><p><span>Collect a sequence of characters</span> that are
+   <em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
+   characters. Let <var title="">line</var> be those
+   characters, if any.</p></li>
+
+   <li><p>If <var title="">line</var> is the empty string, then the
+   file has ended. Abort these steps. The <span>WebSRT parser</span>
+   has finished.</p></li>
+
    <li><p class="XXX">...</p></li>
 
   </ol>