[html5] r8428 - [e] (0) Add some best practices notes regarding how to use metadata cues. Fixing [...]

Tue Jan 28 12:17:39 PST 2014

Author: ianh
Date: 2014-01-28 12:17:38 -0800 (Tue, 28 Jan 2014)
New Revision: 8428

Modified:
   complete.html
   index
   source
Log:
[e] (0) Add some best practices notes regarding how to use metadata cues.
Fixing https://www.w3.org/Bugs/Public/show_bug.cgi?id=24382
Affected topics: HTML, Video Text Tracks

Modified: complete.html
===================================================================

--- complete.html	2014-01-28 18:49:55 UTC (rev 8427)
+++ complete.html	2014-01-28 20:17:38 UTC (rev 8428)
@@ -640,7 +640,8 @@
          <li><a href=#guidelines-for-exposing-cues-in-various-formats-as-text-track-cues><span class=secno>4.7.10.12.4 </span>Guidelines for exposing cues in various formats as text track cues</a></li>
          <li><a href=#text-track-api><span class=secno>4.7.10.12.5 </span>Text track API</a></li>
          <li><a href=#text-tracks-describing-chapters><span class=secno>4.7.10.12.6 </span>Text tracks describing chapters</a></li>
-         <li><a href=#cue-events><span class=secno>4.7.10.12.7 </span>Event handlers for objects of the text track APIs</a></ol></li>
+         <li><a href=#cue-events><span class=secno>4.7.10.12.7 </span>Event handlers for objects of the text track APIs</a></li>
+         <li><a href=#best-practices-for-metadata-text-tracks><span class=secno>4.7.10.12.8 </span>Best practices for metadata text tracks</a></ol></li>
        <li><a href=#user-interface><span class=secno>4.7.10.13 </span>User interface</a></li>
        <li><a href=#time-ranges><span class=secno>4.7.10.14 </span>Time ranges</a></li>
        <li><a href=#the-trackevent-interface><span class=secno>4.7.10.15 </span>The <code>TrackEvent</code> interface</a></li>
@@ -30631,6 +30632,78 @@
   </table></div>
 
 
+
+  <h6 id=best-practices-for-metadata-text-tracks><span class=secno>4.7.10.12.8 </span>Best practices for metadata text tracks</h6>
+
+  <p><i>This section is non-normative.</i></p>
+
+  <p>Text tracks can be used for storing data relating to the media data, for interactive or
+  augmented views.</p>
+
+  <p>For example, a page showing a sports broadcast could include information about the current
+  score. Suppose a robotics competition was being streamed live. The image could be overlayed with
+  the scores, as follows:</p>
+
+  <p><iframe src='data:text/html;charset=utf-8,<!DOCTYPE%20html>%0A<style>%0A%20body%2C%20html%20%7B%20margin%3A%200%3B%20padding%3A%200%3B%20overflow%3A%20hidden%3B%20%7D%0A%20div%20%7B%20width%3A%20600px%3B%20height%3A%20400px%3B%20position%3A%20relative%3B%20%7D%0A%20p%20%7B%20position%3A%20absolute%3B%20top%3A%200%3B%20margin%3A%200.25em%3B%20font%3A%20small-caps%20900%202em%20sans-serif%3B%20text-shadow%3A%20white%200%200%204px%3B%20%7D%0A%20span%20%7B%20display%3A%20block%3B%20%7D%0A%20.left%20%7B%20color%3A%20red%3B%20left%3A%200%3B%20text-align%3A%20left%3B%20%7D%0A%20.right%20%7B%20color%3A%20blue%3B%20right%3A%200%3B%20text-align%3A%20right%3B%20%7D%0A%20.middle%20%7B%20color%3A%20white%3B%20top%3A%20auto%3B%20bottom%3A%200%3B%20left%3A%200%3B%20right%3A%200%3B%20text-align%3A%20center%3B%20text-shadow%3A%20black%200%200%204px%3B%20%7D%0A%20.middle%20span%20%7B%20display%3A%20inline-block%3B%20margin%3A%200%201em%3B%20font-size%3A%200.75em%3B%20text-transform%3A%20
 uppercase%3B%20%7D%0A<%2Fstyle>%0A<div>%0A%20<img%20src%3D"http%3A%2F%2Fwww.whatwg.org%2Fspecs%2Fweb-apps%2Fcurrent-work%2Fimages%2Frobots.jpeg">%0A%20<p%20class%3D"score%20left"><span>Red%20Alliance<%2Fspan>%20<span>78<%2Fspan><%2Fp>%0A%20<p%20class%3D"score%20right"><span>Blue%20Alliance<%2Fspan>%20<span>66<%2Fspan><%2Fp>%0A%20<p%20class%3D"score%20middle"><span>Qual%20Match%2037<%2Fspan>%20<span>Friday%2014%3A21<%2Fspan>%0A<%2Fdiv>' width=600 height=400></iframe>
+
+  <p>In order to make the score display render correctly whenever the user seeks to an arbitrary
+  point in the video, the metadata text track cues need to be as long as is appropriate for the
+  score. For example, in the frame above, there would be maybe one cue that lasts the length of the
+  match that gives the match number, one cue that lasts until the blue alliance's score changes, and
+  one cue that lasts until the red alliance's score changes. If the video is just a stream of the
+  live event, the time in the bottom right would presumably be automatically derived from the
+  current video time, rather than based on a cue. However, if the video was just the highlights,
+  then that might be given in cues also.</p>
+
+  <p>The following shows what fragments of this could look like in a WebVTT file:</p>
+
+  <pre>WEBVTT
+
+...
+
+05:10:00.000 --> 05:12:15.000
+matchtype:qual
+matchnumber:37
+
+...
+
+05:11:02.251 --> 05:11:17.198
+red:78
+
+05:11:03.672 --> 05:11:54.198
+blue:66
+
+05:11:17.198 --> 05:11:25.912
+red:80
+
+05:11:25.912 --> 05:11:26.522
+red:83
+
+05:11:26.522 --> 05:11:26.982
+red:86
+
+05:11:26.982 --> 05:11:27.499
+red:89
+
+...</pre>
+
+  <p>The key here is to notice that the information is given in cues that span the length of time to
+  which the relevant event applies. If, instead, the scores were given as zero-length (or very
+  brief, nearly zero-length) cues when the score changes, for example saying "red+2" at
+  05:11:17.198, "red+3" at 05:11:25.912, etc, problems arise: primarily, seeking is much harder to
+  implement, as the script has to walk the entire list of cues to make sure that no notifications
+  have been missed; but also, if the cues are short it's possible the script will never see that
+  they are active unless it listens to them specifically.</p>
+
+  <p>When using cues in this manner, authors are encouraged to use the <code title=event-media-cuechange><a href=#event-media-cuechange>cuechange</a></code> event to update the current annotations. (In
+  particular, using the <code title=event-media-timeupdate><a href=#event-media-timeupdate>timeupdate</a></code> event would be less
+  appropriate as it would require doing work even when the cues haven't changed, and, more
+  importantly, would introduce a higher latency between when the metatata cues become active and
+  when the display is updated, this <code title=event-media-timeupdate><a href=#event-media-timeupdate>timeupdate</a></code> events
+  are rate-limited.)</p>
+
+
+
   <h5 id=user-interface><span class=secno>4.7.10.13 </span>User interface</h5>
 
   <p>The <dfn id=attr-media-controls title=attr-media-controls><code>controls</code></dfn> attribute is a <a href=#boolean-attribute>boolean
@@ -103064,6 +103137,13 @@
    (<a itemprop=license href=http://creativecommons.org/licenses/by-sa/3.0/>CC BY-SA 3.0</a>)</p>
   </div>
 
+  <div itemscope="" itemtype=http://n.whatwg.org/work>
+   <p>The photograph of robot 148 climbing the tower at the FIRST Robotics Competition 2013 Silicon Valley Regional is based on
+   <a itemprop=work href=http://www.flickr.com/photos/lenore-m/8631391979/>a work</a> by
+   <a itemprop=http://creativecommons.org/ns#attributionURL href=http://www.flickr.com/photos/lenore-m/>Lenore Edman</a>.
+   (<a itemprop=license href=http://creativecommons.org/licenses/by/2.0/>CC BY 2.0</a>)</p>
+  </div>
+
   <p>Thanks also to the Microsoft blogging community for some ideas, to the attendees of the W3C
   Workshop on Web Applications and Compound Documents for inspiration, to the #mrt crew, the #mrt.no
   crew, and the #whatwg crew, and to Pillar and Hedral for their ideas and support.</p>

Modified: index
===================================================================
--- index	2014-01-28 18:49:55 UTC (rev 8427)
+++ index	2014-01-28 20:17:38 UTC (rev 8428)
@@ -640,7 +640,8 @@
          <li><a href=#guidelines-for-exposing-cues-in-various-formats-as-text-track-cues><span class=secno>4.7.10.12.4 </span>Guidelines for exposing cues in various formats as text track cues</a></li>
          <li><a href=#text-track-api><span class=secno>4.7.10.12.5 </span>Text track API</a></li>
          <li><a href=#text-tracks-describing-chapters><span class=secno>4.7.10.12.6 </span>Text tracks describing chapters</a></li>
-         <li><a href=#cue-events><span class=secno>4.7.10.12.7 </span>Event handlers for objects of the text track APIs</a></ol></li>
+         <li><a href=#cue-events><span class=secno>4.7.10.12.7 </span>Event handlers for objects of the text track APIs</a></li>
+         <li><a href=#best-practices-for-metadata-text-tracks><span class=secno>4.7.10.12.8 </span>Best practices for metadata text tracks</a></ol></li>
        <li><a href=#user-interface><span class=secno>4.7.10.13 </span>User interface</a></li>
        <li><a href=#time-ranges><span class=secno>4.7.10.14 </span>Time ranges</a></li>
        <li><a href=#the-trackevent-interface><span class=secno>4.7.10.15 </span>The <code>TrackEvent</code> interface</a></li>
@@ -30631,6 +30632,78 @@
   </table></div>
 
 
+
+  <h6 id=best-practices-for-metadata-text-tracks><span class=secno>4.7.10.12.8 </span>Best practices for metadata text tracks</h6>
+
+  <p><i>This section is non-normative.</i></p>
+
+  <p>Text tracks can be used for storing data relating to the media data, for interactive or
+  augmented views.</p>
+
+  <p>For example, a page showing a sports broadcast could include information about the current
+  score. Suppose a robotics competition was being streamed live. The image could be overlayed with
+  the scores, as follows:</p>
+
+  <p><iframe src='data:text/html;charset=utf-8,<!DOCTYPE%20html>%0A<style>%0A%20body%2C%20html%20%7B%20margin%3A%200%3B%20padding%3A%200%3B%20overflow%3A%20hidden%3B%20%7D%0A%20div%20%7B%20width%3A%20600px%3B%20height%3A%20400px%3B%20position%3A%20relative%3B%20%7D%0A%20p%20%7B%20position%3A%20absolute%3B%20top%3A%200%3B%20margin%3A%200.25em%3B%20font%3A%20small-caps%20900%202em%20sans-serif%3B%20text-shadow%3A%20white%200%200%204px%3B%20%7D%0A%20span%20%7B%20display%3A%20block%3B%20%7D%0A%20.left%20%7B%20color%3A%20red%3B%20left%3A%200%3B%20text-align%3A%20left%3B%20%7D%0A%20.right%20%7B%20color%3A%20blue%3B%20right%3A%200%3B%20text-align%3A%20right%3B%20%7D%0A%20.middle%20%7B%20color%3A%20white%3B%20top%3A%20auto%3B%20bottom%3A%200%3B%20left%3A%200%3B%20right%3A%200%3B%20text-align%3A%20center%3B%20text-shadow%3A%20black%200%200%204px%3B%20%7D%0A%20.middle%20span%20%7B%20display%3A%20inline-block%3B%20margin%3A%200%201em%3B%20font-size%3A%200.75em%3B%20text-transform%3A%20
 uppercase%3B%20%7D%0A<%2Fstyle>%0A<div>%0A%20<img%20src%3D"http%3A%2F%2Fwww.whatwg.org%2Fspecs%2Fweb-apps%2Fcurrent-work%2Fimages%2Frobots.jpeg">%0A%20<p%20class%3D"score%20left"><span>Red%20Alliance<%2Fspan>%20<span>78<%2Fspan><%2Fp>%0A%20<p%20class%3D"score%20right"><span>Blue%20Alliance<%2Fspan>%20<span>66<%2Fspan><%2Fp>%0A%20<p%20class%3D"score%20middle"><span>Qual%20Match%2037<%2Fspan>%20<span>Friday%2014%3A21<%2Fspan>%0A<%2Fdiv>' width=600 height=400></iframe>
+
+  <p>In order to make the score display render correctly whenever the user seeks to an arbitrary
+  point in the video, the metadata text track cues need to be as long as is appropriate for the
+  score. For example, in the frame above, there would be maybe one cue that lasts the length of the
+  match that gives the match number, one cue that lasts until the blue alliance's score changes, and
+  one cue that lasts until the red alliance's score changes. If the video is just a stream of the
+  live event, the time in the bottom right would presumably be automatically derived from the
+  current video time, rather than based on a cue. However, if the video was just the highlights,
+  then that might be given in cues also.</p>
+
+  <p>The following shows what fragments of this could look like in a WebVTT file:</p>
+
+  <pre>WEBVTT
+
+...
+
+05:10:00.000 --> 05:12:15.000
+matchtype:qual
+matchnumber:37
+
+...
+
+05:11:02.251 --> 05:11:17.198
+red:78
+
+05:11:03.672 --> 05:11:54.198
+blue:66
+
+05:11:17.198 --> 05:11:25.912
+red:80
+
+05:11:25.912 --> 05:11:26.522
+red:83
+
+05:11:26.522 --> 05:11:26.982
+red:86
+
+05:11:26.982 --> 05:11:27.499
+red:89
+
+...</pre>
+
+  <p>The key here is to notice that the information is given in cues that span the length of time to
+  which the relevant event applies. If, instead, the scores were given as zero-length (or very
+  brief, nearly zero-length) cues when the score changes, for example saying "red+2" at
+  05:11:17.198, "red+3" at 05:11:25.912, etc, problems arise: primarily, seeking is much harder to
+  implement, as the script has to walk the entire list of cues to make sure that no notifications
+  have been missed; but also, if the cues are short it's possible the script will never see that
+  they are active unless it listens to them specifically.</p>
+
+  <p>When using cues in this manner, authors are encouraged to use the <code title=event-media-cuechange><a href=#event-media-cuechange>cuechange</a></code> event to update the current annotations. (In
+  particular, using the <code title=event-media-timeupdate><a href=#event-media-timeupdate>timeupdate</a></code> event would be less
+  appropriate as it would require doing work even when the cues haven't changed, and, more
+  importantly, would introduce a higher latency between when the metatata cues become active and
+  when the display is updated, this <code title=event-media-timeupdate><a href=#event-media-timeupdate>timeupdate</a></code> events
+  are rate-limited.)</p>
+
+
+
   <h5 id=user-interface><span class=secno>4.7.10.13 </span>User interface</h5>
 
   <p>The <dfn id=attr-media-controls title=attr-media-controls><code>controls</code></dfn> attribute is a <a href=#boolean-attribute>boolean
@@ -103064,6 +103137,13 @@
    (<a itemprop=license href=http://creativecommons.org/licenses/by-sa/3.0/>CC BY-SA 3.0</a>)</p>
   </div>
 
+  <div itemscope="" itemtype=http://n.whatwg.org/work>
+   <p>The photograph of robot 148 climbing the tower at the FIRST Robotics Competition 2013 Silicon Valley Regional is based on
+   <a itemprop=work href=http://www.flickr.com/photos/lenore-m/8631391979/>a work</a> by
+   <a itemprop=http://creativecommons.org/ns#attributionURL href=http://www.flickr.com/photos/lenore-m/>Lenore Edman</a>.
+   (<a itemprop=license href=http://creativecommons.org/licenses/by/2.0/>CC BY 2.0</a>)</p>
+  </div>
+
   <p>Thanks also to the Microsoft blogging community for some ideas, to the attendees of the W3C
   Workshop on Web Applications and Compound Documents for inspiration, to the #mrt crew, the #mrt.no
   crew, and the #whatwg crew, and to Pillar and Hedral for their ideas and support.</p>

Modified: source
===================================================================
--- source	2014-01-28 18:49:55 UTC (rev 8427)
+++ source	2014-01-28 20:17:38 UTC (rev 8428)
@@ -33013,6 +33013,79 @@
   </div>
 
 
+
+  <h6>Best practices for metadata text tracks</h6>
+
+  <!--END dev-html--><p><i>This section is non-normative.</i></p><!--START dev-html-->
+
+  <p>Text tracks can be used for storing data relating to the media data, for interactive or
+  augmented views.</p>
+
+  <p>For example, a page showing a sports broadcast could include information about the current
+  score. Suppose a robotics competition was being streamed live. The image could be overlayed with
+  the scores, as follows:</p>
+
+  <p><iframe src='data:text/html;charset=utf-8,<!DOCTYPE%20html>%0A<style>%0A%20body%2C%20html%20%7B%20margin%3A%200%3B%20padding%3A%200%3B%20overflow%3A%20hidden%3B%20%7D%0A%20div%20%7B%20width%3A%20600px%3B%20height%3A%20400px%3B%20position%3A%20relative%3B%20%7D%0A%20p%20%7B%20position%3A%20absolute%3B%20top%3A%200%3B%20margin%3A%200.25em%3B%20font%3A%20small-caps%20900%202em%20sans-serif%3B%20text-shadow%3A%20white%200%200%204px%3B%20%7D%0A%20span%20%7B%20display%3A%20block%3B%20%7D%0A%20.left%20%7B%20color%3A%20red%3B%20left%3A%200%3B%20text-align%3A%20left%3B%20%7D%0A%20.right%20%7B%20color%3A%20blue%3B%20right%3A%200%3B%20text-align%3A%20right%3B%20%7D%0A%20.middle%20%7B%20color%3A%20white%3B%20top%3A%20auto%3B%20bottom%3A%200%3B%20left%3A%200%3B%20right%3A%200%3B%20text-align%3A%20center%3B%20text-shadow%3A%20black%200%200%204px%3B%20%7D%0A%20.middle%20span%20%7B%20display%3A%20inline-block%3B%20margin%3A%200%201em%3B%20font-size%3A%200.75em%3B%20text-transform%3A%20
 uppercase%3B%20%7D%0A<%2Fstyle>%0A<div>%0A%20<img%20src%3D"http%3A%2F%2Fwww.whatwg.org%2Fspecs%2Fweb-apps%2Fcurrent-work%2Fimages%2Frobots.jpeg">%0A%20<p%20class%3D"score%20left"><span>Red%20Alliance<%2Fspan>%20<span>78<%2Fspan><%2Fp>%0A%20<p%20class%3D"score%20right"><span>Blue%20Alliance<%2Fspan>%20<span>66<%2Fspan><%2Fp>%0A%20<p%20class%3D"score%20middle"><span>Qual%20Match%2037<%2Fspan>%20<span>Friday%2014%3A21<%2Fspan>%0A<%2Fdiv>' width="600" height="400"></iframe>
+
+  <p>In order to make the score display render correctly whenever the user seeks to an arbitrary
+  point in the video, the metadata text track cues need to be as long as is appropriate for the
+  score. For example, in the frame above, there would be maybe one cue that lasts the length of the
+  match that gives the match number, one cue that lasts until the blue alliance's score changes, and
+  one cue that lasts until the red alliance's score changes. If the video is just a stream of the
+  live event, the time in the bottom right would presumably be automatically derived from the
+  current video time, rather than based on a cue. However, if the video was just the highlights,
+  then that might be given in cues also.</p>
+
+  <p>The following shows what fragments of this could look like in a WebVTT file:</p>
+
+  <pre>WEBVTT
+
+...
+
+05:10:00.000 --> 05:12:15.000
+matchtype:qual
+matchnumber:37
+
+...
+
+05:11:02.251 --> 05:11:17.198
+red:78
+
+05:11:03.672 --> 05:11:54.198
+blue:66
+
+05:11:17.198 --> 05:11:25.912
+red:80
+
+05:11:25.912 --> 05:11:26.522
+red:83
+
+05:11:26.522 --> 05:11:26.982
+red:86
+
+05:11:26.982 --> 05:11:27.499
+red:89
+
+...</pre>
+
+  <p>The key here is to notice that the information is given in cues that span the length of time to
+  which the relevant event applies. If, instead, the scores were given as zero-length (or very
+  brief, nearly zero-length) cues when the score changes, for example saying "red+2" at
+  05:11:17.198, "red+3" at 05:11:25.912, etc, problems arise: primarily, seeking is much harder to
+  implement, as the script has to walk the entire list of cues to make sure that no notifications
+  have been missed; but also, if the cues are short it's possible the script will never see that
+  they are active unless it listens to them specifically.</p>
+
+  <p>When using cues in this manner, authors are encouraged to use the <code
+  data-x="event-media-cuechange">cuechange</code> event to update the current annotations. (In
+  particular, using the <code data-x="event-media-timeupdate">timeupdate</code> event would be less
+  appropriate as it would require doing work even when the cues haven't changed, and, more
+  importantly, would introduce a higher latency between when the metatata cues become active and
+  when the display is updated, this <code data-x="event-media-timeupdate">timeupdate</code> events
+  are rate-limited.)</p>
+
+
+
   <h5>User interface</h5>
 
   <p>The <dfn data-x="attr-media-controls"><code>controls</code></dfn> attribute is a <span>boolean
@@ -115060,6 +115133,13 @@
    (<a itemprop="license" href="http://creativecommons.org/licenses/by-sa/3.0/">CC BY-SA 3.0</a>)</p>
   </div>
 
+  <div itemscope itemtype="http://n.whatwg.org/work">
+   <p>The photograph of robot 148 climbing the tower at the FIRST Robotics Competition 2013 Silicon Valley Regional is based on
+   <a itemprop="work" href="http://www.flickr.com/photos/lenore-m/8631391979/">a work</a> by
+   <a itemprop="http://creativecommons.org/ns#attributionURL" href="http://www.flickr.com/photos/lenore-m/">Lenore Edman</a>.
+   (<a itemprop="license" href="http://creativecommons.org/licenses/by/2.0/">CC BY 2.0</a>)</p>
+  </div>
+
   <p>Thanks also to the Microsoft blogging community for some ideas, to the attendees of the W3C
   Workshop on Web Applications and Compound Documents for inspiration, to the #mrt crew, the #mrt.no
   crew, and the #whatwg crew, and to Pillar and Hedral for their ideas and support.</p>