[html5] r5414 - [giow] (0) Allow authors to override WebSRT's encoding using <track charset>.
whatwg at whatwg.org
whatwg at whatwg.org
Fri Sep 3 17:14:03 PDT 2010
Author: ianh
Date: 2010-09-03 17:14:02 -0700 (Fri, 03 Sep 2010)
New Revision: 5414
Modified:
complete.html
index
source
Log:
[giow] (0) Allow authors to override WebSRT's encoding using <track charset>.
Modified: complete.html
===================================================================
--- complete.html 2010-09-03 22:27:05 UTC (rev 5413)
+++ complete.html 2010-09-04 00:14:02 UTC (rev 5414)
@@ -24060,16 +24060,18 @@
<dt>Content attributes:</dt>
<dd><a href=#global-attributes>Global attributes</a></dd>
<dd><code title=attr-track-kind><a href=#attr-track-kind>kind</a></code></dd>
- <dd><code title=attr-track-label><a href=#attr-track-label>label</a></code></dd>
<dd><code title=attr-track-src><a href=#attr-track-src>src</a></code></dd>
+ <dd><code title=attr-track-charset><a href=#attr-track-charset>charset</a></code></dd>
<dd><code title=attr-track-srclang><a href=#attr-track-srclang>srclang</a></code></dd>
+ <dd><code title=attr-track-label><a href=#attr-track-label>label</a></code></dd>
<dt>DOM interface:</dt>
<dd>
<pre class=idl>interface <dfn id=htmltrackelement>HTMLTrackElement</dfn> : <a href=#htmlelement>HTMLElement</a> {
attribute DOMString <a href=#dom-track-kind title=dom-track-kind>kind</a>;
- attribute DOMString <a href=#dom-track-label title=dom-track-label>label</a>;
attribute DOMString <a href=#dom-track-src title=dom-track-src>src</a>;
+ attribute DOMString <a href=#dom-track-charset title=dom-track-charset>charset</a>;
attribute DOMString <a href=#dom-track-srclang title=dom-track-srclang>srclang</a>;
+ attribute DOMString <a href=#dom-track-label title=dom-track-label>label</a>;
readonly attribute <a href=#timedtrack>TimedTrack</a> <a href=#dom-track-track title=dom-track-track>track</a>;
};</pre>
@@ -24126,6 +24128,14 @@
<a href=#websrt>WebSRT</a> file must be a <a href=#websrt-file-using-cue-text>WebSRT file using cue
text</a>.</p>
+ <p>If the elements's <a href=#track-url>track URL</a> identifies a
+ <a href=#websrt>WebSRT</a> resource, then the <dfn id=attr-track-charset title=attr-track-charset><code>charset</code></dfn> attribute may
+ be specified. If the attribute is set, its value must be a valid
+ character encoding name, must be an <a href=#ascii-case-insensitive>ASCII
+ case-insensitive</a> match for the <a href=#preferred-mime-name>preferred MIME
+ name</a> for that encoding, and must match the character encoding
+ of the <a href=#websrt>WebSRT</a> file. <a href=#refsIANACHARSET>[IANACHARSET]</a></p>
+
<p>The <dfn id=attr-track-srclang title=attr-track-srclang><code>srclang</code></dfn>
attribute gives the language of the timed track data. The value must
be a valid BCP 47 language tag. This attribute must be present if
@@ -24180,11 +24190,11 @@
<a href=#timed-track>timed track</a>'s corresponding <code><a href=#timedtrack>TimedTrack</a></code>
object.</p>
- <p>The <dfn id=dom-track-label title=dom-track-label><code>label</code></dfn>, <dfn id=dom-track-src title=dom-track-src><code>src</code></dfn>, and <dfn id=dom-track-srclang title=dom-track-srclang><code>srclang</code></dfn> IDL attributes
- must <a href=#reflect>reflect</a> the respective content attributes of the
- same name. The <dfn id=dom-track-kind title=dom-track-kind><code>kind</code></dfn>
- IDL attributemust <a href=#reflect>reflect</a> the content attribute of the
- same name, <a href=#limited-to-only-known-values>limited to only known values</a>.</p>
+ <p>The <dfn id=dom-track-src title=dom-track-src><code>src</code></dfn>, <dfn id=dom-track-charset title=dom-track-charset><code>charset</code></dfn>, <dfn id=dom-track-srclang title=dom-track-srclang><code>srclang</code></dfn>, and <dfn id=dom-track-label title=dom-track-label><code>label</code></dfn> IDL attributes must
+ <a href=#reflect>reflect</a> the respective content attributes of the same
+ name. The <dfn id=dom-track-kind title=dom-track-kind><code>kind</code></dfn> IDL
+ attributemust <a href=#reflect>reflect</a> the content attribute of the same
+ name, <a href=#limited-to-only-known-values>limited to only known values</a>.</p>
</div>
@@ -27173,8 +27183,13 @@
unsupported (this causes the load to fail, as described below). If
a type is obtained, and represents a supported timed track format,
then the resource's data must be passed to the appropriate parser
- as it is received, with the <a href=#timed-track-list-of-cues>timed track list of cues</a>
- being used for that parser's output.</p>
+ (e.g. the <a href=#websrt-parser>WebSRT parser</a> if the <a href=#content-type title=Content-Type>Content Type metadata</a> is is
+ <code><a href=#text/srt>text/srt</a></code>) as it is received, with the <a href=#timed-track-list-of-cues>timed
+ track list of cues</a> being used for that parser's output. If
+ the <code><a href=#the-track-element>track</a></code> element has a <code title=attr-track-charset><a href=#attr-track-charset>charset</a></code> attribute that specifies
+ a supported character encoding, then that encoding must be given
+ to the parser as a character encoding override. Otherwise the
+ parser must use its default character encoding behavior.</p>
<p>If the <a href=#fetch title=fetch>fetching algorithm</a> fails for
any reason (network error, the server returns an error code, a
@@ -28321,13 +28336,14 @@
<h6 id=parsing-0><span class=secno>4.8.10.11.2 </span>Parsing</h6>
- <p>A <dfn id=websrt-parser>WebSRT parser</dfn>, given an input byte stream and a
+ <p>A <dfn id=websrt-parser>WebSRT parser</dfn>, given an input byte stream, a
<a href=#timed-track-list-of-cues>timed track list of cues</a> <var title="">output</var>,
- must convert the bytes into a string of Unicode characters by
- interpreting them as UTF-8, and then must parse the resulting string
- according to the <a href=#websrt-parser-algorithm>WebSRT parser algorithm</a> below. This
- results in <a href=#timed-track-cue title="timed track cue">timed track cues</a>
- being added to <var title="">output</var>. <a href=#refsRFC3629>[RFC3629]</a></p>
+ and optionally a character encoding override <var title="">encoding</var>, must convert the bytes into a string of
+ Unicode characters by interpreting them as the given <var title="">encoding</var>, or UTF-8 if <var title="">encoding</var> is
+ not provided, and then must parse the resulting string according to
+ the <a href=#websrt-parser-algorithm>WebSRT parser algorithm</a> below. This results in
+ <a href=#timed-track-cue title="timed track cue">timed track cues</a> being added to
+ <var title="">output</var>. <a href=#refsRFC3629>[RFC3629]</a></p>
<p>A <a href=#websrt-parser>WebSRT parser</a>, specifically its conversion and
parsing steps, is typically run asynchronously, with the input byte
@@ -28339,10 +28355,11 @@
<ul class=brief><li><code><a href=#text/srt>text/srt</a></code></li>
</ul><!--<p class="note">Not all of these MIME types are valid registered
- types.</p>--><p>When converting the bytes into Unicode characters, bytes or
- sequences of bytes that are not valid UTF-8 sequences must be
- interpreted as a U+FFFD REPLACEMENT CHARACTER, and all U+0000 NULL
- characters must be replaced by U+FFFD REPLACEMENT CHARACTERs.</p>
+ types.</p>--><p>When converting the bytes into Unicode characters, if the
+ encoding used is UTF-8, bytes or sequences of bytes that are not
+ valid UTF-8 sequences must be interpreted as a U+FFFD REPLACEMENT
+ CHARACTER, and all U+0000 NULL characters must be replaced by U+FFFD
+ REPLACEMENT CHARACTERs.</p>
<p>The <dfn id=websrt-parser-algorithm>WebSRT parser algorithm</dfn> is as follows:</p>
@@ -89207,7 +89224,7 @@
<dt>Optional parameters:</dt>
<dd>No parameters</dd>
<dt>Encoding considerations:</dt>
- <dd>Always UTF-8.</dd>
+ <dd>Must always be UTF-8.</dd>
<dt>Security considerations:</dt>
<dd>
<p>Timed track files themselves pose no immediate risk unless
@@ -89220,8 +89237,9 @@
</dd>
<dt>Interoperability considerations:</dt>
<dd>
- Rules for processing both conforming and non-conforming content
- are defined in this specification.
+ <p>Rules for processing both conforming and non-conforming content
+ are defined in this specification.</p>
+ <p>Some legacy files violate the requirement to use UTF-8.</p>
</dd>
<dt>Published specification:</dt>
<dd>
Modified: index
===================================================================
--- index 2010-09-03 22:27:05 UTC (rev 5413)
+++ index 2010-09-04 00:14:02 UTC (rev 5414)
@@ -24040,16 +24040,18 @@
<dt>Content attributes:</dt>
<dd><a href=#global-attributes>Global attributes</a></dd>
<dd><code title=attr-track-kind><a href=#attr-track-kind>kind</a></code></dd>
- <dd><code title=attr-track-label><a href=#attr-track-label>label</a></code></dd>
<dd><code title=attr-track-src><a href=#attr-track-src>src</a></code></dd>
+ <dd><code title=attr-track-charset><a href=#attr-track-charset>charset</a></code></dd>
<dd><code title=attr-track-srclang><a href=#attr-track-srclang>srclang</a></code></dd>
+ <dd><code title=attr-track-label><a href=#attr-track-label>label</a></code></dd>
<dt>DOM interface:</dt>
<dd>
<pre class=idl>interface <dfn id=htmltrackelement>HTMLTrackElement</dfn> : <a href=#htmlelement>HTMLElement</a> {
attribute DOMString <a href=#dom-track-kind title=dom-track-kind>kind</a>;
- attribute DOMString <a href=#dom-track-label title=dom-track-label>label</a>;
attribute DOMString <a href=#dom-track-src title=dom-track-src>src</a>;
+ attribute DOMString <a href=#dom-track-charset title=dom-track-charset>charset</a>;
attribute DOMString <a href=#dom-track-srclang title=dom-track-srclang>srclang</a>;
+ attribute DOMString <a href=#dom-track-label title=dom-track-label>label</a>;
readonly attribute <a href=#timedtrack>TimedTrack</a> <a href=#dom-track-track title=dom-track-track>track</a>;
};</pre>
@@ -24106,6 +24108,14 @@
<a href=#websrt>WebSRT</a> file must be a <a href=#websrt-file-using-cue-text>WebSRT file using cue
text</a>.</p>
+ <p>If the elements's <a href=#track-url>track URL</a> identifies a
+ <a href=#websrt>WebSRT</a> resource, then the <dfn id=attr-track-charset title=attr-track-charset><code>charset</code></dfn> attribute may
+ be specified. If the attribute is set, its value must be a valid
+ character encoding name, must be an <a href=#ascii-case-insensitive>ASCII
+ case-insensitive</a> match for the <a href=#preferred-mime-name>preferred MIME
+ name</a> for that encoding, and must match the character encoding
+ of the <a href=#websrt>WebSRT</a> file. <a href=#refsIANACHARSET>[IANACHARSET]</a></p>
+
<p>The <dfn id=attr-track-srclang title=attr-track-srclang><code>srclang</code></dfn>
attribute gives the language of the timed track data. The value must
be a valid BCP 47 language tag. This attribute must be present if
@@ -24160,11 +24170,11 @@
<a href=#timed-track>timed track</a>'s corresponding <code><a href=#timedtrack>TimedTrack</a></code>
object.</p>
- <p>The <dfn id=dom-track-label title=dom-track-label><code>label</code></dfn>, <dfn id=dom-track-src title=dom-track-src><code>src</code></dfn>, and <dfn id=dom-track-srclang title=dom-track-srclang><code>srclang</code></dfn> IDL attributes
- must <a href=#reflect>reflect</a> the respective content attributes of the
- same name. The <dfn id=dom-track-kind title=dom-track-kind><code>kind</code></dfn>
- IDL attributemust <a href=#reflect>reflect</a> the content attribute of the
- same name, <a href=#limited-to-only-known-values>limited to only known values</a>.</p>
+ <p>The <dfn id=dom-track-src title=dom-track-src><code>src</code></dfn>, <dfn id=dom-track-charset title=dom-track-charset><code>charset</code></dfn>, <dfn id=dom-track-srclang title=dom-track-srclang><code>srclang</code></dfn>, and <dfn id=dom-track-label title=dom-track-label><code>label</code></dfn> IDL attributes must
+ <a href=#reflect>reflect</a> the respective content attributes of the same
+ name. The <dfn id=dom-track-kind title=dom-track-kind><code>kind</code></dfn> IDL
+ attributemust <a href=#reflect>reflect</a> the content attribute of the same
+ name, <a href=#limited-to-only-known-values>limited to only known values</a>.</p>
</div>
@@ -27153,8 +27163,13 @@
unsupported (this causes the load to fail, as described below). If
a type is obtained, and represents a supported timed track format,
then the resource's data must be passed to the appropriate parser
- as it is received, with the <a href=#timed-track-list-of-cues>timed track list of cues</a>
- being used for that parser's output.</p>
+ (e.g. the <a href=#websrt-parser>WebSRT parser</a> if the <a href=#content-type title=Content-Type>Content Type metadata</a> is is
+ <code><a href=#text/srt>text/srt</a></code>) as it is received, with the <a href=#timed-track-list-of-cues>timed
+ track list of cues</a> being used for that parser's output. If
+ the <code><a href=#the-track-element>track</a></code> element has a <code title=attr-track-charset><a href=#attr-track-charset>charset</a></code> attribute that specifies
+ a supported character encoding, then that encoding must be given
+ to the parser as a character encoding override. Otherwise the
+ parser must use its default character encoding behavior.</p>
<p>If the <a href=#fetch title=fetch>fetching algorithm</a> fails for
any reason (network error, the server returns an error code, a
@@ -28301,13 +28316,14 @@
<h6 id=parsing-0><span class=secno>4.8.10.11.2 </span>Parsing</h6>
- <p>A <dfn id=websrt-parser>WebSRT parser</dfn>, given an input byte stream and a
+ <p>A <dfn id=websrt-parser>WebSRT parser</dfn>, given an input byte stream, a
<a href=#timed-track-list-of-cues>timed track list of cues</a> <var title="">output</var>,
- must convert the bytes into a string of Unicode characters by
- interpreting them as UTF-8, and then must parse the resulting string
- according to the <a href=#websrt-parser-algorithm>WebSRT parser algorithm</a> below. This
- results in <a href=#timed-track-cue title="timed track cue">timed track cues</a>
- being added to <var title="">output</var>. <a href=#refsRFC3629>[RFC3629]</a></p>
+ and optionally a character encoding override <var title="">encoding</var>, must convert the bytes into a string of
+ Unicode characters by interpreting them as the given <var title="">encoding</var>, or UTF-8 if <var title="">encoding</var> is
+ not provided, and then must parse the resulting string according to
+ the <a href=#websrt-parser-algorithm>WebSRT parser algorithm</a> below. This results in
+ <a href=#timed-track-cue title="timed track cue">timed track cues</a> being added to
+ <var title="">output</var>. <a href=#refsRFC3629>[RFC3629]</a></p>
<p>A <a href=#websrt-parser>WebSRT parser</a>, specifically its conversion and
parsing steps, is typically run asynchronously, with the input byte
@@ -28319,10 +28335,11 @@
<ul class=brief><li><code><a href=#text/srt>text/srt</a></code></li>
</ul><!--<p class="note">Not all of these MIME types are valid registered
- types.</p>--><p>When converting the bytes into Unicode characters, bytes or
- sequences of bytes that are not valid UTF-8 sequences must be
- interpreted as a U+FFFD REPLACEMENT CHARACTER, and all U+0000 NULL
- characters must be replaced by U+FFFD REPLACEMENT CHARACTERs.</p>
+ types.</p>--><p>When converting the bytes into Unicode characters, if the
+ encoding used is UTF-8, bytes or sequences of bytes that are not
+ valid UTF-8 sequences must be interpreted as a U+FFFD REPLACEMENT
+ CHARACTER, and all U+0000 NULL characters must be replaced by U+FFFD
+ REPLACEMENT CHARACTERs.</p>
<p>The <dfn id=websrt-parser-algorithm>WebSRT parser algorithm</dfn> is as follows:</p>
@@ -85122,7 +85139,7 @@
<dt>Optional parameters:</dt>
<dd>No parameters</dd>
<dt>Encoding considerations:</dt>
- <dd>Always UTF-8.</dd>
+ <dd>Must always be UTF-8.</dd>
<dt>Security considerations:</dt>
<dd>
<p>Timed track files themselves pose no immediate risk unless
@@ -85135,8 +85152,9 @@
</dd>
<dt>Interoperability considerations:</dt>
<dd>
- Rules for processing both conforming and non-conforming content
- are defined in this specification.
+ <p>Rules for processing both conforming and non-conforming content
+ are defined in this specification.</p>
+ <p>Some legacy files violate the requirement to use UTF-8.</p>
</dd>
<dt>Published specification:</dt>
<dd>
Modified: source
===================================================================
--- source 2010-09-03 22:27:05 UTC (rev 5413)
+++ source 2010-09-04 00:14:02 UTC (rev 5414)
@@ -25825,16 +25825,18 @@
<dt>Content attributes:</dt>
<dd><span>Global attributes</span></dd>
<dd><code title="attr-track-kind">kind</code></dd>
- <dd><code title="attr-track-label">label</code></dd>
<dd><code title="attr-track-src">src</code></dd>
+ <dd><code title="attr-track-charset">charset</code></dd>
<dd><code title="attr-track-srclang">srclang</code></dd>
+ <dd><code title="attr-track-label">label</code></dd>
<dt>DOM interface:</dt>
<dd>
<pre class="idl">interface <dfn>HTMLTrackElement</dfn> : <span>HTMLElement</span> {
attribute DOMString <span title="dom-track-kind">kind</span>;
- attribute DOMString <span title="dom-track-label">label</span>;
attribute DOMString <span title="dom-track-src">src</span>;
+ attribute DOMString <span title="dom-track-charset">charset</span>;
attribute DOMString <span title="dom-track-srclang">srclang</span>;
+ attribute DOMString <span title="dom-track-label">label</span>;
readonly attribute <span>TimedTrack</span> <span title="dom-track-track">track</span>;
};</pre>
@@ -25908,6 +25910,16 @@
<span>WebSRT</span> file must be a <span>WebSRT file using cue
text</span>.</p>
+ <p>If the elements's <span>track URL</span> identifies a
+ <span>WebSRT</span> resource, then the <dfn
+ title="attr-track-charset"><code>charset</code></dfn> attribute may
+ be specified. If the attribute is set, its value must be a valid
+ character encoding name, must be an <span>ASCII
+ case-insensitive</span> match for the <span>preferred MIME
+ name</span> for that encoding, and must match the character encoding
+ of the <span>WebSRT</span> file. <a
+ href="#refsIANACHARSET">[IANACHARSET]</a></p>
+
<p>The <dfn title="attr-track-srclang"><code>srclang</code></dfn>
attribute gives the language of the timed track data. The value must
be a valid BCP 47 language tag. This attribute must be present if
@@ -25968,13 +25980,14 @@
<span>timed track</span>'s corresponding <code>TimedTrack</code>
object.</p>
- <p>The <dfn title="dom-track-label"><code>label</code></dfn>, <dfn
- title="dom-track-src"><code>src</code></dfn>, and <dfn
- title="dom-track-srclang"><code>srclang</code></dfn> IDL attributes
- must <span>reflect</span> the respective content attributes of the
- same name. The <dfn title="dom-track-kind"><code>kind</code></dfn>
- IDL attributemust <span>reflect</span> the content attribute of the
- same name, <span>limited to only known values</span>.</p>
+ <p>The <dfn title="dom-track-src"><code>src</code></dfn>, <dfn
+ title="dom-track-charset"><code>charset</code></dfn>, <dfn
+ title="dom-track-srclang"><code>srclang</code></dfn>, and <dfn
+ title="dom-track-label"><code>label</code></dfn> IDL attributes must
+ <span>reflect</span> the respective content attributes of the same
+ name. The <dfn title="dom-track-kind"><code>kind</code></dfn> IDL
+ attributemust <span>reflect</span> the content attribute of the same
+ name, <span>limited to only known values</span>.</p>
</div>
@@ -29499,8 +29512,15 @@
unsupported (this causes the load to fail, as described below). If
a type is obtained, and represents a supported timed track format,
then the resource's data must be passed to the appropriate parser
- as it is received, with the <span>timed track list of cues</span>
- being used for that parser's output.</p>
+ (e.g. the <span>WebSRT parser</span> if the <span
+ title="Content-Type">Content Type metadata</span> is is
+ <code>text/srt</code>) as it is received, with the <span>timed
+ track list of cues</span> being used for that parser's output. If
+ the <code>track</code> element has a <code
+ title="attr-track-charset">charset</code> attribute that specifies
+ a supported character encoding, then that encoding must be given
+ to the parser as a character encoding override. Otherwise the
+ parser must use its default character encoding behavior.</p>
<p>If the <span title="fetch">fetching algorithm</span> fails for
any reason (network error, the server returns an error code, a
@@ -30858,13 +30878,16 @@
<h6>Parsing</h6>
- <p>A <dfn>WebSRT parser</dfn>, given an input byte stream and a
+ <p>A <dfn>WebSRT parser</dfn>, given an input byte stream, a
<span>timed track list of cues</span> <var title="">output</var>,
- must convert the bytes into a string of Unicode characters by
- interpreting them as UTF-8, and then must parse the resulting string
- according to the <span>WebSRT parser algorithm</span> below. This
- results in <span title="timed track cue">timed track cues</span>
- being added to <var title="">output</var>. <a href="#refsRFC3629">[RFC3629]</a></p>
+ and optionally a character encoding override <var
+ title="">encoding</var>, must convert the bytes into a string of
+ Unicode characters by interpreting them as the given <var
+ title="">encoding</var>, or UTF-8 if <var title="">encoding</var> is
+ not provided, and then must parse the resulting string according to
+ the <span>WebSRT parser algorithm</span> below. This results in
+ <span title="timed track cue">timed track cues</span> being added to
+ <var title="">output</var>. <a href="#refsRFC3629">[RFC3629]</a></p>
<p>A <span>WebSRT parser</span>, specifically its conversion and
parsing steps, is typically run asynchronously, with the input byte
@@ -30881,10 +30904,11 @@
<!--<p class="note">Not all of these MIME types are valid registered
types.</p>-->
- <p>When converting the bytes into Unicode characters, bytes or
- sequences of bytes that are not valid UTF-8 sequences must be
- interpreted as a U+FFFD REPLACEMENT CHARACTER, and all U+0000 NULL
- characters must be replaced by U+FFFD REPLACEMENT CHARACTERs.</p>
+ <p>When converting the bytes into Unicode characters, if the
+ encoding used is UTF-8, bytes or sequences of bytes that are not
+ valid UTF-8 sequences must be interpreted as a U+FFFD REPLACEMENT
+ CHARACTER, and all U+0000 NULL characters must be replaced by U+FFFD
+ REPLACEMENT CHARACTERs.</p>
<p>The <dfn>WebSRT parser algorithm</dfn> is as follows:</p>
@@ -101923,7 +101947,7 @@
<dt>Optional parameters:</dt>
<dd>No parameters</dd>
<dt>Encoding considerations:</dt>
- <dd>Always UTF-8.</dd>
+ <dd>Must always be UTF-8.</dd>
<dt>Security considerations:</dt>
<dd>
<p>Timed track files themselves pose no immediate risk unless
@@ -101936,8 +101960,9 @@
</dd>
<dt>Interoperability considerations:</dt>
<dd>
- Rules for processing both conforming and non-conforming content
- are defined in this specification.
+ <p>Rules for processing both conforming and non-conforming content
+ are defined in this specification.</p>
+ <p>Some legacy files violate the requirement to use UTF-8.</p>
</dd>
<dt>Published specification:</dt>
<dd>
More information about the Commit-Watchers
mailing list