[html5] r5258 - [e] (0) Some more references to UTF-8.
whatwg at whatwg.org
whatwg at whatwg.org
Mon Aug 9 18:16:12 PDT 2010
Author: ianh
Date: 2010-08-09 18:16:10 -0700 (Mon, 09 Aug 2010)
New Revision: 5258
Modified:
complete.html
index
source
Log:
[e] (0) Some more references to UTF-8.
Modified: complete.html
===================================================================
--- complete.html 2010-08-10 00:58:32 UTC (rev 5257)
+++ complete.html 2010-08-10 01:16:10 UTC (rev 5258)
@@ -13408,12 +13408,12 @@
<a href=#ascii-compatible-character-encoding>ASCII-compatible character encoding</a>.</p>
<p>Authors are encouraged to use UTF-8. Conformance checkers may
- advise authors against using legacy encodings.</p>
+ advise authors against using legacy encodings. <a href=#refsRFC3629>[RFC3629]</a></p>
<div class=impl>
<p>Authoring tools should default to using UTF-8 for newly-created
- documents.</p>
+ documents. <a href=#refsRFC3629>[RFC3629]</a></p>
</div>
@@ -27759,7 +27759,7 @@
<p>A <dfn id=websrt-file>WebSRT file</dfn> must consist of a <a href=#websrt-file-body>WebSRT file
body</a> encoded as UTF-8 and labeled with the <a href=#mime-type>MIME
- type</a> <code><a href=#text/srt>text/srt</a></code>.</p>
+ type</a> <code><a href=#text/srt>text/srt</a></code>. <a href=#refsRFC3629>[RFC3629]</a></p>
<p>A <dfn id=websrt-file-body>WebSRT file body</dfn> consists of zero or more <a href=#websrt-line-terminator title="WebSRT line terminator">WebSRT line terminators</a>,
followed by zero or more <a href=#websrt-cue title="WebSRT cue">WebSRT cues</a>
@@ -28027,7 +28027,7 @@
interpreting them as UTF-8, and then must parse the resulting string
according to the <a href=#websrt-parser-algorithm>WebSRT parser algorithm</a> below. This
results in <a href=#timed-track-cue title="timed track cue">timed track cues</a>
- being added to <var title="">output</var>.</p>
+ being added to <var title="">output</var>. <a href=#refsRFC3629>[RFC3629]</a></p>
<p>A <a href=#websrt-parser>WebSRT parser</a>, specifically its conversion and
parsing steps, is typically run asynchronously, with the input byte
@@ -61630,7 +61630,7 @@
encoded using UTF-8. Data in application cache manifests is
line-based. Newlines must be represented by U+000A LINE FEED (LF)
characters, U+000D CARRIAGE RETURN (CR) characters, or U+000D
- CARRIAGE RETURN (CR) U+000A LINE FEED (LF) pairs.</p>
+ CARRIAGE RETURN (CR) U+000A LINE FEED (LF) pairs. <a href=#refsRFC3629>[RFC3629]</a></p>
<p class=note>This is a <a href=#willful-violation>willful violation</a> of two
aspects of RFC 2046, which requires all <code title="">text/*</code>
@@ -61790,7 +61790,7 @@
a U+FFFD REPLACEMENT CHARACTER. <!--All U+0000 NULL characters must
be replaced by U+FFFD REPLACEMENT CHARACTERs. (this isn't black-box
testable since neither U+0000 nor U+FFFD are valid anywhere in the
- syntax and thus both will be treated the same anyway)--></li>
+ syntax and thus both will be treated the same anyway)--> <a href=#refsRFC3629>[RFC3629]</a></li>
<li><p>Let <var title="">base URL</var> be the <a href=#absolute-url>absolute
URL</a> representing the manifest.</li>
@@ -70765,7 +70765,7 @@
steps.</p>
<p>If the attempt succeeds, then convert the script resource to
- Unicode by assuming it was encoded as UTF-8, to obtain its <var title="">source</var>.</p>
+ Unicode by assuming it was encoded as UTF-8, to obtain its <var title="">source</var>. <a href=#refsRFC3629>[RFC3629]</a></p>
<p>Let <var title="">language</var> be JavaScript.</p>
@@ -71510,7 +71510,7 @@
steps.</p>
<p>If the attempt succeeds, then convert the script resource to
- Unicode by assuming it was encoded as UTF-8, to obtain its <var title="">source</var>.</p>
+ Unicode by assuming it was encoded as UTF-8, to obtain its <var title="">source</var>. <a href=#refsRFC3629>[RFC3629]</a></p>
<p>Let <var title="">language</var> be JavaScript.</p>
@@ -72091,7 +72091,7 @@
; a Unicode character other than U+000A LINE FEED (LF) or U+000D CARRIAGE RETURN (CR)</pre>
<p>Event streams in this format must always be encoded as
- UTF-8.</p>
+ UTF-8. <a href=#refsRFC3629>[RFC3629]</a></p>
<p>Lines must be separated by either a U+000D CARRIAGE RETURN U+000A
LINE FEED (CRLF) character pair, a single U+000A LINE FEED (LF)
@@ -72107,8 +72107,9 @@
<h4 id=event-stream-interpretation><span class=secno>10.2.5 </span>Interpreting an event stream</h4>
- <p>Bytes or sequences of bytes that are not valid UTF-8 sequences
- must be interpreted as the U+FFFD REPLACEMENT CHARACTER.</p>
+ <p>Streams must be decoded as UTF-8 text. Bytes or sequences of
+ bytes that are not valid UTF-8 sequences must be interpreted as the
+ U+FFFD REPLACEMENT CHARACTER. <a href=#refsRFC3629>[RFC3629]</a></p>
<p>One leading U+FEFF BYTE ORDER MARK character must be ignored if
any are present.</p>
@@ -78703,7 +78704,7 @@
<h5 id=character-encodings-0><span class=secno>12.2.2.2 </span>Character encodings</h5>
<p>User agents must at a minimum support the UTF-8 and Windows-1252
- encodings, but may support more.</p>
+ encodings, but may support more. <a href=#refsRFC3629>[RFC3629]</a> <a href=#refsWIN1252>[WIN1252]</a></p>
<p class=note>It is not unusual for Web browsers to support dozens
if not upwards of a hundred distinct character encodings.</p>
Modified: index
===================================================================
--- index 2010-08-10 00:58:32 UTC (rev 5257)
+++ index 2010-08-10 01:16:10 UTC (rev 5258)
@@ -13332,12 +13332,12 @@
<a href=#ascii-compatible-character-encoding>ASCII-compatible character encoding</a>.</p>
<p>Authors are encouraged to use UTF-8. Conformance checkers may
- advise authors against using legacy encodings.</p>
+ advise authors against using legacy encodings. <a href=#refsRFC3629>[RFC3629]</a></p>
<div class=impl>
<p>Authoring tools should default to using UTF-8 for newly-created
- documents.</p>
+ documents. <a href=#refsRFC3629>[RFC3629]</a></p>
</div>
@@ -27686,7 +27686,7 @@
<p>A <dfn id=websrt-file>WebSRT file</dfn> must consist of a <a href=#websrt-file-body>WebSRT file
body</a> encoded as UTF-8 and labeled with the <a href=#mime-type>MIME
- type</a> <code><a href=#text/srt>text/srt</a></code>.</p>
+ type</a> <code><a href=#text/srt>text/srt</a></code>. <a href=#refsRFC3629>[RFC3629]</a></p>
<p>A <dfn id=websrt-file-body>WebSRT file body</dfn> consists of zero or more <a href=#websrt-line-terminator title="WebSRT line terminator">WebSRT line terminators</a>,
followed by zero or more <a href=#websrt-cue title="WebSRT cue">WebSRT cues</a>
@@ -27954,7 +27954,7 @@
interpreting them as UTF-8, and then must parse the resulting string
according to the <a href=#websrt-parser-algorithm>WebSRT parser algorithm</a> below. This
results in <a href=#timed-track-cue title="timed track cue">timed track cues</a>
- being added to <var title="">output</var>.</p>
+ being added to <var title="">output</var>. <a href=#refsRFC3629>[RFC3629]</a></p>
<p>A <a href=#websrt-parser>WebSRT parser</a>, specifically its conversion and
parsing steps, is typically run asynchronously, with the input byte
@@ -61566,7 +61566,7 @@
encoded using UTF-8. Data in application cache manifests is
line-based. Newlines must be represented by U+000A LINE FEED (LF)
characters, U+000D CARRIAGE RETURN (CR) characters, or U+000D
- CARRIAGE RETURN (CR) U+000A LINE FEED (LF) pairs.</p>
+ CARRIAGE RETURN (CR) U+000A LINE FEED (LF) pairs. <a href=#refsRFC3629>[RFC3629]</a></p>
<p class=note>This is a <a href=#willful-violation>willful violation</a> of two
aspects of RFC 2046, which requires all <code title="">text/*</code>
@@ -61726,7 +61726,7 @@
a U+FFFD REPLACEMENT CHARACTER. <!--All U+0000 NULL characters must
be replaced by U+FFFD REPLACEMENT CHARACTERs. (this isn't black-box
testable since neither U+0000 nor U+FFFD are valid anywhere in the
- syntax and thus both will be treated the same anyway)--></li>
+ syntax and thus both will be treated the same anyway)--> <a href=#refsRFC3629>[RFC3629]</a></li>
<li><p>Let <var title="">base URL</var> be the <a href=#absolute-url>absolute
URL</a> representing the manifest.</li>
@@ -71814,7 +71814,7 @@
<h5 id=character-encodings-0><span class=secno>10.2.2.2 </span>Character encodings</h5>
<p>User agents must at a minimum support the UTF-8 and Windows-1252
- encodings, but may support more.</p>
+ encodings, but may support more. <a href=#refsRFC3629>[RFC3629]</a> <a href=#refsWIN1252>[WIN1252]</a></p>
<p class=note>It is not unusual for Web browsers to support dozens
if not upwards of a hundred distinct character encodings.</p>
Modified: source
===================================================================
--- source 2010-08-10 00:58:32 UTC (rev 5257)
+++ source 2010-08-10 01:16:10 UTC (rev 5258)
@@ -14071,12 +14071,13 @@
<span>ASCII-compatible character encoding</span>.</p>
<p>Authors are encouraged to use UTF-8. Conformance checkers may
- advise authors against using legacy encodings.</p>
+ advise authors against using legacy encodings. <a
+ href="#refsRFC3629">[RFC3629]</a></p>
<div class="impl">
<p>Authoring tools should default to using UTF-8 for newly-created
- documents.</p>
+ documents. <a href="#refsRFC3629">[RFC3629]</a></p>
</div>
@@ -30126,7 +30127,7 @@
<p>A <dfn>WebSRT file</dfn> must consist of a <span>WebSRT file
body</span> encoded as UTF-8 and labeled with the <span>MIME
- type</span> <code>text/srt</code>.</p>
+ type</span> <code>text/srt</code>. <a href="#refsRFC3629">[RFC3629]</a></p>
<p>A <dfn>WebSRT file body</dfn> consists of zero or more <span
title="WebSRT line terminator">WebSRT line terminators</span>,
@@ -30474,7 +30475,7 @@
interpreting them as UTF-8, and then must parse the resulting string
according to the <span>WebSRT parser algorithm</span> below. This
results in <span title="timed track cue">timed track cues</span>
- being added to <var title="">output</var>.</p>
+ being added to <var title="">output</var>. <a href="#refsRFC3629">[RFC3629]</a></p>
<p>A <span>WebSRT parser</span>, specifically its conversion and
parsing steps, is typically run asynchronously, with the input byte
@@ -69633,7 +69634,7 @@
encoded using UTF-8. Data in application cache manifests is
line-based. Newlines must be represented by U+000A LINE FEED (LF)
characters, U+000D CARRIAGE RETURN (CR) characters, or U+000D
- CARRIAGE RETURN (CR) U+000A LINE FEED (LF) pairs.</p>
+ CARRIAGE RETURN (CR) U+000A LINE FEED (LF) pairs. <a href="#refsRFC3629">[RFC3629]</a></p>
<p class="note">This is a <span>willful violation</span> of two
aspects of RFC 2046, which requires all <code title="">text/*</code>
@@ -69816,7 +69817,7 @@
a U+FFFD REPLACEMENT CHARACTER. <!--All U+0000 NULL characters must
be replaced by U+FFFD REPLACEMENT CHARACTERs. (this isn't black-box
testable since neither U+0000 nor U+FFFD are valid anywhere in the
- syntax and thus both will be treated the same anyway)--></p></li>
+ syntax and thus both will be treated the same anyway)--> <a href="#refsRFC3629">[RFC3629]</a></p></li>
<li><p>Let <var title="">base URL</var> be the <span>absolute
URL</span> representing the manifest.</p></li>
@@ -79552,7 +79553,7 @@
<p>If the attempt succeeds, then convert the script resource to
Unicode by assuming it was encoded as UTF-8, to obtain its <var
- title="">source</var>.</p>
+ title="">source</var>. <a href="#refsRFC3629">[RFC3629]</a></p>
<p>Let <var title="">language</var> be JavaScript.</p>
@@ -80425,7 +80426,7 @@
<p>If the attempt succeeds, then convert the script resource to
Unicode by assuming it was encoded as UTF-8, to obtain its <var
- title="">source</var>.</p>
+ title="">source</var>. <a href="#refsRFC3629">[RFC3629]</a></p>
<p>Let <var title="">language</var> be JavaScript.</p>
@@ -81105,7 +81106,7 @@
; a Unicode character other than U+000A LINE FEED (LF) or U+000D CARRIAGE RETURN (CR)</pre>
<p>Event streams in this format must always be encoded as
- UTF-8.</p>
+ UTF-8. <a href="#refsRFC3629">[RFC3629]</a></p>
<p>Lines must be separated by either a U+000D CARRIAGE RETURN U+000A
LINE FEED (CRLF) character pair, a single U+000A LINE FEED (LF)
@@ -81121,8 +81122,9 @@
<h4 id="event-stream-interpretation">Interpreting an event stream</h4>
- <p>Bytes or sequences of bytes that are not valid UTF-8 sequences
- must be interpreted as the U+FFFD REPLACEMENT CHARACTER.</p>
+ <p>Streams must be decoded as UTF-8 text. Bytes or sequences of
+ bytes that are not valid UTF-8 sequences must be interpreted as the
+ U+FFFD REPLACEMENT CHARACTER. <a href="#refsRFC3629">[RFC3629]</a></p>
<p>One leading U+FEFF BYTE ORDER MARK character must be ignored if
any are present.</p>
@@ -89841,7 +89843,9 @@
<h5>Character encodings</h5>
<p>User agents must at a minimum support the UTF-8 and Windows-1252
- encodings, but may support more.</p>
+ encodings, but may support more. <a
+ href="#refsRFC3629">[RFC3629]</a> <a
+ href="#refsWIN1252">[WIN1252]</a></p>
<p class="note">It is not unusual for Web browsers to support dozens
if not upwards of a hundred distinct character encodings.</p>
More information about the Commit-Watchers
mailing list