[html5] r1275 - /
whatwg at whatwg.org
whatwg at whatwg.org
Thu Feb 28 17:22:52 PST 2008
Author: ianh
Date: 2008-02-28 17:22:51 -0800 (Thu, 28 Feb 2008)
New Revision: 1275
Modified:
index
source
Log:
[ac] (1) 'character encoding declaration' is now a cross-reference term; made the content='' attribute of <meta> case-insensitive for charset decls. switched utf-8 and win1252 defaults around. other minor editorial jiggling.
Modified: index
===================================================================
--- index 2008-02-28 23:26:32 UTC (rev 1274)
+++ index 2008-02-29 01:22:51 UTC (rev 1275)
@@ -24,7 +24,7 @@
<h1 id=html-5>HTML 5</h1>
- <h2 class="no-num no-toc" id=working>Working Draft — 28 February
+ <h2 class="no-num no-toc" id=working>Working Draft — 29 February
2008</h2>
<p>You can take part in this work. <a
@@ -7394,10 +7394,10 @@
document-level metadata with the <code title=attr-meta-name><a
href="#name">name</a></code> attribute, pragma directives with the <code
title=attr-meta-http-equiv><a href="#http-equiv0">http-equiv</a></code>
- attribute, and the file's character encoding declaration when an HTML
- document is serialised to string form (e.g. for transmission over the
- network or for disk storage) with the <code title=attr-meta-charset><a
- href="#charset0">charset</a></code> attribute.
+ attribute, and the file's <a href="#character1">character encoding
+ declaration</a> when an HTML document is serialised to string form (e.g.
+ for transmission over the network or for disk storage) with the <code
+ title=attr-meta-charset><a href="#charset0">charset</a></code> attribute.
<p>Exactly one of the <code title=attr-meta-name><a
href="#name">name</a></code>, <code title=attr-meta-http-equiv><a
@@ -7411,6 +7411,10 @@
title=attr-meta-content><a href="#content0">content</a></code> attribute
must also be specified. Otherwise, it must be omitted.
+ <p>The <dfn id=charset0 title=attr-meta-charset><code>charset</code></dfn>
+ attribute specifies the character encoding used by the document. This is
+ called a <a href="#character1">character encoding declaration</a>.
+
<p>The <code title=attr-meta-charset><a href="#charset0">charset</a></code>
attribute may be specified in <a href="#html5" title=HTML5>HTML
documents</a> only, it must not be used in <a href="#xhtml5"
@@ -7685,18 +7689,20 @@
title=attr-meta-http-equiv-content-type>Encoding declaration state's</a>
user agent requirements are all handled by the parsing section of the
specification. The state is just an alternative form of setting the
- <code title=meta-charset>charset</code> attribute: it is <a
- href="#charset">a character encoding declaration</a>.</p>
+ <code title=meta-charset>charset</code> attribute: it is a <a
+ href="#character1">character encoding declaration</a>.</p>
<p>For <code><a href="#meta0">meta</a></code> elements in the <a
href="#encoding" title=attr-meta-http-equiv-content-type>Encoding
declaraton state</a>, the <code title=attr-meta-content><a
- href="#content0">content</a></code> attribute must have a value
- consisting of the literal string "<code title="">text/html;</code>",
- optionally followed by a single U+0020 SPACE character, followed by the
- literal string "<code title="">charset=</code>", followed by the
- character encoding name of <a href="#charset">the character encoding
- declaration</a>.</p>
+ href="#content0">content</a></code> attribute must have a value that is
+ a case-insensitive<!-- ASCII
+ XXX--> match of a string that consists
+ of the literal string "<code title="">text/html;</code>", optionally
+ followed by any number of <a href="#space" title="space character">space
+ characters</a>, followed by the literal string "<code
+ title="">charset=</code>", followed by the character encoding name of <a
+ href="#charset">the character encoding declaration</a>.</p>
<p>If the document contains a <code><a href="#meta0">meta</a></code>
element in the <a href="#encoding"
@@ -7884,16 +7890,14 @@
<h5 id=charset><span class=secno>3.7.5.4. </span>Specifying the document's
character encoding</h5>
-
- <p>The <code><a href="#meta0">meta</a></code> element may also be used to
- provide UAs with character encoding information for <a href="#html5"
- title=HTML5>HTML</a> files, by setting the <dfn id=charset0
- title=attr-meta-charset><code>charset</code></dfn> attribute to the name
- of a character encoding. This is called a character encoding declaration.</p>
<!-- XXX maybe the rest should move to "writing html" section,
though if we do then we have to duplicate the requirements in the
parsing section for conformance checkers -->
+ <p>A <dfn id=character1>character encoding declaration</dfn> is a mechanism
+ by which the character encoding used to store or transmit a document is
+ specified.
+
<p>The following restrictions apply to character encoding declarations:
<ul>
@@ -7905,8 +7909,8 @@
href="#refsIANACHARSET">[IANACHARSET]</a> <!-- XXX
http://www.iana.org/assignments/character-sets -->
- <li>The attribute value must be serialised without the use of character
- entity references of any kind.
+ <li>The encoding name must be serialised without the use of character
+ entity references or character escapes of any kind.
</ul>
<p>If the document does not start with a BOM, and if its encoding is not
@@ -37158,9 +37162,9 @@
<p>The various types of content mentioned above are described in the next
few sections.
- <p>In addition, there are some restrictions on how <a
- href="#charset">character encoding declarations</a> are to be serialised,
- as discussed in the section on that topic.
+ <p>In addition, there are some restrictions on how <span>character encoding
+ declarations</span> are to be serialised, as discussed in the section on
+ that topic.
<p>The U+0000 NULL character, control characters other than the <a
href="#space" title="space character">space characters</a>, and characters
@@ -37302,14 +37306,14 @@
described below.
<p>RCDATA elements can have <a href="#text1" title=syntax-text>text</a> and
- <a href="#character1" title=syntax-entities>character entity
+ <a href="#character2" title=syntax-entities>character entity
references</a>, but the text must not contain an <a href="#ambiguous"
title=syntax-ambiguous-ampersand>ambiguous ampersand</a>. There are also
<a href="#cdata-rcdata-restrictions">further restrictions</a> described
below.
<p>Normal elements can have <a href="#text1" title=syntax-text>text</a>, <a
- href="#character1" title=syntax-entities>character entity references</a>,
+ href="#character2" title=syntax-entities>character entity references</a>,
other <a href="#elements2" title=syntax-elements>elements</a>, and <a
href="#comments0" title=syntax-comments>comments</a>, but the text must
not contain the character U+003C LESS-THAN SIGN (<code><</code>) or an
@@ -37405,7 +37409,7 @@
<p><dfn id=attribute0 title=syntax-attribute-value>Attribute values</dfn>
are a mixture of <a href="#text1" title=syntax-text>text</a> and <a
- href="#character1" title=syntax-entities>character entity references</a>,
+ href="#character2" title=syntax-entities>character entity references</a>,
except with the additional restriction that the text cannot contain an <a
href="#ambiguous" title=syntax-ambiguous-ampersand>ambiguous
ampersand</a>.
@@ -37732,7 +37736,7 @@
<p>An <dfn id=escaping title=syntax-escape>escaping text span</dfn> is a
span of <a href="#text1" title=syntax-text>text</a> (in CDATA and RCDATA
- elements) and <a href="#character1" title=syntax-entities>character entity
+ elements) and <a href="#character2" title=syntax-entities>character entity
references</a> (in RCDATA elements) that starts with an <a
href="#escaping0" title=syntax-escape-start>escaping text span start</a>
that is not itself in an <a href="#escaping" title=syntax-escape>escaping
@@ -37780,7 +37784,7 @@
<h4 id=character><span class=secno>8.1.4 </span>Character entity references</h4>
<p>In certain cases described in other sections, <a href="#text1"
- title=syntax-text>text</a> may be mixed with <dfn id=character1
+ title=syntax-text>text</a> may be mixed with <dfn id=character2
title=syntax-entities>character entity references</dfn>. These can be used
to escape characters that couldn't otherwise legally be included in <a
href="#text1" title=syntax-text>text</a>.
@@ -38369,13 +38373,13 @@
<li>
<p>Otherwise, return an implementation-defined or user-specified default
character encoding, with the <a href="#confidence"
- title=concept-encoding-confidence>confidence</a> <i>tentative</i>. Due
- to its use in legacy content, <code title="">windows-1252</code> is
- recommended as a default in predominantly Western demographics. In
+ title=concept-encoding-confidence>confidence</a> <i>tentative</i>. In
non-legacy environments, the more comprehensive <code
- title="">UTF-8</code> encoding is recommended instead. Since these
- encodings can in many cases be distinguished by inspection, a user agent
- may heuristically decide which to use as a default.
+ title="">UTF-8</code> encoding is recommended. Due to its use in legacy
+ content, <code title="">windows-1252</code> is recommended as a default
+ in predominantly Western demographics instead. Since these encodings can
+ in many cases be distinguished by inspection, a user agent may
+ heuristically decide which to use as a default.
</ul>
<h5 id=character0><span class=secno>8.2.2.2. </span>Character encoding
Modified: source
===================================================================
--- source 2008-02-28 23:26:32 UTC (rev 1274)
+++ source 2008-02-29 01:22:51 UTC (rev 1275)
@@ -5862,9 +5862,9 @@
metadata with the <code title="attr-meta-name">name</code>
attribute, pragma directives with the <code
title="attr-meta-http-equiv">http-equiv</code> attribute, and the
- file's character encoding declaration when an HTML document is
- serialised to string form (e.g. for transmission over the network or
- for disk storage) with the <code
+ file's <span>character encoding declaration</span> when an HTML
+ document is serialised to string form (e.g. for transmission over
+ the network or for disk storage) with the <code
title="attr-meta-charset">charset</code> attribute.</p>
<p>Exactly one of the <code title="attr-meta-name">name</code>,
@@ -5877,6 +5877,11 @@
the <code title="attr-meta-content">content</code> attribute must
also be specified. Otherwise, it must be omitted.</p>
+ <p>The <dfn title="attr-meta-charset"><code>charset</code></dfn>
+ attribute specifies the character encoding used by the
+ document. This is called a <span>character encoding
+ declaration</span>.</p>
+
<p>The <code title="attr-meta-charset">charset</code> attribute may
be specified in <span title="HTML5">HTML documents</span> only, it
must not be used in <span title="XHTML">XML documents</span>. If the
@@ -6158,18 +6163,19 @@
declaration state's</span> user agent requirements are all handled
by the parsing section of the specification. The state is just an
alternative form of setting the <code
- title="meta-charset">charset</code> attribute: it is <a
- href="#charset">a character encoding declaration</a>.</p>
+ title="meta-charset">charset</code> attribute: it is a
+ <span>character encoding declaration</span>.</p>
<p>For <code>meta</code> elements in the <span
title="attr-meta-http-equiv-content-type">Encoding declaraton
state</span>, the <code title="attr-meta-content">content</code>
- attribute must have a value consisting of the literal string
- "<code title="">text/html;</code>", optionally followed by a
- single U+0020 SPACE character, followed by the literal string
- "<code title="">charset=</code>", followed by the character
- encoding name of <a href="#charset">the character encoding
- declaration</a>.</p>
+ attribute must have a value that is a case-insensitive<!-- ASCII
+ XXX--> match of a string that consists of the literal string
+ "<code title="">text/html;</code>", optionally followed by any
+ number of <span title="space character">space characters</span>,
+ followed by the literal string "<code title="">charset=</code>",
+ followed by the character encoding name of <a href="#charset">the
+ character encoding declaration</a>.</p>
<p>If the document contains a <code>meta</code> element in the
<span title="attr-meta-http-equiv-content-type">Encoding
@@ -6357,17 +6363,14 @@
<h5 id="charset">Specifying the document's character encoding</h5>
- <p>The <code>meta</code> element may also be used to provide UAs
- with character encoding information for <span
- title="HTML5">HTML</span> files, by setting the <dfn
- title="attr-meta-charset"><code>charset</code></dfn> attribute to
- the name of a character encoding. This is called a character
- encoding declaration.</p>
-
<!-- XXX maybe the rest should move to "writing html" section,
though if we do then we have to duplicate the requirements in the
parsing section for conformance checkers -->
+ <p>A <dfn>character encoding declaration</dfn> is a mechanism by
+ which the character encoding used to store or transmit a document is
+ specified.</p>
+
<p>The following restrictions apply to character encoding
declarations:</p>
@@ -6381,8 +6384,8 @@
href="#refsIANACHARSET">[IANACHARSET]</a> <!-- XXX
http://www.iana.org/assignments/character-sets --></li>
- <li>The attribute value must be serialised without the use of
- character entity references of any kind.</li>
+ <li>The encoding name must be serialised without the use of
+ character entity references or character escapes of any kind.</li>
</ul>
@@ -34680,9 +34683,9 @@
<p>The various types of content mentioned above are described in the
next few sections.</p>
- <p>In addition, there are some restrictions on how <a
- href="#charset">character encoding declarations</a> are to be
- serialised, as discussed in the section on that topic.</p>
+ <p>In addition, there are some restrictions on how <span>character
+ encoding declarations</span> are to be serialised, as discussed in
+ the section on that topic.</p>
<p>The U+0000 NULL character, control characters other than the
<span title="space character">space characters</span>, and
@@ -35925,13 +35928,13 @@
<li><p>Otherwise, return an implementation-defined or
user-specified default character encoding, with the <span
title="concept-encoding-confidence">confidence</span>
- <i>tentative</i>. Due to its use in legacy content, <code
+ <i>tentative</i>. In non-legacy environments, the more
+ comprehensive <code title="">UTF-8</code> encoding is
+ recommended. Due to its use in legacy content, <code
title="">windows-1252</code> is recommended as a default in
- predominantly Western demographics. In non-legacy environments, the
- more comprehensive <code title="">UTF-8</code> encoding is
- recommended instead. Since these encodings can in many cases be
- distinguished by inspection, a user agent may heuristically decide
- which to use as a default.</p></li>
+ predominantly Western demographics instead. Since these encodings
+ can in many cases be distinguished by inspection, a user agent may
+ heuristically decide which to use as a default.</p></li>
</ol>
More information about the Commit-Watchers
mailing list