[html5] r1275 - /

whatwg at whatwg.org whatwg at whatwg.org
Thu Feb 28 17:22:52 PST 2008


Author: ianh
Date: 2008-02-28 17:22:51 -0800 (Thu, 28 Feb 2008)
New Revision: 1275

Modified:
   index
   source
Log:
[ac] (1) 'character encoding declaration' is now a cross-reference term; made the content='' attribute of <meta> case-insensitive for charset decls. switched utf-8 and win1252 defaults around. other minor editorial jiggling.

Modified: index
===================================================================
--- index	2008-02-28 23:26:32 UTC (rev 1274)
+++ index	2008-02-29 01:22:51 UTC (rev 1275)
@@ -24,7 +24,7 @@
 
    <h1 id=html-5>HTML 5</h1>
 
-   <h2 class="no-num no-toc" id=working>Working Draft — 28 February
+   <h2 class="no-num no-toc" id=working>Working Draft — 29 February
     2008</h2>
 
    <p>You can take part in this work. <a
@@ -7394,10 +7394,10 @@
    document-level metadata with the <code title=attr-meta-name><a
    href="#name">name</a></code> attribute, pragma directives with the <code
    title=attr-meta-http-equiv><a href="#http-equiv0">http-equiv</a></code>
-   attribute, and the file's character encoding declaration when an HTML
-   document is serialised to string form (e.g. for transmission over the
-   network or for disk storage) with the <code title=attr-meta-charset><a
-   href="#charset0">charset</a></code> attribute.
+   attribute, and the file's <a href="#character1">character encoding
+   declaration</a> when an HTML document is serialised to string form (e.g.
+   for transmission over the network or for disk storage) with the <code
+   title=attr-meta-charset><a href="#charset0">charset</a></code> attribute.
 
   <p>Exactly one of the <code title=attr-meta-name><a
    href="#name">name</a></code>, <code title=attr-meta-http-equiv><a
@@ -7411,6 +7411,10 @@
    title=attr-meta-content><a href="#content0">content</a></code> attribute
    must also be specified. Otherwise, it must be omitted.
 
+  <p>The <dfn id=charset0 title=attr-meta-charset><code>charset</code></dfn>
+   attribute specifies the character encoding used by the document. This is
+   called a <a href="#character1">character encoding declaration</a>.
+
   <p>The <code title=attr-meta-charset><a href="#charset0">charset</a></code>
    attribute may be specified in <a href="#html5" title=HTML5>HTML
    documents</a> only, it must not be used in <a href="#xhtml5"
@@ -7685,18 +7689,20 @@
      title=attr-meta-http-equiv-content-type>Encoding declaration state's</a>
      user agent requirements are all handled by the parsing section of the
      specification. The state is just an alternative form of setting the
-     <code title=meta-charset>charset</code> attribute: it is <a
-     href="#charset">a character encoding declaration</a>.</p>
+     <code title=meta-charset>charset</code> attribute: it is a <a
+     href="#character1">character encoding declaration</a>.</p>
 
     <p>For <code><a href="#meta0">meta</a></code> elements in the <a
      href="#encoding" title=attr-meta-http-equiv-content-type>Encoding
      declaraton state</a>, the <code title=attr-meta-content><a
-     href="#content0">content</a></code> attribute must have a value
-     consisting of the literal string "<code title="">text/html;</code>",
-     optionally followed by a single U+0020 SPACE character, followed by the
-     literal string "<code title="">charset=</code>", followed by the
-     character encoding name of <a href="#charset">the character encoding
-     declaration</a>.</p>
+     href="#content0">content</a></code> attribute must have a value that is
+     a case-insensitive<!-- ASCII
+    XXX--> match of a string that consists
+     of the literal string "<code title="">text/html;</code>", optionally
+     followed by any number of <a href="#space" title="space character">space
+     characters</a>, followed by the literal string "<code
+     title="">charset=</code>", followed by the character encoding name of <a
+     href="#charset">the character encoding declaration</a>.</p>
 
     <p>If the document contains a <code><a href="#meta0">meta</a></code>
      element in the <a href="#encoding"
@@ -7884,16 +7890,14 @@
 
   <h5 id=charset><span class=secno>3.7.5.4. </span>Specifying the document's
    character encoding</h5>
-
-  <p>The <code><a href="#meta0">meta</a></code> element may also be used to
-   provide UAs with character encoding information for <a href="#html5"
-   title=HTML5>HTML</a> files, by setting the <dfn id=charset0
-   title=attr-meta-charset><code>charset</code></dfn> attribute to the name
-   of a character encoding. This is called a character encoding declaration.</p>
   <!-- XXX maybe the rest should move to "writing html" section,
   though if we do then we have to duplicate the requirements in the
   parsing section for conformance checkers -->
 
+  <p>A <dfn id=character1>character encoding declaration</dfn> is a mechanism
+   by which the character encoding used to store or transmit a document is
+   specified.
+
   <p>The following restrictions apply to character encoding declarations:
 
   <ul>
@@ -7905,8 +7909,8 @@
     href="#refsIANACHARSET">[IANACHARSET]</a> <!-- XXX
    http://www.iana.org/assignments/character-sets -->
 
-   <li>The attribute value must be serialised without the use of character
-    entity references of any kind.
+   <li>The encoding name must be serialised without the use of character
+    entity references or character escapes of any kind.
   </ul>
 
   <p>If the document does not start with a BOM, and if its encoding is not
@@ -37158,9 +37162,9 @@
   <p>The various types of content mentioned above are described in the next
    few sections.
 
-  <p>In addition, there are some restrictions on how <a
-   href="#charset">character encoding declarations</a> are to be serialised,
-   as discussed in the section on that topic.
+  <p>In addition, there are some restrictions on how <span>character encoding
+   declarations</span> are to be serialised, as discussed in the section on
+   that topic.
 
   <p>The U+0000 NULL character, control characters other than the <a
    href="#space" title="space character">space characters</a>, and characters
@@ -37302,14 +37306,14 @@
    described below.
 
   <p>RCDATA elements can have <a href="#text1" title=syntax-text>text</a> and
-   <a href="#character1" title=syntax-entities>character entity
+   <a href="#character2" title=syntax-entities>character entity
    references</a>, but the text must not contain an <a href="#ambiguous"
    title=syntax-ambiguous-ampersand>ambiguous ampersand</a>. There are also
    <a href="#cdata-rcdata-restrictions">further restrictions</a> described
    below.
 
   <p>Normal elements can have <a href="#text1" title=syntax-text>text</a>, <a
-   href="#character1" title=syntax-entities>character entity references</a>,
+   href="#character2" title=syntax-entities>character entity references</a>,
    other <a href="#elements2" title=syntax-elements>elements</a>, and <a
    href="#comments0" title=syntax-comments>comments</a>, but the text must
    not contain the character U+003C LESS-THAN SIGN (<code><</code>) or an
@@ -37405,7 +37409,7 @@
 
   <p><dfn id=attribute0 title=syntax-attribute-value>Attribute values</dfn>
    are a mixture of <a href="#text1" title=syntax-text>text</a> and <a
-   href="#character1" title=syntax-entities>character entity references</a>,
+   href="#character2" title=syntax-entities>character entity references</a>,
    except with the additional restriction that the text cannot contain an <a
    href="#ambiguous" title=syntax-ambiguous-ampersand>ambiguous
    ampersand</a>.
@@ -37732,7 +37736,7 @@
 
   <p>An <dfn id=escaping title=syntax-escape>escaping text span</dfn> is a
    span of <a href="#text1" title=syntax-text>text</a> (in CDATA and RCDATA
-   elements) and <a href="#character1" title=syntax-entities>character entity
+   elements) and <a href="#character2" title=syntax-entities>character entity
    references</a> (in RCDATA elements) that starts with an <a
    href="#escaping0" title=syntax-escape-start>escaping text span start</a>
    that is not itself in an <a href="#escaping" title=syntax-escape>escaping
@@ -37780,7 +37784,7 @@
   <h4 id=character><span class=secno>8.1.4 </span>Character entity references</h4>
 
   <p>In certain cases described in other sections, <a href="#text1"
-   title=syntax-text>text</a> may be mixed with <dfn id=character1
+   title=syntax-text>text</a> may be mixed with <dfn id=character2
    title=syntax-entities>character entity references</dfn>. These can be used
    to escape characters that couldn't otherwise legally be included in <a
    href="#text1" title=syntax-text>text</a>.
@@ -38369,13 +38373,13 @@
    <li>
     <p>Otherwise, return an implementation-defined or user-specified default
      character encoding, with the <a href="#confidence"
-     title=concept-encoding-confidence>confidence</a> <i>tentative</i>. Due
-     to its use in legacy content, <code title="">windows-1252</code> is
-     recommended as a default in predominantly Western demographics. In
+     title=concept-encoding-confidence>confidence</a> <i>tentative</i>. In
      non-legacy environments, the more comprehensive <code
-     title="">UTF-8</code> encoding is recommended instead. Since these
-     encodings can in many cases be distinguished by inspection, a user agent
-     may heuristically decide which to use as a default.
+     title="">UTF-8</code> encoding is recommended. Due to its use in legacy
+     content, <code title="">windows-1252</code> is recommended as a default
+     in predominantly Western demographics instead. Since these encodings can
+     in many cases be distinguished by inspection, a user agent may
+     heuristically decide which to use as a default.
   </ul>
 
   <h5 id=character0><span class=secno>8.2.2.2. </span>Character encoding

Modified: source
===================================================================
--- source	2008-02-28 23:26:32 UTC (rev 1274)
+++ source	2008-02-29 01:22:51 UTC (rev 1275)
@@ -5862,9 +5862,9 @@
   metadata with the <code title="attr-meta-name">name</code>
   attribute, pragma directives with the <code
   title="attr-meta-http-equiv">http-equiv</code> attribute, and the
-  file's character encoding declaration when an HTML document is
-  serialised to string form (e.g. for transmission over the network or
-  for disk storage) with the <code
+  file's <span>character encoding declaration</span> when an HTML
+  document is serialised to string form (e.g. for transmission over
+  the network or for disk storage) with the <code
   title="attr-meta-charset">charset</code> attribute.</p>
 
   <p>Exactly one of the <code title="attr-meta-name">name</code>,
@@ -5877,6 +5877,11 @@
   the <code title="attr-meta-content">content</code> attribute must
   also be specified. Otherwise, it must be omitted.</p>
 
+  <p>The <dfn title="attr-meta-charset"><code>charset</code></dfn>
+  attribute specifies the character encoding used by the
+  document. This is called a <span>character encoding
+  declaration</span>.</p>
+
   <p>The <code title="attr-meta-charset">charset</code> attribute may
   be specified in <span title="HTML5">HTML documents</span> only, it
   must not be used in <span title="XHTML">XML documents</span>. If the
@@ -6158,18 +6163,19 @@
     declaration state's</span> user agent requirements are all handled
     by the parsing section of the specification. The state is just an
     alternative form of setting the <code
-    title="meta-charset">charset</code> attribute: it is <a
-    href="#charset">a character encoding declaration</a>.</p>
+    title="meta-charset">charset</code> attribute: it is a
+    <span>character encoding declaration</span>.</p>
 
     <p>For <code>meta</code> elements in the <span
     title="attr-meta-http-equiv-content-type">Encoding declaraton
     state</span>, the <code title="attr-meta-content">content</code>
-    attribute must have a value consisting of the literal string
-    "<code title="">text/html;</code>", optionally followed by a
-    single U+0020 SPACE character, followed by the literal string
-    "<code title="">charset=</code>", followed by the character
-    encoding name of <a href="#charset">the character encoding
-    declaration</a>.</p>
+    attribute must have a value that is a case-insensitive<!-- ASCII
+    XXX--> match of a string that consists of the literal string
+    "<code title="">text/html;</code>", optionally followed by any
+    number of <span title="space character">space characters</span>,
+    followed by the literal string "<code title="">charset=</code>",
+    followed by the character encoding name of <a href="#charset">the
+    character encoding declaration</a>.</p>
 
     <p>If the document contains a <code>meta</code> element in the
     <span title="attr-meta-http-equiv-content-type">Encoding
@@ -6357,17 +6363,14 @@
 
   <h5 id="charset">Specifying the document's character encoding</h5>
 
-  <p>The <code>meta</code> element may also be used to provide UAs
-  with character encoding information for <span
-  title="HTML5">HTML</span> files, by setting the <dfn
-  title="attr-meta-charset"><code>charset</code></dfn> attribute to
-  the name of a character encoding. This is called a character
-  encoding declaration.</p>
-
   <!-- XXX maybe the rest should move to "writing html" section,
   though if we do then we have to duplicate the requirements in the
   parsing section for conformance checkers -->
 
+  <p>A <dfn>character encoding declaration</dfn> is a mechanism by
+  which the character encoding used to store or transmit a document is
+  specified.</p>
+
   <p>The following restrictions apply to character encoding
   declarations:</p>
 
@@ -6381,8 +6384,8 @@
    href="#refsIANACHARSET">[IANACHARSET]</a> <!-- XXX
    http://www.iana.org/assignments/character-sets --></li>
 
-   <li>The attribute value must be serialised without the use of
-   character entity references of any kind.</li>
+   <li>The encoding name must be serialised without the use of
+   character entity references or character escapes of any kind.</li>
 
   </ul>
 
@@ -34680,9 +34683,9 @@
   <p>The various types of content mentioned above are described in the
   next few sections.</p>
 
-  <p>In addition, there are some restrictions on how <a
-  href="#charset">character encoding declarations</a> are to be
-  serialised, as discussed in the section on that topic.</p>
+  <p>In addition, there are some restrictions on how <span>character
+  encoding declarations</span> are to be serialised, as discussed in
+  the section on that topic.</p>
 
   <p>The U+0000 NULL character, control characters other than the
   <span title="space character">space characters</span>, and
@@ -35925,13 +35928,13 @@
    <li><p>Otherwise, return an implementation-defined or
    user-specified default character encoding, with the <span
    title="concept-encoding-confidence">confidence</span>
-   <i>tentative</i>. Due to its use in legacy content, <code
+   <i>tentative</i>. In non-legacy environments, the more
+   comprehensive <code title="">UTF-8</code> encoding is
+   recommended. Due to its use in legacy content, <code
    title="">windows-1252</code> is recommended as a default in
-   predominantly Western demographics. In non-legacy environments, the
-   more comprehensive <code title="">UTF-8</code> encoding is
-   recommended instead. Since these encodings can in many cases be
-   distinguished by inspection, a user agent may heuristically decide
-   which to use as a default.</p></li>
+   predominantly Western demographics instead. Since these encodings
+   can in many cases be distinguished by inspection, a user agent may
+   heuristically decide which to use as a default.</p></li>
 
   </ol>
 




More information about the Commit-Watchers mailing list