[whatwg] Minor clarification of <meta charset> sniffing
Ian Hickson
ian at hixie.ch
Wed Jun 20 01:10:42 PDT 2007
On Wed, 23 May 2007, Michael Day wrote:
>
> A minor point relating to comment skipping in the charset sniffing
> algorithm described in section 8.2.2 of HTML5. The existing text says:
>
> "Advance the position pointer so that it points at the first 0x3E byte
> which is preceeded by two 0x2D bytes (i.e. at the end of an ASCII '-->'
> sequence) and comes after the second 0x2D byte that was found. (The two
> 0x2D bytes cannot be the same as the those in the '<!--' sequence.) If
> no such byte is found before the nth byte, abort this "two step"
> algorithm."
>
> This clearly says that '<!-->' is not a complete comment, as the second
> pair of hyphens cannot be the same as the first. However, it doesn't
> clearly say whether '<!--->' is a complete comment or not.
>
> One option would be to say that the second two 0x2D bytes come after the
> second 0x2D byte that was found, not just the 0x3E byte coming after the
> second 0x2D byte that was found.
I changed it the other way, by allowing overlapped hyphens. This is
consistent with what we've done with comments in the tokeniser.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list