[whatwg] Application deployment

Christoph Päper christoph.paeper at crissov.de
Sun Aug 3 05:35:03 PDT 2008


Robert O'Callahan:
>> http://www.example.com/site.jar#/path/inside/foo.html#heading1
>
> URL parsing doesn't support multiple fragment identifiers

I'm surprised that RFC 3986 (like 2396) makes '#' reserved in  
fragment identifiers (only '[]', too). The fragment ID is terminated  
only by the end of the URI after all. The one reason for disallowing  
'#' I can think of is tokenization starting from the end of the  
string, but as far as I know that may fail for other parts.

   fragment    = *( pchar / "/" / "?" )
   pchar       = unreserved / pct-encoded / sub-delims / ":" / "@"
   unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
   pct-encoded = "%" HEXDIG HEXDIG
   sub-delims  = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" /  
"," / ";" / "="

<http://www.example.com/site.jar#/path/inside/foo.html%23heading1>  
should work fine, though.

-----8<--------8<--------8<--------8<--------8<--------8<--------8<-----

I'm also surprised that RFC 3986 (unlike 2396) misses a section on US- 
ASCII characters deliberately excluded, i.e. <C0> and '"<>{}|\`^ ',  
previously also '[]'. I think

   reserved    = gen-delims / sub-delims
   gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
   ...

should be something like

   reserved    = delims / enclosing / unwise / controls
   delims      = gen-delims / sub-delims
   enclosing   = DQUOTE / "<" / ">" / SP
   unwise      = "{" / "}" / "|" / "\" / "`" / "^"
   controls    = %x00-1F / %x7F
   ...



More information about the whatwg mailing list