[whatwg] URL parsing and same-document references [was: Re: Citing multiple <blockquote> elements in HTML5]
Calogero Alex Baldacchino
alex.baldacchino at email.it
Sat Dec 13 10:09:17 PST 2008
Nils Dagsson Moskopp ha scritto:
> Am Freitag, den 12.12.2008, 20:36 +0100 schrieb Calogero Alex
> Baldacchino:
>
>> The above (but the 'double check' I was suggesting) is about the way
>> Firefox (2.x and 3.0.4) behaves (both href="#foo%20bar" and, in a
>> different page, href="./example.html#foo%20bar" match id="foo bar"),
>> while IE7 and Opera 9.x perform an exact comparison, and show, in the
>> address bar, an url with eventual blank spaces, thus applying the
>> relaxation allowed by URL parsing rules, but not conforming to RFC 3986,
>> as a complete URI string.
>>
> Whenever I copypaste an URI from the address bar to any other program, I
> am severely annoyed by this, especially when spaces (delimiters !) are
> part of the fake-URI. A chat or office program, for example, is unable
> to highlight the fake-URI anymore, (how could it ?), also pasting it
> into source code can create all kind of validation errors. And whenever
> I get a bastardized URI via chat or mail, only a part of it is
> clickable.
>
> Can someone from the web browser faction please state if there is any
> data to support breaking RFC-compatibility ? Because as I see it, its
> something that makes it appear nicer, but breaks whenever URIs are to be
> transferred / communicated.
>
Actually I'm not from any faction, to be honest. I think a rationale for
that may be "people write strange things, both in address bars and in
html code", thus relaxing rules when parsing an URL is meaningful; but I
think when resolving and recomposing a whole URI the strictest rules
should be applied.
> Getting to the problem mentioned here, the robustness principle says
> that id="foo bar" should be accepted, but nevertheless invalid - because
> a fragment with a space can never be part of an URI.
Indeed, that's not part of an URI, but a dereferenced component: when
splitting an URI into its components, there is no need to keep %-encoded
characters (RFC3986 says separated components can be decoded, thus,
AIUI, both href="#foo bar" and id="foo bar" respect to conformance
rules, but when resolving "#foo bar" into a complete, absolute URI, the
result should always look like
"http://example.org/something.html#foo%20bar" to be conforming).
Regards,
Alex
--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f
Sponsor:
Proteggi la tua auto
* Garanzia furto e incendio a soli 30 euro! Offerta valida fino al 31 Dicembre! Non perdere loccasione!
*
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8509&d=13-12
More information about the whatwg
mailing list