[whatwg] IRIs vs. URIs
peter at opera.com
Wed Mar 14 07:20:44 PDT 2007
L. David Baron on 2007-03-13:
> I tend to think it would be good that new uses of URIs/IRIs document that
> they are really IRIs and therefore this reverse-encoding behavior should
> not be used, but instead encoding should be done as UTF-8.
You cannot have UTF-8 encoding just for the URIs/IRIs, and another encoding
for the rest of the source text. To properly parse a URI/IRI in the source
document, you must first convert the bytes in the resource into a stream of
> (In Mozilla's codebase such distinctions are easy to implement since
> we have to pass along the encoding of the document every time we
> create a URI in order to get this backwards-compatible behavior.
Of course, you will need to take special care to handle query data that is
stored as plain non-ASCII bytes in the source document, so you would
still need to pass around that document encoding.
> It would probably be good if the spec documented how the encoding
> issues in URIs are actually handled.
Indeed. Considering the number of partly contradicting bug reports we have
here at Opera on the issue, it would be nice to have it clearly spelled out,
so that everyone is doing the same thing, and that we are doing what the
Peter, software engineer, Opera Software
The opinions expressed are my own, and not those of my employer.
Please reply only by follow-ups on the mailing list.
More information about the whatwg