[whatwg] Is EBCDIC support needed for not breaking the Web?
Henri Sivonen
hsivonen at iki.fi
Sun Jun 1 06:45:26 PDT 2008
The HTML5 draft says that authors should not use EBCDIC-based
encodings. This is more lax than saying that authors must not use and
user agents must not support CESU-8, UTF-7, BOCU-1 and SCSU.
In general, now that UTF-8 exists and is ubiquitously supported,
proliferation of encodings is costly and doesn't expand that the
expressiveness of HTML which is parsed into a Unicode DOM anyway.
Moreover, encodings that are not ASCII supersets are potential
security risks since the string "<script>" may be represented by
different bytes than in ASCII leading to potential privilege
escalation if a server-side gatekeeper and a user agent give different
meanings to the bytes.
For these reasons, if EBCDIC-based encodings don't need to be
supported in order to Support Existing Content, it would be beneficial
never to add support for them and, thus, ban them like CESU-8, UTF-7,
BOCU-1 and SCSU.
I asked Hixie for examples of sites or browsers that require/support
EBCDIC-based encodings. He had none. I examined the encoding menus of
Firefox 3b5, Safari 3.1 and Opera 9.5 beta (on Leopard) and IE8 beta 1
(on English XP SP3). None of them expose EBCDIC-based encodings in the
UI. (All the IBM encodings Firefox exposes turn out to be ASCII-based.)
This makes me wonder: Do the top browsers support any EBCDIC-based
encodings but just without exposing them in the UI? If not, can there
be any notable EBCDIC-based Web content?
I'm suspecting that EBCDIC isn't actually a Web-relevant.
--
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/
More information about the whatwg
mailing list