[whatwg] base64 entities

Kornel Lesiński kornel at geekhood.net
Wed Aug 25 14:52:42 PDT 2010

> == Workarounds ==
> Currently, authors must carefully escape all untrusted content to
> prevent an attacker from injecting HTML.  Unfortunately, authors often
> apply the incorrect escaping or forget to escape entirely, resulting
> in security vulnerabilities.  Escaping content in HTML is tricky
> because authors need to use different escaping rules for different
> contexts.  For example, PHP's htmlspecialchars isn't sufficient in the
> following contexts:
> <img alt=<?php echo htmlspecialchars($name) ?> src="...">
> <script>
> elmt.innerHTML = 'Hi there <?php echo htmlspecialchars($name) ?>.';
> </script>

These cases can be secured without any new features in browsers (by escaping whitespace using numeric entities):

function htmlescape($str) {
	return preg_replace('/[\s<>"\'&]/e','"&".ord("$0").";"',$str);

I don't think that another escaping method would substantially improve PHP's situation. In my experience there are much more common problems that this won't solve:

• authors don't realize that echoed data may be dangerous, e.g. they expect to get a number, and it never occurs to them that field intended for numbers isn't guaranteed to contain only numbers. Some mistakenly believe that XSS is harmless (that it affects only attacker's own browser). They wouldn't use the new escaping method.

• PHP uses fundamentally flawed approach that requires authors to remember to escape all values all the time, and inevitably authors forget to do it sometimes. A better method, when forgotten, won't help at all.

• Novice authors don't understand escaping and end up using wrong approach (e.g. strip_tags() and filtering of input rather than output). Escaping method that makes escaped text opaque is going to be very confusing to authors who don't understand concept of escaping.

regards, Kornel

More information about the whatwg mailing list