[Imps] Reasonable limits on buffered values

Thu Dec 28 09:25:44 PST 2006

My primary strategy against denial of service attacks that target the  
conformance checking service is to limit the number of bytes accepted  
as input. This indirectly throttles everything that is proportional  
to the size of input, which is OK for most stuff that has linear  
growth behavior. (It doesn't address things like the billion laughs  
attack, though.)

I have additionally placed arbitrary hard limits on the size of  
particular buffers. So far, I have learned that the size limit I  
placed on the length of attribute values (2048 UTF-16 code units) is  
too small.

Also, my previous limit on the sum of bytes in HTTP resources loaded  
in order to serve one validation request was too low. I have  
increased the limit to 2 MB.

I'm wondering if there's a best practice here. Is there data on how  
long non-malicious attribute values legitimately appear on the Web?

At least there can be only one attribute buffer being filled at a  
time. Buffering of the textContent of <progress> and friends is  
potentially worse than an attribute buffer, because you could use the  
leading 1 MB of bytes to establish <progress> start tags (each  
creating a buffer for content) and then use the trailing 1 MB to fill  
those buffers simultaneously. Perhaps I should worry about those  
buffers instead. What might be a reasonable strategy for securing  
those (short of writing the associated algorithms as automata that  
don't need buffers)?

Is there data on haw large legitimate HTML documents appear on the  
Web? The current limit of 2 MB is based on rounding the size of the  
Web Apps spec up.

-- 
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/