[whatwg] Video with MIME type application/octet-stream
bzbarsky at MIT.EDU
Tue Sep 7 06:27:51 PDT 2010
On 9/7/10 9:16 AM, Philip Jägenstedt wrote:
> UTF-8, Big5 and GBK are all (as far as I know) ASCII supersets. Do
> real-world text documents include \0 bytes?
Yes. Real-world text documents include all sorts of gunk. Just rarely.
>> As long as "indicates an encoding" doesn't include UTF-8 or ISO-8859-1
>> (thanks, Apache!), that should be reasonable, I think.
> Are you saying that Apache has, at various times, set the default
> character encoding to UTF-8 or ISO-8859-1?
Yes, precisely. Though the UTF-8 stuff was Linux distros, I think, not
Apache itself (in that Apache just sent the thing passed to
AddDefaultCharset and they changed the value of that from ISO-8859-1 to
UTF-8 in their distro packages). Here's the relevant comment from the
Gecko source where we do our text-or-binary sniffing for toplevel contexts:
Make sure to do a case-sensitive exact match comparison here. Apache
1.x just sends text/plain for "unknown", while Apache 2.x sends
text/plain with a ISO-8859-1 charset. Debian's Apache version, just to
be different, sends text/plain with iso-8859-1 charset. For extra fun,
FC7, RHEL4, and Ubuntu Feisty send charset=UTF-8. Don't do general
case-insensitive comparison, since we really want to apply this crap as
rarely as we can.
> I was hoping that no encoding parameter at all would be sent :/
Heh. I've long since given up all hope of reason on this stuff; I just
try to keep it as sane and predictable and simple as possible. :(
More information about the whatwg