[whatwg] Please always use utf-8 for Web Workers
Ian Hickson
ian at hixie.ch
Wed Oct 14 03:55:28 PDT 2009
On Fri, 25 Sep 2009, Simon Pieters wrote:
>
> Workers are new and seems very likely to be incompatible with existing
> scripts. So it is not subject to legacy content with legacy encodings.
> Therefore, we should be able to always use utf-8 for workers. Always
> using utf-8 is simpler to implement and test and encourages people to
> switch to utf-8 elsewhere.
On Fri, 25 Sep 2009, Jonathan Cook wrote:
>
> The importScripts portion of the Web Workers API is compatible with
> existing scripts, but I'm all for more UTF-8 :) If the restriction is
> added to the spec, I'd want to know that a very clear error was going to
> be thrown explaining the problem.
On Fri, 25 Sep 2009, Simon Pieters wrote:
>
> I'm not sure that throwing an error is a good idea. Would you throw an
> error when there's no declared encoding? That seems to be annoying for
> the common case of just using ASCII characters. Throwing an error when
> there is a declared encoding that is not utf-8 might work, but are there
> many scripts that have a declared encoding and are not utf-8?
>
> I think it is to just ignore any declared encoding and assume utf-8. If
> people are using non-ascii in another encoding, then they would notice
> by seeing that their text looks like garbage. Browsers could also log
> messages to their error consoles about encoding declarations declaring
> non-utf-8 and/or sequences of bytes that are not valid utf-8.
On Fri, 25 Sep 2009, Drew Wilson wrote:
>
> Are you saying that if I load a script via a <script> tag in a web page,
> then load it via importScripts() in a worker, that the result of loading
> that script in those two cases should/could be different because of
> different decoding mechanisms?
>
> If that's what's being proposed, that seems bad.
On Fri, 25 Sep 2009, Anne van Kesteren wrote:
>
> That could happen already if the script loaded via <script> did not have
> an encoding set and got it from <script charset>.
On Fri, 25 Sep 2009, Drew Wilson wrote:
>
> Certainly. If I explicitly override the charset, then that seems like
> reasonable behavior. Having the default decoding vary between
> importScripts() and <script> seems bad, especially since you can't
> override charsets with importScripts().
On Fri, 25 Sep 2009, Anne van Kesteren wrote:
>
> It does not need to be overridden per se. If the document character
> encoding is different from UTF-8 then a script loaded through <script>
> will be decoded differently from a script loaded through importScripts()
> as well.
On Mon, 28 Sep 2009, Michael Nordman wrote:
>
> Leaving legacy encodings behind would be a good thing if we can get away
> with it... jmho.
Ok, I've mode workers assume UTF-8 always.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list