[whatwg] Byte-wise tokenization algorithm

Ian Hickson ian at hixie.ch
Sun Dec 21 18:41:05 PST 2008


On Sun, 21 Dec 2008, Edward Z. Yang wrote:
> 
> I suppose the big pivot point is "as if". A byte-wise implementation 
> would replace character globally with byte, and any U+xxxx designation 
> with the UTF-8 encoded byte version. HTML 5 dictates end behavior, not 
> the actual algorithm implementation, no?

Right; conformance requirements phrased as algorithms or specific steps 
may be implemented in any manner, so long as the end result is equivalent.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'



More information about the whatwg mailing list