[whatwg] Mathematics on HTML5
Øistein E. Andersen
html5 at xn--istein-9xa.com
Wed Jun 7 17:58:40 PDT 2006
First of all, I must say that I applaud the initiative to include tags for mathematics in HTML5 and that I really hope that this will be part of the final specification.
Several attempts have been made to create a suitable format for mathematics on the web, none of which has gained widespread acceptance. Let me highlight a few requirements that probably must be satisfied in order to avoid a new failure.
1) Do not encode every tiny semantic detail explicitly
As Henri Sivonen put it: «[I]t is futile to insist on semantics that you can't pull out of LaTeX as it is normally authored.» I would like to use a slightly different wording: It is futile to insist on encoding anything that does not change the appearance of a formula as it is written on a blackboard or printed in a book.
This point can be illustrated by the two similar-looking formulae a) $\sin^2 (2p+1)x$ and b) $f^2(2p+1)x$. Mathematicians whom I know would evaluate a) as {sin[(2p+1)*x]}^2 and b) as f[f(2p+1)]*x.
Rules to map between meaning and form cannot be made to work reliably in all cases. Brevity is queen, and conventions may differ between different fields of mathematics.
The point already made about $dt^2$ meaning (dt)^2 being encoded to mean d(t^2) in MathML is just another example of what happens when there is lack of agreement between what the author wants (i.e. to make the formula look nice on the web so that other people can read it) and what the format tries to enforce (i.e. to assure that encoding of a formula be sufficiently semantical for a computer theoretically to be able to evaluate it).
No semantics is clearly better than wrong semantics, and correct semantics combined with nice presentation is probably not feasible without encoding everything twice. After all, most authors would clearly want to use the same encoding for the superscripts in e.g. $x^2 = x\times x$, $f^2(x) = f[f(x)]$, $f^{(4)}(x) = \frac{d^4}{dx^4}f(x)$, and $x_i^{(j)} = x_{i,j}$. Anything else will undoubtedly lead to erratic encoding, and what is a poor author supposed to do when he wants to use e.g. a superscript in a way for which no encoding exists?
Finally, the encoding of semantics in mathematical formulae probably does not feel more necessary to most people who write and read them than the encoding of the particular meaning of an ambiguous word like `can', albeit a differentiation between the modal verb meaning `be able to', the noun denoting a container, and the transitive verb meaning `put into a can' could potentially be helpful for applications like grammar checking and text retrieval.
2) Fight verbosity
The reasons for TeX's undeniable success are many, but one of them might be its concise syntax. People who appreciate the aesthetics of mathematics and are used to make the distinction between $f$ and $F$ or between $\phi$, $\varphi$ and $\Phi$ certainly have no difficulty distinguishing \big from \Big. More importantly, the amount of mark-up needed to encode a line of mathematics is enormous compared to what is necessary for a line of running text. Consequently, each mark-up element must be kept as short as possible.
It may be true that there will always be more <p>'s than <formula>'e out there; however, those using mathematics are likely to use it quite heavily, which makes <m> (for mathematics, or <f> if the <m> tag cannot be reassigned), <frac>2<den>3</frac> and <root>3<of>125</root> clearly better suited than <formula>, <fraction>2<denominator>3</fraction> and <radical>3<radicand>125</radical>.
3) Assure compatibility with a reasonable subset of TeX
As already stated by others, many (most?) potential users of HTML5 Mathematics already know TeX, and if common TeX constructs cannot be reliably encoded, they will quickly move on to (or wait for) something else. If Wikipedia's subset of TeX [1], which seams to cover the most commonly used syntactic constructs, can straightforwardly be transcoded, then Wikipedia would in all probability offer HTML5 Mathematics as a rendering option, which would potentially give the new format a flying start. I would be most keen to learn to what extent the existing drafts would be able to represent the mathematical constructs covered by this particular subset of TeX adequately.
[1] http://en.wikipedia.org/wiki/Help:Formula
4) Make font selection simple and natural
This point does not seem to have attracted quite the attention it deserves yet.
TeX seems to have got things right on this point by making italic the default for letters and roman the default for numbers. Is this approach completely unfeasible within the HTML/CSS framework?
If either numbers or letters (variables) have to be marked up explicitly, the choice should be obvious. Nevertheless, writing something like <m><var>a</var><var>x</var><sup>2</sup> + <var>b</var><var>x</var> + <var>c</var> = log <var>d</var></m> is somewhat cumbersome, which will inevitably lead to <m><var>ax</var><sup>2</sup> + <var>bx</var> + <var>c</var> = log <var>d</var></m> or even the complete omission of <var> tags.
On the other hand, TeX's approach will render the last part of the formula $ax^2 + bx + c = log d$ as the product l*i*m*d, a fact which will probably will make the author realise that something is wrong and make him change log into \log.
It could be argued, of course, that authors' laziness should be disregarded, which might be right to a certain extent, but we do not want to make their work too troublesome either.
How are non-italic variables supposed to be handled? Using attributes, like <var class="italic">, <var class="bold">, <var class="blackletter">, <var class="roman">, etc. may be part of the solution, even though it would be quite verbose. At the very least, a (minimal) set of font-styles should be clearly defined.
5) Avoid unnecessary complexity [less important section]
Quite a few years ago, my mother was employed by a newspaper to type text written by journalists. In order to allow correct hyphenation of words, she had to encode every possible hyphenation point in each word. One could argue that the same should be done on the web, as the reader's browser's hyphenation rules (if such a thing even existed) cannot be expected to handle text on the Internet correctly because the vocabulary is unknown (as is the author's hyphenation preferences, but I digress.). Still, nobody does this, and hardly anyone complains about it.
Correct splitting of continuous fractions seems like a similar non-problem. If a fraction is really overly long, then the author should probably split it manually.
6) Build on existing standards
When making something from scratch, then the risk of making a mistake is clearly higher, and the new language will be unfamiliar to everyone.
It might be useful to discuss the problems inherent in existing languages in some detail, including (possibly simplified) MathML (semantics? not strictly left-to-right and top-to-bottom?), a subset of TeX (no angle brackets? impossible to add CSS support?), and ISO-12083 (inconvenient tagnames?).
--
Andersen
PS: I would very much welcome a default rendering of the existing and yet-to-be-written code examples in the drafts.
More information about the whatwg
mailing list