[whatwg] Mathematics in HTML5

juanrgonzaleza at canonicalscience.com juanrgonzaleza at canonicalscience.com
Fri Jun 9 03:00:08 PDT 2006


Dan Brickley wrote:
>
> I absolutely agree. It would also be both considerate and sensible  (if
> anyone does want to undertake such a task) to talk to the
> MathML folks first.

Maybe they would be invited to collaborate, after of five or six
consecutive failures maybe they could do it better now. But last news I
have obtained in this point are not very promising unless people here want
wait another 10 years and three or four new mathematical languages from
the w3c before obtain some language that can minimally work. I cannot wait
so many time...

If finally MathML folks are invited to participate in this project, I
would require that was specified in the “contract” ;-) that decisions
taken in the desing of HTML5 were based uniquely in technical reason
instead political interests. From an important MathML guy:

<blockquote>
Juan,

[...]

However, as I have observed again and again during the decade I've
devoted myself to the issues of electronic mathematical communication,
the principle challenges are not technical, but political. MathML is not
the way it is exclusively because of language design considerations --
it is the way it is because it was the politically feasible compromise
between the many conflicting interests of the consortium members that
had a stake is standardizing a markup for math notation.
</blockquote>

?istein E. Andersen wrote:
>
> First of all, I must say that I applaud the initiative to include tags
> for mathematics in HTML5 and that I really hope that this will be part
> of the final specification.
>
> Several attempts have been made to create a suitable format for
> mathematics on the web, none of which has gained widespread acceptance.

One would know why.

For example initial 1994 w3c HTML 3.0 working draft was full of technical
mistakes.
Nov, 1995 proposal for Math in HTML was too “Mathematica oriented”. Apr,
1998 MathML W3C recommendation published was not solid one. Jul, 1999
MathML 1.01 revision contained still many difficulties.

Feb, 2001 MathML 2.0 W3C recommendation published is still CSS (XSL-FO),
DOM, XML unfriendly, and contain many weakness in both content and
presentational markup even if is a myor revision of 1.x and even if was so
changed is completely different from the early 19994 draft.

Now some MathML people asked for CSS compatibility, maybe a new MathML 3.0
was again a mayor revision from previous versions.

> Let me highlight a few requirements that probably must be satisfied in
> order to avoid a new failure.
>
> 1) Do not encode every tiny semantic detail explicitly

I agree, see my proposal.

> The point already made about $dt^2$ meaning (dt)^2 being encoded to mean
> d(t^2) in MathML is just another example of what happens when there is
> lack of agreement between what the author wants (i.e. to make the
> formula look nice on the web so that other people can read it) and what
> the format tries to enforce (i.e. to assure that encoding of a formula
> be sufficiently semantical for a computer theoretically to be able to
> evaluate it).

A few days ago, Distler has changed the MathML code (see previous messages
from mine) and now the code he is serving to the Internet from his most
advanced blog is structurally incorrect, semantically wrong, not
accesible, and with incorrect visual rendering. For example ds is rendered
join and in roman whereas (d x) is rendered in roman.

Using a simple HTML 4 code one could render ds perfectly and do it
accesible via the alt attribute.

> 2) Fight verbosity
>
> More importantly, the amount of mark-up
> needed to encode a line of mathematics is enormous compared to what is
> necessary for a line of running text. Consequently, each mark-up element
> must be kept as short as possible.

Agree, with extension permitted to advanced users.

> It may be true that there will always be more <p>'s than <formula>'e out
> there; however, those using mathematics are likely to use it quite
> heavily, which makes <m> (for mathematics, or <f> if the <m> tag cannot
> be reassigned), <frac>2<den>3</frac> and <root>3<of>125</root> clearly
> better suited than <formula>, <fraction>2<denominator>3</fraction> and
> <radical>3<radicand>125</radical>.

However <frac>2<den>3</frac> is an shorthand for the full markup, because
structures of kind {2 \over 3} are even to be avoided in TeX. If I
remember correctly even Kuhn recognized that \over command was one of his
early mistakes when designed TeX.

<root>3<of>125</root> was already proposed in HTML Math of 1994 and
rejected because technical issues. Also rejected in ISO12083 math of 1995.


> 3) Assure compatibility with a reasonable subset of TeX

No problem with this. But for a full encoding one may go beyond TeX. For
example as recognized by Carlisle, absence of a model for prescripts is
one of most important flaws in TeX, therefore do not wait that a TeX input
can be magically transformed into HTML 5. However constructs as sub and
superindices, stackrel, fractions, and matrices, vectors, dots, hats, and
others can be easily converted from TeX inputs.

> then Wikipedia would in all probability offer HTML5 Mathematics as a
> rendering option, which would potentially give the new format a flying
> start.

This would be a good point in spreading of the language over the Internet.

> 4) Make font selection simple and natural
>
> This point does not seem to have attracted quite the attention it
> deserves yet.
>
> TeX seems to have got things right on this point by making italic the
> default for letters and roman the default for numbers. Is this approach
> completely unfeasible within the HTML/CSS framework?

There is many options. In HTML roman font is default and one just markes
variables as when one uses <i> for italic font. In Elsevier DTD for math,
italic was the default and roman was marked via tag.

I do not think that automatic mixing of roman and italic would be a good
idea at the browser side if one search a rapid cheap implementation fully
compatible with current standards.

However, this would be not a problem for authors, because one could
implement a small js in a week that authors could use in their computers
asisting them to authoring math.

> How are non-italic variables supposed to be handled? Using attributes,
> like <var class="italic">, <var class="bold">, <var
> class="blackletter">, <var class="roman">, etc. may be part of the
> solution, even though it would be quite verbose. At the very least, a
> (minimal) set of font-styles should be clearly defined.

HTML is more verbose than TeX but is less erratic. In the same way that
people has learned to use

<em>this is an important</em> point

in HTML instead of

\em{ this is important} point

I think that people can perfectly use

<var class="vector">F</var>

instead

\mathbf{F}

if you dislike the class attribute, then try something like

<var><b>F</b></var>


"White Lynx" wrote:
>
> Dan Brickley wrote:
>> It would also be both considerate and sensible
>> (if anyone does want to undertake such a task) to talk to the
>> MathML folks first.
> So far we are addressing problems that were invented and deployed by
> MathML folks.

This reply is fantastic! If MathML folks were only a 10% so good on
solving problems as on generating them, today computers would be able to
prove theorems from automatic search in the whole Internet and combining
mathematical fragment from different websites.

Unfortunately, after of 10 years of efforts still today “bigthinkers” as
J. Distler are unable to encode something so simple as (ds)^2 in MathML. I
do not need 10 years, three or four w3c attempts, many drafts and
specifications, half a dozen of tools, plugins, special mimes, special
fonts, and special browsers with native implementation of a XML
“application” is not fully compatible with CSS, XML, and DOM for, in the
long run, one was unable to correctly encode something so simple as
(ds)^2. However, I can use next old HTML code

<SPAN>ds<SPAN><SUP>2</SUP>

in a old page in a old server and all people will see the math (of course,
the HTML code can be improved).

Distler approach in some way remember me that joke of the guy who used the
most advanced supercomputer available on the epoque to compute the 50% of
100. But at least that guy was able to obtain the correct result...

The example of incorrect encodings is not exclusive of Distler
(technologically most advanced blog of the World). It appears in academic
journals of physics and mathematics and also in other bodies using MathML.

I provided some examples in canonical science today of how wrong MathML
was being used in the Internet, but when more time I will submit more
documents analizing dozens of other MathML codes and fragments I have
found in the internet - including official bodies.


Juan R.

Center for CANONICAL |SCIENCE)






More information about the whatwg mailing list