[whatwg] Mathematics in HTML5

juanrgonzaleza at canonicalscience.com juanrgonzaleza at canonicalscience.com
Sat Jun 10 05:57:36 PDT 2006

?istein E. Andersen wrote:

>><root>3<of>125</root> was already proposed in HTML Math of 1994 and
>> rejected because technical issues. Also rejected in ISO12083 math of
>> 1995.
>
> What i meant was to use <root>3<of>125</root> as a shorthand notation
> for something like <root><order>3</order><of>125</of></root>, in which
> case only the actual element names differ from the current proposal.

Now understand you, yes it could be like you say or like George say. In
the original HTML 3 Math, the <of> was the full markup, not a shorthand
and introduced many difficulties doing it. I would recommend the
discussion about names for latter. Now it is more important decide
elements, names and content and parsing modes.

After we could discuss if we prefer <root> or <frac> or <fraction>, etc.
If we would reuse names of TeX or of ISO-12083, etc.

>>>3) Assure compatibility with a reasonable subset of TeX
>
>>absence of a model for prescripts is one of most important flaws in
>> TeX, therefore do not wait that a TeX input can be magically
>> transformed into HTML 5.
>
> Obviously, it will not be possible to transform any TeX code into HTML
> 5.
>
> Something like ${}^aB$ could be transformed into an HTML 5 prescript
> given the correct rules, but then something like ${}^{342}_4X$ would of
> course look different in TeX (probably incorrect) and HTML 5.

Yes, the problems with mathematicians is that many think that TeX is
perfect because offers good typeseting on a piece of paper, but TeX fails
on electronic documents and the web.

Here either you would use some intelligence at the conversor as you claim
[*], either thought to mathematicians that correct markup is using the
HTML5 explicit tag for prescripts or using an modified TeX as that
proposed by George, which already include prescript.

Look that interesting idea George had

http://my.opera.com/White%20Lynx/blog/show.dml/256124

Presubscript:    \inf{sub}Base

TeX code for prescripts is tricky and rudely critqued by MathML, TeX or
ISO12083 people.

Similar for presuperscripts

[*] However, this may be difficult to achieve in practice, because TeX
conversors reading TeX sources are unable to provide correct MathML markup
for prescripts.

Many Tex/LaTeX/IteX conversors transform {}_{sub}Base to

<msub><mrow/><mi>sub</mi><msub><mi>Base</mi>

which is a complete aberration.

1) There exist a special markup for prescripts in MathML: <mprescripts>

2) Above MathML code is structurally invalid.

3) Above MathML code is not acesible. Would be incorrectly spoken by an
aural rendered.

4) sometimes visual rendering can be incorrect, because size and position
of subindex is being computed with empty element <mrow/> as base and the
real base “Base” is ignored.

>>HTML is more verbose than TeX but is less erratic.
>
> That is a fair point.
>
>>I think that people can perfectly use
>><var class="vector">F</var>
>>\mathbf{F}
>>if you dislike the class attribute, then try something like
>><var><b>F</b></var>
>
> A few issues still remain to be solved, though:
>
> Boldface does not necessarily mean vector, and vectors are not always
> printed in bold type.

I agree!

> Presumably, you mean that classes like 'vector'
> need not be defined in the specification, that the choice is up to the
> author, and that a custom CSS style-sheet can be used to define the
> font. (This would require CSS font-families for Fraktur and
> double-struck/blackboard bold.)

Yes, because I think that design of a good semantic markup is something
that still has been not achieved even after a decade of effort from the
OpenMath community. If I add semantic content as vector, I would add
others classes also: vector is generic mathematical linear elementary
vector in usual mathematical sense? Covariant? Maybe defined in
tridimensional space, is a vector in a Hilbert space or in a Liouville
space?

Some authors would call "vector" or "Hilbert-vector" others would call
"ket". Somewhat as I define my own classes in HTML and next use CSS rules
for presentation. I would leave that to authors until that a generic
semantic markup was achieved, proved to be consistent and powerful and
then used by authors.

Font styles would be specified via CSS styles. In fact I suspect that
current specific MathML 2 attributes will be deprecated in a future and
introduced in the scheduled future CSS MathML module, but we can begin now
with the correct usage of CSS.

> This approach would entail introducing semantic or quasi-semantic
> mark-up to encode an important part of a formula's visual appearance.
> Obviously, LaTeX commands like \mathcal and \mathbb indicate no
> semantics, so the only sensible solution would be to transform this into
> something like <var class="cal"> and <var class="bb">. If this is going
> to happen, the classes should probably be defined in the specification.

I do not think so, somewhat as <span class="bold"> is not defined in HTML
text, the font bold property is defined in CSS and you can call it via a
CSS rule applied to a class you can define. For example Spanish authors
could prefer <span class="negrita"> but the CSS rule would be exactly the
same.

"White Lynx" wrote:
>
> Oistein E. Andersen wrote:
>> As an aside, traditional French typographical conventions for
>> mathematics require lowercase variables  in italic, but uppercase ones
>> in roman.
>
> Interesting detail. Do we need extra values like
> text-transform:french-italic; and french-bold-italic; that would
> transform lowecase Latin and Greek characters to appropriate slanted
> mathematical alphanumerical characters and uppercase ones to normal
> mathematical alphanumerical characters?

Also "tag" is not usual in French texts. Maybe language of the mathematics
would be encoded in the [itex] element, somewhat as we use today
xml:lang="fr" attribute for assistance to browsers (e.g. correct aural
rendering) in quotes are typed in other languages.

I would also note here that content MathML was official designed to encode
meaning of mathematics not the presentation and whereas this is true in
the case ot "tag" vs "tg", it is not true in other cases. For example
decimal separator is defined to be a point in content MathML.
<cn>3.1416</cn> is asuming a default presentation, in Spanish
typographical conventions it may be typped as <cn>3,1416</cn>.

> We can revise naming conventrions later. One more problems with roots,
> mentioned by Michael Day in offlist discussion, is that XHTML markup is
> too awkward. If in HTML "radix" and "radicand" are optional so in square
> roots one can omit both
> but in XHTML markup we prefer to avoid any special parsing rules, so in
> XHTML
> The only thing that  we can do within XML+CSS framework is to introduce
> extra element for  square roots like:

Good point.

Juan R.

Center for CANONICAL |SCIENCE)