<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  <title></title>

</head>

<body bgcolor="#ffffff" text="#000000">

Kristof Zelechovski wrote:

<blockquote cite="mid:EEA991D2BA954DDA81AC06EC827BAE79@POCZTOWIEC"

 type="cite">

  <pre wrap="">AFAIK, WebKit is not going to validate XML, they say it makes page load too

slow.  </pre>

</blockquote>

Yes, I can see validation would be a problem, and see little use for

that except local file testing. But I'm just talking about using the

DTD to access entities, not to do validation. While this does involve

another HTTP request (as do external stylesheeets, scripts, etc.),

browsers could, as they do with such files, cache the files.<br>

<blockquote cite="mid:EEA991D2BA954DDA81AC06EC827BAE79@POCZTOWIEC"

 type="cite">

  <pre wrap="">Besides, entities introduce a security risk because it can contain

incomplete syntax fragments and they can open a path to XML injection into,

say, <![DANGER[<span title="&malicious-entity;" >sweet kittens</span >]]>.

So XML processors often refuse to load cross-domain DTD or ENTITIES.

  </pre>

</blockquote>

Then, cross-domain entities could be restricted... I'm just thinking

one should be able to at least have <i>some</i> way to use them, even

if you have to save the file in the same domain.<br>

<blockquote cite="mid:EEA991D2BA954DDA81AC06EC827BAE79@POCZTOWIEC"

 type="cite">

  <pre wrap="">There are several XHTML entities that are indispensable for authors, namely

those that disambiguate characters are invisible or are indistinguishable

from others in a monospaced typeface.  These include spacing, dashes, quotes

and maybe text direction (deprecated).  Converting them to their

corresponding characters deteriorates the editing experience in an ordinary

text editor.  As far as codes for letters are concerned, text in a non-Latin

script would consist mainly of entities, which would make it extremely hard

to read, so this approach is not practical.  An editor limited to the ASCII

character set would be better off using a transliteration scheme and a

converter.

  </pre>

</blockquote>

Yes, if the whole document were written in that fashion, and it was not

a localization. But for those who are already using programs which

support non-Latin scripts, such documents may still take advantage of

entities (or even have the entities themselves be in a non-Latin

script). For example, my "Chinese XHTML" (

<a class="moz-txt-link-freetext" href="http://bahai-library.com/zamir/chineseXHTML.xml">http://bahai-library.com/zamir/chineseXHTML.xml</a> and

<a class="moz-txt-link-freetext" href="http://bahai-library.com/zamir/chineseXHTML.xsl">http://bahai-library.com/zamir/chineseXHTML.xsl</a> ) (can view in Safari,

Firefox, or Opera), which allows tags, attributes, and CSS in Chinese

characters roughly equivalent to the XHTML tag names, as long as the

stylesheet is attached, could also use Chinese entities in the XML

document if such external doctypes were supported (the example uses one

in the internal subset). This raises another use for entities--a simple

introduction to preparing XHTML documents in all regards, regardless of

one's native language. (And I used entities in the XSL file too,

thereby highlighting another example of entities--the ability to

automatically and transparently share <i>all</i> of my code, including

localization code for anyone who wished to make their own localization

or borrow mine for other uses.)<br>

<blockquote cite="mid:EEA991D2BA954DDA81AC06EC827BAE79@POCZTOWIEC"

 type="cite">

  <pre wrap="">

However, as some of the entities are indispensable, a DOCTYPE is required.

The browsers may support built-in entities but XML processors used to

process XHTML documents need not.  Providing a set of the entities needed

in-line is easy; </pre>

</blockquote>

If you mean providing them in each document, then that, while easy, is

already supported, but is a large use of bandwidth, and not to mention

quite a pain to have to copy into each document and maintain...<br>

<blockquote cite="mid:EEA991D2BA954DDA81AC06EC827BAE79@POCZTOWIEC"

 type="cite">

  <pre wrap="">however, the problem is that some validating processors

like MSXML require that the DTD, if provided, should fully describe the

document; providing entities only is not supported by default and the

processor refuses to load the document.  That means a DOCTYPE for XHTML is

necessary and should be provided by WHATWG (or by an independent party).

This DTD should be external in order to use parameter entities and, of

course, to make the document smaller.  It cannot, of course, define all

nuances of XHTML, but an upper approximation would be sufficient.  

The problem, of course, is maintenance, since XHTML is in flux.  XHTML is

currently described formally by a RELAX NG grammar and maintaining a

separate DTD would double the work to do so it would be best to be able to

generate the DTD automatically.  However, the converter I was advised to use

was unable to produce a DTD from the grammar because it is too complex for

the DTD formalism (of course).

  </pre>

</blockquote>

Hmm... Good point. Still, it is surmountable...<br>

<br>

<blockquote cite="mid:EEA991D2BA954DDA81AC06EC827BAE79@POCZTOWIEC"

 type="cite">

  <pre wrap="">Best regards,

Chris

Aside: Note that you cannot use DocBook with MSIE directly; a bug in the

default XSLT processor causes an error in initialization code.  This kills

all transformations, whatever your document is.  (I do not know about TEI.)

  </pre>

</blockquote>

I've used XSL successfully before in IE, but haven't used it for some

time... Right now my "Chinese XHTML" which really did work for me in

Explorer before when I outputted in GB2312 is not working now, though

that may be due to the fonts on my system now.<br>

<br>

Brett<br>

</body>

</html>