[whatwg] Hyphenation

Håkon Wium Lie howcome at opera.com
Thu Jan 11 05:49:17 PST 2007

Also sprach Øistein E. Andersen:

 > > Prince6 (www.princexml.com) supports these properties:
 > > 
 > >   hyphenate: none | auto
 > >   hyphenate-dictionary: none | url(...)
 > >   hyphenate-before: <int>
 > >   hyphenate-after: <int>
 > >   hyphenate-lines: none | <int>
 > >From http://www.princexml.com/howcome/2006/p6/p6demo2.html:
 > > Prince can read the hyphenation format pioneered by TeX and reused by many
 > > other applications. OpenOffice hosts a number of hyphenation dictionaries that
 > > are reusable in Prince6.

 > This is, however, only one part of TeX's hyphenation system. The next level is a
 > hyphenation exception dictionary, a list of fully hyphenated words that would not
 > otherwise be hyphenated correctly. 

Prince doesn't support exception dictionaries. Is it not possible to
encode exceptions in the hyphenation dictionary?

DSSSL has an 'hyphenation-exceptions' property which takes a list of
strings. I'm unsure if it has been implemented, though.


 > In addition to this, hyphenation can be indicated locally. This is needed in order to
 > hyphenate words like rec-ord/re-cord and is the only level that deals with
 > spelling changes.

This can be done by supplying your own dictionary through the
'hyphenate-dictionary' property.

 > There are a few additional caveats. For instance, it is not entirely obvious what
 > should be considered to be a `word' or which characters should be allowed in a
 > `word' (given that only `words' can be hyphenated using this kind of algorithms).
 > TeX uses `category codes' to define letters, and Unicode's character classes
 > give a good approximation, but they cannot be redefined to deal with specific
 > issues. In Italian, for instance, dell'opera should be hyphenated dell'o-
 > pera, but opera should not be hyphenated o-pera. (The particular example may
 > be wrong, but the principle is correct.) Unless the apostrophe is
 > considered to be a `letter' (a constituent of a `word'), correct patterns do not
 > help, as `dell'opera' will not be considered as one unit during hyphenation-point
 > look-up.
 > Another example worth mentioning is that Polish and a few other languages
 > apparently require a hyphenated word like xxx-yyy to be hyphenated xxx-
 > -yyy (with an extra hyphen carried over). A truly flexible system would allow
 > to specify, e.g., which non-letters to treat as part of words and which to give
 > special treatment. (As we all know, TeX hyphenates xxx-yyy as xxx-
 > yyy; in addition, the hyphen prohibits xxx and yyy from being hyphenated,
 > which may or may not be suitable depending on, e.g., column width.)
 > How does Prince deal with these issues?

Prince6 does't try to go beyond Tex.

              Håkon Wium Lie                          CTO °þe®ª
howcome at opera.com                  http://people.opera.com/howcome

More information about the whatwg mailing list