[whatwg] Entity parsing [trema/diaeresis vs umlaut]

K?i?tof ?elechovski giecrilj at stegny.2a.pl
Thu Jun 28 04:51:13 PDT 2007


I had a look at the reference page you have directed me to: it actually
states that the ISO-8859-1 character set can be used for English.  Although
my hypothesis that the word œovre is not English remains valid (see also the
citations in the appendix), I admit that the fact that the ligature œ is not
included in the character set (and, consequently, that the character set
ISO-8859-1 cannot be used for encoding French text, which I find kind of
stunning because of the popularity of the French language) provides a much
simpler explanation to the observable phenomenon.  My fault, I should have
checked that up first.
Best regards
Chris

APPENDIX

Other Wikipedia entries also disagree, e.g.
<http://en.wikipedia.org/wiki/%C5%92>
Borrowings into English from Latin words featuring œ are often spelled with
the letter e, especially in American English. For example, fœderal became
federal in English, while fœtus became fetus only in American English. Other
œs in English spell out as 2 separate letters oe.
<http://en.wikipedia.org/wiki/List_of_words_that_may_be_spelled_with_a_ligat
ure>
The use of the œ and æ is obsolescent in modern English, and has been used
predominantly in British English. It is usually used to evoke archaism, or
in literal quotations of historic sources.
<http://en.wikipedia.org/wiki/American_and_British_English_spelling_differen
ces#Simplification_of_ae_.28.C3.A6.29_and_oe_.28.C5.93.29>
In English, which has imported words from all three languages, it is now
usual to replace Æ/æ with Ae/ae and Œ/œ with Oe/oe.

Microsoft Word does not accept hors d'œuvre but it has no problem with hors
d'oeuvre.  The American English International keyboard does not provide a
way to type the ligature œ.  The Microsoft Encarta dictionary does not
recognize such a spelling, nor does Reference.com.
The word coeur is not mentioned in any English dictionary I know.

-----Original Message-----
From: Oistein E. Andersen [mailto:html5 at xn--istein-9xa.com] 
Sent: Wednesday, June 27, 2007 11:44 PM
To: giecrilj at stegny.2a.pl; whatwg at whatwg.org
Subject: Re: [whatwg] Entity parsing [trema/diaeresis vs umlaut]

You might want to have a look at
http://pl.wikipedia.org/wiki/ISO_8859-1 .

Afterwards, consider the following:
1) Latin-1 does not contain all the characters that are required
for typesetting of English.





More information about the whatwg mailing list