[whatwg] Entity parsing

Sander html5 at zoid.nl
Sat Jun 23 05:58:52 PDT 2007


I hadn't thought of that one ;-)  (in Dutch there are no native words 
with umlauts, only some of German or Scandinavian descent).
My question was about char-sets that contain both a trema version and a 
(seperate) umlaut version of the same character. Are there any?

cheers,
Sander


Kristof Zelechovski schreef:
> Only the vowel U can have either but I have not seen a valid example of
> &utrema;.  The orthography "ambigüe" has recently been changed to "ambiguë"
> for consistency.  Polish "nauka" (science) and German "beurteilen" would
> make good candidates but the national rules of orthography do not allow this
> distinction because Slavic languages do not have diphthongs except in
> borrowed words and it would cause ambiguity in German (cf. "geübt").
> (Incidentally, this leads to bad pronunciation often encountered even in
> Polish media.)
> Cheers
> Chris
>
> -----Original Message-----
> From: Sander [mailto:html5 at zoid.nl] 
> Sent: Friday, June 22, 2007 9:26 PM
> To: Kristof Zelechovski
> Subject: Re: [whatwg] Entity parsing
>
>
> Kristof Zelechovski schreef:
>   
>> A dieresis is not an umlaut so I have to bite my tongue each time I write
>>     
> or
>   
>> read nonsense like ï.  It feels like lying.  Umlaut means "mixed", a
>> dieresis means "standalone".  Those are very different things, and "I" can
>> never gets mixed so there is no ambiguïty.  Since "umlaut" is borrowed
>>     
> from
>   
>> German, I can see no problem in borrowing "tréma" from French.  I
>>     
> personally
>   
>> prefer "&itrema;" to "&idier;" because of readability, but I would not
>> insist on that.
>>   
>>     
>
> "In professional typography, umlaut dots are usually a bit closer to the 
> letter's body than the dots of the trema. In handwriting, however, no 
> distinction is visible between the two. This is also true for most 
> computer fonts and encodings."
> [http://en.wikipedia.org/wiki/Umlaut_(diacritic)]
>
> Are there any char-sets that have both umlaut and trema variations of 
> characters? If so, both entities could exist.
>
> cheers,
> Sander
>
>
> PS: I'd go for "&itrema;" instead of "&idier;" as well as the term 
> "trema" is also the one that's used in Dutch.
>
>
>   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20070623/c97a72ef/attachment-0001.htm>


More information about the whatwg mailing list