Esperanto Orthography

Esperanto is written in an alphabet of twenty-eight letters. Twenty-two of these are identical in form to letters of the English alphabet (q, w, x, and y being omitted). The remaining six are accented letters, which appear as follows: ĉ, ĝ, ĥ, ĵ, ŝ (c, g, h, j, and s with circumflex), and ŭ (u with breve). The full alphabet appears as follows:-
a b c ĉ d e f g ĝ h ĥ i j ĵ k l m n o p r s ŝ t u ŭ v z
With the exception of c (= ts) and the accented letters, the values of the letters are approximately those of the IPA (see Esperanto pronunciation). The alphabet is nearly phonemic; the only significant exceptions being the sequence kz, which is frequently pronounced gz, as in ekzemple; and partially Esperantized words that use ŭ for w, which is normally an allophone of /v/. The six Esperanto accented characters are included in the international form of Morse code. In handwritten Esperanto, the accented letters cause no problems. However, since none of them appear on standard alphanumeric keyboards, various different methods have been devised for representing them in printed and typed text using more standard characters. The original method was what is now referred to as the "h-system," but this has now largely been superseded by the so-called "x-system." With the advent of Unicode, the need for such systems is lessening.

The h-system

The original method of representing accented letters is due to the initiator of Esperanto, L. L. Zamenhof, who recommended using u in place of ŭ, and putting an h after a letter to indicate that the letter should have a circumflex. For example, the consonant ŝ is represented as sh, as in the words shi (ŝi, meaning she) and shanco (ŝanco, meaning chance). Unfortunately this method suffers from two problems:
  1. h is already a consonant in the language, so its use for another purpose would make the pronunciation and sometimes the meaning of words ambiguous.
  2. Simplistic ASCII-based rules for sorting English words fail badly for sorting Esperanto ones, because lexicographically words starting with ĉ should follow words starting with c and precede words starting with d. For example ĉu should be sorted after ci lexicographically, but written in the h-system, chu would be incorrectly sorted before ci.

The x-system

The most common system for typing in Esperanto today is the "x-system," which uses x after a letter to indicate that the letter should have an accent. For example, the consonant ŝ is represented as sx, as in the words sxi (ŝi) and sxanco (ŝanco). This method solves both of the problems inherent in the h-system:
  1. x is not a consonant in the language, so its use introduces no ambiguity into the pronunciation or meaning.
  2. Words starting with cx now correctly follow words starting with c. Similarly, other accented letters are sorted after their unaccented counterparts. The sorting only fails when a word with cz or similar is encountered, but such words are relatively uncommon.
One problem with the x-system is when it is used alongside French text, because many French words end in ux. For example, aux ( in Esperanto) is a word in both languages. This is most serious when one wants to automatically convert an X-system text file which also contains French text to Unicode; any automatic replacement will alter the French text as well. A few English words like "luxury" can also suffer from such search-and-replace routines. A few people have proposed using "vx" instead of "ux" for ŭ, but this variant of the system is rarely used.

Use of the caret

Another, less popular, system is the use of the caret character (^) to represent the accents, either before or after the letter to be accented. For example, ŝanco becomes ^sanco or s^anco. This shares the advantage of unambiguity with the x-system, and also has the advantage that the character itself resembles a circumflex accent, so that people unfamiliar with the system are likely to grasp what is meant. However, the system has not caught on in many places. Many new Esperantists perceive the accented letters as a problem, and often propose "new" methods to transliterate Esperanto, sometimes with substantial modifications. Most of these proposals are ignored or shunned by the community, as such suggestions often come from people who do not know the language well. The transliteration of Esperanto into ASCII is a topic known to cause flame wars and little constructive discussion, and the reduction of such behaviour is sometimes indicated as one of the main reasons to use Unicode and the proper accented letters.

Unicode

The entire Esperanto alphabet is part of the Latin-3 and Unicode character sets, so the above systems are no longer necessary on web pages. Nonetheless, the x-system remains common on Usenet and in e-mail where encoding support is rare and the limited availability of keyboard configurations makes it difficult for many to type the special characters. The HTML entities for the special Esperanto characters in Unicode are:
  • C-circumflex: Ĉ
  • c-circumflex: ĉ
  • G-circumflex: Ĝ
  • g-circumflex: ĝ
  • H-circumflex: Ĥ
  • h-circumflex: ĥ
  • J-circumflex: Ĵ
  • j-circumflex: ĵ
  • S-circumflex: Ŝ
  • s-circumflex: ŝ
  • U-breve: Ŭ
  • u-breve: ŭ

Practical Unicode for Esperanto

Adjusting a keyboard to type Unicode is actually relatively easy (all Windows variants of the Microsoft Windows NT family, such as 2000 and XP, for example, support Unicode; Windows 9x does not natively support Unicode). Microsoft Windows: A page that describes how to use the excellent tool Keyman (free for personal use) in conjunction with a special (free) "keyword file" is available here. It can be configured to automatically run at startup. The advantage of using Keyman is that you can easily deactivate it—so your "abbreviations" (such as "cx," which are automatically converted to the corresponding Esperanto letter as you type) are not accidentally converted. You can also use keyboard layout manager to define special keys: the most elementary thing is associating AltGr+g to ĝ and similar ones. The program has a simple and intuitive interface, but it may be necessary to define a new keyboard to avoid interference from Windows' system-file protection system, that may not permit modifications of important system files as keyboard drivers. Many popular e-mail clients support Unicode, so you can happily use the tools described above to write e-mails using the Esperanto alphabet. If you want to use a text editor that is Esperanto-compatible, make sure it supports Unicode, as does Editplus (UTF-8). In Linux systems, one has first to activate Unicode by setting the environment variable LC_CTYPE=en_US.UTF-8 ; there are also non "en_US" Unicode layouts, and they function accordingly. There is even a special eo_XX.UTF-8 available at Bertil Wennergren's home page, along with a thorough explanation of how one implements Unicode and the keyboard in Linux. On Mac OS X systems, Esperanto characters can be entered by activating the "U.S. Extended" keyboard layout in the "Input Menu" pane of the "International" system preferences. When the U.S. Extended layout is active, Esperanto characters can be entered as follows:
  • C-circumflex ĉ = option+6 shift-c
  • c-circumflex Ĉ = option+6 c
  • G-circumflex: Ĝ = option+6 shift-g
  • g-circumflex: ĝ = option+6 g
  • H-circumflex: Ĥ = option+6 shift-h
  • h-circumflex: ĥ = option+6 h
  • J-circumflex: Ĵ = option+6 shift-j
  • j-circumflex: ĵ = option+6 j
  • S-circumflex: Ŝ = option+6 shift-s
  • s-circumflex: ŝ = option+6 s
  • U-breve: Ŭ = option+b shift-u
  • u-breve: ŭ = option+b u
The option characters can be remembered by mnemonics: the 6 key contains the caret character, so option-6 places a caret over the following character. Option-b stands for breve.

Locale

An Esperanto locale would use "." as the thousands separator and "," as a decimal point. Time and date format among Esperantists is not so standardized as number format, but 24-hour time with colon between hour and minutes, and for dates, either yyyy-mm-dd or dd-mm-yyyy, would be international and unambiguous.

See also

Orthography
   

External links

*eoconv, a tool to convert text between various Esperanto orthographies and character encodings

 

<< PreviousWord BrowserNext >>
kui
merau
milu
lua o milu
rimu
rohe
rohi
whiro
chubby checker
ratu mai mbula
samulayo
tau titi
tapairu
islington
labour (economics)
avaiki
vari
varima te takere
vatea
rebel alliance (star wars)
waterloo, texas
jiang zemin
basilicata
hafaza
ferndale, michigan
lower colorado river authority
jesus gonzales
james smithson
stockholms strm
wright county, iowa
worth county, iowa
woodbury county, iowa
winneshiek county, iowa
winnebago county, iowa
webster county, iowa
wayne county, iowa
washington county, iowa
bichon fris
oligoclonal band
coltan
solna
auriaria
nei tituaabine
drottningholm