A Latin or Roman alphabet is one of many alphabets that use the letters of the Latin script. These letters formed the basis of the original Roman Latin alphabet. Some letters of the Latin script were altered slightly for use in particular languages, although the main letters are largely the same. There were several general types of alterations made to extend the alphabet's uses, depending on the language: diacritics could be added to existing letters; two letters could be fused together into ligatures; additional letters could be inserted; or pairs or triplets of letters could be treated as units (digraphs and trigraphs).
Any additional letters were often given a place in the alphabet by defining an alphabetical order or collation sequence, which can vary between languages. Some of the additions, especially letters which only have diacritics added to them, were not considered distinct letters for this purpose. For example, the French é and the German ö, are not listed separately in their respective alphabet sequences. In some languages, digraphs are included in the collation sequence (e.g. Hungarian CS, Welsh RH).
The tables below summarize and compare some of the alphabets. In this article, the scope of the word "alphabet" is broadened to include letters with tone marks, and other diacritics used to represent a wide range of orthographic traditions, without regard to whether or how they are sequenced in their alphabet or the table.
Usage of the ISO basic Latin alphabet
The Afrikaans, Basque, Breton, Catalan, Czech, Danish, Dutch, English, Estonian, Filipino, Finnish, French, German, Greenlandic, Hungarian, Ido, Interlingua, Karakalpak, Kurdish, Modern Latin, Malay, Norwegian, Pan-European, Portuguese, Romanian, Slovak, Spanish, Swedish, Võro, Walloon, Xhosa, and Zulu alphabets include all 26 letters, at least in their largest version.
ISO 646 statistics
The chart above lists a variety of alphabets that do not officially contain all 26 letters of the ISO basic Latin alphabet. In this list, four letters are used by all of them: A, E, I, and N. For each of the 26 basic ISO Latin alphabet letters, the number of alphabets in the list above using it is as follows:
Note: The I is used in two distinct versions in Turkic languages, dotless (I ı) and dotted (İ i). They are considered different letters, and case conversion must take care to preserve the distinction. Note that Irish traditionally does not write the dot, or tittle, over the small letter i, but the language makes no distinction here if a dot is displayed, so no specific encoding and special case conversion rule is needed like for Turkic alphabets.
Additional letters used in Latin alphabets
Some languages have extended the Latin alphabet with ligatures, modified letters, or digraphs. These symbols are listed below. The characters in the following tables may not all render, depending on which operating system and browser version are used, and the presence or absence of Unicode fonts.
Additional letters by type
Independent letters and ligatures
|Additional base letters||Æ||Ɑ||Ð||Ǝ||Ə||Ɛ||Ɣ||I||Ĳ||Ɩ||Ŋ||Œ||Ɔ||Ʊ||(K‘)||(S)||ẞ||Þ||Ʋ||Ƿ||Ȝ||ʔ|
|Alphabet of Cameroon||Æ||Ɑ||Ə||Ɛ||Ŋ||Œ||Ɔ|
|Alphabet of Benin||Ǝ||Ɛ||Ɣ||Ŋ||Ɔ||Ʋ|
|Alphabet of Burkina Faso||Ǝ||Ɛ||Ɩ||Ŋ||Ɔ||Ʋ|
|Alphabet of Chad||Ə||Ɛ||Ŋ||Ɔ|
|Alphabet of Côte d'Ivoire||Ɛ||Ɩ||Ŋ||Ɔ||Ʊ||ʔ|
|Alphabet of Mali||Ǝ||Ɛ||Ɣ||Ŋ||Ɔ||ʔ|
|Alphabet of Niger||Ǝ||Ɣ||Ŋ|
Letter–diacritic combinations: connected or overlaid
|Alphabet of Cameroon||A̧||Ɓ||Ɗ||Ȩ||Ə̧||Ɛ̧||I̧||Ɨ||Ɨ̧||O̧||Ø||Ɔ̧||U̧||Ƴ|
|Alphabet of Benin||Ɖ||Ƒ|
|Alphabet of Burkina Faso||Ɓ||Ç||Ɗ||Ƴ|
|Alphabet of Chad||Ɓ||Ɗ||Ɦ||Ɨ||Ƴ|
|Alphabet of Mali||Ɓ||Ɗ||Ɲ||Ƴ|
|Alphabet of Niger||Ɓ||Ɗ||Ƙ||Ɲ||Ɍ||Ƴ|
Other letters in collation order
Note: The tables below are a work in progress. Eventually, table cells with light blue shading will indicate letter forms which do not constitute distinct letters in their associated alphabets. Please help with this task if you have the required linguistic knowledge and technical editing skill.
For the order in which the characters are sorted in each alphabet, see Collating sequence.
Letters derived from A–G
Letters derived from H–Q
Letters derived from R–Z
- ↑↑↑↑ In classical Latin, the digraphs CH, PH, RH, TH were used in loanwords from Greek, but they were not included in the alphabet. The ligatures Æ, Œ and W, as well as lowercase letters, were added to the alphabet only in Middle Ages. The letters J and U were used as typographical variants of I and V, respectively, roughly until the Enlightenment.
- ↑↑↑↑ Albanian officially has the digraphs dh, gj, ll, nj, rr, sh, th, xh, zh, which is sufficient to represent the Tosk dialect. The Gheg dialect supplements the official alphabet with 6 nasal vowels, namely â, ê, î, ô, û, ŷ.
- ↑↑↑↑ Arbëresh apparently requires the digraphs dh, gj, hj, ll, nj, rr, sh, th, xh, zh. Arbëresh has the distinctive hj, which is considered as a letter in its own right.
- ↑↑↑↑ Basque has several digraphs: dd, ll, rr, ts, tt, tx, tz. The ü, which is pronounced as /ø/, is required for various words in its Zuberoan dialect.
- ↑↑↑↑ Belarusian also has several digraphs: ch, dz, dź, dž.
- ↑↑↑↑ Breton also has the digraphs ch, c'h, zh.
- ↑↑↑ Catalan also has a large number of digraphs: dj, gu, ig, ix, ll, l·l, ny, qu, rr, ss, ts, tx, tz.
- ↑↑↑↑ Corsican has the trigraphs: chj, ghj.
- ↑↑↑↑↑↑↑↑ Croatian also has the digraphs: dž, lj, nj. It can also be written with four tone markers above on top of the vowels. Note that Croatian Latin is the same as Serbian Latin and they both map 1:1 to Serbian Cyrillic, where the three digraphs map to Cyrillic letters џ, љ and њ, respectively. Rarely and non-standardly, digraph dj is used instead of đ (Cyrillic ђ).
- ↑↑↑↑ Czech also has the digraph ch, which is considered a separate letter and is sorted between h and i. While á, ď, é, ě, í, ň, ó, ť, ú, ů, and ý are considered separate letters, in collation they are treated merely as letters with diacritics. However, č, ř, š, and ž are actually sorted as separate letters.
- ↑↑↑↑↑↑ The Norwegian alphabet is currently identical with the Danish alphabet. C is part of both alphabets and is used in native Danish, but not in native Norwegian. Norwegian and Danish uses é in "én" and more uses, although é is considered a diacritic mark, while å, æ and ø are letters. Q, w, x, z are not used except for names and some foreign words.
- ↑↑↑ The status of ij as a letter, ligature or digraph in Dutch is disputed.
- ↑↑↑ English generally now uses extended Latin letters only in loan words, such as fiancé, fiancée, and résumé. Rare publication guides may still use the dieresis on words, such as "coöperate", rather than the now-more-common "co-operate" (UK) or "cooperate" (US). For a fuller discussion, see articles branching from Lists of English words of international origin, which was used to determine the diacritics needed for more unambiguous English. However, an é or è is sometimes used in poetry to show that a normally silent vowel is to be pronounced, as in "blessèd".
- ↑↑↑↑ Filipino also known as Tagalog also use the digraph ng, even originally with a large tilde that spanned both n and g (as in n͠g) when a vowel follows the digraph. (The use of the tilde over the two letters is now rare).
- ↑↑↑ Uppercase diacritics in French are often thought as being dispensable, while they are obligatory. Many pairs or triplets are read as digraphs or trigraphs depending on context, but are not treated as such lexicographically: consonants ph, (ng), th, gu/gü, qu, ce, ch/(sh/sch), rh; vocal vowels (ee), ai/ay, ei/ey, eu, au/eau, ou; nasal vowels ain/aim, in/im/ein, un/um/eun, an/am, en/em, om/on; the half-consonant -(i)ll-; half-consonant and vowel pairs oi, oin/ouin, ien, ion. When rules that govern the French orthography are not observed, they are read as separate letters, or using an approximating phonology of a foreign language for loan words, and there are many exceptions. In addition, most final consonants are mute (including those consonants that are part of feminine, plural, and conjugation endings).
- ↑↑↑↑ Galician. The standard of 1982 set also the digraphs gu, qu (both always before e and i), ch, ll, nh and rr. In addition, the standard of 2003 added the grapheme ao as an alternative writing of ó. Although not marked (or forgotten) in the list of digraphs, they are used to represent the same sound, so the sequence ao should be considered as a digraph. Note also that nh represents a velar nasal (not a palatal as in Portuguese) and is restricted only to three feminine words, being either demonstrative or pronoun: unha ('a' and 'one'), algunha ('some') and ningunha ('not one'). The Galician reintegracionismo movement uses it as in Portuguese.
- ↑↑↑↑ German also retains most original letters in French loan words. Swiss German does not use ß any more. The long s (ſ) was in use until the mid-20th century. Sch is usually not treated like a true trigraph, neither are ch and qu digraphs. Q only appears in the sequence qu, while y is found only (and x almost only) in loan words. The capital ß (ẞ) is almost never used, ß is replaced with SS when writing all-caps.
- ↑↑↑↑ Guaraní also uses tilde over e and g (the last one not available precomposed in Unicode), as well as digraphs ch, mb, nd, ng, nt, rr and the glottal stop ' .
- ↑↑↑↑ Hausa has the digraphs: sh, ts.
- ↑↑↑↑ Hungarian also has the digraphs: cs, dz, gy, ly, ny, sz, ty, zs; and the trigraph: dzs. Letters á, é, í, ó, ő, ú, and ű are considered separate letters, but are collated as variants of a, e, i, o, ö, u, and ü.
- ↑↑↑↑ Irish formerly used the dot diacritic in ḃ, ċ, ḋ, ḟ, ġ, ṁ, ṗ, ṡ, ṫ. These have been replaced by the digraphs: bh, ch, dh, fh, gh, mh, ph, sh, th except for in formal instances.
- ↑↑↑↑ Italian also has the digraphs: ch, gh, gn, gl, sc. J, K, W, X, Y are used in foreign words. X is also used for native words derived from Latin and Greek; J is also used for just a few native words, mainly names of persons (as in Jacopo) or of places (as in Jesolo and Jesi), in which is always pronounced as letter I.
- ↑ Karakalpak also has the digraphs: ch, sh. A', G', I', N', O', U' are considered as letters. C, F, H, V, X are used in foreign words.
- ↑↑↑↑ Latvian also has the digraphs: dz, dž, ie. Dz and dž are occasionally considered separate letters of the alphabet in more archaic examples, which have been published as recently as the 1950s; however, modern alphabets and teachings discourage this due to an ongoing effort to set decisive rules for Latvian and eliminate barbaric words accumulated during the Soviet occupation. The digraph "ie" is never considered a separate letter.
- ↑↑↑↑ Lithuanian also has the digraphs: ch, dz, dž, ie, uo. However, these are not considered separate letters of the alphabet.
- ↑↑↑↑ Maltese also has the digraphs: ie, għ.
- ↑ Māori uses g only in ng digraph. Wh is also a digraph.
- ↑ Some Mohawk speakers use orthographic i in place of the consonant y. The glottal stop is indicated with an apostrophe ’ and long vowels are written with a colon :.
- ↑ Piedmontese also uses the letter n- to indicate a velar nasal N-sound (pronounced as the gerundive termination in going), which usually precedes a vowel, as in lun-a [moon].
- ↑↑↑↑ Pinyin has four tone markers that can go on top of any of the six vowels (a, e, i, o, u, ü); e.g.: macron (ā, ē, ī, ō, ū, ǖ), acute accent (á, é, í, ó, ú, ǘ), caron (ǎ, ě, ǐ, ǒ, ǔ, ǚ), grave accent (à, è, ì, ò, ù, ǜ). It also uses the digraphs: ch, sh, zh.
- ↑↑↑↑ Polish also has the digraphs: ch, cz, dz, dż, dź, sz, rz.
- ↑↑↑↑ Portuguese also uses the digraphs ch, lh, nh, rr, ss. The trema on ü was used in Brazilian Portuguese before 2009. Neither the digraphs nor accented letters are considered part of the alphabet.
- ↑ Romanian normally uses a comma diacritic below the letters s and t (ș, ț), but it is frequently replaced with an attached cedilla below these letters (ş, ţ) due to past lack of standardization.
- ↑↑↑↑ Romani has the digraphs: čh, dž, kh, ph, th.
- ↑↑↑ Slovak also has the digraphs dz, dž, and ch, which are considered separate letters While á, ä, ď, é, í, ĺ, ň, ó, ô, ŕ, ť, ú, and ý are considered separate letters, in collation they are treated merely as letters with diacritics. However, č, ľ, š, and ž, as well as the digraphs, are actually sorted as separate letters.
- ↑↑↑ Spanish uses several digraphs to represent single sounds: ch, gu (preceding e or i), ll, qu, rr; of these, the digraphs ch and ll were traditionally considered individual letters with their own name (che, elle) and place in the alphabet (after c and l, respectively), but in order to facilitate international compatibility the Royal Spanish Academy decided to cease this practice in 1994 and all digraphs are now collated as combinations of two separate characters. The c-cedilla ç used earlier has been replaced completely by z.
- ↑ Swedish uses é in well integrated loan words like idé and armé, although é is considered a modified e, while å, ä, ö are letters. á and à are rarely used words. W and z are used in some integrated words like webb and zon. Q, ü, è are used for names only, but exist in Swedish names. For foreign names ó, ë, ñ and more are sometimes used, but usually not. Swedish has many digraphs and some trigraphs. ch, dj, lj, rl, rn, rs, sj, sk, si, ti, sch, skj, stj and others are usually pronounced as one sound.
- ↑ Uzbek also has the digraphs: ch, ng, sh considered as letters. C used only in digraphs. G', O' and apostrophe (') are considered as letters. These letters have preferred typographical variants: Gʻ, Oʻ and ʼ respectively.
- ↑↑↑↑ Vietnamese has seven additional base letters: ă â đ ê ô ơ ư. It uses five tone markers that can go on top (or below) any of the 12 vowels (a, ă, â, e, ê, i, o, ô, ơ, u, ư, y); e.g.: grave accent (à, ằ, ầ, è, ề, ì, ò, ồ, ờ, ù, ừ, ỳ), hook above (ả, ẳ, ẩ, ẻ, ể, ỉ, ỏ, ổ, ở, ủ, ử, ỷ), tilde (ã, ẵ, ẫ, ẽ, ễ, ĩ, õ, ỗ, ỡ, ũ, ữ, ỹ), acute accent (á, ắ, ấ, é, ế, í, ó, ố, ớ, ú, ứ, ý), and dot below (ạ, ặ, ậ, ẹ, ệ, ị, ọ, ộ, ợ, ụ, ự, ỵ). It also uses several digraphs and trigraphs – ch, gh, gi, kh, ng, ngh, nh, ph, th, tr – but they are no longer considered letters.
- ↑↑↑↑ Walloon has the digraphs and trigraphs: ae, ch, dj, ea, jh, oe, oen, oi, sch, sh, tch, xh; the letter x is almost only used in xh digraph, the letter j is almost only used in dj and jh digraphs.
- ↑↑↑↑ Welsh has the digraphs ch, dd, ff, ng, ll, ph, rh, th. It also occasionally uses circumflexes, diaereses, acute accents and grave accents on its seven vowels (a, e, i, o, u, w, y), but accented characters are not regarded as separate letters of the alphabet.
- ↑↑↑↑ Xhosa has a large number of digraphs, trigraphs, and even one tetragraph are used to represent various phonemes: bh, ch, dl, dy, dz, gc, gq, gr, gx, hh, hl, kh, kr, lh, mb, mf, mh, nc, ndl, ndz, ng, ng', ngc, ngh, ngq, ngx, nh, nkc, nkq, nkx, nq, nx, ntl, ny, nyh, ph, qh, rh, sh, th, ths, thsh, ts, tsh, ty, tyh, wh, xh, yh, zh. It also occasionally uses acute accents, grave accents, circumflexes, and diaereses on its five vowels (a, e, i, o, u), but accented characters are not regarded as separate letters of the alphabet.
- Africa Alphabet
- African reference alphabet
- Dinka alphabet
- Hawaiian alphabet
- International Phonetic Alphabet
- Łatynka for Ukrainian
- Leet (1337 alphabet)
- Montenegrin Latin alphabet
- Romanization schemes
- Romany alphabet for most Romany languages
- Sámi Latin alphabet
- Standard Alphabet by Lepsius
- Tatar alphabet, similar to Turkish alphabet and Jaꞑalif as a part of Uniform Turkic alphabet
- Uralic Phonetic Alphabet
- Cherokee syllabary
- Digraph (orthography)
- Gaj's Latin alphabet, is the only script of the Croatian and Bosnian standard languages in current use, and one of the two scripts of the Serbian standard language.
- Initial Teaching Alphabet
- Latin alphabet ligatures
- Latin characters in Unicode
- List of Latin letters
- List of precomposed Latin characters in Unicode
- Phonetic transcription symbols
- Specific letter-diacritic combinations
- Trigraph (orthography)
- Typographical ligature
- Uncommon Latin letters
- Writing systems of Africa
- Michael Everson's Alphabets of Europe
- Typo.cz Information on Central European typography and typesetting
- Letter database of the Institute of Estonian Language
- Unicode language coverage tables
- Diacritics Project – All you need to design a font with correct accents