Latin script

For the Latin script as used for the Latin language, see Latin alphabet. For Roman typeface, see Roman type.

"ABC's" redirects here. For the children's song, see Alphabet song. For other uses, see ABC.

Latin Roman

Type	Alphabet
Languages	most western European languages, including Latin many Turkic, Finno-Ugric and Eskimo–Aleut languages Basque language some African, Austronesian and Austroasiatic languages
Time period	~700 BC–present
Parent systems	Egyptian hieroglyphs Proto-Sinaitic Phoenician alphabet Greek alphabet Etruscan alphabet Latin Roman
Child systems	indirectly, the Cherokee syllabary and Yugtun script
Sister systems	Cyrillic Armenian Georgian Coptic Runic/Futhark
Direction	Left-to-right
ISO 15924	`Latn, 215`
Unicode alias	Latin
Unicode range	See Latin characters in Unicode

Latin script, or Roman script, is a set of graphic signs (script) based on the letters of the classical Latin alphabet, derived from a form of the Cumaean Greek version of the Greek alphabet through the Etruscan. Latin script is used as the standard method of writing in most Western and Central European languages, as well as many languages from other parts of the world.

Latin script is the basis for the largest number of alphabets of any writing system^[1] and is the most widely adopted writing system in the world (commonly used by about 70% of the world's population). Latin script is also the basis of the International Phonetic Alphabet. The 26 most widespread letters are the letters contained in the ISO basic Latin alphabet.

Name

The script is either called Roman script or Latin script, in reference to its origin in ancient Rome. In the context of transliteration, the term "romanization" or "romanisation" is often found.^[2]^[3] Unicode uses the term "Latin"^[4] as does the International Organization for Standardization (ISO).^[5]

The numeral system is called the Roman numeral system; and the collection of the elements, Roman numerals. The numbers 1,2,3 ... are Latin/Roman script numbers for the Hindu–Arabic numeral system.

Spread

For earlier history, see Latin alphabet.

The Latin alphabet spread, along with Latin, from the Italian Peninsula to the lands surrounding the Mediterranean Sea with the expansion of the Roman Empire. The eastern half of the Empire, including Greece, Turkey, the Levant, and Egypt, continued to use Greek as a lingua franca, but Latin was widely spoken in the western half, and as the western Romance languages evolved out of Latin, they continued to use and adapt the Latin alphabet.

Middle Ages

With the spread of Western Christianity during the Middle Ages, the Latin alphabet was gradually adopted by the peoples of Northern Europe who spoke Celtic languages (displacing the Ogham alphabet) or Germanic languages (displacing earlier Runic alphabets) or Baltic languages, as well as by the speakers of several Uralic languages, most notably Hungarian, Finnish and Estonian.

The Latin script also came into use for writing the West Slavic languages and several South Slavic languages, as the people who spoke them adopted Roman Catholicism. The speakers of East Slavic languages generally adopted Cyrillic along with Orthodox Christianity. The Serbian language uses both scripts, with Cyrillic predominating in official communication and Latin elsewhere, as determined by the Law on Official Use of the Language and Alphabet.^[6]

Since the 16th century

The distribution of the Latin script. The dark green areas show the countries where the Latin script is the sole main script. Light green shows countries where Latin co-exists with other scripts. Latin-script alphabets are sometimes extensively used in areas coloured grey due to the use of unofficial second languages, such as French in Algeria and English in Egypt, and to Latin transliteration of the official script, such as pinyin in China.

As late as 1500, the Latin script was limited primarily to the languages spoken in Western, Northern, and Central Europe. The Orthodox Christian Slavs of Eastern and Southeastern Europe mostly used Cyrillic, and the Greek alphabet was in use by Greek-speakers around the eastern Mediterranean. The Arabic script was widespread within Islam, both among Arabs and non-Arab nations like the Iranians, Indonesians, Malays, and Turkic peoples. Most of the rest of Asia used a variety of Brahmic alphabets or the Chinese script.

Over the past 500 years, the Latin script has spread around the world, to the Americas, Oceania, and parts of Asia, Africa, and the Pacific with European colonization, along with the Spanish, Portuguese, English, French, Swedish and Dutch languages. It is used for many Austronesian languages, including the languages of the Philippines and the Malaysian and Indonesian languages, replacing earlier Arabic and indigenous Brahmic alphabets. Latin letters served as the basis for the forms of the Cherokee syllabary developed by Sequoyah; however, the sound values are completely different.

Since 19th century

A map showing the expansion of the use of Latin alphabet in areas of former Yugoslavia. Cyrillic text start at the border of the Muslim and Eastern Orthodox nations. This cultural boundary has existed since the dichotomy of the Greek East and Latin West.

In the late 19th century, the Romanians returned to the Latin alphabet, which they had used until the Council of Florence in 1439,^[7] primarily because Romanian is a Romance language. The Romanians were (and still are) predominantly Orthodox Christians, and their Church, increasingly influenced by Russia after the fall of Byzantine Greek Constantinople in 1453 and capture of the Greek Orthodox Patriarch, had begun promoting the Slavic Cyrillic.

Under French rule and Portuguese missionary influence, a Latin alphabet was devised for the Vietnamese language, which had previously used Chinese characters.

In 1928, as part of Mustafa Kemal Atatürk's reforms, the new Republic of Turkey adopted a Latin alphabet for the Turkish language, replacing a modified Arabic alphabet. Most of the Turkic-speaking peoples of the former USSR, including Tatars, Bashkirs, Azeri, Kazakh, Kyrgyz and others, used the Latin-based Uniform Turkic alphabet in the 1930s; but, in the 1940s, all were replaced by Cyrillic. After the collapse of the Soviet Union in 1991, several of the newly independent Turkic-speaking republics, namely Azerbaijan, Uzbekistan, and Turkmenistan, as well as Romanian-speaking Moldova, officially adopted Latin alphabets for their languages.

Kazakhstan, Kyrgyzstan, Iranian-speaking Tajikistan, and the breakaway region of Transnistria kept the Cyrillic alphabet, chiefly due to their close ties with Russia. In the 1930s and 1940s, the majority of Kurds replaced the Arabic script with two Latin alphabets. Although the only official Kurdish government uses an Arabic alphabet for public documents, the Latin Kurdish alphabet remains widely used throughout the region by the majority of Kurdish-speakers.

In 2015, the Kazakh government announced that the Latin alphabet would replace Cyrillic as the writing system for the Kazakh language by 2025.^[8]

As used by various languages

Main article: Latin alphabets

In the course of its use, the Latin alphabet was adapted for use in new languages, sometimes representing phonemes not found in languages that were already written with the Roman characters. To represent these new sounds, extensions were therefore created, be it by adding diacritics to existing letters, by joining multiple letters together to make ligatures, by creating completely new forms, or by assigning a special function to pairs or triplets of letters. These new forms are given a place in the alphabet by defining an alphabetical order or collation sequence, which can vary with the particular language.

Multigraphs

Main article: Latin-script multigraph

A digraph is a pair of letters used to write one sound or a combination of sounds that does not correspond to the written letters in sequence. Examples are ⟨ch⟩, ⟨ng⟩, ⟨rh⟩, ⟨sh⟩ in English, or the ⟨Dutch ij⟩ (note that ⟨ij⟩ is capitalized as ⟨IJ⟩ or the ligature ⟨Ĳ⟩ and sometimes as the single letter ⟨Y⟩ despite it being a different letter, but never as ⟨Ij⟩, and that it often takes the appearance of a ligature ⟨ĳ⟩ very similar to the letter ⟨ÿ⟩ in handwriting).

A trigraph is made up of three letters, like the German ⟨sch⟩, the Breton ⟨c’h⟩ or the Milanese ⟨oeu⟩. In the orthographies of some languages, digraphs and trigraphs are regarded as independent letters of the alphabet in their own right. The capitalization of digraphs and trigraphs is language-dependent, as only the first letter may be capitalized, or all component letters simultaneously (even for words written in titlecase, where letters after the digraph or trigraph are left in lowercase).

Ligatures

Main article: Ligature (typography)

A ligature is a fusion of two or more ordinary letters into a new glyph or character. Examples are ⟨Æ/æ⟩ (from ⟨AE⟩, called "ash"), ⟨Œ/œ⟩ (from ⟨OE⟩, sometimes called "oethel"), the abbreviation ⟨&⟩ (from Latin et "and"), and the German symbol ⟨ß⟩ ("sharp S" or "eszet", from ⟨ſz⟩ or ⟨ſs⟩, the archaic medial form of ⟨s⟩, followed by a ⟨z⟩ or ⟨s⟩).

Wholly new letters

Main article: List of Latin letters

Some examples of new letters to the standard Latin alphabet are the Runic letters wynn ⟨Ƿ/ƿ⟩ and thorn ⟨Þ/þ⟩, and the letter eth ⟨Ð/ð⟩, which were added to the alphabet of Old English. Another Irish letter, the insular g, developed into yogh ⟨Ȝ/ȝ⟩, used in Middle English. Wynn was later replaced with the new letter ⟨w⟩, eth and thorn with ⟨th⟩, and yogh with ⟨gh⟩. Although the four are no longer part of the English or Irish alphabets, eth and thorn are still used in the modern Icelandic and Faroese alphabets.

Some West, Central and Southern African languages use a few additional letters that have a similar sound value to their equivalents in the IPA. For example, Adangme uses the letters ⟨Ɛ/ɛ⟩ and ⟨Ɔ/ɔ⟩, and Ga uses ⟨Ɛ/ɛ⟩, ⟨Ŋ/ŋ⟩ and ⟨Ɔ/ɔ⟩. Hausa uses ⟨Ɓ/ɓ⟩ and ⟨Ɗ/ɗ⟩ for implosives, and ⟨Ƙ/ƙ⟩ for an ejective. Africanists have standardized these into the African reference alphabet.

The Azerbaijani language also has the letter written as "Ə", which represents the near-open front unrounded vowel.

Diacritics

The letter ⟨a⟩ with an acute diacritic.

Main article: Diacritic

A diacritic, in some cases also called an accent, is a small symbol that can appear above or below a letter, or in some other position, such as the umlaut sign used in the German characters ⟨ä⟩, ⟨ö⟩, ⟨ü⟩ or the Romanian characters ă, â, î, ș, ț. Its main function is to change the phonetic value of the letter to which it is added, but it may also modify the pronunciation of a whole syllable or word, or distinguish between homographs. As with letters, the value of diacritics is language-dependent.

Collation

Main article: Collating sequence

Some modified letters, such as the symbols ⟨å⟩, ⟨ä⟩, and ⟨ö⟩, may be regarded as new individual letters in themselves, and assigned a specific place in the alphabet for collation purposes, separate from that of the letter on which they are based, as is done in Swedish. In other cases, such as with ⟨ä⟩, ⟨ö⟩, ⟨ü⟩ in German, this is not done; letter-diacritic combinations being identified with their base letter. The same applies to digraphs and trigraphs. Different diacritics may be treated differently in collation within a single language. For example, in Spanish, the character ⟨ñ⟩ is considered a letter, and sorted between ⟨n⟩ and ⟨o⟩ in dictionaries, but the accented vowels ⟨á⟩, ⟨é⟩, ⟨í⟩, ⟨ó⟩, ⟨ú⟩ are not separated from the unaccented vowels ⟨a⟩, ⟨e⟩, ⟨i⟩, ⟨o⟩, ⟨u⟩.

Romanization

Main article: Romanization

Words from languages natively written with other scripts, such as Arabic or Chinese, are usually transliterated or transcribed when embedded in Latin text or in multilingual international communication, a process termed Romanization.

Whilst the Romanization of such languages is used mostly at unofficial levels, it has been especially prominent in computer messaging where only the limited 7-bit ASCII code is available on older systems. However, with the introduction of Unicode, Romanization is now becoming less necessary. Note that keyboards used to enter such text may still restrict users to Romanized text, as only ASCII or Latin-alphabet characters may be available.

Latin alphabet and international standards

Main article: ISO basic Latin alphabet

By the 1960s, it became apparent to the computer and telecommunications industries in the First World that a non-proprietary method of encoding characters was needed. The International Organization for Standardization (ISO) encapsulated the Latin alphabet in their (ISO/IEC 646) standard. To achieve widespread acceptance, this encapsulation was based on popular usage.

As the United States held a preeminent position in both industries during the 1960s, the standard was based on the already published American Standard Code for Information Interchange, better known as ASCII, which included in the character set the 26 × 2(uppercase and lowercase) letters of the English alphabet. Later standards issued by the ISO, for example ISO/IEC 10646 (Unicode Latin), have continued to define the 26 × 2 letters of the English alphabet as the basic Latin alphabet with extensions to handle other letters in other languages.

The ISO basic Latin alphabet

Notes

↑ Haarmann 2004, p. 96
↑ "Search results | BSI Group". Bsigroup.com. Retrieved 2014-05-12.
↑ "Romanisation_systems". Pcgn.org.uk. Retrieved 2014-05-12.
↑ "ISO 15924 – Code List in English". Unicode.org. Retrieved 2013-07-22.
↑ "Search – ISO". Iso.org. Retrieved 2014-05-12.
↑ "ZAKON O SLUŽBENOJ UPOTREBI JEZIKA I PISAMA" (PDF). Ombudsman.rs. 17 May 2010. Retrieved 2014-07-05.
↑ "Descriptio_Moldaviae". La.wikisource.org. 1714. Retrieved 2014-09-14.
↑ Kazakh language to be converted to Latin alphabet – MCS RK. Inform.kz (30 January 2015). Retrieved on 2015-09-28.

References

Haarmann, Harald (2004), Geschichte der Schrift [History of Writing] (in German) (2nd ed.), München: C. H. Beck, ISBN 3-406-47998-7

External links

Wikimedia Commons has media related to Latin alphabet.

Diacritics Project — All you need to design a font with correct accents

ISO 15924 script codes

Adlm Afak Aghb Ahom Arab Aran Armi Armn Avst Bali Bamu Bass Batk Beng Bhks Blis Bopo Brah Brai Bugi Buhd Cakm Cans Cari Cham Cher Cirt Copt Cprt Cyrl Cyrs Deva Dogr Dsrt Dupl Egyd Egyh Egyp Elba Ethi Geok Geor Glag Gong Gonm Goth Gran Grek Gujr Guru Hanb Hang Hani Hano Hans Hant Hatr Hebr Hira Hluw Hmng Hrkt Hung Inds Ital Jamo Java Jpan Jurc Kali Kana Khar Khmr Khoj Kitl Kits Knda Kore Kpel Kthi Lana Laoo Latf Latg Latn Leke Lepc Limb Lina Linb Lisu Loma Lyci Lydi Mahj Maka Mand Mani Marc Maya Medf Mend Merc Mero Mlym Modi Mong Moon Mroo Mtei Mult Mymr Narb Nbat Newa Nkgb Nkoo Nshu Ogam Olck Orkh Orya Osge Osma Palm Pauc Perm Phag Phli Phlp Phlv Phnx Piqd Plrd Prti Qaaa—Qabx Rjng Roro Runr Samr Sara Sarb Saur Sgnw Shaw Shrd Sidd Sind Sinh Sora Soyo Sund Sylo Syrc Syre Syrj Syrn Tagb Takr Tale Talu Taml Tang Tavt Telu Teng Tfng Tglg Thaa Thai Tibt Tirh Ugar Vaii Visp Wara Wole Xpeo Xsux Yiii Zanb Zinh Zmth Zsye Zsym Zxxx Zyyy Zzzz

As of 2016-12-05

This article is issued from Wikipedia - version of the 11/24/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.