Checked content

Latin alphabet

Related subjects: Linguistics

Did you know...

This Schools selection was originally chosen by SOS Children for schools in the developing world without internet access. It is available as a intranet download. Do you want to know about sponsoring? See

Latin alphabet
Type Alphabet
Languages Latin and Romance languages; most languages of Europe; Romanizations exist for practically all known languages.
Time period ~700 B.C. to the present.
Parent systems
Proto-Canaanite alphabet
  • Phoenician alphabet
Child systems Numerous: see Alphabets derived from the Latin
Sister systems Cyrillic
ISO 15924 Latn, 215
Direction Left-to-right
Unicode alias Latin
Unicode range See Latin characters in Unicode
Note: This page may contain IPA phonetic symbols.

The Latin alphabet, also called the Roman alphabet, is the most widely used alphabetic writing system in the world today. It evolved from the western variety of the Greek alphabet, called the Cumaean alphabet, and was initially developed by the ancient Romans in Classical Antiquity to write the Latin language.

During the Middle Ages, it was adapted to the Romance languages, the direct descendants of Latin, as well as to the Celtic, Germanic, Baltic, and some Slavic languages, and finally to most of the languages of Europe.

With the age of colonialism and Christian proselytism, the Latin alphabet was spread overseas, and applied to Amerindian, Indigenous Australian, Austronesian, East Asian, and African languages. More recently, western linguists have also tended to prefer the Latin alphabet or the International Phonetic Alphabet (itself largely based on the Latin alphabet) when transcribing or devising written standards for non-European languages, such as the African reference alphabet.

In modern usage, the term "Latin alphabet" is used for any straightforward derivation of the alphabet first used to write Latin. These variants may discard some letters (like the Rotokas alphabet) or add extra letters (like the Danish and Norwegian alphabet) to or from the classical Roman script. Letter shapes have changed over the centuries, including the creation of entirely new lower case forms.



It is generally held that the Latins adopted the Cumae alphabet‎, a variant of the Greek alphabet, in the 7th century BC from Cumae, a Greek colony in southern Italy. Roman legend credited the introduction to one Evander, son of the Sibyl, supposedly 60 years before the Trojan war, but there is no historically sound basis to this tale. From the Cumae alphabet, the Etruscan alphabet was derived and the Latins eventually adopted 21 of the original 26 Etruscan letters.

Original Latin alphabet of the 7th c. BC

The letter C was the western form of the Greek gamma, but it was used for the sounds /g/ and /k/ alike, possibly under the influence of Etruscan, which lacked any voiced plosives. Later, probably during the 3rd century BC, the letter Z — unneeded to write Latin proper — was replaced with the new letter G, a C modified with a small horizontal stroke, which took its place in the alphabet. From then on, G represented the voiced plosive /g/, while C was generally reserved for the voiceless plosive /k/. The letter K was used only rarely, in a small number of loanwords such as Kalendae, often interchangeably with C.

After the Roman conquest of Greece in the first century BC, Latin adopted the Greek letters Y and Z (or rather readopted, in the latter case) to write Greek loanwords, placing them at the end of the alphabet. An attempt by the emperor Claudius to introduce three additional letters did not last. Thus it was that during the classical Latin period the Latin alphabet contained 23 letters:

Letter A B C D E F G H
Name ā ē ef
Pronunciation ( IPA) /aː/ /beː/ /keː/ /deː/ /eː/ /ef/ /geː/ /haː/
Letter I K L M N O P Q
Name ī el em en ō
Pronunciation ( IPA) /iː/ /kaː/ /el/ /em/ /en/ /oː/ /peː/ /kʷuː/
Letter R S T V X Y Z
Name er es ū ex ī Graeca zēta
Pronunciation ( IPA) /er/ /es/ /teː/ /uː/ /eks/ /iː ˈgraika/ /ˈzeːta/
The Duenos inscription, dated to the 6th century BC, shows the earliest known forms of the Old Latin alphabet.

The Latin names of some of these letters are disputed. In general, however, the Romans did not use the traditional ( Semitic-derived) names as in Greek: the names of the plosives were formed by adding /eː/ to their sound (except for K and Q, which needed different vowels to be distinguished from C) and the names of the continuants consisted either of the bare sound, or the sound preceded by /e/. The letter Y when introduced was probably called hy /hyː/ as in Greek, the name upsilon not being in use yet, but this was changed to i Graeca (Greek i) as Latin speakers had difficulty distinguishing its foreign sound /y/ from /i/. Z was given its Greek name, zeta. For the Latin sounds represented by the various letters see Latin spelling and pronunciation; for the names of the letters in English see English alphabet.

Old Roman cursive script, also called majuscule cursive and capitalis cursive, was the everyday form of handwriting used for writing letters, by merchants writing business accounts, by schoolchildren learning the Latin alphabet, and even emperors issuing commands. A more formal style of writing was based on Roman square capitals, but cursive was used for quicker, informal writing. It was most commonly used from about the 1st century BC to the 3rd century, but it probably existed earlier than that. It lead to Uncial, a majuscule script commonly used from the 3rd to 8th centuries AD by Latin and Greek scribes.

New Roman cursive script, also known as minuscule cursive, was in use from the 3rd century to the 7th century, and uses letter forms that are more recognizable to modern eyes; a, b, d, and e had taken a more familiar shape, and the other letters were proportionate to each other. This script evolved into the medieval scripts known as Merovingian and Carolingian minuscule.

Medieval and later developments

It was not until the Middle Ages that the letter W (originally a ligature of V and V) was added to the Latin alphabet, to represent sounds from the Germanic languages which did not exist in medieval Latin, and only after the Renaissance did the convention of treating I and U as vowels, and J and V as consonants, become established. Prior to that, the former had been merely glyph variants of the latter.

With the fragmentation of political power, the style of writing changed and varied greatly throughout the Middle Ages, and even after the invention of the printing press. Early deviations from the classical forms were the uncial script, a development of the Old Roman cursive, and various so-called minuscule scripts that developed from New Roman cursive, of which the Carolingian minuscule was the most influential, introducing the lower case forms of the letters, as well as other writing conventions that have since become standard.

The languages that use the Latin alphabet today generally use capital letters to begin paragraphs and sentences and proper nouns. The rules for capitalization have changed over time, and different languages have varied in their rules for capitalization. Old English, for example, was rarely written with even proper nouns capitalised; whereas Modern English of the 18th century had frequently all nouns capitalised, in the same way that Modern German is today, e.g. "All the Sisters of the old Town had seen the Birds".

Spread of the Latin alphabet

The Latin alphabet spread, along with the Latin language, from the Italian Peninsula to the lands surrounding the Mediterranean Sea with the expansion of the Roman Empire. The eastern half of the Empire, including Greece, Asia Minor, the Levant, and Egypt, continued to use Greek as a lingua franca, but Latin was widely spoken in the western half, and as the western Romance languages evolved out of Latin, they continued to use and adapt the Latin alphabet.

With the spread of Western Christianity during the Middle Ages, the alphabet was gradually adopted by the peoples of northern Europe who spoke Celtic languages (displacing the Ogham alphabet) or Germanic languages (displacing their earlier Runic alphabets), Baltic languages, as well as by the speakers of several Finno-Ugric languages, most notably Hungarian, Finnish and Estonian. The alphabet also came into use for writing the West Slavic languages and several South Slavic languages, as the people who spoke them adopted Roman Catholicism. The speakers of East Slavic languages generally adopted the Cyrillic alphabet along with Orthodox Christianity. The Serbian language uses both alphabets.

As late as 1492, the Latin alphabet was limited primarily to the languages spoken in western, northern and central Europe. The Orthodox Christian Slavs of eastern and southeastern Europe mostly used the Cyrillic alphabet, and the Greek alphabet was still in use by Greek-speakers around the eastern Mediterranean. The Arabic alphabet was widespread within Islam, both among Arabs and non-Arab nations like the Iranians, Indonesians, Malays, and Turkic peoples. Most of the rest of Asia used a variety of Brahmic alphabets or the Chinese script.

Latin alphabet world distribution. The dark green areas shows the countries where this alphabet is the sole main script. The light green shows the countries where the alphabet co-exists with other scripts.

Over the past 500 years, the alphabet has spread around the world, to the Americas, Oceania, and parts of Asia, Africa, and the Pacific with European colonization, along with the Spanish, Portuguese, English, French, and Dutch languages. The Latin alphabet is also used for many Austronesian languages, including Tagalog and the other languages of the Philippines, and the official Malaysian and Indonesian languages, replacing earlier Arabic and indigenous Brahmic alphabets. Some glyph forms from the Latin alphabet served as the basis for the forms of the symbols in the Cherokee syllabary developed by Sequoyah; however, the sounds of the final syllabary were completely different. L. L. Zamenhof used the Latin alphabet as the basis for the alphabet of Esperanto.

In the late eighteenth century, the Romanians adopted the Latin alphabet, primarily because Romanian is a Romance language. The Romanians were predominantly Orthodox Christians, and their Church had promoted the Cyrillic alphabet prior to that. Under French rule and Portuguese missionary influence, the Latin alphabet was adapted for writing the Vietnamese language, which had previously used Chinese-like characters. In 1928, as part of Kemal Atatürk's reforms, Turkey adopted the Latin alphabet for the Turkish language, replacing the Arabic alphabet. Most of Turkic-speaking peoples of the former USSR, including Tatars, Bashkirs, Azeri, Kazakh, Kyrgyz and others, used the Latin-based Uniform Turkic alphabet in the 1930s, but in the 1940s all those alphabets were replaced by Cyrillic. After the collapse of the Soviet Union in 1991, several of the newly-independent Turkic-speaking republics, namely Azerbaijan, Uzbekistan, and Turkmenistan, as well as Romanian-speaking Moldova, have officially adopted the Latin alphabet for Azeri, Uzbek, Turkmen, respectively. Kazakhstan, Kyrgyzstan, Tajikistan, and the breakaway region of Transnistria kept the Cyrillic alphabet, chiefly due to their close ties with Russia.


In the course of its use, the Latin alphabet was adapted for use in new languages, sometimes representing phonemes not found in languages that were already written with the Roman characters. To represent these new sounds, extensions were therefore created, be it by adding diacritics to existing letters, by joining multiple letters together to make ligatures, by creating completely new forms, or by assigning a special function to pairs or triplets of letters. These new forms are given a place in the alphabet by defining a alphabetical order or collation sequence, which can vary with the particular language.


A ligature is a fusion of two or more ordinary letters into a new glyph or character. Examples are ash, Æ/æ (from AE), Œ/œ (from OE), the abbreviation & (from Latin et "and"), and the German Eszett, ß (from ſz, the archaic medial form of s followed by a z).

Wholly new letters

Examples are the Runic letters wynn (Ƿ/ƿ) and thorn (Þ/þ), and the Irish letter eth (Ð/ð), which were added to the alphabet of Old English. Another Irish letter, the insular g, developed into yogh (Ȝ/ȝ), used in Middle English. Wynn was later replaced with the new letter w, eth and thorn with th, and yogh with gh. Although the four are no longer part of the English alphabet, eth and thorn are still used in the modern Icelandic alphabet.

The Azerbaijani alphabet has adopted the letter schwa Ə/ə from the International Phonetic Alphabet, using it to represent the sound [æ]. Some West, Central and Southern African languages use a few additional letters which have a similar sound value to their equivalents in the IPA. For example, Adangme uses the letters Ɛ/ɛ and Ɔ/ɔ, and Ga uses Ɛ/ɛ, Ŋ/ŋ and Ɔ/ɔ. Hausa uses Ɓ/ɓ and Ɗ/ɗ for implosives, and Ƙ/ƙ for an ejective. Africanists have standardized these into the African reference alphabet.

Digraphs and trigraphs

A digraph is a pair of letters used to write one sound or a combination of sounds that does not correspond to the written letters in sequence. Examples are CH, RH, SH in English, or the Dutch IJ (note that ij is capitalised as IJ, never Ij, and that it often takes the appearance of a ligature in handwriting). A trigraph is made up of three letters, like the German SCH. In the orthographies of some languages, digraphs and trigraphs are regarded as independent letters of the alphabet in their own right.


A diacritic, in some cases also called an accent, is a small symbol which can appear above or below a letter, or in some other position, such as the umlaut sign used in the German characters Ä, Ö, Ü. Its main function is to change the phonetic value of the letter to which it is added, but it may also modify the pronunciation of a whole syllable or word, or distinguish between homographs. As with letters, the value of diacritics is language-dependent.


Modified letters such as the symbols Å, Ä, and Ö may be regarded as new individual letters in themselves, and assigned a specific place in the alphabet for collation purposes, separate from that of the letter on which they are based, as is done in Swedish. In other cases, such as with Ä, Ö, Ü in German, this is not done, letter-diacritic combinations being identified with their base letter. The same applies to digraphs and trigraphs. Different diacritics may be treated differently in collation within a single language. For example, in Spanish the character Ñ is considered a letter in its own, and sorted between N and O in dictionaries, but the accented vowels Á, É, Í, Ó, Ú are not separated from the unaccented vowels A, E, I, O, U.


Words from languages natively written with other scripts, such as Arabic or Chinese, are usually transliterated or transcribed when embedded in Latin text or in multilingual international communication, a process termed romanization. In the 1970s, the People's Republic of China developed an official transliteration of Mandarin Chinese into the Latin alphabet called Pinyin, although its use has been very rare outside educational and international purposes.

Whilst the romanization of such languages is used mostly at unofficial levels, it has been especially prominent in computer messaging where only the limited 7-bit ASCII code is available on older systems. However, with the introduction of Unicode, romanization is now becoming less necessary.

The English alphabet

As used in modern English, the Latin alphabet consists of the following characters

Majuscule Forms (also called uppercase or capital letters)
Minuscule Forms (also called lowercase or small letters)
a b c d e f g h i j k l m n o p q r s t u v w x y z

In addition, the ligatures Æ of A with E (e.g. "encyclopædia"), and Œ of O with E (e.g. " cœlom") may be used, optionally, in words derived from Latin or Greek, and the diaeresis mark is sometimes placed for example on the letter o (e.g. "coöperate") to indicate the pronunciation of oo as two distinct vowels, rather than a long one. Outside of professional papers on specific subjects that traditionally use ligatures in loanwords, however, ligatures and diaereses are seldom used in modern English.

Latin alphabet and international standards

By the 1960s it became apparent to the computer and telecommunications industries in the First World that a non-proprietary method of encoding characters was needed. The International Organization for Standardization (ISO) encapsulated the Latin alphabet in their ( ISO/IEC 646) standard. To achieve widespread acceptance, this encapsulation was based on popular usage. As the United States held a preeminent position in both industries during the 1960s the standard was based on the already published American Standard Code for Information Interchange, better known as ASCII, which included in the character set the 26 x 2 letters of the English alphabet. Later standards issued by the ISO, for example ISO/IEC 10646 ( Unicode Latin), have continued to define the 26 x 2 letters of the English alphabet as the basic Latin alphabet with extensions to handle other letters in other languages.

The ISO basic Latin alphabet
Aa Bb Cc Dd Ee Ff Gg Hh Ii Jj Kk Ll Mm Nn Oo Pp Qq Rr Ss Tt Uu Vv Ww Xx Yy Zz
  • Derivations
  • Diacritics
  • History
  • ISO/IEC 646
  • List of letters
  • Numerals
  • Palaeography
  • Punctuation
  • Unicode
Retrieved from ""