Of the 40 languages listed below, no less than 18 are spoken in India (including Pakistan and Bangladesh) or China. Of the remaining 22 languages, 9 are European in origin, 3 were in the ancient cultural sphere of influence of China (Japanese, Korean, & Vietnamese), 7 are in the cultural sphere of influence of Islâm (Arabic, Persian, Malay, Javanese, Turkish, Swahili, & Hausa -- not to mention Urdu, already counted in India), 2 were in the ancient cultural sphere of India (Burmese and Thai-Lao -- and as was Javanese before the advent of Islâm), and the remaining one, Tagalog, was culturally isolated, in the Philippines, until the arrival of the Spanish. The white spaces on the map, mainly in Africa, simply mean that the local languages, like Tagalog, are not classified in the cultural spheres of India, China, Europe, or Islâm. |
The "cultural spheres of influence" of India, China, Europe, and Islâm are founded on the World Civilizations of their central or foundational regions, which may be defined by religion or culture but most precisely by the possession of an ancient Classical language attended by a large literature in that language. In India this language is Sanskrit,
, which is first of all the sacred language of Hinduism but otherwise contains extensive secular literature and occurs as a principal language of Buddhism also. In China, Classical Chinese not only possesses literature back to the Spring and Autumn Period, but it was extensively used until even the modern period by educated writers in Japan, Korea, and Vietnam -- people who otherwise did not even speak Chinese.
In Europe, there is only one Classical language common to the whole area, and that is Greek. In a large and dominant subdivision of Europe, we also find Latin as the Classical language.
| Language | Speakers | Language Family | Location | |
|---|---|---|---|---|
| 1993 all | 2005 first | |||
| MANDARIN CHINESE | 952 M | 873 M | Sino-Tibetan | CHINA |
| English | 470 M | 309 M | Indo-European | Europe/ America/etc. |
| HINDI | 418 M | 180 M | Indo-European | INDIA |
| Spanish | 381 M | 322 M | Indo-European | Europe/America |
| Russian | 288 M | 145 M | Indo-European | Europe/ Central Asia |
| Arabic | 219 M | 206 M | Afro-Asiatic, Semitic | Middle East/ N Africa |
| BENGALI | 196 M | 171 M | Indo-European | INDIA |
| Portuguese | 182 M | 177 M | Indo-European | Europe/America |
| Malay -- Bahasa Malaysia/ Indonesia | 155 M | 46 M | Austronesian, Malayo-Polynesian | Malaya/ Indonesia |
| Japanese | 126 M | 122 M | Altaic (?) | Japan |
| French | 124 M | 64 M | Indo-European | Europe/America/ Africa/Oceania |
| German | 121 M | 103 M | Indo-European | Europe |
| URDU | 100 M | 60 M | Indo-European | INDIA |
| PUNJABI | 94 M | 87 M | Indo-European | INDIA |
| Korean | 75 M | 67 M | Altaic (?) | Korea |
| TELUGU | 73 M | 69 M | Dravidian | INDIA |
| MARATHI | 70 M | 68 M | Indo-European | INDIA |
| TAMIL | 69 M | 66 M | Dravidian | INDIA |
| CANTONESE CHINESE | 66 M | 54 M | Sino-Tibetan | CHINA |
| SHANGHAI CHINESE | 65 M | 77 M | Sino-Tibetan | CHINA |
| Javanese | 64 M | 75 M | Austronesian, Malayo-Polynesian | Indonesia |
| Vietnamese | 64 M | 67 M | Austro-Asiatic, Mon-Khmer (?) | Indo-China |
| Italian | 63 M | 61 M | Indo-European | Europe |
| Turkish (Azeri, Turkmen) | 59 M (18 M) | 50 M (37 M) | Altaic | West & Central Asia |
| Tagalog | 53 M | 15 M | Austronesian, Malayo-Polynesian | Philippines |
| MIN CHINESE | 50 M | 67 M | Sino-Tibetan | CHINA |
| Thai & Lao | 50 M | 49 M | Tai-Kadai | S.E. Asia |
| Swahili | 48 M | 5-10 M | Niger-Kordofanian | East Africa |
| HUNAN CHINESE (Xiang) | 48 M | 36 M | Sino-Tibetan | CHINA |
| Ukrainian | 47 M | 39 M | Indo-European | Europe |
| KANARESE (Kannada) | 44 M | 35 M | Dravidian | INDIA |
| Polish | 44 M | 42 M | Indo-European | Europe |
| BIHARI (Bhojpuri) | 42 M | 26 M | Indo-European | INDIA |
| GUJARATI | 41 M | 46 M | Indo-European | INDIA |
| Hausa | 38 M | 24 M | Afro-Asiatic, Chadic | West Africa |
| MALAYALAM | 35 M | 35 M | Dravidian | INDIA |
| Persian & Tajiki | 34 M | 35 M | Indo-European | Iran/Central Asia |
| HAKKA CHINESE | 34 M | 29 M | Sino-Tibetan | CHINA |
| ORIYA | 32 M | 31 M | Indo-European | INDIA |
| Burmese | 31 M | 32 M | Sino-Tibetan | Burma |
The youngest civilization and cultural area would be that of Islâm, whose language, Classical Arabic, represents a large body of secular and religious literature from the Middle Ages down to the present.
With all Classical languages, other languages within their sphere of influence tend to borrow vocabulary, and sometimes even grammar, extensively from the defining language of the civilization. Along with that come references to particular items of literature, history, and religion. Thus, Arabic words frequently occur in Persian, Turkish, Hindi-Urdu, Malay, Swahili, etc., even as Greek and Latin words are regularly and easily found in English, or Chinese words in Japanese, Korean, and Vietnamese. Educated Europeans can be expected to know about Thermopylae, while educated Chinese, Japanese, Koreans, and Vietnamese would be expected to know about the Three Kingdoms, and Muslims about the Bloody Shirt of 'Uthmân.
These numbers are from The World Almanac and Book of Facts 1995 [Funk & Wagnalls, 1994, pp. 598-599] and The World Almanac and Book of Facts 2008 [World Almanac Education Group, Readers Digest, 2008, pp. 728-729]. The 1995 edition reports data from 1993, and the 2008 edition data from 2005.
The treatment of the languages is awkwardly different in the two editions. In 1995, the languages were listed alphabetically and all speakers were given for each language. In 2008, however, the languages are listed by country and only numbers for those who speak them as first languages are given. This results in some dramatic changes in the numbers. Languages widely spoken as second languages, such as Mandarin Chinese, English, Hindi, Russian, Arabic, Malay, French, and Swahili, thus seem to have lost millions of speakers by 2005.
Indeed, the 2008 edition does not list Swahili at all -- a very grave and strange oversight, especially when the list claims to include all languages with at least 2 million speakers. Swahili, which has a large Arabic component, may have ten million or fewer speakers as a first language; but it is a national language in Tanzania, Kenya, Uganda, and the Congo and is even used by the United Nations. It has the reputation among some of being the common language of all of Africa, but it is actually not spoken in the West or South.
Arabic also receives odd treatment in the 2008 edition, since it is broken up by dialect (16 of them) for various Arabic speaking states. In general, these are not separate languages, although North African Arabic, Maghribî, is rather different from the Middle Eastern dialects. Nevertheless, this overlooks the written language (the dialects are explicitly identified as "spoken"), which is the much more unified language of literary Arabic. Since Arabic is the language of Islâm, Moslems around the world, as far afield as Indonesia (which is over 90% Moslem), learn it it for religious reasons as a second language (which is not reflected in the 2005 data).
The treatment of Arabic in the 2008 Almanac means that, while it was given on a short list as one of the "principle languages of the world" in 1995, Arabic disappears from the corresponding 2008 short list of "languages spoken by the most people." Certainly, speakers of any dialect of Arabic would find this development annoying, misrepresentative, ahistorical, and perhaps insulting.
By some estimates, up to a billion people could have some competence in English. But even the figure for Mandarin shrinks when we leave out other Chinese (perhaps a hundred million) who have learned Mandarin as a second language. Some languages, like Swahili and Malay, started out as trade languages which soon were essentially second languages. They continue to have a far smaller number of speakers as first languages than as second. Malay is the first language of less than 50 million people. But as a trade language which has become a national language of Malaysia, Indonesia, and Singapore (called Bahasa Malaysia, Bahasa Indonesia, and Bahasa Melayu respectively), Malay, a Malayo-Polynesian language, is one of the major languages of the world. One would not know this from the 2005 data.
Apart from Malay and Swahili, some languages on the list drop below 30 million in the 2005 data for reasons that are less obvious. Thus, Tagalog, Bihari, Hausa, and Hakka all lose millions of speakers from 1993 to 2005. With Tagalog this may reflect its use as a national language of the Philippines and so as a second language for many speakers of other Philippine languages. Hakka, as a language of traders, and with a geographical distribution that is very scattered, also may have a significant population who use it as a trade language. With Bihari the problem may have been the unreliability of census data (perhaps a problem with other Indian and Chinese languages). Other languages on the list probably have lost numbers because of an actual shrinkage in the number of speakers, as with Japanese, German, Polish, and Ukrainian, where populations have been aging without a replacement level of births. It does surprise me some that no new languages have grown to have more than 30 millions speakers between 1993 and 2005.
| Language | Speakers | Language Family | Location | ||
|---|---|---|---|---|---|
| 1993 all | 2005 first | 2006, Katzner | |||
| Sundanese | 26 M | 27 M | 30 M | Austronesian, Malayo-Polynesian | Indonesia |
Mandarin Chinese has been expanding against the other Chinese languages because of its political, cultural, and demographic dominance and the peculiar relationship of these languages to each other (they are written with the same Classical Chinese characters). In India no language has a status comparable to Mandarin in China. Indian states have their own official languages, recognized by the Constitution, but giving official national status to Hindi, which is common in the North, actually set off riots. Various Indian languages are certain to continue and thrive, while English continues for purposes of neutral national communication. The list of languages in the 1995 Almanac overlooked Bihari in India and Hunan in China, so I had to use numbers from other sources. The 2008 Almanac, on the other hand, has a rather full list of Chinese and Indian languages.
Hindi and Urdu are really the same language (Hindi-Urdu or Hindustani), with Hindi spoken by Hindus and Urdu spoken by Moslems. On the literary level these languages now diverge in vocabulary, with Hindi borrowing from Sanskrit [Sãskr.ta,
] and Urdu borrowing from Arabic and Persian. Hindi-Urdu, however, because it grew up under the Moslem Moghul Emperors, had a Persian and Arabic component from the beginning, which survives even in Hindi. "Hindi" [Hyndi,
] itself is from Arabic Hindî [
], though that is ultimately from Sanskrit sindhu [
], meaning a river, the Indus River, or the Sindh region of India. "Urdu" [Wrdu,
] is from Persian ordu, meaning a camp, or Turkish ordu, meaning an army [note]. Both are derived from Mongolian orda (which had both meanings), as does the English word "horde," which came through the Polish rendering, horda. The name "Urdu" commemorates the circumstance that the language developed in the army camps of the Moghul Emperors, where the originally Turkish and Afghani forces of the Moghuls interacted with the locals. Both Hindi and Urdu have borrowed from English and other modern languages.
I have given Turkish, meaning the Osmanli (
) language of Turkey, with other languages, Azeri and Turkmen, which are so closely related as to sometimes be considered one language (Oghuz Turkish, in the family of Altaic languages). However, both Almanacs, and most other sources, list them all separately, mostly for political, nationalistic reasons. Similarly, I have given Persian and Tajiki together because the latter really is a dialect of Persian -- though I notice some sources confuse it with the nearby Turkic languages.
Only two Sub-Saharan African languages -- Hausa and Swahili -- appear on the list. This reflects the circumstance that a large number of languages are spoken in Africa, and many areas are not densely populated. The most populous country in Africa, Nigeria, with over 100,000,000 people, contains many languages. Of its principal languages, Hausa, Ibo (or Igbo), Yoruba, and Fulani (or Fula), only Hausa makes the list. As of 1993, Ibo had 17 million speakers, Yoruba 20, and Fulani 13 (many of them outside Nigeria). The 2008 Almanac gives only 24 million speakers for Hausa, 18 for Ibo, 19 for Yoruba, and skips, in the peculiar way of its treatment, Fulani altogether -- 22 million is given by Kenneth Katzner, in the book cited below. Hausa evidently is widely used as a second language, which may account for the drop of over 10 million in speakers from one list to the other. Katzner gives an estimate of a total of 55 million Hausa speakers.
The languages with the largest number of speakers in South Africa, Zulu and Xhosa, have about 9 million and 8 million speakers, respectively -- 9 and 7 in the 2005 data. Both Hausa and Swahili are identified as part of the culture area of Islam, because Hausa is predominately spoken by Muslims and because Swahili, although an African language spoken by many non-Muslims, grew up as a trade language under Islamic influence. Thus, the name Swahili itself is Arabic,
, Sawâh.ilî, from
, sâh.il, "coast," and
, sawâh.il, "coasts" (in Arabic a "broken" or irregular plural). The Swahili word for "book," kitabu, is Arabic (
, kitâb); but since many nouns in Swahili begin with ki- and form their plurals by changing that to vi-, "books" is vitabu, which is not at all like Arabic, where the plural is
, kûtûb.
Lost in the vast extent of the World Civilizations is a culture with a claim to be a civilization in its own right. That is Ethiopia.
As a Christian nation, Ethiopia shares in a sub-Roman civilization, but it is otherwise related to the language, alphabet,
and culture of South Arabia. South Arabia itself, of course, became part of Oecumene of Islâm, which spread around Ethiopia and cut it off from most contact with the outer world -- even while its suriving connection, through the Coptic Patriarchate in Egypt (which appointed the Primate of Ethiopia until 1945), was compromised by the difficulties of travel, the alienation of the Coptic Church from Greek and Latin Orthodoxy, and, of course, the Arab Conquest and occupation of Egypt. This left Ethiopia as its own kind of Island Universe in world history. It even possesses its own Classical Language, Ethiopic or Ge'ez. But the major modern descendant of Ethiopic, Amharic, is only spoken by 17 million people -- so it did not make the cut for the table above. A legend arose in Europe in the Middle Ages that there was a lost Christian kingdom, ruled by the saintly "Prester John," somewhere in Africa or Asia. Although it is hard to know if there was any factual basis for this legend at the time (there may have been rumors of Nestorian rulers of Black Cathay), when the Portuguese arrived in the Indian Ocean, they soon discovered that there was indeed just such a Christian kingdom in Africa. Even now, it is hard to know just how to classify the place. The Mediterranean world of Rome, to which Ethiopia was connected, is long gone, but it doesn't sound even remotely correct to then include Ethiopia in the European civilization that is Rome's successor. So Ethiopia remains an anomaly, economically one of the poorest countries in the world, but historically and culturally ancient, unique, and extraordinary in its mountain fastness.
The following map adjusts the size of the areas of the earth to their population. We see why so many of the languages of India and China belong to the 40 languages with over 30 million speakers. I have adapted this from the Fontana Pocket Atlas [Fontana Books, William Collins Sons & Co. Ltd., 1969, p.114-115]. Since I bought the book in 1970 (in Beirut, of all places), the proportions may not be entirely up to date -- it looks like the population of Africa has doubled in the meantime. The book completely overlooked the Philippines, whose population now is about four times that of Taiwan (Luzon has about twice the population of Taiwan). So I have tried to produce a likely estimate. The languages in the table above are identified on the map either by language (Swahili) or by country (Nigeria for Hausa). Some places are identified for interest or clarity (Cyprus, Bali).
General information about world languages may be found in The Languages of the World, by Kenneth Katzner [Routledge & Kegan Paul, revised 1986, Third Edition, 1995, 2002, 2006] and The World's Major Languages, edited by Bernard Comrie [Oxford University Press, 1987]. There is a lot of uncertainty about the populations for Chinese "dialects." The separate discussion for Chinese dialects should be consulted. Thorough treatments of Chinese may be found in The Chinese Language, Fact and Fantasy, by John DeFrancis (University of Hawaii Press, 1984) and The Languages of China, by S. Robert Ramsey (Princeton University Press, 1987).
Genetic Distance and Language Affinities Between Autochthonous Human Populations
The Semitic and Other Afroasiatic Languages
The Pronunciation of Ancient Egyptian
"Knowing" Words in Indo-European Languages
Greek, Sanskrit, and Closely Related Languages
The Spread of Indo-European and Turkish Peoples off the Steppe
The Altaic and Uralic Languages
The Austronesian and Polynesian Languages
History of Philosophy, Indian Philosophy
Philosophy of Science, Linguistics
The transcription of Hindi today tends to follow the conventions of writing Sanskrit, with long and short vowels, e.g. "a" and "â." However, the modern vowels actually contrast quality rather than quantity in pronunciation, like "long" and "short" vowels in English. In Persian, which is written using the long and short vowels of Arabic, but also has modern contrasts of quality rather than quantity, the differences in quality are easily rendered. The Old English ligature of "a" and "e," "æ," can be used for the "short a," a sound that is like the "a" in Modern English "bad." As luck would have it, both Sanskrit and Arabic use the basic vowels "a," "i," and "u."
Languages with more than 30,000,000 Speakers as of 2005, Note
| Sanskrit & Arabic | Persian | Urdu | |||
|---|---|---|---|---|---|
| a | â | æ | a | ![]() | a |
| i | î | e | i | y | i |
| u | û | o | u | w | u |
A system similar to that of Persian was used in the original edition of Teach Yourself Urdu, by T. Grahame Bailey, J.R. Firth, and A.H. Harley [David McKay Company, New York, English Universities Press, 1950, 1956, 1967], and in the 1972 edition of Teach Yourself Punjabi, by C. Shackle [Hodder and Stoughton, Ltd., David McKay Company, 1972, 1980]. There, the short "a" is pronounced and written with the "shwa," "
," an indefinite reduced vowel, like the "a" at the end of English "sofa." The short "i," pronounced as in English, is written "y." And the short "u," like English again, is written "w." "Y" can be a vowel in English, but not "w"; but there will be no confusion for Urdu, Hindi, or Punjabi if we don't get combinations like "yy" for "yi" or "ww" for "wu." Apparently we don't.
I have not seen this elegant system used for Urdu or Punjabi beyond these particular books. Instead, the sources tend in Urdu to follow the transcription conventions for Arabic, as Hindi and Punjabi do for Sanskrit. Indeed, this is what is done in the new edition of Teach Yourself Urdu [David Matthew and Mohamed Kasim Dalvi, Hodder Education, McGraw-Hill, 1999, 2003, 2007]. I have not yet examined more recent editions of Teach Yourself Punjabi. Such an approach has also been something one sees a lot with Persian, despite its unsuitability. For Urdu, Punjabi, and Hindi this is a shame, since the student will be startled to find how some familiar names, like Tâj Mahal, are actually pronounced as they would be written in the old Teach Yourself Urdu: Taj M
h
l. I have used some forms above from these older Teach Youself books but have followed the other, Arabic/Sanskrit conventions elsewhere at this website.
Another convention at this website is to use the standard form of written Arabic, the naskh or naskhî,
, for all uses of the Arabic alphabet. This contrasts with the style commonly used for writing Persian and Urdu, the nasta'lîq,
, short for naskh ta'lîq,
, "hanging naskh." The nasta'lîq is an oblique, sloping, ornate version of the alphabet, which I find difficult to read.
A friend of mine was beginning to take Arabic at the American University of Beirut when we were there in 1969. The class began with learning the alphabet and how to write it. There were some Iranians in the class, who of course already knew the alphabet. When called to the blackboard, they wrote the letters in their accustomed fashion. Professor Ghoul,
(yes, the word "ghoul" in English, and the star Algol,
), turned to the class and announced that this was the "Persian Hand" and that it would not be tolerated in his class. The Iranians would need to learn the proper way to write Arabic. The naskhî does seem more natural to me for the printed page.
I found that one of the most permanent cultural traits in the written cultures of Eurasia (including the Northern part of Africa and Ethiopia) is precisely the phenomenon I am calling hieroglossia. By this I mean the sum of relations that develop between a language perceived as a central or founding element in a given culture area (this language being the hierogloss) and the language or languages that are perceived as being dependent, not historically or linguistically, but ontologically or theologically, on that hierogloss. Within a hieroglossic relationship, the language perceived as dependent, often called the "vulgar tongue" or "vernacular" (or, as I will call it, "laogloss"), is clearly considered not to be self-sufficient. Jean-Noël Robert, "Hieroglossia," Nanzan Institute for Religion & Culture, Bulletin 30 (2006), p. 26
Languages die when others take their place -- we don't need Latin or any dead language, because we've got languages of our own.
John H. McWhorter, The Power of Babel [Perennial, 2003, p.255]
Caught in that sensual music all neglect
Monuments of unageing intellect.William Butler Yeats (1865-1939), "Sailing to Byzantium"
John McWhorter is a fine scholar in linguistics and an engaging and attractive teacher of the subject. However, for someone who agonizes over the loss of even one of the 6000 living spoken languages in the world, despite many languages with very few speakers and no literature, his attitude seems curiously inconsistent when it comes to "dead" languages like
Latin, as we see in the quote above. Presumably, if the language of the Seneca Indians died out, then it would be of no interest to him and he would express a similar level of contempt. Somehow I doubt that would be the case. Do we need the Tonkawa language, which used to be spoken in central Texas? For certain purposes, yes, scientific, aesthetic, and historical. But, apart from linguistics, there is not much else to do with Tonkawa. With Latin there is much to do, because there is much to read. We learn of people, events, and ideas over a span of many centuries.
And that is the point. It used to be the case that education meant learning the Classical language of one's civilization. This was not just an exercise in memorization to show off. There was stuff to read. Early on, there was originally nothing else to read, because the Classical language might be the only written language in the culture, and its literature the only literature. You read that or nothing else.
This meant that even a Classical language that was the first language of no person, and was learned by no one in the cradle, nevertheless was a "living" language in most senses that we could possibly attribute to it. People did speak it. People read it. And people wrote in it. Again, it might be the only language that they wrote. Or, even if it wasn't, even if there were spoken, vernacular languages that were written and read in daily usage, the Classical language was the doorway, if not to the only learning of the civilization, but to the fundamental, formative, and defining learning and knowledge of the civilization [note].
The neglect of the Classical languages of Europe, Greek and Latin, today is the consequence of the vernacular languages, not only becoming written languages themselves, not only containing their own extensive literature, but actually replacing the culture and literature of the Classical languages as the proper representatives of their civilization, rendering the Classics "dead" through a sense of irrelevance. I think there are two main reasons for this. One is nationalism. National pride may not be able to countence the implication that an ancient language, a literature, and a civilization transcend and outweigh that of any particular modern nation. The other reason is the precedent of science. It does not matter that Issac Newton wrote the Principia Mathematica Philosophiae Naturalis in Latin.
![]() Love the Lord your God in all your heart and in all your soul and in all your mind. |
Progress means that the civilization, and its language and literature, are always new. Whether science should be taken as a paradigm in other areas of life, however, is open to serious question. History, philosophy, and ethics are not necessarily understood better by more recent historians, philosophers, and moralists than by ancient ones, although, to be sure, this is open to debate. But it is not as though translations of Thucydides are being read in English or Italian as diligently as University students were once expected to read him in Greek. Greek and Roman literature now tend to be neglected nearly as much as their original languages. So the debate doesn't even start.
Instead, the conceit is that Classical learning, into which the Classical languages were windows, is as obsolete and superseded as Classical physics. This is a pretence of deep folly. The Olympian level of wisdom that produced the United States Constitution, for instance, is almost wholly lacking from recent American politics. American politicians literally do not have the same education; and the result is that they do not understand, and certainly do not believe in, the Constitution they all take an oath to "preserve, protect, and defend."
That Classical civilization should be despised by modernity contains its own bitter irony, when American education now is nearly as deficient in mathematics and science as it is in the Classics. The modern intelligentsia affect a nihilism whose contempt and ignorance fall equally on Greek, Latin, science, and even progress itself. The result is evidently supposed to be some kind of liberation from the shackles of the past and of arbitrary authority. The effect, however, is merely the autism and stupefaction of a dumb and self-referential twilight existence in an isolated present. The particular, the subjective, and the irrational become the inspirations for conflicts in which even common humanity is dismissed.
The only Classical language that all European civilization has in common is Greek. Next comes Latin, which was current within Francia (through which once ran the writ of the Popes, until the Reformation). Behind Greek and Latin, however, there is
Hebrew, which also counts as a Classical language for Christendom as well as for Judaism because an essential item of religious literature, the Hebrew Bible, is in that language (and some Aramaic).
| Europa | 1. Romania | 2. Constantinople | 1. Greek | ![]() |
| 2. Francia | 1. Rome | 2. Latin | ![]() | |
| 3. Russia | 3. Moscow | 3. Old Church Slavonic | ![]() | |

are at once national languages, religious languages, and foundational languages for particular traditions of ancient culture and religion, sometimes with larger implications --
such as the Syriac translations of Greek philosophy that mediated its ultimate translation into Arabic. But usually these languages did not rise to larger significance and were not learned outside their nations except by specialist scholars.
Outside of Europe, there is greater simplicity to the foundational Classical languages of the other great centers of world civilization. In Islam,
Arabic is the surpreme and defining Classical language, with an unmatched religious preeminence but also with many centuries of secular literature, much of which made its way into Europe in the Middle Ages, although its place in the modern world is less cosmopolitan.
Nevertheless, spoken dialects of Arabic still compete with an elevated, literary Arabic, approaching the Classical language, as the dominant written language in Arab countries. Historically, Persian has often competed with Arabic as a literary and secular language, and other modern languages of Islam, from Swahili to Urdu or Malay, now have their own literature; but no language of an Islamic culture will ever escape the shadow of Arabic. And all serious Muslims will learn Arabic to some degree in order to read the Qur'ân.
In India, Sanskrit is foundational for the
autochthonous civilization and for all the religions -- Hinduism, Jainism, and Buddhism in particular -- with an Indian origin. A vast literature in Sanskrit begins with the Vedas and continues nearly to the present. The language is still actively taught and used, although I am not aware of much original literature now being produced in it. Nevertheless, its influence continues on the modern languages, like Hindi, that place themselves deliberately in the tradition of Sanskrit civilization and consequently use it as a source of borrowings and neologisms, as European languages do with Greek and Latin. The sacred character of Sanskrit is more marked than with the likes of Greek or Latin, and it is hard to imagine Hinduism giving up the Sanskrit formulae of the Vedas for vernacular translations as Christian churches have generally done in Europe. As with Arabic for Islam, or Hebrew for Judaism, the language itself is essential to the religion, its meaning, and its power -- the Qur'ân, as with the Vedas, is believed to exist eternally in its original language.
Chinese civilization (and those it influences) has a unique relationship to its Classical
language. The modern spoken languages, although quite different, nevertheless use most of the ancient characters, the
(kanji in Japanese), which means that reading knowledge of Mandarin or Cantonese gives one very nearly all that is needed to begin reading the Classical language. Consequently, Chinese language departments often present Classical Chinese as a subsidiary study to something like Mandarin. Some people are left under the impression that the Chinese of Confucius is an artificial language that has been abbreviated from something that was already much more like Mandarin. The interesting case is then, historically, when Koreans, Japanese, or Vietnamese read and wrote Classical Chinese without ever learning or speaking the contemporary spoken versions of Chinese. Some Chinese scholars find this incomprehensible, improper, or offensive. Yet that is the history, and it also means that much of East Asian civilization, in all the areas around China, has been expressed through, and influenced by, Classical Chinese literature. By abandoning Chinese characters, Korean and Vietnamese have lost their connection to the ancient language; but it is still a living presence in Chinese, in all its separate modern spoken languages, and Japanese. As with Arabic and Sanskrit, Classical Chinese is simply not a "dead" language.
Greek and Latin are not really dead languages either, but a great deal of effort is being put into making them so. Having inspired this attitude, science itself gets tossed away equally in the general shambles and militant ignorance into which Western "education" is being steadily reduced. So impressive is what civilization has done in the present, the false lesson is that just anything, and especially any self-indulgence, will be just as good as whatever was in the past -- the fruit of the long discredited but nevertheless continuing "self-esteem" movement. This is an approach whose payoff is self-destructive and suicidal. One wonders if it is also behind the self-hatred that is found in much educated opinion in the West. The enlightened no longer have anything rational or substantive to believe in and actually develop a rage, like an abandoned child, for the disappointing and absent parent. The orphans of Western civilization are neither wise or happy people.
Since I posted the essay above, my wife has drawn my attention to the article "Hieroglossia," by Jean-Noël Robert [Nanzan Institute for Religion & Culture, Bulletin 30, 2006], which echoes in some detail many of the points and sentiments presented here, complete with a critique of attitudes (like McWhorter's) in modern lingusitics, using examples as disparate as Syriac, Armenian, Persian, and the use of Chinese in Japanese literature. I have added an epigraph from Robert's article at the top of this section. Otherwise, I warmly recommend the full original, which is available on line. While I would prefer not to call Classical languages in which literature is actively generated "dead" languages, as Robert does, he does supply a useful term for classical languages that are used only for liturgy, like Old Church Slavonic or Coptic, and so are approaching truly "dead" status: "passive" heiroglossic languages. To Robert's excellent discussion of the Arabic element in Modern Persian, I would like to add my discussion of the Arabic and Persian elements in Ottoman Turkish.
Languages with more than 30,000,000 Speakers as of 2005
The Contrast between Classical and Modern Chinese
The Semitic and Other Afroasiatic Languages
"Knowing" Words in Indo-European Languages
Greek, Sanskrit, and Closely Related Languages
Philosophy of Science, Linguistics
This is the sort of bloody nonsense up with which I will not put. Winston Churchill, variously quoted, provenance uncertain.
A similar and related issue also arises over prescriptive grammar, i.e. teaching people that certain usages are wrong (e.g. "between you and I") because of a rule that was generalized for an earlier, written, and more prestigious stage of the language. John McWhorter doesn't have much use for this kind of thing either. He values the living, changing, spoken language, where usage steadily changes and grammar and vocabulary evolve over time. Dealing with this is simply "descriptive grammar," not "prescriptive." Of course, it turns out that some examples of prescriptive grammar are things that never were features of usage (like the evils of the "split infinitive" or ending a sentence with a preposition, about which we see Winston Churchill's famous response above) but that were made up out of whole cloth by grammarians, who had their own aesthetic preferences and then believed that others should agree with them. There was also the problem of getting right the grammar that actually applies to a language like English, rather than to Latin.
A good example would be correcting people who answer "That's me" rather than "It is I." The former uses a pronoun in the accusative case as a predicate nominative, where it should be in the nominative. The latter corrects this "error." Unfortunately, no one says "It is I" in genuine, colloquial speech. Also, one wonders if the same criticism would be applied to Louis XIV for saying L'état c'est moi, which by the same grammatical principle should be *L'état c'est je. The latter, however, truly is bad French; but moi seems to be neither nominative (je) nor accusative (me). Modern English, which is strongly influenced by French, uses "me" for both the accusative case and for this sort of "topical" use of moi. "That's me" or "It's me" are perfectly grammatical, just not obviously in terms of Latin grammatical cases.
Another issue would be the inherent ambiguity of certain grammatical rules. The Bible says, "For the wages of sin is death" [Romans 6:23]. There is something odd and archaic about that sentence, probably because the plural number of the subject ("wages") does not agree with the singular number of the verb ("is"). The verb actually is agreeing with the number of the predicate nominative ("death"). There is in truth a dilemma here that is not easily resolved. Where the number of the subject and the predicate nominative do not agree, there is going to be a sense of inconsistency whichever number the verb is in. Where today we may expect the verb to agree with the subject, come what may, the translators working for King James apparently saw the matter otherwise. Whichever way we go, there is clearly an arbitrary element, which is something that grammatical martinets seem reluctant to allow.
Apart from the silly idiosyncrasies of grammarians, confusions about getting the grammar right, and inherent logical problems in grammar, the issue is still a serious one in another respect. As language changes, new languages emerge, which are as different and foreign from the parent language as many unrelated languages. This means you can no longer read the literature. Old English (Anglo-Saxon) is as foreign to Modern English as German, while Middle English (Chaucer) is barely more intelligible than Dutch. Jane Austin (1775-1817) is recognizably Modern English, with some curiosities.
The stream of Time, irresistible, ever moving, carries off and bears away all things that come to birth and plunges them into utter darkness, both deeds of no account and deeds which are mighty and worthy of commemoration; as the playwright [Sophocles] says, it "brings to light that which was unseen and shrouds from us that which was manifest." Nevertheless, the science of History is a great bulwark against the stream of Time; in a way it checks this irresistible flood, it holds in a tight grasp whatever it can seize floating on the surface and will not allow it to slip away into the depths of Oblivion.
Anna Comnena (1083-1153), The Alexiad, translated by E.R.A. Sewter [Penguin Classics, 1969, p.17]. Contemporary image of the Empress Maria, the Alan. |
The process by which a writer like Shakespeare ceases to be easily understood by speakers of the recent language is one that McWhorter seems quite happy to see speeding along. As with his disdain for Classical languages, the result is the same: the loss of the past. Through most of human history, this would have been viewed with alarm. The loss of sacred languages -- Hebrew, Sanskrit, Arabic -- would even have been viewed with horror and fear. It is only now that people assert or affect no interest in the past, seeing it as a weight and a shackle to the new and better. As discussed above, this is an attitude of great folly, and not just as a matter of intellectual curiosity. John McWhorter certainly should know better, but his is a common attitude in linguistics. The practice of science is usually to study rather than use. Linguistics ironically studies "dead" languages as much as living ones, but it then sees "use" only in terms of people speaking, not in terms of reading the words of those long dead -- people whose minds nevertheless still live through their writings. That is the miracle of the written language, through which Socrates, Dante, and Confucius come to life, and without which Linguistics as a science as well as books by John McWhorter would be impossible.
As it happens, McWhorter himself is alarmed about a closely related issue. In his Doing Our Own Thing, The Degradation of Language and Music and Why We Should, Like, Care [Gotham Books, 2004], McWhorter laments the loss of the elevated oratorial and rhetorical tradition of spoken language which derives from the written medium. He sees the virtually illiterate modes of spoken language alone coming to dominate public speech, literature, poetry, and also even music, to the great loss of art, intellect, and sophisticated communication. I don't think that McWhorter appreciates, however, the degree to which his own dismissal of Classical languages and prescriptive grammar contributes to this "degradation," as he calls it himself, of language. After all, as he admits, the whole tradition of oratory and rhetoric goes back to Classical models. Famous speakers of the past, such as Edward Evertt, whom he considers at the beginning of his book, had certainly been educated with a Classical and grammatical emphasis that McWhorter otherwise decries or dismisses. I don't think he appreciates that the loss or "degradation" of one part of the tradition marches in step with the general loss of literate sophistication -- and the consequent loss of the past -- that he targets. We can certainly benefit from a better understanding of questions about grammar, and I would thus agree with McWhorter that educational reforms about language were in order, with respect to many attitudes of the 19th century or earlier; but I do think he has missed an essential part of what has happened, and of which he has therefore himself been a part of the negative and degrading tendency.

The basic meaning of civilization is the presence of cities, and the basic meaning of history is the presence of written records. There can be civilization without writing (the Incas), and perhaps writing without much in the way of cities (runes), but the creation of writing gives to the earliest historical civilizations a role that prior urban culture (as at Jericho) could not match. The four earliest centers of historical civilization stretch diagonally across south Asia into Africa. They are defined by their writing systems. The earliest is in Sumer (or Sumeria), where we now have evidence of a long pre-history of writing. After early pictograms, the writing system that emerged, cuneiform, is named after the wedge shapes that were made by reed pens on clay tablets. This was a cumbersome and messy medium for writing but possesses the virtue from our point of view that burned tablets can become as durable as bricks.
The Sumerians themselves did not last long, and are no longer distinguishable as a people after the end of the III Dynasty of Ur, around 2000 BC. Their language has no known affinities, though the Caucasus is still home to similarly isolated and unique language groups, three of them. A chain of ancient non-Indo-European and non-Semitic languages -- of Elam, the Kassites, the Hurrians, and Urartu -- stretched from Sumer to the Caucasus, but too little is known of these languages, or of the early forms of the Caucasian ones, for certain connections to be drawn. Sumerian civilization, however, did not die, since most of its elements, and the cuneiform writing system itself, were adapted to writing a Semitic language, Akkadian, whose daughters, Babylonian and Assyrian, bore the literature of subsequent Mesopotamian civilization, even while lovingly preserving knowledge of Sumerian. The last cuneiform text is from 75 AD, and so this is taken as marking the end of Sumerian civilization, even if the end of the Sumerians themselves long antedates it.
Hard on the heels of Sumer came Egypt, with evidence of Sumerian influence, where a new writing system, hieroglyphics, developed -- now with some evidence emerging of its antecedents in Egypt. Of the durable systems of writing, hieroglyphics alone retained its pictographic character, though the Egyptians developed cursive and abbreviated forms for more practical purposes. The Egyptians also developed a more practical medium for writing, papyrus scrolls, though these have the drawback, from our point of view, of easily burning and decaying. An intact Egyptian papyrus is a prize, though these are more common in the dry climate of Egypt than similarly volatile media would be in the damp Ganges Valley of India. The Egyptians themselves, and their writing, were somewhat more durable than Sumer. The last hieroglyphic inscription was carved in 394 AD, and the last cursive (Demotic) papyrus is from 480 AD. That, even then, the Egyptian language survived, as Coptic, written in the Greek alphabet, is discussed elsewhere.
The Indus Valley of India is where the next civilization emerges, again with evidence of Sumerian influence. The Indus pictographic script is not well attested and remains undeciphered. Nor, unlike hieroglyphics and cuneiform, are there any bilingual texts to aid in decipherment. So we don't even know what the people of the Indus Valley called themselves or their place -- perhaps the closest we can get is that the Sumerians called the place Melu
a. The problem is that the Indus Valley civilization did not survive, flourishing only from around 2800 to 1500 (or even just from 2600 to 1900). The examples of Indus writing are brief and fragmentary. Just what happened is still mysterious. The advent of Indo-European steppe peoples with horses and chariots undoubtedly had the kind of effect that is also evident in the Middle East, where small numbers of such people established regimes in Babylonia (the Kassite Dynasty) and Mitanni, and the technology made a foreign regime possible in Egypt. The Indus cities, however, now seem already declining, vulnerable, and perhaps even abandoned, perhaps because of climatic and hydrological changes. There is little real evidence of violent conquest, though a similar absence is also noteworthy with respect to the Kassite regime in Babylon, the Mitanni, or the Hyksos in Egypt. In any case, India passed into a Dark Age and emerged contemporaneous with the beginning of Classical civilization in Greece, circa 800 BC.

While contact between Sumeria, Egypt, and the Indus occurred early, the fourth center of civilization, in China, remained relatively isolated and emerged considerably later, with the Shang Dynasty, about the time that India has passsing temporarily out of history. Of all the early systems of writing, Chinese Characters, the direct descendants of Shang pictographs, are the only one still in use today. The Indian system, of course, ended with the Indus civilization. Cuneiform and hieroglyphics were replaced by alphabetic scripts that developed, perhaps under Egyptian influence, in Phoenicia and Canaan.
A striking geographical feature of the early civilizations is that they were all in river valleys, and not only that, but desert river valleys. That circumstance might be overlooked in the Middle East, where the climate is uniformly dry, but is conspicuous in India and China, where the rivers in deserts (the Indus), or at least relatively dry areas (the Huang He), are matched by rivers that are in areas of heavy rainfall (the Ganges & Yangtze). In China, an old saying has it that in the north (Huang He valley) you go by horse, and in the south (Yangtze valley) you go by boat. That life and agriculture likely was easier in rainy areas may have been just the problem. The irrigation systems that were necessary for reliable agriculture in the desert climates imply a level of organization and technological development, let alone records, which are just what we find in the earliest days of Sumer, Egypt, and the Indus valley. In these terms, it should not be surprising that civilization in India began on the Indus rather than the Ganges, and in China on the Huang He rather than the Yangtze. This even made a difference in the Chinese diet, since rice, which we think of as the Chinese staple, would only grow in the wet south. In the north, it was wheat that was grown, and the staple diet was based on something else which is still conspicuous in Chinese cooking, noodles.

Another curious, but unexplained, feature of these civilizations is that the delay in the develoment of China, and the hiatus in the development of India, end up producing a philosophical culture simultaneously with the development of Greek philosophy, while the independent Egyptian and Mesopotamian civilizations were far gone in decline. The multiple points of similarity between the thought of the Greece, India, and China, evident in the simplest terms in their respective treatment of the physical elements, cannot be accounted for by mutual influence, which does not seem to have existed at the earliest period. The undoubted transfer of ideas between Greece and India in the Hellenistic Period, and the export of Buddhism from India to China beginning in the Han Dynasty, provides us points of comparison with what, the uninfluenced traditions, came before. The time when Parmenides, Confucius, and the Buddha all lived, the end of the 5th century BC, has been called the "axial age"; but it remains mysterious that such simultaneous and sometimes parallel development should have occurred.
The age was also one of religious innovation. In India, where religion and philosophy remain closely related, Buddhism, Jainism, and Upanishadic Hinduism straddle the distinction. In China, schools that are pretty purely philosophical, Confucianism and Taosim, eventually attract religious elements and grow, with Buddhism, into the three religious "Ways" of Chinese civilization. Greek religion, of course, was doomed to extinction, replaced by Prophetic Judaism and its daughter religions, Christianity and Islâm. Meanwhile, of course, the Jewish tradition had been profoundly influenced by Greek philosophy, so that when Christianity was adopted by Rome, it could be said to repesent a synthesis of "Athens and Jerusalem." The place in this of the religious revolution in Irân, Zoroastrianism, is more obscure. The moral rigor of Zoroaster, in separating all evil from God, may actually be the source of similar reforms in both Judaism and in Greek philosophy, but there is little in the way of direct evidence of this.
History of Philosophy, Indian Philosophy
History of Philosophy, Chinese Philosophy

The grassland across Eastern Europe and Central Asia, the Steppe, is one of the great highways of world history. Equipped with horses and cattle, people could live easily on the Steppe and move freely across it, all the way from Mongolia to Hungary. From the Second Millennium BC until well into the Middle Ages, movements back and forth across the Steppe, and especially off of it at the periphery, profoundly influenced the history of the surrounding lands in Europe and Asia, particularly Eastern Europe, the Middle East, India, and China. The most dramatic example of this in Ancient times was the descent of the Iranians into the Middle East and India.
Introducing horses and chariots for the first time into these areas of earlier civilization, the Iranian invaders not only revolutionized warfare, but were the ones to reap the first advantages from the innovation. The occupation of the Iranian plateau established their permanent presence there, with the great successor kingdoms of the Medes and the Persians. An Iranian elite, at least, imposed themselves on the Hurrians and the Kassites of the Middle East. With the Hurrians, the kingdom of the Mitanni then became one of the great second millennium powers of Eastern Syria, while with the Kassites, an ascendancy was established over Babylonia. In a 13th century treaty with the Hittites, the Mitanni listed their gods, which are evidently the same gods as in contemporary Iran and in Vedic India.
Recent scholarship has begun to discount to role of the Iranian invaders in the introduction of the horse and in dominating the Hurrians and Kassites, since horses appear before their arrival and the evidence of the Iranian element among the Hurrians and Kassites is thin. However, the Iranian movement was certainly more that of a migration than the invasion of organized armies. As such, its influences were somewhat more in the way of diffusion than of conquest. Horses arrive before an Iranian population does because they were traded ahead of the migrants. Someone brought them -- horses (and chariots) did not simply suddenly wander across the Central Asian deserts into the Middle East and India. There is no doubt that horses did not exist in the 3rd millennium BC in Egypt or Sumeria. When horses do arrive, they are adopted as quickly as possible. Horses clearly arrived in Egypt with an invasion, that of the Hyksos, but there is no evidence that the Hyksos were Iranians. Whether or not the Hurrians were ever dominated by Iranians, there is undoubted Iranian influence there, with the names of the gods. The gods can have diffused along with the horses. That the Iranians are near is incontestable -- they are soon revealed on the Iranian plateau, and in India, and their languages have been in those places ever since. Consquently, it doesn't make much sense to completely discount the traditional notions of the role and influence of the Iranians.
India is where the eastern branch of Iranian invaders, the Arya, imposed themselves and, erasing whatever establishment or vestiges were in place of the older Indus Valley Civilization -- known to the Sumerians as Melu
a -- laid the foundation of a new civilization with their own language and gods. A subsequent Iranian people, the Sakas, later also invaded India. The Sakas had been dislodged, as the most distant Indo-European occupants of the Steppe, known as the Yüeh Chih (Pinyin Yuezhi, the "Moon Tribe") to the Chinese, were thrown back into the Tarim Basin (the Lesser Yüeh Chih) and Transoxania (the Greater Yüeh Chih). The Greater Yüeh Chih, organized as the kingdom of the Kushans, later continued the tradition by invading India themselves.
Argument continues over the role of the Arya invaders in the end of the Indus Valley Civilization (one of whose famous artifacts is shown at left), which is now usually said to have been in decline and to have come to an end through its own decadence (cf. The Oxford Companion to Archaeology [Oxford University Press, 1996], "Indus Civilization," pp. 348-351) or natural disasters (cf. "Indus Civilization, Clues to an Ancient Puzzle," National Geographic Magazine, Vol.197, No.6, June 2000, pp.108-129). Civilizations, indeed, have their ups and downs. Egypt during its Intermediate Periods, or China after the fall of the Han, are good examples. However, they rarely disappear as the result of such low points. Even Mayan Civilization, which essentially did collapse, still left the Mayan people, speaking their own language, behind. It is thus hard to imagine no connection between the invasion and the disappearance, even as the probable language of the Indus Valley, a Dravidian language, was erased from most of the north of India. Egypt had a tough enough time shaking off the Hyksos.
Claims are also now made, perhaps not coincidentally starting with Indian scholars, that the Arya originated in India, that the Vedic language is closely related to the Dravidian languages and the source of all other Indo-European languages, and that the hitherto undeciphered Indus Valley script is actually the basis of both the much later (700 or 800 years) Brahmi alphabet in India and even the Phoenican/Canaanite alphabet of the Middle East. These are inherently suspect and improbable claims, examined in some detail elsewhere.
Since the early Iranian peoples were illiterate, much of their movement and activities remain concealed from history. With the spread of literate civilization, however, much more can be discerned when the entire process of spreading across the Steppe was repeated all over again in the Middle Ages by the Turks.
Moving from east to west, the Turks came as far west on the Steppe itself. The remaining Turkish presence in Europe looks at least in part associated with the later Mongol invasions. Thus Kazan, which was the capital of the Mongol Khanate of Kazan, is now the capital of the Russian republic of Tatarstan, where the Turkish language of Tatar is spoken. Ajacent to Tatarstan we find the closely related languages of Chuvash and Bashkir (now in Bashkortostan). Other Tartar speakers remain in Central Asia. The Crimean Tatars, surviving from the Mongol Khanate of the Crimea, were deported to Central Asia by Stalin in 1944 for supposedly collaborating with the Germans. Recently some have been returning. In the Ukraine, earlier Turkic peoples like the Khazars, Patzinaks, and Cumans have disappeared. The Bulgars, originally Turkic, were absorbed by their Slavic subjects. Nor did the Turks settle nearly as much of the Middle East. However, the whole area around the Aral Sea became permanently Turkish, now dignified as "Turkestan," while the defeat of the Romanian/Byzantine Emperor Romanus IV at the battle of Manzikert by the Seljuk Turk Great Sultân Alp Arslan in 1071 opened up Asia Minor, which Iranians had never penetrated (despite the Persian conquest), to permanent Turkish occupation and settlement. The presence of Turkey amidst and upon older Indo-European peoples, the Greeks and the Armenians, and overlapping an Iranian people, the Kurds, has not made for forgiveness or forgetfulness of their recent advent (i.e. almost 1000 years ago). Although a Turkish ethnic presence was never established in India, Turkish princes in Afghanistan profoundly influenced Indian history, first by the invasion of Mahmûd of Ghazna in 1008, when Islâm was first solidly planted in Indian civilization, and by the later invasion of Babur, the first of the Moghuls, in 1526.
The most spectacular and rapid movement of conquest across the Steppe, through the Middle East, and into Europe and China, was that of the Mongols in the 13th Century. Although several significant and a few durable kingdoms resulted from this conquest, little remains by way of permanent Mongol ethnic presence. The Kalmyks on the lower Volga are the only Mongolian speaking group left in Europe, Buddhist in religion and evidently associated with the Khanate of Astrakhan. The Mongols thus repeated the earlier, and less well documented, career of the Huns, who also left few durable marks of their presence (like the name "Hungary"). Subsequently, the conquest of the Steppe came from off of the Steppe, especially as the Russians, who had moved across Asia north of the Steppe, in the forest land of the Taiga, occupied much of the central portion of it from the north in the 19th Century.
The advent of gunpowder removed the advantage that the horse and the composite reflex bow had given nomadic Steppe dwellers for so long.
Grasslands at the corresponding latitudes elsewhere in the world did not have the same impact on world history as the Steppe. The Pampas of South American and the Veldt of South Africa were far too small to provide a highway between different cultural regions. The Prairie of North America, although extensive, was still not as extensive as the Steppe and lacked the key ingredient: A domesticated animal for riding. Although the horse had actually evolved in North America, it died out there and historically is only found in Asia and Africa (Zebras).
The reintroduction of the horse into the New World by the Spanish set off the development of a romantic Plains culture among American Indian tribes who adopted the horse; but this did not involve the historic transmission of cultures around the periphery, nor did it last very long -- only a century and a half, at most. What these grasslands could mean in modern life was as appropriate ranges to grow a domesticated grass, wheat, and similar agricultural staples. The steppe and similar provinces thus have gone from being highways of history to being breadbaskets of the world.
The Altaic and Uralic Languages
"Knowing" Words in Indo-European Languages
History of Philosophy, Indian Philosophy

The Vedas and all their parts are shruti, "revelation." The sectarian teachings, Vais.n.avite (the sect of Vis.n.u), Shaivite (the sect of Shiva), & Shâkta (Tantric, sect of Shakti), may regard their texts, the âgamas, as shruti also. The pious view is that the Vedas are eternal and uncreated and exist essentially as sound. More conventional, but still pious, scholarship may still exaggerate the antiquity of the Vedas, sometimes claiming they go back to 10,000 BC or earlier. Now, however, it looks like even the oldest parts of the R.g Veda do not antedate the arrival of the Arya in India, although the gods and elements of the stories are older, since they are attested with Iranian peoples and the Mitanni, with parallels in Greek and Latin mythology. The word "veda,"
, is from the root vid, "to know," making for other derivates like vidya,
, "knowledge," and avidya,
, "ignorance." The vid root is cognate to idea in Greek, video in Latin, and wit in English. Claims can be found on the Internet that the Arya and their gods were autochthonous to India; but the linguistic, archeological, and epigraphic evidence is overwhelming in favor of their arrival from the Steppe, like the Turks and Mongols centuries later, and of their origin elsewhere.
| THE FOUR VEDAS | |
|---|---|
| I. R.g Veda | the oldest Veda, from c.1500 BC; from r.c, "sacred hymn or verse"; liturgial manual of the hotr., chief sacrificial priest. |
| II. Sâma Veda | from sâman, "song, chant"; hymnal of singing udgâtr. priest, assistant of the hotr.. |
| III. Yajur Veda | from yajus, "sacrifical formula"; liturgical manual of adhvaryu priest, assistant of hotr. charged with ritual preparations, "practical work." |
| IV. Atharva Veda | the youngest Veda, c. 800 BC; from atharvan, the "fire priest," not originally associated with Vedic sacrifice, later added as brâhman.a, the fourth sacrificial priest. |
Each Veda consists of four parts.
| THE FOUR PARTS OF THE VEDAS | ||
|---|---|---|
| I. Samhitâs (or Mantras) Hymns |
R.g Veda Samhitâs: 10522 verses | The samhitâs and brâhman.as are the karmakân.d.a, "action part," of the Vedas, studied by the pûrva mîmâmsâ, "prior interpretation," or Mîmâmsâ school. |
| Sâma Veda Samhitâs: 1984 verses | ||
| Yajur Veda Samhitâs: 1875 verses | ||
| Atharva Veda Samhitâs: 5977 verses | ||
| II. Brâhman.as Ritual Texts | The brâhman.as, using much mythic material, are commentaries on and explanations of the hymns and ritual practices. | |
| III. Âran.yakas Forest Treatises |
The âran.yakas verge into philosophical writing but often are indistinguishable from the brâhman.as; they may be regarded as philosophical texts written by or for forest dwelling hermits or as brâhman.a ritual texts written for forest dwellers who cannot practice the ordinary household rituals described in the brâhman.as proper. | The âran.yakas and upanis.ads are the jñanakân.d.a, "knowledge part," of the Vedas, studied by the uttara mîmâmsâ, "posterior interpretation," or Vedânta, "End of the Vedas," school. |
| IV. Upanis.ads Philosophical Texts | ||
The Vedas are traditionally taught by a Brahmin teacher (guru) orally to a student (brahmacârin) in sequences (called "branches") of associated samhitâs, brâhman.as, âran.yakas, and upanis.ads. For example:
| R.g Veda | |||
|---|---|---|---|
| Shakala Branch Samhitâs | Aitareya Brâhman.a | Aitareya Âran.yaka | Aitareya Upanis.ad |
History of Philosophy, Indian Philosophy

Translations of Upanis.ads,
, range from the classic Sacred Books of the East versions by F. Max Müller in two volumes, originally published in 1879 and 1884 (reissued by Dover in 1962), to a recent new translation by Patrick Olivelle (Upanis.ads, Oxford, 1996). Olivelle should be consulted for the most recent thinking and scholarship. Mircea Eliade says that only the Upanis.ads listed here, out of more than 200, are considered to be shruti. At the same time, I've seen claims that the Upanis.ads are not even part of the Vedas. Olivelle says that these (his collection does not include the Maitri Upanis.ad) are in fact the whole of the original Upanis.adic literature -- indeed, that the Br.hadâran.yaka and Chândogya together "constitute about two-thirds of the corpus of ancient Upanis.adic documents" (p. 4). Subsequently, however, numerous, often Sectarian, documents were produced, as late as the 16th century, which were regarded as Upanis.ads by different people (p. xxxiii). Collections of Upanis.ads varied by region. Olivelle mentions a northern collection of 52 and a southern collection of 108 (p. xxxiii). The only modern publication of the additional Upanis.ads I've seen is K. Narayanasvami Aiyar's Thirty Minor Upans.ads (Parimal Publications, Delhi, 1997), although I have not been keeping up with the literature. The late Upanis.ads are not part of the Brâhman.a literature and tend to be ascribed to the Atharva Veda. This is already true of some of the Upanis.ads listed here.
The Upanis.ads are basically about two things: Brahman,
, and Âtman,
. Brahman is ultimate reality in the external world, the Âtman the ultimate reality in the internal world and hence the Self. The Buddhist doctrine is anâtman or "no self." The fundamental division in Vedânta,
, which is the interpretation of the Upanis.ads, is whether Brahman and the Âtman are identical or different. If they are identical, we have a school of 
, Advaita or "non-dual" Vedânta. A "non-dual" doctrine can also be called "Monism," that there is only one thing. If Brahman and the Âtman are different, we have a school of 
, Dvaita or "dualistic" Vedânta. The Dvaita Vedânta of Madhva is a Theistic doctrine of a personal God, with the "five differences": that (1) Brahman is different from Âtmans, (2) Brahman is different from matter, (3) Âtmans are different from each other, (4) Âtmans are different from matter, and (5) pieces of matter are different from each other. Thus, it is a pluralistic metaphysics, not just dualistic. In the "qualified" Advaita Vedânta of Ramanuja, Brahman is a personal God, who nevertheless contains all reality, including multiple selves and the world. This may be called a "Pantheism" and is comparable to the metaphysics of Baruch Spinoza. The God of both Madhva and Ramanuja is identified as the devotionalistic deity Vis.n.u.
In the "unqualified" Advaita Vedânta of Sãnkara, Brahman is the only thing that exists, and the world and individual selves are part of illusion, Mâya,
(which is not illusion, but the creative power of God, for Theistic or other realistic versions of Vedanta). Since the Âtman, idnetical to Brahman, is not an individual self or soul, individuality over time and from life to life must be carried by the "subtle" bodies that are examined in the following Mân.d.ûkya Upanis.ad. Brahman is left without much in the way of positive characteristics, much like the "One" of Being in Parmenides. But there are three essential attributes of Brahman that are expressed in the formula,
, Saccidânanda.
First is
, sat, which is "existence." This is the same root as
, satya, "truth," which turns up in the Satyâgraha,
, or "Truth Force" of Mahâtmâ Gandhi. Second is
, cit, which is consciousness. This diverges from the characterization of Being by Parmenides, who left the existence of both consciousness and the world unexplained. Here, the world, even as illusion, can exist as a representation within the consciousness of Brahman. Third and finally there is
, ânanda, which is "bliss." "Ânanda" was also the name of the Buddha's personal attendant, who figures in many stories about the Buddha. Brahman is existence, consciousness, and bliss. The ultimate Self within each of us, the Âtman, is this also. So we do exist, and our consciousness is the consciousness of Brahman.
However, where is the Bliss? I may have the existence and the consciousness, but the bliss is missing. That is where we are damaged by Mâya. To be free of illusion and to achieve Bliss in Brahman is Salvation or Liberation, the goal of religious and meditative practice, the Yogas,
.
| Action Part | Knowledge Part | ||
|---|---|---|---|
| Hymns | Brâhman.as | Forest Treatises | Upanishads |
| R.g Veda, Shakala Branch | Aitareya Brâhman.a [2] | Aitareya Âran.yaka | Aitareya Upanis.ad [1] |
| R.g Veda | Kaus.îtaki (Shânkhânaya) Brâhman.a [2] | Shânkhânaya Aran.yaka | Kaus.îtaki Upanis.ad [1] |
| Sâma Veda | Jaiminîya (Talavakâra) Upanis.ad Brâhman.a [3] | Kena Upanis.ad | |
| Chândogya Brâhman.a [4] | Chândogya Upanis.ad [1][5] | ||
| Kr.s.n.a (Black) Yajur Veda | Taittirîya Brâhman.a [6] | Taittirîya Âran.yaka | Taittirîya Upanis.ad [1] |
| Maitri (Maitrâyani-Brâhman.a) Upanis.ad [7] | |||
| Shvetâshvatara Upanis.ad [7] | |||
| Kât.ha Upanis.ad [7][8] | |||
| Shukla (White) Yajur Veda, Vâjasaneyi Samhitâ | Îshâ Samhitâ Upanis.ad [9] | ||
| Shatapatha Brâhman.a [10] | Br.hadâran.yaka Upanis.ad [1][11] | ||
| Atharva Veda, Pippalâda Branch [12] | Prashna Upanis.ad [7] | ||
| Atharva Veda, Shaunaka Branch | Mun.d.aka Upanis.ad [7] | ||
| Atharva Veda | Mân.d.ûkya Upanis.ad [7][13][14] | ||
Note 1: Early Upanis.ad, between 800 and 500 BC.
Note 2: The R.g Veda contains only the two Brâhman.as listed.
Note 3: The Sâma Veda contains eight Brâhman.as, including the Pañcavim.sha Brâhman.a, the S.ad.vim.sha Brâhman.a, and the Praud.ha Brâhman.a.
Note 4: Eight out of ten chapters are the upanis.ad.
Note 5: Contains the mahâvakya, "great sentence," for the Sâma Veda: tat tvam asi, Note 6: The only Brâhman.a of the Black Yajur Veda.
Note 7: Upanis.ads of the middle period, between 500 and 200 BC.
Note 8: Some attribute the Kât.ha Upanis.ad to the Atharva Veda or the Sâma Veda.
Note 9: Unusally positioned with samhitâs instead of with âran.yakas.
Note 10: The only Brâhman.a of the White Yajur Veda.
Note 11: Oldest upanis.ad, appended to the Shatapatha Brâhman.a, contains the teaching of the Unknown Knower and the mahâvakya, "great sentence," for the Yajur Veda: aham brahmâsmi, "I am brahman."
Note 12: The only Brâhman.a of the Atharva Veda is the Gopatha Brâhman.a. I am not aware of the Branch to which it is supposed to belong.
Note 13: The Mân.d.ûkya Upanis.ad is found embedded in the Kârikâ Âgama, not attached to a Brâhman.a or Âran.yaka.
Note 14: Contains the mahâvakya, "great sentence," for the Atharva Veda: ayam âtmâ brahma, "this self is brahman."
History of Philosophy, Indian Philosophy
The Mân.d.ûkya Upanis.ad, "This Self is Brahman," Ayam âtmâ brahma, is the mahâvakya, "great sentence," of the Atharva Veda. The four "great sentences," one from each Veda, express the fundamental teaching of the Upanis.ads. The other three are: tat tvam asi, "thou art that," aham brahmâsmi, "I am Brahman," and sarvam khalu idam brahma, "all this indeed is Brahman." The latter looks like it is also included in this verse but, sorry folks, only one great sentence per Veda. The "four quarters" of the Self are going to be the four levels of consciousness. Tat tvam asi, This verse is about the first state of the jîva, or the individual phenomenal self. The physical body accompanies the waking state. The "nineteen mouths" are the five senses (sight, hearing, touch, taste, smell), five organs of action (speech, hands, feet, genitals, anus), five vital principles (prân.a, apâna, samâna, udâna, vyâna), and the "sensorium" (manas) reason (buddhi) ego (ahamkâra) and apperception (citta). "Gross" is sthûla, "thick, bulky, big, large, stout, massive; coarse, gross; dull, stupid; material, tangible (phil.)..." The "worldly," where here characterizes the waking state, means "belonging to all men; universal, dwelling or worshipped everywhere, generally known....consisting of all men...intellect conditioned by the aggregate (Vedânta phil.)..."
This verse is about the second state of the jîva, dreaming. The dreaming or "astral" body accompanies the dreaming state (calling it the "astral" body is borrowed from Neoplatonism). Much has been made recently of "astral projection," where real journeys can supposedly be made in the separated astral body -- though the ultimate would be "teleportation," where an astral project ends with the physical body appearing where the astral body traveled.
This verse is about the third state of the jîva, deep sleep. The causal body (kâran.asharîra) or karmic body (kârman.asharîra) accompanies the third state [note]. For the Jains this simply consists of one's karma and is responsible for the existence and circumstances of life in the phenomenal world. "Pure" cognition which is neither inner nor outer seems to be cognition without an object altogether. This is what we call "unconsciousness," but there is no unconsciousness in âtman or Brahman. The Br.hadâran.yaka Upanis.ad had asked the question what the Knower without the Known would be. Since the Known, and any object of consciousness, would be part of illusion, Mâya, the Knower without the Known would be the subject without an object. So "pure" cognition looks like what to us, and also to Dvaita Vedânta (and others), would be unconsciousness. Since Brahman is defined as sac-cid-ânanda, Does this verse go with verse #5, and so with deep sleep, or with verse #7, and so with the âtman? "Îshvara" is traditionally interpreted to mean God in a personal sense, which in #5 would be merely part of Mâya, "Illusion," (as in monistic Advaita Vedânta) or in #7 would be identical with Brahman (as in theistic schools of Vedânta). Verse #6 sytlistically does seem to go with verse #7, but "îshvara" may not be used here in the loaded sense of meaning a personal God -- the difference may not be conceived as clearly in this Upanis.ad as it would be in later Vedânta, and we might regard the causal principle in the third state as no more than the causal body.
This verse is about the fourth state, which leaves the jîva behind and now is the pure Self, the âtman. Note that the term "nondual" (advaita) is actually used in the text. "Auspicious" in Sanskrit is Shiva, which is the name of a devotionalistic God in Hinduism, but important theistic interpretations of the Mân.d.ûkya tend to be Vaishnavite rather than Shaivite.
Om was originally pronounced aum; and this is remembered here, where Om is analyzed into three parts, with an intangible fourth part.
"Miti" can also be translated "measuring" -- the translation prefered by those who see "îshvara" as a creative God to be identified with the fourth state. A third state which "erects" the world does not require that kind of function in the fourth. However, the theistic interpretations of the text are up against another problem. The theistic Dvaita Vedânta view is that the third state is a state of unconsciousness and ignorance; but this is contradicted by the very name of the third state, "Prâjña," which means "intelligent, wise, clever" (from jña, "know"). This is not ignorance. But what "erects" the world doesn't have to be God even in the third state. It can be karma.
History of Philosophy, Indian Philosophy
The astral body and the causal body, in contrast to the gross physical body, are "subtle" bodies. This works clearly and simply in the Mân.d.ûkya Upanis.ad. There is a tendency, however, mainly outside of India, to add more subtle bodies. In the
, "thou art that."
Copyright (c) 1997, 1998, 2008, 2012 Kelley L. Ross, Ph.D. All Rights Reserved

The Mân.d.ûkya Upanis.adtranslation substantially that of Sarvepalli Radhakrishnan & Charles A. Moore (1957) and Thomas E. Wood (1990) with modifications

, is one of the shortest Upanis.ads, only twelve verses long, and it is very late, conventionally associated with the Atharva Veda; but it is also one of the most important Upanis.ads, with commentaries by great Indian philosophers, like Shãnkara. Different schools of philosophy interpret the text according to their own doctrine. Quoted definitions are dictionary citations from Arthur MacDonell, A Practical Sanskrit Dictionary [Oxford, 1929].
1. Om: this syllable is all this. A further exposition of it is: what was, what is, and what will be -- all is only Om. And whatever else is beyond the three times, that also is only Om.
The Mân.d.ûkya Upanis.ad says it is about the syllable "Om." This is a sacred syllable that can be used as a mantra for meditation or written on things for good luck. I have an Indian cookbook whose author says that her mother wrote Om on her tongue in butter when she was born. The different ways to write the word are discussed at "Greek, Sanskrit, and Closely Related Languages." An abbreviated writing is given at the top of this page. Nevertheless, nothing is really said about Om here. It is used as a device to symbolize what the Mân.d.ûkya is really talking about, which is consciousness.
2. All this
, indeed, is Brahman. This Self is Brahman. This Self itself has four quarters.
, "thou art that," is the most famous of these propositions and the only one commonly quoted in Sanskrit.
3. The waking state, outwardly cognitive, having seven limbs, having nineteen mouths, enjoying the gross, the worldly (vaishvânara), is the first quarter.
4. The dreaming state, inwardly cognitive, having seven limbs, having nineteen mouths, enjoying the exquisite, the brilliant (taijasa), is the second quarter.
5. Where one, asleep, does not desire any desire whatever, sees no dream whatever, this is deep sleep. The sleeping state, which has become one, just pure cognition, made of bliss (ânanda), verily an enjoyer of bliss, whose mouth is thought, the cognitional (prâjña), is the third quarter.
, "existence, consciousness, & bliss," "made of bliss" strongly suggests that deep sleep is closer to Brahman than the previous two states, a hierarchy rejected by those opposed to the Monistic interpretation of Vedânta (see comment on verse #6). The "cognitional," prâjña (an intensifying prefix on jña, one of the knowing roots in Sanskrit), is "intelligence associated with individuality (phil.)"; "Intensely Conscious Being or Conscious Intensity." Interpretations of the third state as a state of ignorance, avidya,
, of unconsiousness, will have some difficulty why it would be called prâjña.
6. This is the lord (îshvara) of all; this is the knower of all; this is the inner controller; this is the source of all, indeed the origin as well as the end of all beings.
7. Not inwardly cognitive, not outwardly cognitive, not cognitive both ways, not pure cognition, neither cognitive nor non-cognitive, unseen, beyond speech, ungraspable, without any distinctive marks, unthinkable, undesignatable, the essence of the knowledge of the one Self, the cessation of the phenomenal world, quiescent, auspicious, nondual (advaita) -- [such] they think, is the fourth. He is the Self. He is to be known.
8. This is the Self with regard to the syllable "Om", with regard to the elements: the quarters are the elements and the elements are the quarters: the letter a, the letter u, the letter m.
9. Vaishvânara (the worldly) is the waking state, the letter a, the first element, either from "âpti" (obtaining) or from "âdimattva" (being first). Verily, he obtains (âpnoti) all desires and becomes first (âdi) -- he who knows this.
10. Taijasa (the brilliant) is the dreaming state, the letter u, the second element, either from "utkars.a" (exaltation) or from "ubhayatva" (intermediateness). Verily, he exalts the stream of knowledge and becomes equal-minded; no one ignorant of Brahman is born in the family of him who knows this.
11. Prâjña (the cognitional) is the sleeping state, the letter m, the third element, either from "miti" (erecting) or from "apîti" (merging). Verily, he erects (minoti) this all and he becomes its merging -- he who knows this.
12. The fourth is what is without an element, what cannot be dealt with or spoken of, the cessation of the phenomenal world, auspicious, nondual. Thus Om is the very Self. He enters the Self with the Self -- he who knows this.
Copyright (c) 1997, 1998, 2000, 2004, 2005, 2009, 2010, 2012 Kelley L. Ross, Ph.D. All Rights Reserved
The Mân.d.ûkya Upanis.ad, Note:
Subtle Bodies
diagram we see the addition of a particular subtle body, the "etheric" body, in different systems. The "various" heading shows the etheric body inserted between the physical and the astral bodies. I have found this in two different books about astral projection and even in a book about massage (where the subtle bodies are described as auras extending beyond the physical body). The astral projection books disagree about what the etheric body is supposed to be. In one, it is part of the astral body left behind in the physical body during astral projection; in the other, it is comparable to the astral body in that it projects, but it travels to physical locations, while the full astral body travels to locations on the astral plane. The other cases are from Theosophy and Eckankar (whose main project used to be teaching astral projection but now apparently places emphasis on meditation). In the former, the etheric body, as the "lower mental body," is added between the causal and the astral body, while in the latter it is added, as the "mental body," above the causal body. These various and conflicting interpretations of the etheric body perhaps attest to its late introduction.