Published: August 23, 2012
Biologists using tools developed for drawing evolutionary family trees say that they have solved a longstanding problem in archaeology: the origin of the Indo-European family of languages.
The family includes English and most other European languages, as well as Persian, Hindi and many others. Despite the importance of the languages, specialists have long disagreed about their origin.
Linguists believe that the first speakers of the mother tongue, known as proto-Indo-European, were chariot-driving pastoralists who burst out of their homeland on the steppes above the Black Sea about 4,000 years ago and conquered Europe and Asia. A rival theory holds that, to the contrary, the first Indo-European speakers were peaceable farmers in Anatolia, now Turkey, about 9,000 years ago, who disseminated their language by the hoe, not the sword.
The new entrant to the debate is an evolutionary biologist, Quentin Atkinson of the University of Auckland in New Zealand. He and colleagues have taken the existing vocabulary and geographical range of 103 Indo-European languages and computationally walked them back in time and place to their statistically most likely origin.
The result, they announced in Thursday’s issue of the journal Science, is that “we found decisive support for an Anatolian origin over a steppe origin.” Both the timing and the root of the tree of Indo-European languages “fit with an agricultural expansion from Anatolia beginning 8,000 to 9,500 years ago,” they report.
But despite its advanced statistical methods, their study may not convince everyone.
The researchers started with a menu of vocabulary items that are known to be resistant to linguistic change, like pronouns, parts of the body and family relations, and compared them with the inferred ancestral word in proto-Indo-European. Words that have a clear line of descent from the same ancestral word are known as cognates. Thus “mother,” “mutter” (German), “mat’ ” (Russian), “madar” (Persian), “matka” (Polish) and “mater” (Latin) are all cognates derived from the proto-Indo-European word “mehter.”
Dr. Atkinson and his colleagues then scored each set of words on the vocabulary menu for the 103 languages. In languages where the word was a cognate, the researchers assigned it a score of 1; in those where the cognate had been replaced with an unrelated word, it was scored 0. Each language could thus be represented by a string of 1’s and 0’s, and the researchers could compute the most likely family tree showing the relationships among the 103 languages.
A computer was then supplied with known dates of language splits. Romanian and other Romance languages, for instance, started to diverge from Latin after A.D. 270, when Roman troops pulled back from the Roman province of Dacia. Applying those dates to a few branches in its tree, the computer was able to estimate dates for all the rest.
The computer was also given geographical information about the present range of each language and told to work out the likeliest pathways of distribution from an origin, given the probable family tree of descent. The calculation pointed to Anatolia, particularly a lozenge-shaped area in what is now southern Turkey, as the most plausible origin — a region that had also been proposed as the origin of Indo-European by the archaeologist Colin Renfrew, in 1987, because it was the source from which agriculture spread to Europe.
Dr. Atkinson’s work has integrated a large amount of information with a computational method that has proved successful in evolutionary studies. But his results may not sway supporters of the rival theory, who believe the Indo-European languages were spread some 5,000 years later by warlike pastoralists who conquered Europe and India from the Black Sea steppe.
A key piece of their evidence is that proto-Indo-European had a vocabulary for chariots and wagons that included words for “wheel,” “axle,” “harness-pole” and “to go or convey in a vehicle.” These words have numerous descendants in the Indo-European daughter languages. So Indo-European itself cannot have fragmented into those daughter languages, historical linguists argue, before the invention of chariots and wagons, the earliest known examples of which date to 3500 B.C. This would rule out any connection between Indo-European and the spread of agriculture from Anatolia, which occurred much earlier.
“I see the wheeled-vehicle evidence as a trump card over any evolutionary tree,” said David Anthony, an archaeologist at Hartwick College who studies Indo-European origins.
Historical linguists see other evidence in that the first Indo-European speakers had words for “horse” and “bee,” and lent many basic words to proto-Uralic, the mother tongue of Finnish and Hungarian. The best place to have found wild horses and bees and be close to speakers of proto-Uralic is the steppe region above the Black Sea and the Caspian. The Kurgan people who occupied this area from around 5000 to 3000 B.C. have long been candidates for the first Indo-European speakers.
In a recent book, “The Horse, the Wheel and Language,” Dr. Anthony describes how the steppe people developed a mobile society and social system that enabled them to push out of their homeland in several directions and spread their language east, west and south.
Dr. Anthony said he found Dr. Atkinson’s language tree of Indo-European implausible in several details. Tocharian, for instance, is a group of Indo-European languages spoken in northwest China. It is hard to see how Tocharians could have migrated there from southern Turkey, he said, whereas there is a well-known migration from the Kurgan region to the Altai Mountains of eastern Central Asia, which could be the precursor of the Tocharian-speakers who lived along the Silk Road.
Dr. Atkinson said that this was a “hand-wavy argument” and that such conjectures should be judged in a quantitative way.
Dr. Anthony, noting that neither he nor Dr. Atkinson is a linguist, said that cognates were only one ingredient for reconstructing language trees, and that grammar and sound changes should also be used. Dr. Atkinson’s reconstruction is “a one-legged stool, so it’s not surprising that the tree it produces contains language groupings that would not survive if you included morphology and sound changes,” Dr. Anthony said.
Dr. Atkinson responded that he did indeed run his computer simulation on a grammar-based tree constructed by Don Ringe, an expert on Indo-European at the University of Pennsylvania, but that the resulting origin was, again, Anatolia, not the Pontic steppe.