The following draft of a paper was presented at the Fourth International Conference on Austronesian Linguistics (FoCAL), in Suva, Fiji, in August 1984, under the title “The Balkans and Papua New Guinea: Language Contact Issues.” It briefly touches on some of the new (and disturbing) ideas about Sprachbund issues that I encountered during my Fulbright year in Romania in 1983–84. It was a frustrating year for linguistic research, but a wonderful year for language learning—and for travel, it being my first trip to Europe.


To many who limit themselves to the study of European languages, “the Balkan languages represent a unique case of evolution from genealogical divergence toward typological convergence” (Saramandu 1979:177). It is likely, however, that any large language family has some members who have to some extent forsaken their relatives for their neighbors. One such group in the Austronesian (AN) language family comprises the New Guinea Oceanic languages. (I continue to use “New Guinea Oceanic” as a typological, not a genetic, label.)

The Balkan Sprachbund may receive more publicity than its counterpart in Papua New Guinea, but in neither area are the issues anywhere near resolved. I intend here to outline some of these issues and to compare the progress being made toward resolving them in each of the two areas of study. The Balkans will receive greater attention because I assume that most Austronesianists are less familiar with that area.


The core of the Balkan Sprachbund is composed of five languages: Albanian, Bulgarian, Macedonian, Modern Greek, and Romanian. Compared to the hundreds of languages involved in New Guinea, the number seems quite manageable. Moreover, Bulgarian and Macedonian are sufficiently close that they can be considered together for most purposes. More peripherally involved in the Balkan Sprachbund are Serbocroatian and Turkish. Turkish is usually considered only as an outside donor language, but it would be interesting to compare Balkan or western Turkish with eastern dialects or with other Turkic languages to see to what extent it may also have acquired Balkan, or at least European features.

In order to determine what is specifically “Balkan” about the core languages, one can compare Bulgarian with the other Slavonic languages and with Old Bulgarian (that is, Old Church Slavonic) dating from the 9th to 11th centuries A.D. (Rosetti 1978:480). One can compare Romanian with the other Romance languages and with Latin. The earliest documents in Romanian itself date from the 16th century (Rosetti 1978:482). Records of Greek go back millennia, so it is perhaps the most tractable of the Balkan languages. Albanian, being an isolate with only a brief written history, is harder to deal with, but at least there are two dialects to compare. The southern (or Tosc) dialect shares more features with Bulgarian, Greek, and Romanian than does the northern (or Gheg) dialect. (Comrie 1981:198.)

The surviving languages of the Balkan Sprachbund, then, all belong to different branches of Indo-European. For most of these branches, there is some documentary or comparative basis for sorting out areal features from genetic features. (Comrie 1981:198.) Matters are considerably complicated, however, by the knowledge that the original Balkan substratum did not survive. The most common terms used to refer to this substratum are “Thracian”, “Dacian”, and “Illyrian”. No one is sure whether these are different names for the same language, different dialects of the same language, or three different languages, each with separate dialects. Assumptions vary from linguist to linguist. So does the importance assigned to the role of the substratum in accounting for the similarities shared by the present-day Balkan languages. I shall discuss the substratum issue in greater detail shortly.


Early studies of the Balkan languages taken as a unit perhaps tended to overstate the similarities among them. Sandfeld, in his classic synthesis on the subject (1930), says that, “in going from one of these languages to another … one is struck by the fact that the manner in which things are expressed remains essentially the same throughout the entire territory covered by these languages” (1930:6-7; Grace’s [1981:27] translation).

First, let me illustrate the kinds of explanations I had hoped to find, by briefly summarizing the loss of the infinitive.

In the Balkan languages, finite verbs are used where other European languages would use the infinitive. The loss of the infinitive in Greek can be explained on language-internal grounds. Loss of word-final [n] in Greek made the infinitive formally identical to the 3d person present indicative form of the verb. Distributional evidence suggests that this innovation spread north from Greece. Bulgarian lacks an infinitive entirely. Citation forms of verbs are usually 1st person present indicative. The infinitive exists in Albanian but is used more in the northern dialect than in the southern one. In Serbo-Croation, Serbians prefer to use subordinate finite verbs where Croatians use the infinitive. In Romanian, too, more northern dialects use the infinitive more than the southern ones. I believe there is general agreement on this question.

Unfortunately, not many other issues are as well resolved.

One can say almost the same thing about some areas of Papua New Guinea, but only where the languages involved are all from the same family. The convergence between AN and Papuan languages is on a much grosser level, at least in most cases.

More recent work on Balkan languages, especially that by scholars from the Balkan countries themselves, seems to pay more attention to the differences among the various languages. One reason may be that the Balkan scholars have a greater concern for questions of their own national identity than did the outsiders who originally popularized the concept of the Sprachbund. In fact, Dumitru Macrea, a Romanian scholar, has expressed the view that the whole concept of a Balkan linguistic union being somehow comparable to a language family had its origin in the desire of Germany and Austria to propagate the idea of a unitary Balkan area which those powers then planned to dominate politically, economically, and culturally (Macrea 1982:284).

Another reason more recent scholarship may emphasize the differences among the languages is that there is simply much more data available than there used to be. Finer differences have become more salient. The same thing is happening with regard to Papua New Guinea languages too, as more data becomes available. I suspect that detailed study of the Kupwar village languages would also turn up many, many cases in which those languages are not as perfectly intertranslatable as they are often assumed to be. Even if many texts are morpheme-for-morpheme translatable, I suspect comparable morphemes are never full synonyms.

This raises an important issue. Is absolute convergence necessary? Is it desirable? Is it even possible? What kinds of differences are most tolerable? If fluently bilingual speakers maintain one of their languages solely for emblematic purposes, that is, solely to mark themselves off from speakers of other languages, what portion of their language will serve that emblematic function? Will they be content to say, “You say tomayto and we say tomahto,”, or “You call it eggplant and we call it aubergine”? Or might they also focus on larger differences, like “You put object complements before the verb and we put them after,” or “You have all those heart idioms and we have all those liver ones”? Virtually any recognizable difference would seem sufficient to be emblematic.

Unifying factors

What is it that accounts for the unity that does exist among the Balkan languages? It is significant that no mention at all is made of the possibility of a common Balkan substratum in two recent general works in English that devote some attention to Balkan areal features. These two works are Comrie’s (1981) introduction to typology and universals and Bynon’s (1977) textbook in historical linguistics. Bynon mentions the Byzantine Empire and Greek Orthodox church as unifying factors, while Comrie emphasizes the mutual bilingualism that enabled innovations to spread across language boundaries. Schaller’s (1975) introduction to Balkan linguistics (in German) also tends to discount the role of the substratum and appeal more to the Greek and Latin adstrata as unifying factors. The over dependence on substratum by earlier linguists to explain language change seems to have made many western linguists shy of using the term.

Substratum is generally given a more prominent role, however, by those linguists for whom it is not just an academic issue but also a question of national ethnogenesis. Romanian linguists, for instance, often talk of the history of their language in geological terms. Romanian is said to consist of an autochthonous (pre-Roman Dacian) substratum, a core stratum from Latin, and a superstratum of Slavic. To some, the central problem in Balkan linguistics is the identification of pre-Roman, pre-Slavic, autochthonous elements in the Balkan languages (see Brancus 1978). In spite of much effort, not much progress has been made in this direction (Brancus 1978:374). The only records we have of the Dacian language are a handful of proper names and between 10 and 20 Dacian glosses in two Greek lists of medicinal plants (Academia R.S.R. 1969:314-316).

Al. Rosetti, the Romanian linguist who has concerned himself most with Balkan linguistics in the broader sense—that is, the study of the Sprachbund as a whole, not just the attempt to reconstruct the pre-Roman substratum—nevertheless uses the term “substrate influence”, rather loosely to designate any sort of interference between two languages (Rosetti 1978:205). This perhaps parallels the use of loaded terms like “mixed language” or “language mixture” to describe any sort of contamination between AN and Papuan languages in the New Guinea area.

Gheorghe Ivanescu, one of the principal Romanian Indo-Europeanists, holds a fascinatingly particular view that requires a substrate motivation for each and every sound change. He attacks the “neolinguist” view that phonetic changes are imitative and therefore transferable across language boundaries (1980:735). He asserts instead that a phonetic change is realized only by a change in the “base of articulation”, that is, by a change in the characteristic shape of the oral cavity at rest within a given population (1980:8). He attacks the structuralists for failing to recognize the innateness of certain articulatory tendencies, and suggests that phonetic similarities between some Caucasian languages and Romanian (such as the presence of phonemic schwa) “are to be explained by the anthropological relationship between the peoples of the Caucasus and those of the Carpathians” (1980:733).

An interesting corollary of Ivanescu’s view is that languages do not change at a constant rate. Instead, language change depends on external changes in the speaker population. The “base of articulation”, for instance, changes over time “through changes in the quantitative relationships between the component human types [of a population], as well as through mixtures with other populations, maybe even through biological mutations between one generation and the next” (1980:9).

However, according to Ivanescu (1980:11), the “articulatory basis” of a language can be suppressed. “It does not manifest itself in those eras in which there exists an intense traffic of goods and people” (1980:11). It “cannot manifest itself either in the capitalist era or in the socialist era, except in popular speech … [It] only shows itself in eras in which there is a natural economy, thus in the primitive-commune and feudal eras” (1980:11). For instance, “the adaptation of Latin to the articulatory and psychological bases of the romanized populations, thus the birth of the Romance languages, was not possible except with the change from a trade economy during the slavery era to a natural economy during the medieval era” (1980; 11). (This “natural” economy was organized on a feudal basis in the West and on the basis of village collectives in the East [1980:11].)

A “natural” economy, however, does not allow languages to attain their “natural” condition. In a “natural” economy, divergent local bases of articulation are free to influence phonology, while divergent local temperaments are free to influence morphosyntax (1980:13). These influences are “completely avoided only in eras of intense circulation” of people and goods, thus in eras of higher technological development when unitary literary languages are born (1980:13). “[O]nly in such eras can languages completely attain their natural condition: that of relative stability” (1980: 13).

I’ve lingered over Ivanescu’s views somewhat more than might be necessary for two reasons. In the first place, we often tend to take our shared assumptions for granted. It is healthy sometimes to bring some of them into sharp relief by considering radically different viewpoints. Second, the divergence of assumptions among those of us working on New Guinea language history is relatively narrow compared to that encountered among those working on Balkan language history. Let me give a few more illustrations:

I have already mentioned Macrea’s opinion that Germanic imperialism is responsible for propagating the Sprachbund idea. Macrea (1982:285) and Ivanescu (1980:48 ff.) see similar forces at work in an early hypothesis that attempted to explain the particularly close similarities between Romanian and Albanian. The hypothesis was that the Romanian language and people originally took shape south of the Danube close to where the Albanians are now. A corollary assumption is that when the armies of the Roman Empire retreated south of the Danube in A.D. 275, the whole Romanized population came with them. One can see why this hypothesis would weaken the historical argument for Romanian territorial claims. Although this hypothesis is still kept alive by some Hungarian irredentists (see Du Nay 1977), it is no longer considered seriously by any present-day Romanian linguists. Instead, Romanian linguists are inclined to attribute the similarities between Romanian and Albanian to a common Thraco-Dacian substrate, on the theory that the Romanians continue that portion of the substrate population that adopted Latin as its mother-tongue, while the Albanians continue that portion that borrowed a lot from Latin but did not switch over to Latin (Ivanescu 1980:57).

Romanian linguists, then, are far less reticent than their Western counterparts about appealing to a common substratum as a unifying factor in the Balkan Sprachbund. I believe that part of the appeal to substratum as an explanatory factor is motivated by the desire to establish prior territorial claim to present Romanian-speaking areas. So far, historical linguistics in the New Guinea area has been relatively free from involvement in territorial claims. I hope that situation continues.

Other unifying factors mentioned in the Romanian literature are:

(1) similar conditions of life among the Balkan peoples, particularly the relative mobility their livestock-centered economy afforded them;
(2) exposure to Byzantine civilization, especially the Eastern Church;
(3) subjugation to the Ottoman Empire, a condition which actually reinforced the church as a unifying factor;
(4) widespread bilingualism (Saramandu 1979).

Saramandu (1979), a younger Romanian Balkanologist, distinguishes what he calls “passive” and “active” bilingualism. The distinction is not unfamiliar, but I would use the terms “restricted” and “unrestricted” to describe the two types. By “passive” bilingualism, Saramandu means bilingualism restricted to certain social occasions (religious services, for instance) or certain social strata (priests, administrators, itinerant merchants or craftsmen). The mass of the population would presumably recognize but not use another tongue. By “active” bilingualism, Saramandu means the bilingualism of a person who masters and uses two or more languages in more or less equal measure.

I’m not sure that, for a given population, the end result of either of these types of bilingualism would be very different, except that the second permits the possibility of complete language shift. On an a priori basis, one might suppose that the foreign languages in which a population is passively bilingual might contribute more loanwords or loan translations, and have less effect on phonology, morphology, or syntax; while the foreign languages controlled actively by the mass of a population would influence the phonology and phraseology as much as the lexicon. But French, for instance, seems to have penetrated into every corner of English (except perhaps phonology) even though the great mass of Anglo-Saxons after 1066 were certainly no more than passively bilingual. If sufficiently influential, active bilinguals can spread foreignisms among their own passively bilingual kith and kin at least as efficiently as foreigners can.

Here ends the draft of the paper I presented but never submitted for the conference proceedings. The only record I preserved was a hand-annotated printout from the Wang word processor at the accounting firm where I was working (the Honolulu office of Deloitte). Unfortunately, the bibliography seems to have gone missing. I scanned, OCRed, and then cleaned up those pages to get the text above.

My wife and I began that fascinating year teaching summer extension courses in Yap, Micronesia, during a severe drought that had us bathing out of buckets in our air-conditioned hotel room. Little did we realize at the time what types of shortages we would face during our long, cold, dark winter in Romania. We both made the trip to Fiji, where we stayed in a village near the conference hotel, along with several other participants from far corners of the globe. For the two of us, especially after Romania, that Pacific Island village made us feel we were back home again.



  1. aspalathos

    Good Day Mr., SomeHow I Found Your Station And It Is Interesting One. I Am Not Into Linguistics But As I Can See You Are Trying To Help Those People In The Oceanic Group Of Languages. The Reason I Am Writing To You Is Your Writing In Composition Of Above Article That Is: “More peripherally involved in the Balkan Sprachbund are Serbocroatian and Turkish”. What Bothers Me Is The Word “Serbocroatian”. I Do Not Understand Are You Aware That There Are Serbian As Well As Croatian People Living In The Balkan Peninsula. It Is The Right Of Croatian People As Well As Any Other Nation On This Earth To Have Their Own Name For Their Mother Tongue Language. The Croatian People Call It Croatian Language(Even As Croatian Is Similar To Serbian Language). Obviously There Should Be A Reason For Writing “Serbocroatian” In Such Form And It Is Only You That Can Answer It. If It Was Written “Serbo-Croatian” Or “Croatian-Serbian” Or “B-C-S” I Will Not Bother You And Wasting My And Your Time. Thank You For Your Attention. Ciao, Adios, Dovidjenja, Au revoir&Shalom.

    This was a paper I wrote in 1984, before the violent breakup of Yugoslavia, at a time when people troubled themselves to emphasize the essential unity of the Serbian, Croatian, and Bosnian languages, and not their differences. Romanian usage tends not to hyphenate and internally capitalize such compounds: thus Macedoromân, Meglenoromân, Grecoromân, etc.

