languages


Conlangers and linguists should enjoy the interesting dialect maps at http://projetbabel.org/vosgien/langues_oil.htm:

isoglosse_beau

Best not to dwell on the Microsoft Paint airbrushing.  I find such maps fascinating; I once started to make a series of them for Verdurian, but never finished them.

What amazes me is the sheer variety in a relatively small area (the width of France is about the same as Illinois + Indiana + Ohio).  And if you compare maps, each lexical item has its own story… the isoglosses don’t match up except in very broad terms (there’s often a northwestern-coastal area and a northeastern area).  It’s also notable that Paris doesn’t seem to influence the patois all that much.  (That is, standard French may replace the patois, but while they exist they retain their own particular words.)

I read about the world’s lop-sided linguistic situation— where handful of big languages (e.g. English, French, Spanish, Arabic, various Chinese languages, etc.) dominate, and how the ones that do not (e.g. Welsh, Breton, Hawaiian, various aboriginal languages of Americas and Australia, etc.) are dying off, with some people and organizations (even governments) to prevent and try to reverse this trend. Do you think it is possible for these small languages to survive in the modern world; or is it enviable that they will just die off, the attractions of the big languages being (like cities) too great. I’ve heard that only one language has really successfully been revived: Hebrew. Is this true? If it is, why is so. If it isn’t, what can be learned from the Hebrew revival that can possibly be applied to other language revivals? Also, how important do you think government policy on language is? What do you think is the best government policy toward language.

—Christopher

Language revival, and people’s thoughts in general about what other people should speak, often go astray for failure to address a huge fact: languages are hard to learn.  It’s a multi-year commitment, best done in childhood (not because children are better at it but because they have the time for it).  And at the community level, it doesn’t really work unless almost the whole community makes that effort.

The irony is that public policy through the 1950s or so was the opposite of today’s concerns: people seemed to be terrified that minorities had their own languages, and did their best to discourage and destroy them.  But languages that survived this now faced a greater threat: modern communications and mobility.  Learning the national language becomes is the easiest thing to do, and once the younger generation isn’t taught the minority language any more, it’s likely to die out.

You’re right that Hebrew is the one clear case of language revival.  However, it had two great advantages that most other attempts lack:

  1. There was a large population with a good non-native understanding of the language.  There was thus less trouble finding adults to jump-start educating the kids.
  2. It became the language of a state, and thus something people had to learn.  You’re not soon going to see this happening with Hawai’ian or Cherokee.

That’s not to say it can’t be done.  Many people are trying, with anything from Cornish to Ainu to Amerindian languages.  It’s just that token efforts (naming it “official”, sprinkling a few words around,  having a half-hour class once a week) won’t do the trick.

Modern communications offers advantages, too: activists and language learners can easily connect up; recordings make it easier to share the spoken language; desktop or web publishing is easier and cheaper than print.

Governments can help by e.g. funding immersion schools and production of cultural material.  Sometimes unexpected things help: e.g. the Peruvian government required a fraction of Peruvian content on the radio, which led to a lot of exposure of Quechua songs.  People respond to cultural content much more readily than government decrees!

As to how successful efforts have been, I found this interesting discussion from people who know more about it than I do:

Language Hat: Reviving Passamaquoddy

I recently picked up Geoffrey Sampson’s Empirical Linguistics. Sampson has a bone to pick with Chomsky– he wrote an earlier book called Educating Eve: The ‘Language Instinct’ Debate where he engages with Steven Pinker’s version of Chomskyanism, without apparently ever getting a response. But he’s actually a pretty good bone-picker. He thinks Chomsky carried some very unlikely propositions by sheer force of personality rather than argument, and he makes a good case.

Speaker intuitions

The target here is the idea that models of language should be based on speakers’ intuitions of grammaticality. Historically, probably Chomsky insisted on this as part of his reaction against Skinnerian behaviorism, which had little role for the mind; Chomsky asserted that speakers do have a model of language.  Indeed he reified this out of all proportion into a mini-organ precisely specified by genetics– though he never says how.

For decades, syntax was conducted by linguists consulting speaker intuitions– meaning their own. In effect they were making up their own data, which is an invitation to trouble. Sampson mentions William Labov scathingly documenting how Chomsky treated his own intuitions as scientific fact, those of others when they disagreed with his as fallible opinions.

More seriously, speaker intuitions can be demonstrably wrong. People can be quite sure that they never say something– one example is a speaker who convincingly insisted that he never used any more positively, as in John is smoking a lot anymore; but then he was caught spontaneously saying Do you know what’s a lousy show anymore? Johnny Carson.

Intuitions are all right for basic matters; the problem is that syntax today is so sophisticated that the sort of sample sentences people are asked to judge are so complicated and unlikely that it’s unlikely a pre-existing rule covers them. Only nativists like Chomsky can really maintain that the grammar covers all possible situations in advance. It’s simpler to maintain that as in other cultural domains like law or fashion, people creatively approach new situations when they’re confronted with them.

These delusions are especially dangerous when theoretical edifices have been built on top of them. Sampson recalls giving a seminar which covered center embedding; he referred to the conventional wisdom that multiple center embedding was impossible. Anne de Roeck asked, “But don’t you find that sentences that people you know produce are easier to understand?” Sampson was well into an extended answer to the question before he realized that de Roeck’s question was in fact a counter-example.

The experience spurred a new interest in empirical investigation of linguistic claims. He started to work with linguistic corpora, using computers for searching and analysis; much of the book is a set of reports on how such work is done and what sort of things come up. Not surprisingly, what people actually say and write is more varied and interesting than the somewhat artificial constructs linguists make up.

This is heresy from a Chomskyan point of view– isn’t a grammar supposed to generate all possible sentences and divide them into acceptable and unacceptable? Well, no, that’s just Chomsky’s pet idea. All a grammar has to do is tell us what people say and write– their positive performance. We don’t need to posit a mechanism to deal with the sentences people don’t actually utter or encounter. (It can be a convenient shorthand, of course, to say “We don’t say XXX”. It can eliminate pontes asinorum– or just contrast the dialect being described to others where XXX does occur.)

One chapter of the book departs from the overall topic to consider The Logical Structure of Linguistic Theory, a huge work Chomsky wrote in 1955 and didn’t publish till 1975, but frequently referred to.  It had semi-mythical status when it could only be consulted by the priestly elite of Chomskyanism; once he could finally read it, he found it extremely disappointing.

Meaning

Curiously, when it comes to meaning, Sampson takes the opposite tack. He thinks meaning can’t be investigated scientifically (that is, empirically, subject to falsification) at all. That doesn’t mean it can’t be handled at all, but the approach will be that of the humanities: essentially narrative documentation of human creativity.

He blisteringly attacks Fodor and Katz’s semantic primitives– the analysis of “bachelor” as [+HUMAN] [-MARRIED] [+MALE], for instance. As a minor point, he shows that the idea only works for nouns and adjectives anyway– it’s useless for verbs. For verbs you’d might as well just use inferences: e.g if John buys trout from Sally, then Sally sells trout to John. This can be extended to nouns: If John is a bachelor, John is human, unmarried, and male.

But worse yet, the primitives don’t hold up under analysis. Meanings are too fluid. Does a cup have to have a handle or not? Does it have width or height requirements? Labov tried to approach this by showing people specially constructed objects which were designed to test aspects of the definition of ‘cup’; he found that people’s responses weren’t atomistic, but probabilistic. A certain width-height ratio might produce a given probability that people would judge the object a cup. And he didn’t even get into (say) the usage of the objects. Other researchers concluded that far from coming to a consensus, people could come up with “a myriad” of possible common features.

Adam Kilgarriff, working on computational linguistics, concluded that “I don’t believe in word senses”.  Actual usage is so fluid and vague that it makes no sense to ask in the abstract how many meanings a word has; you can only ask what meanings it could be convenient to distinguish for a given task.

Barbara Partee apparently once found young children asking her whether she had a father. This confused her till she realized that they were asking if she had a husband. They didn’t understand the adult notion that fathers were related to conception; they understood them as male heads of household. Presumably once we understand the facts of life, we adapt our definitions. In other words, faced with new information, we either follow the social consensus or create a new one.

As an example of the latter process, Sampson notes the new reality that people can change sex. If a man fathers a child, then becomes a woman, is he now the child’s mother? Until society faces the question, there is no answer, except in terms of individual creativity. This can’t just be handled by fiddling with the [+MALE] node.

Sampson chides linguists for ignoring the philosophical debates on meaning… linguists’ books on semantics often barely consider Wittgenstein, White, or Quine.  (The sources I’ve read do mention them, but I think I should probably check them out directly.)

On the whole I think Sampson is mostly right about empirical verification of syntactic claims, and probably right about semantics. He obviously dislikes Chomsky’s work very personally, but he’s not a crank– he gives good arguments for his skepticism.

I’ve been rereading the Appendix to 1984. The Party planned to ditch English and have all its members speaking Newspeak only by 2050. (It’s not certain what they planned for the Proles; O’Brien thought they were ineducable, in which case they would still be using Oldspeak.) But Newspeak was designed to have no redundancy in its lexicon and also to be spoken in a rapid, monotonous voice, with no variation of stress or tone (duckspeak) which would make it very hard to follow even in a moderately noisy environment. Do you think a language like that is viable?

—Mornche Geddick

Your question was an opportunity to reread Orwell’s description of Newspeak.  I think it’s a brilliant satire of totalitarian and authoritarian modes of thought; it should be read along with his less fantastical but equally perceptive “Politics and the English Language”.

The main sources or targets seem to be these:

  • An aesthete’s aggrieved reaction to the regularities of artificial languages like Esperanto.  Though this is slightly provincial— what’s wrong with agglutinative languages?— it fits in very well with the Party’s blunt destruction of everything from the past.
  • The careless meaninglessness and deceitfulness of political jargon.
  • The Soviet fashion for syllabic abbreviations, e.g. Sovnarkom for “council of people’s commisioners”.

But that’s not your question.  Would it work?  As a written language, purposely impoverished in meaning and cut off from the past, I don’t see why not.  There are clear examples of the latter: Atatürk’s adoption of the Roman alphabet cut off Turks from centuries of literature; the adoption of báihuà (the Mandarin vernacular) over wényán (the classical literary language), plus the script reform, did the same for China.  To be sure scholars in both cases could continue to learn and study past works, but it was a new barrier.

Could the Party keep the new language immaculate of heretical meanings?  Only by retaining absolute power, which of course is a political not a linguistic question.

Newspeak depends on what’s normally called the Sapir-Whorf hypothesis; it was intnded to make all other modes of thought but Ingsoc impossible.  But if the political side wasn’t there, I doubt that the linguistic side would hold up.  Suppose the totalitarian state simply collapses, as in V for Vendetta: would the absence of metaphorical uses of “free” continue?  I doubt it; people would simply invent new words or senses.  Writing in 1948, with all the European empires intact, Orwell might suppose that uneducated peoples (denied access to sophisticated liberal thought) could never rebel; I think it’s clear by now that this was wrong— despite his own hatred for imperialism, Orwell grossly underestimated the capacities of non-Europeans.

As for the monotonous delivery “without involving the higher brain centres at all”, I think this should be taken as a parody of political speeches, especially the propagandists for extremists, mouthing out verbiage with no concern for careful thought, beauty, or internal contradiction.  In the world of 1984, it wasn’t a bug but a feature if torrents of Newspeak were hard to follow; the aim was the suppression of thought and progress.

Blogger Hans Perk posts this invitation from a Disney party in 1932, and comments that “This is how Mickey should talk”.  Here’s a sample:

minnie an’ me’s gonna have a big shindig over t’ th’ studio on june 25th on account of we’re gonna say good-bye t’ columbia an’ hello to united artists, an’ we want you should help us… an’ this invitation’s good for two people, so’s ya can bring a guest if’ya like…

My question is… who talks this way?  Or ever talked this way?  Even given that some of the slang is outdated, is this supposed to be urban, or rural, or childish, or what?  It seems like a mishmash that says nothing about where Mickey is from (except that he evidently never went to college).

It’s linguistically interesting, though annoying, when writers reproduce supposedly substandard patterns that are actually near-universal in speech.  Who actually pronounces the d in and in “and dancing”?  I assume the apostrophe in t’ and th’ represents a shwa not an elision; again, who says [ovr æt ði studio] instead of [ovr ɘt ðɘ studio]?  Perhaps there was still a contrast in 1932, or perhaps it was a tired bit of pseudo-folksiness even then.

I’ve just read Michael Tomasello’s Constructing a Language (2003), and no, it’s not about conlangs.  It’s about how children acquire language.  It’s one of the best books on language I’ve read; also one of the most difficult.

As adumbrated in Chapter 1, the Generative Grammar hypothesis focuses only on grammar and claims that the human species has evolved during its phylogeny a genetically based universal grammar. 

Jeez, don’t open with a joke or anything.

Contra Chomsky

I already reviewed an essay of Tomasello’s; the book is an extended form.  In brief, he destroys the idea of the poverty of the stimulus.  Chomsky was reacting against Skinner’s stimulus-response notion of language, rightfully pointing out that there has to be a good deal of mental machinery dealing with language— language is nothing like a conditioned response.  His mistake was to assume that this machinery was innate— that children don’t hear enough language to deduce its principles.  As Tomasello shows, reviewing study after study, there is no evidence for this.  Children learn most easily precisely those constructions they hear the most often, and the more difficult constructions take longest to learn.

In later years Chomsky, faced with the vast array of human languages, elaborated the idea of parameters: there are a bunch of switches (OS or SO?  pro-drop or not?  AN or NA?) which determine a language’s particular grammar; all the child has to do is learn what the settings are for the language she hears.  Again, the evidence is against this.  Simply put, children don’t suddenly acquire settings; their competence increases slowly and (this is a key point) item by item, construction by construction. There’s not a sudden point where (say) they realize that English isn’t pro-drop.  They mix constructions with pronouns and those without. 

A major argument for innateness is that children learn languages “naturally”, supposedly within a window of opportunity and much easier than adults.  I’ve addressed this before, and Tomasello repeats some of the same objections, but adds a good new one: children have the advantage of having no first-language interference. 

Attention!

How do children learn language?  We know most about the years from 1 to 4, which have been best studied.  Tomasello doesn’t believe in a language organ at all; he maintains that human language depends simply on human cognitive abilities, and the key one, appearing at about 2 years of age, is the ability to maintain joint attentional frames… that is, the child interacts with an adult, about some situation.  The key word is attention: the child only now can understand that others have mental states, and seek to affect them.  Animal language is all about expressing states: the animal is horny or hungry or wants to go home, or sees a predator.  Other animals may react to these expressions, but they’re not intended as communication— in fact the animal is quite likely to make the same expressions when alone.  What distinguishes human language is the ability to model other minds (and thus to try to affect them). 

Joint attentional frames are Tomasello’s response to Quine’s dilemma about ostension: pointing to a rabbit, do we mean the rabbit, the rabbit’s foot, the act of running, the color of the fur, or a bag of rabbit parts? 

Tomasello points out, by the way, that ostension is of less use than we might think in language learning.  Verbs, for instance, are most often used not to point out an ongoing action, but to describe one that just occurred or that’s about to occur— neither of these are things that can be pointed to.  Even nouns often occur when not present (“Where’s Daddy?”  “What does a cow say?”).

What the frames provide is meaning and context.  Basically, toddlers learn language because it’s the commentary to a situation they already understand.  (To put it another way, if you leave the TV on, they won’t learn about elections or American Idol.  There’s no attentional frame to give them a handle on the words from the TV, so they don’t learn anything from it.)  A child won’t learn ‘rabbit’ from a random act of pointing.  They learn the word in a familiar, information-rich context: playing with a pet, visiting a zoo, reading a book, whatever.  They pretty much already understand what the adult is doing and what the utterance means, and they can use that to figure out what any unfamiliar words mean.

Bag those trees

Tomasello rejects generative grammar and formal linguistics entirely.  The language organ hypothesis posits that children have a full adult understanding of grammar and only need to learn how to activate it.  This just doesn’t match the years-long struggle children have to acquire language and the mistakes they make.

How do they acquire language?  Item by item— and the items may be words, phrases with open slots, or entire constructions (e.g. passive voice).  The evidence is that they don’t learn to link up these items right away.  E.g., learning the verb hit, they don’t really have a concept of the verb’s subject and object.  They learn the word’s particular slots: hitter and hittee.  It’s only much later that they abstract out general syntactic categories like subject; and particular items may indeed remain as anomalies in adult speech. 

As an example, generative grammar treats questions as a transformation of statements… “Where’s the rabbit?” is related to locatives like “The rabbit is in the cage.”  But Tomasello points out that for many children, the first multi-word constructions they produce are questions: where X, what’s X?  They can hardly be transforming statements when they’re not producing statements yet.  Rather, they learn the questions because they hear similar questions from adults. 

Some aspects of language are delayed because they require more cognitive sophistication.  The proper use of pronouns and definite articles, for instance, requires an understanding of what other people know.  Young children use these features based only on what they themselves know.  There are items and constructions that aren’t mastered until well unto school age.

Further research needed

The book starts with words and simple constructions, and progresses to more complicateed ones.  It gets weaker as it goes on, not because Tomasello’s argument declines, but because the research gets thinner.  There just aren’t enough studies of how children learn the more complex constructions of their language.

Still, his usage-based linguistics is perhaps the first overall theory of language that strikes me as being on the right track in general.  He rejects generative syntax and innate linguistic competence entirely, and that may be going too far.  But as a heuristic, it’s completely correct: we should explain as much as we can with general cognitive abilities before positing language-specific ones. 

Chomskyan linguistics in particular seems like arid speculation verging on pseudo-science.  The whole idea of parameters, for instance, is an invitation to fool oneself: any anomalous data can be swept under the rug by adding a new parameter.  The best alternatives so far have been people who are more sensible (e.g. Lakoff and McCawley) but who still are very far from the neurochemistry of the brain.  I’ve long felt that we won’t be getting near the truth till linguistics is a lot more like color theory: read Hardin’s Color for Philosophers and note how much vision and color perception derive directly from facts about neurons.

Tomasello isn’t at the neural level yet, but he deals refreshingly in facts, both facts about child language acquisition and facts about human cognitive development.  It doesn’t exactly liven up the book, but a few decades of this and I think we’ll get a whole lot closer to exactly how we do this language thing we do.

I picked up a book on the human voice, memorably titled The Human Voice, by Anne Karpf, mostly because it’s about what I’ve always regretted as a lack in linguistics… essentially, everything that voice can do that’s not reducible to writing– intonation, expression, accent, prosody. 

It’s astonishing how much information content there is in the voice.  If I hear you speak even briefly, I know your sex, region, class, rough age, your emotional state, your closeness to me, a good deal of your personality.  It encodes general information and also much of your irreducible individuality.  It’s also a powerful social force: it can soothe or inflame, insult or amuse, completely independent of the words used; it’s a major means of bonding between mother and child.  (Fathers too, to some extent; one of Karpf’s curious facts is that babies aren’t terribly moved by their father’s voice.)

Linguistics generally ignores these things, or issues a vague promissory note– we’ll deal with it after we nail down syntax, maybe.  It’s true that science proceeds as much by excluding as by including domains: Galileo and Newton wouldn’t have got so far with the laws of motion if they hadn’t ignored friction.  On the other hand, if we haven’t explored the territory, the one thing we should know is that we have no idea how big it is or what it contains.  One reason I distrust many current models of language is because they’re adapted to what we do know– the relatively discreet worlds of phonemes and words, which seem amenable to largely computational methods.  But our methods just don’t seem so useful for voice, which is full of analog effects, individual variation, and phenomena we can barely name or talk about.

Unfortunately the book really only underlines how little we know.  It’s full of interesting facts and tantalizing studies, but it’s entirely theory-free.  It’ll probably take another thirty years before we understand, not what the voice can do, but how the brain handles it.  I suspect it’ll be a revolution.

For the two TF2 fans who haven’t already seen this:

http://www.dailymotion.com/video/x6v3cg_gogolrush_videogames

Quite lovely character animation for machinima.  You gotta feel for that demoman.

Also the opportunity to learn two useful French terms: bêtisier “gag reel”, and gogol “tard”.

Look at this picture of the Chinese Olympic mascots, bearing in mind that their names are supposed to spell out 北京欢迎你 Běijīng huānyíng nǐ “Beijing welcomes you”.  What leaps out at you?

Why, that the names don’t match!  What nefarious message do they have for us?

贝贝 Bèibèi uses the character ’shells, valuables’.  Interestingly, the tone doesn’t match Běijīng.

晶晶 Jīngjīng is ‘bright, shining’.

欢欢 Huānhuān is ‘cheerful’; this one does match huānyíng.

迎迎 Yíngyíng ‘welcome, meet’ is OK too.

妮妮 Nīnī is ‘girl’; again a tone mismatch.

OK, there isn’t actually a subversive message; I guess names like “You-you” and “North-north” just didn’t sound cute enough.

福娃 fúwá is ‘happy’ + ‘baby’, not to be confused with Japanese futa.

Ryan North of Dinosaur Comics must be reading some linguistics.  He recently had a comic on the Great Vowel Shift, and now he has a two-comic series on Paul Grice’s conversational maxims:

http://www.qwantz.com/archive/001271.html

Prove me wrong if you can: I’d venture to say that these are the only comics to date to focus on Grice’s conversational maxims.  (I think Stan Lee was planning a 4-issue Power Man miniseries covering Grice as well as speech acts, but John Romita couldn’t figure out how to draw a kick-ass presupposition.)

Here’s an explanation in case T-Rex’s isn’t clear enough:

http://www.zompist.com/xurnash.htm#Implicature

Next Page »