languages


Someone over at Metafilter had a great question: What syntactic category are mathematical operands? (Their username is notsnot, in case this needs to go in a dissertation someday.)

Let’s start with something like

Three plus four is seven.

For now, we’ll say the numerals are NPs.  In a construction NP <word> NP, the <word> could be various things: a verb, a conjunction, a preposition. We can immediately rule out verbs, since plus and its friends (minus, times, over, etc.) are not conjugated.

We should also look at non-mathematical sentences, like

Determination plus luck means victory.

Let’s do some syntax. There are some standard though fallible tests for prepositions.  For instance, they can usually be modified by right:

He fell right in the river.
She lives right down the street.
Go to the cave right in the forest.

*Determination right plus luck means victory.
*Three right plus four is seven.

Prepositional phrases (PP) can often be fronted:

Up the hill she walked.
*Plus four three is seven.

You can front a PP and replace the NP with an interrogative, or front just the questioned element:

Sam is the king of England.  Seven is three plus four.
Of what is Sam the king? *Plus what is seven three?
What is Sam king of?  *What is seven three plus?

PPs allow gapping:

Sam is king of England, and Joe, of France.
*Seven is three plus four, and eight, plus five.

These tests aren’t definitive, but plus is failing every one of them. A better match might be conjunctions. Plus, like and, can link NPs or sentences, and can be used multiple times:

Bill and Anne and Rahesh came.  Two plus two plus one make five.
Bill came, and Anne left.  Bill came, plus Anne left.

On the other hand, this transformation sure doesn’t work:

Sam is a king and Sam is a dancer.  >> Sam is a king and a dancer.
X is 4, plus X is sin θ >> *X is 4 plus sin θ.

It looks like the construction S, plus S isn’t really the same plus as in two plus two.  And other operators don’t allow it at all:

*Bill left, minus Anne stayed.
*Bill left, over Anne stayed.

A bigger problem is that English sentences allow literally infinite amounts of inserted material.

Sam and Alice are nobles.
Sam and Alice are fine, just nobles.
Sam and Alice are still nobles.
Sam, prince of Florin, and Alice, duchess of Guilder, are nobles of Sylvania.
Sam and possibly Alice are, as of Tuesday, nobles.

How much of this can we do with mathematical expressions?

Three plus four is seven.
Three plus lovely four is seven.
Three plus four is still seven.
Three, square root of nine, plus four, half of eight, are seven.
Three and possibly four are, as of Tuesday, seven.

These are not impossible, but at best they sound jocular.  The additions are not math; they’re intrusions of ordinary English.

There’s also the complication that mathematical plus and minus can be unary: you can say Minus three plus four is one. You can have a conjunction beginning a sentence (And the Lord said to Moses…), but that’s not how minus is working here; it’s obviously a modifier for three.

Not to belabor the obvious, but many of the basic things we can do with a sentence don’t really work in mathematics.  You can’t really put a mathematical expression in the past tense, or use the present perfect, or use pronouns, or passivize, or insert a relative clause, or nominalize, or cleft, or topicalize.

And all this is looking at a very basic expression that probably did arise out of normal syntax.  It’s even harder to apply our notions of normal English syntax to something like

x equals minus b plus or minus square root of b squared minus four a c over two a.

or

e to the i n equals cosine of n plus i times sine of n.

I’ve gone into this much detail to convince you (and myself) that ordinary English syntax doesn’t really explain mathematical expressions.  I hope my conclusion doesn’t shock or appall you: mathematical expressions don’t follow English syntactic rules; they follow mathematical rules.

Now, maybe you could shoehorn the quadratic formula or Euler’s formula into the syntactic framework of your choice. I will bet you, however, that you’ll end up with a pile of very idiosyncratic special rules and special syntactic categories, and a bunch of ad hoc exclusions of normal English rules.

And there’s an alternative formulation that would end up far simpler than that: mathematical expressions have their own cross-linguistic syntax, based on their written form, and languages have conventions on how to say them aloud.

I don’t think this is terribly surprising… it’s like discovering that the Russian of Tolstoy’s War and Peace contains a number of passages which are written in the Roman alphabet and don’t follow ordinary Russian syntax.  Is this a revolutionary discovery about weird undercurrents of Russian?  No, it’s just that Tolstoy included quite a bit of French in the text.  Similarly, English sentences can have embedded mathematics.

Still, I hadn’t thought about it this way, and I find it interesting that a pretty ordinary part of English turns out to be, well, not really English at all.

Now, for historical and practical reasons, there’s a certain overlap, especially with basic arithmetic. People undoubtedly said “Two and two are four” (or “twá and twá sind féower”) long before international mathematics was formalized. So these behave more like ordinary English than the quadratic formula does.

Plus, the conventions for speaking math out loud were, of course, invented by speakers of the language out of existing (or newly borrowed) words, and follow ordinary language conventions– where possible.  So you can read cos (2θ) as “cosine of two times theta”.  On the other hand you can just read it as “cos two theta”, which probably has no non-math analogue in English.

(I should add that programmers are very familiar with the idea that math expressions have a particular syntax.  They don’t bother with linguistic categories at all; they define their own, such as operators, variables, constants, functions, and statements.)

 

 

Advertisements

The book of mine which I use the most is The Conlanger’s Lexipedia. Enough, in fact, that my paperback copy is getting too worn. So I created a hardcover edition!

clex-hard

Lulu charges more than I’d like, but on the other hand I can put it on sale! So for now, you can pick it up for $28.76. That’s less than it costs to go out for dinner! And heck, I’ve put the hardcover Language Construction Kit on sale too.

I also took the opportunity to update the text, correcting a few embarrassing errors. Also, the latest copy of Word, amazingly, can hold the whole book in memory at once without crashing. So I was able to add the first few chapters to the index.

Go buy a few!

The Fan’s Guide to Neo-Sindarin, by Fiona Jallings, is now out. Here’s where you can buy it. It’s about Neo-Sindarin.

fiona-cover

This is partly a Yonagu Books production: I edited the book and did the book design. But I enjoyed the book a lot and I think most conlangers would.

Tolkien is the greatest of conlangers, and one of the most frustrating. He has an effortless good taste that few of us can match.

I goth ’wîn drega o gwen sui ’wath drega o glawar!
the enemy our flees from us like shadow flees from sunlight
Our enemy flees from us like a shadow flees from sunlight!

You get the feeling that every word has been carefully hand-crafted and polished for decades, probably because it has. He was a linguist, knew his Indo-European and sound changes inside out, and knew how to make a language seem familiar yet with few outright borrowings. The feel of his languages is so natural that it’s become a cliché. (If you’re planning an orcish language, I advise you not to imitate the Black Speech.)

What he couldn’t do for the life of him was finish a language, or write a grammar. He kept messing with things, and he never properly explained even some of the basics. Quenya is in pretty good shape, but Sindarin is woefully underspecified.

That’s where Neo-Sindarin comes in. It’s an attempt by multiple people to finish the language, at least to the point of usability.  There are glaring holes— entire tenses or lines of paradigms, the copula, the pronominal system, just aren’t complete. It would be a little grotesque to make up words to fill things out, and the Neo-Sindarinists don’t do that. They scour the published texts and the slowly accumulating extra material; they extrapolate carefully from Proto-Elvish or from early drafts of Noldorin.

Because so much material has been published only in the last few years, Fiona’s book is pretty much state of the art. It’s a textbook (with exercises), organized in such a way that it can serve as a reference grammar.  You can learn Neo-Sindarin or just learn how it works. It’s also an annotated introduction to the reconstruction process; you can see exactly what was reconstructed, and by whom, and what that’s based on. And it’s lively, or at least as lively as a language textbook can be.

There are also sections on (e.g.) naming and cosmology that remind us that Tolkien was not only a linguist, but a medievalist. The elves are more different from modern humans than many an sf alien.

For me, the most interesting bit was peeking behind the curtain into Tolkien’s study as he conlangs. As I’ve been studying Sanskrit, it’s fascinating to see glimpses of Indo-European poke out in Elvish, such as umlaut and multiple verb stems.

In Sindarin, Tolkien made extensive— really extensive— use of mutations, as in Celtic (and these are not dissimilar to Sanskrit’s sandhi).  There are half a dozen types of mutation, and they make for patterns like this:

drambor – a fist
i dhrambor – the fist
in dremboer – the fists

The article i, you see, triggers vocalic mutation, while the plural in triggers nasal mutation. Often mutation takes on a syntactic role: e.g. only the presence of mutation distinguishes the structure i ’wend bain “the maiden is beautiful” from i ’wend vain “the beautiful maiden”. (Bain is the un-mutated form.)

Sindarin has particularly complex pluralization rules, yet they go back to a very simple rule: add –i to the end. Only the i triggers two separate sound changes, one affecting potentially every vowel in the word, the other moving the –i into the last syllable (and causing some changes there).  And for some words you need to know the ancient form.

Beginning conlangers often want to make simpler languages, Esperanto-style; but later on we usually get a taste for complexity. But merely being weird or randomly irregular is not interesting. Sindarin is a master class in getting complexity out of some fairly simple ideas.

And also, you know, in finishing your grammar. Tolkien had the reworking bug; he was one of those people who can’t stop fiddling with his creation. But really, people, take a sheet of paper and write out all your pronouns.

The other area where most conlangers could learn from Tolkien is in the lexicon. Creating words, he was in his element. This is the opposite of machine-generating a word list and assigning each an English meaning. His words have a history going back to Proto-Elvish and interesting derivations, and they all sound good.

Anyway, I hope you have a wide collection of natlang grammar and a few conlangs; Fiona’s book is a great addition to that part of the shelf.

I finally got around to something I wanted to do for awhile… find out what some of the signs on the Hanamura map in Overwatch say.

In the arcade, there are intriguing posters of a lanky woman, not D.Va, who may have a mecha of her own.

ow-machine

Super マシン2 = Super Machine 2

音樂! = Ongaku! = Music!

ow-panther

ルパンター X = Pantā X = Hunter X

パワーガー  = Pawāgāru = Power Girl

The sign on the door of the outside door of the castle:

花村城跡地。立ち入り禁止。

Hanamura-jō atochi. Tachiiri kinshi.

Site of Hanamura castle. No trespassing.

The Rikimaru shop is labeled, not very excitingly,

ラーメン屋 Rāmen-ya = Ramen shop

Finally, the van outside the arcade says

うまさ世界 デリバリ = Umasa sekai – deribari = Tastiness World – Delivery

Thanks to alert reader Hirofumi Nagamura for corrections!

Edit: And also for providing translations for these signs inside the castle:

ow-temple

Left: 七転八起 = Shi chi ten hakki = “Fall seven times, rise eight times”— i.e. “Don’t be discouraged by multiple setbacks.”

Right: 竜の心で気合全開 = Ryū no kokoro de ki ai zen kai = “With a dragon’s heart, go all out with your fighting spirit.”

I’m at the point in my book where I need some sample sentences in Hindi. If you (or a friend or relative) know Hindi and can translate them for me, please contact me. There’s a couple dozen or so.

(I have versions of them already, but they’re either copied from textbooks or they’re my attempt at modifications. I’d rather have a native speaker produce original ones.)

Also, it’d be helpful to have a short (one-paragraph) text in Hindi I can use as a sample text. It should be in the public domain.

I know you were all waiting to hear what the king said. Here’s a bit more of the passage. The order of the lines is Devanāgarī, transliteration (with sandhi), pre-sandhi words, glosses, English.
एतच्चिंतयित्वा स राजा पंडितसभां कारितवान् ।

etacciṃtayitvā sa rājā paṃḍita-sabhāṃ kāritavān

etad cintayitvā sas rājā paṇḍita-sabhām kāritavān

this-s.nom.n think-gerundive that-s.nom.m wise-assembly-s.acc make-PassPart-caus-s.nom.m

Having considered these things, the King convened an assembly of wise men.

राजोवाच । भो भोः पंडिताः श्रूयतां ।

rājovāca bhobhoḥ paṃḍitāḥ śrūyatāṃ

rājā uvāca bhobhos paṇḍitās śrūyatām

The King said, “O wise men, let it be heard:
अस्ति कश्चिदेवंभूतो विद्वान्यो

asti kaś-cid-evaṃ-bhūto vidvān yo

asti kas-cid evam-bhūtas vidvān yas

be-PresPart-3s who-s.nom-ever such-s.nom.m sage-s.nom.m who-s.snom.m

Is there any sage among you who—
मम पुत्राणां

mama putrāṇāṃ

mama putrāṇām

I-gen son-p.gen

my sons
नित्यमुन्मार्गगामिनामनधिगतशास्त्राणामिदानीं

nityam-unmārga-gāminām-an-adhigata-śāstrāṇām-idānīṃ

nityam unmārga-gāminām an-adhigata-śāstrāṇām idānīm

constantly wrong.way-go-gerund-p.m not-read-PassPart-book-p.m. now

being always wayward and never reading books—
नीतिशास्त्रोपदेशेन पुनर्जन्म कारयितुं समर्थः ।

nīti-śāstr-opadeśena punar-janma kārayituṃ samarthaḥ?

nīti-śāstra-upadeśena punar-janma kārayitum sam-arthas?

behavior-book-instruction-s.ins again-birth-s.acc effect-infinitive with-capable-s.nom.m

can instruct them in reading and proper behavior, [giving them] a second birth?”

 

This is from the prologue to the Hitopadeśa.  The king, whose name is Sudarśana, has a problem many kings have had: his sons are pretty worthless. He asks the pundits for help. (Yep, pundit is a borrowing from Sanksrit.) As he appears in a book written by a brahmin, the dude who steps up to help, one Viṣarma, believes that the answer is that they sit with a brahmin, i.e. himself, and learn moral tales.

I will report back later on the actual fables. But for now let’s look at one of the words in the text:

नित्यमुन्मार्गगामिनामनधिगतशास्त्राणामिदानीं

nityamunmārgagāmināmanadhigataśāstrāṇāmidānīṃ

First, you may well ask, is that one word?  It’s written as one. And by the rules of sandhi, it’s pronounced as one. But Müller transliterates it as four words:

nityam – constantly
unmārga-gāminām – wrong-ways-going
an-adhigata-śāstrāṇām – non-reading-books
idānīm now

The first three words are a description of the unruly princes, and grammatically this can be considered a really big compound. Idānīm ‘now’ probably got dragged in only because it was too tempting to combine the initial i– with the preceding –m.

Sanskrit is extremely fond of these combined words, and this is by no means on the longer end of the possibilities— you can easily have compounds with 20 or 30 roots.

Now, you can certainly do this in English:

“Can anyone instruct my undirected, non-book-reading sons by reading-conduct-instruction?”

But we usually consider this sort of thing inelegant; it reminds of bureaucratic language: “You must submit the project extension protocol revision form to the acting assistant operations and processes group manager.” We’d be more likely to use subclauses:

“My sons are constantly going the wrong way and never read books; can anyone teach them to value good conduct and literature?”

You only have to inflect the last member of a compound, so possibly the compounds were easier than regular clauses. Or perhaps they were embraced for their difficulty. After all, when the Hitopadeśa was written, the spoken language was already very different. A.L. Basham describes classical Sanskrit as one of the most “ornate and artificial” languages in the world. He also suggests that these compounds may be influenced by Tamil, which also encourages concatenations without explicit connectors or inflections.

 

 

 

If someone has gone through and transliterated it and done a word-for-word gloss. But I have worked through the grammar enough that I can at least follow that.

Let’s work through an example. We start, as Westerners have for more than a century, with the Hitopadeśa, a medieval book of sagely advice told through animal stories. I start with Max Müller’s 1864 edition.  Here’s a sample line.

राजोवाच । भो भोः पंडिताः श्रूयतां ।

râjâ   -jan, N.sg.  The King
uvâcha:  vach, 3.sg.Perf.Par.  said:
bho  Ind.  O
bhos  Ind.  ye
paṇḍitâs  -ta, V.pl.m.  wise,
śrûyatâm  śru, 3 sg. Imp. Pass.   be it heard

Now, Devanāgarī is not hard to read. It’s an abugida, meaning that the basic grapheme is a single consonant with an inherent vowel. E.g. it starts with क = ka. Diacritics modify it to change the vowel: कि ki, कु ku, का , and so on. If you really want a naked k, perhaps at the end of a word, you write क्.

If you actually transliterate Müller’s Devanāgarī, syllable by syllable, you get this:

rā-jo-vā-ca bho bhoḥ paṃ-ḍi-tāḥ śrū-ya-tāṃ

Which, if you look carefully, isn’t what Müller provides.  What happened?

Sandhi happened. All languages have processes of assimilation and relaxation that happen as words are uttered in context. Occasionally these become noticeable to people and they attempt to write them down— e.g. someone is represented as saying “I hafta go” for “I have to go”.  Sometimes the assimilations are lexicalized, which is why we write assimilation and not adsimilation.

Well, in Sanskrit there are a lot of such adaptations, and you have to write them all. So for instance the vowels ā + u combine into o: rājā uvāca > rājovāca. (Müller’s â / ch are older transliterations; we now use ā / c.)  The –s at the end of paṇḍitās changes to ḥ before the following  ś, while the final in the last word changes to ṃ, which in this case indicates nasalization. Before a stop, it’s pronounced as a homorganic stop, which is why paṇ- changed to paṃ-.

There are special diacritics for these last two letters: e.g. kaṃ would be कं, and kaḥ would be कः.

So, Müller is providing the pre-sandhi versions of the words, which makes them easier to look up in a dictionary.

(A complication for the actual book I’m writing: It turns out that Word and Illustrator don’t properly handle Devanāgarī. They can’t do the combinations– e.g. nra should be written न्र, but they turn that into न् र, like barbarians. So I won’t be able to use a lot of Devanāgarī except as, shudder, bitmaps.)

Next we need to translate his glosses to a briefer and more modern convention:

king-s.nom say-perf.part.-3s oh wise-p.voc.m hear-imper.pass.-3s

Müller glosses bho bhos as “O ye”, but this is a bit confusing— bhos is not a pronoun. An online dictionary suggests that it’s an interjection often used in addressing people: oh! hello!  indeed!   And it seems that we’re actually dealing with a reduplicated form here, bhobhos.

Finally we can provide the translation:

The king said: O wise men, let it be heard…

That’s enough for today, but on request I’ll tell you what the king wanted heard. And you should request it, because then I can talk about Sanskrit’s insane mega-compounds.

By the way, classical Sanskrit wasn’t written in Devanāgarī— it was written in the local, contemporary script. All modern Indian scripts, and Southeast Asian ones as well, ultimately derive from Brāhmī, which is what Aśoka knew. If you write your vernacular in Devanāgarī, as of course Hindi speakers do, then you write your Sanskrit in Devanāgarī; but if you speak Tamil you use Tamil script, and so on.

How, you may wonder, does this compare to learning wényán for my China book? The script is way easier, of course. But sandhi is a nightmare, and the grammar is far less accessible. You can boldly translate wényán poems knowing little but the glosses, but I don’t think I’ll be doing my own translations of Sanskrit poetry.

Next Page »