Syntax toys

I wanted to talk about my latest syntax toys, so I decided to post all three of them: ggg, gtg, mg.

To fully understand them, you’ll have to wait for my upcoming syntax book. But in brief: they are all apps for generating sentences.

  • ggg rearranges strings. You can use this for the toy grammars that syntacticians and computer programmers always start their books with, but it can handle everything in Syntactic Structures. I’ve loaded it with some interesting sample grammars.
  • mg is the equivalent for the Minimalist Program. It’s actually way more fun than reading Chomsky, in much the same way it’s much more fun to try painting a watch than to watch paint drying. I’ll explain the basics in another post.
  • gtg rearranges trees. The idea is that the program knows about syntactic structure, so you can have rules that talk about or rearrange an NP, no matter what’s in it.  You can do this in ggg only by writing rules that apply to elements before they’re expanded into subtrees.

I’m going to talk some more about gtg, since I’ve been working avidly on it for the last few weeks.

I showed some of these to a non-linguist friend, and I think he was polite but didn’t get it. That’s fine; like I say, it requires a book to explain. But from his questions, like “Could you write poetry with it?”, it was clear that he expected it to be something rather different– a wide-ranging text generator.

That is, he was more or less measuring intelligence by the size of its vocabulary.  gtg only knows about a dozen nouns and a dozen verbs (and some other stuff). It would be possible to add a hundreds more, but that’s not the point.  The point is to model basic English syntax.  That’s hard enough!

It’s not an ultra-hard problem by any means, or I couldn’t have done it in a few weeks. On the other hand, I had Chomsky’s and other linguists’ rules to start with!

The thing is, English speakers all know these rules… unconsciously. Which means you’re not impressed when you see someone produce a simple but correct sentence. Well, let’s see how aware you are of the rules.  Here are some variants of sentences:

  • The fish were caught by her
  • She has eaten fish
  • She must like fish
  • She’s eating fish

That’s passive, perfect, modal, and progressive. All four can occur in one sentence. Without trying out alternatives in your head, what order do they appear in?

Here’s another: some sentences require an added do, some don’t:

  • We don’t keep kosher.
  • Did you take out the trash?
  • What does the fox say?
  • We aren’t going to St. Ives.
  • Can’t you keep a secret?

Again, without trying it out in your head, just from general knowledge: can you state when this added do appears?

Or, can you say precisely you use he and when you use him?  If you are a conlanger or you know an inflected language, you probably immediately think “He is nominative.”  Well, what about in Sarah wants him to move out? Him is the subject of ‘move out’, isn’t it?  (It’s not the object of want. What does Sarah want? “Him”? No, she wants “for him to move out”.)

The rules aren’t terribly difficult… indeed, if you look in the boxes on the gtg page, they’re all right there! But they’re difficult enough to make a fairly involved computing problem.

Now, syntacticians devising rules like to use formal notation… but they almost always supplement it with English descriptions. Programming forces you to be much more explicit.

Now, when I began the program, I started out with rules that looked something like this:


If you look at mg, the rules are still like that… and since I wrote that a few months ago, I don’t even remember how they work. But besides being unreadable, such rules are very ad hoc, and hide a bunch of details in the program code.

What I ended up doing instead was writing myself a tiny programming language.  This forced me to come up with the smallest steps possible, and to encode as little grammatical information as possible within the program itself.

Here’s an example: the rules for making a sentence negative.

* negative
maybe if Aux lex not insert Neg
maybe if no Aux find T lex not insert Neg

The first line is a comment. The rest are commands.

  • Maybe says that a rule is optional– the program will execute it only sometimes.
  • If looks for something of a particular category, in this case an auxiliary verb. If it’s not found, we skip to the next rule. If it is, we remember the current location.
  • Lex not means to look up the word not in the lexicon and keep it on the clipboard.
  • Insert says to insert what’s on the clipboard into the sentence at the current location.

Note that this mini-language only has two ‘variables’, what I’ve called the clipboard and the current location. I haven’t found a rule yet that requires more than that.

The help file for gtg explains all the commands and annotates each of the grammatical rules I used.

This is not how syntacticians write their rules; but one conclusion I’ve come to after reading a bunch of syntax books is that all the formalisms are far less important than their inventors think. Chomsky started people thinking that there was One True Theory of Syntax, but there isn’t. It’s less like solving the Dirac equation and more like proving the Pythagorean theorem: there are many ways to do it, and the fact that they look and feel different doesn’t mean that most of them are wrong. Writing rules in this simple language worked out for me and it’s no worse than, say, the extremely unintuitive rules of Minimalism.

Can you use these toys for writing grammars for your language or conlang?  Well, best to wait for the book to come out, but in general, sure, you can try.

I have to warn you, though: it’s not quite as straightforward as using the SCA, and plenty of people have trouble with that.  You have to think like a programmer: be able to break a problem into tiny pieces, and work out all the complications.

On the other hand, tools like gtg can help keep you honest: if the rules don’t work, the program produces crappy sentences, so you know something’s wrong. Plus it keeps you thinking about a wide variety of sentences. (Good syntacticians can quickly run through a bunch of transformations in their heads, applying them to some problem. When you’re new to the concept, you can think only about simple sentences and miss 90% of the complications.)

Also, I hope to keep improving the program, so it may be easier later on.