I’m at the point in my book where I need some sample sentences in Hindi. If you (or a friend or relative) know Hindi and can translate them for me, please contact me. There’s a couple dozen or so.

(I have versions of them already, but they’re either copied from textbooks or they’re my attempt at modifications. I’d rather have a native speaker produce original ones.)

Also, it’d be helpful to have a short (one-paragraph) text in Hindi I can use as a sample text. It should be in the public domain.

I know you were all waiting to hear what the king said. Here’s a bit more of the passage. The order of the lines is Devanāgarī, transliteration (with sandhi), pre-sandhi words, glosses, English.
एतच्चिंतयित्वा स राजा पंडितसभां कारितवान् ।

etacciṃtayitvā sa rājā paṃḍita-sabhāṃ kāritavān

etad cintayitvā sas rājā paṇḍita-sabhām kāritavān

this-s.nom.n think-gerundive that-s.nom.m wise-assembly-s.acc make-PassPart-caus-s.nom.m

Having considered these things, the King convened an assembly of wise men.

राजोवाच । भो भोः पंडिताः श्रूयतां ।

rājovāca bhobhoḥ paṃḍitāḥ śrūyatāṃ

rājā uvāca bhobhos paṇḍitās śrūyatām

The King said, “O wise men, let it be heard:
अस्ति कश्चिदेवंभूतो विद्वान्यो

asti kaś-cid-evaṃ-bhūto vidvān yo

asti kas-cid evam-bhūtas vidvān yas

be-PresPart-3s who-s.nom-ever such-s.nom.m sage-s.nom.m who-s.snom.m

Is there any sage among you who—
मम पुत्राणां

mama putrāṇāṃ

mama putrāṇām

I-gen son-p.gen

my sons


nityam unmārga-gāminām an-adhigata-śāstrāṇām idānīm

constantly wrong.way-go-gerund-p.m not-read-PassPart-book-p.m. now

being always wayward and never reading books—
नीतिशास्त्रोपदेशेन पुनर्जन्म कारयितुं समर्थः ।

nīti-śāstr-opadeśena punar-janma kārayituṃ samarthaḥ?

nīti-śāstra-upadeśena punar-janma kārayitum sam-arthas?

behavior-book-instruction-s.ins again-birth-s.acc effect-infinitive with-capable-s.nom.m

can instruct them in reading and proper behavior, [giving them] a second birth?”


This is from the prologue to the Hitopadeśa.  The king, whose name is Sudarśana, has a problem many kings have had: his sons are pretty worthless. He asks the pundits for help. (Yep, pundit is a borrowing from Sanksrit.) As he appears in a book written by a brahmin, the dude who steps up to help, one Viṣarma, believes that the answer is that they sit with a brahmin, i.e. himself, and learn moral tales.

I will report back later on the actual fables. But for now let’s look at one of the words in the text:



First, you may well ask, is that one word?  It’s written as one. And by the rules of sandhi, it’s pronounced as one. But Müller transliterates it as four words:

nityam – constantly
unmārga-gāminām – wrong-ways-going
an-adhigata-śāstrāṇām – non-reading-books
idānīm now

The first three words are a description of the unruly princes, and grammatically this can be considered a really big compound. Idānīm ‘now’ probably got dragged in only because it was too tempting to combine the initial i– with the preceding –m.

Sanskrit is extremely fond of these combined words, and this is by no means on the longer end of the possibilities— you can easily have compounds with 20 or 30 roots.

Now, you can certainly do this in English:

“Can anyone instruct my undirected, non-book-reading sons by reading-conduct-instruction?”

But we usually consider this sort of thing inelegant; it reminds of bureaucratic language: “You must submit the project extension protocol revision form to the acting assistant operations and processes group manager.” We’d be more likely to use subclauses:

“My sons are constantly going the wrong way and never read books; can anyone teach them to value good conduct and literature?”

You only have to inflect the last member of a compound, so possibly the compounds were easier than regular clauses. Or perhaps they were embraced for their difficulty. After all, when the Hitopadeśa was written, the spoken language was already very different. A.L. Basham describes classical Sanskrit as one of the most “ornate and artificial” languages in the world. He also suggests that these compounds may be influenced by Tamil, which also encourages concatenations without explicit connectors or inflections.




If someone has gone through and transliterated it and done a word-for-word gloss. But I have worked through the grammar enough that I can at least follow that.

Let’s work through an example. We start, as Westerners have for more than a century, with the Hitopadeśa, a medieval book of sagely advice told through animal stories. I start with Max Müller’s 1864 edition.  Here’s a sample line.

राजोवाच । भो भोः पंडिताः श्रूयतां ।

râjâ   -jan,  The King
uvâcha:  vach,  said:
bho  Ind.  O
bhos  Ind.  ye
paṇḍitâs  -ta,  wise,
śrûyatâm  śru, 3 sg. Imp. Pass.   be it heard

Now, Devanāgarī is not hard to read. It’s an abugida, meaning that the basic grapheme is a single consonant with an inherent vowel. E.g. it starts with क = ka. Diacritics modify it to change the vowel: कि ki, कु ku, का , and so on. If you really want a naked k, perhaps at the end of a word, you write क्.

If you actually transliterate Müller’s Devanāgarī, syllable by syllable, you get this:

rā-jo-vā-ca bho bhoḥ paṃ-ḍi-tāḥ śrū-ya-tāṃ

Which, if you look carefully, isn’t what Müller provides.  What happened?

Sandhi happened. All languages have processes of assimilation and relaxation that happen as words are uttered in context. Occasionally these become noticeable to people and they attempt to write them down— e.g. someone is represented as saying “I hafta go” for “I have to go”.  Sometimes the assimilations are lexicalized, which is why we write assimilation and not adsimilation.

Well, in Sanskrit there are a lot of such adaptations, and you have to write them all. So for instance the vowels ā + u combine into o: rājā uvāca > rājovāca. (Müller’s â / ch are older transliterations; we now use ā / c.)  The –s at the end of paṇḍitās changes to ḥ before the following  ś, while the final in the last word changes to ṃ, which in this case indicates nasalization. Before a stop, it’s pronounced as a homorganic stop, which is why paṇ- changed to paṃ-.

There are special diacritics for these last two letters: e.g. kaṃ would be कं, and kaḥ would be कः.

So, Müller is providing the pre-sandhi versions of the words, which makes them easier to look up in a dictionary.

(A complication for the actual book I’m writing: It turns out that Word and Illustrator don’t properly handle Devanāgarī. They can’t do the combinations– e.g. nra should be written न्र, but they turn that into न् र, like barbarians. So I won’t be able to use a lot of Devanāgarī except as, shudder, bitmaps.)

Next we need to translate his glosses to a briefer and more modern convention:

king-s.nom say-perf.part.-3s oh wise-p.voc.m hear-imper.pass.-3s

Müller glosses bho bhos as “O ye”, but this is a bit confusing— bhos is not a pronoun. An online dictionary suggests that it’s an interjection often used in addressing people: oh! hello!  indeed!   And it seems that we’re actually dealing with a reduplicated form here, bhobhos.

Finally we can provide the translation:

The king said: O wise men, let it be heard…

That’s enough for today, but on request I’ll tell you what the king wanted heard. And you should request it, because then I can talk about Sanskrit’s insane mega-compounds.

By the way, classical Sanskrit wasn’t written in Devanāgarī— it was written in the local, contemporary script. All modern Indian scripts, and Southeast Asian ones as well, ultimately derive from Brāhmī, which is what Aśoka knew. If you write your vernacular in Devanāgarī, as of course Hindi speakers do, then you write your Sanskrit in Devanāgarī; but if you speak Tamil you use Tamil script, and so on.

How, you may wonder, does this compare to learning wényán for my China book? The script is way easier, of course. But sandhi is a nightmare, and the grammar is far less accessible. You can boldly translate wényán poems knowing little but the glosses, but I don’t think I’ll be doing my own translations of Sanskrit poetry.

My wife has just returned from Peru, and brought back a list of Peruvian names from the newspapers. Odd spellings for foreign names are muy de onda (very hip).






Airon (Aaron?)





Jeylo (J. Lo)

Jhunior Brayan


Itan (Ethan?)

Johan Jonathán










I’ve updated the Numbers from 1 to 10 page!  For the first time in, well, many years.

Note: if you don’t see the new page, because the old page is cached, you may have to hit shift-refresh.

The major features:

  • It now makes extensive use of Unicode to finally present the numbers as they were intended to be seen. (If you can’t see all the characters it’s dredged up, check the notes page for how to download comprehensive Unicode fonts.)
  • As a corollary, I’ve started to include the native writing system for key languages.
  • The families are color-coded to help you navigate.
  • The page uses Javascript to allow you to customize the results.

Now the story behind the update. The original source file was an enormous Mac Word 5.1 file. To generate the html files, I would output the source file into RTF (which is how you were supposed to access .doc files). Then I ran a custom C program that converted the RTF into html.

So far so good, only my old PowerPC died a few years back, which meant I could no longer run Mac Word 5.1, which meant I couldn’t generate the RTF or the html files, which meant no updates period.

Sigh, Mac Word 5.1, released in 1991, was a thing of beauty. It had little of the cruft of later versions of Word, I had all the commands in muscle memory, and on the PowerPC it was damn fast. Plus it never crashed. I had to switch to Word 2008 when I needed Unicode, but I kept using 5.1 until I couldn’t. I’ve gotten used to Word 2008, but it is just not the reliable workhorse that 5.1 was. It crashes unpredictably with certain large files, especially if they have a lot of formatting— as many of my books do.

Word 5.1 had a neat feature that I used extensively for the numbers list: you could overtype characters. This was necessary to represent the many many arcane and wacky characters that linguists have used over the last couple centuries to write their grammars and wordlists. Word 2008 can read these, but apparently can’t create them.

I had long envisioned a database or a text document that could hold the numbers, letting the web page itself be very simple. I was a bit worried that the database would be huge and slow, but then I remembered that most web pages these days pull down megabytes of cruft.

So, the source file is now plaintext.  I still use Word to create it, because it looks better there and I can use bolding to help me navigate, but all I do to make the plaintext file is copy and paste into TextEdit. It turns out that the whole file is only 400K, far smaller than the 1.4M html file that was the old mondo partly-Unicoded version. The text file is human-readable, but some pretty simple Javascript reads and prettifies it for the actual web page.

There are probably some typos in the file, due to quirks in the old process which I may have missed or messed up during the conversion process.  On the other hand, there were a lot of kludges in the old html version; the new version is much closer to the original sources.

I haven’t dealt with the sources page yet. (The problems are similar; it will be a another fairly tedious project to update the document and access page.)

Edit: The sources page is done now too! As with the numbers page, you can zero in on specific regions.

If you happen to be a linguist or for some other reason study the less-spoken languages, I’m always open to additions and corrections, and finally I can make them again.


In your review of Overwatch, you said that you appreciate the fact that characters speak appropriately in Chinese, Korean, Russian, and French. However, I have read some complaints that the French accent of Widowmaker sounds fake. Since I have heard similar complaints about Leliana of the Dragon Age series, and since both are voiced by French people, I would like to know if this perception comes from actors deliberately exaggerating their pronunciation, or if Hollywood or something similar have misled people into what constitute a true foreign accent.
Antonin BRAULT

Standards are changing, so I think this issue is in flux.

I can tell you what isn’t acceptable any more: mangling foreigners’ accents as in this book.

That is, it would be completely offensive if instead of having a Korean-Japanese-American woman (Charlet Chung) voice D.Va, they’d had a white American attempt a Korean accent.

So far as I can judge, Chloé Hollings, the voice of Widowmaker, pronounces the French perfectly— as she should; she’s French.

Is her French accent exaggerated? Yes, of course; Hollings is bilingual and speaks excellent English. I don’t have any inside knowledge of Blizzard’s production, but one can imagine for many of these voices a scene something like this:

Voice actor: (pronounces a line perfectly)

Director: Great! Only… can you make it sound more French?

And the director does have a point! If they’ve gone to the trouble of hiring bilingual voice actors, they kind of don’t want perfectly unaccented English. The characters are supposed to be cartoony, so they want to reach the sweet spot where the accents communicate the character but remain attractive. (Americans, at least, react negatively to a heavy foreign accent, but find a light accent enchanting.)

With Dragon Age, I saw a page that noted that Corinne Kempa (voice of Leliana) simply didn’t have the type of French accent Americans expect to hear. Again, American viewers aren’t very sophisticated here; few could even identify different varieties of French. (I liked Leliana— it was nice to have a fantasy game that didn’t over-rely on British accents.)

It’s hard to make everybody happy, but I think Blizzard took a pretty good approach. I also like the fact that, except for the two ninjas, the characters aren’t defined by their nationalities. E.g. Mei is a climatologist, who just happens to be Chinese. Zarya is much more defined as “butch power-lifting soldier” than as Russian. They do paint with a broad brush, but they’re nodding much more to media images than to ethnic stereotypes— e.g. McCree is a version of Clint Eastwood; Junkrat refers to Mad Max.  One character they could have done better with, in my opinion, is Pharah, who should speak some Arabic.

Edit: The new character, Ana, does speak some Arabic.

I saw this on Twitter, and decided that this was an important phrase to learn in Chinese:



wǎng-shàng xūnǐ jiāoxīn bù yí

web-above virtual entrust not should

You should not make virtual commitments online.


While we’re at it, my Overwatch pals have been quoting D.Va’s comments in Korean, so let’s look at those in more detail.


a̠nɲjʌ̹ŋ ɦa̠sʰe̞jo

Annyeong haseyo!

peace you.have

Do you have peace? = How are you?

That first word is a borrowing from Chinese 安寧— Mandarin ānníng ‘peace, tranquility’. You will undoubtedly recognize the first character from 西安 Xī’ān, the ancient capital of China; also Heian, the ancient name for Kyoto. is very informal and also from the future, so she just says Annyeong!



Kamsa hamnida!

thanks have.assertive

I am thankful! = Thank you!

Again, the first word is a borrowing: 感謝 gǎnxiè ‘gratitude’; the common way to say “Thank you” in Mandarin— which you can hear Mei say in Overwatch— is 謝謝 xièxiè.

And again, D.Va informally says just Kamsa!

Mei’s “Hello” is 你好 Nǐhǎo, literally “you good?”


Next Page »