Thanks to alert reader Kit La Touche, I’ve now seen the actual article from Quentin Atkinson.  It has lots of statistical detail, but doesn’t really answer my objections.  There’s nothing at all about the sampling problem, nothing about historical trends if any in phoneme size, nothing to indicate he realizes that the Bantu explosion basicially erases most of African language diversity.

Plus, though he mentions the idea that modern population size can’t be projected into the past, he doesn’t seem to realize that it may be entirely uncorrelated with ancient (> 15,000 years) population sizes.  E.g. Mandarin doesn’t have a billion speakers today because it was a particularly large tribe in 13,000 BC.  For most of our existence we were hunter-gatherers, and most languages probably didn’t exceed 500 speakers, except when a tribe could expand into virgin territory.

Here’s his languages, which he gets from the WALS survey.  I was worried that he was overrepresenting the Polynesian languages, but it seems not.  On the other hand, some areas are strangely thin.  There can’t be many Khoisan languages there, and quite a few areas are worryingly sparse: India, East Africa, southern Australia, North America.

At one point he contrasts an analysis based on families, which suggests Africa as the origin, with one based on individual languages, which narrows it down to sub-Saharan Africa, especially the west.  But dude, look at your map; you only have four data points in northern Africa, and only four on the east coast.

And here’s his actual scatterplot, with legend.  That’s an awfully, well, scattered distribution; note that his own analysis suggests that the distance from Africa accounts for just 19% of the variation.

