Linguistics beach reads

Since I started grad school, I have made a practice of reading books, and pop-science linguistics books in particular. I genuinely think I’ve gotten a lot out of it over the years. Let me make a few recommendations for your summer beach reading, focusing on lighter fare.

  • The Riddle of the Labyrinth: The Quest to Crack an Ancient Code (Margalit Fox, 2013) is a breezy take on the decipherment of Linear B, with particular emphasis on crucial early work done by Brooklyn College professor Alice Kober, who was in heavy correspondence with amateur Michael Ventris, who announced the decipherment just eighteen months after her untimely death at age 43. (Ventris himself died even younger, at 34, in a car accident that some think a concealed suicide.) The Linear B saga is a neverending source of interest, and Fox is both good on the drama (she used to write the obituaries in the Times) and the linguistics (she has a master’s degree from Stony Brook).
  • Chinese Characters across Asia: How the Chinese Script Came to Write Japanese, Korean, and Vietnamese (Zev Handel, 2025) talks amateurs through the history of writing in East Asia, summarizing his much more technical 2019 book on the same topic for a non-linguistic audience. 
  • Patterns In The Mind: Language And Human Nature (Ray Jackendoff, 1994) is my favorite of Language Instinct-alikes. It is focused more or less on selling the idea of UG to normies, and on those terms, it succeeds mightily. 
  • Because Internet: Understanding the New Rules of Language (Gretchen McCulloch, 2019) does a good job summarizing disparate threads in the sociolinguistics of computer-mediated language with just enough humor to lighten the mood.
  • Language and Problems of Knowledge: The Managua Lectures (Noam Chomsky, 1987) is the text of five lectures given to a lay audience in Nicaragua, illustrating the core ideas of the generative program. Most of the examples are based on comparing the syntax of English and Spanish, and the book is easily the most accessible thing Chomsky has written (and far more relevant to current thinking than, say, the equally-accessible Syntactic Structures). 

I of course welcome other suggestions in the comments section. 

Actually, chess has a tech tree

A few years ago, Elon posted (it’s a real tweet; just screenshotting for posterity) that Chess doesn’t have a “tech tree”:

see full tweet at https://x.com/elonmusk/status/1841521084559945980?lang=en

I disagree. First, there’s promotion of pawns. Then, there is castling, which moves the king from a center file to relative safety on the side, at the same time moving a rook into the center and a more active role. And of course we speak of developing one’s knights (who could be likened to indirect fire), bishops (who provide enfillade fire), and rooks by moving them from their starting position into ones where they can more actively attack and defend. the middle of the board. There are even various systems for scoring board position based on piece development. If this isn’t a tech tree, I don’t know what is.

More imperative-dominant defectivity in English

Aidan Malanoski adds the following to our list of imperative-dominant defective verbs in English: come V, go V. These seem (to Aidan and I) to show a similar distribution to the ones discussed in the previous post: imperatives are ok (Come say hi! Go give your mom a kiss!) as are infinitives (She shouted for him to come greet me.) and everything else is degraded.

Tipping for counter service

In the States, which already had a large and diverse (some would say annoying) set of interactions which de facto required substantial tips (e.g., 15 to 20% of the purchase price), tips are now also requested (though not yet required) for food and/or drinks purchased at the counter. Hip cafes have long had tip jars (with no particular expectation that one uses them), and one is expected to tip the bartender at least a dollar for an alcoholic drink purchased, but the new thing is that anywhere with an iPad-based payment system also requests a percentage tip.

A lot of people are dismayed by this. I for one select “no tip” or “0%” or whatever the majority of the time. But the new “turning around the iPad” normal doesn’t bother me either; at worst it’s a small, progressive tax transferring money from easily-influenced (rule-following? introverted?) consumers making luxury food purchases to counter-service employees who probably need the money more.

Vibe coding in 2025

I have seen a decent amount of software developed via vibe coding, i.e., coding with heavy AI assistance but I have not seen anything that is passable as professionally-developed software. One big tell is that the AI assistant tends to add in copious boundary condition checks that’d never occur to humans because they are unlikely to occur in practice. Another is bizarre style, which imposes a heavy cognitive cost on the reader (whose time is qualitatively more valuable than that of the computer).

For people who basically can’t code anywhere near a professional level, I see why this is valuable, but I think these people should be honest with themselves and admit they can’t really code, and wouldn’t know good code if it hit them in the face.

For people who can (and who can already avail themselves of various autocompletion tools whether AI-powered or knowledge-based), I don’t see the value proposition. It creates hard-to-review code and code review is a more difficult, important, cognitively taxing, and rarified skill than development itself, so this technology could, in the worst case, actually drive up the already high cost of software development.

Neural fossils

Neural network cognitive modeling had a brief, precocious golden era between 1986 (the year the Parallel Distributed Processing books came out) and maybe about 1997 (at which point the limitations of those models were widely known…though I’m little fuzzier about when this realization settled in). During that period, I think it’s fair to say, a lot of people got hired into the faculty, in psychology and linguistics in particular, simply because they knew a bit about this exciting new approach. Some of those people went on to do other interesting things once the shine had worn off, but a lot of them didn’t, and some of them are even still around, haunting the halls of R1s. I think something similar will happen to the new crop of LLMologists in the academy: some have the skills to pivot should we reach peak LLM (if we haven’t already), but many don’t.

Hiring season

It’s hiring season and your dean has approved your linguistics department for a new tenure line. Naturally, you’re looking to hire an exciting young “hyphenate” type who can, among other things, strengthen your computational linguistics offerings, help students transition into industry roles and perhaps even incorporate generative AI into more mundane parts of your curriculum (sigh). There are two problems I see with this. First, most people applying for these positions don’t actually have relevant industry experience, so while they can certainly teach your students to code, they don’t know much about industry practices. Secondly, an awful lot of them would probably prefer to be a full-time software engineer, all things considered, and are going to take leave—if not quit outright—if the opportunity ever becomes available. (“Many such cases.”) The only way to avoid this scenario, as I see it, is to find people who have already been software engineers and don’t want to be them anymore, and fortunately, there are several of us.

News from the east

I am a total sucker for cute content from East Asia. I loved to watch Pangzai do his little drinking tricks. I love to hear what the “netizens” are up to. I love the greasy little hippo. I love the horse archer raves. I even love the chow chows painted as pandas. It’s delightful. Is this propaganda? Maybe; certainly it’s embedded a larger matrix of Western-oriented soft-power diplomacy. (That’s why we have so many Thai restaurants.) But I suppose I’m blessed to live in a time where you can get so much cute news from halfway across the world.

Learned tokenization

Conventional (i.e., non-neural, pre-BERT) NLP stacks tend to use rule-based systems for tokenizing sentences into words. One good example is Spacy, which provides rule-based tokenizers for the languages it supports. I am sort of baffled this is considered a good idea for languages other than English, since it seems to me that most languages need machine learning for even this task to properly handle phenomena like clitics. If you like the Spacy interface—I admit it’s very convenient—and work in Python, you may want to try thespacy-udpipe library, which exposes the UDPipe 1.5 models for Universal Dependencies 2.5; these in turn use learned tokenizers (and taggers, morphological analyzers, and dependency parsers, if you care) trained on high-quality Universal Dependencies data.

Pied piping and style

I find pied-piping in English a bit stilted, even if it is sometimes the prescribed option. Consider the following contrast:

(1) I’m not someone to fuck with.
(2) I’m not someone with whom to fuck.

In (1) the preposition with is stranded; in (2) it is raises along with the wh-element. What are your impressions of a speaker who says (2)? For me, they sound a bit like a nerd, or perhaps a cartoonish villain. I thought about this the other day because I was watching Alien Resurrection (1997)—it’s okay but not one of my favorite entries in the Weyland-Yutani cinematic universe—and one of the first bits of characterization we get for mercenary “Ron Johner”, played by badass Ron Perlman, is the following bit of dialogue (here taken directly from Joss Whedon’s screenplay):

This would work if Johner was a sort of evil genius, or if it was some kind of callback to something earlier, but I think this is probably just unanalyzed language pedantry ruining the vibe a little.