A thought about academic jobs

I try not to pontificate about the academic job market. I recognize that I incredibly fortunate to have the job I have. I recognize that it is hard to get such a job, that it in some sense it comes down to luck, that there are more PhDs than faculty jobs, and finally that my job is not my friend. That said…

A colleague of mine had a PhD advisee who was offered a more or less ideal tenure-track job, at an excellent state school specializing in the advisee’s subarea, in a very pleasant town. The student, believe it or not, turned it down, and is now starting more or less from scratch on the alt-ac path. I genuinely don’t understand this. Earning a PhD in your field is the one always-necessary condition for getting an faculty job, even if the skills transfer to other pursuits. The demands of a graduate program expects from you are, to a great degree, necessary to get a faculty job. There are of course extra steps—that qualifying paper has to be sent off to a journal, and so on—but in terms of effort they are nothing compared to the work needed to get your degree. If you are doing well in your PhD program and if you are enjoying your studies, why not, for as long as you are able, consider applying for faculty positions? If you are not meeting your program’s expectations, your pessimism about the academic job market is besides the point, and if you are meeting or exceeding those expectations, you really might want to consider it.

High school as signaling behavior

When you meet an adult for the first time in Cincinnati—where I grew up—it is customary to ask them where they went to high school. Even though I have had basically nothing to do with Cincinnati since I reached the age of majority, I can learn so much about someone by learning they went to St. Ursula, or Walnut Hills, or Elder, Summit Country Day, or Wyoming. (This is helped along by the fact that Cincinnati is, for historical reasons, rather Catholic.) It’s one of the first things I ask born and raised New Yorkers too, and it tends to yield a lot of information. I know half a dozen graduates of Bronx Science (including the president of my college); I believe David Pesetsky is one of several well-known linguists who attend Horace Mann; Hunter High is also a very promising sign, as is Stuyvesant. I even know about some of the elite high schools of Illinois at this point.

While virtually all the focus on “elite institutions” is directed at undergraduate colleges, I think this is something of a misdirection. While this may seem self-serving, I think high school choice might be a stronger signal than college choice, at least in parts of the country where it is common for one (with the help and possibly financial support of one’s parents, of course) to more or less pick a high school, with many magnet and private options.

My personal experience bears this out. I went to a very good suburban public school system (Lakota) until I was 14 and the strongest students at 14 who continued on to high school in that system are not living particularly impressive lives. In contrast, my class at my very good Catholic high school (St. Xavier) includes, among other impressive individuals, two centimillionaires (though one of those two is a phony and a scoundrel). I for one did not gain much personal ambition from St. Xavier, but I did acquire a love of learning (as someone once described it to me, “a pseudo-erotic attachment to knowledge”). Also, without any particular intentionality, I attended a good (but not selective) “R1” public college, and I feel like high school left me particularly well-positioned to take advantage of it. I didn’t even seriously consider elite colleges; I grew up in a solidly middle class family where there was no particular knowledge of elite institutions, to the point that I didn’t even find out what the Ivy League was until after I’d been accepted to Penn for my PhD. Had I been drawn from a slightly higher class stratum, I might have applied to Ivys, or at least one of those pricy private liberal arts schools on the East Coast like Vassar, and had I done so, I would have taken on an onerous load of personal debt in the process. And for what? It wouldn’t have made me any better a scholar.

Stop capitalizing so much

One of the absolute scourges of student writing is the tendency to capitalize just about every multi-word noun phrase. The rule in English is pretty simple: you only capitalize proper names, and these are, roughly, the names of people, locations, or organizations. Technical concepts do not qualify. It doesn’t matter if it’s part of an acronym: we capitalize the acronym but not necessarily the full phrase. Natural language processing is not a proper name; cognitive science isn’t either; logistic regression certainly is not a proper name nor is conditional random fields or hidden Markov model or support vector machine or…

Rich people shouldn’t drive

I don’t understand why the filthy rich ever drive. Sure, I get why Ferdinand Habsburg gets into the ~~Eva~~ cockpit: an F1 race is the modern-day tournament. But driving is a dangerous, high-liability, cognitively taxing activity and it’s easy for the rich to offload those hazards to a specialist. I don’t understand why, for example:

Warren Buffett (alleged net worth $138B) supposedly drives his own older Cadillac to the office (though maybe not anymore, given that he’s now 93).
“Little” Sam Altman (alleged net worth $1B) drives a low-to-the-ground sports car through stop-and-go traffic in downtown Palo Alto (it’s giving Dukakis).
“Bumpin’ dat” Justin Timberlake (alleged net worth $250M) got busted for a DUI in Long Island.
Alec Baldwin (alleged net worth $70m) settled out of court with a guy he allegedly punched in the jaw over a parking spot.

In the unlikely event that I hit centimillion status, the first thing I’m doing is buying a black, under-the-radar towncar and hiring a chaffeur with good personal recommendations. And before that, when I enter decamillion territory, I’m just calling UberXen. No alternate-side parking, no DUIs for me. I don’t know about Justin, but surely Warren and Sam have something better to do than be behind the wheel. They could be power napping, meditating, watching the market, or catching up on X (“the everything app”) the back of their car instead.

Linguistic relativity and i-language

Elif Batuman’s autofiction novel The Idiot follows Selin, a Harvard freshman in the mid 1990s. Selin initially declares her major in linguistics and describes two classes in more detail. One is with a soft-spoken professor who is said to be passionate about Turkic phonetics (no clue who this might be: anybody?) and the other is described as a Italian semanticist who wears beautiful suits (maybe this is Gennaro Chierchia; not sure). Selin is taken aback by the stridency with which her professor (presumably the Turkic phonetician) rails against the Sapir-Whorf hypothesis—she regrets how the professor repeatedly mentions Whorf’s day job as a fire prevention specialist—and finds linguistic relativity so intuitive she changes her major at the end of the book.

Batuman is not the only person to draw a connection between rejection of the stronger forms of the Sapir-Whorf hypothesis and generativism. Here’s the thing though: there is no real connection between these two ideas! Generativism has no particular stance on any of this. The only connection I see between these two ideas is that, when you adopt the i-language view, you simply have more interesting things to study. If you truly understand, say, poverty of the stimulus arguments, you just won’t feel the need to entertain intuitive-popular views of language because you’ll recognize that the human condition vis-à-vis language is much richer and much stranger than Whorf ever imagined.

Medical bills

Starting about two years ago, I got an unexpected medical bill in the mail. The amount wasn’t very high, but I was quite frustrated and annoyed. First, this was from a local College of Dentistry, where most procedures are free for the insured (and probably not insured too); there was no “explanation of benefits” that explained this was a co-pay, or that my insurance only covered some portion. Secondly, I hadn’t been to the College of Dentistry in quite a while, so I had no idea which of the various procedures this was or even what day I received the billed service. Third, there was no way to get more information: the absolute worst thing about this provider is that the administrative staff are some of the most overloaded and overworked people I have ever seen, and I have witnessed them just let the phone ring because they’re dealing with a huge line of in-person patients (some of whom are bleeding from their mouth). So I didn’t pay it. After a while though, the bills continued and I started to worry. Was I wasting paper for no reason? Would this harm my credit score? So I put about an hour into finding a way to actually get in touch with the billing office: turns out this was a Google Form buried somewhere on a website, and if you fill it out, a someone calls you (in my case, within the hour!), looks up your chart, and can tell you the date of service and why you were billed. Why they didn’t just include this in the bill in the first place? I have to imagine this makes it ever harder for the College to actually collect on these debts.

“Indic” considered harmful

Indic is an adjective referring to the Indo-Aryan languages such as Hindi-Urdu or Bengali. These languages are spoken mostly in the northern parts of India, as well as in Bangladesh, Pakistan, Sri Lanka, Nepal, and the Maldives. This term can be confusing, because hundreds of millions of people in the Indian subcontinent (and nearby island nations) speak non-Indic first languages: over 250 million people, particularly in the south of India and the north of Sri Lanka, speak Dravidian languages, which include Malayalam, Tamil, and Telugu. Austronesian, Tibeto-Burman, and Tai-Kadai languages, and many language isolates, are also spoken in the India and the other nations of subcontinent, as is English (and French, and Portuguese). Unfortunately, there is now a trend to use Indic to mean ‘languages of the subcontinent’. See here for a prominent example. This is a new sense for Indic, and while there is probably a need for such a lexeme to express the notion (language of India or subcontinental language would work), reusing Indic, which already has a distinct and well-established sense, just adds unnecessary confusion.

A minor syntactic innovation in English: “BE crazy”

I recently became aware of an English syntactic construction I hadn’t noticed before. It involves the predicate BE crazy, which itself is nothing new, but here the subject of that predicate is, essentially, quoted speech from a second party. I myself am apparently a user of this variant. For example, a friend told me of someone who describes themselves (on an online dating platform) as someone who …likes travel and darts, and I responded, simply, Likes darts is crazy. That is to say, I am making some kind of assertion that the description “likes darts”, or perhaps the speech act of describing oneself as such, is itself a bit odd. Now in this case, the subject is simply the quotation (with the travel and part elided), and while this forms a constituent, a tensed VP, we don’t normally accept them as the subject of predicates. And I suspect constituenthood is not even required. So this is distinct from the ordinary use of BE crazy with a nominal subject.

I suspect, though I do not have the means to prove, this is a relatively recent innovation; I hear it from my peers (i.e., those of similar age, not my colleagues at work, who may be older) and students, but not often elsewhere. I also initially thought it might be associated with the Mid-Atlantic but I am no longer so sure.

Your thoughts are welcome.

“Segmented languages”

In a recent paper (Gorman & Sproat 2023), we complain about conflation of writing systems with the languages they are used to write, highlighting the nonsense underlying common expressions like “right-to-left language”, “syllabic language” or “ideographic” language found in the literature. Thus we were surprised to find the following:

Four segmented languages (Mandarin, Japanese, Korean and Thai) report character error rate (CER), instead of WER… (Gemini Team 2024:18)

Since the most salient feature of the writing systems used to write Mandarin, Japanese, Korean, and Thai is the absence of segmentation information (e.g., whitespace used to indicate word boundaries), presumably the authors mean to say that the data they are using has already been pre-segmented (by some unspecified means). But this is not a property of these languages, but rather of the available data.

[h/t: Richard Sproat]

References

Gemini Team. 2023. Gemini: A family of highly capable multimodal models. arXiv preprint 2312.11805. URL: https://arxiv.org/abs/2312.11805.

Gorman, K. and Sproat, R.. 2023. Myths about writing systems in speech & language technology. In Proceedings of the Workshop on Computation and Written Language, pages 1-5.

Growing consensus

Any time I read a paper that begins, roughly, “there is a growing consensus that P“, there is not in fact, as far as I can tell, a growing consensus in support of P.