X moment

A Reddit moment is an expression used to refer to a certain type of cringe ‘cringeworthy behavior or content’ judged characteristic of Redditors, habitual users of the forum website reddit.com. It seems hard to pin down what makes cringe Redditor-like, but discussion on Urban Dictionary suggests that one salient feature is a belief in one’s superiority, or the superiority of Redditors in general; a related feature is irl behavior that takes Reddit too seriously. The normal usage is as an interjection of sorts; presented with cringeworthy internet content (a screenshot or URL), one might simply respond  “Reddit moment”.

However, Reddit isn’t the only community that can have a similar type of pejorative X moment. One can find many instances of crackhead moment, describing unpredictable or spazzy behavior. A more complicated example comes from a friend, who shared a link about a software developer who deliberately sabotaged a widely used JavaScript software library to protest the Russian invasion of Ukraine. JavaScript, and the Node.js community in particular, has been extremely vulnerable to both deliberate sabotage and accidental bricking ‘irreversible destruction of technology’, and naturally my friend sent the link with the commentary “js moment”. The one thing that seems to unite all X moment snowclones is a shared negative evaluation of the community in the common ground.

Evaluations from the past

In a literature review, speech and language processing specialists often feel tempted to report evaluation metrics like accuracy, F-score, or word error rate for systems described in the literature review. In my opinion, this is only informative if the prior and present work use the exact same data set(s) for evaluations. (Such results should probably be presented in a table along with results from the present work, not in the body of the literature review.) If instead, they were tested on some proprietary data set, an obsolete corpus, or a data set the authors of the present work have declined to evaluate on, this information is inactionable. Authors should omit this information, and reviewers and editors should insist that it be omitted.

It is also clear to me that these numbers are rarely meaningful as measures of how difficult a task is “generally”. To take an example from an unnamed 2019 NAACL paper (one guilty of the sin described above), word error rates on a single task in a single language range between 9.1% and 23.61% (note also the mixed precision). What could we possibly reason from this enormous spread of results across different data sets?