On the past tense debate; Part 3: the overestimation of overirregularization

One final, and still unresolved, issue in the past tense debate is the role of so-called overirregularization errors.

It is well-known that children acquiring English tend to overregularize irregular verbs; that is, they apply the regular -d suffix to verbs which in adult English form irregular pasts, producing, e.g., *thinked for thought. Maratsos (2000) estimates that children acquiring English very frequently overregularize irregular verbs; for instance, Abe, recorded roughly 45 minutes a week from ages 2;5 to 5;2, overregularizes rare irregular verbs as much as 58% of the time, and even the most frequent irregular verbs are overregularized 18% of the time. Abe appears to have been exceptional in that he had a very large receptive vocabulary for his age (as measured by the Peabody Picture Vocabulary Test), giving him more opportunities (and perhaps more grammatical motivation) for overregularization,1 but Maratsos estimates that less-precocious children have lower but overall similar rates of overregularization.

In contrast, it is generally agreed that overirregularization, or the application of irregular patterns (e.g., in English, of ablaut, shortening, etc.) are quite a bit rarer. The only serious attempt to count overirregularizations is by Xu & Pinker (1995; henceforth XP). They estimate that children produce such errors no more than 0.2% of the time, which would make overirregularizations roughly two orders of magnitude rarer than overregularizations. This is a substantial difference. If anything, I think that XP overestimate overirregularizations. For instance, XP count brang as an overirregularization, even though this form does exist quite robustly in adult English (though it is somewhat stigmatized). Furthermore, XP count *slep for *slept as an overirregularization, though this is probably just ordinary (td)-deletion, a variable rule that is attested already in early childhood (Payne 1980). But by any account, overirregularization is extremely rare. The same is found in nonce word elicitation experiments such as those conducted by Berko (1958): both children and adults are loath to generate irregular past tenses for nonce verbs.2 

This is a problem for most existing computational models. Nearly all of them—Albright & Hayes’ (2003) rule-based model (see their §4.5.3), O’Donnell’s (2015) rules-plus-storage system, and all analogical models and neural networks I am aware of—not only overregularize, like children do, but also overirregularize at rates far exceeding what children do. I submit that any computational model which produces substantial overirregularization is simply on the wrong track.

Endnotes

  1. It is amusing to note that Abe is now, apparently, a trial lawyer and partner at a white-shoe law firm.
  2. As I mentioned in a previous post, this is somewhat obscured by ratings tasks, but that’s further evidence we should disregard such tasks.

References

Albright, A. and Hayes, B. 2003. Rules vs. analogy in English past tenses: a computational/experimental study. Cognition 90(2): 119-161.
Berko, J. 1958. The child’s learning of English morphology. Word 14: 150-177.
Maratsos, M. 2000. More overregularizations after all: new data and discussion on Marcus, Pinker, Ullman, Hollander, Rosen & Xu. Journal of Child Language 27: 183-212.
O’Donnell, T. 2015. Productivity and Reuse in Language: a Theory of Linguistic Computation and Storage. MIT Press.
Payne, A. 1980. Factors controlling the acquisition of the Philadelphia dialect by out-of-state children. In W. Labov (ed.),  Locating Language in Time and Space, pages 143-178. Academic Press.
Xu, F. and Pinker, S. 1995. Weird past tense forms. Journal of Child Language 22(3): 531-556.

Leave a Reply

Your email address will not be published. Required fields are marked *