The Wordlikeness Project

We (myself, Karthik Durvasula, and Jimin Kahng) recently got the good news that our NSF collaborative research proposal has been funded. This works springs ultimately from my dissertation. There I argue—using a mix of logical argumentation and “archival” wordlikeness data mostly taken from appendices of previously published work—that the view of phonotactic grammar as statistical patterns or constraints projected from the lexicon is not strongly supported by the available data. My conclusions are perhaps weakened by the low overall quality of this archival data, which is drawn from various stimulus presentation modalities (i.e., auditory vs. orthographic) and response modalities (Likert scale vs. binary forced-choice vs. transcription). In the NSF study, we will be collecting wordlikeness data in English and Korean, manipluating these stimulus presentation and response modalities, and this data will be made publicly available under the name of the Wordlikeness Project. (Here we draw inspiration from the English Lexicon Project and spinoffs.) We will also be using this data for extensive computational modeling, to answer some of the questions raised in my dissertation and in Karthik and Jimin’s subsequent work.