Term paper specifications

You are responsible for a term paper that counts something of linguistic interest using a non-trivial amount of Python features.

If you are not sure whether your project satisfies the above specifications, email a brief description to Kyle before proceeding.

A brief list of ideas:

  1. Count the words most associated with each of the 12 zodiac signs in a corpus of horoscopes
  2. Count the number of words ending in various derivational suffixes in a digital dictionary
  3. Count the number of words ending in syllabic sonorants in a pronunciation dictionary
  4. Count the frequencies of the different pronunciations of the word live using a tagger (n.b.: this works because the pronunciation is used when it's a noun, and the other when it's a verb)

What to submit

Your submission should include:

  1. Any interesting samples of code (though I won't reviewing code quality in my grading)
  2. Data used (or instructions or code to obtain it, if it's more than 10 MB or so)
  3. A write-up of 3-4 pages describing:
    1. the data you used
    2. what you counted
    3. what the counts were (please make a table, don't just dump Python output here)
    4. why this might be a interesting thing to count

Rubric

The term paper will be graded on the degree to submission satisfies the above specification.

I will grade the submission up to the point where I am required to submit grades to the registrar's office; this usually a week or so after the end of the semester. If I have not received a term paper by then, you will receive an "I" (incomplete) grade until you submit the term paper.

Hints

  1. While it's technically possible to work with audio data for this project, it's a lot harder than working with discrete (e.g., text, etc.) data unless you've also studied acoustic phonetics and/or signal processing.
  2. It's okay (good, even) if this harmonizes with some other projects you're doing for credit (e.g., qualifying papers), so long as you make it clear in your write-up what part of the project is unique to the term paper.