Term paper specifications

You are responsible for a term paper that builds and evaluates some language technology discussed in class.

If you are not sure whether your project satisfies the above specifications, email a brief description to Kyle before proceeding.

A brief list of ideas:

  1. Write a finite-state grapheme-to-phoneme conversion grammar using Pynini, then evaluate it against pronunciation dictionaries from WikiPron.
  2. Using a language model and a finite-state covering grammar, decode ambiguous text (e.g., written in "chatspeak", containing ambiguous abbreviations, etc.) using NGram and Pynini.
  3. Train and evaluate a tagger (for part-of-speech, NP chunks, or named entities) using a tagger (1 2 3 4).
  4. Train and evaluate a text classifier using scikit-learn.

What to submit

Your submission should include:

  1. Any interesting samples of code (though I won't reviewing code quality in my grading)

  2. Data used (or instructions or code to obtain it, if it's more than 10 MB or so)

  3. A write-up of several pages describing:

    1. what you did
    2. why it might be a useful thing to automate
    3. the data you used
    4. the software you used and/or developed
    5. the results of your evaluation

Rubric

The term paper will be graded on the degree to submission satisfies the above specification.

I will grade the submission up to the point where I am required to submit grades to the registrar's office; this usually a week or so after the end of the semester. If I have not received a term paper by then, you will receive an "I" (incomplete) grade until you submit the term paper.

Hints

  1. While it's possible to work with audio data for this project, it's a lot harder than working with discrete (e.g., text, etc.) data unless you've also studied acoustic phonetics and/or signal processing.
  2. It's okay (good, even) if this harmonizes with some other projects you're doing for credit (e.g., qualifying papers), so long as you make it clear in your write-up what part of the project is unique to the term paper.

Proposal

To propose a topic, send a brief description of the project to both Kyle and Spencer (in the same email).

Submission

Submit the term paper via email, once again sending it to both Kyle and Spencer (in the same email).