This is a course on the foundations of computational linguistics.

Goals

Most courses taught on computational linguistics or natural language processing tend to be fairly advanced survey courses, assuming a large amount of knowledge of computer science and covering a variety of somewhat unrelated modeling techniques, problem domains, and tasks. This course takes a different approach, seeking to build up a small set of conceptual tools and empirical methods from the ground up.

Some specific goals of the course include:

Content

The modern study of language originated in the 1950s when tools from the recently-emerged theory of computation started to be applied to the problem of language. One important early result in this area was the Chomsky Hierarchy which gave a containment hierarchy of phrase-structure grammars or grammars built out of rewrite rules like the following:

\[\texttt{S} \longrightarrow \texttt{NP}\ \texttt{VP}\]

Chomsky showed that restricting the form of these rules into four classes (left-linear, context-free, context-sensitive, and unrestricted) gave rise to classes of formal languages which were strictly nested.

\[\mathcal{L}_{\texttt{LL}} \subset \mathcal{L}_{\texttt{CF}} \subset \mathcal{L}_{\texttt{CS}} \subset \mathcal{L}_{\texttt{UR}}\]

The Chomsky hierarchy remains a foundational result in theoretical computer science, and forms a basic entry point for studying formal grammars in linguistics. The conceptual tools underlying the hierarchy are useful for understanding even modern systems such as recurrent or recursive neural networks and transformers.

In COMPLING 445/645 we will study the Chomsky Hierarchy up to the left-linear languages. We will cover refinements of this hierarchy and how to work with its extensions using probability. Along the way, we will introduce the foundations of probability theory and Bayesian inference.

What this course is not

What you need: curiosity, ambition, fearlessness, and ability to use Google to find answers to technical problems.

What is the difference between various CL/NLP offerings at McGill?

McGill now offers a number of courses in CL/NLP that cover somewhat overlapping but mostly complementary material, with differences in assumed background knowledge and focus. While some topics will appear in several courses, the emphasis will be different enough that you could fruitfully take all courses (if you are an undergraduate student).

Programming

We will use Clojure, a LISP based on the \(\lambda\)-calculus. Clojure has several advantages.

Resources

General LISP and Clojure learning resources:


2 Models of Computation