How does language work? How is it represented in the mind? How is it processed? How is it learned? These are the foundational questions which motivates the field of linguistics. The modern study of language or generative linguistics began in the late 1940s and early 1950s when it was realized that the emerging understanding of computation that resulted from the work of Kurt Gödel, Alan Turing, Alonzo Church, Emil Post, Rózsa Péter, Stephen Kleene and others might be able to shed light on this question.

These researchers had begun to explore simple machine models of computation such as the Turing machine, Post rewrite systems, the \(\mu\)-recursive functions, and the \(\lambda\)-calculus. These models were originally intended to formalize the notion of mathematical computation or the reasoning processes used by a mathematician in proving theorems in a step-by-step fashion. Two surprising facts were soon discovered. First, Turing machines, \(\lambda\)-calculus, and other systems so-introduced proved to be equivalent in the sense that they could be used to define exactly the same set of functions. It was conjectured that this set of formalisms, the Turing-complete formalisms, in fact, defined the maximal set of function that could be effectively computed—that is, actually implemented in real physical hardware (this conjecture is known as the Church-Turing thesis). Second, it was discovered that there existed particular machines in each formalism which were universal computers. These machines could be programmed using their input to simulate the behavior of any other machine in the class. In modern terms, these machines are programmable interpreters able to perform any computation definable by a Turing machine.

These discoveries, which mostly occurred in the 1930s, launched the modern theory of computation. It wasn’t long before researchers began to realize that such models might have applications in modeling more general processes of the human mind beyond theorem proving. In particular, beginning in the early 1950s researchers such as Yoshua Bar Hillel, Charles Hockett and, most notably, Noam Chomsky, began to apply and adapt these tools to understanding how language might be produced and comprehended.

In another thread of research, Claude Shannon began studying the problem of transmission of information under conditions of uncertainty, and laid the foundations for the modern field of information theory. These tools were also quickly adapted to be used in the study of problems of the mind by researchers such as George Miller, and in especially in the case of language, Charles Hockett.

The ideas of these early researchers still form the foundations the modern fields of linguistics, artificial intelligence, and natural language processing. In this course, we will study these foundations with the goal of understanding the logic of how the tools can be applied to build models of language. Along the way, we will acquire many of the conceptual tools still needed to read the modern literature in natural language processing. However, the goal is not only to understand how such models work, but to be able to critically evaluate how computational models advance our scientific understanding of language.

But first, we need a model of computation….

The \(\lambda\)-Calculus and LISP

In this course, we will use the lambda calculus as our basic model of computation. The \(\lambda\)-calculus was introduced by Alonzo Church in the 1930s as a computable model of mathematical logic. Church based this model on mathematical functions applied to arguments, building up complex computable functions (originally representing logical expressions) from simple ones. In his famous paper introducing the Turing machine, Alan Turing showed that Church’s \(\lambda\)-Calculus and Turing machines were equivalent in the sense that they computed the same set of functions. In 1960, John McCarthy introduced LISP, one of the first practical programming languages, based on the \(\lambda\)-Calculus using list-based data structures (hence the name LISt Processor).

LISP was the first programming language in the paradigm that became known as functional programming which now includes languages such as Haskell, ML, Erlang, and some aspects of popular imperative programming languages such as Java (e.g., Guy Steele, the designer of Java, also was one of the designers of the LISP dialect known as Scheme). The behavior of functional programs is often easier to reason about than imperative programs and easier to parallelize. Thus, these languages have become important again in recent years after a period of less popularity. Functional languages also form the basis for most probabilistic programming languages—high-level programming languages that make it easy to specify probabilistic models. The Church programming language is one inspiration for this course. The \(\lambda\)-calculus is also one of the main computational tools used in formal semantics as studied in linguistics. Thus, a thorough understanding of the \(\lambda\)-calculus will be helpful in many areas beyond this course.

We will use a variant of LISP called Clojure. Althought the language is very powerful, the basic syntax of Clojure is very simple and can be learned in a few minutes. In the next few sections, we will discuss the basics of Clojure, before proceeding to use the language to begin our study of natural language using these tools.

Functions, Expressions, and Evaluation

The most important operation in the \(\lambda\)-calculus is the application of a function (also known as a procedure) to some arguments. In LISP, procedure application is expressed in Polish notation like this: (function argument1 argument2). For example, we can add 1 and 2 as follows.

(+ 1 2)

There are a few things to note about the example above. First, the bit of code (+ 1 2) is called an expression in LISP. An expression is any syntactically well-formed piece of code that can be run or evaluated by the LISP interpreter.

In fact, in LISP, eval, which runs the interpreter on an expression, is just another function, as shown below. We will see the meaning of quote shortly, for now, it just tells the interpreter to pass the actual expression (+ 1 2) to eval without first evaluating it.

(eval (quote (+ 1 2)))

The process of computing the output of an expression by running the interpreter on it is called evaluation and the result is known as the value of the expression. The ideas of expressions and evaluation are very important in functional programming languages because they are used to define the behavior of the interpreter.

Types

In LISP, every expression and every value has a type. The type of the expression tells the LISP interpreter what to do with that expression. The expression (+ 1 2) is of type list in Clojure and consists of three subexpressions: the symbol + which is used here to refer to the primitive function that performs addition (more on symbols below), and it’s arguments 1 and 2 which are constants of primitive type integer.

List expressions like (+ 1 2)—where the first element of the list is a procedure and the remaining elements are arguments to that procedure—have a special meaning in LISP. Whenever the interpreter sees such an expression, it tries to apply the function in the first position to the values of the expressions in the argument positions.

LISP is a homoiconic language. This means that LISP code itself is built out of the same datatypes that are processed by the language when code is run. In LISP you can write a program which takes lists as inputs and produces lists as outputs. But the program itself will also be expressed using lists, numbers, and other fundamental LISP datatypes of the same kind that are being processed by the code. We will see below how this makes it easy to reason about the behavior of LISP code.

To understand the behavior of the LISP interpreter, it is very important to keep track of the evaluation rule for each type in the language. The evaluation rule is just the principle that says what the interpreter does with an expression of each type that it is passed. So far, we have seen that when the interpreter is passed a list, it will usually try to apply the first element of the list to the other elements as a function.

Creating functions with \(\lambda\)

The true power of LISP comes from the ability to define new procedures that perform new computations—not just from using existing primitive functions. The operator that allows the definition of new functions in the \(\lambda\)-calculus is called \(\lambda\) and is what gives the \(\lambda\)-calculus its name. In Clojure we write fn instead of \(\lambda\). The general form of a \(\lambda\) expression in Clojure looks like this:

(fn [parameters...] body...)
(fn [x y] (+ x y))

Let’s take a look at the definition above. It has three parts. First, there is the fn (\(\lambda\)) operator. Second, there is a list of formal parameters to the procedure. These are variables that will hold the values of inputs to the procedure. In this case, this includes x and y. Third, there is the procedure body which expresses the computation performed by the procedure using the formal parameters. In this case, the body simple adds the values stored in the two variables. Note what happens when we evaluate the expression above. The value of a \(\lambda\)/fn expression is a function. This is our second evaluation rule: List expressions that begin with the fn operator are evaluated to create a new function.

How can we use a \(\lambda\)-defined procedure? Consider the following example.

((fn [x y] (+ x y)) 1 2)

This is a little hard to parse at first, but with a little effort you can see that this code snippet is similar to (+ 1 2) but with + replaced with the procedure definition (fn [x y] (+ x y)). In this example, the procedure is constructed and then immediately applied to the arguments 1 and 3 to compute the output value 3, and then “forgotten” by the language. Such functions that don’t have names are sometimes called anonymous functions. How can we make a function that doesn’t disappear like this?

Naming things with def

One mechanism we can use to make things which don’t disappear is by naming them with a variable. In Clojure we can create named variables with the def primitive.

(def x 1)

def binds the value 1 to the variable name x. Names variables, like x, are special kinds of expression with a type we have already seen, known as a symbol. The interpreter assumes that expressions of type symbol are variables and tries to look up the value that has been associated with the symbol.

(+ x x)

Now when we run the expression (+ x x) the language first looks up the value of the variable x twice. Then, it looks up the value of the symbol +, which happens to be bound to the built-in addition function. It then applies this addition function + to the two values of x (which, of course, contain the same identical value).

We can also use def to name functions created with fn.

(def my-addition (fn [x y] (+ x y)))

Now we have a function named my-addition which is defined using the fn term we introduced above. Note an important fact, for Clojure there is no difference between binding a number like 1 to a variable-name like x or a function to a variable name like my-addition. Functions are first class objects, they are simply a type of object in the system like any other. They can be passed as values to other functions, assigned to variables, stored in data structures, etc.

(my-addition 1 2)
(my-addition x x)

Because we very frequently need to define procedures, Clojure provides a shorthand for doing this. Instead of writing

(def my-addition (fn [parameters] body))

in Clojure we can also write

(defn my-addition [parameters] body)

which means the same thing.

(defn my-addition [x y] (+ x y))
(my-addition 1 2)

Internally, variable bindings live in an environment, which can be thought of as a (nested) table of symbol-value pairs.

Variable Name (Symbol) Binding
x 1
my-addition #object[Function]

Clojure uses lexical scope , which means that if you redefine a variable like x in some local context the new definition is only used in that scope, and when you leave that scope you will revert to the original definition.

Some basic datatypes

All expressions and values in LISP have a type. In this section, we discuss some of the most important types in the programming language.

Numbers

Expressions like 1 and 2 are numbers which are an atomic type in LISP. In LISP all atomic types evaluate to themselves.

(eval 1)

Symbols

Above, we discussed variables. A variable is named location in memory that stores some value for us. The kinds of things that can be the names of variables in LISP are a special datatype known as a symbol. Whenever the Clojure interpreter encounters an expression which is of type symbol, it assumes it is the name for a variable and tries to evaluate the symbol by looking up the value stored with the variable. That is, symbols evaluate to the value bound to the variable with that name.

x

If the variable has not been bound to value, you will get a special value in Clojure called nil which is Clojure’s way of referring to “nothing.”

some-symbol

What if we want to talk about the symbol itself, instead of always assuming that it is the name of a variable bound to some value? In this case, we can use the quote special form. quote takes an argument and returns a quoted version of that argument. Quoted values are treated as is by the interpreter (as if they were a constant). In other words, the interpreter doesn’t try to apply its normal rules to anything within a quote, the value of a quoted expression is the expression itself.

(def my-variable (quote some-symbol))
my-variable

In the expression above, we created a variable called my-variable and bound it to the value of the expression (quote some-symbol). We then asked Clojure to interpret the symbol my-variable. The language assumed that the symbol my-variable was a variable, and looked up its value which, thanks to the quote, was the symbol my-symbol.

Any expression, of any complexity, can be quoted, if you want the interpreter to treat it “as is”.

(def my-variable (quote (+ 1 2 3)))
my-variable

Here the variable my-variable is bound to an expression which is a list consisting of three elements: the symbol x and the constants 1, 2, and 3.

Since we frequently want to quote expressions, Clojure provides a shorthand for this. Any expression preceded by a single quote ' is considered to be quoted in the language.

(def my-variable 'some-symbol)
my-variable

Strings

In addition to numbers, procedures, and symbols, LISP also has strings. A string is written "some string".

(def my-string "some string")
my-string

Note that symbols are not strings.

Lists

In Clojure lists are written as a sequence of things appearing between parentheses (v1 v2 v3 ...). Lists are perhaps the most important and common type of expression in Clojure. In particular, function applications, def-expressions, quote-expressions, and several other special kinds of expressions are all expressed as lists in LISP code.

We can explicitly construct a list using the list procedure.

(list 1 2 3)

In Clojure, we can add a single value on to the beginning of a list using the cons procedure.

(cons 0 (list 1 2 3))

If we want to remove the first element of a list, we can use the rest primitive.

(rest (list 0 1 2 3))

Or combined.

(rest (cons 0 (list 1 2 3)))

If we want to retrieve just the first element of a list, we can use the first primitive.

(first (list 0 1 2 3))

Or, using cons, first and rest.

(cons
  (first (list 0 'x 'y 'z))
  (rest (list 'w 1 2 3)))

A function application is expressed a list with a procedure in the first position.

(list (fn [x y] (+ x y)) 1 2)

The result of evaluating the expression above is a list with a procedure in the first position and the values 1 and 2 in the second and third position. This expression can be evaluated to produce an output using the eval procedure.

(eval (list (fn [x y] (+ x y)) 1 2))

We can also quote a list to prevent the language from treating it as a function application.

'(1 2 3)

Notice what happens when we fail to quote the expression above.

(1 2 3)

There is an important special case of lists in LISP which is usually written '(). This is the empty list and is very important for reasons we will see later. We could also get the empty list by calling list with no arguments like this (list).

(= '() (list))

Note the use of the quote symbol in '(). The empty list is actually displayed as (), and in Clojure these two expressions are equivalent. But, to be clear that we don’t want the interpreter to try to use function application (applying no function to nothing), we quote the list so that interpreter simply evaluates the empty list to itself.

Booleans

Another important datatype in LISP are the boolean values: true and false. In Clojure these are written true and false. For example, we can check the equality of two objects with the = procedure which returns true if the two objects have the same value and false otherwise.

(= '(1 2) (list 1 2))
(= '(1 2) (list 1 2 3))

In LISP, programmers usually follow the convention that boolean-valued functions, also known as predicates, end in a question mark. An important predicate we will see below is the predicate empty? which checks if its argument is an empty list/collection.

(empty? '())
(empty? (list 1 2 3))

On very important kind of expression that relies on booleans in LISP is the conditional expression. Conditional is just another word for an if…then structure. In Clojure, conditionals start with if and have three parts:

(if boolean-condition consequent alternative)

The boolean condition is any boolean expression. If it is true, the expression in the consequent position is evaluated and its value is returned as the value of the whole conditional expression. If the boolean condition is false, the expression in the alternative is evaluated and its value is returned as the value of the whole conditional. Note that unlike in many imperative programming languages, conditionals have values in functional languages just like any other expression.

(if (= 1 2) "hi" "bye")
(if (= 1 1) "hi" "bye")

Conditionals in functional programming languages illustrate the important fact that every valid expression has a value. In most imperative languages, and if-then statement itself has no value, but simply passes control to one branch or the other, depending on the value of the boolean-condition. In functional languages, however, the whole expression has the value of the succeeding branch.

(eval '(if (= 1 1) "hi" "bye"))

let expressions

Another important kind of expression in Clojure is the let expression. let allows you to bind variables temporarily within the scope of the expression.

(let [x 3
      y 5] 
(+ x y))

It consists of three parts (i) the let key word, (ii) a list of pairs of variables and expressions to evaluate to get their values and (iii) the body of the let expression where the variable binding will be used.

The apply function

Just like the evaluation of an expression is implemented by a function, eval in Clojure, the application of a function to some arguments is also implemented by a function apply.

(apply + '(1 2 3 4))

apply takes two arguments: the function and a list containing the arguments to that function.

Summary of Evaluation Rules

Input Type Evaluation Action
Atomic types (e.g., numbers, strings) Evaluate to themselves
Symbol Lookup the value bound to that variable name
List starting with quote Return the body “as is”
List starting with def Evaluate the body of the def statement, bind the resulting value to the symbol
List starting with fn Make a function
List starting with if Evaluate condition, if true evaluate consequent otherwise evaluate alternative
List starting with let Evaluate each expression in the binding expression and assign its values to the corresponding variable, evaluate the body in the context of these bindings
Other lists Evaluate each sub-expression, apply the value of the first subexpression to the values of the others as arguments

1 Introduction 3 Recursion