2 Models of Computation
How does language work? How is it represented in the mind? How is it processed? How is it learned? These are the foundational questions which motivates the field of linguistics. The modern study of language or generative linguistics began in the late 1940s and early 1950s when it was realized that the emerging understanding of computation that resulted from the work of Kurt Gödel, Alan Turing, Alonzo Church, Emil Post, Rózsa Péter, Stephen Kleene and others might be able to shed light on this question.
These researchers had begun to explore simple machine models of computation such as the Turing machine, Post rewrite systems, the \(\mu\)-recursive functions, and the \(\lambda\)-calculus. These models were originally intended to formalize the notion of mathematical computation or the reasoning processes used by a mathematician in proving theorems in a step-by-step fashion. Two surprising facts were soon discovered. First, Turing machines, \(\lambda\)-calculus, and other systems so-introduced proved to be equivalent in the sense that they could be used to define exactly the same set of functions. It was conjectured that this set of formalisms, the Turing-complete formalisms, in fact, defined the maximal set of function that could be effectively computed—that is, actually implemented in real physical hardware (this conjecture is known as the Church-Turing thesis). Second, it was discovered that there existed particular machines in each formalism which were universal computers. These machines could be programmed using their input to simulate the behavior of any other machine in the class. In modern terms, these machines are programmable interpreters able to perform any computation definable by a Turing machine.
These discoveries, which mostly occurred in the 1930s, launched the modern theory of computation. It wasn’t long before researchers began to realize that such models might have applications in modeling more general processes of the human mind beyond theorem proving. In particular, beginning in the early 1950s researchers such as Yoshua Bar Hillel, Charles Hockett and, most notably, Noam Chomsky, began to apply and adapt these tools to understanding how language might be produced and comprehended.
In another thread of research, Claude Shannon began studying the problem of transmission of information under conditions of uncertainty, and laid the foundations for the modern field of information theory. These tools were also quickly adapted to be used in the study of problems of the mind by researchers such as George Miller, and in especially in the case of language, Charles Hockett.
The ideas of these early researchers still form the foundations the modern fields of linguistics, artificial intelligence, and natural language processing. In this course, we will study these foundations with the goal of understanding the logic of how the tools can be applied to build models of language. Along the way, we will acquire many of the conceptual tools still needed to read the modern literature in natural language processing. However, the goal is not only to understand how such models work, but to be able to critically evaluate how computational models advance our scientific understanding of language.
But first, we need a model of computation….
The \(\lambda\)-Calculus and LISP
In this course, we will use the lambda calculus as our basic model of computation. The \(\lambda\)-calculus was introduced by Alonzo Church in the 1930s as a computable model of mathematical logic. Church based this model on mathematical functions applied to arguments, building up complex computable functions (originally representing logical expressions) from simple ones. In his famous paper introducing the Turing machine, Alan Turing showed that Church’s \(\lambda\)-Calculus and Turing machines were equivalent in the sense that they computed the same set of functions. In 1960, John McCarthy introduced LISP, one of the first practical programming languages, based on the \(\lambda\)-Calculus using list-based data structures (hence the name LISt Processor).
LISP was the first programming language in the paradigm that became known as functional programming which now includes languages such as Haskell, ML, Erlang, and some aspects of popular imperative programming languages such as Java (e.g., Guy Steele, the designer of Java, also was one of the designers of the LISP dialect known as Scheme). The behavior of functional programs is often easier to reason about than imperative programs and easier to parallelize. Thus, these languages have become important again in recent years after a period of less popularity. Functional languages also form the basis for most probabilistic programming languages—high-level programming languages that make it easy to specify probabilistic models. The Church programming language is one inspiration for this course. The \(\lambda\)-calculus is also one of the main computational tools used in formal semantics as studied in linguistics. Thus, a thorough understanding of the \(\lambda\)-calculus will be helpful in many areas beyond this course.
We will use a variant of LISP called Clojure. Althought the language is very powerful, the basic syntax of Clojure is very simple and can be learned in a few minutes. In the next few sections, we will discuss the basics of Clojure, before proceeding to use the language to begin our study of natural language using these tools.
Functions, Expressions, and Evaluation
The most important operation in the \(\lambda\)-calculus is the
application of a function (also known as a procedure) to some
arguments. In LISP, procedure application is expressed in Polish
notation like this: (function argument1 argument2)
. For
example, we can add 1 and 2 as follows.
(+ 1 2)
There are a few things to note about the example above. First, the bit
of code (+ 1 2)
is called an expression in LISP. An
expression is any syntactically well-formed piece of code that can be
run or evaluated by the LISP interpreter.
In fact, in LISP, eval
, which runs the interpreter on an
expression, is just another function, as shown below. We will see
the meaning of quote
shortly, for now, it just tells the
interpreter to pass the actual expression (+ 1 2)
to
eval
without first evaluating it.
(eval (quote (+ 1 2)))
The process of computing the output of an expression by running the interpreter on it is called evaluation and the result is known as the value of the expression. The ideas of expressions and evaluation are very important in functional programming languages because they are used to define the behavior of the interpreter.
Types
In LISP, every expression and every value has a type. The type of
the expression tells the LISP interpreter what to do with that
expression. The expression (+ 1 2)
is of type list in Clojure and
consists of three subexpressions: the symbol +
which is used here
to refer to the primitive function that performs addition (more on
symbols below), and it’s arguments 1
and 2
which are constants
of primitive type integer.
List expressions like (+ 1 2)
—where the first element of the
list is a procedure and the remaining elements are arguments to that
procedure—have a special meaning in LISP. Whenever the
interpreter sees such an expression, it tries to apply the function in
the first position to the values of the expressions in the argument
positions.
LISP is a homoiconic language. This means that LISP code itself is built out of the same datatypes that are processed by the language when code is run. In LISP you can write a program which takes lists as inputs and produces lists as outputs. But the program itself will also be expressed using lists, numbers, and other fundamental LISP datatypes of the same kind that are being processed by the code. We will see below how this makes it easy to reason about the behavior of LISP code.
To understand the behavior of the LISP interpreter, it is very important to keep track of the evaluation rule for each type in the language. The evaluation rule is just the principle that says what the interpreter does with an expression of each type that it is passed. So far, we have seen that when the interpreter is passed a list, it will usually try to apply the first element of the list to the other elements as a function.
Creating functions with \(\lambda\)
The true power of LISP comes from the ability to define new procedures
that perform new computations—not just from using existing
primitive functions. The operator that allows the definition of new
functions in the \(\lambda\)-calculus is called \(\lambda\) and is
what gives the \(\lambda\)-calculus its name. In Clojure we write
fn
instead of \(\lambda\). The general form of a \(\lambda\)
expression in Clojure looks like this:
(fn [parameters...] body...)
(fn [x y] (+ x y))
Let’s take a look at the definition above. It has three parts. First,
there is the fn
(\(\lambda\)) operator. Second, there is a list of
formal parameters to the procedure. These are variables that will
hold the values of inputs to the procedure. In this case, this
includes x
and y
. Third, there is the procedure body which
expresses the computation performed by the procedure using the formal
parameters. In this case, the body simple adds the values stored in
the two variables. Note what happens when we evaluate the expression
above. The value of a \(\lambda\)/fn
expression is a
function. This is our second evaluation rule: List expressions that
begin with the fn
operator are evaluated to create a new function.
How can we use a \(\lambda\)-defined procedure? Consider the following example.
((fn [x y] (+ x y)) 1 2)
This is a little hard to parse at first, but with a little effort you
can see that this code snippet is similar to (+ 1 2)
but with +
replaced with the procedure definition (fn [x y] (+ x y))
. In this
example, the procedure is constructed and then immediately applied to
the arguments 1
and 3
to compute the output value 3
, and then
“forgotten” by the language. Such functions that don’t have names are
sometimes called anonymous functions. How can we make a function
that doesn’t disappear like this?
Naming things with def
One mechanism we can use to make things which don’t disappear is by
naming them with a variable. In Clojure we can create named
variables with the def
primitive.
(def x 1)
def
binds the value 1
to the variable name x
. Names
variables, like x
, are special kinds of expression with a type we
have already seen, known as a symbol. The interpreter assumes that
expressions of type symbol are variables and tries to look up the
value that has been associated with the symbol.
(+ x x)
Now when we run the expression (+ x x)
the language first looks up
the value of the variable x
twice. Then, it looks up the value of
the symbol +
, which happens to be bound to the built-in addition
function. It then applies this addition function +
to the two
values of x
(which, of course, contain the same identical value).
We can also use def
to name functions created
with fn
.
(def my-addition (fn [x y] (+ x y)))
Now we have a function named my-addition
which is defined using the
fn
term we introduced above. Note an important fact, for Clojure
there is no difference between binding a number like 1
to a
variable-name like x
or a function to a variable name like
my-addition
. Functions are first class objects, they are simply a
type of object in the system like any other. They can be passed as
values to other functions, assigned to variables, stored in data
structures, etc.
(my-addition 1 2)
(my-addition x x)
Because we very frequently need to define procedures, Clojure provides a shorthand for doing this. Instead of writing
(def my-addition (fn [parameters] body))
in Clojure we can also write
(defn my-addition [parameters] body)
which means the same thing.
(defn my-addition [x y] (+ x y))
(my-addition 1 2)
Internally, variable bindings live in an environment, which can be thought of as a (nested) table of symbol-value pairs.
Variable Name (Symbol) | Binding |
---|---|
x | 1 |
my-addition | #object[Function] |
Clojure uses lexical scope , which means that if you redefine a
variable like x
in some local context the new definition is only
used in that scope, and when you leave that scope you will revert to
the original definition.
Some basic datatypes
All expressions and values in LISP have a type. In this section, we discuss some of the most important types in the programming language.
Numbers
Expressions like 1
and 2
are numbers which are an atomic type
in LISP. In LISP all atomic types evaluate to themselves.
(eval 1)
Symbols
Above, we discussed variables. A variable is named location in memory that stores some value for us. The kinds of things that can be the names of variables in LISP are a special datatype known as a symbol. Whenever the Clojure interpreter encounters an expression which is of type symbol, it assumes it is the name for a variable and tries to evaluate the symbol by looking up the value stored with the variable. That is, symbols evaluate to the value bound to the variable with that name.
x
If the variable has not been bound to value, you will get a special
value in Clojure called nil
which is Clojure’s way of referring to
“nothing.”
some-symbol
What if we want to talk about the symbol itself, instead of always
assuming that it is the name of a variable bound to some value? In
this case, we can use the quote
special form. quote
takes an
argument and returns a quoted version of that argument. Quoted
values are treated as is by the interpreter (as if they were a
constant). In other words, the interpreter doesn’t try to apply its
normal rules to anything within a quote
, the value of a quoted
expression is the expression itself.
(def my-variable (quote some-symbol))
my-variable
In the expression above, we created a variable called my-variable
and bound it to the value of the expression (quote some-symbol)
. We
then asked Clojure to interpret the symbol my-variable
. The language
assumed that the symbol my-variable
was a variable, and looked up
its value which, thanks to the quote, was the symbol my-symbol
.
Any expression, of any complexity, can be quoted, if you want the interpreter to treat it “as is”.
(def my-variable (quote (+ 1 2 3)))
my-variable
Here the variable my-variable
is bound to an expression which is a
list consisting of three elements: the symbol x
and the constants
1
, 2
, and 3
.
Since we frequently want to quote expressions, Clojure provides a
shorthand for this. Any expression preceded by a single quote '
is
considered to be quoted in the language.
(def my-variable 'some-symbol)
my-variable
Strings
In addition to numbers, procedures, and symbols, LISP also has
strings. A string is written "some string"
.
(def my-string "some string")
my-string
Note that symbols are not strings.
Lists
In Clojure lists are written as a sequence of things appearing between
parentheses (v1 v2 v3 ...)
. Lists are perhaps the most important and
common type of expression in Clojure. In particular, function
applications, def
-expressions, quote
-expressions, and several
other special kinds of expressions are all expressed as lists in LISP
code.
We can explicitly construct a list using the list
procedure.
(list 1 2 3)
In Clojure, we can add a single value on to the beginning of a list
using the cons
procedure.
(cons 0 (list 1 2 3))
If we want to remove the first element of a list, we can use the
rest
primitive.
(rest (list 0 1 2 3))
Or combined.
(rest (cons 0 (list 1 2 3)))
If we want to retrieve just the first element of a list, we can use
the first
primitive.
(first (list 0 1 2 3))
Or, using cons
, first
and rest
.
(cons
(first (list 0 'x 'y 'z))
(rest (list 'w 1 2 3)))
A function application is expressed a list with a procedure in the first position.
(list (fn [x y] (+ x y)) 1 2)
The result of evaluating the expression above is a list with a
procedure in the first position and the values 1
and 2
in the
second and third position. This expression can be evaluated to produce
an output using the eval
procedure.
(eval (list (fn [x y] (+ x y)) 1 2))
We can also quote a list to prevent the language from treating it as a function application.
'(1 2 3)
Notice what happens when we fail to quote the expression above.
(1 2 3)
There is an important special case of lists in LISP which is usually
written '()
. This is the empty list and is very important for
reasons we will see later. We could also get the empty list by calling
list
with no arguments like this (list)
.
(= '() (list))
Note the use of the quote symbol in '()
. The empty list is actually
displayed as ()
, and in Clojure these two expressions are
equivalent. But, to be clear that we don’t want the interpreter to try
to use function application (applying no function to nothing), we
quote the list so that interpreter simply evaluates the empty list to
itself.
Booleans
Another important datatype in LISP are the boolean values: true
and false. In Clojure these are written true
and false
. For
example, we can check the equality of two objects with the =
procedure which returns true
if the two objects have the same value
and false
otherwise.
(= '(1 2) (list 1 2))
(= '(1 2) (list 1 2 3))
In LISP, programmers usually follow the convention that boolean-valued
functions, also known as predicates, end in a question mark. An
important predicate we will see below is the predicate empty?
which
checks if its argument is an empty list/collection.
(empty? '())
(empty? (list 1 2 3))
On very important kind of expression that relies on booleans in LISP
is the conditional expression. Conditional is just another word
for an if…then structure. In Clojure, conditionals start with if
and have three parts:
(if boolean-condition consequent alternative)
The boolean condition is any boolean expression. If it is true, the expression in the consequent position is evaluated and its value is returned as the value of the whole conditional expression. If the boolean condition is false, the expression in the alternative is evaluated and its value is returned as the value of the whole conditional. Note that unlike in many imperative programming languages, conditionals have values in functional languages just like any other expression.
(if (= 1 2) "hi" "bye")
(if (= 1 1) "hi" "bye")
Conditionals in functional programming languages illustrate the important fact that every valid expression has a value. In most imperative languages, and if-then statement itself has no value, but simply passes control to one branch or the other, depending on the value of the boolean-condition. In functional languages, however, the whole expression has the value of the succeeding branch.
(eval '(if (= 1 1) "hi" "bye"))
let
expressions
Another important kind of expression in Clojure is the let
expression. let
allows you to bind variables temporarily within the
scope of the expression.
(let [x 3
y 5]
(+ x y))
It consists of three parts (i) the let
key word, (ii) a list of
pairs of variables and expressions to evaluate to get their values and
(iii) the body of the let
expression where the variable binding
will be used.
The apply
function
Just like the evaluation of an expression is implemented by a
function, eval
in Clojure, the application of a function to some
arguments is also implemented by a function apply
.
(apply + '(1 2 3 4))
apply
takes two arguments: the function and a list containing the
arguments to that function.
Summary of Evaluation Rules
Input Type | Evaluation Action |
---|---|
Atomic types (e.g., numbers, strings) | Evaluate to themselves |
Symbol | Lookup the value bound to that variable name |
List starting with quote |
Return the body “as is” |
List starting with def |
Evaluate the body of the def statement, bind the resulting value to the symbol |
List starting with fn |
Make a function |
List starting with if |
Evaluate condition, if true evaluate consequent otherwise evaluate alternative |
List starting with let |
Evaluate each expression in the binding expression and assign its values to the corresponding variable, evaluate the body in the context of these bindings |
Other lists | Evaluate each sub-expression, apply the value of the first subexpression to the values of the others as arguments |