# Lambda calculus: Wikis

Note: Many of our articles have direct quotes from sources you can cite, within the Wikipedia article! This article doesn't yet, but we're working on it! See more info or our list of citable articles.

# Encyclopedia

In mathematical logic and computer science, lambda calculus, also written as λ-calculus, is a formal system for function definition, function application and recursion. It was introduced by Alonzo Church in the 1930s as part of an investigation into the foundations of mathematics.[1][2] After the original system was shown to be logically inconsistent (the Kleene–Rosser paradox), Church isolated and published in 1936[3] just the portion relevant to computation, what is now called the untyped lambda calculus. In 1940, he also introduced a computationally weaker but logically consistent system, known as the simply typed lambda calculus.[4] In both typed and untyped versions, ideas from lambda calculus have found application in the fields of logic, recursion theory (computability), and linguistics, and have played an important role in the development of the theory of programming languages (with untyped lambda calculus being the original inspiration for functional programming, in particular Lisp, and typed lambda calculus serving as the foundation for modern type systems). This article deals primarily with the untyped lambda calculus.

## Informal description

### Motivation

Functions are a fundamental concept within computer science and mathematics. The λ-calculus provides simple semantics for computation with functions so that properties of functional computation can be studied.

Consider the following two examples. The identity function I(x) = x takes a single input, x, and immediately returns x (i.e. the identity does nothing with its input), whereas the function sqadd(x, y) = x*x + y*y takes a pair of inputs, x and y and returns the sum of their squares, x*x + y*y. Using these two examples, we can make some useful observations that motivate the major ideas in the lambda calculus.

The first observation is that functions need not be explicitly named. That is, the function sqadd(x, y) = x*x + y*y can be rewritten in anonymous form as x, y → x*x + y*y, (read as “the pair of x and y are mapped to x*x + y*y”). Similarly, I(x) = x can be rewritten in anonymous form as x → x, and a quick check will confirm that this does indeed express the identity function.

The second observation is that the specific choice of name for a function's arguments is largely irrelevant. That is, x → x and y → y express the same function: the identity. Similarly, x, y → x*x + y*y and u, v → u*u + v*v also express the same function.

Finally, any function that requires two inputs, for instance the aforementioned sqadd function, can be reworked into an equivalent function that accepts a single input, and as output returns another function, that in turn accepts a single input. For example, x, y → x*x + y*y can be reworked into x → (y → x*x + y*y). This transformation is called currying, and can be generalized to functions accepting an arbitrary number of arguments.

This final observation may seem somewhat obscure, and is best grasped intuitively through the use of an example. So, compare the function x, y → x*x + y*y with its curried form, x → (y → x*x + y*y). Given two arguments, we have:
(x, y → x*x + y*y)(5, 2)
= 5*5 + 2*2 = 29.
However, given the same two arguments, we have:
(x → (y → x*x + y*y))(5)(2)
= (y → 5*5 + y*y)(2)
= 5*5 + 2*2 = 29
and we see the uncurried and curried forms compute the same result.

### The lambda calculus

The lambda calculus consists of lambda terms and some extra operations.

Since the names of functions are largely a convenience, the lambda calculus has no means of naming a function. Since all functions expecting more than one input can be transformed into equivalent functions accepting a single input (via Currying), the lambda calculus has no means for creating a function that accepts more than one argument. Since the names of arguments are largely irrelevant, the native notion of equality on lambda terms is alpha-equivalence, which codifies this principle.

#### Lambda terms

The syntax of lambda terms is particularly simple. A lambda term may be a variable, x. If t is a lambda term, and x is a variable, then λx.t is a lambda term (called a lambda abstraction). If t and s are lambda terms, then ts is a lambda term (called an application). Nothing else is a lambda term, though bracketing may be used to disambiguate terms.

Intuitively, a lambda abstraction λx.t represents an anonymous function that takes a single input, and the λ is said to bind x in t, and an application ts represents the application of input s to some function t. In the lambda calculus, functions are taken to be first class values, so functions may be used as the inputs to other functions, and functions may return functions as their outputs.

For example, λx.x represents the identity function, x → x, and (λx.x)y represents the identity function applied to y. Further, (λx.y) represents the constant function x → y, the function that always returns y, no matter the input. It should be noted that function application is left associative, so (λx.x)y z = ((λx.x)y)z.

Lambda terms on their own aren't particularly interesting. What makes them interesting are the various notions of equivalence and reduction that can be defined over them.

#### Alpha equivalence

A basic form of equivalence, definable on lambda terms, is alpha equivalence. This states that the particular choice of bound variable, in a lambda abstraction, doesn't (usually) matter. For instance, λx.x and λy.y are alpha equivalent lambda terms, representing the same identity function. Note that the terms x and y aren't alpha equivalent, because they are not bound in a lambda abstraction. In many presentations, it is usual to identify alpha equivalent lambda terms.

The following definitions are necessary in order to be able to define beta reduction.

#### Free variables

The free variables of a term are those variables not bound by a lambda abstraction. That is, the free variables of x are just x; the free variables of λx.t are the free variables of t, with x removed, and the free variables of ts are the union of the free variables of t and s.

For example, the lambda term representing the identity λx.x has no free variables, but the constant function λx.y has a single free variable, y.

#### Capture avoiding substitutions

Using the definition of free variables, we may now define a capture avoiding substitution. Suppose t, s and r are lambda terms and x and y are variables where x does not equal y. We write t[x := r] for the substitution of r for x in t, in a capture avoiding manner. That is:

• x[x := r] = r;
• y[x := r] = y;
• (ts)[x := r] = (t[x := r])(s[x := r]);
• (λx.t)[x := r] = λx.t;
• (λy.t)[x := r] = λy.(t[x := r]) if y is not in the free variables of r (a freshness condition).

For example, (λx.x)[y := y] = λx.(x[y := y]) = λx.x, and ((λx.y)x)[x := y] = ((λx.y)[x := y])(x[x := y]) = (λx.y)y.

The freshness condition is crucial in order to ensure that substitution does not change the meaning of functions. For example, suppose we define another substitution action without the freshness condition. Then, (λx.y)[y := x] = λx.(y[y := x]) = λx.x, and the constant function λx.y turns into the identity λx.x by substitution.

If our freshness condition is not met, then we may simply alpha-rename with a suitably fresh variable. For example, switching back to our correct notion of substitution, in (λx.y)[y := x] the lambda abstraction can be renamed with a fresh variable z, to obtain (λz.y)[y := x] = λz.(y[y := x]) = λz.x, and the meaning of the function is preserved by substitution.

#### Beta reduction

Beta reduction states that an application of the form (λx.t)s reduces to the term t[x := s] (we write (λx.t)s → t[x := s] as a convenient shorthand for “(λx.t)s beta reduces to t[x := s]”). For example, (λx.x)s → x[x := s] = s, demonstrating that λx.x really is the identity. Similarly, (λx.y)s → y[x := s] = y, demonstrating that λx.y really is a constant function.

The lambda calculus may be seen as an idealised functional programming language, like Haskell or Standard ML. Under this view, beta reduction corresponds to a computational step, and in the untyped lambda calculus, as presented here, reduction need not terminate. For instance, consider the term (λx.xx)(λx.xx). Here, we have (λx.xx)(λx.xx) → (xx)[x := λx.xx] = (x[x := λx.xx])(x[x := λx.xx]) = (λx.xx)(λx.xx). That is, the term reduces to itself in a single beta reduction, and therefore reduction will never terminate.

Another problem with the untyped lambda calculus is the inability to distinguish between different kinds of data. For instance, we may want to write a function that only operates on numbers. However, in the untyped lambda calculus, there's no way to prevent our function from being applied to truth values, or strings, for instance.

Typed lambda calculi, that will be introduced later in the article, attempt to rule out as many misbehaved terms as possible.

## Formal definition

### Definition

Lambda expressions are composed of

variables v1, v2, ..., vn
the abstraction symbols λ and .
parentheses ( )

The set of lambda expressions, Λ, can be defined recursively:

1. If x is a variable, then x ∈ Λ
2. If x is a variable and M ∈ Λ, then (λx.M) ∈ Λ
3. If M, N ∈ Λ, then (M N) ∈ Λ

Instances of rule 2 are known as abstractions and instances of rule 3 are known as applications.[5]

### Notation

To keep the notation of lambda expressions uncluttered, the following conventions are usually applied.

• Outermost parentheses are dropped: M N instead of (M N).
• Applications are assumed to be left associative: M N P means (M N) P.
• The body of an abstraction extends as far right as possible: λx.M N means λx.(M N) and not (λx.M) N.
• A sequence of abstractions are contracted: λx.λy.λz.N is abbreviated as λxyz.N.[6]

### Free and bound variables

The abstraction operator, λ, is said to bind its variable wherever it occurs in the body of the abstraction. Variables that fall within the scope of a lambda are said to be bound. All other variables are called free. For example in the following expression y is a bound variable and x is free: λy.x x y. Also note that a variable binds to its "nearest" lambda. In the following expression one single occurrence of x is bound by the second lambda: λx.y (λx.z x)

The set of free variables of a lambda expression, M, is denoted as FV(M) and is defined by recursion on the structure of the terms, as follows:

1. FV(x) = {x}, where x is a variable
2. FV(λx.M) = FV(M) - {x}
3. FV(M N) = FV(M) ∪ FV(N)[7]

An expression which contains no free variables is said to be closed. Closed lambda expressions are also known as combinators and are equivalent to terms in combinatory logic.

## Reduction

The meaning of lambda expressions is defined by how expressions can be reduced.[8]

There are three kinds of reduction:

• α-conversion: changing bound variables;
• β-conversion: applying functions to their arguments;
• η-conversion: which captures a notion of extensionality.

We also speak of the resulting equivalences: two expressions are β-equivalent if they can be β-converted into the same expression, and α/η-equivalence are defined similarly.

### α-conversion

Alpha-conversion allows bound variable names to be changed. For example, alpha-conversion of λx.x might yield λy.y. Frequently in uses of lambda calculus, terms that differ only by alpha-conversion are considered to be equivalent.

The precise rules for alpha-conversion are not completely trivial. First, when alpha-converting an abstraction, the only variable occurrences that are renamed are those that are bound to the same abstraction. For example, an alpha-conversion of λxx.x could result in λyx.x, but it could not result in λyx.y. The latter has a different meaning from the original.

Second, alpha-conversion is not possible if it would result in a variable getting captured by a different abstraction. For example, if we replace x with y in λxy.x, we get λyy.y, which is not at all the same.

#### Substitution

Substitution, written E[V := E′], is the process of replacing all free occurrences of the variable V by expression E′. Substitution on terms of the λ-calculus is defined by recursion on the structure of terms, as follows.

x[x := N]        ≡ N
y[x := N]        ≡ y, if x ≠ y
(M1 M2)[x := N]  ≡ (M1[x := N]) (M2[x := N])
(λy.M)[x := N]   ≡ λy.(M[x := N]), if x ≠ y and y ∉ FV(N)

To substitute into a lambda abstraction, it is sometimes necessary to α-convert the expression. For example, it is not correct for (λx.y)[y := x] to result in (λx.x), because the substituted x was supposed to be free but ended up being bound. The correct substitution in this case is (λz.x), up-to α-equivalence. Notice that substitution is defined uniquely up-to α-equivalence.

### β-reduction

Beta-reduction captures the idea of function application. Beta-reduction is defined in terms of substitution: the beta-reduction of  ((λV.E) E′ is E[V := E′].

For example, assuming some encoding of 2, 7, *, we have the following β-reductions: ((λn.n*2) 7)7*2.

### η-conversion

Eta-conversion expresses the idea of extensionality, which in this context is that two functions are the same if and only if they give the same result for all arguments. Eta-conversion converts between λx.f x and f whenever x does not appear free in f.

This conversion is not always appropriate when lambda expressions are interpreted as programs. Evaluation of λx.f x can terminate even when evaluation of f does not.

## Encoding datatypes

The basic lambda calculus may be used to model booleans, arithmetic, data structures and recursion, as illustrated in the following sub-sections.

### Arithmetic in lambda calculus

There are several possible ways to define the natural numbers in lambda calculus, but by far the most common are the Church numerals, which can be defined as follows:

0 := λfx.x
1 := λfx.f x
2 := λfx.f (f x)
3 := λfx.f (f (f x))

and so on. A Church numeral is a higher-order function—it takes a single-argument function f, and returns another single-argument function. The Church numeral n is a function that takes a function f as argument and returns the n-th composition of f, i.e. the function f composed with itself n times. This is denoted f(n) and is in fact the n-th power of f (considered as an operator); f(0) is defined to be the identity function. Such repeated compositions (of a single function f) obey the laws of exponents, which is why these numerals can be used for arithmetic. (In Church's original lambda calculus, the formal parameter of a lambda expression was required to occur at least once in the function body, which made the above definition of 0 impossible.)

We can define a successor function, which takes a number n and returns n + 1 by adding an additional application of f:

SUCC := λnfx.f (n f x)

Because the m-th composition of f composed with the n-th composition of f gives the m+n-th composition of f, addition can be defined as follows:

PLUS := λmnfx.m f (n f x)

PLUS can be thought of as a function taking two natural numbers as arguments and returning a natural number; it can be verified that

PLUS 2 3 and
5

are equivalent lambda expressions. Since adding m to a number n can be accomplished by adding 1 m times, an equivalent definition is:

PLUS := λmn.m SUCC n[9]

Similarly, multiplication can be defined as

MULT := λmnf.m (n f)[10]

Alternatively

MULT := λmn.m (PLUS n) 0

since multiplying m and n is the same as repeating the add n function m times and then applying it to zero. Exponentiation has a rather simple rendering in Church numerals, namely

POW := λbe.e b

The predecessor function defined by PRED n = n - 1 for a positive integer n and PRED 0 = 0 is considerably more difficult. The formula

PRED := λnfx.ngh.h (g f)) (λu.x) (λu.u)

can be validated by showing inductively that if T denotes gh.h (g f)), then T(n)u.x) = (λh.h(f(n-1)(x))) for n > 0. Two other definitions of PRED are given below, one using conditionals and the other using pairs. With the predecessor function, subtraction is straightforward. Defining

SUB := λmn.n PRED m,

SUB m n yields m - n when m > n and 0 otherwise.

### Logic and predicates

By convention, the following two definitions (known as Church booleans) are used for the boolean values TRUE and FALSE:

TRUE := λxy.x
FALSE := λxy.y
(Note that FALSE is equivalent to the Church numeral zero defined above)

Then, with these two λ-terms, we can define some logic operators (these are just possible formulations; other expressions are equally correct):

AND := λpq.p q p
OR := λpq.p p q
NOT := λpab.p b a
IFTHENELSE := λpab.p a b

We are now able to compute some logic functions, for example:

AND TRUE FALSE
≡ (λpq.p q p) TRUE FALSE →β TRUE FALSE TRUE
≡ (λxy.x) FALSE TRUE →β FALSE

and we see that AND TRUE FALSE is equivalent to FALSE.

A predicate is a function which returns a boolean value. The most fundamental predicate is ISZERO which returns TRUE if its argument is the Church numeral 0, and FALSE if its argument is any other Church numeral:

ISZERO := λn.nx.FALSE) TRUE

The following predicate tests whether the first argument is less-than-or-equal-to the second:

LEQ := λmn.ISZERO (SUB m n),

and since m = n iff LEQ m n and LEQ n m, it is straightforward to build a predicate for numerical equality.

The availability of predicates and the above definition of TRUE and FALSE make it convenient to write "if-then-else" expressions in lambda calculus. For example, the predecessor function can be defined as' '

PRED := λn.ngk.ISZERO (g 1) k (PLUS (g k) 1)) (λv.0) 0

which can be verified by showing inductively that ngk.ISZERO (g 1) k (PLUS (g k) 1)) (λv.0) is the add n - 1 function for n > 0.

### Pairs

A pair (2-tuple) can be defined in terms of TRUE and FALSE, by using the Church encoding for pairs. For example, PAIR encapsulates the pair (x,y), FIRST returns the first element of the pair, and SECOND returns the second.

PAIR := λxyf.f x y
FIRST := λp.p TRUE
SECOND := λp.p FALSE
NIL := λx.TRUE
NULL := λp.p (λxy.FALSE)

A linked list can be defined as either NIL for the empty list, or the PAIR of an element and a smaller list. The predicate NULL tests for the value NIL. (Alternatively, with NIL := FALSE, the construct l (λhtz.deal_with_head_h_and_tail_t) (deal_with_nil) obviates the need for an explicit NULL test).

As an example of the use of pairs, the shift-and-increment function that maps (m, n) to (n, n + 1) can be defined as

Φ := λx.PAIR (SECOND x) (SUCC (SECOND x))

which allows us to give perhaps the most transparent version of the predecessor function:

PRED := λn.FIRST (n Φ (PAIR 0 0)).

### Recursion and fixed points

Recursion is the definition of a function using the function itself; on the face of it, lambda calculus does not allow this. However, this impression is misleading. Consider for instance the factorial function f(n) recursively defined by

f(n) = 1, if n = 0; else n × f(n - 1).

In lambda calculus, a function cannot refer directly to itself. However, a function may accept a parameter which is assumed to be itself. As an invariant, this argument is typically the first. Binding it to the function yields a new function which may recurse. To achieve recursion, the self-referencing argument (called r here) must always be passed to itself within the function body.

g := λr. λn.(1, if n = 0; else n × (r r (n-1)))
f := g g

This solves the specific problem of the factorial function, but a generic solution is also possible. Given a lambda term representing the body of a recursive function or loop, taking itself as the first argument, the fixed-point operator will return the desired recursive function or loop. The function does not need to be explicitly passed to itself at any point. In fact there are many possible definitions for this operator, generally known as fixed point combinators. The simplest is defined as such:

Y = λg.(λx.g (x x)) (λx.g (x x))

In the lambda calculus, Y g is a fixed-point of g, as it expands to:

Y g
λh.((λx.h (x x)) (λx.h (x x))) g
x.g (x x)) (λx.g (x x))
gx.g (x x)) (λx.g (x x)) - Compare with the previous step
g (Y g).

Now, to complete our recursive call to the factorial function, we would simply call  g (Y g) n,  where n is the number we are calculating the factorial of.

Given n = 5, for example, this gives:

g (Y g) 5
fn.(1, if n = 0; and n·(f(n-1)), if n>0)) (Y g) 5

and expands to:

n.(1, if n = 0; and n·((Y g) (n-1)), if n>0)) 5
1, if 5 = 0; and 5·(g(Y g) (5-1)), if 5>0
5·(g(Y g) 4)
5·(λn.(1, if n = 0; and n·((Y g) (n-1)), if n>0) 4)
5·(1, if 4 = 0; and 4·(g(Y g) (4-1)), if 4>0)
5·(4·(g(Y g) 3))
5·(4·(λn.(1, if n = 0; and n·((Y g) (n-1)), if n>0) 3))
5·(4·(1, if 3 = 0; and 3·(g(Y g) (3-1)), if 3>0))
5·(4·(3·(g(Y g) 2)))
...

And so on, evaluating the structure of the algorithm recursively. Every recursively defined function can be seen as a fixed point of some other suitable function, and therefore, using Y, every recursively defined function can be expressed as a lambda expression. In particular, we can now cleanly define the subtraction, multiplication and comparison predicate of natural numbers recursively.

### Standard combinators

Certain combinators have commonly accepted names:

I := λx.x
K := λxy.x
S := λxyz.(x z (y z))

## Computable functions and lambda calculus

A function F: NN of natural numbers is a computable function if and only if there exists a lambda expression f such that for every pair of x, y in N, F(x)=y if and only if f x =β y,  where x and y are the Church numerals corresponding to x and y, respectively and =β meaning equivalence with beta reduction. This is one of the many ways to define computability; see the Church-Turing thesis for a discussion of other approaches and their equivalence.

## Undecidability of equivalence

There is no algorithm which takes as input two lambda expressions and outputs TRUE or FALSE depending on whether or not the two expressions are equivalent. This was historically the first problem for which undecidability could be proven. As is common for a proof of undecidability, the proof shows that no computable function can decide the equivalence. Church's thesis is then invoked to show that no algorithm can do so.

Church's proof first reduces the problem to determining whether a given lambda expression has a normal form. A normal form is an equivalent expression which cannot be reduced any further. Then he assumes that this predicate is computable, and can hence be expressed in lambda calculus. Building on earlier work by Kleene and constructing a Gödel numbering for lambda expressions, he constructs a lambda expression e which closely follows the proof of Gödel's first incompleteness theorem. If e is applied to its own Gödel number, a contradiction results.

## Lambda calculus and programming languages

As pointed out by Peter Landin's 1965 paper A Correspondence between ALGOL 60 and Church's Lambda-notation, sequential procedural programming languages can be understood in terms of the lambda calculus, which provides the basic mechanisms for procedural abstraction and procedure (subprogram) application.

Lambda calculus reifies "functions" and makes them first-class objects, which raises implementation complexity when implementing lambda calculus. A particular challenge is related to the support of higher-order functions, also known as the Funarg problem. Lambda calculus is usually implemented using a virtual machine approach. The first practical implementation of lambda calculus was provided in 1963 by Peter Landin, and is known as the SECD machine. Since then, several optimized abstract machines for lambda calculus were suggested, such as the G-machine[11] and the Categorical abstract machine.

The most prominent counterparts to lambda calculus in programming are functional programming languages, which essentially implement the calculus augmented with some constants and datatypes. Lisp uses a variant of lambda notation for defining functions, but only its purely functional subset ("Pure Lisp") is really equivalent to lambda calculus.

### First-class functions

Anonymous functions are sometimes called lambda expressions in programming languages. An example of a lambda expression in Lisp:

```(lambda (x) (* x x))
```

```\x -> x*x -- where the \ denotes the greek λ
```

The above Lisp example is an expression which evaluates to a first-class function. The symbol `lambda` creates an anonymous function, given a list of parameter names, `(x)` — just a single argument in this case, and an expression which is evaluated as the body of the function, `(* x x)`. The Haskell example is identical.

Functional languages are not the only ones to support functions as first-class objects. Numerous imperative languages, e.g. Pascal, have long supported passing subprograms as arguments to other subprograms. In C and the C-like subset of C++ the equivalent result is obtained by passing pointers to the code of functions (subprograms) deemed function pointers. Such mechanisms are limited to subprograms written explicitly in the code, and do not directly support higher-level functions. Some imperative object-oriented languages have notations that represent functions of any order; such mechanisms are available in C++, Smalltalk and more recently in Eiffel ("agents") and C# ("delegates"). As an example, the Eiffel "inline agent" expression

```agent (x: REAL): REAL do Result := x * x end
```

denotes an object corresponding to the lambda expression λx.x*x (with call by value). It can be treated like any other expression, e.g. assigned to a variable or passed around to routines. If the value of square is the above agent expression, then the result of applying square to a value a (β-reduction) is expressed as square.item ([a]), where the argument is passed as a tuple.

A Python example of this uses the lambda form of functions:

```func = lambda x: x ** 2
```

This creates a new anonymous function and names it func which can be passed to other functions, stored in variables, etc. Python can also treat any other function created with the standard def statement as first-class objects.

The same holds for Smalltalk expression

```[ :x | x * x ]
```

This is first-class object (block closure), which can be stored in variables, passed as arguments, etc.

A similar C++ example (using the Boost.Lambda library):

```std::for_each(c.begin(), c.end(), std::cout << _1 * _1 << '\n');
```

Here the standard library function for_each iterates over all members of container 'c', and prints the square of each element. The _1 notation is Boost.Lambda's convention (originally derived from Boost.Bind) for representing the first placeholder element (the first argument), represented as x elsewhere.

### Reduction strategies

Whether a term is normalizing or not, and how much work needs to be done in normalizing it if it is, depends to a large extent on the reduction strategy used. The distinction between reduction strategies relates to the distinction in functional programming languages between eager evaluation and lazy evaluation.

The following uses the term redex, short for reducible expression. For example, (λx.M) N is a beta-redex; λx.M x is an eta-redex if x is not free in M. The expression to which a redex reduces is called its reduct; using the previous example, the reducts of these expressions are respectively M[x:=N] and M.

Full beta reductions
Any redex can be reduced at any time. This means essentially the lack of any particular reduction strategy—with regard to reducibility, "all bets are off".
Applicative order
The rightmost, innermost redex is always reduced first. Intuitively this means a function's arguments are always reduced before the function itself. Applicative order always attempts to apply functions to normal forms, even when this is not possible.
Most programming languages (including Lisp, ML and imperative languages like C and Java) are described as "strict", meaning that functions applied to non-normalising arguments are non-normalising. This is done essentially using applicative order, call by value reduction (see below), but usually called "eager evaluation".
Normal order
The leftmost, outermost redex is always reduced first. That is, whenever possible the arguments are substituted into the body of an abstraction before the arguments are reduced.
Call by name
As normal order, but no reductions are performed inside abstractions. For example λx.(λx.x)x is in normal form according to this strategy, although it contains the redex (λx.x)x.
Call by value
Only the outermost redexes are reduced: a redex is reduced only when its right hand side has reduced to a value (variable or lambda abstraction).
Call by need
As normal order, but function applications that would duplicate terms instead name the argument, which is then reduced only "when it is needed". Called in practical contexts "lazy evaluation". In implementations this "name" takes the form of a pointer, with the redex represented by a thunk.

Applicative order is not a normalising strategy. The usual counterexample is as follows: define Ω = ωω where ω = λx.xx. This entire expression contains only one redex, namely the whole expression; its reduct is again Ω. Since this is the only available reduction, Ω has no normal form (under any evaluation strategy). Using applicative order, the expression KIΩ = (λxy.x) (λx.x)Ω is reduced by first reducing Ω to normal form (since it is the rightmost redex), but since Ω has no normal form, applicative order fails to find a normal form for KIΩ.

In contrast, normal order is so called because it always finds a normalising reduction if one exists. In the above example, KIΩ reduces under normal order to I, a normal form. A drawback is that redexes in the arguments may be copied, resulting in duplicated computation (for example, (λx.xx) ((λx.x)y) reduces to ((λx.x)y) ((λx.x)y) using this strategy; now there are two redexes, so full evaluation needs two more steps, but if the argument had been reduced first, there would now be none).

The positive tradeoff of using applicative order is that it does not cause unnecessary computation if all arguments are used, because it never substitutes arguments containing redexes and hence never needs to copy them (which would duplicate work). In the above example, in applicative order (λx.xx) ((λx.x)y) reduces first to (λx.xx)y and then to the normal order yy, taking two steps instead of three.

Most purely functional programming languages (notably Miranda and its descendents, including Haskell), and the proof languages of theorem provers, use lazy evaluation, which is essentially the same as call by need. This is like normal order reduction, but call by need manages to avoid the duplication of work inherent in normal order reduction using sharing. In the example given above, (λx.xx) ((λx.x)y) reduces to ((λx.x)y) ((λx.x)y), which has two redexes, but in call by need they are represented using the same object rather than copied, so when one is reduced the other is too.

While the idea of beta reduction seems simple enough, it is not an atomic step, in that it must have a non-trivial cost when estimating computational complexity.[12] To be precise, one must somehow find the location of all of the occurrences of the bound variable V in the expression E, implying a time cost, or one must keep track of these locations in some way, implying a space cost. A naïve search for the locations of V in E is O(n) in the length n of E. This has led to the study of systems which use explicit substitution. Sinot's director strings [13] offer a way of tracking the locations of free variables in expressions.

### Parallelism and concurrency

The Church-Rosser property of the lambda calculus means that evaluation (β-reduction) can be carried out in any order, even in parallel. This means that various nondeterministic evaluation strategies are relevant. However, the lambda calculus does not offer any explicit constructs for parallelism. One can add constructs such as Futures to the lambda calculus. Other process calculi have been developed for describing communication and concurrency.

## Semantics

The fact that lambda calculus terms act as functions on other lambda calculus terms, and even on themselves, led to questions about the semantics of the lambda calculus. Could a sensible meaning be assigned to lambda calculus terms? The natural semantics was to find a set D isomorphic to the function space DD, of functions on itself. However, no nontrivial such D can exist, by cardinality constraints because the set of all functions from D into D has greater cardinality than D.

In the 1970s, Dana Scott showed that, if only continuous functions were considered, a set or domain D with the required property could be found, thus providing a model for the lambda calculus.

This work also formed the basis for the denotational semantics of programming languages.

## Notes and References

1. ^ A. Church, "A set of postulates for the foundation of logic", Annals of Mathematics, Series 2, 33:346–366 (1932).
2. ^ For a full history, see Cardone and Hindley's "History of Lambda-calculus and Combinatory Logic" (2006).
3. ^ A. Church, "An unsolvable problem of elementary number theory", American Journal of Mathematics, Volume 58 (1936), pp. 354-363.
4. ^ A. Church, "A Formulation of the Simple Theory of Types", Journal of Symbolic Logic, Volume 5 (1940).
5. ^ Barendregt, Hendrik Pieter (1984), The Lambda Calculus: Its Syntax and Semantics, Studies in Logic and the Foundations of Mathematics, 103 (Revised edition ed.), North Holland, Amsterdam. Corrections, ISBN 0-444-87508-5
6. ^ Selinger, Peter, Lecture Notes on the Lambda Calculus, Department of Mathematics and Statistics, University of Ottawa, pp. 9
7. ^ Barendregt, Henk; Barendsen, Erik (March 2000), Introduction to Lambda Calculus
8. ^ de Queiroz, Ruy J.G.B. "A Proof-Theoretic Account of Programming and the Role of Reduction Rules." Dialectica 42(4), pages 265-282, 1988.
9. ^ Felleisen, Matthias; Matthew Flatt (2006). Programming Languages and Lambda Calculi. pp. 26.
10. ^ Selinger, Peter, Lecture Notes on the Lambda Calculus, Department of Mathematics and Statistics, University of Ottawa, pp. 16
11. ^ Simon Peyton Jones, Implementation of Functional Programming Languages, Prentice Hall, 1987
12. ^ R. Statman, "The typed λ-calculus is not elementary recursive." Theoretical Computer Science, (1979) 9 pp73-81.
13. ^ F.-R. Sinot. "Director Strings Revisited: A Generic Approach to the Efficient Representation of Free Variables in Higher-order Rewriting." Journal of Logic and Computation 15(2), pages 201-218, 2005.

• Morten Heine Sørensen, Paweł Urzyczyn, Lectures on the Curry-Howard isomorphism, Elsevier, 2006, ISBN 0444520775 is a recent monograph that covers the main topics of lambda calculus from the type-free variety, to most typed lambda calculi, including more recent developments like pure type systems and the lambda cube. It does not cover subtyping extensions.
• Pierce, Benjamin (2002). Types and Programming Languages. MIT Press. ISBN 0-262-16209-1.  covers lambda calculi from a practical type system perspective; some topics like dependent types are only mentioned, but subtyping is an important topic.

Some parts of this article are based on material from FOLDOC, used with permission.

# Simple English

In mathematical logic and computer science, lambda calculus, also λ-calculus, is a formal system. It was designed to investigate the definition of functions, and how to apply them. It is also a tool for analysing recursion. It was introduced by Alonzo Church and Stephen Cole Kleene in the 1930s. Church used lambda calculus in 1936 to give a negative answer to the Entscheidungsproblem. Lambda calculus can be used to define what a computable function is. No general algorithm can answer the question of if two lambda calculus expressions are equivalent. This was the first question, even before the halting problem, for which undecidability could be proved. Lambda calculus has greatly influenced functional programming languages, such as LISP, ML and Haskell.

Lambda calculus can be called the smallest universal programming language. It consists of a single transformation rule (variable substitution) and a single function definition scheme. Lambda calculus is universal in the sense that any computable function can be expressed and evaluated using this formalism. It is thus the same as the Turing machine formalism. However, lambda calculus emphasizes the use of transformation rules. It does not care about the actual machine that implements them. It is an approach more related to software than to hardware.