Notes (Week 10 Monday)
These are the notes I wrote during the lecture.
Deal with λ-calculus: - we have *way* too many slides - there's no way we'll get through them all - therefore: whatever we don't to I promise won't be on the exam :) Once we have the λ-notation for function application, we don't need to call functions by name anymore. f(x) = x + 5 I want to write: f(3) In the "normal" notation I can't do this without separately defining the function :( With λ-notation I can just write: f = λx. x + 5 I can still write f(3) Or (equivalently) I can write: (λx. x + 5)(3) ^ I can *inline* functions: use them in expressions without naming them To evaluate this expression: (λx. x + 5) 3 Take the body x+5 and replace all occurences of x by 3 "beta-reduces to" (more later) (λx. x + 5) 3 ↦β (x+5)[x:=3] = 3 + 5 - Syntax The λ-calculus terms take three forms: - variables (similar to atoms in propositional logic, or to variables in predicate logic) - function application - function abstractions (x,y,z ranges over variables) (s,t,y ranges over λ-calculus terms) t ::= x -- variable | s t -- function application | λx. t -- function abstraction ^ That's literally the whole syntax (in most presentations) The slides also include constants as a base case (will be needed for the HOL material, but we won't get that far) Some syntactic conventions: associativity: function application associates *to the left* f x y read this as (f x) y We only have single-argument functions λx. t But: we can encode multi-argument functions by nesting λs (λx. λy. x + y) <-- a function that accepts two arguments and adds them together ^ a function, which (given an argument x) returns *another function* which given an argument y adds them together Currying: the technique of encoding multi-argument functions by nested function application def: f(x,y) = x + y app: f(3,5) (λx. λy. x + y) 3 5 read as ((λx. λy. x + y) 3) 5 ↦β (λy. 3 + y) 5 ↦β 3 + 5 That's why function application is left-associative. More conventions: - Nested lambdas are often grouped together Instead of λx. λy. x + y ∀x. ∀y. x < y we often write λx y. x + y ∀x y. x < y - The scope of a λ extends as far to the right as possible λx. f x y is the same thing as (λx. (f x y)) which is different from (λx. f x) y - Substitution ((λx. x + 1) x)[x := 2] ^ bound ^ free = (λx. x + 1) 2 λx. is a *binder* for x in the same way that ∃x or ∀x is - Semantics (β-reduction) The semantics is easy in the sense that it's tiny: whenever you have a β-redex, you can reduce it. A β-redex is an expression of the following form: (λx. t) s To reduce it, replace all occurences of x in t by s. ____________________ (β-rule) (λx. t) s ↦β t[x:=s] There are three more (structural) rules. Morally, they say "you can β-reduce anywhere" t ↦β t' _________________________ (app1) s t ↦β s t' s ↦β s' _________________________ (app2) s t ↦β s' t s ↦β s' _________________________ (lam) (λx. s) ↦β (λx. s') _____________ β-rule (λx. x) 3 ↦ 3 _____________________ app1 f ((λx. x) 3) ↦β f 3 ____________________________ app2 (f ((λx. x) 3)) 4 ↦ (f 3) 4 Q: If I can β-reduce an expression to another, are they in some sense equal? A: We don't consider them *syntactically* equal, but they would be semantically equal. In λ-calculus speak, they are: alpha-beta equivalent. Two terms are α-equivalent t ≡α t' if they differ only in choice of placeholders for bound names (λx. x) ≡α (λy. y) β-equivalence: the smallest equivalence relation that relates β-redexes Q: can you write a denotation of λ-terms? ⟦λx. x⟧ : ... A: yes, but (read some Dana Scott if you really want to know) ^ *way* beyond the scope of the course - Church encodings (aka how do you program in this stupid thing?) I won't *prove* that we can do all computable functions in λ-calculus. Instead I will give you an intution for how you might, by sketching how to encode some common programming constructs. The standard Church encodings of - booleans - if-then-else - natural numbers - recursive functions For the Church encodings cheesy slogan: "A Church encoding is what it does" we're going to encode a data type (bool, nat, ..) as the thing we typically do to consume it. Q: What do you do with a nat in a program? A: Iterate over it (repeat something n times) therefore: a Church number is a function that repeats another function n times Church Numbers Peano λf x. x 0 Z λf x. f x 1 S Z λf x. f (f x) 2 S (S Z) λf x. f^n x n S^n Z To illustrate how you can compute on Church numerals, let's define - successor function S = λn. λf x. f(n f x) (or equivalently) λn. λf x. n f (f x) n f x = apply f to x n times f(n f x) = apply f to x n+1 times S 0 ≡α (λn. λf x. f(n f x)) (λg y. y) ^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^ ↦β λf x. f((λg y. y) f x) ^^^^^^^^^^^ ↦β λf x. f((λy. y) x) ^^^^^^^^^ ↦β λf x. f x = 1 - addition λn m. λf x. n f (m f x) Let's do 1+1 (+) 1 1 = (λn m. λf x. n f (m f x)) 1 1 ↦β (λm. λf x. 1 f (m f x)) 1 ↦β (λf x. 1 f (1 f x)) = (λf x. 1 f ((λf x. f x) f x)) ↦β (λf x. 1 f ((λx. f x) x)) ↦β (λf x. 1 f (f x)) = (λf x. (λf x. f x) f (f x)) ^^^^^^^^^^^^^ ↦β (λf x. (λx. f x) (f x)) ^^^^^^^^^^^^^^^ ↦β (λf x. f (f x)) = 2 Q: What do you do with a bool in a program? A: do branching on it therefore: a Church boolean is going to be a function that chooses between two options λx y. x true λx y. y false negation ¬b λb. λx y. b y x or equivalently: λb. b false true (¬) (true) = (λb. λx y. b y x) (λz w. z) ↦β (λx y. (λz w. z) y x) ^^^^^^^^^^^ ↦β (λx y. (λw. y) x) ↦β (λx y. y) = false if-then-else if b then c else d <-- your favourite language b c d It's not obvious at first glance that you could write a λ-term whose evaluation doesn't terminate. But we do have self-application (λx. x x) (λx. x x) ↦β x x[x:=(λx. x x)] (λx. x x) (λx. x x) ^ basically the λ equivalent of while(true) do skip; or f(x) = return f(x) ^ this is an example of a λ-calculus term with *no normal form* A *normal form* is a λ-term with no β-redexes - Fix-point combinators ^ potentially diverging λ-terms that do something interesting Y combinator is a λ-term that satisfies the following equation: Y f ≡αβη f(Y f) ^ this equation characterises iteration or recursion while true do P ≡ P; while true do P (we don't talk about η) Y = (λf. (λx. f (x x)) (λx. f(x x))) Y g = (λf. (λx. f (x x)) (λx. f(x x))) g ↦β (λx. g (x x)) (λx. g(x x)) ↦β g ((λx. g(x x)) (λx. g(x x))) ^ does this really give us recursion? f(x) = if x = 0 then return 3 else return f(x-1) // x: the argument to f above // f: a function that behaves as the f above Y (λf x. (x = 0) 3 (f(x-1))) ≡β (λx. (x = 0) 3 ((Y (λf x. (x = 0) 3 (f(x-1))))(x-1))) - Confluence A property that all λ-calculus terms have: intuitively: a term cannot have two different normal forms (λx. x) ((λy. y) z) <--- has two β-redexes (λx. x) ((λy. y) z) ↦β (λy. y) z ↦β z (λx. x) ((λy. y) z) ↦β (λx. x) z ↦β z Confluence (aka the Church-Rosser theorem) is the property that the choice doesn't matter, in the sense that you'll get on different paths initially but you can always get back on the same path s / \ β / \ β / \ t u \ / β*\ / β* \ / v Formally: If s ↦β t and s ↦β u, then there exists v such that t ↦β* v and s ↦β* v. Intuitively: any choice of eval order yields the same final result. You can usually encode loops as tail-recursive functions while(x ≠ 0) x := x - 1; y := x * y; f(x,y) = if x = 0 then return y else return f(x-1,x*y) Final point: why λ-calculus (out of the box) fails as a foundation of mathematics. Y (¬) ≡ (¬) (Y (¬)) ^ in words: there exists a term that's equal to its own negation λa b. a a b λa b. a b a