COMP2111
System Modelling and Design
Term 1, 2024

Notes (Week 10 Monday)

These are the notes I wrote during the lecture.

Deal with λ-calculus:
- we have *way* too many slides
- there's no way we'll get through them all
- therefore: whatever we don't to I promise won't be on the exam :)

Once we have the λ-notation for function application,
  we don't need to call functions by name anymore.

  f(x) = x + 5

   I want to write:
      f(3)

   In the "normal" notation I can't do this without separately
   defining the function :(

   With λ-notation I can just write:

      f = λx. x + 5

   I can still write
      f(3)

   Or (equivalently) I can write:

      (λx. x + 5)(3)
      ^ I can *inline* functions: use them in expressions
        without naming them

To evaluate this expression:
      (λx. x + 5) 3

Take the body
  x+5
and replace all occurences of x by 3

                      "beta-reduces to" (more later)
      (λx. x + 5) 3   ↦β  (x+5)[x:=3] = 3 + 5


- Syntax

The λ-calculus terms take three forms:

 - variables (similar to atoms in propositional logic,
              or to variables in predicate logic)
 - function application
 - function abstractions

  (x,y,z ranges over variables)
  (s,t,y ranges over λ-calculus terms)         

  t  ::=    x       -- variable
      |     s t     -- function application
      |     λx. t   -- function abstraction

  ^ That's literally the whole syntax (in most presentations)
  The slides also include constants as a base case
    (will be needed for the HOL material, but we won't get that far)

Some syntactic conventions:

  associativity:
   function application associates *to the left*

    f x y      read this as    (f x) y

We only have single-argument functions λx. t
But: we can encode multi-argument functions by nesting λs

  (λx. λy. x + y)    <-- a function that accepts two arguments
                         and adds them together

  ^ a function, which (given an argument x)
     returns *another function* which given an argument y
       adds them together

  Currying: the technique of encoding multi-argument functions
   by nested function application

   def: f(x,y) = x + y
   app: f(3,5)

  (λx. λy. x + y) 3 5
    read as
  ((λx. λy. x + y) 3) 5 
  ↦β
  (λy. 3 + y) 5
  ↦β
  3 + 5

That's why function application is left-associative.

More conventions:
  - Nested lambdas are often grouped together
      Instead of
        λx. λy. x + y          ∀x. ∀y. x < y
      we often write
        λx y. x + y            ∀x y. x < y
  - The scope of a λ extends as far to the right as possible

      λx. f x y        is the same thing as
      (λx. (f x y))
          which is different from
             (λx. f x) y

- Substitution

   ((λx. x + 1)   x)[x := 2]
         ^ bound  ^ free
   =
    (λx. x + 1) 2

   λx. is a *binder* for x
   in the same way that ∃x or ∀x is

- Semantics (β-reduction)

The semantics is easy in the sense that it's tiny:

  whenever you have a β-redex, you can reduce it.

   A β-redex is an expression of the following form:

       (λx. t) s

  To reduce it, replace all occurences of x in t by s.

    ____________________ (β-rule)
     (λx. t) s ↦β t[x:=s]

There are three more (structural) rules.
  Morally, they say "you can β-reduce anywhere"

          t ↦β t'
    _________________________ (app1)
        s t ↦β s t'

          s ↦β s'
    _________________________ (app2)
        s t ↦β s' t

          s ↦β s'
    _________________________ (lam)
        (λx. s) ↦β (λx. s')



        _____________ β-rule
       (λx. x) 3 ↦ 3
      _____________________ app1
      f ((λx. x) 3) ↦β f 3
   ____________________________ app2
   (f ((λx. x) 3)) 4  ↦ (f 3) 4

Q: If I can β-reduce an expression to another, are they in some sense
   equal?
A: We don't consider them *syntactically* equal,
   but they would be semantically equal.
   In λ-calculus speak, they are:

     alpha-beta equivalent.

   Two terms are
     α-equivalent

      t ≡α t'

    if they differ only in choice of placeholders for bound names

      (λx. x) ≡α (λy. y)

    β-equivalence:

      the smallest equivalence relation that relates β-redexes

Q: can you write a denotation of λ-terms?  ⟦λx. x⟧ : ...
A: yes, but (read some Dana Scott if you really want to know)
   ^ *way* beyond the scope of the course

- Church encodings (aka how do you program in this stupid thing?)

I won't *prove* that we can do all computable functions in λ-calculus.
Instead I will give you an intution for how you might,
 by sketching how to encode some common programming constructs.

  The standard Church encodings of
  - booleans
  - if-then-else        
  - natural numbers
  - recursive functions

For the Church encodings
 cheesy slogan: "A Church encoding is what it does"
   we're going to encode a data type (bool, nat, ..)
   as the thing we typically do to consume it.

   Q: What do you do with a nat in a program?
   A: Iterate over it (repeat something n times)
      therefore: a Church number is a function
                 that repeats another function n times


       Church            Numbers      Peano
       λf x. x           0            Z
       λf x. f x         1            S Z
       λf x. f (f x)     2            S (S Z)
       λf x. f^n x       n            S^n Z

   To illustrate how you can compute on Church numerals,
    let's define
     - successor function

           S  =  λn. λf x. f(n f x)
            (or equivalently)
                 λn. λf x. n f (f x)

            n f x    = apply f to x n times
            f(n f x) = apply f to x n+1 times


       S 0 ≡α
       (λn. λf x. f(n f x)) (λg y. y)
       ^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^
       ↦β
       λf x. f((λg y. y) f x)
               ^^^^^^^^^^^
       ↦β
       λf x. f((λy. y) x)
               ^^^^^^^^^
       ↦β
       λf x. f x
       =
       1

     - addition

        λn m. λf x. n f (m f x)

     Let's do 1+1

     (+) 1 1
     =
     (λn m. λf x. n f (m f x)) 1 1
     ↦β
     (λm. λf x. 1 f (m f x)) 1
     ↦β
     (λf x. 1 f (1 f x))
     =
     (λf x. 1 f ((λf x. f x) f x))
     ↦β
     (λf x. 1 f ((λx. f x) x))
     ↦β
     (λf x. 1 f (f x))
     =
     (λf x. (λf x. f x) f (f x))
            ^^^^^^^^^^^^^
     ↦β
     (λf x. (λx. f x) (f x))        
            ^^^^^^^^^^^^^^^
     ↦β
     (λf x. f (f x))
     =
     2


   Q: What do you do with a bool in a program?
   A: do branching on it
      therefore: a Church boolean is going to be
                 a function that chooses between two options


             λx y. x          true
             λx y. y          false

      negation  ¬b

           λb. λx y. b y x
          or equivalently:
           λb. b false true

       (¬) (true)
       =
       (λb. λx y. b y x) (λz w. z)
       ↦β
       (λx y. (λz w. z) y x)
              ^^^^^^^^^^^
       ↦β
       (λx y. (λw. y) x)
       ↦β
       (λx y. y)
       =
       false

      if-then-else
                if b then c else d  <-- your favourite language
                b c d

It's not obvious at first glance that you could write a λ-term
 whose evaluation doesn't terminate.
But we do have self-application

  (λx. x x) (λx. x x)
  ↦β
  x x[x:=(λx. x x)]
  (λx. x x) (λx. x x)

   ^ basically the λ equivalent of   while(true) do skip;
      or           f(x) = return f(x)

 ^ this is an example of a λ-calculus term with *no normal form*

A *normal form* is a λ-term with no β-redexes

- Fix-point combinators
   ^ potentially diverging λ-terms that do something interesting

  Y combinator is a
   λ-term that satisfies the following equation:

      Y f ≡αβη f(Y f)

       ^ this equation characterises iteration or recursion

      while true do P  ≡  P; while true do P

      (we don't talk about η)

    Y = (λf. (λx. f (x x)) (λx. f(x x)))

    Y g
    =  (λf. (λx. f (x x)) (λx. f(x x))) g
    ↦β (λx. g (x x)) (λx. g(x x))
    ↦β g ((λx. g(x x)) (λx. g(x x)))

    ^ does this really give us recursion?

    f(x) =
      if x = 0 then
        return 3
      else
        return f(x-1)

    // x: the argument to f above
    // f: a function that behaves as the f above                
    Y (λf x. (x = 0) 3 (f(x-1)))
    ≡β
    (λx. (x = 0) 3 ((Y (λf x. (x = 0) 3 (f(x-1))))(x-1)))


- Confluence

  A property that all λ-calculus terms have:
     intuitively: a term cannot have two different normal forms

      (λx. x) ((λy. y) z)   <--- has two β-redexes

      (λx. x) ((λy. y) z) ↦β (λy. y) z ↦β z

      (λx. x) ((λy. y) z) ↦β (λx. x) z ↦β z

  Confluence (aka the Church-Rosser theorem) is the property that
   the choice doesn't matter, in the sense that you'll get on
   different paths initially but you can always get back on
   the same path

               s
              / \
          β  /   \ β
            /     \
            t     u
            \     /
           β*\   / β*
              \ /
               v

   Formally:
      If s ↦β t and s ↦β u,
       then there exists v such that
         t ↦β* v  and  s ↦β* v.

   Intuitively: any choice of eval order yields the same final result.

  You can usually encode loops as tail-recursive functions

    while(x ≠ 0)
      x := x - 1;
      y := x * y;

    f(x,y) =
      if x = 0 then
        return y
      else
        return f(x-1,x*y)


Final point: why λ-calculus (out of the box)
  fails as a foundation of mathematics.

   Y (¬)
   ≡
   (¬) (Y (¬))

   ^ in words:
      there exists a term that's equal to its own negation


   λa b. a a b
   λa b. a b a

2024-04-19 Fri 10:38

Announcements RSS