COMP3311 Week 9 Wednesday Lecture

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [0/36]
❖ Week 09 Wednesday

In today's lecture ...

Things to do ...

Things to note ...

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [1/36]
❖ PostgreSQL Query Costs

PostgreSQL provides the explain statement to

Usage:

EXPLAIN [ANALYZE] Query

Without ANALYZE, EXPLAIN shows plan with estimated costs.

With ANALYZE, EXPLAIN executes query and prints real costs.

Note that runtimes may show considerable variation due to buffering.

If simply want to know the runtime is ok, maybe \timing is good enough

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [2/36]
❖ EXPLAIN Examples

Note that PostgreSQL builds a query evaluation tree, rather than a linear plan, e.g.

[Diagram:Pics/dbms/pg-plan.png]

EXPLAIN effectively shows a pre-order traversal of the plan tree

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [3/36]
❖ EXPLAIN Examples (cont)

Example: Select on indexed attribute

db=# explain select * from Students where id=123475;
                          QUERY PLAN
------------------------------------------------------------
 Index Scan using students_pkey on students
                  (cost=0.28..8.30 rows=1 width=8)
   Index Cond: (id = 123475)

db=# explain analyze select * from Students where id=123475;
                          QUERY PLAN
------------------------------------------------------------
 Total runtime: 3.252 ms
 Index Scan using students_pkey on students
                  (cost=0.28..8.30 rows=1 width=8)
                  (actual time=0.011..0.012 rows=1 loops=1)
   Index Cond: (id = 123475)
 Planning Time: 0.058 ms
 Execution Time: 0.025 ms

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [4/36]
❖ EXPLAIN Examples (cont)

Example: Select on non-indexed attribute

db=#  explain select * from people where origin=13;
                         QUERY PLAN                         
------------------------------------------------------------
 Seq Scan on people  (cost=0.00..178.35 rows=1323 width=37)
   Filter: (origin = 13)

db=# explain analyze select * from people where origin=13;
                        QUERY PLAN                            
----------------------------------------------------------
 Seq Scan on people
          (cost=0.00..178.35 rows=1323 width=37)
          (actual time=0.173..1.214 rows=1323 loops=1)
   Filter: (origin = 13)
   Rows Removed by Filter: 7185
 Planning Time: 0.058 ms
 Execution Time: 1.299 ms

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [5/36]
❖ EXPLAIN Examples (cont)

Example: aggregate on an indexed attribute

db=#  explain analyze select max(id) from People;
                          QUERY PLAN      
--------------------------------------------------------------------
 Result  (cost=0.31..0.32 rows=1 width=4)
         (actual time=0.020..0.021 rows=1 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=0.29..0.31 rows=1 width=4)
                (actual time=0.017..0.017 rows=1 loops=1)
         ->  Index Only Scan Backward using people_pkey on people
                        (cost=0.29..253. 17 rows=8508 width=4)
                        (actual time=0.016..0.016 rows=1 loops=1)
               Index Cond: (id IS NOT NULL)
               Heap Fetches: 0
 Planning Time: 0.090 ms
 Execution Time: 0.039 ms

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [6/36]
❖ EXPLAIN Examples (cont)

Example: Join on a primary key (indexed) attribute

db=# explain analyze select s.id, p.zid, p.full_name
db=# from Students s join People p on s.id=p.id;
                         QUERY PLAN                     
---------------------------------------------------------------
 Hash Join  (cost=163.91..343.34 rows=6085 width=20)
            (actual time=1.959..5.190 rows=6 085 loops=1)
   Hash Cond: (p.id = s.id)
   ->  Seq Scan on people p
                (cost=0.00..157.08 rows=8508 width=20)
                (actual time=0.00 4..0.910 rows=8508 loops=1)
   ->  Hash  (cost=87.85..87.85 rows=6085 width=4)
             (actual time=1.942..1.943 rows=608 5 loops=1)
         Buckets: 8192  Batches: 1  Memory Usage: 278kB
         ->  Seq Scan on students s
                      (cost=0.00..87.85 rows=6085 width=4)
                      (actual time=0.004..0.864 rows=6085 loops=1)
 Planning Time: 0.178 ms
 Execution Time: 5.569 ms

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [7/36]
❖ EXPLAIN Examples (cont)

Example: Join on a non-indexed attribute

db=# explain analyze
db=# select s1.code, s2.code
db=# from Subjects s1 join Subjects s2 on s1.owner=s2.owner;
                        QUERY PLAN
----------------------------------------------------------------
 Hash Join  (cost=152.58..10257.41 rows=785511 width=18)
            (actual time=1.851..125.883 rows=785511 loops=1)
   Hash Cond: (s1.owner = s2.owner)
   ->  Seq Scan on subjects s1
                (cost=0.00..95.59 rows=4559 width=13)
                (actual time=0.  006..0.563 rows=4559 loops=1)
   ->  Hash  (cost=95.59..95.59 rows=4559 width=13)
             (actual time=1.826..1.827 rows=45 59 loops=1)
         Buckets: 8192  Batches: 1  Memory Usage: 278kB
         ->  Seq Scan on subjects s2
                      (cost=0.00..95.59 rows=4559 width=13)
                      (actual t ime=0.004..0.851 rows=4559 loops=1)
 Planning Time: 0.112 ms
 Execution Time: 168.613 ms

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [8/36]
❖ Exercise: RA Tree to RA Plan

Re-write the above tree-based plans as sequences of RA ops

Example:

Index Scan using students_pkey on students
                 (cost=0.28..8.30 rows=1 width=8)
  Index Cond: (id = 123475)

is simply

Res = Sel[id=123475]Students

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [9/36]
❖ Transactions, Concurrency


DBMSs provide valuable information resources in an environment that is:

Each user should see the system as: Ultimate goal: data integrity is maintained at all times.
COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [10/36]
❖ Transactions, Concurrency (cont)

Transaction processing

Concurrency control Recovery mechanisms
In COMP3311, we consider only transactions and concurrency
COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [11/36]
❖ Transactions

A transaction is

Transactions happen in a multi-user, unreliable environment.

To maintain integrity of data, transactions must be:

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [12/36]
❖ Example Transaction

Bank funds transfer

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [13/36]
❖ Example Transaction (cont)

Example function to implement bank transfer ...

create or replace function
   transfer(N integer, Src text, Dest text)
   returns integer
declare
   sID integer; dID integer; avail integer;
begin
   select id,balance into sID,avail
   from   Accounts where name=Src;
   if (sID is null) then
      raise exception 'Invalid source account %',Src;
   end if;
   select id into dID
   from Accounts where name=Dest;
   if (dID is null) then
      raise exception 'Invalid dest account %',Dest;
   end if;
...

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [14/36]
❖ Example Transaction (cont)

Example function to implement bank transfer (cont)...

...
   if (avail < N) then
      raise exception 'Insufficient funds in %',Src;
   end if;
   -- total funds in system = NNNN
   update Accounts set balance = balance-N
   where  id = sID;
   -- funds temporarily "lost" from system
   update Accounts set balance = balance+N
   where  id = dID;
   -- funds restored to system; total funds = NNNN
   return nextval('tx_id_seq');
end;

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [15/36]
❖ Exercise: Transaction Failures

Describe any problems from the transaction failing after:

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [16/36]
❖ Transaction Concepts

A transaction must always terminate, either:


[Diagram:Pics/xact/tx-states1.png]

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [17/36]
❖ Transaction Concepts (cont)

To describe transaction effects, we consider:

Normally abbreviated to R(X), W(X), A, C

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [18/36]
❖ Transaction Consistency

Transactions typically have intermediate states that are invalid.

However, states before and after transaction must be valid.


[Diagram:Pics/xact/tx-exec1.png]



Valid = consistent = satisfying all specified constraints on the data

COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [19/36]
❖ Transaction Consistency (cont)

Transaction descriptions can be abstracted

A schedule defines ... Abribtrary interleaving of operations causes anomalies, so that ...
  • two consistency-preserving transactions
  • produce a final state which is not consistent
  • COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [20/36]
    ❖ Serial Schedules

    Serial execution:   T1 then T2   or   T2 then T1

    T1: R(X) W(X) R(Y) W(Y)
    T2:                     R(X) W(X)
    

    or

    T1:           R(X) W(X) R(Y) W(Y)
    T2: R(X) W(X)
    

    Serial execution guarantees a consistent final state if

    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [21/36]
    ❖ Concurrent Schedules

    Concurrent schedules interleave T1,T2,... operations

    Some concurrent schedules are ok, e.g.

    T1: R(X) W(X)      R(Y)      W(Y)
    T2:           R(X)      W(X)
    

    Other concurrent schedules cause anomalies, e.g.

    T1: R(X)      W(X)      R(Y) W(Y)
    T2:      R(X)      W(X)
    

    Want the system to ensure that only valid schedules occur.

    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [22/36]
    ❖ Serializability

    Serializable schedule:

    Abstracting this needs a notion of schedule equivalence.

    Two common formulations of serializability:

    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [23/36]
    ❖ Conflict Serializability

    A characterization of serializability based on conflicting operations

    Consider two transactions T1 and T2 acting on data item X.

    Possible orders for read/write operations by T1 and T2:

    T1 first   T2 first   Equiv?
    R1(X) R2(X)   R2(X) R1(X)   yes
    R1(X) W2(X)   W2(X) R1(X)   no
    W1(X) R2(X)   R2(X) W1(X)   no
    W1(X) W2(X)   W2(X) W1(X)   no

    Note: if T1 and T2 act on different data items (e.g. R(X), W(Y)), order does not matter.

    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [24/36]
    ❖ Conflict Serializability (cont)

    Two transactions have a potential conflict if

    In such cases, the order of operations affects the result.

    If no conflict, can swap order without affecting the result.

    If we can transform a schedule

    then we say that the schedule is conflict serializible.
    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [25/36]
    ❖ Conflict Serializability (cont)


    Example: transform a concurrent schedule to serial schedule

    T1: R(A) W(A)      R(B)      W(B)
    T2:           R(A)      W(A)      R(B) W(B)
    swap
    T1: R(A) W(A) R(B)           W(B)
    T2:                R(A) W(A)      R(B) W(B)
    swap
    T1: R(A) W(A) R(B)      W(B)
    T2:                R(A)      W(A) R(B) W(B)
    swap
    T1: R(A) W(A) R(B) W(B)
    T2:                     R(A) W(A) R(B) W(B)
    

    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [26/36]
    ❖ Conflict Serializability (cont)

    Checking for conflict-serializability:

    Method for doing this:
    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [27/36]
    ❖ Example Precedence Graph


    Example schedule which is not conflict serializable:

    T1: R(X)           R(Y) W(X)      W(Y)
    T2:           R(X)           W(X)
    T3:      R(X)                          W(X)
    

    Precendence graph for the above schedule:

    [Diagram:Pics/xact/prec-graph.png]

    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [28/36]
    ❖ Exercise: Precedence Graphs


    Build precedence graphs for the following schedules:

    1. T1: R(X) W(X)      R(Y)      W(Y)
      T2:           R(X)      W(X)
      


    2. T1: R(X)      W(X)      R(Y) W(Y)
      T2:      R(X)      W(X)
      

    For each schedule:
    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [29/36]
    ❖ View Serializability

    View Serializability is

    As with CS, it is based on a notion of schedule equivalence The idea: if across the two schedules ... then they are view equivalent
    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [30/36]
    ❖ View Serializability (cont)

    Two schedules S and S' on T1 .. Tn are view equivalent iff

    To check serializibilty of S ...
    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [31/36]
    ❖ Exercise: Checking Serializability

    Check whether each schedule is conflict/view serializable:


    1. T1: R(X)      W(X)
      T2:      R(X)      W(X)
      


    2. T1: W(X)      R(Y)
      T2:      R(Y)      R(X)
      


    3. T1: R(X)      W(X)
      T2:      W(X)
      T3:                W(X)
      

    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [32/36]
    ❖ Concurrency Control


    Serializability tests are useful theoretically ...

    But don't provide a mechanism for organising schedules

    What is required are methods that ...
    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [33/36]
    ❖ Concurrency Control (cont)

    Approaches to ensuring ACID transactions:

    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [34/36]
    ❖ Lock-based Concurrency Control

    Synchronise access to shared data items via following rules:

    These rules alone do not guarantee serializability ...
    Locking also introduces potential for deadlock and starvation.
    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [35/36]
    ❖ Lock-based Concurrency Control (cont)

    Examples of locking:

    Schedule 1
    T1: Lx(X)       R(X)           W(X) U(X)
    T2:       Lx(Y)      R(Y) W(Y)           U(Y)
    
    Schedule 2
    T1: Lx(X)       R(X) W(X) U(X)
    T2:       Lx(X) .............. R(X) W(X) U(X)
    

    New operations: Lx() = exclusive lock, Ls() = shared lock, U() unlock

    Note: in Schedule 2, locking effectively forces serial execution

    This is not generally the case; there may be some concurrency

    In general, locking reduces concurrency, but gains safety

    COMP3311 23T3 ♢ Week 9 Wednesday Lecture ♢ [36/36]


    Produced: 10 Nov 2023