COMP9315 Week 08 Thursday Lecture

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [0/63]
❖ Things To Note

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [1/63]
❖ Query Optimization

Input: relational algebra expression tree

Output: tree of instantiated relation operations

[Diagram:Pics/qproc/query-transform.png]

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [2/63]
❖ Query Optimization (cont)

Where query optimisation fits in query evaluation:

[Diagram:Pics/qproc/qproc2.png]

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [3/63]
❖ Query Optimization (cont)

Approximate algorithm for cost-based optimisation:

translate SQL query to RAexp
for enough transformations RA' of RAexp {
   while (more choices for RelOps in RA') {
      Plan = {}
      for each node N of RA' (recursively) {
         ROp = select RelOp method for N
         Plan = Plan ∪ ROp
      }
      cost = 0
      for each node P of Plan (bottom-up) 
         { cost += Cost(P) // using child info }
      if (cost < MinCost)
         { MinCost = cost;  BestPlan = Plan }
   }
}

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [4/63]
❖ Cost Models and Analysis

The cost of evaluating a query is determined by:

Analysis of costs involves estimating:
COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [5/63]
❖ Choosing Access Methods (RelOps)

Performed for each node in RA expression tree ...

Inputs:

Output:
COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [6/63]
❖ Choosing Access Methods (RelOps) (cont)

Example:

giving

tmp[i]   := BtreeSearch[name='John'](Student)
tmp[i+1] := LinearSearch[age>21](tmp[i])

Where possible, use pipelining to avoid storing tmp[i] on disk.

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [7/63]
❖ Choosing Access Methods (RelOps) (cont)

Rules for choosing σ access methods:

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [8/63]
❖ Choosing Access Methods (RelOps) (cont)

Rules for choosing access methods:

 
(bnl = block nested loop;   inl = index nested loop;   sm = sort merge)
COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [9/63]
❖ Cost Estimation

Without executing a plan, cannot always know its precise cost.

Thus, query optimisers estimate costs via:

Result size estimated by statistical measures on relations, e.g.
rS cardinality of relation S
RS avg size of tuple in relation S
V(A,S) # distinct values of attribute A
min(A,S) min value of attribute A
max(A,S) max value of attribute A
COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [10/63]
❖ Estimating Projection Result Size

Straightforward, since we know:

Assume page size B,   bout = ceil(rT / cout),   where cout = floor(B/Rout)

If using select distinct ...

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [11/63]
❖ Estimating Selection Result Size

Selectivity = fraction of tuples expected to satisfy a condition.

Common assumption: attribute values uniformly distributed.

Example: Consider the query

select * from Parts where colour='Red'

If   V(colour,Parts)=4,   r=1000  ⇒  |σcolour=red(Parts)|=250

In general, | σA=c(R) |  ≅  rR / V(A,R)


Heuristic used by PostgreSQL: | σA=c(R) |  ≅  r/10    (unless primary key)

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [12/63]
❖ Estimating Selection Result Size (cont)

Estimating size of result for e.g.

select * from Enrolment where year > 2015;

Could estimate by using:

Assume: min(year)=2010, max(year)=2019, |Enrolment|=105
Heuristic used by some systems:   | σA>c(R) | ≅ r/3
COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [13/63]
❖ Estimating Selection Result Size (cont)

Estimating size of result for e.g.

select * from Enrolment where course <> 'COMP9315';

Could estimate by using:

e.g. | V(course,Enrolment) | = 2000,   | σA<>c(E) | = r * 1999/2000


Heuristic used by some systems:   | σA<>c(R) | ≅ r

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [14/63]
❖ Exercise: Selection Size Estimation

Assuming that

Give formulae for the number of expected results for
  1. select * from R where not A=k
  2. select * from R where A=k and B=j
  3. select * from R where A in (k,l,m,n)
where j, k, l, m, n are constants.

Assume: V(A,R) = 10  and V(B,R)=100  and r=1000

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [15/63]
❖ Selection Size Estimation

How to handle non-uniform attribute value distributions?

So, for part colour example, might have distribution like:

White: 35%   Red: 30%   Blue: 25%   Silver: 10%

Use histogram as basis for determining # selected tuples.

Disadvantage: cost of storing/maintaining histograms.

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [16/63]
❖ Exercise: Selection Size Estimation (ii)

Given a relation R with attribute X whose statistics are:

Estimate the size of the result of the following queries:
  1. select * from R where X is not null
  2. select * from R where X >= 'c'
  3. select * from R where X between 'b' and 'd'
COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [17/63]
❖ Selection Size Estimation

Summary: analysis relies on operation and data distribution:

E.g. select * from R where a = k;

Case 1:   uniq(R.a)   ⇒   0 or 1 result

Case 2:   rR  tuples && size(dom(R.a)) = n   ⇒   rR / n results

E.g. select * from R where a < k;

Case 1:   k ≤ min(R.a)   ⇒   0 results

Case 2:   k > max(R.a)   ⇒   ≅ rR results

Case 3:   size(dom(R.a)) = n   ⇒   ? min(R.a) ... k ... max(R.a) ?

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [18/63]
❖ Estimating Join Result Size

Analysis relies on semantic knowledge about data/relations.

Consider equijoin on common attr: R ⋈a S

Case 1:   values(R.a) ∩ values(S.a) = {}   ⇒   size(R ⋈a S) = 0

Case 2:   uniq(R.a) and uniq(S.a)   ⇒   size(R ⋈a S)  ≤  min(|R|, |S|)

Case 3:   pkey(R.a) and fkey(S.a)   ⇒   size(R ⋈a S)  ≤  |S|

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [19/63]
❖ Exercise: Join Size Estimation

How many tuples are in the output from:

  1. select * from R, S where R.s = S.id
    where S.id is a primary key and R.s is a foreign key referencing S.id
  2. select * from R, S where R.s <> S.id
    where S.id is a primary key and R.s is a foreign key referencing S.id
  3. select * from R, S where R.x = S.y
    where R.x and S.y have no connection except that dom(R.x)=dom(S.y)
Under what conditions will the first query have maximum size?
COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [20/63]
❖ Cost Estimation: Postscript

Inaccurate cost estimation can lead to poor evaluation plans.

Above methods can (sometimes) give inaccurate estimates.

To get more accurate cost estimates:

Either way, optimisation process costs more (more than query?)

Trade-off between optimiser performance and query performance.

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [21/63]
❖ PostgreSQL Query Optimiser

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [22/63]
❖ Overview of QOpt Process

Input: tree of Query nodes returned by parser

Output: tree of Plan nodes used by query executor

Intermediate data structures are trees of Path nodes All Node types are defined in include/nodes/*.h
COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [23/63]
❖ Overview of QOpt Process (cont)

 

[Diagram:Pics/qproc/qopt-trees1.png]

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [24/63]
❖ QOpt Data Structures

Generic Path node structure:


typedef struct Path
{
   NodeTag     type;          // scan/join/...
   NodeTag     pathtype;      // specific method
   RelOptInfo *parent;        // output relation
   PathTarget *pathtarget;    // list of Vars/Exprs, width 
   // estimated execution costs for path ...
   Cost        startup_cost;  // setup cost
   Cost        total_cost;    // total cost
   List       *pathkeys;      // sort order
} Path;


PathKey = (opfamily:Oid, strategy:SortDir, nulls_first:bool)

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [25/63]
❖ QOpt Data Structures (cont)

Specialised Path nodes (simplified):


typedef struct IndexPath
{
   Path    path;
   IndexOptInfo *indexinfo; // index for scan 
   List   *indexclauses; // index select conditions 
   ...
   ScanDirection  indexscandir; // used by planner 
   Selectivity  indexselectivity; // estimated #results 
} IndexPath;

typedef struct JoinPath
{
   Path      path;
   JoinType  jointype;     // inner/outer/semi/anti 
   Path     *outerpath;    // outer part of the join 
   Path     *innerpath;    // inner part of the join 
   List     *restrictinfo; // join condition(s) 
} JoinPath;

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [26/63]
❖ Query Optimisation Process

Query optimisation proceeds in two stages (after parsing)...

Rewriting:

Planning and optimisation: Then produces a Plan tree from the selected path.
COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [27/63]
❖ Top-down Trace of QOpt

Top-level of query execution: backend/tcop/postgres.c


exec_simple_query(const char *query_string)
{
  // lots of setting up ... including starting xact
  parsetree_list = pg_parse_query(query_string);
  foreach(parsetree, parsetree_list) {
    // Query optimisation
    querytree_list = pg_analyze_and_rewrite(parsetree,...);
    plantree_list = pg_plan_queries(querytree_list,...);
    // Query execution
    portal = CreatePortal(...plantree_list...);
    PortalRun(portal,...);
  }
  // lots of cleaning up ... including close xact
}

Assumes that we are dealing with multiple queries (i.e. SQL statements)

COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [28/63]
❖ Top-down Trace of QOpt (cont)

query_planner() produces plan for a select/join tree

  • invoke make_one_rel() to find best path/plan

    Code in: backend/optimizer/plan/planmain.c

  • COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [29/63]
    ❖ Top-down Trace of QOpt (cont)

    make_one_rel() generates possible plans, selects best

    Code in: backend/optimizer/path/allpaths.c
    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [30/63]
    ❖ Join-tree Generation

    make_rel_from_joinlist() arranges path generation

    Standard path tree generator (standard_join_search()): Code in:   backend/optimizer/path/{allpaths.c,joinrels.c}
    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [31/63]
    ❖ Join-tree Generation (cont)

    Genetic query optimiser (geqo):

    Code in:   backend/optimizer/geqo/*.c
    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [32/63]
    ❖ Join-tree Generation (cont)

    Basic idea of genetic algorithm:

    Optimize(join)
    {
       t = 0
       p = initialState(join)  // pool of (random) join orders
       for (t = 0; t < #generations; t++) {
          p' = recombination(p) // get parts of good join orders 
          p'' = mutation(p')    // generate new variations 
          p = selection(p'',p)  // choose best join orders 
       }
    }
    


    #generations determined by size of initial pool of join orders

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [33/63]
    ❖ Query Execution

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [34/63]
    ❖ Query Execution

    Query execution:   applies evaluation plan result tuples

    [Diagram:Pics/qproc/qproc3.png]

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [35/63]
    ❖ Query Execution (cont)

    Example of query translation:

    select s.name, s.id, e.course, e.mark
    from   Student s, Enrolment e
    where  e.student = s.id and e.semester = '05s2';
    

    maps to

    πname,id,course,mark(Stu ⋈e.student=s.id (σsemester=05s2Enr))

    maps to

    Temp1  = BtreeSelect[semester=05s2](Enr)
    Temp2  = HashJoin[e.student=s.id](Stu,Temp1)
    Result = Project[name,id,course,mark](Temp2)
    

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [36/63]
    ❖ Query Execution (cont)

    A query execution plan:

    Results may be passed from one operator to the next:
    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [37/63]
    ❖ Materialization

    Steps in materialization between two operators

    Advantage: Disadvantage:
    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [38/63]
    ❖ Pipelining

    How pipelining is organised between two operators:

    Advantage: Disadvantage:
    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [39/63]
    ❖ Iterators (reminder)

    Iterators provide a "stream" of results:

    Other possible operations: reset to specific point, restart, ...
    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [40/63]
    ❖ Pipelining Example

    Consider the query:

    select s.id, e.course, e.mark
    from   Student s, Enrolment e
    where  e.student = s.id and
           e.semester = '05s2' and s.name = 'John';
    

    which maps to the RA expression

    Proj[id,course,mark](Join[student=id](Sel[05s2](Enr),Sel[John](Stu)))
    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [41/63]
    ❖ Pipelining Example (cont)

    Evaluated via communication between RA tree nodes:

    [Diagram:Pics/qproc/qtree.png]



    Note: likely that projection is combined with join in PostgreSQL

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [42/63]
    ❖ Disk Accesses

    Pipelining cannot avoid all disk accesses.

    Some operations use multiple passes (e.g. merge-sort, hash-join).

    Thus ...
    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [43/63]
    ❖ PostgreSQL Query Execution

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [44/63]
    ❖ PostgreSQL Query Execution

    Defs: src/include/executor and src/include/nodes

    Code: src/backend/executor

    PostgreSQL uses pipelining ...

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [45/63]
    ❖ PostgreSQL Executor

    Modules in src/backend/executor fall into two groups:

    execXXX (e.g. execMain, execProcnode, execScan)

    nodeXXX   (e.g. nodeSeqscan, nodeNestloop, nodeGroup)
    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [46/63]
    ❖ Example PostgreSQL Execution

    Consider the query:

    -- get manager's age and # employees in Shoe department
    select e.age, d.nemps
    from   Departments d, Employees e
    where  e.name = d.manager and d.name ='Shoe'
    

    and its execution plan tree

    [Diagram:Pics/qproc/qtree1.png]

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [47/63]
    ❖ Example PostgreSQL Execution (cont)

    The execution plan tree

    [Diagram:Pics/qproc/qtree1.png]

    contains three nodes:

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [48/63]
    ❖ Example PostgreSQL Execution (cont)

    Initially InitPlan() invokes ExecInitNode() on plan tree root.

    ExecInitNode() sees a NestedLoop node ...
       so dispatches to ExecInitNestLoop() to set up iterator
       then invokes ExecInitNode() on left and right sub-plans
           in left subPlan, ExecInitNode() sees an IndexScan node
            so dispatches to ExecInitIndexScan() to set up iterator
           in right sub-plan, ExecInitNode() sees a SeqScan node
            so dispatches to ExecInitSeqScan() to set up iterator

    Result: a plan state tree with same structure as plan tree.

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [49/63]
    ❖ Example PostgreSQL Execution (cont)

    Then ExecutePlan() repeatedly invokes ExecProcNode().

    ExecProcNode() sees a NestedLoop node ...
       so dispatches to ExecNestedLoop() to get next tuple
       which invokes ExecProcNode() on its sub-plans
           in left sub-plan, ExecProcNode() sees an IndexScan node
                so dispatches to ExecIndexScan() to get next tuple
                if no more tuples, return END
                for this tuple, invoke ExecProcNode() on right sub-plan
                    ExecProcNode() sees a SeqScan node
                        so dispatches to ExecSeqScan() to get next tuple
                        check for match and return joined tuples if found
                        continue scan until end
                    reset right sub-plan iterator

    Result: stream of result tuples returned via ExecutePlan()

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [50/63]
    ❖ Query Performance

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [51/63]
    ❖ Performance Tuning

    How to make a database-backed system perform "better"?

    Improving performance may involve any/all of:

    Remembering that, to some extent ...
    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [52/63]
    ❖ Performance Tuning (cont)

    Tuning requires us to consider the following:

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [53/63]
    ❖ Performance Tuning (cont)

    Performance can be considered at two times:

    Difficult to predict what query optimiser will do, so ...
    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [54/63]
    ❖ PostgreSQL Query Tuning

    PostgreSQL provides the explain statement to

    Usage:

    EXPLAIN [ANALYZE] Query
    

    Without ANALYZE, EXPLAIN shows plan with estimated costs.

    With ANALYZE, EXPLAIN executes query and prints real costs.

    Note that runtimes may show considerable variation due to buffering.

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [55/63]
    ❖ EXPLAIN Examples

    Database

    
    people(id, family, given, title, name, ..., birthday)
    courses(id, subject, semester, homepage) 
    course_enrolments(student, course, mark, grade, ...) 
    subjects(id, code, name, longname, uoc, offeredby, ...)
    ...
    

    where

    
           table_name          | n_records 
    ---------------------------+-----------
     people                    |     55767
     courses                   |     73220
     course_enrolments         |    525688
     subjects                  |     18525
    ...
    

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [56/63]
    ❖ EXPLAIN Examples (cont)

    Example: Select on non-indexed attribute

    
    uni=# explain
    uni=# select * from Students where stype='local';
                         QUERY PLAN
    ----------------------------------------------------
     Seq Scan on students
                 (cost=0.00..562.01 rows=23543 width=9)
       Filter: ((stype)::text = 'local'::text)
    

    where

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [57/63]
    ❖ EXPLAIN Examples (cont)

    More notes on explain output:

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [58/63]
    ❖ EXPLAIN Examples (cont)

    Example: Select on non-indexed attribute with actual costs

    
    uni=# explain analyze
    uni=# select * from Students where stype='local';
                           QUERY PLAN
    ----------------------------------------------------------
     Seq Scan on students
                 (cost=0.00..562.01 rows=23543 width=9)
                 (actual time=0.011..4.704 rows=23551 loops=1)
       Filter: ((stype)::text = 'local'::text)
       Rows Removed by Filter: 7810
     Planning time: 0.054 ms
     Execution time: 5.875 ms
    

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [59/63]
    ❖ EXPLAIN Examples (cont)

    Example: Select on indexed, unique attribute

    
    uni=# explain analyze
    uni-# select * from Students where id=100250;
                           QUERY PLAN
    -------------------------------------------------------
     Index Scan using student_pkey on student
                (cost=0.00..8.27 rows=1 width=9)
                (actual time=0.049..0.049 rows=0 loops=1)
       Index Cond: (id = 100250)
     Planning Time: 0.088 ms
     Execution Time: 0.057 ms
    

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [60/63]
    ❖ EXPLAIN Examples (cont)

    Example: Select on indexed, unique attribute

    
    uni=# explain analyze
    uni-# select * from Students where id=1216988;
                           QUERY PLAN
    -------------------------------------------------------
     Index Scan using students_pkey on students
                      (cost=0.29..8.30 rows=1 width=9)
                      (actual time=0.011..0.012 rows=1 loops=1)
       Index Cond: (id = 1216988)
     Planning time: 0.066 ms
     Execution time: 0.062 ms
    

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [61/63]
    ❖ EXPLAIN Examples (cont)

    Example: Join on a primary key (indexed) attribute  (2016)

    
    uni=# explain analyze
    uni-# select s.id,p.name
    uni-# from Students s, People p where s.id=p.id;
                          QUERY PLAN
    ----------------------------------------------------------
    Hash Join (cost=988.58..3112.76 rows=31048 width=19)
              (actual time=11.504..39.478 rows=31048 loops=1)
      Hash Cond: (p.id = s.id)
      -> Seq Scan on people p
             (cost=0.00..989.97 rows=36497 width=19)
             (actual time=0.016..8.312 rows=36497 loops=1)
      -> Hash (cost=478.48..478.48 rows=31048 width=4)
              (actual time=10.532..10.532 rows=31048 loops=1)
              Buckets: 4096  Batches: 2  Memory Usage: 548kB
          ->  Seq Scan on students s 
                  (cost=0.00..478.48 rows=31048 width=4)
                  (actual time=0.005..4.630 rows=31048 loops=1)
    Total runtime: 41.0 ms
    

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [62/63]
    ❖ EXPLAIN Examples (cont)

    Example: Join on a primary key (indexed) attribute  (2018)

    
    uni=# explain analyze
    uni-# select s.id,p.name
    uni-# from Students s, People p where s.id=p.id;
                          QUERY PLAN
    ----------------------------------------------------------
    Merge Join  (cost=0.58..2829.25 rows=31361 width=18)
                (actual time=0.044..25.883 rows=31361 loops=1)
      Merge Cond: (s.id = p.id)
      ->  Index Only Scan using students_pkey on students s
                (cost=0.29..995.70 rows=31361 width=4)
                (actual time=0.033..6.195 rows=31361 loops=1)
            Heap Fetches: 31361
      ->  Index Scan using people_pkey on people p
                (cost=0.29..2434.49 rows=55767 width=18)
                (actual time=0.006..6.662 rows=31361 loops=1)
    Planning time: 0.259 ms
    Execution time: 27.327 ms
    

    COMP9315 24T1 ♢ Week 8 Thursday Lecture ♢ [63/63]


    Produced: 4 Apr 2024