Performance Tuning

COMP3311 20T3 ♢ Performance Tuning ♢ [0/14]
❖ DB Application Performance

In order to make DB applications efficient, it is useful to know:

and then, "encourage" the DBMS to use the most efficient methods

Achieve by using indexes and avoiding certain SQL query structures

COMP3311 20T3 ♢ Performance Tuning ♢ [1/14]
❖ DB Application Performance (cont)

Application programmer choices that affect query cost:

COMP3311 20T3 ♢ Performance Tuning ♢ [2/14]
❖ DB Application Performance (cont)


Whatever you do as a DB application programmer

You have no control over the optimisation process
COMP3311 20T3 ♢ Performance Tuning ♢ [3/14]
❖ DB Application Performance (cont)

Example: query to find sales people earning more than $50K

select name from Employee
where  salary > 50000 and
       empid in (select empid from Worksin
                 where  dept = 'Sales')

A query evaluator might use the strategy

SalesEmps = (select empid from WorksIn where dept='Sales')
foreach e in Employee {
    if (e.empid in SalesEmps && e.salary > 50000)
        add e to result set
}

Needs to examine all  employees, even if not in Sales

This is not a good expression of the query.

COMP3311 20T3 ♢ Performance Tuning ♢ [4/14]
❖ DB Application Performance (cont)

A different expression of the same query:

select name
from   Employee join WorksIn using (empid)
where  Employee.salary > 5000 and
       WorksIn.dept = 'Sales'

Query evaluator might use the strategy

SalesEmps = (select * from WorksIn where dept='Sales')
foreach e in (Employee join SalesEmps) {
    if (e.salary > 50000)
        add e to result set
}

Only examines Sales employees, and uses a simpler test

This is a good expression of the query.

COMP3311 20T3 ♢ Performance Tuning ♢ [5/14]
❖ DB Application Performance (cont)

A very poor expression of the query (correlated subquery):

select name from Employee e
where  salary > 50000 and
       'Sales' in (select dept from Worksin where empid=e.id)

A query evaluator would be forced to use the strategy:

foreach e in Employee {
    Depts = (select dept from WorksIn where empid=e.empid)
    if ('Sales' in Depts && e.salary > 50000)
        add e to result set
}

Needs to run a query for every  employee ...

COMP3311 20T3 ♢ Performance Tuning ♢ [6/14]
❖ Indexes

Indexes provide efficient content-based access to tuples.

Can build indexes on any (combination of) attributes.

Definining indexes:

CREATE INDEX name ON table ( attr1, attr2, ... )

attri  can be an arbitrary expression (e.g. upper(name)).

CREATE INDEX also allows us to specify

COMP3311 20T3 ♢ Performance Tuning ♢ [7/14]
❖ Indexes (cont)

Indexes can significantly improve query costs.

Considerations in applying indexes:

COMP3311 20T3 ♢ Performance Tuning ♢ [8/14]
❖ Query Tuning

Sometimes, a query can be re-phrased to affect performance:

Examples which may  prevent optimiser from using indexes:

select name from Employee where salary/365 > 100
       -- fix by re-phrasing condition to (salary > 36500)
select name from Employee where name like '%ith%'
select name from Employee where birthday is null
       -- above two are difficult to "fix"
select name from Employee
where  dept in (select id from Dept where ...)
       -- fix by using Employee join Dept on (e.dept=d.id)

COMP3311 20T3 ♢ Performance Tuning ♢ [9/14]
❖ Query Tuning (cont)

Other tricks in query tuning (effectiveness is DBMS-dependent)

COMP3311 20T3 ♢ Performance Tuning ♢ [10/14]
❖ PostgreSQL Performance Analysis

PostgreSQL provides the explain statement to

Usage:

EXPLAIN [ANALYZE] Query

Without ANALYZE, EXPLAIN shows plan with estimated costs.

With ANALYZE, EXPLAIN executes query and prints real costs.

Note that runtimes may show considerable variation due to buffering.

If simply knowing the runtime is ok, maybe \timing is good enough

COMP3311 20T3 ♢ Performance Tuning ♢ [11/14]
❖ EXPLAIN Examples

Note that PostgreSQL builds a query evaluation tree, rather than a linear plan, e.g.

[Diagram:Pics/dbms/pg-plan.png]

EXPLAIN effectively shows a pre-order traversal of the plan tree

COMP3311 20T3 ♢ Performance Tuning ♢ [12/14]
❖ EXPLAIN Examples (cont)

Example: Select on indexed attribute

db=# explain analyze select * from Students where id=100250;
                            QUERY PLAN
------------------------------------------------------------
 Index Scan using student_pkey on student
                  (cost=0.00..5.94 rows=1 width=17)
                  (actual time=3.209..3.212 rows=1 loops=1)
   Index Cond: (id = 100250)
 Total runtime: 3.252 ms

Example: Select on non-indexed attribute

db=# explain analyze select * from Students where stype='local';
                         QUERY PLAN
-------------------------------------------------------------
 Seq Scan on student  (cost=0.00..70.33 rows=18 width=17)
             (actual time=0.061..7.784 rows=2512 loops=1)
   Filter: ((stype)::text = 'local'::text)
 Total runtime: 7.554 ms

COMP3311 20T3 ♢ Performance Tuning ♢ [13/14]
❖ EXPLAIN Examples (cont)

Example: Join on a primary key (indexed) attribute

db=# explain
db-# select s.sid,p.name
db-# from Students s join People p on s.id=p.id;

                        QUERY PLAN
-------------------------------------------------------
 Hash Join  (cost=70.33..305.86 rows=3626 width=52)
   Hash Cond: ("outer".id = "inner".id)
   -> Seq Scan on people p
               (cost=0.00..153.01 rows=3701 width=52)
   -> Hash  (cost=61.26..61.26 rows=3626 width=8)
       -> Seq Scan on student s
                   (cost=0.00..61.26 rows=3626 width=8)

COMP3311 20T3 ♢ Performance Tuning ♢ [14/14]


Produced: 12 Nov 2020