❖ Things To Note |
PersonName ::= Family','Given | Family', 'Given Family ::= NameList Given ::= NameList NameList ::= Name | Name' 'NameList Name ::= Upper Letters Letter ::= Upper | Lower | Punc Letters ::= Letter | Letter Letters Upper ::= 'A' | 'B' | ... | 'Z' Lower ::= 'a' | 'b' | ... | 'z' Punc ::= '-' | "'"
❖ Debugging Assignment 1 |
Symptoms: a message like FATAL: the database system ...
Solution: write debugging output to the log
See elog()
Still can't work it out?
give
❖ Scanning Relations |
Example: simple scan of a table ...
select name from Employee
implemented as:
DB db = openDatabase("myDB");
Relation r = openRel(db,"Employee");
Scan s = start_scan(r);
Tuple t; // current tuple
while ((t = next_tuple(s)) != NULL)
{
char *name = getStrField(t,2);
printf("%s\n", name);
}
❖ Scanning Relations (cont) |
Consider the following simple Scan
typedef struct { Relation rel; Page *curPage; // Page buffer int curPID; // current pid int curTID; // current tid } ScanData; typedef ScanData *Scan; Scan start_scan(Relation rel) { Scan new = malloc(ScanData); new->rel = rel; new->curPage = get_page(rel,0); new->curPID = 0; new->curTID = 0; return new; }
❖ Scanning Relations (cont) |
Implementation of next_tuple()
Tuple next_tuple(Scan s) { if (s->curTID == nTuples(s->curPage)) { // finished cur page, get next if (s->curPID == nPages(s->rel)) return NULL; s->curPID++; s->curPage = get_page(s->rel, s->curPID); s->curTID =0; } Record r = get_record(s->curPage, s->curTID); s->curTID++; return makeTuple(s->rel, r); }
❖ Scanning in PostgreSQL |
Scanning defined in: backend/access/heap/heapam.c
Implements iterator data/operations:
HeapScanDesc
scan = heap_beginscan(rel,...,nkeys,keys)
tup = heap_getnext(scan, direction)
heap_endscan(scan)
scan
res = HeapKeyTest(tuple,...,nkeys,keys)
ScanKeys
❖ Scanning in PostgreSQL (cont) |
typedef HeapScanDescData *; typedef struct HeapScanDescData { // scan parameters Relation rs_rd; // heap relation descriptor Snapshot rs_snapshot; // snapshot ... tuple visibility int rs_nkeys; // number of scan keys ScanKey rs_key; // array of scan key descriptors ... // state set up at initscan time PageNumber rs_npages; // number of pages to scan PageNumber rs_startpage; // page # to start at ... // scan current state, initally set to invalid HeapTupleData rs_ctup; // current tuple in scan PageNumber rs_cpage; // current page # in scan Buffer rs_cbuf; // current buffer in scan ... } HeapScanDescData; HeapScanDesc
❖ Scanning in other File Structures |
Above examples are for heap files
btree
hash
gist
gin
❖ The Sort Operation |
Sorting is explicit in queries only in the order by
select * from Students order by name;
Sorting is used internally in other operations:
group by
For large data on disks, need external sorts such as merge sort.
❖ Two-way Merge Sort (cont) |
Requires three in-memory buffers:
Assumption: cost of Merge operation on two in-memory buffers ≅ 0.
❖ Comparison for Sorting |
Above assumes that we have a function to compare tuples.
Needs to understand ordering on different data types.
Need a function tupCompare(r1,r2,f)
strcmp
int tupCompare(r1,r2,f) { if (r1.f < r2.f) return -1; if (r1.f > r2.f) return 1; return 0; }
Assume =, <, > are available for all attribute types.
❖ Comparison for Sorting (cont) |
In reality, need to sort on multiple attributes and ASC/DESC, e.g.
-- example multi-attribute sort
select * from Students
order by age desc, year_enrolled
Sketch of multi-attribute sorting function
int tupCompare(r1,r2,criteria) { foreach (f,ord) in criteria { if (ord == ASC) { if (r1.f < r2.f) return -1; if (r1.f > r2.f) return 1; } else { if (r1.f > r2.f) return -1; if (r1.f < r2.f) return 1; } } return 0; }
❖ Cost of Two-way Merge Sort |
For a file containing b data pages:
Example: Relation with r=105 and c=50 ⇒ b=2000 pages.
Number of passes for sort: ceil(log22000) = 11
Reads/writes entire file 11 times! Can we do better?
❖ n-Way Merge Sort |
Initial pass uses: B total buffers
Reads B pages at a time, sorts in memory, writes out in order
❖ n-Way Merge Sort (cont) |
Method:
// Produce B-page-long runs for each group of B pages in Rel { read B pages into memory buffers sort group in memory write B pages out to Temp } // Merge runs until everything sorted numberOfRuns = ⌈b/B⌉ while (numberOfRuns > 1) { // n-way merge, where n=B-1 for each group of n runs in Temp { merge into a single run via input buffers write run to newTemp via output buffer } numberOfRuns = ⌈numberOfRuns/n⌉ Temp = newTemp // swap input/output files }
❖ Cost of n-Way Merge Sort |
Consider file where b = 4096, B = 16 total buffers:
❖ Cost of n-Way Merge Sort (cont) |
Generalising from previous example ...
For b data pages and B buffers
❖ Exercise: Cost of n-Way Merge Sort |
How many reads+writes to sort the following:
❖ Sorting in PostgreSQL |
Sort uses a merge-sort (from Knuth) similar to above:
SortTuple
qsort()
If memory fills while reading, form "runs" and do disk-based sort.
❖ Sorting in PostgreSQL (cont) |
Disk-based sort has phases:
workMem
Implementation of "tapes": backend/utils/sort/logtape.c
❖ Sorting in PostgreSQL (cont) |
Sorting comparison operators are obtained via catalog (in Type.o
// gets pointer to function via pg_operator struct Tuplesortstate { ... SortTupleComparator ... }; // returns negative, zero, positive ApplySortComparator(Datum datum1, bool isnull1, Datum datum2, bool isnull2, SortSupport sort_helper);
Flags indicate: ascending/descending, nulls-first/last.
ApplySortComparator()
tupCompare()
❖ The Projection Operation |
Consider the query:
select distinct name,age from Employee;
If the Employee
(94002, John, Sales, Manager, 32) (95212, Jane, Admin, Manager, 39) (96341, John, Admin, Secretary, 32) (91234, Jane, Admin, Secretary, 21)
then the result of the projection is:
(Jane, 21) (Jane, 39) (John, 32)
Note that duplicate tuples (e.g. (John,32)
❖ The Projection Operation (cont) |
The projection operation needs to:
distinct
❖ Sort-based Projection |
Requires a temporary file/relation (Temp
for each tuple T in Rel { T' = mkTuple([attrs],T) write T' to Temp } sort Temp on [attrs] for each tuple T in Temp { if (T == Prev) continue write T to Result Prev = T }
❖ Exercise: Cost of Sort-based Projection |
Consider a table R(x,y,z) with tuples:
Page 0: (1,1,'a') (11,2,'a') (3,3,'c') Page 1: (13,5,'c') (2,6,'b') (9,4,'a') Page 2: (6,2,'a') (17,7,'a') (7,3,'b') Page 3: (14,6,'a') (8,4,'c') (5,2,'b') Page 4: (10,1,'b') (15,5,'b') (12,6,'b') Page 5: (4,2,'a') (16,9,'c') (18,8,'c')
SQL: create T as (select distinct y from R)
Assuming:
R
T
❖ Cost of Sort-based Projection |
The costs involved are (assuming B=n+1 buffers for sort):
Rel
Temp
Temp
Temp
❖ Hash-based Projection (cont) |
Algorithm for both phases:
for each tuple T in relation Rel {
T' = mkTuple([attrs],T)
H = h1(T', n)
B = buffer for partition[H]
if (B full) write and clear B
insert T' into B
}
for each partition P in 0..n-1 {
for each tuple T in partition P {
H = h2(T, n)
B = buffer for hash value H
if (T not in B) insert T into B
// assumes B never gets full
}
write and clear all buffers
}
❖ Exercise: Cost of Hash-based Projection |
Consider a table R(x,y,z) with tuples:
Page 0: (1,1,'a') (11,2,'a') (3,3,'c')
Page 1: (13,5,'c') (2,6,'b') (9,4,'a')
Page 2: (6,2,'a') (17,7,'a') (7,3,'b')
Page 3: (14,6,'a') (8,4,'c') (5,2,'b')
Page 4: (10,1,'b') (15,5,'b') (12,6,'b')
Page 5: (4,2,'a') (16,9,'c') (18,8,'c')
-- and then the same tuples repeated for pages 6-11
SQL: create T as (select distinct y from R)
Assuming:
R
T
❖ Cost of Hash-based Projection |
The total cost is the sum of the following:
R
To ensure that n is larger than the largest partition ...
❖ Projection on Primary Key |
No duplicates, so the above approaches are not required.
Method:
bR = nPages(Rel) for i in 0 .. bR-1 { P = read page i for j in 0 .. nTuples(P) { T = getTuple(P,j) T' = mkTuple(pk, T) if (outBuf is full) write and clear append T' to outBuf } } if (nTuples(outBuf) > 0) write
❖ Index-only Projection |
Can do projection without accessing data file iff ...
Basic idea:
❖ Comparison of Projection Methods |
Difficult to compare, since they make different assumptions:
Best case scenario for each (assuming n+1 in-memory buffers):
❖ Projection in PostgreSQL |
Code for projection forms part of execution iterators:
ExecProject(projInfo,...)
check_sql_fn_retval(...)
ExecStoreTuple(newTuple,...)
❖ Varieties of Selection |
Selection: select * from R where C
R
C
❖ Varieties of Selection (cont) |
Examples of different selection types:
one: select * from
where id = 1234
pmr:
select * from
where age=65
select * from
where age=65 and gender='m'
rng:
select * from
where age≥18 and age≤21
select * from
where age between 18 and 21
and height between 160 and 190
note: rng = range
❖ Exercise: Query Types |
Using the relation:
create table Courses ( id integer primary key, code char(8), -- e.g. 'COMP9315' title text, -- e.g. 'Computing 1' year integer, -- e.g. 2000..2016 convenor integer references Staff(id), constraint once_per_year unique (code,year) );
give examples of each of the following query types:
❖ Implementing Select Efficiently |
Two basic approaches:
Our analyses assume: 1 input buffer available for each relation.
If more buffers are available, most methods benefit.
❖ Heap Files |
❖ Selection in Heaps |
For all selection queries, the only possible strategy is:
// select * from R where C
for each page P in file of relation R {
for each tuple t in page P {
if (t satisfies C)
add tuple t to result set
}
}
i.e. linear scan through file searching for matching tuples
❖ Selection in Heaps (cont) |
The heap is scanned from the first to the last page:
Costrange = Costpmr = b
If we know that only one tuple matches the query (one query),
a simple optimisation is to stop the scan once that tuple is found.
Costone : Best = 1 Average = b/2 Worst = b
❖ Insertion in Heaps |
Insertion: new tuple is appended to file (in last page).
rel = openRelation("R", READ|WRITE); pid = nPages(rel)-1; get_page(rel, pid, buf); if (size(newTup) > size(buf)) { deal with oversize tuple } else { if (!hasSpace(buf,newTup)) { pid++; nPages(rel)++; clear(buf); } insert_record(buf,newTup); put_page(rel, pid, buf); }
Costinsert = 1r + 1w
Plus possible extra writes for oversize tuples, e.g. PostgreSQL's TOAST
❖ Insertion in Heaps (cont) |
Alternative strategy:
R
R
backend/access/heap/{heapam.c,hio.c}
❖ Insertion in Heaps (cont) |
PostgreSQL's tuple insertion:
(Relation relation, // relation desc HeapTuple newtup, // new tuple data CommandId cid, ...) // SQL statement heap_insert
newtup
xmin
❖ Deletion in Heaps |
SQL: delete from
where
Implementation of deletion:
rel = openRelation("R",READ|WRITE); for (p = 0; p < nPages(rel); p++) { get_page(rel, p, buf); ndels = 0; for (i = 0; i < nTuples(buf); i++) { tup = get_record(buf,i); if (tup satisfies Condition) { ndels++; delete_record(buf,i); } } if (ndels > 0) put_page(rel, p, buf); if (ndels > 0 && unique) break; }
❖ Exercise: Cost of Deletion in Heaps |
Consider the following queries ...
delete from Employees where id = 12345 -- one delete from Employees where dept = 'Marketing' -- pmr delete from Employees where 40 <= age and age < 50 -- range
Show how each will be executed and estimate the cost, assuming:
Generalise the cost models for each query type.
Produced: 29 Feb 2024