COMP9315 24T1 |
Exercises 07 Implementing Join: Nested-loop, Sort-merge, Hash |
DBMS Implementation |
Does the buffer replacement policy matter for sort-merge join? Explain your answer.
Suppose that the relation R (with 150 pages) consists of one attribute a and S (with 90 pages) also consists of one attribute a. Determine the optimal join method for processing the following query:
select * from R, S where R.a > S.a
Assume there are 10 buffer pages available to process the query and there are no indexes available. Assume also that the DBMS only has available the following join methods: nested-loop, block nested loop and sort-merge. Determine the number of page I/Os required by each method to work out which is the cheapest.
Unless otherwise noted, compute all costs as number of page I/Os, except that the cost of writing out the result should be uniformly ignored (it's the same for all methods).
What is the cost of joining R and S using page-oriented simple nested loops join? What is the minimum number of buffer pages required for this cost to remain unchanged?
What is the cost of joining R and S using block nested loops join? What is the minimum number of buffer pages required for this cost to remain unchanged?
What is the cost of joining R and S using sort-merge join? What is the minimum number of buffer pages required for this cost to remain unchanged?
What is the cost of joining R and S using grace hash join? What is the minimum number of buffer pages required for this cost to remain unchanged?
What would be the lowest possible I/O cost for joining R and S using any join algorithm? How much buffer space would be needed to achieve this cost?
What is the maximum number of tuples that the join of R and S could produce, and how many pages would be required to store this result?
How would your answers to the above questions change if you are told that R.a is a foreign key that refers to S.b?
Consider performing the join:
select * from R, S where R.x = S.y
where we have bR = 1000, bS = 5000 and where R is used as the outer relation.
Compute the page I/O cost of performing this join using hybrid hash join:
if we have N=100 memory buffers available
if we have N=512 memory buffers available
if we have N=1024 memory buffers available