| COMP3311 24T2 |
Week 10 Relational Algebra, Transaction Processing |
Database Systems |
Relational algebra operators can be composed. What precisely does this mean? And why is it important?
The natural join ( R Join S ) joins two tables on their common attributes. Consider a theta-join ( R Join[C] S ) where the join condition C is a conjunction of R.A = S.A for each attribute A appearing in the schemas of both R and S (i.e. it joins on the common attributes). What is the difference between the above natural join and theta join?
The definition of relational algebra in lectures was based on sets of tuples. In practice, commercial relational database management systems deal with bags (or multisets) of tuples.
Consider the following relation that describes a collection of PCs
| Model | Speed | RAM | Disk | Price |
|---|---|---|---|---|
| 1001 | 700 | 64 | 10 | 799 |
| 1002 | 1500 | 128 | 60 | 2499 |
| 1003 | 1000 | 128 | 20 | 1499 |
| 1004 | 1000 | 256 | 40 | 1999 |
| 1005 | 700 | 64 | 30 | 999 |
Consider a projection of this relation on the processor speed
attribute, i.e. Proj[speed](PCs).
Consider the following two relations:
R
|
S
|
Give the relation that results from each of the following relational algebra expressions on these relations:
R Div S
R Div (Sel[B != b1](S))
R Div (Sel[B != b2](S))
R × S) - (Sel[R.C=S.C](R Join[B=B] S)
Consider two relations R1 and R2, where R1 contains N1 tuples and R2 contains N2 tuples, and N1 > N2 > 0. Give the minimum and maximum possible sizes (in tuples) for the result relation produced by each of the following relational algebra expressions. In each case state any assumptions about the schemas of R1 and R2 that are needed to make the expression meaningful.
(Date, exercise 6.11) Let A and B be two arbitrary relations. In terms of the keys of A and B, state the keys for each of the following RA expressions. Assume in each case that A and B meet the requirements of the operation (e.g. they are union-compatible for the Union and Intersect operations).
Sel[cond](A), where cond is any condition
Proj[attrs](A), where attrs is any set of atributes
A × B
A Union B
A Intersect B
A - B
A Div B
Consider the following relational schema:
Suppliers(sid, sname, address) Parts(pid, pname, colour) Catalog(supplier, part, cost)
Assume that the ids are integers,
that cost is a real number,
that all other attributes are strings,
that the supplier field
is a foreign key containing a supplier id,
and
that the part field is a foreign key
containing a part id.
Write a relational algebra expression to answer each
of the following queries:
Find the names of suppliers who supply some red part.
Find the sids of suppliers who supply some red or green part.
Find the sids of suppliers who supply some red part or whose address is 221 Packer Street.
Find the sids of suppliers who supply some red part and some green part.
Find the sids of suppliers who supply every part.
Find the sids of suppliers who supply every red part.
Find the sids of suppliers who supply every red or green part.
Find the sids of suppliers who supply every red part or supply every green part.
Find the pids of parts that are supplied by at least two different suppliers.
Find pairs of sids such that the supplier with the first sid charges more for some part than the supplier with the second sid.
Find the pids of the most expensive part(s) supplied by suppliers named "Yosemite Sham".
Find the pids of parts supplied by every supplier at a price less than 200 dollars (if any supplier either does not supply the part or charges more than 200 dollars for it, the part should not be selected).
Using the Suppliers-Parts-Catalog schema from the previous question, state what the following relational algebra expressions compute:
Proj[sname]( Proj[sid](Sel[colour='red'](Parts)) Join Sel[cost<100](Catalog) Join Suppliers )
Proj[sname]( Proj[sid]( Sel[colour='red'](Parts) Join Sel[cost<100](Catalog) Join Suppliers ) )
Proj[sname]( Sel[colour='red'](Parts) Join Sel[cost<100](Catalog) Join Suppliers ) Intersect Proj[sname]( Sel[colour='green'](Parts) Join Sel[cost<100](Catalog) Join Suppliers )
Proj[sid]( Sel[colour='red'](Parts) Join Sel[cost<100](Catalog) Join Suppliers ) Intersect Proj[sid]( Sel[colour='green'](Parts) Join Sel[cost<100](Catalog) Join Suppliers )
Proj[sid,sname]( Proj[sid,sname]( Sel[colour='red'](Parts) Join Sel[cost<100](Catalog) Join Suppliers ) Intersect Proj[sid,sname]( Sel[colour='green'](Parts) Join Sel[cost<100](Catalog) Join Suppliers ) )
Consider the following relational schema containing flight, aircraft and pilot information for an airline:
Flights(flno, from, to, distance, departs, arrives) Aircraft(aid, aname, cruisingRange) Certified(employee, aircraft) Employees(eid, ename, salary)
The Flights relation describes a particular timetabled
fight from a source (city) to a destination (city) at a particular time
each week.
Note that this schema doesn't model which specific aircraft makes
the flight.
The Aircraft relation describes individual planes,
with the aname field containing values like
'Boeing 747', 'Airbus A300', etc.
The Certified relation indicates which pilots (who
are employees) are qualified to fly which aircraft.
The employee field contains the eid of the
pilot, while the aircraft field contains
Finally, the Employees relation contains information
about all of the employees of the airline, including the pilots.
Write, where possible, a relational algebra expression to answer each of the following queries. If it is not possible to express any query in relational algebra, suggest what extra mechanisms might make it possible.
Find the ids of pilots certified for 'Boeing 747' aircraft.
Find the names of pilots certified for 'Boeing 747' aircraft.
Find the ids of all aircraft that can be used on non-stop flights from New York to Los Angeles.
Identify the flights that can be piloted by a pilot whose salary is more than $100,000.
Find the names of pilots who can operate planes with a range greater than 3000 miles, but are not certified on 'Boeing 747' aircraft.
Find the total amount paid to employees as salaries.
Find the ids of employees who make the highest salary.
Find the ids of employees who make the second highest salary.
Find the ids of employees who are certified for the largest number of aircraft.
Find the ids of employees who are certified for exactly three aircraft.
Find if there is any non-stop or single-stop way to travel by air from Sydney to New York.
Is there a sequence of flights from Sydney to Timbuktu (somewhere in the middle of Africa)? Each flight in the sequence is required to depart from the destination of the previous flight; the first flight must leave Sydney, and the last flight must arrive in Timbuktu, but there is no restriction on the number of intermediate flights.
[based on Ramakrishnan, ex.17.1]
Give a brief definition for each of the following terms:
Draw the precedence graph for the following schedule (where C means "commit"):
T1: R(A) W(Z) C T2: R(B) W(Y) C T3: W(A) W(B) CT3 --> T1 -->
[based on Ramakrishnan, ex.17.2]
Consider the following incomplete schedule S:
T1: R(X) R(Y) W(X) W(X) T2: R(Y) R(Y) T3: W(Y)
[based on Ramakrishnan, ex.17.3]
For each of the following schedules, state whether it is
conflict-serializable and/or view-serializable.
If you cannot decide whether a schedule belongs to either
class, explain briefly.
The actions are listed in the order they are scheduled,
and prefixed with the transaction name.
T1:R(X) T2:R(X) T1:W(X) T2:W(X)
T1:W(X) T2:R(Y) T1:R(Y) T2:R(X)
T1:R(X) T2:R(Y) T3:W(X) T2:R(X) T1:R(Y)
T1:R(X) T1:R(Y) T1:W(X) T2:R(Y) T3:W(Y) T1:W(X) T2:R(Y)
T1:R(X) T2:W(X) T1:W(X) T3:W(X)
Is the following schedule serializable? Show your working.
T1: R(X)W(X)W(Z) R(Y)W(Y) T2: R(Y)W(Y)R(Y) W(Y)R(X) W(X)R(V)W(V)