SQL: Queries on Multiple Tables

COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [0/15]
❖ Queries on Multiple Tables

Queries involving a single table are useful.

Exploiting all data in the DB requires

Example: Which brewers makes beers that John likes?

select b.brewer
from   Beers b join Likes L on (b.name = L.beer)
where  L.drinker = 'John';

Info on brewers is in Beers; info on who likes what in Likes.

Need to combine info from both tables using "common" attributes

COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [1/15]
❖ Queries on Multiple Tables (cont)

Example Beers and Likes tuples:

Beers(80/-, Caledonian, Scotch Ale)     Likes(John, Sculpin)
Beers(New, Toohey's, Lager)             Likes(John, Red Nut)
Beers(Red Nut, Bentspoke, Red IPA)      Likes(Adam, New)
Beers(Sculpin, Ballast Point, IPA)      Likes(John, 80/-)

"Merged" tuples resulting from

Joined(80/-, Caledonian, Scotch Ale, John, 80/-)
Joined(New, Toohey's, Lager, Adam, New)
Joined(Red Nut, Bentspoke, Red IPA, John, Red Nut)
Joined(Sculpin, Ballast Point, IPA, John, Sculpin)

In the query, the where clause removes all tuples not related to John

COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [2/15]
❖ Join


Join is the SQL operator that combines tuples from tables.

Such an important operation that several variations exist


We focus on theta-join and outer join in this course
COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [3/15]
❖ Join (cont)


Join fits into SELECT queries as follows:

SELECT Attributes
FROM   R1
       JOIN R2 ON (JoinCondition1)
       JOIN R3 ON (JoinCondition2)
       ...
WHERE  Condition

Can include an arbitrary number of joins.

WHERE clause typically filters out some of the joined tuples.

COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [4/15]
❖ Join (cont)


Alternative syntax for joins:

SELECT brewer
FROM   Likes L, Beers b
WHERE  L.beer = b.name
       AND L.drinker = 'John';

Join condition(s) are specified in the WHERE clause

We prefer the explicit JOIN syntax, but this is sometimes more compact

Note: duplicates could be eliminated by using distinct

COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [5/15]
❖ Join (cont)

Operational semantics of   R1 JOIN R2 ON (Condition):

FOR EACH tuple t1 in R1 DO
    FOR EACH tuple t2 in R2 DO
        check Condition for current
           t1, t2 attribute values
        IF Condition holds THEN
            add (t1,t2) to result
        END
    END
END

Easy to generalise: add more relations, include WHERE condition

Requires one tuple variable for each relation, and nested loops over relations.
But this is not how it's actually computed!

COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [6/15]
❖ Name Clashes in Conditions

If a SELECT statement

use the table name to disambiguate.

Example: Which hotels have the same name as a beer?

SELECT Bars.name
FROM   Bars, Beers
WHERE  Bars.name = Beers.name;
-- or, using table aliases ...
SELECT r.name
FROM   Bars r, Beers b
WHERE  r.name = b.name

COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [7/15]
❖ Explicit Tuple Variables

Table-dot-attribute doesn't help if we use same table twice in SELECT.

To handle this, define new names for each "instance" of the table

SELECT r1.a, r2.b  FROM  R r1, R r2  WHERE  r1.a = r2.a

Example: Find pairs of beers by the same manufacturer.

SELECT b1.name, b2.name
FROM   Beers b1 JOIN Beers b2 ON (b1.brewer = b2.brewer)
WHERE  b1.name < b2.name;

The WHERE condition is used to avoid:

COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [8/15]
❖ Outer Join


Join only produces a result tuple from tR and tS where ...

SELECT * FROM R JOIN S WHERE (Condition)

Sometimes, we want a result for every  R tuple

These kinds of requests often include "for each" or "for every"
COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [9/15]
❖ Outer Join (cont)

Example: for each suburb with a bar, find out who drinks there.

Theta-join only gives results for suburbs where people drink.

SELECT B.addr, F.drinker
FROM   Bars B
       JOIN Frequents F ON (F.bar = B.name)
ORDER  BY addr;

   addr    | drinker 
-----------+---------
 Coogee    | Adam
 Coogee    | John
 Kingsford | Justin
 Sydney    | Justin
 The Rocks | John

But what if we want all suburbs, even if some have are no drinkers?

This is from an older and simpler instance of the beers database.

COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [10/15]
❖ Outer Join (cont)


An outer join solves this problem.

For   R   OUTER JOIN S   ON (Condition)

This outer join variant is called LEFT OUTER JOIN.
COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [11/15]
❖ Outer Join (cont)

Solving the example query with an outer join:

SELECT B.addr, F.drinker
FROM   Bars B
       LEFT OUTER JOIN Frequents F on (F.bar = B.name)
ORDER BY B.addr;

    addr    | drinker 
------------+---------
 Coogee     | Adam
 Coogee     | John
 Kingsford  | Justin
 Randwick   | 
 Sydney     | Justin
 The Rocks  | John

Note that Randwick is now mentioned (because of the Royal Hotel).

COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [12/15]
❖ Outer Join (cont)

Operational semantics of   R1 LEFT OUTER JOIN R2 ON (Cond):

FOR EACH tuple t1 in R1 DO
    nmatches = 0
    FOR EACH tuple t2 in R2 DO
        check Cond for current
           t1, t2 attribute values
        IF Cond holds THEN
            nmatches++
            add (t1,t2) to result
        END
    END
    IF nmatches == 0 THEN
        t2 = (null,null,null,...)
        add (t1,t2) to result
    END
END

COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [13/15]
❖ Outer Join (cont)

Many RDBMSs provide three variants of outer join:

COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [14/15]
❖ Subqueries

The result of a query can be used in the WHERE clause of another query.

Case 1: Subquery returns a single, unary tuple

SELECT * FROM R WHERE R.a = (SELECT S.x FROM S WHERE Cond1)

Case 2: Subquery returns multiple values

SELECT * FROM R WHERE R.a IN (SELECT S.x FROM S WHERE Cond2)


This approach is often used in the initial discussion of SQL in some textbooks.

These kinds of queries can generally be solved more efficiently using a join

SELECT * FROM R JOIN S ON (R.a = S.x) WHERE Cond

COMP3311 20T3 ♢ SQL: Queries on Multiple Tables ♢ [15/15]


Produced: 3 Oct 2020