NULLSELECTSELECTSELECTINEXISTSSELECTSQL = Structured Query Language (sometimes called "sequel").
SQL is an ANSI/ISO standard language for querying and manipulating relational DBMSs.
Designed to be a "human readable" language comprising:
SQL was developed at IBM (San Jose Lab) during the 1970's, and standardised in 1986.
DBMSs typically implement the SQL2 standard (aka SQL-92).
Unfortunately, they also:
In these slides, we try to use only standard (portable) SQL2.
Since SQL2, there have been three new proposed standards:
SQL:1999 added e.g.
MERGEMajor DBMSs (Oracle, DB2, SQLServer, PostgreSQL MySQL):
SQL provides high-level, declarative access to data.
However, SQL is not a Turing-complete programming language.
Applications typically embed evaluation of SQL queries into PL's:
SQL's query sub-language is based on relational algebra.
Relational algebra:
Example relational algebra operations:
In order to demonstrate aspects of SQL, we use two databases:
bankbeerER design for a simple banking application:
Relational schema corresponding to the ER design:
We will use the following instance of this schema:
Branch relation/table instance:
branchName | address | assets ------------+----------------+-------- Clovelly | Clovelly Rd. | 1000 Coogee | Coogee Bay Rd. | 40000 Maroubra | Anzac Pde. | 17000 Randwick | Alison Rd. | 20000 UNSW | near Library | 3000
Customer relation/table instance:
name | address | customerNo | homebranch --------+----------------+------------+------------ Adam | Belmore Rd. | 12345 | Randwick Bob | Rainbow St. | 32451 | Coogee Chuck | Clovelly Rd. | 76543 | Clovelly David | Anzac Pde. | 82199 | UNSW George | Anzac Pde. | 81244 | Maroubra Graham | Malabar Rd. | 92754 | Maroubra Greg | Coogee Bay Rd. | 22735 | Coogee Jack | High St. | 12666 | Randwick
Account relation/table instance:
branchName | accountNo | balance ------------+-----------+--------- UNSW | U-245 | 1000 UNSW | U-291 | 2000 Randwick | R-245 | 20000 Coogee | C-123 | 15000 Coogee | C-124 | 25000 Clovelly | Y-123 | 1000 Maroubra | M-222 | 5000 Maroubra | M-225 | 12000
Owner relation/table instance:
account | customer ---------+---------- U-245 | 12345 U-291 | 12345 U-291 | 12666 R-245 | 12666 C-123 | 32451 C-124 | 22735 Y-123 | 76543 M-222 | 92754 M-225 | 12345
ER design for beers/bars/drinkers database:
Relational schema corresponding to the ER design:
We will use the following instance of this schema:
Bars relation/table instance:
name | addr | license
------------------+-----------+---------
Australia Hotel | The Rocks | 123456
Coogee Bay Hotel | Coogee | 966500
Lord Nelson | The Rocks | 123888
Marble Bar | Sydney | 122123
Regent Hotel | Kingsford | 987654
Royal Hotel | Randwick | 938500
Drinkers relation/table instance:
name | addr | phone --------+----------+------------ Adam | Randwick | 9385-4444 Gernot | Newtown | 9415-3378 John | Clovelly | 9665-1234 Justin | Mosman | 9845-4321
Beers relation/table instance:
name | manf
---------------------+---------------
80/- | Caledonian
Bigfoot Barley Wine | Sierra Nevada
Burragorang Bock | George IV Inn
Crown Lager | Carlton
Fosters Lager | Carlton
Invalid Stout | Carlton
Melbourne Bitter | Carlton
New | Toohey's
Old | Toohey's
Old Admiral | Lord Nelson
Pale Ale | Sierra Nevada
Premium Lager | Cascade
Red | Toohey's
Sheaf Stout | Toohey's
Sparkling Ale | Cooper's
Stout | Cooper's
Three Sheets | Lord Nelson
Victoria Bitter | Carlton
Frequents relation/table instance:
drinker | bar ---------+------------------ Adam | Coogee Bay Hotel Gernot | Lord Nelson John | Coogee Bay Hotel John | Lord Nelson John | Australia Hotel Justin | Regent Hotel Justin | Marble Bar
Likes relation/table instance:
drinker | beer ---------+--------------------- Adam | Crown Lager Adam | Fosters Lager Adam | New Gernot | Premium Lager Gernot | Sparkling Ale John | 80/- John | Bigfoot Barley Wine John | Pale Ale John | Three Sheets Justin | Sparkling Ale Justin | Victoria Bitter
Sells relation/table instance:
bar | beer | price
------------------+------------------+-------
Australia Hotel | Burragorang Bock | 3.50
Coogee Bay Hotel | New | 2.25
Coogee Bay Hotel | Old | 2.50
Coogee Bay Hotel | Sparkling Ale | 2.80
Coogee Bay Hotel | Victoria Bitter | 2.30
Lord Nelson | Three Sheets | 3.75
Lord Nelson | Old Admiral | 3.75
Marble Bar | New | 2.80
Marble Bar | Old | 2.80
Marble Bar | Victoria Bitter | 2.80
Regent Hotel | New | 2.20
Regent Hotel | Victoria Bitter | 2.20
Royal Hotel | New | 2.30
Royal Hotel | Old | 2.30
Royal Hotel | Victoria Bitter | 2.30
SQL definitions, queries and statements are composed of:
--CREATESELECTTABLEintegervarchardate'John''blue''it''s'"Students""Really Silly!"While SQL identifiers and keywords are case-insensitive, we generally:
SELECTFROMWHERECREATECustomersStudentsOwnsEnrolledInidnamepartNumberisActiveWe ignore the above conventions when typing in lectures.
A categorised list of frequently-used SQL92 keywords:
Querying Defining Data Changing Data SELECT CREATE INSERT FROM TABLE INTO WHERE INTEGER VALUES GROUP BY REAL UPDATE HAVING VARCHAR SET ORDER BY CHAR DELETE DESC KEY DROP EXISTS PRIMARY ALTER IS NULL FOREIGN NOT NULL REFERENCES IN CONSTRAINT DISTINCT CHECK AS
There are 225 reserved words in SQL92 ... not a small language.
A list of PostgreSQL's SQL keywords:
ALL DEFERRABLE IS OVERLAPS ANALYSE DESC ISNULL PRIMARY ANALYZE DISTINCT JOIN PUBLIC AND DO LEADING REFERENCES ANY ELSE LEFT RIGHT AS END LIKE SELECT ASC EXCEPT LIMIT SESSION_USER BETWEEN FALSE NATURAL SOME BINARY FOR NEW TABLE BOTH FOREIGN NOT THEN CASE FREEZE NOTNULL TO CAST FROM NULL TRAILING CHECK FULL OFF TRUE COLLATE GROUP OFFSET UNION COLUMN HAVING OLD UNIQUE CONSTRAINT ILIKE ON USER CROSS IN ONLY USING CURRENT_DATE INITIALLY OR VERBOSE CURRENT_TIME INNER ORDER WHEN CURRENT_USER INTERSECT OUTER WHERE DEFAULT INTO
Note that some SQL92 reserved words are not reserved words in PostgreSQL.
Names are used to identify
Can create arbitrary indentifiers by enclosing in "..."
Example identifiers:
employee student Courses last_name "That's a Great Name!"
Oracle SQL also allows unquoted hash (#$
Since SQL does not distinguish case, the following are all treated as being the same identifier:
employee Employee EmPlOyEe
Most RDBMSs will let you give the same name to different kinds of objects
(e.g. a table called BeerBeer
Some common naming conventions:
DrinkersTheDrinkersAllDrinkersbeerSellsNumeric constants have same syntax as programming languages, e.g.
10 3.14159 2e-5 6.022e23
String constants are written in single quotes, e.g.
'John' 'some text' '!%#%!$' 'O''Brien'
'"' '[A-Z]{4}\d{4}' 'a VeRy! LoNg String'
PostgreSQL provides extended strings containing \
E'\n' E'O\'Brien' E'[A-Z]{4}\\d{4}' E'John'
Boolean constants: TRUEFALSE
PostgreSQL also allows 't''true''yes''f''false''no'
Other kinds of constants are typically written as strings.
Dates: '2008-04-13''13:30:15'
Timestamps: '2004-10-19 10:23:54'
PostgreSQL also recognises: 'January 26 11:05:10 1988 EST'
Time intervals: '10 minutes''5 days, 6 hours'
PostgreSQL also has IP address, XML, etc. data types.
All attributes in SQL relations are typed (i.e. have domain specified)
SQL supports a small set of useful built-in data types:
text string, number (integer,real), date, boolean, binary
Various type conversions are available (e.g. date to string, string to date, integer to real) and applied automatically "where they make sense".
Basic domain (type) checking is performed automatically.
The NULL
No structured data types are available (in SQL2).
Various kinds of number types are available:
INTEGERINTSMALLINTREALDOUBLE PRECISIONNUMBER(,)serialcurrency$1,000.00Two string types are available:
CHAR()VARCHAR()'abc'::CHAR(2) = 'ab' 'abc'::CHAR(4) = 'abc '
PostgreSQL also provides TEXT
Dates are simply specially-formatted strings, with a range of operations to implement date semantics.
Format is typically YYYY-MM-DD'1998-08-02'
Accepts other formats (and has format-conversion functions), but beware of two-digit years (year 2000)
Comparison operators implement before (<>
Subtraction counts number of days between two dates.
Etc. etc. ... consult your local SQL Manual
PostgreSQL also supports several non-standard data types.
textbyteapointcirclepolygon
CREATE TABLE Employees (
empid integer primary key,
name text,
pay_rate float[]
);
INSERT INTO Employees VALUES
(1234, 'John', '{35.00,45.00,60.00}');
SELECT pay_rate[2] FROM Employees ...
Tuple and set constants are both written as:
( val1, val2, val3, ... )
The correct interpretation is worked out from the context.
Examples:
INSERT INTO Student(stude#, name, course)
VALUES (2177364, 'Jack Smith', 'BSc')
-- tuple literal
CREATE TABLE Academics (
id integer,
name varchar(40),
job varchar(10) CHECK
job IN ('Lecturer', 'Tutor');
-- set literal
SQL data types provide coarse-grained control over values.
If more fine-grained control over values is needed:
CREATE DOMAIN PositiveInt AS INTEGER
CHECK (VALUE > 0);
CREATE DOMAIN Colour AS
CHECK (VALUE IN ('red','yellow','green','blue','violet'));
CREATE TABLE T (
x Colour,
y PositiveInt,
z INTEGER CHECK (z BETWEEN 10 AND 20)
);
Comparison operators are defined on all types:
< > <= >= = <> (or !=)
Boolean operators ANDORNOT
Note ANDOR&&||
Most data types also have type-specific operations available
See PostgreSQL Documentation Chapter 8/9 for data types and operators
String comparison:
<LIKE%.*_.Examples (using SQL92 pattern matching):
Name LIKE 'Ja%' |
Name |
|
Name LIKE '_i%' |
Name |
|
Name LIKE '%o%o%' |
Name |
|
Name LIKE '%ith' |
Name |
|
Name LIKE 'John' |
Name |
PostgreSQL also supports case-insensitive match: ILIKE
Most Unix-based DBMSs utilise the regexp library
~Attr ~ 'RegExp'
PostgreSQL also provides full-text searching (see doc)
Examples (using POSIX regular expressions):
Name ~ '^Ja' |
Name |
|
Name ~ '^.i' |
Name |
|
Name ~ '.*o.*o.*' |
Name |
|
Name ~ 'ith$' |
Name |
|
Name ~ 'John' |
Name |
String manipulation:
||lower()substring()Note that above operations are null-preserving (strict):
NULLNULL(a||' '||b||' '||c)NULLabcArithmetic operations:
+ - * / abs ceil floor power sqrt sin
Aggregations apply to a column of numbers in a relation:
count()sum()avg()min/max()count
NULLNULL
3 + NULL = NULL 1 / NULL = NULL
NULL
sum(1,2,3,4,5,6) = 21 sum(1,2,NULL,4,NULL,6) = 13 avg(1,2,3,4,5) = 3 avg(NULL,2,NULL,4) = 3
NULL
Expressions containing NULLNULL
However, boolean expressions use three-valued logic:
| a | b | a AND |
a OR |
| TRUE | TRUE | TRUE | TRUE |
| TRUE | FALSE | FALSE | TRUE |
| TRUE | NULL | NULL | TRUE |
| FALSE | FALSE | FALSE | FALSE |
| FALSE | NULL | FALSE | NULL |
| NULL | NULL | NULL | NULL |
NULL
Important consequence of NULL
These expressions do not work as (might be) expected:
x = NULL x <> NULL
Both return NULL
Can only test for NULL
x IS NULL x IS NOT NULL
NULL
Other ways PostgeSQL provides for dealing with NULL
coalesce(,,)
nullif(,)coalesce()
In order to give a relational data model, we need to:
A database schema is a collection of relation schemas that defines the structure of and constraints on an entire database.
So far, we have given relational schemas informally, e.g.
Account(accountNo, branchName, balance) Branch(branchNo, address, assets) Customer(customerNo, name, address, homeBranch) Owner(customer,branch)
SQL is normally considered to be a query language.
However, it also has a data definition sub-language (DDL) for describing database schemas.
The SQL DDL allows us to specify:
Relations (tables) are described using:
CREATE TABLE RelName (
attribute1 domain1 constraints,
attribute2 domain2 constraints,
...
table-level constraints, ...
)
where constraints can include details about primary keys, foreign keys, default values, and constraints on attribute values.
This not only defines the table schema but also creates an empty instance of the table.
Tables are removed via DROP TABLE;
Consider the relational diagram for the example schema:
This shows explicitly the connection between foreign key attributes and their corresponding key attributes.
The SQL DDL provides notation for expressing this in the table definition.
SQL DDL for the example schema:
CREATE TABLE Branch (
name varchar(30),
address varchar(50),
assets float,
PRIMARY KEY (name)
);
Note: name
More SQL DDL for the example schema:
CREATE TABLE Customer (
customerNo integer,
name varchar(40),
address varchar(50),
homeBranch varchar(30) not null,
PRIMARY KEY (customerNo),
FOREIGN KEY (homeBranch)
REFERENCES Branch(name)
);
Note: the not null
More SQL DDL for the example schema:
CREATE TABLE Account (
accountNo char(5),
heldAtBranch varchar(30) not null,
balance float,
PRIMARY KEY (accountNo),
FOREIGN KEY (heldAtBranch)
REFERENCES Branch(name)
);
Note: the not null
More SQL DDL for the example schema:
CREATE TABLE OwnedBy (
account char(5),
customer integer,
PRIMARY KEY (account,customer),
FOREIGN KEY (account)
REFERENCES Account(accountNo),
FOREIGN KEY (customer)
REFERENCES Customer(customerNo)
);
Note: it is not possible in SQL to capture the semantics that Accounts are required to be owned by some Customer.
Primary keys:
accountNo char(5) PRIMARY KEY,
name varchar(40), address varchar(50), ... PRIMARY KEY (name,address)
If we want to define a numeric primary key, e.g.
CREATE TABLE R ( id INTEGER PRIMARY KEY, ... );
we still have the problem of generating unique values.
Most DBMSs provide a mechanism to
CREATE TABLE R ( id SERIAL PRIMARY KEY, ... );
Foreign keys:
customer integer
REFERENCES Customer(customerNo)
-- or
customer integer REFERENCES Customer
-- or
FOREIGN KEY (customer)
REFERENCES Customer(customerNo)
Foreign keys: (cont)
name varchar(40),
addr varchar(50),
...
FOREIGN KEY (name,addr)
REFERENCES Person(name,address)
FOREIGN KEYDeclaring foreign keys assures referential integrity.
Example:
Account.branchNameBranch
If we want to delete a tuple from BranchAccount
AccountAccountCan force the alternative delete behaviours via e.g.
-- to cascade deletes customer integer REFERENCES Customer(customerNo) ON DELETE CASCADE -- to set foreign keys to NULL customer integer REFERENCES Customer(customerNo) ON DELETE SET NULL
Example of different deletion strategies:
Can specify that an attribute must have a non-null value, e.g.
barcode varchar(20) NOT NULL, price float NOT NULL
Can specify that an attribute must have a unique value, e.g.
barcode varchar(20) UNIQUE, isbn varchar(15) UNIQUE NOT NULL
Primary keys are automatically UNIQUE NOT NULL
Can specify a DEFAULT
CREATE TABLE Account (
accountNo char(5) PRIMARY KEY,
branchName varchar(30)
REFERENCES Branch(name)
DEFAULT 'Central',
balance float DEFAULT 0.0
);
INSERT INTO Account(accountNo) VALUES ('A-456')
-- produces the tuple
Account('A-456','Central',0.0)
In fact, NOT NULL
SQL has a more general mechanism for specifying such constraints.
attrName type CHECK ( condition )
The Condition can be arbitrarily complex, and may even
involve other attributes, relations and SELECT
(but many RDBMSs (e.g. Oracle and PostgreSQL) don't allow SELECTCHECK
Example:
CREATE TABLE Example
(
gender CHAR(1) CHECK (gender IN ('M','F')),
Xvalue INT NOT NULL,
Yvalue INT CHECK (Yvalue > Xvalue),
Zvalue FLOAT CHECK (Zvalue >
(SELECT MAX(price)
FROM Sells)
)
);
Any constraint in an SQL DDL can be named via
CONSTRAINT constraintName constraint
Example:
CREATE TABLE Example
(
gender CHAR(1) CONSTRAINT GenderCheck
CHECK (gender IN ('M','F')),
Xvalue INT NOT NULL,
Yvalue INT CONSTRAINT XYOrder
CHECK (Yvalue > Xvalue),
);
Mechanism for creating databases is typically DBMS-specific.
Many implement a (non-standard) SQL-like statement:
CREATE DATABASE DBname;
Many provide an external command, e.g PostgreSQL's
$ createdb DBname
Produces an empty database (no tables, etc) called DBname
A database can be completely removed (no backup) via
$ dropdb DBname
This removes all tuples, all tables, all traces of DBname
Tables can be removed from a database schema via:
DROP TableName
All tuples can be removed from a table via:
DELETE FROM TableName
Loading a schema with PostgreSQL:
$ createdb mydb $ psql mydb ... mydb=# \i schema.sql ...
or
$ psql -f schema.sql mydb
Running the above as:
$ psql -a -f schema.sql mydb
intersperses messages with the schema definition.
Useful for debugging, since errors appear in context.
Re-loading schemas is not well-supported in PostgreSQL.
Simplest approach is:
$ dropdb mydb $ createdb mydb $ psql -f schema.sql mydb
An alternative is to leave DB but drop all tables:
$ psql mydb
...
mydb=# drop Table1;
mydb=# drop Table2;
etc. etc. in correct order
mydb=# \i schema.sql
...
Later, we'll see how to write functions to automate this.
The entire contents of a database may be dumped:
$ pg_dump mydb > mydb.dump
Dumps all definitions needed to re-create entire DB
create tablevarchar(30)character varying(30)Dumps may be used for backup/restore or copying DBs
$ pg_dump mydb > mydb.dump -- backup $ createdb newdb $ psql newdb -f mydb.dump -- copy
Result: newdbmydb
SQL provides mechanisms for modifying data (tuples) in tables:
INSERTDELETEUPDATE(See description of relational model for details of which checking applied when)
Also provides mechanisms for modifying table meta-data:
CREATE TABLEDROP TABLEALTER TABLECREATE VIEWCREATE FUNCTIONCREATE RULEDROP VIEWDROP FUNCTIONDROP RULEUPDATECREATE OR REPLACE
Accomplished via the INSERT
INSERT INTO RelationName VALUES (val1, val2, val3, ...) INSERT INTO RelationName(Attr1, Attr2, ...) VALUES (valForAttr1, valForAttr2, ...)
Each form adds a single new tuple into RelationName.
INSERT INTO VALUES (,,...)
CREATE TABLEINSERT INTO (,,...) VALUES (,,...)Example: Add the fact that Justin likes 'Old'.
INSERT INTO Likes VALUES ('Justin','Old');
-- or --
INSERT INTO Likes(drinker,beer)
VALUES('Justin', 'Old');
-- or --
INSERT INTO Likes(beer,drinker)
VALUES('Old','Justin');
Example: Add a new drinker with unknown phone number.
INSERT INTO Drinkers(name,addr)
VALUES('Frank','Coogee');
-- which inserts the tuple ...
('Frank', 'Coogee', null)
Example: insertion with default values
ALTER TABLE Likes
ALTER COLUMN beer SET DEFAULT 'New';
ALTER TABLE Likes
ALTER COLUMN drinker SET DEFAULT 'Joe';
INSERT INTO Likes(drinker)
VALUES('Fred');
INSERT INTO Likes(beer)
VALUES('Sparkling Ale');
-- inserts the two new tuples ...
('Fred', 'New')
('Joe', 'Sparkling Ale')
Example: insertion with insufficient values.
E.g. specify that drinkers' phone numbers cannot be NULL
ALTER TABLE Drinkers ALTER COLUMN phone SET NOT NULL;
And then try to insert a new drinker whose phone number we don't know:
INSERT INTO Drinkers(name,addr)
VALUES ('Zoe', 'Manly');
ERROR: ExecInsert: Fail to add null value
in not null attribute phone
Can use the result of a query to perform insertion of multiple tuples at once.
INSERT INTO Relation ( Subquery );
Tuples of Subquery must be projected into a suitable format (i.e. matching the tuple-type of Relation ).
Example: Create a relation of potential drinking buddies (i.e. people who go to the same bars as each other).
CREATE TABLE DrinkingBuddies (
drinker varchar(20) references Drinkers(name),
buddy varchar(20) references Drinkers(name),
primary key (drinker,buddy)
);
INSERT INTO DrinkingBuddies (
SELECT a.drinker AS drinker,
b.drinker AS buddy
FROM Frequents a, Frequents b
WHERE a.bar = b.bar AND a.drinker <> b.drinker
);
Note: this is better done as a view (treat this as a materialized view).
Tuples may be inserted individually:
insert into Stuff(x,y,s) values (2,4,'green'); insert into Stuff(x,y,s) values (4,8,null); insert into Stuff(x,y,s) values (8,null,'red'); ...
but this is tedious if 1000's of tuples are involved.
It is also inefficient, because all relevant constraints are checked after insertion of each tuple.
Most DBMSs provide non-SQL methods for bulk insertion:
Example: PostgreSQL's copy
copy Stuff(x,y,s) from stdin; 2 4 green 4 8 \N 8 \N red \.
Can also copy from a named file.
Accomplished via the DELETE
DELETE FROM Relation WHERE Condition
Removes all tuples from Relation that satisfy Condition.
Example: Justin no longer likes Sparkling Ale.
DELETE FROM Likes
WHERE drinker = 'Justin'
AND beer = 'Sparkling Ale';
Special case: Make relation R empty.
DELETE FROM R;
Example: remove all expensive beers from sale.
DELETE FROM Sells WHERE price >= 3.00;
Example: remove all drinkers with no fixed address.
DELETE FROM Drinkers WHERE addr IS NULL;
This fails if such Drinkers are referenced in other tables.
Method A for DELETE FROMWHERE
FOR EACH tuple T in R DO
IF T satisfies Cond THEN
remove T from relation R
END
END
Method B for DELETE FROMWHERE
FOR EACH tuple T in R DO
IF T satisfies Cond THEN
make a note of this T
END
END
FOR EACH noted tuple T DO
remove T from relation R
END
Does it matter which method is used?
Example: Delete all beers for which there is another beer by the same manufacturer.
DELETE FROM Beers b
WHERE EXISTS
(SELECT name
FROM Beers
WHERE manf = b.manf
AND name <> b.name);
Does the query result in ...
FROMExample continued ...
Different results come from different evaluation methods ..
DELETEAn update allows you to modify values of specified attributes in specified tuples of a relation:
UPDATE R SET list of assignments WHERE Condition
Each tuple in relation R that satisfies Condition has the assignments applied to it.
Assignments may:
SET price = 2.00SET price = price * 0.5Example: Adam changes his phone number.
UPDATE Drinkers SET phone = '9385-2222' WHERE name = 'Adam';
Example: John moves to Coogee.
UPDATE Drinkers
SET addr = 'Coogee',
phone = '9665-4321'
WHERE name = 'John';
Can update many tuples at once (all tuples that satisfy condition)
Example: Make $3 the maximum price for beer.
UPDATE Sells SET price = 3.00 WHERE price > 3.00;
Example: Increase beer prices by 10%.
UPDATE Sells SET price = price * 1.10;
Accomplished via the ALTER TABLE
ALTER TABLE Relation Modifications
Some possible modifications are:
NULLExample: Add phone numbers for hotels.
ALTER TABLE Bars ADD phone char(10) DEFAULT 'Unlisted';
This appends a new column to the table and sets value
for this attribute to 'Unlisted'
Specific phone numbers can subsequently be added via:
UPDATE Bars SET phone = '9665-0000' WHERE name = 'Coogee Bay Hotel';
If no default value is given, new column is set to all NULL
Full details are in the PostgreSQL Reference Manual.
See the section "SQL Commands", which has entries for
INSERTDELETEUPDATECREATEDROPALTER
A query is a declarative program that retrieves data from a database.
Analogous to an expression in relational algebra.
But SQL does not implement relational algebra precisely.
Queries are used in two ways in RDBMSs:
psqlThe most common kind of SQL statement is the SELECT query:
SELECT attributes FROM relations WHERE condition
The result of this statement is a relation, which is typically displayed on output.
The SELECT
The question "What beers are made by Toohey's?", can be phrased:
SELECT Name FROM Beers WHERE Manf = 'Toohey''s';
This gives a subset of the Beers
name
-------------
New
Old
Red
Sheaf Stout
Notes:
'''''\''
SELECT
For SQL SELECT
SELECT Attributes FROM R WHERE Condition
Formal semantics (relational algebra):
SELECTOperationally, we think in terms of a tuple variable ranging over all tuples of the relation.
Operational semantics:
FOR EACH tuple T in R DO
check whether T satisfies the condition
in the WHERE clause
IF it does THEN
print the attributes of T that are
specified in the SELECT clause
END
END
For a relation R and attributes X ⊆ R, the relational algebra expression πX(R) is implemented in SQL as:
SELECT X FROM R
Example: Names of drinkers = πName(Drinkers)
SELECT Name FROM Drinkers; name -------- Adam Gernot John Justin
Example: Names/addresses of drinkers = πName,Addr(Drinkers)
SELECT Name, Addr FROM Drinkers; name | addr --------+---------- Adam | Randwick Gernot | Newtown John | Clovelly Justin | Mosman
The symbol *
Example: All information about drinkers = (Drinkers)
SELECT * FROM Drinkers; name | addr | phone --------+----------+------------ Adam | Randwick | 9385-4444 Gernot | Newtown | 9415-3378 John | Clovelly | 9665-1234 Justin | Mosman | 9845-4321
SQL implements renaming (ρ) via the ASSELECT
Example: rename Beers(name,manf) to Beers(beer,brewer)
SELECT name AS beer, manf AS Brewer
FROM Beers;
beer | brewer
---------------------+---------------
80/- | Caledonian
Bigfoot Barley Wine | Sierra Nevada
Burragorang Bock | George IV Inn
Crown Lager | Carlton
Fosters Lager | Carlton
...
AS
Example: display beer prices in Yen, rather than dollars
SELECT bar, beer, price*120 AS PriceInYen FROM Sells;
bar | beer | priceinyen
------------------+------------------+------------------
Australia Hotel | Burragorang Bock | 420
Coogee Bay Hotel | New | 270
Coogee Bay Hotel | Old | 300
Coogee Bay Hotel | Sparkling Ale | 335.999994277954
Coogee Bay Hotel | Victoria Bitter | 275.999994277954
Lord Nelson | Three Sheets | 450
Lord Nelson | Old Admiral | 450
...
Trick: to put specific text in output columns
ASSELECT drinker, 'likes Cooper''s' AS WhoLikes FROM Likes WHERE beer = 'Sparkling Ale'; drinker | wholikes ---------+---------------- Gernot | likes Cooper's Justin | likes Cooper's
The relational algebra expression σCond(Rel) is implemented in SQL as:
SELECT * FROM Rel WHERE Cond
Example: All about the bars at The Rocks
SELECT * FROM Bars WHERE Addr='The Rocks';
name | addr | license
-----------------+-----------+---------
Australia Hotel | The Rocks | 123456
Lord Nelson | The Rocks | 123888
(2 rows)
The condition can be an arbitrarily complex boolean-valued expression using the operators mentioned previously.
Example: Find the price that The Regent charges for New
SELECT price FROM Sells WHERE bar = 'Regent Hotel' AND beer = 'New'; price ------- 2.2
This can be formatted better via to_char
SELECT to_char(price,'$99.99') AS price FROM Sells WHERE bar = 'Regent Hotel' AND beer = 'New'; price --------- $ 2.20
to_char()
SELECT
Syntax is similar to simple SELECT
SELECT Attributes FROM R1, R2, ... WHERE Condition
Difference is that FROM
Also, the condition typically includes cross-relation (join) conditions.
SELECTExample: Find the brewers whose beers John likes.
SELECT Manf as brewer
FROM Likes, Beers
WHERE beer = name AND drinker = 'John';
brewer
---------------
Caledonian
Sierra Nevada
Sierra Nevada
Lord Nelson
Note: duplicates could be eliminated by using DISTINCT
SELECTThe above example corresponds to a relational algebra evaluation like:
BeerDrinkers = Likes Join[beer=name] Beers JohnsBeers = Sel[drinker=John](BeerDrinkers) Brewers = Proj[manf](JohnsBeers) Result = Rename[manf->brewer](Brewers)
The SQL compiler knows how to translate tests
SELECT
For SQL SELECT
SELECT Attributes FROM R1, R2, ... Rn WHERE Condition
Formal semantics (relational algebra):
SELECT
Operational semantics of SELECT
FOR EACH tuple T1 in R1 DO
FOR EACH tuple T2 in R2 DO
...
check WHERE condition for current
assignment of T1, T2, ... vars
IF holds THEN
print attributes of T1, T2, ...
specified in SELECT
END
...
END
END
Requires one tuple variable for each relation, and nested loops over relations. This is not how it's actually computed!
If a selection condition
Example: Which hotels have the same name as a beer?
SELECT Bars.name FROM Bars, Beers WHERE Bars.name = Beers.name;
(The answer to this query is empty, but there is nothing special about this)
Can use such qualified names, even if there is no ambiguity:
SELECT Sells.beer FROM Sells WHERE Sells.price > 3.00;
Advice:
AS
The relation-dot-attribute convention doesn't help
if we happen to use the same relation twice in a SELECT
To handle this, we need to define new names for each "instance"
of the relation in the FROM
Syntax:
SELECT r1.a, r2.b FROM R r1, R r2 WHERE r1.a = r2.a
Example: Find pairs of beers by the same manufacturer.
SELECT b1.name, b2.name
FROM Beers b1, Beers b2
WHERE b1.manf = b2.manf AND b1.name < b2.name;
name | name
---------------------+------------------
Crown Lager | Fosters Lager
Crown Lager | Invalid Stout
Crown Lager | Melbourne Bitter
Crown Lager | Victoria Bitter
Fosters Lager | Invalid Stout
Fosters Lager | Melbourne Bitter
...
The second part of the condition is used to avoid:
(New,New)(New,Old) (Old,New)
A common alternative syntax for
SELECT r1.a, r2.b FROM R r1, R r2 WHERE r1.a = r2.a
uses the as
SELECT r1.a, r2.b FROM R as r1, R as r2 WHERE r1.a = r2.a
SQL supports syntax for explicit joins:
SELECT...FROM A natural join B SELECT...FROM A join B using (A1,...,An) SELECT...FROM A join B on Condition
The natural joinjoin using
Example: Find the beers sold at bars where John drinks
SELECT Sells.bar, beer, price
FROM Sells, Frequents
WHERE drinker = 'John'
AND Sells.bar = Frequents.bar;
could also be expressed as
SELECT bar, beer, price
FROM Sells natural join Frequents
WHERE drinker='John';
-- joins on the only common attribute: bar
The example could also be expressed as
SELECT bar, beer, price
FROM Sells join Frequents using (bar)
WHERE drinker='John';
-- only one bar attribute in join result
or
SELECT Sells.bar, beer, price
FROM Sells join Frequents
on Sells.bar = Frequents.bar
WHERE drinker='John';
-- bar attribute occurs twice in join result
Join only produces tuples where there are matching values in both of the relations involved in the join.
Often, it is useful to produce results for all tuples in one relation, even if it has no matches in the other.
Consider the query: for each region, find out who drinks there.
A regular join only gives results for regions where people drink.
SELECT B.addr, F.drinker
FROM Bars as B join Frequents as F
on (bar = name)
ORDER BY addr;
addr | drinker
-----------+---------
Coogee | Adam
Coogee | John
Kingsford | Justin
Sydney | Justin
The Rocks | John
But what if we want a result that shows all regions, even if there are no drinkers there?
An outer join solves this problem.
For R OUTER JOIN
NULLLEFT OUTER JOINSolving the example query with an outer join:
SELECT B.addr, F.drinker
FROM Bars as B
left outer join
Frequents as F
on (bar = name)
ORDER BY B.addr;
addr | drinker
------------+---------
Coogee | Adam
Coogee | John
Kingsford | Justin
Randwick |
Sydney | Justin
The Rocks | John
Note that Randwick is now mentioned (because of the Royal Hotel).
Many RDBMSs provide three variants of outer join:
LEFT OUTER JOINRIGHT OUTER JOINNULLFULL OUTER JOINNULL
The result of a SELECT-FROM-WHEREWHERE
Simplest Case: Subquery returns a single, unary tuple
Can treat the result as a single constant value and use in expressions.
Syntax:
SELECT *
FROM R
WHERE R.a = (SELECT x FROM S WHERE Cond)
-- assume only one result
Example: Find bars that serve New at the same price as the Coogee Bay Hotel charges for VB.
SELECT bar
FROM Sells
WHERE beer = 'New' AND
price =
(SELECT price
FROM Sells
WHERE bar = 'Coogee Bay Hotel'
AND beer = 'Victoria Bitter');
bar
-------------
Royal Hotel
The inner query finds the price of VB at the CBH, and uses this as an argument to a test in the outer query.
Note the potential ambiguity in references to attributes of Sells
SELECT bar
FROM Sells
WHERE beer = 'New' AND
price =
(SELECT price
FROM Sells
WHERE bar = 'Coogee Bay Hotel'
AND beer = 'Victoria Bitter');
This introduces notions of scope: an attribute refers to the most closely nested relation with that attribute.
Parentheses around the subquery are required (and set the scope).
Note also that the query could be answered via:
SELECT s1.bar
FROM Sells as s1, Sells as s2
WHERE s1.beer = 'New'
AND s1.price = s2.price
AND s2.bar = 'Coogee Bay Hotel'
AND s2.beer = 'Victoria Bitter';
In general, expressing a query via joins will be much more efficient than expressing it with sub-queries.
Complex Case: Subquery returns multiple unary tuples.
Treat it as a list of values, and use the various operators
on lists/sets (e.g. IN
Complex Case: Subquery returns a relation.
Most of the "list operators" also work on relations.
INTests whether a specified tuple is contained in a relation.
tuple IN
Conversely for tuple NOT IN
Syntax:
SELECT *
FROM R
WHERE R.a IN (SELECT x FROM S WHERE Cond)
-- assume multiple results
INExample: Find the name and brewer of beers that John likes.
SELECT *
FROM Beers
WHERE name IN
(SELECT beer
FROM Likes
WHERE drinker = 'John');
name | manf
---------------------+---------------
80/- | Caledonian
Bigfoot Barley Wine | Sierra Nevada
Pale Ale | Sierra Nevada
Three Sheets | Lord Nelson
The subexpression answers the question "What are the names of the beers that John likes?"
IN
Note that this query can be answered equally well without using IN
SELECT Beers.name, Beers.manf
FROM Beers, Likes
WHERE Likes.drinker = 'John' AND
Likes.beer = Beers.name;
name | manf
---------------------+---------------
80/- | Caledonian
Bigfoot Barley Wine | Sierra Nevada
Pale Ale | Sierra Nevada
Three Sheets | Lord Nelson
The version with the subquery corresponds more closely to the way the original query was expressed, and is probably "more natural".
The subquery version is, however, potentially less efficient.
EXISTS
EXISTS()
Example: Find the beers that are the unique beer by their manufacturer.
SELECT name, manf
FROM Beers b1
WHERE NOT EXISTS
(SELECT *
FROM Beers
WHERE manf = b1.manf
AND name != b1.name);
Note the scoping rule: to refer to outer Beersb1
A subquery that refers to values from a surrounding query is called a correlated subquery.
ANYALL
Example: Find the beers sold for the highest price.
SELECT beer
FROM Sells
WHERE price >=
ALL(SELECT price FROM sells);
Beware: in common use, "any" and "all" are often synonyms.
E.g. "I'm better than any of you" vs. "I'm better than all of you".
SQL implements the standard set operations on "union-compatible" relations:
UNION |
set of tuples in either R1 or R2 | |
INTERSECT |
set of tuples in both R1 and R2 | EXCEPT |
set of tuples in R1 but not R2 |
Oracle deviates from the SQL standard and uses MINUSEXCEPT
Example: Find the drinkers and beers such that the drinker likes the beer and frequents a bar that sells it.
(SELECT * FROM Likes) INTERSECT (SELECT drinker,beer FROM Sells natural join Frequents); drinker | beer ---------+----------------- Adam | New John | Three Sheets Justin | Victoria Bitter
An SQL relation is really a bag (multiset):
UNIONINTERSECTMINUSBag Union
Sum the times an element appears in the two bags
Take the minimum number of occurrences from each bag.
Proper-subract the number of occurrences in the two bags.
Default result for SELECT-FROM-WHERE
Default result for UNIONINTERSECTMINUS
Why the difference?
A bag can be produced faster because no need to worry about eliminating duplicates (which typically requires sorting).
Can force set semantics with SELECT DISTINCT
Can force bag semantics with UNION ALL
Example: What beer manufacturers are there?
SELECT DISTINCT manf FROM Beers;
manf
---------------
Caledonian
Carlton
Cascade
Cooper's
George IV Inn
Lord Nelson
Sierra Nevada
Toohey's
Note that the result is sorted.
If we omit DISTINCT
Not all SQL implementations provide a divide operator, but the same effect can be achieved by combination of existing operations.
Example: Find bars that each sell all of the beers Justin likes.
SELECT DISTINCT a.bar
FROM Sells a
WHERE NOT EXISTS (
(SELECT beer FROM Likes
WHERE drinker = 'Justin')
EXCEPT
(SELECT beer FROM Sells b
WHERE bar = a.bar)
);
Selection clauses can contain aggregation operations.
Example: What is the average price of New?
SELECT AVG(price)
FROM Sells
WHERE beer = 'New';
avg
------------------
2.38749998807907
Note:
If we want set semantics, we can force using DISTINCT
Example: How many different bars sell beer?
SELECT COUNT(DISTINCT bar)
FROM Sells;
count
-------
6
Without DISTINCTSells
The following operators apply to a list (bag) of numeric values in one column of a relation:
SUM AVG MIN MAX COUNT
The notation COUNT(*)
Example: How many different beers are there?
SELECT COUNT(*) FROM Beers;
count
-------
18
SELECT-FROM-WHEREGROUP BY
Example: How many beers does each brewer make?
There is one entry for each beer by each brewer in the
Beers
The following gives us a list of brewers:
SELECT manf FROM Beers;
The number of occurrences of each brewer is the number of beers that they make.
Ordering the list makes it much easier to work out:
SELECT manf FROM Beers ORDER BY manf;
but we still need to count length of runs by hand.
If we group the runs, we can count(*)
SELECT manf, COUNT(manf)
FROM Beers
GROUP BY manf;
manf | count
---------------+-------
Caledonian | 1
Carlton | 5
Cascade | 1
Cooper's | 2
George IV Inn | 1
Lord Nelson | 2
Sierra Nevada | 2
Toohey's | 4
GROUP BY
SELECT attributes/aggregations FROM relations WHERE condition GROUP BY attribute
Semantics:
SELECT-FROM-WHEREThe query
select manf,count(manf) from Beers group by manf;
first produces a partitioned relation and then counts the number of tuples in each partition:
Grouping is typically used in queries involving the phrase "for each".
Example: For each drinker, find the average price of New at the bars they go to.
SELECT drinker, AVG(price) as "Avg.Price"
FROM Frequents, Sells
WHERE beer = 'New'
AND Frequents.bar = Sells.bar
GROUP BY drinker;
drinker | Avg.Price
---------+-----------
Adam | 2.25
John | 2.25
Justin | 2.5
SELECT
When using grouping, every attribute in the SELECT
GROUP-BYSELECT bar, MIN(price) FROM Sells WHERE beer = 'New';
PostgreSQL's response to this query:
ERROR: Attribute sells.bar must be GROUPed
or used in an aggregate function
SELECTHow to answer the query: Which bar sells 'New' cheapest?
SELECT bar
FROM Sells
WHERE beer = 'New' AND
price <= (SELECT MIN(price)
FROM Sells
WHERE beer = 'New');
bar
--------------
Regent Hotel
SELECTAlso, cannot use grouping to simply re-order results.
Incorrect Example: Print beers grouped by their manufacturer.
SELECT name, manf FROM Beers
GROUP BY manf;
ERROR: Attribute beers.name must be GROUPed
or used in an aggregate function
SELECTHow to print beers grouped by their manufacturer?
SELECT name, manf FROM Beers
ORDER BY manf;
name | manf
---------------------+---------------
80/- | Caledonian
Crown Lager | Carlton
Fosters Lager | Carlton
Invalid Stout | Carlton
Melbourne Bitter | Carlton
Victoria Bitter | Carlton
Premium Lager | Cascade
...
ORDER BY
In some queries, you can use the WHERE
Example: Average beer price by suburb excluding hotels in The Rocks.
SELECT Bars.addr, AVG(Sells.price)
FROM Sells, Bars
WHERE Bars.addr != 'The Rocks'
AND Sells.bar = Bars.name
GROUP BY Bars.addr;
For more complex conditions on groups, use the HAVING
HAVINGGROUP-BY
SELECT attributes/aggregations FROM relations WHERE condition (on tuples) GROUP BY attribute HAVING condition (on group);
Semantics of HAVING
GROUP-BYHAVINGHAVINGFROMWHEREExample: Find the average price of common beers (i.e. those that are served in more than one hotel).
SELECT beer,
to_char(AVG(price),'9.99')
as "$$$"
FROM Sells
GROUP BY beer
HAVING COUNT(bar) > 1;
beer | $$$
-----------------+-------
New | 2.39
Old | 2.53
Victoria Bitter | 2.40
The HAVING
Example: Find the average price of beers that are either commonly served (in more than one hotel) or are manufactured by Cooper's.
SELECT beer, AVG(price)
FROM Sells
GROUP BY beer
HAVING COUNT(bar) > 1
OR beer in
(SELECT name
FROM beers
WHERE manf = 'Cooper''s');
beer | avg
-----------------+------------------
New | 2.38749998807907
Old | 2.53333330154419
Sparkling Ale | 2.79999995231628
Victoria Bitter | 2.39999997615814
GROUP-BY and HAVING also provide an alternative formulation for division.
Example: Find bars that each sell all of the beers Justin likes.
SELECT DISTINCT S.bar
FROM Sells S, Likes L
WHERE S.beer = L.beer
AND L.drinker = 'Justin'
GROUP BY bar
HAVING count(S.beer) =
(SELECT count(beer) FROM Likes
WHERE drinker = 'Justin');
Sometimes it is useful to
GROUP-BYGROUP-BYPARTITION
Syntax for PARTITION
SELECT attr1, attr2, ...,
aggregate1 OVER (PARTITION BY attri),
aggregate2 OVER (PARTITION BY attrj), ...
FROM Table
WHERE condition on attributes
Note: the condition cannot include the aggregate value(s)
Example: show each city with daily temperature and temperature range
Schema: Weather(city,date,temperature)
SELECT city, date, temperature as temp,
min(temperature) OVER (PARTITION BY city) as lowest,
max(temperature) OVER (PARTITION BY city) as highest
FROM Weather;
Output: Result(city, date, temp, lowest, highest)
Example showing GROUP BYPARTITION
SELECT city, min(temperature) max(temperature) FROM Weather GROUP BY city
Result: one tuple for each city Result(city,min,max)
SELECT city, date, temperature as temp,
min(temperature) OVER (PARTITION BY city),
max(temperature) OVER (PARTITION BY city)
FROM Weather;
Result: one tuple for each temperature measurement.
Example: get a list of low-scoring students in each course
(low-scoring = mark is less than average mark for class)
Schema: Enrolment(course,student,mark)
Approach:
SELECT course, student, mark,
avg(mark) OVER (PARTITION BY course)
FROM Enrolments;
We now look at several ways to complete this data request ...
For complex queries, it is often useful to
FROMWITHWHEREDefining complex queries using views:
CREATE VIEW CourseMarksAndAverages(course,student,mark,avg)
AS
SELECT course, student, mark,
avg(mark) OVER (PARTITION BY course)
FROM Enrolments;
SELECT course, student, mark
FROM CourseMarksAndAverages
WHERE mark < avg;
In the general case:
CREATE VIEW View1(a,b,c,d) AS Query1; CREATE VIEW View2(e,f,g) AS Query2; ... SELECT a,f FROM View1, View2 WHERE c = e;
Notes:
Defining complex queries using FROM
SELECT course, student, mark
FROM (SELECT course, student, mark,
avg(mark) OVER (PARTITION BY course)
FROM Enrolments) AS CourseMarkAndAverages
WHERE mark < avg;
Avoids the need to define views.
In the general case:
SELECT attributes
FROM (Query1) AS X,
(Query2) AS Y,
...
WHERE X.a = Y.b AND other conditions
Notes:
a
Defining complex queries using WITH
WITH CourseMarksAndAverages AS
(SELECT course, student, mark,
avg(mark) OVER (PARTITION BY course)
FROM CourseEnrolments)
SELECT course, student, mark, avg
FROM CourseMarksAndAverages
WHERE mark < avg;
Avoids the need to define views.
In the general case:
WITH Name1(a,b,c) AS (Query1),
Name2 AS (Query1), ...
SELECT attributes
FROM Name1, Name2, ...
WHERE conditions with attributes of Name1 and Name2
Notes:
WITH
Recursive queries are structured as:
WITH RECURSIVE Recurs(attributes) AS (
SELECT ... not involving Recurs
UNION
SELECT ... FROM Recurs, ...
)
SELECT attributes
FROM Recurs, ...
WHERE condition involving Recurs attributes
Useful for scenarios in which we need to traverse multi-level relationships.
Simple example involving a "virtual" table.
Sum the numbers from 1 to 100:
WITH RECURSIVE t(n) AS (
SELECT 1
UNION
SELECT n+1 FROM t WHERE n < 100
)
SELECT sum(n) FROM t;
In the general case:
WITH RECURSIVE Recurs(attributes) AS (
Q1 (non-recursive query)
UNION
Q2 (recursive query)
)
SELECT * FROM Recurs;
Requires the use of several temporary tables:
How recursion works:
Working = Result = evaluate Q1
while (Working table is not empty) {
Temp = evaluate Q2, using Working in place of Recurs
Temp = Temp - Result
Result = Result UNION Temp
Working = Temp
}
I.e. generate new tuples until we see nothing not already seen.
Example: count number of each sub-part in a given part.
Schema: Parts(part, sub_part, quantity)
WITH RECURSIVE IncludedParts(sub_part, part, quantity) AS (
SELECT sub_part, part, quantity
FROM Parts WHERE part = GivenPart
UNION ALL
SELECT p.sub_part, p.part, p.quantity
FROM IncludedParts i, Parts p
WHERE p.part = i.sub_part
)
SELECT sub_part, SUM(quantity) as total_quantity
FROM IncludedParts
GROUP BY sub_part
A view is like a "virtual relation" defined via a query.
View definition and removal:
CREATE VIEW ViewName AS Query CREATE VIEW ViewName [ (AttributeNames) ] AS Query DROP VIEW ViewName
The Query may be any SQL query, involving
The stored tables in a view are referred to as base tables.
Views are defined only after their base tables are defined.
A view is valid only as long as its underlying query is valid.
Dropping a view has no effect on the base tables.
Example: An avid Carlton drinker might not be interested in any other kinds of beer.
CREATE VIEW MyBeers AS
SELECT name, manf
FROM Beers
WHERE manf = 'Carlton';
SELECT * FROM MyBeers;
name | manf
------------------+---------
Crown Lager | Carlton
Fosters Lager | Carlton
Invalid Stout | Carlton
Melbourne Bitter | Carlton
Victoria Bitter | Carlton
A view might not use all attributes of the base relations.
Example: We don't really need the address of inner-city hotels.
CREATE VIEW InnerCityHotels AS
SELECT name, license
FROM Bars
WHERE addr in ('The Rocks','Sydney');
SELECT * FROM InnerCityHotels;
name | license
-----------------+---------
Australia Hotel | 123456
Lord Nelson | 123888
Marble Bar | 122123
A view might use computed attribute values.
Example: Number of beers produced by each brewer.
CREATE VIEW BeersBrewed AS
SELECT manf as brewer,
count(*) as nbeers
FROM beers GROUP BY manf;
SELECT * FROM BeersBrewed;
brewer | nbeers
---------------+--------
Caledonian | 1
Carlton | 5
Cascade | 1
...
This can be achieved in two different ways:
CREATE VIEW InnerCityHotels AS
SELECT name AS pub, license AS lic
FROM Bars
WHERE addr IN ('The Rocks', 'Sydney');
CREATE VIEW InnerCityHotels(pub,lic) AS
SELECT name, license
FROM Bars
WHERE addr IN ('The Rocks', 'Sydney');
Views can be used in queries as if they were stored relations.
However, they differ from stored relations in two important respects:
Example: of view changing when base table changes.
SELECT * FROM InnerCityHotels;
name | license
-----------------+---------
Australia Hotel | 123456
Lord Nelson | 123888
Marble Bar | 122123
-- then the Lord Nelson goes broke
DELETE FROM Bars WHERE name = 'Lord Nelson';
-- no explict update has been made to InnerCityHotels
SELECT * FROM InnerCityHotels;
name | license
-----------------+---------
Australia Hotel | 123456
Marble Bar | 122123
Explicit updates are allowed on views satisfying the following:
RWHERERWHERESELECTSELECTNULL
Example: Our InnerCityHotel
INSERT INTO InnerCityHotels
VALUES ('Jackson''s on George', '9876543');
creates a new tuple in the Bars
(Jackson's on George, NULL, 9876543)
when we SELECT
addr IN ('The Rocks', 'Sydney')
If we had chosen to omit the license
CREATE VIEW CityHotels AS
SELECT name,addr FROM Bars
WHERE addr IN ('The Rocks', 'Sydney');
INSERT INTO CityHotels
VALUES ('Jackson''s on George', 'Sydney');
creates a new tuple in the Bars
(Jackson's on George, Sydney, NULL)
which would appear in the view after the insertion.
Updatable views in PostgreSQL require us to specify explicitly how updates are done:
CREATE RULE InsertCityHotel AS
ON INSERT TO CityHotels
DO INSTEAD
INSERT INTO Bars VALUES
(new.name, new.addr, NULL);
CREATE RULE UpdateCityHotel AS
ON UPDATE TO CityHotels
DO INSTEAD
UPDATE Bars
SET addr = new.addr
WHERE name = old.name;
Two alternative ways of implementing views:
Example: Using the InnerCityHotels
CREATE VIEW InnerCityHotels AS
SELECT name, license
FROM Bars
WHERE addr IN ('The Rocks', 'Sydney');
SELECT name
FROM InnerCityHotels
WHERE license = '123456';
--is rewritten into the following form before execution
SELECT name
FROM Bars
WHERE addr IN ('The Rocks', 'Sydney')
AND license = '123456';
Demonstrate the rewriting process via relational algebra.
Some abbreviations
nnamellicenseLlicense = ''123456'Aaddr IN ('The Rocks', 'Sydney')InnerCityHotels = π(n,l)(σ(A)(Bars))
Rewriting of query involving a view:
= SELECT name from InnerCityHotels WHERE license = '123456' = π(n)(σ(L)(InnerCityHotels)) = π(n)(σ(L)(π(n,l)(σ(A)(Bars)))) = π(n)(π(n,l)(σ(L)(σ(A)(Bars)))) = π(n)(σ(L)(σ(A)(Bars))) = π(n)(σ(L & A)(Bars)) = π(n)(σ(A & L)(Bars)) = SELECT name FROM Bars WHERE addr IN ('The Rocks', 'Sydney') AND license = '123456'
Naive implementation of materialized views:
However, in a situation where
Materialized views are used extensively in data warehouses.
Produced: 13 Sep 2020