❖ Week 07 Thursday |
ass2
❖ Relational Design Theory |
We know how to express ER data models and SQL schemas
But how do we know that our models/schemas are "good"?
Properties of "good" models/schemas
Relies on the notion of functional dependency
❖ Relational Design Theory (cont) |
Functional dependencies
❖ Relational Design and Redundancy |
In database design, redundancy is generally a "bad thing":
❖ Relational Design and Redundancy (cont) |
Consider the following relation defining bank accounts/branches:
accountNo | balance | customer | branch | address | assets |
A-101 | 500 | 1313131 | Downtown | Brooklyn | 9000000 |
A-102 | 400 | 1313131 | Perryridge | Horseneck | 1700000 |
A-113 | 600 | 9876543 | Round Hill | Horseneck | 8000000 |
A-201 | 900 | 9876543 | Brighton | Brooklyn | 7100000 |
A-215 | 700 | 1111111 | Minus | Horseneck | 400000 |
A-222 | 700 | 1111111 | Redwood | Palo Alto | 2100000 |
A-305 | 350 | 1234567 | Round Hill | Horseneck | 8000000 |
Careless updating of this data may introduce inconsistencies.
❖ Exercise: Update Anomalies |
Show what happens when the following changes occur:
❖ Update Anomalies |
Insertion anomaly:
❖ Avoiding Update Anomalies |
Redundancy in stored data can lead to update anomalies
Functional dependencies allow us to
❖ Notation/Terminology |
Most texts adopt the following terminology:
upper-case letters, denoting set of all attributes (e.g. R, S, P, Q ) | ||
lower-case letter corresponding to schema (e.g. r(R), s(S), p(P), q(Q) ) | ||
lower-case letters (e.g. t, t', t1, u, v ) | ||
upper-case letters from start of alphabet (e.g. A, B, C, D ) | ||
simple concatenation of attribute names (e.g. X=ABCD, Y=EFG ) | ||
tuple[attrSet] (e.g. t[ABCD], t[X]) |
❖ Database Design (revisited) |
To avoid update anomalies, normalization should give:
❖ Database Design (revisited) (cont) |
But we already have a design procedure (ER-then-relational)
❖ Functional Dependency |
A relation instance r(R) satisfies a dependency X → Y if
We say that "Y is functionally dependent on X".
Attribute sets X and Y may overlap; trivially true that X → X.
Notes:
❖ Identifying Functional Dependencies |
Consider an instance r(R) of the relation schema R(ABCDE):
A | B | C | D | E |
a1 | b1 | c1 | d1 | e1 |
a2 | b1 | c2 | d2 | e1 |
a3 | b2 | c1 | d1 | e1 |
a4 | b2 | c2 | d2 | e1 |
a5 | b3 | c3 | d1 | e1 |
What dependencies can we observe among its attributes?
❖ Identifying Functional Dependencies (cont) |
Since A values are unique, the definition of fd gives:
❖ Identifying Functional Dependencies (cont) |
Other observations:
In practice, choose a minimal set of fds (basis)
❖ Exercise: Functional Dependencies (1) |
Real estate agents conduct visits to rental properties
❖ Exercise: Functional Dependencies (1) (cont) |
Describe any functional dependencies that exist in this data:
P# | When | Address | Notes | S# | Name | CarReg ----+-------------+------------+---------------+------+-------+------- P4 | 03/06 15:15 | 55 High St | Bathroom leak | S44 | Rob | ABK754 P1 | 04/06 11:10 | 47 High St | All ok | S44 | Rob | ABK754 P4 | 03/07 12:30 | 55 High St | All ok | S43 | Dave | ATS123 P1 | 05/07 15:00 | 47 High St | Broken window | S44 | Rob | ABK754 P1 | 05/07 15:00 | 47 High St | Leaking tap | S44 | Rob | ABK754 P2 | 13/07 12:00 | 12 High St | All ok | S42 | Peter | ATS123 P1 | 10/08 09:00 | 47 High St | Window fixed | S42 | Peter | ATS123 P3 | 11/08 14:00 | 99 High St | All ok | S41 | John | AAA001 P4 | 13/08 10:00 | 55 High St | All ok | S44 | Rob | ABK754 P3 | 05/09 11:15 | 99 High St | Bathroom leak | S42 | Peter | ATS123
State assumptions used in determining the fds.
❖ Exercise: Functional Dependencies (2) |
Consider this data (possibly from a spreadsheet):
id | name | subj | term | title | mark | grade --------+-----------------+----------+------+-----------------+------+------- 5123485 | Lillian Seng... | COMP9020 | 17s1 | Foundations ... | | AF 5123485 | Lillian Seng... | COMP9021 | 17s2 | Principles o... | 46 | FL 5123485 | Lillian Seng... | COMP9020 | 18s1 | Foundations ... | 44 | FL 5123485 | Lillian Seng... | COMP9021 | 18s2 | Principles o... | 46 | FL 5123504 | Yap Huawen | MATH5960 | 17s1 | Bayesian Inf... | 56 | PS 5123504 | Yap Huawen | COMP9318 | 17s1 | Data Warehou... | 76 | DN 5123504 | Yap Huawen | COMP9444 | 17s2 | Neural Netwo... | 53 | PS 5123504 | Yap Huawen | COMP9814 | 18s1 | Ext Artifici... | | NF 5123504 | Yap Huawen | COMP6714 | 18s2 | Info Retriev... | 86 | HD 5123523 | Cindy Nava G... | COMP9313 | 17s1 | Big Data Man... | 55 | PS 5123523 | Cindy Nava G... | COMP9444 | 17s2 | Neural Netwo... | 63 | PS 5123523 | Cindy Nava G... | COMP9417 | 18s1 | Machine Lear... | | AW 5123523 | Cindy Nava G... | COMP9318 | 19T1 | Data Warehou... | 43 | FL 5123523 | Cindy Nava G... | COMP6714 | 19T3 | Info Retriev... | | AW 5123523 | Cindy Nava G... | COMP9321 | 20T1 | Data Service... | | SY
Identify functional dependencies
❖ Exercise: Functional Dependencies (2) |
What functional dependencies exist in the following table:
A | B | C | D -----+-----+-----+----- 1 | a | 6 | x 2 | b | 7 | y 3 | c | 7 | z 4 | d | 6 | x 5 | a | 6 | y 6 | b | 7 | z 7 | c | 7 | x 8 | d | 6 | y
How is this case different to the previous ones?
❖ Exercise: Functional Dependencies (3) |
What functional dependencies exist in the following table:
A | B | C | D -----+-----+-----+----- 1 | a | 6 | x 2 | b | 7 | y 3 | c | 7 | z 4 | d | 6 | x 5 | a | 6 | y 6 | b | 7 | z 7 | c | 7 | x 8 | d | 6 | y
How is this case different to the previous ones?
❖ Functional Dependency |
Above, we considered dependency within a relation instance r(R).
More important for design is dependency across all possible instances of the relation (i.e. a schema-based dependency).
This is a simple generalisation of the previous definition:
E.g. course enrolment example
❖ Functional Dependency (cont) |
Can we generalise some ideas about functional dependency?
E.g. are there dependencies that hold for any relation?
❖ Inference Rules |
Armstrong's rules are general rules of inference on fds.
F1. Reflexivity e.g. X → X
F2. Augmentation e.g. X → Y ⇒ XZ → YZ
F3. Transitivity e.g. X → Y, Y → Z ⇒ X → Z
F4. Additivity e.g. X → Y, X → Z ⇒ X → YZ
F5. Projectivity e.g. X → YZ ⇒ X → Y, X → Z
F6. Pseudotransitivity e.g. X → Y, YZ → W ⇒ XZ → W
F1 - F3 are essential; the rest are useful to have
❖ Inference Rules (cont) |
Example: determining validity of AB → GH, given:
F = { AB → E, AG → J, BE → I, E → G, GI → H }
AB → E | (given) | |||
E → G | (given) | |||
AB → G | (using F3 on 1,2) | |||
AB → AB | (using F1) | |||
AB → B | (using F5 on 4) | |||
AB → BE | (using F4 on 1,5) | |||
BE → I | (given) | |||
AB → I | (using F3 on 6,7) | |||
AB → GI | (using F4 on 3,8) | |||
GI → H | (given) | |||
AB → H | (using F3 on 9,10) | |||
AB → GH | (using F4 on 3,11) |
❖ Closures |
Given a set F of fds, how many new fds can we derive?
For a finite set of attributes, there must be a finite set of derivable fds.
Closure = F+ = largest collection of dependencies derivable from F
Closures allow us to answer two interesting questions:
❖ Closures (cont) |
For the question "is X → Y derivable from F?" ...
R = ABC, F = { AB → C, C → B }
F+ = { A → A, AB → A,
AC → A, AB → B,
BC → B, ABC → B,
C → C, AC → C,
BC → C, ABC → C,
AB → AB, . . . . . . ,
AB → ABC, AB → ABC,
C → B, C → BC,
AC → B, AC → AB }
❖ Closures (cont) |
Algorithms based on F+ rapidly become infeasible.
To solve this problem ...
Given a set X of attributes and a set F of fds, the closure of X (denoted X+) is
We can prove (using additivity) that (X → Y) ∈ F+ iff Y ⊂ X+.
For computation purposes, |X+| is bounded by the number of attributes.
❖ Closures (cont) |
For the question "is X → Y derivable from F?" ...
❖ Exercise: Determining Keys |
Determine possible primary keys for each of the following:
Produced: 30 Mar 2023