❖ Normalisation |
Normalisation aims to put a schema into xNF
❖ Normalisation (cont) |
In practice, BCNF and 3NF are the most important normal forms.
Boyce-Codd Normal Form (BCNF):
❖ Relation Decomposition |
The standard transformation technique to remove redundancy:
We accomplish decomposition by
May require several decompositions to achieve acceptable NF.
Normalisation algorithms tell us how to choose S and T.
❖ Schema (Re)Design |
Consider the following relation for BankLoans:
branchName | branchCity | assets | custName | loanNo | amount |
Downtown | Brooklyn | 9000000 | Jones | L-17 | 1000 |
Redwood | Palo Alto | 2100000 | Smith | L-23 | 2000 |
Perryridge | Horseneck | 1700000 | Hayes | L-15 | 1500 |
Downtown | Brooklyn | 9000000 | Jackson | L-15 | 1500 |
Mianus | Horseneck | 400000 | Jones | L-93 | 500 |
Round Hill | Horseneck | 8000000 | Turner | L-11 | 900 |
North Town | Rye | 3700000 | Hayes | L-16 | 1300 |
This schema has all of the update anomalies mentioned earlier.
❖ Schema (Re)Design (cont) |
To improve the design, decompose the BankLoans relation.
The following decomposition is not helpful:
Branch(branchName, branchCity, assets)
CustLoan(custName, loanNo, amount)
because we lose information (which branch is a loan held at?)
Another possible decomposition:
BranchCust(branchName, branchCity, assets, custName)
CustLoan(custName, loanNo, amount)
❖ Schema (Re)Design (cont) |
The BranchCust relation instance:
branchName | branchCity | assets | custName |
Downtown | Brooklyn | 9000000 | Jones |
Redwood | Palo Alto | 2100000 | Smith |
Perryridge | Horseneck | 1700000 | Hayes |
Downtown | Brooklyn | 9000000 | Jackson |
Mianus | Horseneck | 400000 | Jones |
Round Hill | Horseneck | 8000000 | Turner |
North Town | Rye | 3700000 | Hayes |
❖ Schema (Re)Design (cont) |
The CustLoan relation instance:
custName | loanNo | amount |
Jones | L-17 | 1000 |
Smith | L-23 | 2000 |
Hayes | L-15 | 1500 |
Jackson | L-15 | 1500 |
Jones | L-93 | 500 |
Turner | L-11 | 900 |
Hayes | L-16 | 1300 |
❖ Schema (Re)Design (cont) |
Now consider the result of (BranchCust ⨝ CustLoan)
branchName | branchCity | assets | custName | loanNo | amount |
Downtown | Brooklyn | 9000000 | Jones | L-17 | 1000 |
Downtown | Brooklyn | 9000000 | Jones | L-93 | 500 |
Redwood | Palo Alto | 2100000 | Smith | L-23 | 2000 |
Perryridge | Horseneck | 1700000 | Hayes | L-15 | 1500 |
Perryridge | Horseneck | 1700000 | Hayes | L-16 | 1300 |
Downtown | Brooklyn | 9000000 | Jackson | L-15 | 1500 |
Mianus | Horseneck | 400000 | Jones | L-93 | 500 |
Mianus | Horseneck | 400000 | Jones | L-17 | 1000 |
Round Hill | Horseneck | 8000000 | Turner | L-11 | 900 |
North Town | Rye | 3700000 | Hayes | L-16 | 1300 |
North Town | Rye | 3700000 | Hayes | L-15 | 1500 |
❖ Schema (Re)Design (cont) |
This is clearly not a successful decomposition.
The fact that we ended up with extra tuples was symptomatic of losing some critical "connection" information during the decomposition.
Such a decomposition is called a lossy decomposition.
In a good decomposition, we should be able to reconstruct the original relation exactly:
Such a decomposition is called lossless join decomposition.
❖ BCNF Normalisation Algorithm |
The following algorithm converts an arbitrary schema to BCNF:
Inputs: schema R, set F of fds Output: set Res of BCNF schemas Res = {R}; while (any schema S ∈ Res is not in BCNF) { choose any fd X → Y on S that violates BCNF Res = (Res-S) ∪ (S-Y) ∪ XY }
The last step means: make a table from XY; drop Y from table S
The "choose any" step means that the algorithm is non-deterministic
❖ BCNF Normalisation Example |
Recall the BankLoans schema:
BankLoans(branchName, branchCity, assets, custName, loanNo, amount)
Rename to simplify ...
B = branchName, C = branchCity, A = assets, N = CustName, L = loanNo, M = amount
So ... R = BCANLM, F = { B → CA, L → MN }, key(R) = BL
R is not in BCNF, because B → CA is not a whole key
Decompose into
❖ BCNF Normalisation Example (cont) |
S = BCA is in BCNF, only one fd and it has key on LHS
T = BLNM is not in BCNF, because L → NM is not a whole key
Decompose into ...
❖ 3NF Normalisation Algorithm |
The following algorithm converts an arbitrary schema to 3NF:
Inputs: schema R, set F of fds Output: set Ri of 3NF schemas let Fc be a reduced minimal cover for F Res = {} for each fd X → Y in Fc { if (no schema S ∈ Res contains XY) { Res = Res ∪ XY } } if (no schema S ∈ Res contains a key for R) { K = any candidate key for R Res = Res ∪ K }
❖ 3NF Normalisation Example |
Recall the BankLoans schema:
BankLoans(branchName, branchCity, assets, custName, loanNo, amount)
Rename to simplify ...
R = BCANLM, F = { B → CA, L → MN }, key(R) = BL
Compute minimal cover = { B → C, B → A, L → M, L → N }
Reduce minimal cover = { B → CA, L → MN }
Convert into relations: S = BCA, T = LNM
No relation has key BL, so add new table containing key U = BL
Result is S = BCA, T = LNM, U = BL ... same as BCNF
❖ Database Design Methodology |
To achieve a "good" database design:
Produced: 5 Nov 2020