Relational Data Model


Relational Data Model

The relational data model describes the world as:

Goal of relational model:

Can be viewed as an attempt to formalise the file organisations that were in common use at the time the model was developed.

Relational Data Model (cont)

The relational data model has existed for over 30 years.

(The original description is Codd, Communications of the ACM, 13(6), 1970)

The relational model has provided the basis for:

It is a very influential development in CS, for which Codd received a Turing award.

Terminology

A note on the terminology used in the relational model ...

The relational model is a mathematical theory; it has no "standard".

However, it also has a close connection to file/data structures.

There are thus two kinds of terminology in use:

Textbooks alternate between the two; treat them as synonyms.


Relations

The relational model has one structuring mechanism ...

Each relation schema (denoted R,S,T,...) has: Each attribute (denoted A,B,... or a1,a2,...) has:

Relations (cont)

Attribute values are atomic   (no composite or multi-valued attributes).

Attribute domains are typically: numbers, strings, booleans.

There is a distinguished value NULL that belongs to all domains.

A database is a collection of associated relations.


Relations (cont)

Example relation (bank accounts):

[Diagram:Pic/er-rel/table.png]


Example Database

[Diagram:Pic/er-rel/db.png]


Example Database (cont)

A tuple is a set of values; a relation is a set of tuples.

Since a relation is a set, there is no ordering on rows.

Normally, we define a standard ordering on components of a tuple.

The following are different presentations of the same relation:

[Diagram:Pic/er-rel/tables.png]


Example Database (cont)

Consider a relation R :

R(a1, a1, ... an)   (alternatively, D1 × D2 × ... × Dn) A particular subset r of D1 × D2 × ... × Dn

Example Database (cont)

Schema names are typically unique within a given database.

So, we often use R as a synonym for R(a1, a1, ... an).

r(R) is used to denote that r is an instance of the schema R.

The number of attributes (n) in a schema is its degree (arity).

Note: the phrase " the relation R " can refer to either

The intended usage is generally clear from the context.


Example Database (cont)

E.g. the Accounts schema has type String × String × Int :

   Account(branchName, accountNo, balance)

E.g. the Account instance (set of tuples) from the diagram:


{
  (Downtown, A-101, 500),   (Mianus, A-215, 700),
  (Perryridge, A-102, 400), (Round Hill, A-305, 350),
  (Brighton, A-201, 900),   (Redwood, A-222, 700),
  (Brighton, A-217, 750)
}

Notes:


Changing Relations

In making changes to relations, it is ...

The reasons: Schema update is a well-known and not well-solved problem in RDBMSs.

Constraints on Relational Data


Constraints

A very important feature of the relational model:

This is useful because it allows

Integrity Constraints

Relations allow us to represent data and associations.

Domains limit the values that attributes can take.

However, to fully represent the semantics of real-world problems, we need more detailed ways of specifying

Integrity constraints are logical statements about data that provide such information.

Some examples:


Integrity Constraints (cont)

Several kinds of constraints exist e.g.

key combination of attributes must be unique
entity integrity no attribute in key may be NULL
referential integrity references to tuples in other relations must be valid
domain value of attribute must satisfy certain property

Functional dependencies are another important kind of constraint, related to database design; we cover them in considerable detail later.


Integrity Constraints (cont)

Associating an attribute to a domain restricts its possible values to a well-defined set (e.g. integer).

Domain constraints allow more "fine-grained" definition of potential attribute values.

Example:

An age attribute is typically defined as integer ...

Note: the NULL value satisfies any domain constraint.

Integrity Constraints (cont)

Relational tuples have no notion of identity like OIDs.

Identity is value-based (as in ER model)

Relational model supports same notions of key as ER model: Keys are often implemented by introducing an artificial attribute specifically for the purpose of being a key (e.g. student ID, account number).

Integrity Constraints (cont)

Referential integrity constraints are relevant for inter-relation references.

Example:

The notion that the branchName must refer to a valid branch is a referential integrity constraint.

Foreign Keys

Referential integrity is related to the notion of a foreign key.

A set of attributes FK from a relation schema R1 is a foreign key if:


Foreign Keys (cont)

Foreign keys are critical in relational databases

Special notation for foreign/primary keys:

Foreign Keys (cont)

Foreign key examples:

[Diagram:Pic/er-rel/fkeys.png]


Relational Databases

Relations, keys, foreign keys, and integrity constraints provide a complete toolkit for building relational databases.

A relational database schema is

A relational database instance is

Constraint Checking

If we have a database instance that satisfies all integrity constraints, what can go wrong?

The data might change     constraints need to be re-checked.

Possible changes:


Constraint Checking (cont)

For domain constraints ...

Insert:

Delete: Update:

Constraint Checking (cont)

These changes satisfy domain constraints:

insert Account(Downtown, A-456, 600)
insert Account(Perryridge, A-321, 200)
insert Account(Perryridge, A-102, 750)
   # but note duplicate key value
insert Account(Perryrige, A-131, 450)
   # value looks ok, but isn't correct

These changes do not satisfy domain constraints:

insert Account(Downtown, A-321, money)
   # 3rd attribute (a3) fails type check
insert Account(XYZZY, Hello, 300)
   # if we check for "lexically sensible" values on a1,a2
insert Account(Brighton, A-402, -500)
   # if we check for positive opening balance


Constraint Checking (cont)

For key constraints ...

Insert:

Delete: Update:

Constraint Checking (cont)

These changes satisfy key constraints:

insert Account(Downtown, A-456, 600)
insert Depositor(A-101, 9876543)
   # ok, only part of key duplicated
update Account(Downtown, A-101, 500)
    to Account(Downtown, A-101, 600)
   # ok, key attributes were not changed
insert Depositor(A-305, 8888888)
   # but no such customer

These changes do not satisfy key constraints:

insert Account(Perryridge, A-102, 750)
   # key A-102 already exists in relation
update Account(Downtown, A-101, 500)
    to Account(Downtown, A-201, 500)
   # key A-201 already exists in relation


Constraint Checking (cont)

For referential integrity constraints ...

Insert:

Delete: Update:

Constraint Checking (cont)

Example of deletion with foreign keys:

[Diagram:Pic/er-rel/deletion.png]


Constraint Checking (cont)

How to handle violation of referential constraints on deletion?

One approach:

Another approaches:

Constraint Checking (cont)

These changes satisfy referential integrity constraints:

insert Account(Downtown, A-456, 600)
insert Depositor(A-215, 9876543)
update Account(Downtown, A-101, 500)
    to Account(Perryridge, A-101, 500)
   # ok, a1 changed to valid FK

These changes do not satisfy referential integrity constraints:

insert Account(Wombatville, A-987, 500)
   # no such branch
insert Depositor(A-305, 8888888)
   # valid account, but no such customer
update Account(Downtown, A-101, 500)
    to Account(Nowhere, A-101, 500)
   # no such branch


Constraint Checking (cont)

These changes satisfy referential integrity constraints:

delete Depositor(A-102, 1234567)
delete Depositor(A-101, 1313131)
   # although A-101 now has no "owner"
delete Branch(North Town, Rye, 3700000)
   # ok, since no accounts or customers (but assets?)

These changes do not satisfy referential integrity constraints:

delete Branch(Perryridge, Horseneck, 1700000)
   # some accounts are held at Perryridge
delete Customer(Smith, Rye, 1234567, Mianus)
   # Depositor records become invalid


Mapping ER Designs to Relational Schemas


ER to Relational Mapping

As noted earlier, one useful strategy for database design:

By examining semantic correspondences, a formal mapping between the ER and relational models has been developed.

Because it is formal, it can be automated, and commercial tools now exist to perform it.


ER to Relational Mapping (cont)

If we have tools, why worry about the mapping process itself?

It is still useful to understand how mapping occurs because:

Also, you're CSE students and you like to know how things work.


Relational Model vs ER Model

The relational and ER data models have some obvious correspondences:

Entity/Relationship Relational
Attributes Attributes (atomic)
Entity
Relationship
Relation schema
Entity instance
Relationship instance
Tuple (row, record)
Entity set
Relationship set
Relation instance

Relational Model vs ER Model (cont)

There are also differences between relational and ER models.

Compared to ER, the relational model:


Relational Model vs ER Model (cont)

At first glance, Relational looks less powerful than ER:

However, the relational model:

Mapping Strong Entities

An entity consists of:

A relation schema consists of: So, even the mapping from entity to relation schema is not simple.

Mapping Strong Entities (cont)

In one special case, there is an obvious mapping:

maps to Each row in the relation R corresponds to an entity in E.

The key for the relation is the same (set of attributes) as for the entity set.


Mapping Strong Entities (cont)

ER supports composite (hierarchical) attributes.

The relational model supports only atomic attributes.

Composite attributes consist of

[Diagram:Pic/er-rel/composite.png]


Mapping Strong Entities (cont)

One approach to mapping composite attributes:

E.g.   Struct A {W, Struct B {X,Y}, Z} (W,X,Y,Z)

It is common to retain structuring attribute as part of name to resolve name conflicts.

E.g. Struct Addr {number,street,suburb,pcode}
maps to (AddrNumber,AddrStreet,AddrSuburb,AddrPcode)


Mapping Strong Entities (cont)

Alternative approach to mapping composite attributes:

E.g.   Struct name {"John","Smith"} "John Smith"

However, this approach "hides" information from data manipulation languages:


Mapping Strong Entities (cont)

Example:

[Diagram:Pic/er-rel/mapstrent.png]


Mapping Weak Entities

A weak entity set W

If we simply form a relation for W by mapping its attributes, it would not be a valid relation because it would not have a key.

The solution:

This always yields a relation with a valid key.

Mapping Weak Entities (cont)

More formally:

The key is E's key (foreign key in W) plus the discriminator of W.

The weak relationship set between W and E is not explicitly represented.


Mapping Weak Entities (cont)

Example:

[Diagram:Pic/er-rel/mapwkent.png]


Mapping N:M Relationships

A binary relationship set B between entity sets E and F gives associations between pairs of entities in E and F

[Diagram:Pic/er-rel/nmrelationship.png]

We can represent

But how to represent B?


Mapping N:M Relationships (cont)

One possibility: represent the relationship set B explicitly by a relation R.

Each tuple (row) in R represents the relationship between a specific pair of entities from E and F.

For this to work, the tuple would need to contain information to identify the entities involved

This is achieved by storing the keys of the related entities.

It is somewhat like breaking the ER diagram up as follows:

[Diagram:Pic/er-rel/nmrelationship2.png]


Mapping N:M Relationships (cont)

A relationship set B(E,F) is represented by a relation R containing:

where S and T are relations representing entity sets E and F.

The key for R is the union of the key attributes for S and T.


Mapping N:M Relationships (cont)

This approach for representing relationships works generally:

but requires a new relation to be created for each relationship set.

This can slow down query processing considerably.

In certain special cases, we do not need to create a new relation (see later).


Mapping N:M Relationships (cont)

Example:

[Diagram:Pic/er-rel/mapnnrel.png]


Mapping 1:N Relationships

Consider a 1:N relationship R between entity sets E and F

As above, we represent E and F by relations S and T.

How to capture the association between an entity in F and the corresponding entity in E?

We have already seen one solution: introduce a new relation for R.


Mapping 1:N Relationships (cont)

Since there is (at most) one corresponding entity, add attributes in F:

In other words, we insert a foreign key for E into F,
along with any attributes for the relationship R.

If an entity in F has no relationship with E give NULL values to the "extra" attributes in F.


Mapping 1:N Relationships (cont)

Example (generic mapping):

[Diagram:Pic/er-rel/map1nrel1.png]


Mapping 1:N Relationships (cont)

Example (optimised mapping):

[Diagram:Pic/er-rel/map1nrel2.png]


Mapping 1:1 Relationships

1:1 relationships are handled in a similar manner to 1:N relationships.

The difference is that we could choose either relation to hold the key of the other relation, to represent the correspondence.

Choose the entity set that participates totally, if only one of them does.

For a 1:1 relationship between entity sets E and F (S and T):


Mapping 1:1 Relationships (cont)

Example:

[Diagram:Pic/er-rel/map11rel.png]


Mapping Multi-valued Attributes

An attribute in a relation may hold a single atomic value.

An attribute in an entity may hold multiple (structured) values.

A multi-valued attribute may be viewed as:

so treat it like an N:M relationship between entities and values.

Create a new relation where each tuple contains:


Mapping Multi-valued Attributes (cont)

Example:

[Diagram:Pic/er-rel/mapmva.png]


Mapping Multi-valued Attributes (cont)

This approach is like altering the ER diagram as follows:

[Diagram:Pic/er-rel/mapmva1.png]


Mapping Multi-valued Attributes (cont)

Example: the two entities


Person(12345, John, 12-feb-1990, [red,green,blue])
Person(54321, Jane, 25-dec-1990, [green,purple])

would be represented as

Person(12345, John, 12-feb-1990)
Person(54321, Jane, 25-dec-1990)
FavColour(12345, red)
FavColour(12345, green)
FavColour(12345, blue)
FavColour(54321, green)
FavColour(54321, purple)


Mapping Subclasses

Each subclass is represented as a separate relation.

Each entity in the subclass:

Use the superclass entity's primary key to capture the association.

Each tuple in the subclass relation contains:


Mapping Subclasses (cont)

Example:

[Diagram:Pic/er-rel/mapsubclass.png]


Mapping Subclasses (cont)

This approach is like transforming the ER as follows:

[Diagram:Pic/er-rel/mapsubclass1.png]


Mapping Subclasses (cont)

This approach to subclass mapping is called "ER style"

There are two other approaches to subclass mapping:


Mapping Subclasses (cont)

Example of object-oriented mapping:

[Diagram:Pic/er-rel/mapsubclass2.png]


Mapping Subclasses (cont)

Example of single-table-with-nulls mapping:

[Diagram:Pic/er-rel/mapsubclass3.png]


Mapping Subclasses (cont)

Which mapping is best depends on other requirements ...


Produced: 13 Sep 2020