COMP9315 Week 01 Thursday Lecture

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [0/53]
❖ Things To Note


COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [1/53]
❖ Database Engine Operations

DB engine = "relational algebra virtual machine".

Common operators of relational algebra (RA):

selection (σ ) projection (π ) join ( )
union () intersection () difference (-)
sort group aggregate

For each of these operations:

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [2/53]
❖ Relational Algebra

All RA operators return a result of type relation.

For convenience, we can name a result and use it later.

E.g. database   R1(x,y),   R2(y,z),  


Tmp1(x,y)   = Sel[x>5]R1
Tmp2(y,z)   = Sel[z=3]R2
Tmp3(x,y,z) = Tmp1 Join Tmp2
Res(x,z)    = Proj[x,z] Tmp3
-- which is equivalent to
Tmp1(x,y,z) = R1 Join R2
Tmp2(x,y,z) = Sel[x>5 & z=3] Tmp1
Res(x,z)    = Proj[x,z]Tmp2

Each "intermediate result" has a well-defined schema.

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [3/53]
❖ Exercise: Relational Algebra (i)

Assume a schema: R(a,b,c), S(x,y)

Translate each of the following SQL statements to RA

Indicate: the fields and # tuples in the result
COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [4/53]
❖ Exercise: Relational Algebra (ii)

Using the following student/course/enrolment schema:

Students(sid, name, degree, ...)
Courses(cid, code, term, title, ...)
Enrolments(sid, cid, mark, grade)

write relational algebra expressions to solve the problem

Express it as a sequence of steps, where each step uses one RA operation.
COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [5/53]
❖ DBMS Query Evaluation

Path of a query through a typical DBMS:

[Diagram:Pics/intro/qryeval1.png]

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [6/53]
❖ A Relational Algebra Engine

[Diagram:Pics/intro/dbmsarch1.png]

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [7/53]
❖ PostgreSQL

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [8/53]
❖ PostgreSQL

PostgreSQL is a full-featured open-source (O)RDBMS.

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [9/53]
❖ User View of PostgreSQL

Users interact via SQL in a client process, e.g.


$ psql webcms
psql (15.6)
Type "help" for help.
webcms2=# select * from calendar;
 id | course |   evdate   |      event
----+--------+------------+---------------------------
  1 |      4 | 2001-08-09 | Project Proposals due
 10 |      3 | 2001-08-01 | Tute/Lab Enrolments Close
 12 |      3 | 2001-09-07 | Assignment #1 Due (10pm)
 ...

or


$dbconn = pg_connect("dbname=webcms");
$result = pg_query($dbconn,"select * from calendar");
while ($tuple = pg_fetch_array($result))
   { ... $tuple["event"] ... }

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [10/53]
❖ PostgreSQL Functionality

PostgreSQL systems deal with various kinds of objects:

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [11/53]
❖ PostgreSQL Functionality (cont)

PostgreSQL's dialect of SQL is mostly standard (but with extensions).

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [12/53]
❖ PostgreSQL Functionality (cont)

PostgreSQL stored procedures differ from SQL standard:

create function ( ArgTypes ) returns ResultType
as $$
... body of function definition ...
$$ language FunctionBodyLanguage;

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [13/53]
❖ PostgreSQL Functionality (cont)

Example:

create or replace function
    barsIn(suburb text) returns setof Bars
as $$
declare
    r record;
begin
    for r in
        select * from Bars where location = suburb
    loop
       return next r;
    end loop;
end;
$$ language plpgsql;
used as e.g.
select * from barsIn('Randwick');

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [14/53]
❖ PostgreSQL Functionality (cont)

Uses multi-version concurrency control (MVCC)

Disadvantages of this approach:
PostgreSQL also provides locking to enforce critical concurrency.
COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [15/53]
❖ PostgreSQL Functionality (cont)

PostgreSQL has a well-defined and open extensibility model:

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [16/53]
❖ Catalogs

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [17/53]
❖ Database Objects

RDBMSs manage different kinds of objects

Many objects have names (and, in PostgreSQL, all can have OIDs).

How are the different types of objects represented?

How do we go from a name (or OID) to bytes stored on disk?

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [18/53]
❖ Catalogs

Consider what information the RDBMS needs about relations:

Similarly for other DBMS objects (e.g. views, functions, triggers, ...)

This information is stored in the system catalog tables

Standard for catalogs in SQL:2003: INFORMATION_SCHEMA

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [19/53]
❖ Catalogs (cont)

The catalog is affected by several types of SQL operations:

where Object is one of table, view, function, trigger, schema, ...

E.g. drop table Groups; produces something like

delete from Tables
where  schema = 'public' and name = 'groups';

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [20/53]
❖ Catalogs (cont)

In PostgreSQL, the system catalog is available to users via:


The low-level representation is available to sysadmins via:
COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [21/53]
❖ Catalogs (cont)

You can explore the PostgreSQl catalog via psql commands

You can also explore via SQL on the catalog tables
COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [22/53]
❖ Catalogs (cont)

A PostgreSQL installation (cluster) typically has many DBs

Some catalog information is global, e.g.

Other catalog information is local to each database, e.g
COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [23/53]
❖ Catalogs (cont)

Side-note:   PostgreSQL tuples contain

OIDs are used as primary keys in many of the catalog tables.
COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [24/53]
❖ Representing Databases

Above the level of individual DB schemata, we have:

These tables are global to each PostgreSQL cluster.

Keys are names (strings) and must be unique within cluster.

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [25/53]
❖ Representing Databases (cont)

pg_database contains information about databases:

pg_namespace contains information about schemata: pg_tablespace contains information about tablespaces: PostgreSQL represents access via array of access items:

Role=Privileges/Grantor

where Privileges is a string enumerating privileges, e.g.

jas=arwdRxt/jas,fred=r/jas,joe=rwad/jas

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [26/53]
❖ Representing Tables

Representing one table needs tuples in several catalog tables.

Due to O-O heritage, base table for tables is called pg_class.

The pg_class table also handles other "table-like" objects:

All tuples in pg_class have an OID, used as primary key.

Some fields from the pg_class table:

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [27/53]
❖ Representing Tables (cont)

Details of catalog tables representing database tables

pg_class holds core information about tables

pg_attribute contains information about attributes pg_type contains information about types
COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [28/53]
❖ Exercise: Table Statistics

Using the PostgreSQL catalog, write a PLpgSQL function

create type TableInfo as (table text, ntuples int);
create function pop() returns setof TableInfo ...

Hints:

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [29/53]
❖ Exercise: Extracting a Schema

Write a PLpgSQL function:

It should behave as follows:

db=# select * from schema();
            tables
---------------------------
 table1(x, y, z)
 table2(a, b)
 table3(id, name, address)
...

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [30/53]
❖ Exercise: Enumerated Types

PostgreSQL allows you to define enumerated types, e.g.

create type Mood as enum ('sad', 'happy');

Creates a type with two ordered values 'sad' < 'happy'

What is created in the catalog for the above definition?

Hint:

pg_type(oid, typname, typelen, typetype, ...)
pg_enum(oid, enumtypid, enumlabel)

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [31/53]
❖ PostgreSQL Architecture

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [32/53]
❖ PostgreSQL Architecture

Client/server architecture:

[Diagram:Pics/intro/proc-arch.png]


The listener process is sometimes called postmaster

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [33/53]
❖ PostgreSQL Architecture (cont)

Memory/storage architecture:

[Diagram:Pics/intro/mem-arch.png]

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [34/53]
❖ PostgreSQL Architecture (cont)

File-system architecture:

[Diagram:Pics/intro/file-arch.png]

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [35/53]
❖ Exercise: PostgreSQL Data Files

PostgreSQL uses OIDs as

Using the pg_catalog tables, find .. Relevant catalog info ...

pg_database(oid,datname,...)
pg_class(oid,relname,...)

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [36/53]
❖ PostgreSQL Source Code

Top-level of PostgreSQL distribution contains:

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [37/53]
❖ PostgreSQL Source Code (cont)

The source code directory (src) contains:

along with Makefiles to build system and other directories ...

Code for backend (DBMS engine)

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [38/53]
❖ PostgreSQL Source Code (cont)

How to get started understanding the workings of PostgreSQL:

Some helpful information is available via:
COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [39/53]
❖ PostgreSQL Source Code (cont)

PostgreSQL documentation has detailed description of internals:

See also "How PostgreSQL Processes a Query"
COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [40/53]
❖ Life-cycle of a PostgreSQL query

How a PostgreSQL query is executed:

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [41/53]
❖ Life-cycle of a PostgreSQL query (cont)

Data flow to get to execute a query:

[Diagram:Pics/intro/pg-processes.png]

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [42/53]
❖ PostgreSQL server

PostgresMain(int argc, char *argv[], ...)

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [43/53]
❖ PostgreSQL server (cont)

As well as handling SQL queries, PostgresqlMain also

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [44/53]
❖ PostgreSQL Data Types

Data types defined in *.h files under src/include/

Two important data types: Node and List

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [45/53]
❖ PostgreSQL Query Evaluation

exec_simple_query(const char *query_string)

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [46/53]
❖ PostgreSQL Query Evaluation (cont)

pg_parse_query(char *sqlStatements)

pg_analyze_and_rewrite(Node *parsetree, ...)
COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [47/53]
❖ PostgreSQL Query Evaluation (cont)

Each query is represented by a Query structure

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [48/53]
❖ PostgreSQL Query Evaluation (cont)

pg_plan_queries(querytree_list, ...)

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [49/53]
❖ PostgreSQL Query Evaluation (cont)

Each executable query is represented by a PlannedStmt node

Each Plan node represents one relational operation
COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [50/53]
❖ PostgreSQL Query Evaluation (cont)

PlannedStmt *planner(Query *parse, ...)

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [51/53]
❖ PostgreSQL Query Evaluation (cont)

Queries run in a Portal environment containing

Portal defined in src/include/utils/portal.h

PortalRun() function also requires

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [52/53]
❖ PostgreSQL Query Evaluation (cont)

How query evaluation happens in exec_simple_query():

COMP9315 24T1 ♢ Week 1 Thursday Lecture ♢ [53/53]


Produced: 16 Feb 2024