Programming with Databases

COMP3311 20T3 ♢ Programming with Databases ♢ [0/12]
❖ Programming with Databases

So far, we have seen ...

All of the above programming
COMP3311 20T3 ♢ Programming with Databases ♢ [1/12]
❖ Programming with Databases (cont)

While SQL (+ PLpgSQL) gives a powerful data access mechanism

Complete applications require code to "Conventional" programming languages (PLs) provide these.

We need PL + DBMS connectivity.

COMP3311 20T3 ♢ Programming with Databases ♢ [2/12]
❖ Programming with Databases (cont)

Requirements of an interface between PL and RDBMS:

Distance between PL and DBMS is variable, e.g.
COMP3311 20T3 ♢ Programming with Databases ♢ [3/12]
❖ Programming with Databases (cont)

Programming Language / DBMS archtecture:

[Diagram:Pics/pldb/dbpl-arch.png]

COMP3311 20T3 ♢ Programming with Databases ♢ [4/12]
❖ PL/DB Interface

Common DB access API used in programming languages

db = connect_to_dbms(DBname,User/Password);

query = build_SQL("SqlStatementTemplate",values);

results = execute_query(db,query);

while (more_tuples_in(results))
{
   tuple = fetch_row_from(results);
   // do something with values in tuple ...
}

This pattern is used in many different libraries:

COMP3311 20T3 ♢ Programming with Databases ♢ [5/12]
❖ PL/DB Interface (cont)

DB access libraries have similar overall structure.

But differ in the details:

Object-relational mappers (ORMs) ...
COMP3311 20T3 ♢ Programming with Databases ♢ [6/12]
❖ PL/DB Mismatch

There is a tension between PLs and DBMSs

If not handled carefully, can lead to inefficient use of DB.

Note: relative costs of DB access operations from PL:

COMP3311 20T3 ♢ Programming with Databases ♢ [7/12]
❖ PL/DB Mismatch (cont)

Consider this (imaginary) PL/DBMS access method:

--  establish connection to DBMS
db = dbAccess("DB");
query = "select a,b from R,S where ... ";
--  invoke query and get handle to result set
results = dbQuery(db, query);
--  for each tuple in result set
while (tuple = dbNext(results)) {
    --  process next tuple
    process(tuple['a'], tuple['b']);
}


Estimated costs:  dbAccess = 500ms,  dbQuery = 200ms,  dbNext = 10ms

In later cost estimates, ignore dbAccess ... same base cost for all examples

COMP3311 20T3 ♢ Programming with Databases ♢ [8/12]
❖ PL/DB Mismatch (cont)

Example: find mature-age students   (e.g. 10000 students, 500 over 40)

query = "select * from Student";
results = dbQuery(db, query);
while (tuple = dbNext(results)) {
    if (tuple['age'] >= 40) {
        --  process mature-age student
    }
}


We transfer 10000 tuples from DB, 9500 are irrelevant

Cost = 1*200 + 10000*10 = 100200ms = 100s

COMP3311 20T3 ♢ Programming with Databases ♢ [9/12]
❖ PL/DB Mismatch (cont)

E.g. should be implemented as:

query = "select * from Student where age >= 40";
results = dbQuery(db, query);
while (tuple = dbNext(results)) {
    --  process mature-age student
}


Transfers only the 500 tuples that are needed.

Cost = 1*200 + 500*10 = 5200ms = 5s

COMP3311 20T3 ♢ Programming with Databases ♢ [10/12]
❖ PL/DB Mismatch (cont)

Example: find info about all marks for all students

query1 = "select id,name from Student";
res1 = dbQuery(db, query1);
while (tuple1 = dbNext(res1)) {
    query2 = "select course,mark from Marks"
             + " where student = " + tuple1['id'];
    res2 = dbQuery(db,query2);
    while (tuple2 = dbNext(res2)) {
        --  process student/course/mark info
    }
}

E.g. 10000 students, each with 8 marks, ⇒ run 10001 queries

Cost = 10001*200 + 80000*10 = 2800s = 46min

COMP3311 20T3 ♢ Programming with Databases ♢ [11/12]
❖ PL/DB Mismatch (cont)

E.g. should be implemented as:

query = "select id,name,course,mark"
        + " from Student s join Marks m "
        + " on (s.id=m.student)"
results = dbQuery(db, query);
while (tuple = dbNext(results)) {
    --  process student/course/mark info
}


We invoke 1 query, and transfer same number of tuples.

Cost = 1*200ms + 80000*10ms = 800s = 13min

COMP3311 20T3 ♢ Programming with Databases ♢ [12/12]


Produced: 25 Oct 2020