Week 04 Lectures


Tuples


Tuples2/84

Each page contains a collection of tuples

[Diagram:Pics/storage/page-tuples-small.png]


What do tuples contain? How are they structured internally?


Records vs Tuples3/84

A table is defined by a schema, e.g.

create table Employee (
   id   integer primary key,
   name varchar(20) not null,
   job  varchar(10), 
   dept number(4) references Dept(id)
);

where a schema is a collection of attributes  (name,type,constraints)

Schema information (meta-data) is stored in the DB catalog


... Records vs Tuples4/84

Tuple = collection of attribute values based on a schema, e.g.

(33357462, 'Neil Young', 'Musician', 0277)


Record = sequence of bytes, containing data for one tuple, e.g.

[Diagram:Pics/storage/tuple-bytes-small.png]


Bytes need to be interpreted relative to schema to get tuple


Converting Records to Tuples5/84

A Record is an array of bytes (byte[])

A Tuple is a collection of named,typed values  (cf. C struct)

[Diagram:Pics/storage/rec-to-tuple-small.png]


... Converting Records to Tuples6/84

Information on how to interpret bytes in a record ...

For variable-length records, some formatting info ...


Operations on Records7/84

Common operation on records ... access record via RecordId:

Record get_record(Relation rel, RecordId rid) {
    (pid,tid) = rid;
    Page buf = get_page(rel, pid);
    return get_bytes(rel, buf, tid);
}

Cannot use a Record directly; need a Tuple:

Relation rel = ... // relation schema
Record rec = get_record(rel, rid)
Tuple t = mkTuple(rel, rec)

Once we have a Tuple, we can access individual attributes/fields


Operations on Tuples8/84

Once we have a record, we need to interpret it as a tuple ...

Tuple t = mkTuple(rel, rec)


Once we have a tuple, we want to examines its contents ...

Typ   getTypField(Tuple t, int i)

E.g.   int x = getIntField(t,1),   char *s = getStrField(t,2)


Fixed-length Records9/84

A possible encoding scheme for fixed-length records:

[Diagram:Pics/storage/fixed-length-small.png]

Since record format is frequently used at query time, cache in memory.


Variable-length Records10/84

Possible encoding schemes for variable-length records:


Data Types11/84

DBMSs typically define a fixed set of base types, e.g.

DATE,  FLOAT,  INTEGER,  NUMBER(n),  VARCHAR(n), ...

This determines implementation-level data types for field values:

DATE time_t
FLOAT float,double
INTEGER int,long
NUMBER(n) int[] (?)
VARCHAR(n) char[]

PostgreSQL allows new base types to be added


Field Descriptors12/84

A Tuple could be implemented as

typedef struct {
  ushort    nfields;   // number of fields/attrs
  ushort    data_off;  // offset in struct for data
  FieldDesc fields[];  // field descriptions
  Record    data;      // pointer to record in buffer
} Tuple;

Fields are derived from relation descriptor + record instance data.


... Field Descriptors13/84

Tuple data could be

[Diagram:Pics/storage/rec8-small.png]


... Field Descriptors14/84

Or, tuple data could be ...

[Diagram:Pics/storage/rec9-small.png]


Exercise 1: How big is a FieldDesc?15/84

FieldDesc = (offset,length,type), where

If pages are 8KB in size, how many bits are needed for each?

E.g.

[Diagram:Pics/storage/field-desc-small.png]


PostgreSQL Tuples16/84

Definitions: include/postgres.h,  include/access/*tup*.h

Functions: backend/access/common/*tup*.c  e.g.

PostgreSQL implements tuples via:


... PostgreSQL Tuples17/84

Tuple structure:

[Diagram:Pics/storage/pg-tuple-struct-small.png]


... PostgreSQL Tuples18/84

Tuple-related data types: (cont)

// TupleDesc: schema-related information for HeapTuples

typedef struct tupleDesc 
{
  int         natts;       // # attributes in tuple 
  Oid         tdtypeid;    // composite type ID for tuple type 
  int32       tdtypmod;    // typmod for tuple type 
  bool        tdhasoid;    // does tuple have oid attribute? 
  int         tdrefcount;  // reference count (-1 if not counting)
  TupleConstr *constr;     // constraints, or NULL if none 
  FormData_pg_attribute attrs[];
  // attrs[N] is a pointer to description of attribute N+1 
} *TupleDesc;


... PostgreSQL Tuples19/84

Tuple-related data types: (cont)

// FormData_pg_attribute:
// schema-related information for one attribute

typedef struct FormData_pg_attribute
{
  Oid      attrelid;    // OID of reln containing attr
  NameData attname;     // name of attribute
  Oid      atttypid;    // OID of attribute's data type
  int16    attlen;      // attribute length
  int32    attndims;    // # dimensions if array type
  bool     attnotnull;  // can attribute have NULL value
  .....                 // and many other fields
} FormData_pg_attribute;

For details, see include/catalog/pg_attribute.h


... PostgreSQL Tuples20/84

HeapTupleData contains information about a stored tuple

typedef HeapTupleData *HeapTuple;

typedef struct HeapTupleData
{
  uint32           t_len;  // length of *t_data 
  ItemPointerData t_self;  // SelfItemPointer 
  Oid         t_tableOid;  // table the tuple came from 
  HeapTupleHeader t_data;  // -> tuple header and data 
} HeapTupleData;

HeapTupleHeader is a pointer to a location in a buffer


... PostgreSQL Tuples21/84

PostgreSQL stores a single block of data for tuple

  • containing a tuple header, followed by data byte[]

    typedef struct HeapTupleHeaderData // simplified
    {
      HeapTupleFields t_heap;
      ItemPointerData t_ctid;      // TID of newer version
      uint16          t_infomask2; // #attributes + flags
      uint16          t_infomask;  // flags e.g. has_null
      uint8           t_hoff;      // sizeof header incl. t_bits
      // above is fixed size (23 bytes) for all heap tuples
      bits8           t_bits[1];   // bitmap of NULLs, var.len.
      // OID goes here if HEAP_HASOID is set in t_infomask
      // actual data follows at end of struct
    } HeapTupleHeaderData;
    


    ... PostgreSQL Tuples22/84

    Some of the bits in t_infomask ..

    #define HEAP_HASNULL      0x0001
            /* has null attribute(s) */
    #define HEAP_HASVARWIDTH  0x0002
            /* has variable-width attribute(s) */
    #define HEAP_HASEXTERNAL  0x0004
            /* has external stored attribute(s) */
    #define HEAP_HASOID_OLD   0x0008 
            /* has an object-id field */
    

    Location of NULLs is stored in t_bits[] array


    ... PostgreSQL Tuples23/84

    Tuple-related data types: (cont)

    typedef struct HeapTupleFields  // simplified
    {
      TransactionId t_xmin;  // inserting xact ID
      TransactionId t_xmax;  // deleting or locking xact ID
      union {
        CommandId   t_cid;   // inserting or deleting command ID
        TransactionId t_xvac;// old-style VACUUM FULL xact ID 
      } t_field3;
    } HeapTupleFields;
    

    Note that not all system fields from stored tuple appear


    PostgreSQL Attribute Values24/84

    Values of attributes in PostgreSQL tuples are packaged as Datums

    // representation of a data value
    typedef uintptr_t Datum;
    

    The actual data value:


    ... PostgreSQL Attribute Values25/84

    Attribute values can be extracted as Datum from HeapTuples

    Datum heap_getattr(
          HeapTuple tup,     // tuple (in memory)
          int attnum,        // which attribute
          TupleDesc tupDesc, // field descriptors
          bool *isnull       // flag to record NULL
    )
    

    isnull is set to true if value of field is NULL

    attnum can be negative ... to access system attributes (e.g. OID)

    For details, see include/access/htup_details.h


    ... PostgreSQL Attribute Values26/84

    Values of Datum objects can be manipulated via e.g.

    // DatumGetBool:
    //   Returns boolean value of a Datum.
    
    #define DatumGetBool(X) ((bool) ((X) != 0))
    
    // BoolGetDatum:
    //   Returns Datum representation for a boolean.
    
    #define BoolGetDatum(X) ((Datum) ((X) ? 1 : 0))
    

    For details, see include/postgres.h


    Implementing Relational Operations


    DBMS Architecture (revisited)28/84

    Implementation of relational operations in DBMS:

    [Diagram:Pics/scansortproj/dbmsarch-relop-small.png]


    Relational Operations29/84

    DBMS core = relational engine, with implementations of

    In this part of the course: Terminology reminder:


    ... Relational Operations30/84

    Two "dimensions of variation":

    Each query method involves an operator and a file structure: As well as query costs, consider update costs (insert/delete).


    ... Relational Operations31/84

    SQL vs DBMS engine


    Cost Models


    Cost Models33/84

    An important aspect of this course is

    Cost can be measured in terms of Primary assumptions in our cost models:


    ... Cost Models34/84

    Since time cost is affected by many factors

    we do not consider time cost in our analyses.

    For comparing methods, page cost is better

    Estimating costs with multiple concurrent ops and buffering is difficult!!

    Addtional assumption: every page request leads to some i/o


    ... Cost Models35/84

    In developing cost models, we also assume:

    [Diagram:Pics/scansortproj/file-struct0-small.png]


    ... Cost Models36/84

    Our cost models are "rough" (based on assumptions)

    But do give an O(x) feel for how expensive operations are.

    Example "rough" estimation: how many piano tuners in Sydney?

    Actual number of tuners in Yellow Pages = 120

    Example borrowed from Alan Fekete at Sydney University.


    Query Types37/84

    Type SQL RelAlg a.k.a.
    Scan select * from R R -
    Proj select x,y from R Proj[x,y]R -
    Sort select * from R
    order by x
    Sort[x]R ord
    Sel1 select * from R
    where id = k
    Sel[id=k]R one
    Seln select * from R
    where a = k
    Sel[a=k]R -
    Join1 select * from R,S
    where R.id = S.r
    R Join[id=r] S -

    Different query classes exhibit different query processing behaviours.


    Example File Structures38/84

    When describing file structures

    [Diagram:Pics/scansortproj/one-page-small.png]


    ... Example File Structures39/84

    Consider three simple file structures:

    All files are composed of b primary blocks/pages

    [Diagram:Pics/scansortproj/file-struct0-small.png]

    Some records in each page may be marked as "deleted".


    Exercise 2: Operation Costs40/84

    For each of the following file structures

    Determine #page-reads + #page-writes for insert and delete

    You can assume the existence of a file header containing

    Assume also


    Scanning


    Scanning42/84

    Consider the query:

    select * from Rel;
    

    Operational view:

    for each page P in file of relation Rel {
       for each tuple t in page P {
          add tuple t to result set
       }
    }
    

    Cost: read every data page once

    Time Cost = b.Tr,     Page Cost = b


    ... Scanning43/84

    Scan implementation when file has overflow pages, e.g.

    [Diagram:Pics/scansortproj/file-struct1-small.png]


    ... Scanning44/84

    In this case, the implementation changes to:

    for each page P in data file of relation Rel {
        for each tuple t in page P {
            add tuple t to result set
        }
        for each overflow page V of page P {
            for each tuple t in page V {
                add tuple t to result set
    }   }   }
    

    Cost: read each data page and each overflow page once

    Cost = b + bOv

    where bOv = total number of overflow pages


    Selection via Scanning45/84

    Consider a one query like:

    select * from Employee where id = 762288;
    

    In an unordered file, search for matching tuple requires:


    [Diagram:Pics/scansortproj/file-search-small.png]


    Guaranteed at most one answer; but could be in any page.


    ... Selection via Scanning46/84

    Overview of scan process:

    for each page P in relation Employee {
        for each tuple t in page P {
            if (t.id == 762288) return t
    }   }
    

    Cost analysis for one searching in unordered file

    Page Costs:   Costavg = b/2    Costmin = 1    Costmax = b


    Exercise 3: Cost of Search in Hashed File47/84

    Consider the hashed file structure b = 10, c = 4, h(k) = k%10

    [Diagram:Pics/scansortproj/hash-file-10-small.png]

    Describe how the following queries

    select * from R where k = 51;
    select * from R where k > 50;
    

    might be solved in a file structure like the above (h(k) = k%b).

    Estimate the minimum and maximum cost (as #pages read)


    Iterators48/84

    Access methods typically involve iterators, e.g.


    Scan s = start_scan(Relation r, ...)


    Tuple next_tuple(Scan s)


    Example Query49/84

    Example: simple scan of a table ...

    select name from Employee
    

    implemented as:

    DB db = openDatabase("myDB");
    Relation r = openRelation(db,"Employee",READ);
    Scan s = start_scan(r);
    Tuple t;  // current tuple
    while ((t = next_tuple(s)) != NULL)
    {
       char *name = getStrField(t,2);
       printf("%s\n", name);
    }
    


    Exercise 4: Implement next_tuple()50/84

    Consider the following possible Scan data structure

    typedef struct {
       Relation rel;
       Page     *curPage;  // Page buffer
       int      curPID;    // current pid
       int      curTID;    // current tid
    } ScanData;
    

    Assume tuples are indexed 0..nTuples(p)

    Assume pages are indexed 0..nPages(rel)

    Implement the  Tuple next_tuple(Scan)  function

    P.S. What's in a Relation object?


    Relation Copying51/84

    Consider an SQL statement like:

    create table T as (select * from S);
    

    Effectively, copies data from one table to a new table.

    Process:

    make empty relation T
    s = start scan of S
    while (t = next_tuple(s)) {
        insert tuple t into relation T
    }
    


    ... Relation Copying52/84

    Possible that T is smaller than S

    [Diagram:Pics/scansortproj/copy-files-small.png]


    ... Relation Copying53/84

    In terms of existing relation/page/tuple operations:

    
    Relation in;       // relation handle (incl. files)
    Relation out;      // relation handle (incl. files)
    int ipid,opid,tid; // page and record indexes
    Record rec;        // current record (tuple)
    Page ibuf,obuf;    // input/output file buffers
    
    in = openRelation("S", READ);
    out = openRelation("T", NEW|WRITE);
    clear(obuf);  opid = 0;
    for (ipid = 0; ipid < nPages(in); ipid++) {
        ibuf = get_page(in, ipid);
        for (tid = 0; tid < nTuples(ibuf); tid++) {
            rec = get_record(ibuf, tid);
            if (!hasSpace(obuf,rec)) {
                put_page(out, opid++, obuf);
                clear(obuf);
            }
            insert_record(obuf,rec);
    }   }
    if (nTuples(obuf) > 0) put_page(out, opid, obuf);
    


    Exercise 5: Cost of Relation Copy54/84

    Analyse cost for relation copying:

    1. if both input and output are heap files
    2. if input is sorted and output is heap file
    3. if input is heap file and output is sorted
    Assume ...

    Give cost in terms of #pages read + #pages written


    Scanning in PostgreSQL55/84

    Scanning defined in: backend/access/heap/heapam.c

    Implements iterator data/operations:


    ... Scanning in PostgreSQL56/84

    
    typedef HeapScanDescData *HeapScanDesc;
    
    typedef struct HeapScanDescData
    {
      // scan parameters 
      Relation      rs_rd;        // heap relation descriptor 
      Snapshot      rs_snapshot;  // snapshot ... tuple visibility 
      int           rs_nkeys;     // number of scan keys 
      ScanKey       rs_key;       // array of scan key descriptors 
      ...
      // state set up at initscan time 
      PageNumber    rs_npages;    // number of pages to scan 
      PageNumber    rs_startpage; // page # to start at 
      ...
      // scan current state, initally set to invalid 
      HeapTupleData rs_ctup;      // current tuple in scan
      PageNumber    rs_cpage;     // current page # in scan
      Buffer        rs_cbuf;      // current buffer in scan
       ...
    } HeapScanDescData;
    


    Scanning in other File Structures57/84

    Above examples are for heap files

    Other access file structures in PostgreSQL:


    Sorting


    The Sort Operation59/84

    Sorting is explicit in queries only in the order by clause

    select * from Students order by name;
    

    Sorting is used internally in other operations:

    Sort methods such as quicksort are designed for in-memory data.

    For large data on disks, need external sorts such as merge sort.


    Two-way Merge Sort60/84

    Example:

    [Diagram:Pics/scansortproj/two-way-ex2-small.png]


    ... Two-way Merge Sort61/84

    Requires three in-memory buffers:

    [Diagram:Pics/scansortproj/two-way-buf-small.png]


    Assumption: cost of Merge operation on two in-memory buffers ≅ 0.


    Comparison for Sorting62/84

    Above assumes that we have a function to compare tuples.

    Needs to understand ordering on different data types.

    Need a function tupCompare(r1,r2,f) (cf. C's strcmp)

    int tupCompare(r1,r2,f)
    {
       if (r1.f < r2.f) return -1;
       if (r1.f > r2.f) return 1;
       return 0;
    }
    

    Assume =, <, > are available for all attribute types.


    ... Comparison for Sorting63/84

    In reality, need to sort on multiple attributes and ASC/DESC, e.g.

    -- example multi-attribute sort
    select * from Students
    order by age desc, year_enrolled
    

    Sketch of multi-attribute sorting function

    int tupCompare(r1,r2,criteria)
    {
       foreach (f,ord) in criteria {
          if (ord == ASC) {
             if (r1.f < r2.f) return -1;
             if (r1.f > r2.f) return 1;
          }
          else {
             if (r1.f > r2.f) return -1;
             if (r1.f < r2.f) return 1;
          }
       }
       return 0;
    }
    


    Cost of Two-way Merge Sort64/84

    For a file containing b data pages:

    Gives total cost:   2.b.ceil(log2b)

    Example: Relation with r=105 and c=50     b=2000 pages.

    Number of passes for sort:   ceil(log22000)  =  11

    Reads/writes entire file 11 times!    Can we do better?


    n-Way Merge Sort65/84

    Initial pass uses: B total buffers

    [Diagram:Pics/scansortproj/n-way-buf-pass0-small.png]

    Reads B pages at a time, sorts in memory, writes out in order


    ... n-Way Merge Sort66/84

    Merge passes use:   n input buffers,   1 output buffer

    [Diagram:Pics/scansortproj/n-way-buf-small.png]


    ... n-Way Merge Sort67/84

    Method:

    
    // Produce B-page-long runs
    for each group of B pages in Rel {
        read B pages into memory buffers
        sort group in memory
        write B pages out to Temp
    }
    // Merge runs until everything sorted
    numberOfRuns = b/B
    while (numberOfRuns > 1) {
        // n-way merge, where n=B-1
        for each group of n runs in Temp {
            merge into a single run via input buffers
            write run to newTemp via output buffer
        }
        numberOfRuns = numberOfRuns/n
        Temp = newTemp // swap input/output files
    }
    


    Cost of n-Way Merge Sort68/84

    Consider file where b = 4096, B = 16 total buffers:

    (cf. two-way merge sort which needs 11 passes)


    ... Cost of n-Way Merge Sort69/84

    Generalising from previous example ...

    For b data pages and B buffers

    • first pass: read/writes b pages, gives b0 = b/B runs
    • then need lognb0 passes until sorted, where n = B-1
    • each pass reads and writes b pages   (i.e. 2.b page accesses)

    Cost = 2.b.(1 + lognb0),   where b0 and n are defined above


    Exercise 6: Cost of n-Way Merge Sort70/84

    How many reads+writes to sort the following:

    • r = 1048576 tuples (220)
    • R = 62 bytes per tuple (fixed-size)
    • B = 4096 bytes per page
    • H = 96 bytes of header data per page
    • D = 1 presence bit per tuple in page directory
    • all pages are full
    Consider for the cases:
    • 9 total buffers, 8 input buffers, 1 output buffer
    • 33 total buffers, 32 input buffers, 1 output buffer
    • 257 total buffers, 256 input buffers, 1 output buffer


    Sorting in PostgreSQL71/84

    Sort uses a merge-sort (from Knuth) similar to above:

    • backend/utils/sort/tuplesort.c
    • include/utils/sortsupport.h
    Tuples are mapped to SortTuple structs for sorting:
    • containing pointer to tuple and sort key
    • no need to reference actual Tuples during sort
    • unless multiple attributes used in sort
    If all data fits into memory, sort using qsort().

    If memory fills while reading, form "runs" and do disk-based sort.


    ... Sorting in PostgreSQL72/84

    Disk-based sort has phases:

    • divide input into sorted runs using HeapSort
    • merge using N buffers, one output buffer
    • N = as many buffers as workMem allows
    Described in terms of "tapes" ("tape" sorted run)

    Implementation of "tapes": backend/utils/sort/logtape.c


    ... Sorting in PostgreSQL73/84

    Sorting comparison operators are obtained via catalog (in Type.o):

    // gets pointer to function via pg_operator
    struct Tuplesortstate { ... SortTupleComparator ... };
    
    // returns negative, zero, positive
    ApplySortComparator(Datum datum1, bool isnull1,
                        Datum datum2, bool isnull2,
                        SortSupport sort_helper);
    

    Flags in SortSupport indicate: ascending/descending, nulls-first/last.

    ApplySortComparator() is PostgreSQL's version of tupCompare()


    Implementing Projection


    The Projection Operation75/84

    Consider the query:

    select distinct name,age from Employee;
    

    If the Employee relation has four tuples such as:

    (94002, John, Sales, Manager,   32)
    (95212, Jane, Admin, Manager,   39)
    (96341, John, Admin, Secretary, 32)
    (91234, Jane, Admin, Secretary, 21)
    

    then the result of the projection is:

    (Jane, 21)   (Jane, 39)   (John, 32)
    

    Note that duplicate tuples (e.g. (John,32)) are eliminated.


    ... The Projection Operation76/84

    The projection operation needs to:

    1. scan the entire relation as input
      • already seen how to do scanning

    2. remove unwanted attributes in output tuples
      • implementation depends on tuple internal structure
      • essentially, make a new tuple with fewer attributes
        and where the values may be computed from existing attributes

    3. eliminate any duplicates produced   (if distinct)
      • two approaches: sorting or hashing


    Sort-based Projection77/84

    Requires a temporary file/relation (Temp)

    for each tuple T in Rel {
        T' = mkTuple([attrs],T)
        write T' to Temp
    }
    
    sort Temp on [attrs]
    
    for each tuple T in Temp {
        if (T == Prev) continue
        write T to Result
        Prev = T
    }
    


    Exercise 7: Cost of Sort-based Projection78/84

    Consider a table R(x,y,z) with tuples:

    Page 0:  (1,1,'a')   (11,2,'a')  (3,3,'c')
    Page 1:  (13,5,'c')  (2,6,'b')   (9,4,'a')
    Page 2:  (6,2,'a')   (17,7,'a')  (7,3,'b')
    Page 3:  (14,6,'a')  (8,4,'c')   (5,2,'b')
    Page 4:  (10,1,'b')  (15,5,'b')  (12,6,'b')
    Page 5:  (4,2,'a')   (16,9,'c')  (18,8,'c')
    

    SQL:   create T as (select distinct y from R)

    Assuming:

    • 3 memory buffers, 2 for input, one for output
    • pages/buffers hold 3 R tuples (i.e. cR=3), 6 T tuples (i.e. cT=6)
    Show how sort-based projection would execute this statement.


    Cost of Sort-based Projection79/84

    The costs involved are (assuming B=n+1 buffers for sort):

    • scanning original relation Rel:   bR   (with cR)
    • writing Temp relation:   bT     (smaller tuples, cT > cR, sorted)
    • sorting Temp relation:
        2.bT.ceil(lognb0) where b0 = ceil(bT/B)
    • scanning Temp, removing duplicates:   bT
    • writing the result relation:   bOut     (maybe less tuples)
    Cost = sum of above = bR + bT + 2.bT.ceil(lognb0) + bT + bOut


    Hash-based Projection80/84

    Partitioning phase:

    [Diagram:Pics/scansortproj/hash-project-small.png]


    ... Hash-based Projection81/84

    Duplicate elimination phase:

    [Diagram:Pics/scansortproj//hash-project2-small.png]


    ... Hash-based Projection82/84

    Algorithm for both phases:

    for each tuple T in relation Rel {
        T' = mkTuple([attrs],T)
        H = h1(T', n)
        B = buffer for partition[H]
        if (B full) write and clear B
        insert T' into B
    }
    for each partition P in 0..n-1 {
        for each tuple T in partition P {
            H = h2(T, n)
            B = buffer for hash value H
            if (T not in B) insert T into B
            // assumes B never gets full
        }
        write and clear all buffers
    }
    


    Exercise 8: Cost of Hash-based Projection83/84

    Consider a table R(x,y,z) with tuples:

    Page 0:  (1,1,'a')   (11,2,'a')  (3,3,'c')
    Page 1:  (13,5,'c')  (2,6,'b')   (9,4,'a')
    Page 2:  (6,2,'a')   (17,7,'a')  (7,3,'b')
    Page 3:  (14,6,'a')  (8,4,'c')   (5,2,'b')
    Page 4:  (10,1,'b')  (15,5,'b')  (12,6,'b')
    Page 5:  (4,2,'a')   (16,9,'c')  (18,8,'c')
    -- and then the same tuples repeated for pages 6-11
    

    SQL:   create T as (select distinct y from R)

    Assuming:

    • 4 memory buffers, one for input, 3 for partitioning
    • pages/buffers hold 3 R tuples (i.e. cR=3), 4 T tuples (i.e. cT=4)
    • hash functions:   h1(x) = x%3,   h2(x) = (x%4)%3
    Show how hash-based projection would execute this statement.


    Cost of Hash-based Projection84/84

    The total cost is the sum of the following:

    • scanning original relation R:   bR
    • writing partitions:   bP   (bR vs bP ?)
    • re-reading partitions:   bP
    • writing the result relation:   bOut
    Cost = bR + 2bP + bOut

    To ensure that n is larger than the largest partition ...

    • use hash functions (h1,h2) with uniform spread
    • allocate at least sqrt(bR)+1 buffers
    • if insufficient buffers, significant re-reading overhead


    Produced: 10 Mar 2020