❖ Indexes |
A 1-d index is based on the value of a single attribute A.
Some possible properties of A:
index on unique field, may be sorted on A | ||
index on non-unique field, file sorted on A | ||
file not sorted on A |
A given table may have indexes on several attributes.
❖ Indexes (cont) |
Indexes themselves may be structured in several ways:
every tuple is referenced by an entry in the index file | ||
only some tuples are referenced by index file entries | ||
tuples are accessed directly from the index file | ||
may need to access several index pages to reach tuple |
Index file has total i pages (where typically i ≪ b)
Index file has page capacity ci (where typically ci ≫ c)
Dense index: i = ceil( r/ci ) Sparse index: i = ceil( b/ci )
❖ Selection with Primary Index |
For one queries:
ix = binary search index for entry with key K
if nothing found { return NotFound }
b = getPage(pageOf(ix.tid))
t = getTuple(b,offsetOf(ix.tid))
-- may require reading overflow pages
return t
Worst case: read log2i index pages + read 1+Ov data pages.
Thus, Costone,prim = log2 i + 1 + Ov
Assume: index pages are same size as data pages ⇒ same reading cost
❖ Selection with Primary Index (cont) |
For range queries on primary key:
❖ Selection with Primary Index (cont) |
Method for range queries
// e.g. select * from R where a between lo and hi pages = {} results = {} ixPage = findIndexPage(R.ixf,lo) while (ixTup = getNextIndexTuple(R.ixf)) { if (ixTup.key > hi) break; pages = pages ∪ pageOf(ixTup.tid) } foreach pid in pages { // scan data page plus ovflow chain while (buf = getPage(R.datf,pid)) { foreach tuple T in buf { if (lo<=T.a && T.a<=hi) results = results ∪ T } } }
❖ Insertion with Primary Index |
Overview:
tid = insert tuple into page P at position p find location for new entry in index file insert new index entry (k,tid) into index file
Problem: order of index entries must be maintained
Reorganisation requires, on average, read/write half of index file:
Costinsert,prim = (log2i)r + i/2.(1r+1w) + (1+Ov)r + (1+δ)w
❖ Deletion with Primary Index |
Overview:
find tuple using index mark tuple as deleted delete index entry for tuple
If we delete index entries by marking ...
❖ Clustering Index |
Data file sorted; one index entry for each key value
Cost penalty: maintaining both index and data file as sorted
(Note: can't mark index entry for value X until all X tuples are deleted)
❖ Secondary Index |
Data file not sorted; want one index entry for each key value
Costpmr =
(log2i
❖ Multi-level Indexes |
Secondary Index used two index files to speed up search
Ix1
Ix2
Ix1
Ix2
Ix1
Ix3
Ix2
Ix3
Ultimately, reduce top-level of index hierarchy to one page.
❖ Multi-level Indexes (cont) |
Example data file with three-levels of index:
Assume: not primary key, c = 20, ci = 3
In reality, more likely c = 100, ci = 1000
❖ Select with Multi-level Index |
For one query on indexed key field:
xpid = top level index page
for level = 1 to d {
read index entry xpid
search index page for J'th entry
where index[J].key <= K < index[J+1].key
if (J == -1) { return NotFound }
xpid = index[J].page
}
pid = xpid // pid is data page index
search page pid and its overflow pages
Costone,mli = (d + 1 + Ov)r
(Note that d = ceil( logci r ) and ci is large because index entries are small)
Produced: 15 Mar 2023