Week 10 Lectures (61)

61

Data Storage in PDBs (cont)

Hash partitioning

use hash value to determine which node and page
e.g. i = hash(tuple) so tuple is placed on i ^th node
helpful for equality-based queries on hashing attribute

Range partitioning

ranges of attr values are assigned to processors
e.g. values 1-10 on node₀, 11-20 on node₁, ..., 99-100 node_n-1
potentially helpful for range-based queries

In both cases, data skew may lead to unbalanced load