Hash Join
Basic idea:
- use hashing as a technique to partition relations
- to avoid having to consider all pairs of tuples
Requires sufficent memory buffers
- to hold substantial portions of partitions
- (preferably) to hold largest partition of outer relation
Other issues:
- works only for equijoin
R.i=S.j (but this is a common case)
- susceptible to data skew (or poor hash function)
Variations: simple, grace, hybrid.
|