[prev] 28 [next]

Hash Join

Basic idea:
  • use hashing as a technique to partition relations
  • to avoid having to consider all pairs of tuples
Requires sufficent memory buffers
  • to hold substantial portions of partitions
  • (preferably) to hold largest partition of outer relation
Other issues:
  • works only for equijoin   R.i=S.j   (but this is a common case)
  • susceptible to data skew   (or poor hash function)
Variations:   simple,   grace,   hybrid.