[prev] 64 [next]

Parallelism in DB Operations (cont)

Parallel sorting
  • scan in parallel, range-partition during scan
  • pipeline into local sort on each processor
  • merge sorted partitions in order

Potential problem:
  • data skew because of unfortunate choice of partition points
  • resolve by initial data sampling to determine partitions