[prev] 41 [next]

Large Data

Some modern applications have massive data sets (e.g. Google)
  • far too large to store on a single machine/RDBMS
  • query demands far too high even if could store in DBMS
Approach to dealing with such data
  • distribute data over large collection of nodes  (also, redundancy)
  • provide computational mechanisms for distributing computation
Often this data does not need full relational selection
  • represent data via (key,value) pairs
  • unique keys can be used for addressing data
  • values can be large objects (e.g. web pages, images, ...)