Scale, Distribution, Replication
Data for modern applications is very large (TB, PB, XB)
- not feasible to store on a single machine
- not feasible to store in a single location
Many systems opt for massive networks of simple nodes
- each node holds moderate amount of data
- each data item is replicated on several nodes
- nodes clustered in different geographic sites
Benefits:
- reliability, fault-tolerance, availability
- proximity ... use data closest to client
- scope for parallel execution/evaluation
|