Week 10 Lectures (90)

90

Scale, Distribution, Replication

Data for modern applications is very large (TB, PB, XB)

not feasible to store on a single machine
not feasible to store in a single location

Many systems opt for massive networks of simple nodes

each node holds moderate amount of data
each data item is replicated on several nodes
nodes clustered in different geographic sites

Benefits:

reliability, fault-tolerance, availability
proximity ... use data closest to client
scope for parallel execution/evaluation