Hadoop DFS
Apache Hadoop Distributed File System
- a hierarchical file system (directories & files a la Linux)
- designed to run on large number of commodity computing nodes
- supporting very large files (TB) distributed/replicated over nodes
- providing high reliability (failed nodes is the norm)
Provides support for Hadoop map/reduce implementation.
Optimised for write-once-read-many apps
- simplifies data coherence
- aim is maximum throughput rather than low latency
|