ABSTRACT
Understanding Failures in Petascale Computers
Bianca Schroeder Garth A. Gibson
SciDAC 2007. Journal of Physics: Conference Series 78 (2007) 012022
We develop a consistent mutable replication extension
for NFSv4 tuned to meet the rigorous demands of large-scale
data sharing in global collaborations. The system
uses a hierarchical replication control protocol that dynamically
elects a primary server at various
granularities. Experimental evaluation indicates a substantial
performance advantage over a single server system.
With the introduction of the hierarchical replication
control, the overhead of replication is negligible even
when applications mostly write and replication servers
are widely distributed.








