Journal Papers | Conference Papers | Other | Posters | Talks

 

PDSI Talks & Publications

Journals

Failure Tolerance in Petascale Computers. Garth Gibson, Bianca Schroeder, Joan Digney. CTWatch Quarterly,
vol. 3 no. 4. Volume on Software Enabling Technologies for Petascale Science. November 2007. www.ctwatch.org
PDF

Understanding Failures in Petascale Computers. Bianca Schroeder, Garth A. Gibson. SciDAC 2007. Journal of Physics: Conference Series 78 (2007) 012022.
Abstract / PDF / Permanent JPCS Link

All 100 open access volumes of the Journal of Physics Conference Series (JPCS)are available via the journal home page: http://herald.iop.org/JPCS_home/m294/crk//link/1520

Understanding Disk Failure Rates: What does an MTTF of 1,000,000 hours mean to you? Bianca Schroeder, Garth A. Gibson. ACM Transactions on Storage (TOS), Volume 3 Issue 3, October 2007.

A Replicated File System for Grid Computing. Jiaying Zhang and Peter Honeyman. Concurrency and Computation: Practice and Experience, 2007; 00:1–7.
Abstract / PDF

Data Management: the Victorian Era Child of the 21st Century. Farber R., PNNL-SA-53343, Pacific Northwest National Laboratory, Richland, WA. Published in Scientific Computing, vol. 24 no.4, March 2007.
HTML

Balancing Computation and Experiment. Farber R. PNNL-SA-54125, Pacific Northwest National Laboratory, Richland, WA. Published in Innovation: America's Journal of Technology Commercialization, vol. 5 no. 24, April/May 2007.
HTML

Early Experiences on the Journey Towards Self-* Storage. Michael Abd-El-Malek, William V. Courtright II, Chuck Cranor, Gregory R. Ganger, James Hendricks, Andrew J. Klosterman, Michael Mesnier, Manish Prasad, Brandon Salmon, Raja R. Sambasivan, Shafeeq Sinnamohideen, John D. Strunk, Eno Thereska, Matthew Wachs, Jay J. Wylie. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, September 2006.
Abstract / PDF

Conferences

On Application-level Approaches to Avoiding TCP Throughput Collapse in Cluster-Based Storage Systems. E. Krevat, V. Vasudevan, A. Phanishayee, D. Andersen, G. Ganger, G. Gibson, S. Seshan. Proceedings of the 2nd international Petascale Data Storage Workshop (PDSW '07) held in conjunction with Supercomputing '07. November 11, 2007, Reno, NV.
Abstract / PDF

GIGA+: Scalable Directories for Shared File Systems. Swapnil V. Patil, Garth A. Gibson, Sam Lang, Milo Polte. Proceedings of the 2nd international Petascale Data Storage Workshop (PDSW '07) held in conjunction with Supercomputing '07. November 11, 2007, Reno, NV.
Abstract / PDF

Accelerating Reed-Solomon Coding in RAID Systems with GPUs. Matthew Curry (University of Alabama at Birmingham, USA); Lee Ward (Sandia National Laboratories, USA); Tony Skjellum (University of Alabama Birmingham, USA); Ron Brightwell (Sandia National Laboratories, USA). 22nd IEEE International Parallel and Distributed Processing Symposium, April 14-18, 2008, Miami, FL.
Abstract / PDF

An Analysis of Data Corruption in the Storage Stack. L. Bairavasundaram, G. Goodson, B. Schroeder, A. Arpaci-Dusseau, R. Arpaci-Dusseau, 6th Usenix Conference on File and Storage Technologies (FAST 2008).
Abstract / PDF

Scalable Security for Petascale Parallel File Systems. Andrew Leung, Ethan L. Miller, and Stephanie Jones., SC '07, Reno, NV, November 2007.
Abstract / PDF

POTSHARDS: Secure Long-Term Storage Without Encryption. Mark W. Storer, Kevin Greenan, Ethan L. Miller, Kaladhar Voruganti. Proceedings of the 2007 USENIX Technical Conference, June 2007.
Abstract / PDF

PRIMS : Making NVRAM Suitable for Extremely Reliable Storage. Kevin Greenan, Ethan L. Miller. Proceedings of the 3rd Workshop on Hot Topics in System Dependability (HotDep '07), June 2007.
Abstract / PDF

Direct-pNFS: Scalable, Transparent, and Versatile Access to Parallel File Systems. Dean Hildebrand, Peter Honeyman. Proc. 16th IEEE International Symp. on High Performance Distributed Computing (HPDC 2007), Monterey. June 2007.
Abstract / PDF

Modeling the Relative Fitness of Storage. Michael P. Mesnier, Matthew Wachs, Raja R. Sambasivan, Alice X. Zheng, Gregory R. Ganger. SIGMETRICS'07, June 12-16, 2007, San Diego, California, USA.ACM. Awarded Best Paper.
Abstract / PDF

Hierarchical Replication Control in a Global File System. Jiaying Zhang and Peter Honeyman. Proc. 7th IEEE International Symp. on Cluster Computing and the Grid (CCGrid07), Rio de Janeiro. May 2007.
Abstract / PDF

pNFS and Linux: Working towards a Heterogeneous Future. Dean Hildebrand, Peter Honeyman, and W.A. (Andy) Adamson. Proc. 8th LCI International Conf. on High-Performance Clustered Computing, South Lake Tahoe. May 2007.
Abstract / PDF

Fingerpointing Correlated Failures in Replicated Systems. Soila Pertet, Rajeev Gandhi and Priya Narasimhan. USENIX Workshop on Tackling Computer Systems Problems with Machine Learning Techniques (SysML), Cambridge, MA. April 2007.
Abstract / PDF

MultiMap: Preserving Disk Locality for Multidimensional Datasets. Minglong Shao, Steven W. Schlosser, Stratos Papadomanolakis, Jiri Schindler, Anastassia Ailamaki, Gregory R. Ganger. IEEE 23rd International Conference on Data Engineering (ICDE 2007) Istanbul, Turkey, April 2007. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-05-102. March 2005.
Abstract / PDF

The Computer Failure Data Repository. Bianca Schroeder, Garth Gibson. Invited contribution to the Workshop on Reliability Analysis of System Failure Data (RAF'07) MSR Cambridge, UK, March 2007.
Abstract / PDF

//TRACE: Parallel Trace Replay with Approximate Causal Events. Michael Mesnier, Matthew Wachs, Raja R. Sambasivan, Julio Lopez, James Hendricks, Gregory R. Ganger. Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST '07), February 13-16, 2007, San Jose, CA. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-06-108, September 2006.
Abstract / PDF

Disk Failures in the Real World: What Does an MTTF of 1,000,000 Hours Mean to You? Bianca Schroeder, Garth A. Gibson. Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST '07), February 13--16, 2007, San Jose, CA. Best Paper Award.
Abstract / PDF

A Large Scale Study of Failures in High-performance-computing Systems. Bianca Schroeder, Garth Gibson. International Symposium on Dependable Systems and Networks (DSN 2006). IEEE Transactions on Dependable and Secure Computing (TDSC).
Abstract / PDF

Argon: Performance Insulation for Shared Storage Servers. Matthew Wachs, Michael Abd-El-Malek, Eno Thereska, Gregory R. Ganger. Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST '07), February 13--16, 2007, San Jose, CA.
Abstract / PDF

Towards Fingerpointing in the Emulab Dynamic Distributed System. Michael P. Kasick, Priya Narasimhan, Kevin Atkinson, Jay Lepreau. Proceedings of the 3rd USENIX Workshop on Real, Large Distributed Systems (WORLDS '06), Seattle, WA. Nov. 5, 2006.
Abstract / PDF

NFSv4 Replication for Grid Storage Middleware. Jiaying Zhang and Peter Honeyman. Proc. 4th International Workshop on Middleware for Grid Computing, Melbourne. November 2006.
Abstract / PDF

Ceph: A Scalable, High-Performance Distributed File System. Sage Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, Carlos Maltzahn. Proceedings of the 7th Conference on Operating Systems Design and Implementation (OSDI '06), November 2006.
Abstract / PDF

CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data. Sage Weil, Scott A. Brandt, Ethan L. Miller, Carlos Maltzahn. Proceedings of SC '06, November 2006.
Abstract / PDF

Reliability Mechanisms for File Systems Using Non-Volatile Memory as a Metadata Store. Kevin Greenan, Ethan L. Miller, Proceedings of the 6th ACM & IEEE Conference on Embedded Software (EMSOFT '06), October 2006, pages 178-187.
Abstract / PDF

Scalable Security for Large, High Performance Storage Systems. Andrew Leung, Ethan L. Miller. Proceedings of the 2nd ACM Workshop on Storage Security and Survivability (StorageSS 2006), October 2006.
Abstract / PDF

Other

Network Transparency in Wide Area Collaborations. Jiaying Zhang. Ph.D. Dissertation, University of Michigan, Ann Arbor, May 2007.
Abstract / PDF

Distributed Access to Parallel File Systems. Dean Hildebrand. Ph.D. Dissertation, University of Michigan, Ann Arbor, February 2007.
Abstract / PDF

Posters

PDSI Shared Information Resources for HEC Storage. PDSI PIs. ASCR PI meeting, March 31, 2008, Denver, CO.
PDF

PDSI Data Releases and Repositories. PDSI PIs. 6th USENIX Conference on File and Storage Technologies (FAST '08). Feb. 26-29, 2008. San Jose, CA.
PDF

Talks

Petascale Data Storage Institute - Access Methods. Garth Gibson, Carnegie Mellon University. SDM-PDSI Mini Workshop. Nov 30, 2007. Seattle, WA
PDF [495K]

Performance Challenges for Extreme Scale Computing. John T. Daly, Los Alamos National Lab. SDI Seminar, Carnegie Mellon University.
PDF [4.6M]

Understanding Failure in Petascale Computers. Garth Gibson (Joint work with Bianca Schroeder). 2007 SciDAC Conference, June 25, Boston MA.
PDF [899K]

Design and Expressions of a Scalable Supercomputer. Lee Ward. Sandia’s MPP, 10/31/2006.
PDF [673K]

Last updated 2008-05-01 | ©2008 Carnegie Mellon University