Petascale Data Storage Workshops
News
August 4, 2009 - CView software released by PNL. CView is a 3D graphics engine designed for displaying graphically represented cluster performance data. It also includes a data management library for representing groups of related data.
June 16, 2009 - PLFS Source Code Released on Sourceforge.net
April 18, 2009 - LANL's Parallel Log-Structured File System (PLFS)
April 13, 2009 - PDSW '08 papers now available through IEEE Xplore
August 6, 2008 - Sandia Releases Application Traces
Recent Events
Wednesday March 25, 2009 - PDSI Seminar - Storage for Petascale Computing
John Bent, Los Alamos National Lab
1 pm - 2:20 pm
BH 136A, Adamson Wing
November 21, 2008 - Supercomputing '08 Panel: Exa and Yotta Scale Data - Are We Ready?
Soon after Teraflops, HPC facilities were handling Peta-Byte data. The challenges of Exa-byte and Yotta-byte data will be a significant, possibly dominate, limiter on productivity of HPC users. This panel will address the question "Is the HPC community ready for Exa Byte data?" and will discuss challenges of Yotta-bytes.
November 19, 2008 - Supercomputing '08 BOF: pNFS Protocol after Final Draft and before RFC (Slides available!)
pNFS is an extension to NFSv4 that allows clients to overcome NFS scalability and performance barriers. Like NFS, pNFS is a client/server protocol implemented with secure and reliable remote procedure calls. A pNFS server manages storage metadata and responds to client requests for storage layout information. pNFS departs from conventional NFS by allowing clients to access storage directly and in parallel. By separating data and metadata access, pNFS eliminates the server bottlenecks inherent to NAS access methods. By combining parallel I/O with the ubiquitous standard for Internet filing, pNFS insulates storage architects from the risks of deploying best-of-breed technologies, promising state of the art performance, massive scalability, and interoperability across standards-compliant application platforms. pNFS is aproaching the RFC publication and this BoF will try to discuss all the 3 pNFS flavors and their advantages and disadvantages in an effort to accelerate its penetration and invite users to try it.
November 17, 2008 - Petascale Data Storage Workshop at SC '08 - Papers now available through IEEE Xplore
Petascale computing infrastructures make petascale demands on information storage capacity, performance, concurrency, reliability, availability, and manageability. This one-day workshop focuses on the data storage problems and emerging solutions found in petascale scientific computing environments, with special attention to issues in which community collaboration can be crucial, problem identification, workload capture, solution interoperability, standards with community buy-in, and shared tools.
This workshop seeks contributions on relevant topics, including but not limited to: performance and benchmarking results and tools, failure tolerance problems and solutions, APIs for high performance features, parallel file systems, high bandwidth storage architectures, wide area file systems, metadata intensive workloads, autonomics for HPC storage, virtualization for storage systems, archival storage advances, resource management innovations, etc.
Code and Data Releases
- NEW - March 24, 2009! Redstorm S3d I/O kernel trace data (Sandia)
- Sandia: 2 sets of application I/O trace outputs taken during runs of the Sandia code Alegra.
- LANL releases workstation file system statistics gathered with fsstats tool (08/07/08)
- Sandia's IOR software, for benchmarking parallel file systems using POSIX, MPIIO, or HDF5 interfaces. This software was developed at LLNL, with Bill Loewe.
- FSSTATS code release
- LANL PDSI Research Data and Open Source Code
- PFS code release
- Computer Failure Data Repository (CFDR)
PDSI Overview
Petascale computing infrastructures for scientific discovery make petascale demands on information storage capacity, performance, concurrency, reliability, availability, and manageability. The last decade has shown that parallel file systems can barely keep pace with high performance computing along these dimensions; this poses a critical challenge when petascale requirements are considered. The Petascale Data Storage Institute will focus on the data storage problems found in petascale scientific computing environments, with special attention to community issues such as interoperability, community buy-in, and shared tools. Leveraging experience in applications and diverse file and storage systems expertise of its members, the institute allows a group of researchers to collaborate extensively on developing requirements, standards, algorithms, and development and performance tools. Mechanisms for petascale storage and results will be made available to the petascale computing community. The institute will hold periodic workshops and develop educational materials on petascale data storage for science.
The Petascale Data Storage Institute is a collaboration between researchers at Carnegie Mellon University, National Energy Research Scientific Computing Center, Pacific Northwest National Laboratory, Oak Ridge National Laboratory, Sandia National Laboratory, Los Alamos National Laboratory, University of Michigan, and the University of California at Santa Cruz.
The Drive to Petascale ComputingFaster computers need more data, faster:
Challenges
|
![]() |
Contact Us
Garth Gibson
School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213
phone: 412-268-5890
Angela Miller, Administrative Asst.
School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213
phone: 412-268-6645









