OSD

Object Storage Device (OSD) for archiving MS raw files

Here we would like to present an OSD archiving system which we use for long-term preservation of large MS raw data files. We use the Dell DX6000 OSD as the hardware backbone for an online archive. The software that is used to transfer, archive and recover raw data is an own development.

DELL DX6000 OBJECT STORAGE PLATFORM

The Dell DX6000 cluster is designed to store and access fixed digital content. The architecture utilizes simple and cost-effective methods, including some powerful “self-management” features:

  • Policy-based data management based on metadata for automated object deletion
  • Self-healing functionality for automated detection and repairing errors, non-manual reconfiguration and object regeneration
  • Non-disruptive capacity expansion and automatic disc shutdown-on-error
  • Automated disc capacity and workload balancing
  • Data protection without traditional backup by creating a user-defined number of object replicas
  • Multiple access protocols like HTTP and SDKs (C++, JAVA, Python, C#)
  • 128-bit flat address space (no hierarchy and file-system complexity) for storing up to 3.4x1038 objects

MASS SPECTROMETRY DX CLIENT (MSDX)

For MS data transfer and efficient metadata management we implemented our own client software solution called Mass Spectrometry DX Storage Solution (MSDX). It not only allows for object-integrated default and custom metadata, but also a separate SQL database for efficient filtering and searching of metadata. This standalone tool, written in JAVA, uses the manufacturer’s software development kit (SDK) providing communication and data streaming functions based on a subset of the HTTP/1.1 protocol called simple content storage protocol (SCSP). The software provides the following features:

  • User login
  • File and folder compression (tar.gz)
  • MySQL database interface Java Object Oriented Query (JOOQ) for storing fetched metadata for flexible information  and data search
  • Parsing of file information for storage in metadata database
  • Data transfer streaming of data objects including default and custom metadata (integrity seal, content-md5, policy for object immutability with administration override)
  • Rollback functionality: deleting data from archive in case of errors to keep metadata database and archive consistent
  • Multithreading: data compression simultaneously on multiple cores
  • Standard Widget Toolkit (SWT) GUI for file selection and file transfer initiation
  • Logging
  • HTML output result file
  • File deletion from local disc after successful transfer
  • Program configuration with INI-file for easy customizing

DX RECOVERY TOOL (DXRECOVER)

The web application DXrecover provides a browser based search interface to recover the data from the OSD archive. First, the metadata database is searched by user’s request and results are displayed in the browser. The metadata contains a link to the objects located on the OSD archive which allows downloading the raw data to local computers.


For further Information see published article:

The Amino Acid's Backup Bone - Storage Solutions for Proteomics Facilities.
Meckel H, Stephan C, Bunse C, Krafzik M, Reher C, Kohl M, Meyer HE, Eisenacher M.

If you are interested to get the software packages please do not hesitate to contact:

Hagen Meckel
Bioinformatics / Biostatistics Group

Medizinisches-Proteom-Center
Ruhr-Universität Bochum
Zentrum für klinische Forschung I (ZKF I) Raum E.042
Universitätsstrasse 150
D-44780 Bochum
Phone: +49 234 32 25999
Fax: +49 234 32 14554
Contact person: Hagen Meckel