cw21_mainbanner_igbmc-fr

IGBMC – New HPC storage for genomics

Founded in 1994 by Pierre Chambon, one of the most important personalities in the field of biomedical research, the institute is now one of the leading research centres in Europe in the field of genetics and molecular biology. In France, it is the largest research unit involving INSERM (National Institute for Health and Medical Research), CNRS (National Centre for Scientific Research) and the University of Strasbourg. In addition to its four scientific departments, the IGBMC has developed scientific services and high-tech platforms for both its own internal use and use by the entire scientific community. The Institute aims to further develop transdisciplinary research at the interface of biology, biochemistry, physics and medicine, and to attract students from all over the world by offering top-class training in the biomedical sciences. The IGBMC campus is located in the Illkirch Innovation Park, a suburb of Strasbourg—an exceptional academic and industrial environment designed for collaboration and technology transfer.

Project details.

igbmc
  • Professional Services
  • Server & Storage
  • Education / Research
  • Lenovo
  • 250–1,500 employees

PROJECT OBJECTIVES.

The GenomEast platform leverages a computing infrastructure that consists of a computational cluster (20 nodes / 304 physical cores / 608 virtual cores) in conjunction with 390 TB of storage.

The platform sequencer is connected to this cluster via a 1G network cable and transmits data as it is generated.

 

Who can access this platform?

  • Bioinformaticians using the platform with read and write rights can read files from the sequencer, make calculations from these files and then write the corresponding results (most of these files are compressed)
  • Platform clients with read-only rights can receive their data via an FTP server from which they can access the data stored in the platform’s storage cluster thanks to NFS mount points.

 

Challenges.

  • Acquisition of a new storage infrastructure (and the accompanying parallel virtual file system)
  • Integration of a tiering solution that allows the use of drives with different performance within a single logical volume Volume on fast drives of at least 20 TB.
  • Solution must be expandable by simply adding (scaling up) components to at least 1 PB with expansion increments of at least 250 TB and must achieve at least the performance of the original solution.
    solution_image_server

     

    SOLUTION.

    Un ensemble de 2 serveurs « Spectrum Scale » basés sur des serveurs LENOVO SR650, interconnectés
    2 Spectrum Scale servers based on LENOVO SR650 servers, connected to 3 Lenovo storage arrays/systems, one of them a main array (GPFS controller) with two dual-port SAS 12 Gb/s controllers active/active, 4 drives with 3.2 TB SSD combined with 20 1.6 TB SSD drives and 2 expansion bays for 60 NL-SAS drives, each 12 TB 7.2K RPM, and 20 NL-SAS drives, 12 TB 7.2K RPM, as well as 4 3.2 TB SSD drives. The overall configuration consists of several LUNs with 10 drives (8+2P) each and a cumulative throughput of > 2 GB/s.

     

    In order to secure the existing IGBMC data in the long term, these were migrated from the existing storage system to the new Spectrum Scale system.

     

    The new infrastructure is covered by both hardware and software guarantees of 5 and 8 years respectively.

    BUSINESS BENEFITS.

    The solution offers high storage density, requires 14 U of space for ~736 TB, has high availability due to the 2 access servers and its hardware design, and features high performance (> 2 GB/s). It also offers enterprise functions such as tiering, snapshots, replication, quotas and multi-protocol export.