Survey on memory and devices disaggregation solutions for HPC systems

Maciej Bielski 1, 2 Christian Pinto Daniel Raho Renaud Pacalet 1, 2
1 LabSoC - System on Chip
LTCI - Laboratoire Traitement et Communication de l'Information
Abstract :

Traditionally, HPC workloads are characterized by different requirements in CPU and memory resources, which in addition vary over time in unpredictable manner. For this reason, HPC system designs, assuming physical co-location of CPU and memory on a single motherboard, strongly limit scalability, while leading to inefficient resources over-provisioning. Also, peripherals available in the system need to be globally accessible to allow optimal usage. In this context, modern HPC designs tend to support disaggregated memory, compute nodes, remote peripherals and hardware extensions to support virtualization techniques. In this paper, a qualitative survey on different attempts of memory and devices disaggregation is conducted. In addition, alternative future directions for devices disaggregation are proposed in the context of the work planned in the H2020 dRedBox project.

Complete list of metadatas

https://hal.telecom-paristech.fr/hal-02287406
Contributor : Telecomparis Hal <>
Submitted on : Friday, September 13, 2019 - 4:56:47 PM
Last modification on : Sunday, September 15, 2019 - 1:12:10 AM

Identifiers

  • HAL Id : hal-02287406, version 1

Citation

Maciej Bielski, Christian Pinto, Daniel Raho, Renaud Pacalet. Survey on memory and devices disaggregation solutions for HPC systems. 19th IEEE International Conference on Computational Science and Engineering - CSE 2016, Aug 2016, Paris, France. ⟨hal-02287406⟩

Share

Metrics

Record views

3