HeAT (The Helmholtz Analytics Toolkit)#

Contact person

Claudia Comito

Description#

HeAT is a distributed tensor framework for high performance data analytics. It provides highly optimized algorithms and data structures for tensor computations using CPUs, GPUs and distributed cluster systems on top of MPI. HeAT builds on PyTorch and mpi4py to provide high-performance computing infrastructure for memory-intensive applications within the NumPy/SciPy ecosystem.

With HeAT you can:

- port existing NumPy/SciPy code from single-CPU to multi-node clusters with minimal coding effort;
- exploit the entire, cumulative RAM of your many nodes for memory-intensive operations and algorithms;
- run your NumPy/SciPy code on GPUs (CUDA, ROCm, coming up: Apple MPS).

The goal of HeAT is to fill the gap between machine learning libraries that have a strong focus on exploiting GPUs for performance, and traditional, distributed high-performance computing (HPC). The basic idea is to provide a dtype, distributed tensor library with machine learning methods based on it.

Among other things, the implementation will allow to tackle use cases that would otherwise exceed memory limits of a single node.

Model(s)#

HeAT v1.3 : MIT license

Programming language(s)#

Python

Tags#

#optional #infrastructure #data analysis #dataanalytics #datascience #statistics #linearalgebra

Back to Optional components