ARemote Jobs Ace

RunPod

Manager, HPC Storage Engineer

Remote, USA

Role brief

What this role is asking for.

Runpod is pioneering the future of AI and machine learning, offering cutting-edge cloud infrastructure for full‑stack AI applications. Founded in 2022, we are a rapidly growing, well‑funded, remote‑first company with a global team across the US, Canada, and Europe. Our mission is to create a foundational platform that enables developers and companies to build, deploy, and scale custom AI systems with speed and flexibility. As AI workloads continue to push the limits of throughput, latency, and parallelism, Runpod is investing heavily in next-generation storage architectures purpose-built for GPU-centric compute. We are looking for an Engineering Manager, Datacenter Storage Engineering to lead the team responsible for Runpod’s distributed storage infrastructure across all regions. This role owns the end-to-end storage stack — from NAND and NVMe devices through filesystems, transport protocols, and cluster-level deployment — ensuring performance, reliability, and scalability for AI workloads. You will manage engineers designing and operating large-scale SAN and NFS-based systems, including high-performance shared filesystems for training workloads. This role requires deep technical fluency and architectural leadership, combined with strong people management and operational discipline. Responsibilities Own Distributed Storage Architecture: Define, evolve, and operate Runpod’s glob

Company role signals

RunPod role signals.

Repeated tags across 17 active roles show the current hiring pattern.