Together AI
Senior Software Engineer Together Cloud Infrastructure
Amsterdam
Role brief
What this role is asking for.
About the Role Together AI is building the AI Acceleration Cloud, an end-to-end platform for the full generative AI lifecycle, combining the fastest LLM inference engine with state-of-the-art AI cloud infrastructure. As a Senior AI Infrastructure Engineer, you will play a key role in building the next generation AI cloud platform โ a highly available, global, blazing-fast cloud infrastructure that virtualizes cutting-edge ML hardware (GB200s/GB300s, BlueField DPUs) and enables state-of-the-art ML practitioners with self-serve AI cloud services, such as on-demand + managed Kubernetes and Slurm clusters. This platform serves both our internal SaaS products (inference, fine-tuning) and our external cloud customers, spanning dozens of data centers across the world. Hybrid working two days a week in the Amsterdam office. Responsibilities Design, build, and maintain performant, secure, and highly-available backend services/operators that run in our data centers and automate hardware management, such as Infiniband partitioning, in-DC parallel storage provisioning, and VM provisioning. Design and build out the IaaS software layer for a new GB200 data center with thousands of GPUs. Work on a global multi-exabyte high-performance object store, serving massive datasets for pretraining. Build advanced observability stacks for our customers with automated node lifecycle management for faul...
Company role signals