Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.

As a C++ Machine Learning Engineer on the AI Models team at Tenstorrent, you’ll work on the training framework behind our most advanced models. You’ll write high-performance C++ code, shape how new layers and operators are implemented, and help models scale across our custom silicon. If you enjoy building the guts of ML systems and seeing them run fast, this role is for you.

This role is hybrid, based out of Warsaw or Gdansk, Poland. We also consider remote candidates on a case-by-case status.

We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting.

Who You Are

  • Strong in C++ and low-level systems programming, especially in performance-critical code.
  • Comfortable thinking in tensors, memory layout, and compiler graphs.
  • Familiar with PyTorch and curious about how frameworks map to hardware.
  • A builder who enjoys solving technical puzzles and digging into the details.

What We Need

  • Extend and optimize our ML training framework with new ops, layers, and training features.
  • Debug and tune model performance on Tenstorrent chips.
  • Work with compiler and kernel teams to make sure models compile and run as expected.
  • Support integration of real-world models and help bring them into production.

What You Will Learn

  • How ML frameworks and compilers connect at the system level.
  • How to translate training workloads into low-level operations optimized for custom silicon.
  • How large-scale model training works under the hood, from memory layout to operator fusion.
  • What it takes to build infrastructure that supports fast iteration in research and production.

Tenstorrent offers a highly competitive compensation package and benefits.

Tenstorrent

Tenstorrent