What you'll need to be successful:
This role requires a unique blend of a software engineer's mindset and operational expertise. You'll thrive in this role if you have:
- A proactive and systematic approach to problem-solving, with a high degree of ownership.
- Proven experience in a production environment supporting large-scale, mission-critical applications with a high degree of autonomy.
- Proficiency in at least one programming language, with a preference for Go. You should be comfortable writing custom applications, not just scripts.
- Experience with infrastructure as code (Terraform), container orchestration (Kubernetes, Docker) and GitOps (ArgoCD).
- Demonstrable expertise in a major cloud provider (Azure, AWS, or GCP).
- A strong grasp of microservices architecture, databases (SQL, NoSQL), and networking fundamentals, so you can understand how custom code can solve platform-level issues.
- An understanding of core SRE principles, including SLIs, SLOs, and error budgets.
- Experience in an on-call rotation for a 24/7 cloud-based environment.
- Exceptional communication and collaboration skills, with a proven ability to work effectively in a remote, distributed team, where tasks may be self-driven.
We're looking for someone who is not just looking for a job, but a career-defining opportunity to tackle complex challenges at a massive scale. If you're a curious and motivated engineer who's passionate about building reliability directly into the platform, we'd love to hear from you.