Site Reliability Engineer - Gaming
As a Site Reliability Engineer, your primary objective will be to ensure stability, reliability, and performance of the areas service across our main online casino product.
Our Product Development organisation is truly Global with cross functional teams spanning 6 Tech Hubs – Malta, Budapest, Stockholm, Tallinn, Kyiv and Athens. With nearly 600 strong professionals, the Product Development organisation is spear-headed by our CTO-CPO with all our talented Area Teams working together.
Key Responsibilities:
-
Incident & Problem Management: Investigate system incidents, drive Root Cause Analysis (RCAs), and execute long-term remedial fixes. Proactively reduce the number of incidents caused by system changes.
-
Observability & Metrics: Define and enforce Service Level Agreements (SLAs), Service Level Objectives (SLOs), and success metrics for new initiatives. Build and maintain comprehensive dashboards to achieve observability excellence.
-
Performance & Capacity: Identify and help resolve performance bottlenecks. Optimize infrastructure and code to maintain fast service, and conduct capacity planning to forecast future hardware or cloud resource requirements.
-
Availability & Change Management: Guarantee the Platform components remain highly reachable and functional for users. Oversee deployments to ensure new code does not disrupt the existing system.