Work Mode: Full Time Locations: Warsaw

About SimCorp:

Join SimCorp, a leader in FinTech, as we drive the evolution of financial technology. We are an innovative, curious, and collaborative company that embraces challenges and fosters growth. Our values – caring, customer success-driven, collaborative, curious, and courageous – guide us. We are a people-centered organization focused on skills development, relationship building, and client success, creating an environment where all team members can grow, feel heard, valued, and empowered.

Role Importance:

As part of the Deutsche Börse Group, SimCorp supports clients 24/7 with delivery centers worldwide. Our strategy focuses on Platform leadership, SaaS acceleration, and Ecosystem scaling by 2025. Platform Azure Operations is crucial for ensuring reliability, availability, and operational excellence across our cloud-native services. The Lead Problem Manager is key to this mission, driving root cause analysis, preventing recurring issues, and enhancing platform stability and performance.

The Problem Manager is part of the Delivery Excellence team within the Platform organization.

As Lead Problem Manager, you will identify, manage, and resolve systemic and recurring issues across our cloud infrastructure and services. You will utilize monitoring tools, ticketing systems (e.g., Opsgenie, Salesforce, Cadalys), and operational platforms (Azure DevOps, Grafana) to uncover insights, lead problem investigations, and drive cross-functional corrective actions. You will establish problem management workflows, track key metrics, and ensure that lessons learned from incidents are integrated into future improvements, including automation opportunities, promoting a data-driven approach to reliability and operational excellence.

Responsibilities:

  • Problem Analysis & Root Cause Identification: Lead deep-dive investigations into major and recurring issues. Facilitate root cause analysis (RCA) sessions, coordinate across engineering and operations, and maintain thorough problem records. Ensure RCAs are completed within agreed SLAs.
  • Platform Stability & Preventive Action: Identify systemic risks and implement mitigation strategies through infrastructure changes, configuration updates, or automation solutions to prevent issue recurrence.
  • Tooling & Process Integration: Develop and maintain integrations across incident, change, and problem management systems to ensure seamless data flow and traceability.
  • Automation & Self-Healing Initiatives: Collaborate with Service Enablement specialists, SREs, and platform engineers to design and implement automation for detecting, mitigating, or resolving known errors and platform vulnerabilities.
  • Knowledge Enablement & RCA Documentation: Create and maintain clear RCA documentation, known error databases, and self-service materials for engineering and support teams.
  • Analytics & Reporting: Provide visibility into recurring issues, mean time to resolution (MTTR), and problem trends through dashboards and reports, influencing prioritization of technical debt and improvement initiatives.
  • Continuous Improvement Leadership: Drive post-incident reviews and retrospectives, fostering a culture of learning and accountability through feedback loops and operational best practices.

Qualifications:

  • 5+ years of experience in Problem Management, Service Management (ITIL), or platform operations, preferably in cloud environments.
  • Proficiency in ITIL practices, particularly in problem, incident, and change management.
  • Experience with cloud monitoring and alerting platforms (e.g., Opsgenie, Grafana, Prometheus).
  • Proficiency in RCA methodologies (e.g., 5 Whys, Fishbone, Pareto) and problem tracking systems (e.g., Salesforce, Cadalys, ServiceNow, Jira).
  • Familiarity with automation and self-healing frameworks for cloud operations.
  • Experience with reporting and data analysis tools (e.g., Power BI, Azure Log Analytics).
  • Strong communication, facilitation, and cross-functional collaboration skills.
  • Ability to prioritize effectively in a diverse, global, and multicultural environment.

Benefits:

SimCorp offers a comprehensive benefits package that may vary by country, including opportunities for training and certification, flexible hours, vacation time, and work-from-home options to support work-life balance.