WHAT MAKES US, US

Join some of the most innovative thinkers in FinTech as we lead the evolution of financial technology. If you are an innovative, curious, collaborative person who embraces challenges and wants to grow, learn and pursue outcomes with our prestigious financial clients, say Hello to SimCorp!

At its foundation, SimCorp is guided by our values — caring, customer success-driven, collaborative, curious, and courageous. Our people-centered organization focuses on skills development, relationship building, and client success. We take pride in cultivating an environment where all team members can grow, feel heard, valued, and empowered.

If you like what we’re saying, keep reading!

WHY THIS ROLE IS IMPORTANT FOR US

As part of the Deutsche Börse Group, and with delivery centers worldwide (Manila, Noida, Kyiv, Warsaw, and Mexico City), we support our clients 24/7. Our strategy focuses on Platform leadership, SaaS acceleration, and Ecosystem scaling by 2025.

Within this journey, Platform Azure Operations plays a vital role by ensuring reliability, availability, and operational excellence across our cloud-native services. The Lead Problem Manager is a pivotal role in this mission, responsible for driving root cause analysis, preventing recurring issues, and continuously improving platform stability and performance.

The Problem Manager is organized in the Delivery Excellence team, which is a part of the Platform organization.

As Lead Problem Manager you will be responsible for identifying, managing, and resolving systemic and recurring issues across our cloud infrastructure and services. You’ll work across monitoring tools, ticketing systems (e.g., Opsgenie, Salesforce, Cadalys etc.), and operational platforms (Azure DevOps, Grafana, etc.) to surface insights, lead problem investigations, and drive cross-functional corrective actions.

You will establish problem management workflows, track key metrics, and ensure that lessons learned from incidents are institutionalized into future improvements, including automation opportunities. This position plays a vital role in ensuring a data-driven approach to reliability and operational excellence.

WHAT YOU WILL BE RESPONSIBLE FOR

Problem Analysis & Root Cause Identification:

  • Lead deep-dive investigations into major and recurring issues. Facilitate root cause analysis (RCA) sessions, coordinate across engineering and operations, and maintain thorough problem records.
  • Ensure RCAs are completed and delivered within agreed service level agreements (SLAs) to meet compliance and stakeholder expectations.

Platform Stability & Preventive Action:

  • Recognize systemic risks and take steps to mitigate them through changes in infrastructure, updates to configurations, or automation solutions. Prevent recurrence of high-impact issues.

Tooling & Process Integration:

  • Develop and maintain integrations (in collaboration with our Service Enablement team) across incident, change, and problem management systems. Ensure seamless data flow and traceability from incidents to RCAs and preventive actions.

Automation & Self-Healing Initiatives:

  • Partner with Service Enablement specialists, SREs and platform engineers to design and implement automation that detects, mitigates, or resolves known errors and platform vulnerabilities.

Knowledge Enablement & RCA Documentation:

  • Create and maintain clear RCA documentation, known error databases, and self-service materials to upskill engineering and support teams.

Analytics & Reporting:

  • Deliver visibility into recurring issues, mean time to resolution (MTTR), and problem trends via dashboards and reports. Use insights to influence prioritization of technical debt and improvement initiatives.

Continuous Improvement Leadership:

  • Drive post-incident reviews and retrospectives. Champion a culture of learning and accountability through feedback loops and operational best practices.

WHAT WE VALUE

We are looking for candidates with the following qualifications:

  • 5+ years of experience in Problem Management, Service Management in general (ITIL) or platform operations, beneficially in cloud environments.
  • Well-versed in ITIL practices, especially in problem, incident, and change management
  • Experience with cloud monitoring and alerting platforms (e.g. Opsgenie, Grafana, Prometheus)
  • Proficiency in RCA methodologies (e.g., 5 Whys, Fishbone, Pareto) and problem tracking systems (e.g., Salesforce, Cadalys, ServiceNow, Jira)
  • Familiarity with automation and self-healing frameworks for cloud operations
  • Experience with reporting and data analysis tools (e.g., Power BI, Azure Log Analytics)
  • Proficient communication, facilitation, and cross-functional collaboration skills
  • Ability to prioritize effectively in a diverse, global, and multicultural environment.

BENEFITS

SimCorp offers several benefits that might play a significant factor in considering whether to accept a job offer. Since SimCorp operates in 30+ offices worldwide, the benefits package may vary from country to country. Take advantage of this section and indicate the most-valued benefits for candidates, considering training and certification, as well as benefits that can improve a candidate's work-life balance, such as flexible hours, vacation time, work-from-home options, etc. Please note that this may not be possible for multiple location postings due to the difference in benefits.

WHO WE ARE

For over 50 years, we have worked closely with investment and asset managers to become the world’s leading provider of integrated investment management solutions. We are 3,000+ colleagues with a broad range of nationalities, educations, professional experiences, ages, and backgrounds.

SimCorp is an independent subsidiary of the Deutsche Börse Group. Following the recent merger with Axioma, we leverage the combined strength of our brands to provide an industry-leading, full, front-to-back offering for our clients.

We are committed to building a culture where diverse perspectives and expertise are integrated into our everyday work. We believe in the continual growth and development of our employees, so that we can provide best-in-class solutions to our clients.