Company Background

The customer is a global leader in diagnostics and drug development, employing over 70,000 professionals and serving clients in more than 100 countries. With over $14 billion in annual revenue, they are committed to advancing healthcare and empowering patients, providers, and researchers through data-driven solutions. Their mission is to improve health and improve lives by delivering clear and confident answers in a complex medical landscape.

Project Description

The project involves building a new internal system to support Laboratory Information Management (LIM). The goal is to modernize and streamline data handling processes, improve data accessibility, and enhance performance across large-scale data workflows. The team is responsible for both designing and implementing a high-performance solution, working closely with the client on architecture decisions, performance tuning, and best practices in data engineering.

Technologies

  • Python
  • Databricks
  • Apache Spark
  • Hive
  • AWS EMR
  • S3
  • Oracle SQL
  • DataStage
  • CI/CD tools
  • Mainframe systems

What You'll Do

  • Design and implement scalable data processing pipelines using Spark, Hive, and Python
  • Collaborate with stakeholders on system architecture, performance tuning, and design decisions
  • Optimize SQL and Spark queries to ensure fast and efficient data access
  • Develop ETL processes and manage data flow across systems using tools like DataStage
  • Contribute to the CI/CD pipeline setup, automation, and deployment strategies
  • Participate in code reviews, documentation, and cross-functional planning meetings
  • Support the data modeling process within a data warehouse environment
  • Work collaboratively with the client’s engineering and architecture teams

Job Requirements

  • 5+ years of experience in Data Engineering or related role
  • 5+ years of hands-on Python development experience
  • Experience with Databricks, Spark, Hive, AWS EMR/S3
  • Proficiency in Oracle SQL and query tuning
  • Familiarity with CI/CD tools and modern DevOps practices
  • Strong understanding of SDLC and software engineering principles
  • Experience with data modeling and ETL in large-scale environments
  • Exposure to mainframe systems is a plus
  • English level: B1+ (spoken and written)

What Do We Offer

The global benefits package includes: * Technical and non-technical training for professional and personal growth; * Internal conferences and meetups to learn from industry experts; * Support and mentorship from an experienced employee to help you professional grow and development; * Internal startup incubator; * Health insurance; * English courses; * Sports activities to promote a healthy lifestyle; * Flexible work options, including remote and hybrid opportunities; * Referral program for bringing in new talent; * Work anniversary program and additional vacation days.

Coherent Solutions Sp z o.o.

Coherent Solutions Sp z o.o.