Engineering Manager - Observability & Reliability Engineering Obsession (x/f/m)
We are looking for an Engineering Manager to join the OREO (Observability Reliability Engineering Obsession) team in Platform Engineering.
As an Engineering Manager, your mission will be to lead the Reliability & Observability team and drive the evolution of Doctolib's observability platform, supporting the exponential growth of Doctolib services while building and empowering a world-class SRE team.
Working in the tech team at Doctolib involves building innovative products and features to improve the daily lives of care teams and patients. We work in feature teams in an agile environment, while collaborating with product, design, and business teams.
You will lead a team of Site Reliability Engineers who are responsible for shaping Doctolib's observability strategy and ensuring our platform remains reliable, debuggable, and scalable. This role sits at the intersection of people management, technical leadership, and strategic planning with a particular focus on building organizational capabilities around logging, metrics, tracing, and alerting.
Your team also owns and operates critical transversal services that enable secure, scalable infrastructure management across the organization, including HashiCorp Vault for secrets management and Terraform Enterprise for infrastructure as code.
Your responsibilities include but are not limited to:
People Leadership:
- Lead, coach, and grow a team of Site Reliability Engineers, supporting their technical development and career progression
- Create a culture of operational excellence, continuous improvement, and psychological safety within the team
- Conduct regular 1:1s, performance reviews, and career development conversations
- Recruit, onboard, and retain top SRE talent aligned with Doctolib's mission and values
- Partner with SREs and senior engineers to define and evolve the observability strategy across the platform, focusing on logging, metrics, tracing, and alerting
- Own the strategy and evolution of critical transversal services including HashiCorp Vault and Terraform Enterprise
- Drive prioritization and roadmap planning for large-scale reliability and observability initiatives
- Ensure alignment between team objectives and broader engineering and business goals
- Advocate for and allocate resources toward reducing technical debt and improving developer experience
- Own the team's on-call experience and contribute to the incident response processes, ensuring sustainable practices and continuous improvement
- Ensure high availability and reliability of transversal services that are critical to the entire engineering organization
- Lead postmortem reviews and drive systemic improvements to prevent recurring issues
- Work closely with Product Managers, Engineering Managers, and architects to align observability capabilities with product and platform needs
- Partner with security and infrastructure teams to evolve secrets management and IaC practices across the organization
- Represent the OREO team in engineering leadership forums, architectural reviews, and strategic planning sessions
- Foster strong partnerships with software engineering teams to improve instrumentation quality and adoption of observability best practices
- Our solutions are built on a single fully cloud-native platform that supports web and mobile app interfaces, multiple languages, and is adapted to the country and healthcare specialty requirements. To address these challenges, we are modularizing our platform run in a distributed architecture through reusable components.
- Our stack is composed of Rails, TypeScript, Java, Python, Kotlin, Swift, and React Native.
- We leverage AI ethically across our products to empower patients and health professionals. Discover our AI vision here and learn about our first AI hackathon here!
- You have at least 5+ years of software engineering or SRE experience, with a strong technical background in cloud-native environments (preferably AWS, GCP, and/or Kubernetes-based)
- You have 3+ years of engineering management experience, leading technical teams (ideally SRE, platform, or infrastructure teams)
- You have deep understanding of observability tooling and architecture (Fluent Bit, OpenTelemetry, Loki, Elasticsearch, Prometheus, Thanos, Datadog)
- You have experience with infrastructure as code (Terraform, OpenTofu) and secrets management systems (Vault, AWS Secrets Manager)
- You have proven ability to balance technical depth with people leadership, able to mentor engineers, review technical designs, and guide architectural decisions
- Have experience scaling SRE or platform teams in fast-growing, high-traffic environments
- Have background in designing and operating high-scale telemetry pipelines
- Have hands-on experience with HashiCorp Vault and Terraform Enterprise in production environments
- Have hands-on experience with backend programming languages (e.g., Go, Python, Ruby)
- Have experience driving cultural and technical transformations
- Free comprehensive health insurance for you and your children
- Parent Care Program: receive one additional month of leave on top of the legal parental leave
- Free mental health and coaching services through our partner Moka.care
- For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support
- Work from EU countries and the UK for up to 10 days per year, thanks to our flexibility days policy
- Work Council subsidy to refund part of sport club membership or creative class
- Up to 14 days of RTT
- A subsidy from the work council to refund part of the membership to a sport club or a creative class
- Lunch voucher with Swile card
- 30-min phone screen with a Tech Recruiter
- 1h30 technical interview (SRE System Design & Architecture)
- 1h15 behavioral interview (Leadership & People Management)
- 1h30 Engineering Management case study (team scenarios, prioritization, and conflict resolution)
- 1h manager interview with Senior Engineering Leadership
- At least one reference check
- Permanent position
- Full-time
- Levallois-Perret, France
- Start date: as soon as possible