SRE / Incident Management (x/f/m)
We are looking for a Site Reliability Engineer / Incident Management to join the Doctolib Operations Center team.
As a Site Reliability Engineer / Incident Management, your mission will be to improve the daily lives of care teams and patients by ensuring the reliability and operational excellence of our platform. You will lead efforts in incident and problem management, enhance day-to-day operational capabilities, and consult on infrastructure and platform improvements with teams across the organization.
Working in the tech team at Doctolib involves building innovative products and features to improve the daily lives of care teams and patients. We work in feature teams in an agile environment, while collaborating with product, design, and business teams.
Your responsibilities include but are not limited to:
- Create monitoring tools for the platform and act as first-level responder for site impairments
- Communicate with internal and external stakeholders to assess platform quality and suggest improvements using your broad knowledge of the stack
- Act as Incident Manager during incidents to ensure teams follow appropriate lines of inquiry and have the necessary resources to perform their tasks
- Help teams research thorough root cause analyses after incidents and track implementation of improvements
- Assess non-standard changes as part of the Change Advisory Board to prevent production impact and ensure proper communication with incident management
- Our solutions are built on a single fully cloud-native platform that supports web and mobile app interfaces, multiple languages, and is adapted to the country and healthcare specialty requirements. To address these challenges, we are modularizing our platform run in a distributed architecture through reusable components.
- Our stack is composed of Rails, TypeScript, Java, Python, Kotlin, Swift, and React Native.
- We leverage AI ethically across our products to empower patients and health professionals. Discover our AI vision here and learn about our first AI hackathon here!
- You have strong organizational skills and an insatiable curiosity
- You are willing to communicate across several levels of the organization and potentially outside of it to business partners
- You have strong knowledge of Linux-based systems, including the CLI and bash commands
- You have strong knowledge of cloud-based systems (AWS preferred)
- You are fluent in English
- Have knowledge of a high-level language like Python or Ruby
- Have working knowledge of the Kubernetes ecosystem
- Are familiar with common software engineering tools like Confluence, Jira, DataDog, and Elasticsearch
- Free comprehensive health insurance for you and your children
- Parent Care Program: receive one additional month of leave on top of the legal parental leave
- Free mental health and coaching services through our partner Moka.care
- For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support
- Work from EU countries and the UK for up to 10 days per year, thanks to our flexibility days policy
- Work Council subsidy to refund part of sport club membership or creative class
- Up to 14 days of RTT
- Lunch voucher with Swile card
- 30 min phone screen with a Tech Recruiter
- 1h30 Technical interview (SRE)
- 1h30 Behavioral interview
- At least one reference check
- Permanent position
- Full-time
- Paris, France
- Start date: as soon as possible