Infrastructure Engineer

Full Time
Toronto, ON, Canada
2 months ago

At Lyft, our mission is to improve people’s lives with the world’s best transportation. To do this, we start with our own community by creating an open, inclusive, and diverse organization.

The Datastores and Persistence teams are responsible for managing data in transit and at rest for all Lyft services. Our persistence storage systems are responsible for safely reading and writing petabytes each month. The teams manage and maintain our cloud storage services including RDS, Aurora, DynamoDB, S3, ElasticCache and ZooKeeper to ensure that all Lyft services are able to access and store information reliably and efficiently. We are also responsible for ensuring that data in transit is accessible, providing data pipelines for our customers to subscribe to in order to respond in real time to defined events. This infrastructure is built upon a Kafka foundation with Lyft customized tooling added to support our specific needs.

Our infrastructure has been built by Lyft to reliably store and stream billions of unique events each day. You will be part of a team that continuously monitors and tunes these systems in order to ensure that we can maintain a high level of service for our teams across the company.

Responsibilities:
  • Maintain and analyze metrics from; operating systems; control planes; and applications to assist in fault detection and performance enhancement
  • Design, develop and deploy tooling and systems that continually improve the reliability, scalability and efficiency of our platform
  • Balance feature development speed and reliability with service-level objectives
  • Operate and improve our Infrastructure using industry best practices and tools
  • Participate in design and production readiness reviews, platform management and capacity planning ceremonies with cross-functional teams
  • Document Infrastructure operations process and insights, identify repeatable actions and ruthlessly automate repetitive tasks
  • Participate in our teams on-call rotations, respond to incidents and support other teams mitigate customer impacting events
Experience:
  • 3+ years experience working on teams responsible for software development, automation and systems engineering
  • Bachelor’s Degree or equivalent experience in Computer Science or relevant discipline
  • Ability to create production ready code in one or more high level languages, such as Go Lang, Python, Rust and C/C++
  • Experience operating large scale infrastructure in public cloud environments, such as AWS, Google Cloud or Microsoft Azure
  • Experience operating and monitoring Kubernetes and Envoy Proxy in large Production environment
  • Experience with distributed storage technologies such as S3, RDS, DynamoDB, Aurora and distributed configuration systems such as Zookeeper and etcd
  • Experience using monitoring, alerting and logging systems at massive-scale, such as Prometheus, Telegraph and M3
  • Experience identifying nascent problems, performance bottlenecks and areas for improvement and developing and executing plans to mitigate them
Benefits:
  • Extended health and dental coverage options, along with life insurance and disability benefits
  • Mental health benefits
  • Family building benefits
  • Access to a Health Care Savings Account
  • In addition to provincial observed holidays, team members get 15 days paid time off, with an additional day for each year of service 
  • 4 Floating Holidays each calendar year prorated based off of date of hire
  • 10 paid sick days per year regardless of province
  • 18 weeks of paid parental leave. Biological, adoptive, and foster parents are all eligible

Lyft proudly pursues and hires a diverse workforce. Lyft believes that every person has a right to equal employment opportunities without discrimination because of race, ancestry, place of origin, colour, ethnic origin, citizenship, creed, sex, sexual orientation, gender identity, gender expression, age, marital status, family status, disability, pardoned record of offences, or any other basis protected by applicable law or by Company policy.  Lyft also strives for a healthy and safe workplace and strictly prohibits harassment of any kind.  Accommodation for persons with disabilities will be provided upon request in accordance with applicable law during the application and hiring process.  Please contact your recruiter now if you wish to make such a request.

This role will be in-office on a hybrid schedule following the establishment of a Lyft office in Toronto — Team Members will be expected to work in the office 3 days per week on Mondays, Thursdays and a team-specific third day. Additionally, hybrid roles have the flexibility to work from anywhere for up to 4 weeks per year.