Senior Site Reliability Engineer (SRE) / DevOps Engineer
SAM Labs is an award-winning EdTech start-up. Founded in 2014, growing fast, making a difference, and venture capital funded. With over 9,000 school customers in 60+ countries already using SAM Labs products and lesson materials, SAM Labs is looking for more talent to join its team!
SAM Labs inspires generations of problem solvers with Coding and STEAM. We empower teachers with innovative curriculum resources, tools, and the mindset to do so. Teachers and students learn with SAM Labs kits and lessons by designing anything from energy efficient lighting grids for ‘smart’ cities to solar-powered habitats for endangered species - all the while meeting curriculum standards.
The Opportunity
We're seeking an exceptional Senior Site Reliability Engineer / DevOps Engineer with a passion for crafting and operating highly scalable, resilient, cloud-based systems. You'll play a vital role in driving our software delivery infrastructure's automation, optimization, and continuous improvement. This collaborative role will see you working alongside talented developers and operations specialists to ensure our applications' seamless delivery and reliability. This role is a contracted position until November 2024.
Responsibilities:
- Design, implement, and manage Kubernetes (AWS EKS) clusters to ensure high availability and disaster recovery capabilities.
- Automate CI/CD pipelines using GitHub Actions and manage code repositories on GitHub.
- Deep understanding of AWS cloud services (e.g., EC2, S3, RDS, Lambda, DynamoDB) and experience with AWS Cloud Organizations for multi-account management.
- Develop and maintain infrastructure as code (IaC) using Terraform.
- Ensure system reliability and efficiency through comprehensive observability solutions, including Grafana, Kibana, and AWS CloudWatch.
- Conduct system performance analysis and optimize infrastructure to meet business requirements.
- Collaborate with cross-functional teams to integrate DevOps methodologies into development, testing, and production environments.
- Lead incident response efforts, including post-mortem analysis and implementing preventive measures.
- Stay updated on emerging technologies and propose enhancements to our DevOps practices.
Why this Role?
- Career Growth: Joining SAM Labs as a Senior Site Reliability Engineer / DevOps Engineer offers the unique opportunity to shape and enhance the architecture of our growing platform. You will work on ensuring the reliability and scalability of our systems.
- Learning and Development: This role will deepen your expertise in cloud services, infrastructure as code, and DevOps practices. You will gain hands-on experience with AWS, Kubernetes, Terraform, and CI/CD automation.
- Impactful Work: Your contributions will directly support the education of thousands of students worldwide, enabling innovative STEAM learning experiences that meet curriculum standards.
- Collaborative Environment: You will work with a talented team of developers and operations specialists, fostering a culture of creativity, transparency, and fun.
By joining SAM Labs, you become part of a mission-driven team dedicated to transforming education through technology. Embrace the challenge, grow with us, and make a difference in the lives of students worldwide.