Site Reliability Engineer II

Full Time
Richardson, TX, USA
1 week ago

Our Opportunity:

We are looking for a Site Reliability Engineer II at our facility in Richardson, Texas to contribute to the enhancement of site reliability and resiliency, system operations, infrastructure as code, observability, security hardening, and performance engineering.

What You’ll Do: 

  • Support the implementation and management of Chewy platform standards to facilitate the seamless transition of applications to production by leveraging AWS services and employing containerization through Infrastructure as Code (IAC) techniques
  • Provide a comprehensive framework for automating and optimizing processes, thereby minimizing the reliance on manual intervention
  • Utilize tools such as Python and Terraform to achieve efficient process automation
  • Establish a robust framework for site reliability that can be measured and reported to our customers
  • Implement scalable processes using various process automation tools
  • Take charge of maintaining security hardening on the Load Balancer end and oversee regular upgrades and software maintenance
  • Daily Operations and day to day regular Developer / Admin activities on Chewy platform end and sharing the reports across the org

What You’ll Need:

  • Bachelor's degree in Computer Science, Information Science, Network Engineering, Cyber Security, Site Reliability, or related field and 5 years of experience
  • Will accept Master’s degree and 3 years of experience
  • Experience must include 3 years with: hands-on experience with cloud services, specifically AWS including EC2 instances, ECS and EKS container platforms, IAM roles, network VPC configurations, Load Balancers, and other essential AWS services
  • Broader AWS global setup including knowledge of different regions, availability zones, and designing systems for reliability and fault tolerance
  • Containerization technologies including Docker and AWS Fargate
  • Expertise in orchestrating containers using Elastic Container Services or Kubernetes
  • Utilizing observability tools such as Datadog or Splunk
  • Service Level Objectives (SLOs) and ability to measure reliability of services
  • CI/CD tools and processes, including pipelines-ascode (Jenkins, Github Actions)
  • Configuration management and infrastructure-as-code (Terraform) for cloud provisioning
  • Troubleshooting and handling outages and incident

Chewy is committed to equal opportunity. We value and embrace diversity and inclusion of all Team Members. If you have a disability under the Americans with Disabilities Act or similar law, and you need an accommodation during the application process or to perform these job requirements, or if you need a religious accommodation, please contact CAAR@chewy.com.

 

If you have a question regarding your application, please contact HR@chewy.com.

 

To access Chewy's Customer Privacy Policy, please click here. To access Chewy's California CPRA Job Applicant Privacy Policy, please click here.