Director, Site Reliability Engineering

Full Time

San Francisco, CA, USA

11 months ago

Apply now

About Pinterest:

Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love. In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping Pinners make their lives better in the positive corner of the internet.

Creating a life you love also means finding a career that celebrates the unique perspectives and experiences that you bring. As you read through the expectations of the position, consider how your skills and experiences may complement the responsibilities of the role. We encourage you to think through your relevant and transferable skills from prior experiences.

Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our PinFlex landing page to learn more.

At Pinterest, the Site Reliability Engineering organization is accountable for both ensuring Pinterest's overall availability and helping Engineering teams design, build, and operate resilient systems at scale.

We are looking for a skilled SRE Director to join our growing team and take the lead on developing and maintaining a world-class Site Reliability Engineering (SRE) organization. In this leadership position, your extensive technical knowledge will drive innovation, optimize system performance, and ensure the scalability and reliability of our crucial infrastructure.

What You’ll Do:

Lead, manage, mentor, and cultivate high-performing SRE teams by fostering an innovative, collaborative, and responsible culture.
Drive professional growth, career development, and strategic succession planning for the team, ensuring a bench strength of talent with diverse capabilities.
Create an inclusive, welcoming and open environment for robust communication and risk-taking, inspiring positive team dynamics and engagement.
Attract and retain top-tier talent, prioritizing professional growth, leadership development and a supportive work culture.
Lead IMOC, implement freeze rotations, and mentor the team during technical incidents and critical business decisions related to production reliability and stability.
Establish, manage, and take ownership of key SRE northstar metrics while continuously strategizing for their enhancement and overall improvement.
Establish teams charter and direction that aligns with the strategic goals of the broader SRE and Engineering organizations.
Oversee the formulation and execution of the SRE strategy, ensuring alignment with business objectives and key industry trends.
Manage operational and technical quality guided by established metrics and processes, ensuring consistent enhancement of the team's performance.
Drive the adoption of new technologies and tools to enhance SRE efficiency and effectiveness.
Facilitate the building of an operating framework to deliver significant value to customer organizations while developing tools and systems beneficial to the wider Engineering organization.

What We’re Looking For:

Over 10+ years of experience as an SRE or related field, with demonstrated ability in managing and expanding high-performing SRE teams.
Demonstrable experience in leadership roles involving teams of 25 or more engineers across various locations; capacity to establish effective cross-functional partnerships.
Deep knowledge of at least one major public cloud platform (AWS, Azure, GCP), including proficiency in IaaS, PaaS, container services, and networking issues.
Experience in designing and implementing multi-cloud, resiliency solutions, disaster recovery, and business continuity planning.
Experience leading teams skilled in large-scale frameworks / automation languages like Python, Java, Go, Lambda, and/or Terraform.
Extensive expertise in infrastructure automation, command of configuration management tools (for example, Ansible, Chef, Puppet), and thorough understanding of Infrastructure as Code (IaC) principles.
Experience in formulating and applying Service Level Objectives (SLOs) and Service Level Agreements (SLAs).
Solid grasp of security best practices in a cloud environment.
Exceptional ability to explain complex technical problems to non-technical audiences and promote strategies to key stakeholders.
Experience and successful track record in building and evolving effective teams, nurturing both technical and people-centric leaders.

Relocation Statement:

This position is not eligible for relocation assistance. Visit our PinFlex page to learn more about our working model.

#LI-REMOTE

#LI-JE2

At Pinterest we believe the workplace should be equitable, inclusive, and inspiring for every employee. In an effort to provide greater transparency, we are sharing the base salary range for this position. The position is also eligible for equity. Final salary is based on a number of factors including location, travel, relevant prior experience, or particular skills and expertise.

Information regarding the culture at Pinterest and benefits available for this position can be found here.

US based applicants only$204,754—$421,578 USD

Our Commitment to Diversity:

Pinterest is an equal opportunity employer and makes employment decisions on the basis of merit. We want to have the best qualified people in every job. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you require an accommodation during the job application process, please notify accessibility@pinterest.com for support.