Engineering Manager, SRE
About Index:
We shaped the earliest forms of ad tech, and we’re looking for the technical expertise to help shape its future. Our customers have unique problems that can only be solved at internet scale, and that’s where the technical skills of our team make a real difference.
Our exchange handles more than 450 billion requests every day (for comparison Google serves an estimated 9 billion searches a day), all running in our own global data centers. Every member of our technology team has an enormous amount of autonomy in building and managing our systems to support and enable our growing level of scale. Through the transparency of our technology, dedication to innovation and integrity, and long-standing customer relationships, we lead through change.
What’s it like to work at Index?
We have more than 600 Indexers around the globe dedicated to building a safe and transparent marketplace that provides a trusted experience for consumers.
Index is an exciting and fast-paced place to work. We’re built on our values of change, support, learning and teaching, trust, and intention. We pride ourselves on our independence and openness, not only in our technology, but in our teams, too. Our diverse and inclusive culture celebrates how we can leverage our unique differences to help drive Index forward.
Our culture of success is truly supportive and collaborative. In working together across our teams, we’re continually investing in the people and technology to solve the industry’s most complex problems. As we extend the promise of ad tech to every channel, we’re looking for talented engineers to help advance Index, and the industry, forward.
About The Role
We are seeking an experienced Engineering Manager with a strong background in Site Reliability Engineering (SRE) to lead and develop a high-performance team of engineers. The ideal candidate will have a deep technical understanding of on-premise and hybrid cloud environments and a proven track record of managing SRE teams in a global setting.
Here’s What You’ll be Doing
- Team Leadership: Build and lead a world-class SRE team, fostering a culture of innovation, collaboration, and accountability. Provide mentorship, guidance, and professional development opportunities to team members.
- Technical Expertise: Possess a deep understanding of on-premise and hybrid cloud environments, with a focus on optimizing performance low-latency on Kubernetes platforms supporting a robust developer experience framework.
- Operational Excellence: Drive operational excellence through proactive monitoring, automation, and the development of robust incident management processes. Ensure the team meets and exceeds service level objectives (SLOs) and service level indicators (SLIs).
- Software Engineering Skills: Collaborate with software engineering teams to implement SRE best practices in the software development life cycle, including designing scalable and resilient systems.
- Incident Management: Lead incident response efforts, ensuring rapid resolution and post-incident analysis to prevent recurrence. Maintain incident reports and contribute to continuous improvement.
- Reporting and Metrics: Develop and maintain meaningful performance metrics and reporting mechanisms to track the health and reliability of our systems. Use data-driven insights to guide decision-making and triaging.
- Global Scale: Manage SRE operations at global scale, considering regional nuances and ensuring consistent, reliable service delivery across geographies.
- Project Management: Act as a technical leader on projects, architecting the design of projects to meet the needs of the business outcome, and to align with existing architectural vision. Collaborate with subject matter experts and with a network of peers to ensure on-time quality delivery.
Here's What You Need
- Proven experience (6+ years) in SRE roles, with a focus on low-latency, global-scale environments built on upstream Kubernetes.
- Strong software engineering skills, including proficiency in programming languages such as Goland, Python, Perl.
- Excellent understanding of on-premise and hybrid cloud architectures.
- Exceptional leadership and team-building skills with a track record of developing high-performing teams with at least 3 years of experience in that role.
- Expertise in incident management, root cause analysis, and post-incident reviews.
- Strong analytical and problem-solving abilities.
- Experience with industry-standard SRE tools and technologies within the CNCF portfolio.
- Excellent communication skills, with the ability to collaborate effectively with cross-functional teams.
Why You’ll Love Working Here
- Comprehensive health, dental, and vision plans at no cost to you
- Time off and flexible work schedules
- Retirement plan with a 5% company match
- Stock options and equity packages
- Generous parental leave
- Monthly wellness stipend plus fitness discounts and quarterly wellness group activities
- Home office stipend
- Community engagement opportunities and donation-matching program
- Annual virtual company retreats and regular community-led team events
Equal employment opportunity
At Index Exchange, we believe that successful products are built by teams just as diverse as the audience who uses them. As such, we are committed to equal employment opportunities. We celebrate diversity of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or expression, or veteran status. Additionally, we realize that diversity is deeper than any status or classification—diversity is the human experience. For those who show grit, passion, and humility—Index will welcome you.
Accessibility for applicants with disabilities
Index Exchange is committed to working with and providing access and reasonable accommodations to applicants with disabilities. Please let us know if you’d like to request a reasonable accommodation.
*COVID-19 guidance: We have re-opened offices in various cities following local guidelines, but are continuing to maintain a flexible work environment