Staff Site Reliability Engineer

Full Time
Bengaluru, Karnataka, India
3 months ago
See yourself at Twilio

Join the team as Twilio’s next Staff, Site Reliability Engineer on Twilio’s Segment platform observability team.

Who we are & why we’re hiring

Twilio powers real-time business communications and data solutions that help companies and developers worldwide build better applications and customer experiences.

Although we're headquartered in San Francisco, we have presence throughout South America, Europe, Asia and Australia. We're on a journey to becoming a global company that actively opposes racism and all forms of oppression and bias. At Twilio, we support diversity, equity & inclusion wherever we do business.

About the job

Are you passionate about building and maintaining cutting-edge observability solutions to empower engineers with real-time insights into complex cloud-based systems?

As a Staff Site Reliability Engineer on the Platform Observability Team, your primary mission will be to help drive decisions and improve outcomes across our cloud-based SaaS platform based on solutions provided by observability tooling. You will work closely with cross-functional teams to build developer-friendly observability solutions, allowing product and infrastructure engineers to build and ship with x-ray vision into their software's metrics, logs, traces and performance profiles.

Responsibilities

In this role, you’ll:

  • Design, implement, and maintain observability infrastructure and tooling, focusing on logging, distributed tracing, metrics, and continuous profiling.
  • Collaborate with software engineers to provide comprehensive instrumentation to capture relevant telemetry data for observability purposes.
  • Leverage open-source standards, such as OpenTelemetry, to build scalable and interoperable solutions.
  • Build and leverage FinOps tools and solutions to provide cost visibility to engineering stakeholders and teams. 
  • Develop data pipelines to handle high cardinality data and enable interactive troubleshooting capabilities for engineers.
  • Enable effective telemetry correlation and allow engineers to understand the behavior of distributed systems.
  • Work on building affordable and engineer-friendly observability tooling, facilitating real-time root-cause analysis and reducing mean time to resolution (MTTR) for incidents.
  • Support and enhance our DataDog platform including standard and custom metrics at an extremely high scale. 
  • Contribute to the development of the Observability platform's features and functionalities, continuously enhancing the user experience and ensuring self-service capabilities for other teams.
  • Collaborate with the OpenTelemetry community and contribute to open-source initiatives to foster a broader adoption of observability solutions.
Qualifications 

Not all applicants will have skills that match a job description exactly. Twilio values diverse experiences in other industries, and we encourage everyone who meets the required qualifications to apply. While having “desired” qualifications make for a strong candidate, we encourage applicants with alternative experiences to also apply. If your career is just starting or hasn't followed a traditional path, don't let that stop you from considering Twilio. We are always looking for people who will bring something new to the table!

Required:

  • 8+ years experience writing production-grade code in a modern programming language.
  • Proven experience in designing, implementing, and maintaining observability solutions, preferably within a cloud-based SaaS environment.
  • Strong proficiency in programming languages such as Python, Go, or Java.
  • Familiarity with open-source observability tools and standards, including Prometheus, Grafana, OpenTelemetry, and others.
  • Knowledge of distributed tracing, log management, and metric aggregation techniques.
  • Proficiency in IaC, Kubernetes, and AWS concepts, best practices, and tools.
  • Participate in team on-call rotations.
  • Solid problem-solving skills, proactive attitude, and ability to work collaboratively in a dynamic team environment.

Desired:

  • Experience with context propagation and telemetry correlation to enable effective troubleshooting and monitoring of distributed systems.
  • Experience in building data pipelines.
  • Understanding of high cardinality data challenges and strategies for handling complex telemetry data.
  • Proficiency in optimizing cloud infrastructure and compute costs through the implementation of cost observability software and workflows.

Location 

This role will be based remotely in India.

Approximately 10% travel is anticipated. 

What We Offer

There are many benefits to working at Twilio, including, in addition to competitive pay, things like generous time-off, ample parental and wellness leave, healthcare, a retirement savings program, and much more. Offerings vary by location.

Twilio thinks big. Do you?

We like to solve problems, take initiative, pitch in when needed, and are always up for trying new things. That's why we seek out colleagues who embody our values — something we call Twilio Magic. Additionally, we empower employees to build positive change in their communities by supporting their volunteering and donation efforts.

So, if you're ready to unleash your full potential, do your best work, and be the best version of yourself, apply now!

If this role isn't what you're looking for, please consider other open positions.

Twilio is proud to be an equal opportunity employer. Twilio is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Additionally, Twilio participates in the E-Verify program in certain locations, as required by law.

Twilio is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, please contact us at accommodation@twilio.com.