Sr Site Reliability Engineer

Full Time
Chennai, Tamil Nadu, India
5 months ago

Over the past 15 years, we have seen a shift in the focus of business models across every industry – from selling physical products via one-time transactions to monetizing services via ongoing customer (aka subscriber) relationships. This is the “Subscription Economy” a phrase coined by our CEO, Tien Tzuo, he even wrote the book on it: Subscribed.

Companies have realized that the path to growth going forward is to establish direct, digital relationships with their customers, and monetize these relationships through an ever growing set of digital services.

Our vision is simple: we call it “The World Subscribed.” It’s the idea that one day every company will join the Subscription Economy -- a $1.5 Trillion opportunity by 2025 according to UBS.

Our mission: to power the world’s best companies to win in the Subscription Economy.

The Team & Role

Zuora’s TechOps teams are responsible for Cloud infrastructures, monitoring performance and uptime, managing internal and external shared services, infrastructure services and more - for Zuora’s customer facing SaaS products and platforms. Our technologists sit across US, Beijing, India and Costarica, using a follow-the-sun model to provide 24x7x365 coverage for critical functions and partner closely with our Engineering, Customer Support, Security, Global Services and Sales teams on a daily basis to keep our customers front and center.

In this role you’ll get to

  • Ensure Service Availability, Scalability, Security & Capacity
  • Run our global infrastructure using Ansible, Terraform, CI/CD & Kubernetes in a multi-cloud platform
  • Automation - continue to push for new levels of efficiencies
  • Proactive, preventative enablement driving high reliability
  • Architecting and enabling solutions that drive preventative, proactive solutions & Infrastructure services
  • Be on an on-call (PagerDuty) rotation to respond to incidents that impact Zuora’s products and services availability, and provide leadership and drive restoration outcomes  for service engineers with customer incidents.
  • Drive and coordinate the critical impacting issues bridges and collaboration to root cause & restoration.
  • Use your on-call shift to prevent incidents from ever happening.
  • Run our infrastructure with Puppet, Ansible, Terraform, GIT CI/CD, Jenkins, ECS, and Kubernetes.
  • Incorporate feedback from incidents back into monitoring that alerts on symptoms rather than on outages.
  • Work with engineering teams on maintaining and improving runbooks, including documenting cases where runbooks are missing and needed.
  • Support and maintain core infrastructure that enables Zuora scale to support all of our customer’s needs.
  • Help debug production issues across services and levels of the stack.

Job Involves

  • Take every task that requires a person to execute it, strip it down & automate it
  • Take on capacity planning head on, shaping the multi-cloud world
  • Resolution of complex and critical issues, participation in Major incidents as a SME
  • Configure monitoring and alerting to ensure integrity, reliability & the performance that skeptics thought couldn’t be done (line for problem solving)
  • Service expert ensuring expertise is reflected in SOP's documentation are shared
  • End-to-end tuning needs, optimizing resource utilization, as load patterns fluctuate
  • Instrumentation and metrics that clearly describe the service behaviors
  • End-to-end tuning needs, optimizing resource utilization, as load patterns fluctuate
  • Consult on new capabilities ensuring a scalable infrastructure
  • Resiliency and recoverability, ensuring that backup / restore and disaster recovery capabilities are implemented, tested and maintained

Who we’re looking for

  • 10+ years of overall experience
  • You bring your excellent communication, problem solving, critical thinking & passion to the table each day to disrupt, make an impact & rewrite the rulebook.
  • Operating System:
    • Strong knowledge in Linux Operating system.
    • User and File System Management in Linux/Unix
    • Troubleshooting at OS level.
  • Oracle:
    • Strong knowledge of Oracle architecture and administration activities.
    • Backup and recovery process (logical and RMAN)
    • Strong knowledge of Database recovery process.
    • Performance and troubleshooting expertise in DB related issues.
  • AWS:
    • Good exposure to AWS core functionalities.
    • EC2, Volume management, IAM, S3, AWS CLI, Loadbalancer, Security group, VPC, Subnet.
    • Cloudwatch, ECS container services.
    • SSL certificate management.
    • Troubleshooting knowledge in AWS Resource.
  • DevOps:
    • Good knowledge in Jenkins.
    • Knowledge in Shell Scripting or Python.
    • Exposure in Ansible and Terraform Module.
  • Experience in infrastructure services (DNS, Mail Relays, NTP, CDN, SSL Certificates)
  • Experience running and leading command center bridges
  • Experience driving Incident issues to isolation and alignment with the corresponding service

Benefits*

  • Competitive compensation, company equity, and retirement programs
  • Medical Insurance
  • Paid holidays and “wellness” days and company wide winter break
  • Generous, flexible time off 
  • 6 months fully paid parental leave
  • Learning & Development stipend
  • Opportunities to volunteer and give back, including charitable donation match
  • Free resources and support for your mental wellbeing

*Specific benefits offerings may vary by country

 

About Zuora 

As the Subscription Economy leader, Zuora empowers today’s innovative companies to nurture and monetize direct, digital relationships. Our award-winning multi-product portfolio now includes Zuora Revenue, Zuora Collect and Zuora Central Platform. More recently, we’ve added subscription experience platform Zephr to our family, further expanding our capabilities to serve as an intelligent hub that monetizes the complete quote to cash and revenue recognition process at scale.

Through our combination of technology and expertise, Zuora (NYSE: ZUO) helps more than 1,000 companies around the world, including BMC Software, Box, Caterpillar, General Motors, Penske Media Corporation, Schneider Electric, Siemens and Zoom nurture and monetize direct, digital customer relationships. Headquartered in Silicon Valley, Zuora operates offices around the world in the U.S., EMEA, APAC and LATAM.

“ZEO” Culture

At Zuora, we’re building an inclusive, high-performance culture that every ZEO wants to subscribe to. We want ZEOs at every level to feel valued, included, and inspired to innovate, connect and collaborate authentically as we pioneer the Subscription Economy. You’ll be empowered to think like an owner, take initiative and together, with the support of your team you’ll push each other to the next level and help transform business models everywhere.

To learn more visit www.zuora.com

Zuora is proud to be an Equal Employment Opportunity Employer.

Think, be and do you! At Zuora, different perspectives, experiences and contributions matter. Everyone counts. Zuora is proud to be an Equal Opportunity Employer committed to creating an inclusive environment for all.

Zuora does not discriminate on the basis of, and considers individuals seeking employment with Zuora without regards to, race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics.

We encourage candidates from all backgrounds to apply. Applicants in need of special assistance or accommodation during the interview process or in accessing our website may contact us by sending an email to assistance@zuora.com.