Sr. Devops Engineer
About Us:
LogicMonitor is the leading fully automated, cloud-based infrastructure monitoring and observability platform for enterprise IT and managed service providers.
We love going to work and think you should too. We are customer obsessed, work as one agile team, and strive to be better every day while building trust. These are our core values. We foster a culture of performance and recognition, allowing us to transform growth as we enable our employees to do the best work of their careers.
This position is located in Pune. You'll be working in a major tech center of Pune, India. We call our offices Centers of Energy, because they’re where we accelerate work, spark creativity, and ignite our culture of connection and celebration. Our teams coordinate their time in Centers of Energy to reflect how they work best.
LogicMonitor is an equal opportunity employer. We deeply care about our employees' well-being, creating an environment where everyone feels valued and respected. We celebrate the diversity of our team and are committed to fostering a culture of inclusivity. When you join LogicMonitor, you're not just an employee to us, but a valued member of our community. Come as you are, be yourself, and let's grow together.
To learn more about life at LogicMonitor, check out our Careers Page.
What You'll Do:
This role will take a lead in the operational uptime and continued expansion of LogicMonitor's production of the TechOps infrastructure by serving as a technical architect and a facilitator of operational excellence. Responsibilities include design and implement new production deployments of SOA-based software across global physical and cloud data centers as well as provide guidance in organizing, securing and automating existing infrastructure and deployments. This position will work with developers and provide feedback to force operational performance improvements within the LM product platform and operations infrastructure.
Here's a closer look at this key role:
- Maintain uptime of LogicMonitor's SaaS based service and drive technical/process enhancements to improve uptime
- Deploy production applications and drive improvements to the deployment process
- Design and deploy new application components
- Design and deploy new infrastructures and integrations
- Ensure security of the production environment
- Meet with prospective customers as needed
- Write code to automate various aspects of infrastructure maintenance and and deployments
- Support development and work closely with developers to drive operational and architecture/design changes
- Own, manage, and execute multiple large and technically complex projects across teams
- Act as a strategic resource for the company with the ability to develop and deliver technical presentations for other departments, customers, and conferences
- Consistently lead by example in providing good documentation, thorough runbooks, attention to detail, and DDFD completeness in work.
- Ownership of the new-hire process
- Providing alignment between business objectives and the team's pursuit of technology improvements
- Ownership of remediation actions for Service Disruptions and Outages
- Develop and maintain relationships with other groups within LogicMonitor to help ensure the forward trajectory of company
- In conjunction with the Principal Engineer and the Director of TechOps, help monitor work output, quality, and timeliness; help maintain the balance between planned and unplanned activities, help with task prioritization and assignment
- In a boots on the ground role, make decisions, build consensus, escalate, and communicate as appropriate to facilitate the success of a project, and the success and cohesiveness of the team and the company.
- Provide direct technical guidance to help team members achieve goals and improve their productivity.
- Participate in the recruitment and hiring of new engineers
- 5+ years working in senior data center operations positions
- Expert understanding of linux system administration and 4+ years of hands-on experience
- Expert understanding of Amazon Web Services
- Thorough understanding of various software application stacks
- Well versed in security principles, both system and network
- Extensive experience in various application scaling methodologies, including (but not limited to) load balancers
- Extensive experience with configuration management tools such as chef, puppet or ansible
- Extensive experience with java applications.
- Extensive experience with CI and build systems
- Significant experience with virtualization and container technologies (Docker, Kubernetes, etc.)
- Signification experience with relational databases (MySQL) and NoSQL databases (eg MongoDB) in both administration and querying
- Significant experience programming and scripting (java/ruby/python/shell/go).
- Significant experience with source code management tools (git).
- High level understanding of networking technologies (routing, switching, firewalls, iptables, etc)
- Experience with bamboo, or other continuous integration build environments.
- Experience with package management systems (RPM, ruby gems, etc)
- Experience with Log management tools like SumoLogic, Kibana and monitoring tools like Graphana
- An strong understanding of SOA and High Availability systems
- Experience successfully training and mentoring technical personnel
- Excellent written and verbal communications skills with a track record of improving documentation and processes
Nice to Have -
- Expert level understanding of LogicMonitor's technology stack and processes
- A track record of providing leadership and inspiration to team members, maintaining a positive trajectory, and driving to constantly improve results
- A track record of working with minimal supervision and under pressure
- A track record of solving complex problems.
- A track record of providing strong leadership and continually raise the bar for the team
- A track record of leading by example with proactive communication, attention to detail, and excellent follow through
- A track record of being security conscious in all decision making, and regularly improving the security posture of the company and product
- Seen as a 'go to' person for any technical or logistical challenges while also being a force multiplier with the team
- Ability to handle complexity and ambiguity with ease
- Ability to grasp the big picture and distill it down to actionable tasks
- A desire not just to resolve problems, but to fully understand them. A tenacity and skill to quickly delve to the root of the problem, understand why it happened, and prevent it in the future.