Senior Software Engineer- ML Opps/ Infra

Full Time
2 months ago

Location- Open to 100% remote within United States or Canada

As a Sr. Software Engineer - MLOps/Infra, you will actively contribute and lead in the design and development of innovative MLOps/LLMOps platform. This team collaborates closely with other Product Development teams and works through the entire feature lifecycle including ideation, dataset construction, experimental validation, prototyping, production implementation, deployment, and operations.

Responsibilities:

  • Identifying and validating opportunities for the application of AI/ML or data-driven techniques
  • Driving technical delivery through the full feature lifecycle, from idea to production and operations
  • Collaborating within and beyond the team to identify problems and deliver solutions 
  • Assessing requirements and approaches for large-scale data and AI/ML platform components
  • Collaborating with UX/UI teammates on the usability of product features
  • Owning the uptime and reliability of delivered services and capabilities, including on-call rotation
  • Developing supporting tooling, automation, and microservices to accelerate the team

Requirements:

  • B.S. / M.S. / Ph.D. in Computer Science or related disciplines
  • 6+ years of industry experience with a proven track record of ownership and delivery. 
  • Experienced SME/Tech lead on MLOps/LLMOps technologies such as CI/CD with cloud-native microservice platform, ML model training, evaluation & serving, LLM prompt engineering, LLM fine tuning, LLM evaluation & etc.
  • Excellent collaboration and communication skills
  • Experience with software engineering of production-grade services in cloud environments
  • Experience formulating use cases as ML problems and putting ML models into production
  • Knowledge of and/or curiosity to learn about specific Sumo Logic customer problem domains
  • Operational excellence orientation: SLIs/SLOs, monitoring and troubleshooting, on-call rotations
  • Solid grounding in core ML concepts, basic statistics, and the judicious use of abstraction

Desirable:

  • Cloud-based application and infrastructure deployment and management
  • Common ML libraries (eg, scikit-learn, PyTorch) and components (eg, Airflow, MLFlow)
  • Relevant cloud provider services (eg, AWS Bedrock, AWS Sagemaker)
  • LLMOps related experience, Enterprise-grade copilot or AI agent deployment & monitoring experience

About Us

Sumo Logic, Inc. empowers the people who power modern, digital business. Sumo Logic enables customers to deliver reliable and secure cloud-native applications through its Sumo Logic SaaS Analytics Log Platform, which helps practitioners and developers ensure application reliability, secure and protect against modern security threats, and gain insights into their cloud infrastructures. Customers worldwide rely on Sumo Logic to get powerful real-time analytics and insights across observability and security solutions for their cloud-native applications. For more information, visit www.sumologic.com.

Sumo Logic Privacy Policy. Employees will be responsible for complying with applicable federal privacy laws and regulations, as well as organizational policies related to data protection.

The expected annual compensation range for this role is $160-180k + 10% bonus + equity. Compensation varies based on a variety of factors which include (but aren’t limited to) role level, skills and competencies, qualifications, knowledge, location, and experience. In addition to base pay, certain roles are eligible to participate in our bonus or commission plans, as well as our benefits offerings, and equity awards.

Other details

  • Health, Dental, Vision- Insurance
  • 401k and Life Insurance options
  • Unlimited PTO with 15+ days of recognized holidays
  • Quarterly Wellness days
  • 100% remote with the option to be in the office if you want (Bay Area, Austin, Denver, NYC)
  • 3 months of paid parental leave

#LI-DNI