Senior Software Engineer, Workload Scheduling

Full Time
1 month ago

MongoDB’s mission is to empower innovators to create, transform, and disrupt industries by unleashing the power of software and data. We enable organizations of all sizes to easily build, scale, and run modern applications by helping them modernize legacy workloads, embrace innovation, and unleash AI. Our industry-leading developer data platform, MongoDB Atlas, is the only globally distributed, multi-cloud database and is available in more than 115 regions across AWS, Google Cloud, and Microsoft Azure. Atlas allows customers to build and run applications anywhere—on premises, or across cloud providers. With offices worldwide and over 175,000 new developers signing up to use MongoDB every month, it’s no wonder that leading organizations, like Samsung and Toyota, trust MongoDB to build next-generation, AI-powered applications.

The Workload Scheduling team is focused on optimizing the operational resiliency of a MongoDB cluster. We build out the core server workload management infrastructure so that MongoDB can properly queue, load shed, and make decisions to stabilize performance during situations where system resources are beginning to become constrained. We are working to move workload management policies out of the core server into its own process for better modularity, and building out the infrastructure to observe, measure, and tune workload mechanisms and policies to improve MongoDB’s performance and stability while under load.

We are a new and small team looking for an experienced engineer to join us. We are looking for someone who is excited about building something from the ground up, and a role with lots of opportunities to make a large impact on the operational resiliency story at MongoDB.

This role can be based remotely in the United States or Canada.

Candidate Profile
  • Minimum 5 years of experience in programming, debugging, and performance tuning distributed and/or highly concurrent low-level software systems. Experience in C++ and/or Rust is preferable.
  • Strong systems fundamentals, including multi-threaded programming and performance profiling
  • Excellent verbal and written technical communication skills and a desire to collaborate with colleagues and mentor junior engineers and interns.
  • Passion for learning new things in the domains of software engineering, distributed systems, and performance.
  • Familiarity with distributed system concepts such as fault tolerance, consistency, and availability
  • Experience with writing software to observe and mitigate overload scenarios for highly performant and concurrent systems
Position Expectations
  • Write production-ready database code in C++, Rust, or possibly another low-level language
  • Write unit tests and integration tests in C++, Rust, Javascript, and Python to demonstrate application correctness
  • Investigate the performance characteristics of the server and write performance regression tests
  • Build and optimize workload management primitives such as operation queueing/ticketing mechanisms as well as load-shedding
  • Implement and continuously improve server workload policies to schedule, prioritize, and make decisions about incoming and in-progress operations to maximize goodput while under system load
  • Build a process external to the core server that monitors and tunes policies of a MongoDB server.
  • Improve the workload and system observability/diagnostics of the core server.
  • Diagnose performance and correctness test failures, identify bugs and/or deficiencies in existing code, and fix them
  • Interview candidates for software engineering positions
  • Handle customer escalations
  • Collaborate with stakeholders and engineering teams across the company to jointly work on large initiatives
  • Mentor new and junior engineers on the team, participating in helping facilitate technical growth in the team
  • Lead projects through writing scope and technical design documents, and be able to appropriately estimate and plan out execution streams across contributors
Success Measures
  • In the first month, you will have understood the high level architecture of MongoDB and fixed a few bugs
  • In three months, you will have contributed to the development of a project slated for the next major release of MongoDB, and diagnosed and fixed a few customer or testing-reported issues
  • In six months, you will have taken on code review responsibilities and are involved in reviewing the design for new features
  • In twelve months, you will be leading the development of a new feature and helping to mentor new engineers on the team

To drive the personal growth and business impact of our employees, we’re committed to developing a supportive and enriching culture for everyone. From employee affinity groups, to fertility assistance and a generous parental leave policy, we value our employees’ wellbeing and want to support them along every step of their professional and personal journeys. Learn more about what it’s like to work at MongoDB, and help us make an impact on the world!

MongoDB is committed to providing any necessary accommodations for individuals with disabilities within our application and interview process. To request an accommodation due to a disability, please inform your recruiter.

MongoDB is an equal opportunities employer.