Lead Distributed Systems Engineer (Machine Learning)
As Lead Distributed Systems Engineer (Machine Learning), you will play an instrumental role in advancing the services and systems comprising our Machine Learning Platform. Your technical expertise and leadership skills will be leveraged to drive the development and deployment of high quality, scalable AI solutions. The fast pace of research and product releases in this field means that there is always something new and exciting to work on, which keeps the role engaging and challenging. By combining digital marketing and AI, you have the potential to make a significant impact on the way businesses interact with their customers.
This role will be reporting to the Director of Engineering (AI).
Responsibilities
- Lead strategy, architecture, and development of scalable, highly available, and fault tolerant services powering our Machine Learning platform according to industry standards for performance, monitoring, orchestration, and testing.
- Design, deploy, and maintain Machine Learning services such as a feature store, experimentation platform, model endpoint management/blue-green deployment, vector databases, etc.
- Collaborate with product and engineering stakeholders to empathetically understand and define requirements for complex systems, and develop complex projects from conception into rigorous technical specifications with a clear path to production.
- Minimize risk across platform/system deployments, features, and processes.
- Foster close collaboration with AI research teams to ensure that their innovations are effectively integrated into the product development process.
- Build systems that deliver measurable and tangible business value.
- Conduct design reviews and hold the team to technical and operational rigor.
- Serve as a thought leader, providing technical guidance and mentorship to junior developers and contributing to the overall technical excellence of the organization.
Requirements
- 7+ years software engineering experience
- Bachelor’s degree in Computer Science, Mathematics, Engineering, or related technical field
- Experience architecting, building, and maintaining production distributed systems at scale
- Exemplary software engineering skills (design, unit testing, git, code review, CI/CD)
- Proficiency with Python
- Experience with large-scale data processing frameworks (we use SQL, PySpark, Kafka)
- Experience with cloud computing platforms (we use Google Cloud Platform (GCP))
- Experience with modern cloud technologies (we use Kubernetes, Terraform, etc)
- Experience implementing performant microservices (we use gRPC)
- Proficient in database management, including designing database schema, crafting efficient queries, performing basic DBA tasks, and knowledgeable regarding common databases relevant to Python development
- Enjoys collaborating with AI researchers, product managers, and other engineering teams
- Strong technical leadership skills
- Able to inspire excellence and elevate the quality of engineering solutions
- A desire to always be learning and contributing to a collaborative environment
Studies have shown that women, communities of color, and historically underrepresented people are less likely to apply to jobs unless they meet every single qualification. We are committed to building a diverse and inclusive culture where all Inkers can thrive. If you’re excited about the role but don’t meet all of the abovementioned qualifications, we encourage you to apply. Our differences bring a breadth of knowledge and perspectives that makes us collectively stronger.
We welcome and employ people regardless of race, color, gender identity or expression, religion, genetic information, parental or pregnancy status, national origin, sexual orientation, age, citizenship, marital status, ethnicity, family or marital status, physical and mental ability, political affiliation, disability, Veteran status, or other protected characteristics. We are proud to be an equal opportunity employer.