Software Engineer (AI Platforms)
At SingleStore, we’re not just building a database company, we’re defining the future of data management. Going beyond multi-cloud, we offer customers flexible networking, storage, and compute options to meet their requirements. With a few clicks, our cloud service spins up production grade infrastructure using the latest capabilities of major cloud providers and the industry standard Kubernetes ecosystem.
As data systems evolve, the “database” is no longer just where queries run, it’s becoming the foundation for realtime AI applications: retrieval, reasoning, agent workflows, and intelligent automation over enterprise data. That’s the direction we’re building toward.
About the AI Platform TeamWe build the software platform that powers AI native experiences across SingleStore: AI/ML capabilities, agent runtimes, tool integration, and the operational layer required to run these systems reliably at scale. Our work sits at the intersection of distributed systems, cloud infrastructure, and practical applied AI.
This team is not “pure research”, it’s engineering heavy. You’ll build product grade systems that let customers safely and reliably use AI on their data.
Role SummaryWe are looking for a Software Engineer to design and implement core platform capabilities for AI/ML and AI Agents in SingleStore Cloud. You’ll work on services that enable model/tool orchestration (e.g. MCP style tool discovery and execution), agent workflows, retrieval pipelines (embeddings/vector search), evaluation/observability, and secure multi tenant operations.
You will likely find yourself using Go and Python, Kubernetes, cloud primitives, and the right tools for the job, while applying solid AI/ML fundamentals to make correct engineering decisions.
Role and Responsibilities
- Build and evolve backend services that power AI features: agent orchestration, tool execution, retrieval/RAG pipelines, and model serving integrations.
- Design APIs and control plane workflows for AI platform components (tenant-aware, secure by default, observable).
- Implement MCP style tool discovery / integration patterns so agents can safely call tools, connectors, and internal services.
- Work closely with product managers, designers, customers, and partner engineering teams to deliver high quality AI experiences.
- Engineer for reliability and scale: latency, cost controls, rate limiting, fallbacks, rollouts, and incident response readiness.
- Establish best practices around evaluation: offline test sets, regression detection, prompt/model/version tracking, and quality gates.
- Contribute to secure AI by design approaches: permissions, data access boundaries, prompt injection defenses, and auditability.
- Mentor junior engineers and contribute to a welcoming, high ownership team environment.
Required Skills and Experience
This is a software engineering role that requires strong fundamentals plus working knowledge of AI/ML concepts.
- Strong software engineering skills with experience in distributed systems (Go, Python, or similar).
- Experience building cloud native services: Kubernetes, containers, service-to-service APIs, CI/CD.
- 4+ years of experience working on a SaaS product or production platform.
- Solid understanding of AI/ML fundamentals (you don’t need to be a researcher, but you should understand concepts well enough to build correct systems):
- Supervised learning basics (training vs inference, overfitting, evaluation metrics, classification, anomaly detection, forecasting, regression etc.)
- LLM basics (tokens, context windows, prompting, tool/function calling concepts)
- Embeddings + vector search fundamentals (similarity, indexing tradeoffs, retrieval pitfalls)
- Strong debugging and problem-solving skills, including incident-style troubleshooting across services and infrastructure.
- Intellectual curiosity about investigating issues that impact product quality, reliability, latency, and business metrics.
- Passion for building robust, maintainable systems in a fast-paced, team-oriented environment.
Nice to Have (Preferred)
- Hands on experience with AI agents and orchestration frameworks (tool calling, workflows, planners/executors).
- Practical experience with RAG systems, reranking, grounding, and evaluation strategies.
- Experience with model serving patterns (batch/online inference, caching, streaming responses).
- Knowledge of security considerations for AI systems (data isolation, RBAC, prompt injection threats, audit logs).
- Familiarity with vector databases or vector capabilities in modern data platforms.
- Experience with observability stacks (structured logging, metrics, tracing) and SLO driven engineering.
Go, Python, Kubernetes, cloud infrastructure, distributed systems, APIs, and modern AI tooling (LLM providers, embeddings, retrieval systems, eval/observability pipelines), ML tooling.