Research Scientist, World Modeling
At Google DeepMind, we value diversity of experience, knowledge, backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunities regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.
SnapshotJoin an ambitious project to build generative models that simulate the physical world. We believe scaling pretraining on video and multimodal data is on the critical path to artificial general intelligence. World models will power numerous domains, such as visual reasoning and simulation, planning for embodied agents, and real-time interactive entertainment. The team will collaborate with and build on work from Gemini, Veo and Genie teams, and tackle critical new problems to scale world models to the highest levels of compute.
About usArtificial Intelligence could be one of humanity’s most useful inventions. At Google DeepMind, we’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority.
The RoleKey responsibilities:Implement core infrastructure and conduct research to build generative models of the physical world. Solve essential problems to train world simulators at massive scale, develop metrics and scaling laws for physical intelligence, curate and annotate training data, enable real-time interactive generation, and study integration of world models with multimodal language models. Embrace the bitter lesson and seek simple methods that scale, with emphasis on strong systems and infrastructure.
Areas of focus:
- Systems for training multimodal transformers at massive scale.
- Infrastructure for large-scale video data pipelines and annotation.
- Inference optimization and distillation for real-time generation.
- Methods for native multimodal generation in language models.
- Methods for ultra-long-context transformers.
- Quantitative evals for physical accuracy and intelligence.
- Scaling law science for video pretraining.
We seek individuals who are passionate about world models and believe learning from data of the physical world is crucial to intelligence. We strive for simple methods that scale and look for candidates excited to improve models through infrastructure, data, evals, and compute.
In order to set you up for success as a Research Scientist at Google DeepMind, we look for the following skills and experience:
- Experience with large-scale transformer models and/or large-scale data pipelines.
- MSc or PhD in computer science or machine learning, or equivalent industry experience.
- Track record of releases, publications, and/or open source projects relating to video generation, world models, multimodal language models, or transformer architectures.
- Strong systems and engineering skills in deep learning frameworks like JAX or PyTorch.
In addition, the following would be an advantage:
- Experience building training codebases for large-scale video or multimodal transformers.
- Expertise optimizing efficiency of distributed training systems and/or inference systems.
The US base salary range for this full-time position is between $136,000 - $245,000 + bonus + equity + benefits. Your recruiter can share more about the specific salary range for your targeted location during the hiring process.