Research Scientist, Gemini Data - Mountain View

Full Time
Mountain View, CA, USA
9 hours ago
Snapshot

The Gemini Data team 

We are seeking a highly motivated and talented Research Scientist to join our team to work on Gemini data. Our goal is to organize the world's information and generate and curate high-quality tokens for Gemini core model training.

About Us

Artificial Intelligence could be one of humanity’s most useful inventions. At Google DeepMind, we’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority.

The Role

We are seeking a highly motivated and talented Research Scientist to join our team to work on Gemini data.

Key responsibilities

  • Research and develop methods to create diversified high-quality synthetic data, scale the creation through collaborations, evaluate & improve its effectiveness through ablation in pretraining/post-training/distillation.
  • Research and develop methods to identify quality issues horizontally in the pretraining data corpus, innovate on how to fix, and evaluate & improve its effectiveness through ablation into landing.
  • Stay up-to-date with the latest advancements in LLM research.
About You

In order to set you up for success as a Research Scientist at Google DeepMind,  we look for the following skills and experience:

In order to set you up for success as a Research Scientist at Google DeepMind,  we look for the following skills and experience: (these are MQs)

  • PhD in Computer Science or related field.
  • In-depth experience and familiarity of LLM training and/or agents.
  • Strong publication record in top machine learning conferences (e.g., NeurIPS, CVPR, ICML, ICLR, ICCV, ECCV).
  • Solid skills & experience in software engineering for ML

In addition, the following would be an advantage: 

  • Excellent communication and teamwork skills
  • Passion for research and a desire to make a significant impact in the pretraining data area.
  • Expertise in one or more of the following areas of LLMs: Synthetic Data, Data Quality, Scaling Data 

The US base salary range for this full-time position is between $166,000 - $220,000 + bonus + equity + benefits. Your recruiter can share more about the specific salary range for your targeted location during the hiring process.

At Google DeepMind, we value diversity of experience, knowledge, backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunity regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.