Research Scientist, Sound Team

DeepMind

Full Time

Mountain View, CA, USA

5 months ago

Apply now

Snapshot

Members of the Sound Team are a group of researchers working on audio understanding, editing, and generation. We are part of Frontier AI, the unit responsible for building and scaling the next generation of our core models. Research includes, but is not limited to, sound understanding, joint audio-video generation, audio-visual editing, and long-context modeling. Work with us to create a future where speech, music, and general audio are central to AI understanding, generation, and modification.

About Us

Artificial Intelligence could be one of humanity’s most useful inventions. At Google DeepMind, we’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority.

The Role

Research Scientists at Google DeepMind lead our efforts in developing novel algorithmic architectures towards the end goal of solving and building Artificial General Intelligence. We seek individuals who are passionate about audio and developing novel architectures to push the state of art. In this role, you will make key contributions advancing research in sound understanding, joint audio-video generation, and audio editing.

Key responsibilities:

Data: Unlocking new audio capabilities within the model, both in pre-training and post-training.
Models: Improving quality of models for understanding and generation. This includes research to improve our tokenizers, better techniques for generation quality, and looking at joint audio and visual representations.
Evals: Better evaluation methods (human, auto raters, automated metrics) to measure quality of open-ended tasks.

About You

In order to set you up for success as a Research Scientist at Google DeepMind, we look for the following skills and experience:

Minimum Qualifications

PhD in Computer Science, or a related Machine Learning field.
Audio understanding and/or generation experience.
A proven track record of research and publications in some of the following areas: audio generation, video generation, LLMs.

In addition, the following would be an advantage:

Preferred Qualifications

Experience working with LLMs.
A real passion for Audio and Sound!

The US base salary range for this full-time position is between $141,000 - $202,000 + bonus + equity + benefits. Your recruiter can share more about the specific salary range for your targeted location during the hiring process.

At Google DeepMind, we value diversity of experience, knowledge, backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunity regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.