Two new robot models from Google DeepMind can think before taking action.
Google DeepMind has unveiled two robot models that work together to make robots think before they act. The Gemini Robotics 1.5 (action model) and Gemini Robotics-ER 1.5 (embodied reasoning model) bring generative AI principles to the real world. This should make robots much more versatile than they currently are.
Two Models: one that Thinks, one that Acts
DeepMind builds upon the Gemini foundation models but optimized them specifically for robotics. The approach consists of two separate but collaborating components:
- Gemini Robotics-ER 1.5 is a vision-language model that reasons about a task. It processes visual input and textual instructions, can consult tools (such as web searches), and generates a step-by-step plan in natural language: what needs to be done and why.
- Gemini Robotics 1.5 is a vision-language-action (VLA) model that converts these steps into actual robot actions such as gripping, moving, and self-positioning. The model also makes brief, practical considerations to avoid errors or awkward movements.
This separation reflects how humans often work: first planning, then executing.
Learning across Different Models
According to DeepMind, a major gain is the ability to transfer skills across different “embodiments.” A model that learns to work with two arms can be applied to a humanoid robot with more complex hands (Apollo) without extensive training. This eliminates the need to build a completely new model for each robot platform.
What Can They Do?
DeepMind cites several examples such as sorting laundry: the ER model breaks down the task into steps (identify white/color, pick up clothing, check material, place in correct bin) and the action model physically executes these steps. More realistic use cases include complex assembly tasks, laboratory automation, and warehouse work where environments vary.
According to DeepMind, this approach opens the door to more general, rapidly deployable robots.
Limitations and Availability
However, DeepMind emphasizes that we’re not yet ready for a household robot that independently does the laundry. Gemini Robotics-1.5 (the action model) is currently limited to trusted testers; the ER model with “simulated reasoning” is now available through Google AI Studio for developers who want to generate robot instructions and experiment with embodied workflows.