Google’s latest robotics model, RT-2, combines artificial intelligence with physical robots. In a recent demonstration, the robot was able to pick up a specific object from a table based on a given instruction. This achievement was made possible by integrating large language models, similar to the ones powering chatbots like ChatGPT and Bard, into the robots’ functioning. This approach represents a significant breakthrough in robotics, as it allows robots to learn new skills on their own, rather than being programmed for specific tasks individually.
Traditionally, robots were trained using specific instructions for each task, a slow and labor-intensive process. By leveraging language models trained on vast amounts of internet text, Google’s researchers connected these models to robots, resulting in a significant advancement in their abilities to reason and solve problems.
The new RT-2 model is referred to as a “vision-language-action” model, which enables the robot not only to perceive and analyze its surroundings but also to receive instructions on how to move. By tokenizing the robot’s movements and incorporating them into the language model’s training data, RT-2 can learn how to perform actions like picking up objects or following complex instructions.
During a demonstration, RT-2 showcased impressive capabilities, such as accurately following multilingual instructions and making abstract connections between related concepts. While the robot still has some limitations and occasional errors, its potential applications are promising. Google believes that robots equipped with language models could eventually be deployed in various settings, from warehouses and medical facilities to household assistance tasks like folding laundry or tidying up.
While concerns about the risks of A.I. language models powering robots are acknowledged, Google assures that RT-2 has safety features in place to prevent harmful actions. Although the integration of A.I. language models with robotics introduces challenges, researchers are excited about the possibilities it offers for advancing the field of robotics.