Google’s DeepMind robotics division recently introduced three groundbreaking products designed to revolutionize robots’ capabilities in decision-making, safety, and task execution within human environments.
AutoRT: Revolutionizing Decision-making and Safety
AutoRT, the flagship product, integrates a sophisticated data acquisition system utilizing Visual Language Models (VLM) and Large Language Models (LLM). VLM analyzes surroundings and identifies objects, while LLM executes tasks creatively. A notable addition is the “Robot Constitution” within LLM, prioritizing safety by instructing machines to avoid tasks involving people, animals, sharp objects, or electrical appliances. Additionally, safety measures include automatic halting when joint force surpasses a specific threshold and an emergency physical switch for human intervention.
Deployment and Performance
In a span of seven months, Google deployed 53 AutoRT robots across four office buildings, conducting over 77,000 tests. Some operated remotely, while others autonomously followed algorithms or leveraged the Robotic Transformer AI model. The robots’ current design comprises manipulator limbs on mobile bases with integrated cameras for situational assessment.
Optimizing Robotic Operation
The second innovation, SARA-RT (Self-Adaptive Robust Attention for Robotics Transformers), targets optimizing the RT-2 model’s performance. Addressing computational resource challenges arising from increased data input, a novel up-training method resolves the quadrupling resource demand issue, enabling faster model operation without compromising quality, notes NIX Solutions.
Facilitating Task Training
Google DeepMind’s engineers developed RT-Trajectory, an AI model simplifying robot task training. By allowing operators to demonstrate tasks directly, RT-Trajectory analyzes and adapts human-specified trajectories to robotic actions, streamlining training processes significantly.
In summary, these innovations mark significant strides in enhancing robot adaptability, safety, and performance within dynamic human-centric environments.