Home > Research > Publications & Outputs > Large language model-driven natural language in...

Links

Text available via DOI:

View graph of relations

Large language model-driven natural language interaction control framework for single-operator bimanual teleoperation

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published
Close
Article number1621033
<mark>Journal publication date</mark>17/07/2025
<mark>Journal</mark>Frontiers in Robotics and AI
Volume12
Publication StatusPublished
<mark>Original language</mark>English

Abstract

Bimanual teleoperation imposes cognitive and coordination demands on a single human operator tasked with simultaneously controlling two robotic arms. Although assigning each arm to a separate operator can distribute workload, it often leads to ambiguities in decision authority and degrades overall efficiency. To overcome these challenges, we propose a novel bimanual teleoperation large language model assistant (BTLA) framework, an intelligent co-pilot that augments a single operator’s motor control capabilities. In particular, BTLA enables operators to directly control one robotic arm through conventional teleoperation while directing a second assistive arm via simple voice commands, and therefore commanding two robotic arms simultaneously. By integrating the GPT-3.5-turbo model, BTLA interprets contextual voice instructions and autonomously selects among six predefined manipulation skills, including real-time mirroring, trajectory following, and autonomous object grasping. Experimental evaluations in bimanual object manipulation tasks demonstrate that BTLA increased task coverage by 76.1 % and success rate by 240.8 % relative to solo teleoperation, and outperformed dyadic control with a 19.4 % gain in coverage and a 69.9 % gain in success. Furthermore, NASA Task Load Index (NASA-TLX) assessments revealed a 38–52 % reduction in operator mental workload, and 85 % of participants rated the voice-based interaction as “natural” and “highly effective.”