Controlling drones and other robots with gestures

Interact with robots and other devices by gesturing, using wearable muscle and motion sensors

From spaceships to Roombas, robots have the potential to be valuable assistants and to extend our capabilities. But it can still be hard to tell them what to do – we’d like to interact with a robot as if we were interacting with another person, but it’s often clumsy to use pre-specified voice/touchscreen commands or to set up elaborate sensors. Allowing robots to understand our nonverbal cues such as gestures with minimal setup or calibration can be an important step towards more pervasive human-robot collaboration.

This system, dubbed Conduct-a-Bot, aims to take a step towards these goals by detecting gestures from wearable muscle and motion sensors. A user can make gestures to remotely control a robot by wearing small sensors on their biceps, triceps, and forearm. The current system detects 8 predefined navigational gestures without requiring offline calibration or training data – a new user can simply put on the sensors and start gesturing to remotely pilot a drone.

For more information, check out the virtual presentation below

By using a small number of wearable sensors and plug-and-play algorithms, the system aims to start reducing the barrier to casual users interacting with robots. It builds an expandable vocabulary for communicating with a robot assistant or other electronic devices in a more natural way. We look forward to extending this vocabulary to additional scenarios and to evaluating it with more users and robots.

Plug-and-Play Gesture Control Using Muscle and Motion Sensors

A user makes gestures to pilot a drone through hoops — Plug-and-Play Gesture Control Using Muscle and Motion Sensors

Photos by Joseph DelPreto, MIT CSAIL

Sensors: Wearable EMG and IMU

Gestures are detected using wearable muscle and motions sensors. Muscle sensors, called electromyography (EMG) sensors, are worn on the biceps and triceps to detect when the upper arm muscles are tensed. A wireless device with EMG and motion sensors is also worn on the forearm.

In the current experiments, MyoWare processing boards with Covidien electrodes and an NI data acquisition device were used to stream biceps and triceps activity. The Myo Gesture Control Armband was used to monitor forearm activity. Alternative sensors and acquisition devices could be substituted in the future.

Wearable muscle and motion sensors on the upper and lower arm — Wearable EMG and IMU sensors monitor muscle activity and motion of the upper and lower arm

EMG sensors monitor biceps, triceps, and forearm muscles. The forearm device also includes an IMU.

Photos by Joseph DelPreto, MIT CSAIL

Gesture Detection: Classification Pipelines

Machine learning pipelines process the muscle and motion signals to classify 8 possible gestures at any time. For most of the gestures, unsupervised classifiers process the muscle and motion data to learn how to separate gestures from other motions in real time; Gaussian Mixture Models (GMMs) are continuously updated to cluster the streaming data and create adaptive thresholds. This lets the system calibrate itself to each person’s signals while they’re making gestures that control the robot. Since it doesn’t need any calibration data ahead of time, this can help users start interacting with the robot quickly.

In parallel with these classification pipelines, a neural network predicts wrist flexion or extension from forearm muscle signals. The network is trained on data from previous users instead of requiring new training data from each user.

Machine learning pipelines classify gestures from wearable muscle and motion sensors

System Overview: Data acquisition, signal processing, robot control — The closed-loop system consists of EMG and IMU acquisition, classification pipelines, and robot control. LEDs cue gestures during open-loop blocks.

The closed-loop system consists of EMG and IMU acquisition, classification pipelines, and robot control. LEDs cue gestures during open-loop blocks.

Images by Joseph DelPreto, MIT CSAIL

Conference Media: Human-Robot Interaction 2020 (HRI ’20)


Virtual Presentation	Demo Video

Publications

J. DelPreto and D. Rus, “Plug-and-Play Gesture Control Using Muscle and Motion Sensors,” in Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction (HRI), New York, NY, USA, 2020, p. 439–448. doi:10.1145/3319502.3374823
[BibTeX] [Abstract] [Download PDF]

As the capacity for machines to extend human capabilities continues to grow, the communication channels used must also expand. Allowing machines to interpret nonverbal commands such as gestures can help make interactions more similar to interactions with another person. Yet to be pervasive and effective in realistic scenarios, such interfaces should not require significant sensing infrastructure or per-user setup time. The presented work takes a step towards these goals by using wearable muscle and motion sensors to detect gestures without dedicated calibration or training procedures. An algorithm is presented for clustering unlabeled streaming data in real time, and it is applied to adaptively thresholding muscle and motion signals acquired via electromyography (EMG) and an inertial measurement unit (IMU). This enables plug-and-play online detection of arm stiffening, fist clenching, rotation gestures, and forearm activation. It also augments a neural network pipeline, trained only on strategically chosen training data from previous users, to detect left, right, up, and down gestures. Together, these pipelines offer a plug-and-play gesture vocabulary suitable for remotely controlling a robot. Experiments with 6 subjects evaluate classifier performance and interface efficacy. Classifiers correctly identified 97.6\% of 1,200 cued gestures, and a drone correctly responded to 81.6\% of 1,535 unstructured gestures as subjects remotely controlled it through target hoops during 119 minutes of total flight time.

@inproceedings{delpreto2020emgImuGesturesDrone,
author={DelPreto, Joseph and Rus, Daniela},
title={Plug-and-Play Gesture Control Using Muscle and Motion Sensors},
year={2020},
month={March},
isbn={9781450367462},
publisher={ACM},
address={New York, NY, USA},
url={https://dl.acm.org/doi/10.1145/3319502.3374823?cid=99658989019},
doi={10.1145/3319502.3374823},
booktitle={Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction (HRI)},
pages={439–448},
numpages={10},
keywords={Robotics, EMG, Wearable Sensors, Human-Robot Interaction, Gestures, Plug-and-Play, Machine Learning, IMU, Teleoperation},
location={Cambridge, United Kingdom},
series={HRI ’20},
abstract={As the capacity for machines to extend human capabilities continues to grow, the communication channels used must also expand. Allowing machines to interpret nonverbal commands such as gestures can help make interactions more similar to interactions with another person. Yet to be pervasive and effective in realistic scenarios, such interfaces should not require significant sensing infrastructure or per-user setup time. The presented work takes a step towards these goals by using wearable muscle and motion sensors to detect gestures without dedicated calibration or training procedures. An algorithm is presented for clustering unlabeled streaming data in real time, and it is applied to adaptively thresholding muscle and motion signals acquired via electromyography (EMG) and an inertial measurement unit (IMU). This enables plug-and-play online detection of arm stiffening, fist clenching, rotation gestures, and forearm activation. It also augments a neural network pipeline, trained only on strategically chosen training data from previous users, to detect left, right, up, and down gestures. Together, these pipelines offer a plug-and-play gesture vocabulary suitable for remotely controlling a robot. Experiments with 6 subjects evaluate classifier performance and interface efficacy. Classifiers correctly identified 97.6\% of 1,200 cued gestures, and a drone correctly responded to 81.6\% of 1,535 unstructured gestures as subjects remotely controlled it through target hoops during 119 minutes of total flight time.}
}