Skip to Main Content
Article navigation
Purpose

The purpose of this study is to address the challenge of object manipulation in scenarios where the target is not explicitly defined, requiring robots to engage in efficient planning to determine the sequence of actions for picking, placing and positioning objects. The aim is to develop a multistep skill learning method that integrates perception with a set of primitive actions, including a novel action of orienting, to enable robots to perform complex tasks that require multistep planning and interaction with various objects in cluttered and unstructured environments.

Design/methodology/approach

To achieve the purpose, the authors propose a pipeline that decomposes the object manipulation task into three independent stages, each trained end-to-end with raw visual inputs using off-policy reinforcement learning algorithms. The Q-learning algorithm is used to simultaneously train three fully convolutional neural networks for each primitive action – grasping, pushing, placing and orienting – from scratch. The framework is designed to be modular, allowing for easy extension to multistep manipulation tasks.

Findings

The findings demonstrate that robots can learn complex behaviors through both simulated and real-world experiments. In simulation, the robot achieved an efficient block-stacking success rate of up to 98% during testing. When transferring the model to a real universal robots UR3 (UR3) robot using effective domain randomization, the robot achieved a 100% completion rate with convex objects and a 92% completion rate with various objects not seen during training.

Originality/value

We develop a novel multistep skill learning method that integrates perception with multiple primitive actions, including a new action of orienting, and the use of off-policy reinforcement learning algorithms for end-to-end training. The modular design of the framework allows for easy extension to more complex manipulation tasks, and the encouraging results in both simulated and real-world experiments demonstrate significant improvements over current long-term planning methods.

Licensed re-use rights only
You do not currently have access to this content.
Don't already have an account? Register

Purchased this content as a guest? Enter your email address to restore access.

Please enter valid email address.
Email address must be 94 characters or fewer.
Pay-Per-View Access
$41.00
Rental

or Create an Account

Close Modal
Close Modal