Self-supervised reinforcement learning for multi-step object manipulation skills

Wang, Jiaqi; Chen, Chuxin; Liu, Jingwei; Du, Guanglong; Zhu, Xiaojun; Guan, Quanlong; Qiu, Xiaojian

doi:10.1108/IR-12-2024-0534

Article navigation

Research Article| March 31 2025

Self-supervised reinforcement learning for multi-step object manipulation skills

Jiaqi Wang

0009-0006-5048-2572

;

Jiaqi Wang

South China University of Technology

, Guangzhou,

China

Search for other works by this author on:

This Site

PubMed

Google Scholar

Chuxin Chen

0000-0002-0008-5302

;

Chuxin Chen

South China University of Technology

, Guangzhou,

China

Search for other works by this author on:

This Site

PubMed

Google Scholar

Jingwei Liu;

Jingwei Liu

South China University of Technology

, Guangzhou,

China

Search for other works by this author on:

This Site

PubMed

Google Scholar

Guanglong Du

0000-0001-9425-843X

;

Guanglong Du

Department of CS,

South China University of Technology

, Guangzhou,

China

Guanglong Du can be contacted at: csgldu@scut.edu.cn

Search for other works by this author on:

This Site

PubMed

Google Scholar

Xiaojun Zhu

0000-0002-9506-9005

;

Xiaojun Zhu

Jianghuai Advanced Technology Center

, Hefei,

China

Search for other works by this author on:

This Site

PubMed

Google Scholar

Quanlong Guan;

Quanlong Guan

Jinan University

, Guangzhou,

China

Search for other works by this author on:

This Site

PubMed

Google Scholar

Xiaojian Qiu

Institute for Military-Civilian Integration of Jiangxi Province

, Nanchang,

China

Search for other works by this author on:

This Site

PubMed

Google Scholar

Author & Article Information

Guanglong Du can be contacted at: csgldu@scut.edu.cn

Publisher: Emerald Publishing

Received: December 02 2024

Revision Received: January 20 2025

Accepted: February 24 2025

Online ISSN: 1758-5791

Print ISSN: 0143-991X

2025

Emerald Publishing Limited

Licensed re-use rights only

Industrial Robot (2025) 52 (6): 853–865.

https://doi.org/10.1108/IR-12-2024-0534

Purpose

The purpose of this study is to address the challenge of object manipulation in scenarios where the target is not explicitly defined, requiring robots to engage in efficient planning to determine the sequence of actions for picking, placing and positioning objects. The aim is to develop a multistep skill learning method that integrates perception with a set of primitive actions, including a novel action of orienting, to enable robots to perform complex tasks that require multistep planning and interaction with various objects in cluttered and unstructured environments.

Design/methodology/approach

To achieve the purpose, the authors propose a pipeline that decomposes the object manipulation task into three independent stages, each trained end-to-end with raw visual inputs using off-policy reinforcement learning algorithms. The Q-learning algorithm is used to simultaneously train three fully convolutional neural networks for each primitive action – grasping, pushing, placing and orienting – from scratch. The framework is designed to be modular, allowing for easy extension to multistep manipulation tasks.

Findings

The findings demonstrate that robots can learn complex behaviors through both simulated and real-world experiments. In simulation, the robot achieved an efficient block-stacking success rate of up to 98% during testing. When transferring the model to a real universal robots UR3 (UR3) robot using effective domain randomization, the robot achieved a 100% completion rate with convex objects and a 92% completion rate with various objects not seen during training.

Originality/value

We develop a novel multistep skill learning method that integrates perception with multiple primitive actions, including a new action of orienting, and the use of off-policy reinforcement learning algorithms for end-to-end training. The modular design of the framework allows for easy extension to more complex manipulation tasks, and the encouraging results in both simulated and real-world experiments demonstrate significant improvements over current long-term planning methods.

2025

Emerald Publishing Limited

Licensed re-use rights only

You do not currently have access to this content.

Don't already have an account? Register

Self-supervised reinforcement learning for multi-step object manipulation skills

Email Alerts

Cited By

Self-supervised reinforcement learning for multi-step object manipulation skills

Sign in

Client Account

ICE Member Sign In

Email Alerts

Suggested Reading

Related Chapters

Recommended for you

Cited By

Sharing Unavailable