Skip to Main Content
Article navigation
Purpose

Object detection and instance segmentation play an important role in autonomous driving, where vehicles must perceive their surroundings reliably. In practice, these tasks are commonly addressed using separate models, which increases both training complexity and deployment cost. To overcome this issue, we propose UniPercepNet-S, a lightweight dual-task framework inspired by YOLOF that brings detection and segmentation into a single unified network, aiming to support real-time perception in resource-constrained environments.

Design/methodology/approach

UniPercepNet-S follows a YOLOF-style one-level detection design and strengthens the backbone with a channel attention module to improve feature quality. To enable instance segmentation, we add a simple yet efficient mask prediction branch that operates directly on detected objects while keeping computation low. We evaluate the proposed framework on MS COCO and BDD100 K, covering both general object segmentation and autonomous-driving-oriented scenarios.

Findings

The proposed UniPercepNet-S achieves a mask AP of 38.0 on MS COCO, placing it among the top-performing entries in the COCO Detection Challenge for segmentation tasks. On BDD100 K, which reflects real-world driving conditions, the model reaches an AP of 20.3, showing that it generalizes well across different datasets. These results suggest that UniPercepNet-S can deliver accurate detection and segmentation while remaining suitable for real-time use.

Originality/value

This work contributes a unified and lightweight one-level framework that performs object detection and instance segmentation simultaneously, avoiding the need for heavy multi-scale architectures or separate task-specific models. By combining attention-enhanced representations with an efficient segmentation branch, UniPercepNet-S provides a practical solution for real-time perception. Its balance between simplicity, accuracy, and speed makes it especially valuable for autonomous driving and other embedded vision applications.

Licensed re-use rights only
You do not currently have access to this content.
Don't already have an account? Register

Purchased this content as a guest? Enter your email address to restore access.

Please enter valid email address.
Email address must be 94 characters or fewer.
Pay-Per-View Access
$41.00
Rental

or Create an Account

Close Modal
Close Modal