This study aims to address the problem of ellipse distortion and measurement inaccuracy caused by perspective projection in the robotic inspection of engine cylinder heads.
A robot-assisted multi-view pose estimation framework is proposed. Unlike traditional single-view methods that rely solely on 2D image gradients, this method integrates robot kinematics with multi-view geometry constraints. By minimizing the reprojection error across multiple views, the system recovers the spatial parameters of circular holes. The method was validated using a UR5e robot and a Basler industrial camera, comparing its performance against the gradient-based Keypoint-Based Detection algorithm under varying pose disturbances.
Experimental results demonstrate that the proposed multi-view approach significantly outperforms the single-view method. In scenarios with large viewing angle variations, the method reduced the root mean square error of the diameter measurement by 71% compared to the baseline, achieving subpixel accuracy. The processing efficiency also improved by 45.6% due to the elimination of complex iterative filtering required in single-view methods.
Single-view perception was found to be highly sensitive to pose deviations and surface reflections, limiting its reliability for high-precision tasks compared to the multi-view geometric constraint approach.
It should be noted that commercial systems for hole-parameter measurement exist in industrial inspection applications. However, most of these systems are closed proprietary solutions whose algorithms and data sets are not publicly accessible, making direct experimental comparison difficult. The goal of this work is therefore to investigate the algorithmic effectiveness of the proposed perception strategies and to provide a flexible vision-based solution that can be deployed on general robot-mounted vision platforms.
The primary contribution is the integration of active robot pose constraints into the ellipse fitting process, solving the ill-posed problem of single-view fitting under extreme perspective distortion common in industrial environments.
