Skip to Main Content
Article navigation
Purpose

Self-Admitted Technical Debt (SATD) consists of source-code comments in which developers explicitly acknowledge suboptimal design or implementation decisions that require future improvement. These comments often convey emotional signals such as frustration, urgency, or concern, which may reflect the perceived severity and priority of technical debt. While sentiment analysis has been increasingly applied to SATD, little attention has been paid to the interpretability and reliability of sentiment predictions produced by modern deep learning models. This study aims to investigate how explainable artificial intelligence (XAI) techniques interpret SATD sentiment predictions and whether different model-agnostic explanation methods provide consistent or divergent explanations. Specifically, we examine the reliability, agreement, and limitations of popular post-hoc explainers when applied to BERT-based SATD sentiment classification.

Design/methodology/approach

We formulate SATD sentiment analysis as a binary classification task that distinguishes negative from non-negative comments and fine-tune a BERT model on a manually curated SATD sentiment dataset using ten-fold cross-validation. For all correctly predicted instances, we generate local token-level explanations using three model-agnostic XAI techniques: LIME, SHAP, and BreakDown. We quantitatively assess explanation behaviour and cross-method consistency using feature contribution distributions, top-k token overlap, semantic similarity based on BERT embeddings, and Spearman rank correlation.

Findings

The results reveal substantial divergence among the three explanation methods. LIME, SHAP, and BreakDown assign markedly different contribution magnitudes to influential tokens, exhibit near-zero overlap in top-k features, and frequently produce contradictory ranking orders and sentiment contribution directions. Statistical tests further confirm that these differences are systematic rather than random across both negative and non-negative sentiment categories.

Originality/value

Our findings demonstrate that model-agnostic explanation techniques cannot be used interchangeably for interpreting SATD sentiment predictions. Relying on a single explainer may lead to incomplete or misleading interpretations of developer intent. We therefore recommend multi-method triangulation and manual validation when explanation results are used to support technical debt prioritisation, code review, or maintenance decision-making. This study provides a comprehensive and reproducible empirical analysis of explanation reliability and divergence for SATD sentiment analysis, contributing a foundation for trustworthy and interpretable SATD analytics.

Licensed re-use rights only
You do not currently have access to this content.
Don't already have an account? Register

Purchased this content as a guest? Enter your email address to restore access.

Please enter valid email address.
Email address must be 94 characters or fewer.
Pay-Per-View Access
$41.00
Rental

or Create an Account

Close Modal
Close Modal