This study presents a new startup investment decision support model (SIDSM) that is specifically designed to mitigate the inherent challenges of class imbalance and uncertainty in startup investment decisions.
Based on financial and structural indicators sourced from the Crunchbase database, the proposed model incorporates a multi-stage methodology. First, a systematic feature selection process integrating SHAP, Boruta and Elbow methods is used to retain informative features. Subsequently, uncertainty estimates are calculated at the feature and observation levels using the Shannon-Entropy and DeepGini metric and included in XGBoost's learning process through a user-defined loss function and a label-distribution-aware margin (LDAM) integration. Cuckoo Search meta-heuristic algorithm is used to optimize the hyperparameters to ensure model robustness, and a class-based threshold optimization is used to optimize decision boundaries.
The experimental findings demonstrate that SIDSM outperforms the baseline models, achieving a macro F1-score of 89.47% and more stable minority class detection, thereby indicating its potential to support startup investment decisions in a reliable, transparent and evidence-driven manner under class imbalance.
This study proposes a novel, data-conscious and holistic approach to the methodological limitations of traditional classification approaches. By integrating feature explanatory power and observation-based uncertainties into the learning process, rather than relying solely on error-driven optimization, the model becomes sensitive to uncertainty, captures difficult-to-learn and minority-class patterns better and provides a highly robust and explainable framework for startup investment decision-making.
