Traditional modeling techniques for forecasting turbulence often rely on correlation-based criteria, which may select variables that correlate with the target without truly driving its dynamics. This limits model interpretability, generalization and efficiency. The purpose of this study is to overcome these limitations by introducing an observational causality-based approach to input selection that identifies the variables responsible for the future evolution of a target quantity while disregarding noncausal factors.
The authors’ approach is grounded in the synergistic-unique-redundant decomposition (SURD) of causality, which dissects the information that candidate inputs provide about a target variable into unique, redundant and synergistic causal components. These components are directly linked to the theoretical limits of predictive performance, quantified through the information-theoretic notion of irreducible error. To estimate these causal contributions in practice, the authors leverage neural mutual information estimators. The authors demonstrate the methodology by forecasting wall-shear stress using direct numerical simulation (DNS) data of turbulent channel flow.
The analysis reveals that variables with high unique or synergistic causal contributions enable compact forecasting models with strong predictive performance, whereas redundant variables can be excluded without compromising accuracy. Specifically, when predicting future wall-shear stress using two wall-parallel planes separated in the wall-normal direction, the streamwise velocity near the wall provides unique information about the target. In contrast, when both planes are located close to the wall, their information is largely redundant, and either can serve as input without degrading predictive accuracy. Finally, synergistic interactions emerge between different velocity components, which, when combined, enhance the prediction of future wall-shear stress beyond what each component achieves individually.
This work presents a causality-based approach for input selection in turbulence forecasting. The method quantifies the causal contributions of candidate variables to the prediction of a future quantity of interest and connects them to the fundamental limits of predictive accuracy achievable by any model. This enables more interpretable and compact models by reducing input dimensionality without sacrificing performance. Beyond turbulence, the approach provides a general-purpose tool for variable selection in scientific machine learning, flow control and data-driven modeling of complex systems.
