2026., Bias and Beyond: Rethinking Data Practices and Inspiring Collective Dialogue, Soyeon Ahn, April Mann
Download citation file:
The dominant modeling approach in recent peer-reviewed literature is one in which:
This approach is largely the product of years of peer-reviewed cost function estimation by William Duncombe, John Yinger, and colleagues of the Maxwell School at Syracuse University (Duncombe & Yinger, 1999, 2011). Here, we provide the rationale for this approach.
Our goal is to elicit from district spending data the “cost” of achieving specific outcome levels. We are setting up a model in which we predict spending levels from educational outcomes (narrowly measured as student achievement in Math and Language Arts), and other factors, rather than predicting outcomes from spending levels. As such, we must take statistical steps to correct for the fact that spending is influenced by outcomes, while, simultaneously, outcomes are also affected by spending (the circular/feedback loop relationship in the picture). More spending can lead to better student outcomes, as increased funding can be used to reduce class sizes, recruit better-qualified personnel, provide support services, and so on. However, higher outcomes in a community may drive increased spending, as homeowners desire to have their schools continue to be perceived as high-performing, thus keeping their property values relatively high. In this case, there is no clear causal direction: the two factors affect each other simultaneously. The relevant statistical approach to isolate the effect of outcomes on spending (distinct from the effect of spending on outcomes) is to use a two-stage model in which we use exogenous (outside the loop) measures of each district’s competitive context to correct for endogeneity (inside the loop feedback) in the outcome measure.
