This paper aims to investigate the usefulness and validity of student evaluations of teaching (SET) by estimating multiple biases and their cumulative effect, and assessing their implications for evaluating teaching effectiveness.
The study uses a rich dataset from a Polish university and applies linear and quantile regressions to estimate SET biases, including course difficulty, class size and instructor characteristics. The cumulative effect of these biases is measured, and changes during the COVID-19 pandemic are analyzed to assess their impact on SET scores.
The cumulative SET bias reaches more than one point on a 1–5 Likert scale, challenging the reliability of raw SET scores. Significant asymmetries exist between low and high SET scores. Poor initial evaluations of a teacher predict future low performance ratings, while top-rated teacher contests are often influenced by chance rather than teaching quality.
The findings suggest universities should discontinue using raw SET scores for faculty evaluation and instead implement adjustments for identified biases. This approach will provide a more accurate measure of teaching performance.
This paper builds on earlier studies that applied econometric frameworks to analyze SET bias predictors and offers a novel, comprehensive assessment of cumulative SET biases and their asymmetries. It is the first to evaluate the effects of multiple SET biases within a single model and the first to document how SET biases intensified during the pandemic, emphasizing the need for significant reform in teaching evaluation practices.
