Open Conference Systems, ITACOSM 2019 - Survey and Data Science

Font Size: 
Correcting sample biases: an analysis of teaching evaluations
Shira Fano, Rosanna Cataldo, Tiziana Venittelli

Building: Learning Center Morgagni
Room: Aula 210
Date: 2019-06-06 03:30 PM – 04:40 PM
Last modified: 2019-05-23


Students’ evaluation of teaching has become a widespread phenomenon in higher education as it allows universities to monitor students’ satisfaction and improve teaching quality (Spooren, P., 2010). Nevertheless, one of the issues faced is the reliability of data, particularly when students fill in evaluations on a voluntary basis (Brinkmoller, B., 2018), (Beleche, 2012).

We exploit a change in the regulation regarding the collection method of teaching evaluations in Italian universities. Before 2017/2018 students could decide to fill in teaching evaluations on a voluntary basis, whereas, after this date it becomes compulsory. First, we take advantage of this discontinuity and our aim is to show how collecting information from a non-random sample can lead to biased estimates. Second, we focus on teaching evaluations of the academic year 2017/2018, and estimate an Ordinal Logit model to investigate which are the main determinants of students’ overall satisfaction of courses.

Data consists of the universe of teaching evaluations collected in 2016/2017 and 2017/2018 in a large Italian public university. We have information on overall satisfaction of the course, quality of infrastructures, organization of teaching activities and the ability of teachers in delivering lessons. Teaching evaluations collected are N = 59314 and N = 289033 respectively for 2016/2017 and 2017/2018.

We compare students’ answers in the two years using t-test and Kolmogorov-Smirnov tests; for all answers we reject the null hypothesis that answers have equal means and come from the same distribution. For example, students in the 2016/2017 survey are more interested in the topics taught in class but students in the 2017 cohort believe that teaching material is significantly better than their colleagues in the previous year. We perform robustness checks and argue that the significant difference in answers is mainly due to the non-random sample available for the first cohort. Drawing inference from this sample would lead to biased estimates.

Results of the Ordinal Logit model, where the dependent variable is the survey answer “Are you overall satisfied about this course?†(from 1 to 4) suggest that explaining clearly the program and objectives of the course significantly increase the probability of high overall satisfaction. For example, answering 4 compared to 1 in this question the odds of moving to a higher category in the outcome variable is 307%. Teachers’ attitude and ability have a significant effect on student satisfaction and the magnitude of the effect is large. The quality of classrooms and labs is also significant, but the magnitude is smaller.


Spooren, P. (2010). On the credibility of the judge: A cross-classified multilevel analysis on students’ evaluation of teaching. Studies in educational evaluation, 36(4), 121-131.

Thielsch, M. T., Brinkmoller, B., & Forthmann, B. (2018). Reasons for responding in student evaluation of teaching. Studies in Educational Evaluation, 56, 189-196.

Beleche, T., Fairris, D., & Marks, M. (2012). Do course evaluations truly reflect student learning? Evidence from an objectively graded post-test. Economics of Education Review, 31(5), 709-719.