A comparative study of benchmarking procedures for interrater and intrarater agreement studies
Amalia Vanacore, Maria Sole Pellegrino

Decision making processes typically rely on subjective evaluations providedby human raters. In the absence of a gold standard against which check evaluationtrueness, the magnitude of inter/intra-rater agreement coefficients is commonlyinterpreted as a measure of the rater’s evaluative performance. In this study somebenchmarking procedures for characterizing the extent of agreement are discussedand compared via a Monte Carlo simulation.


