In
statistics,
inter-rater reliability,
inter-rater agreement, or
concordance is the degree of agreement among raters. It gives a score of how much , or
consensus, there is in the ratings given by judges. It is useful in refining the tools given to human judges, for example by determining if a particular scale is appropriate for measuring a particular variable. If various raters do not agree, either the scale is defective or the raters need to be re-trained.