Combining Confidence-Tagged Expert Opinions by Alternate Maximization of Likelihood
N. N. Schraudolph. Combining Confidence-Tagged Expert Opinions by Alternate Maximization of Likelihood. Technical Report IDSIA-25-98, Istituto Dalle Molle di Studi sull'Intelligenza Artificiale, 1998.
Download
125.5kB | 68.5kB | 69.8kB |
Abstract
We address the problem of combining the subjective, confidence-tagged opinions of experts that independently review a set of like items, such as submissions to a scientific conference. The conventional approach of confidence-weighted averaging is improved upon by augmenting a probabilistic error model with bias and trust parameters that characterize the subjectivity of the referees' quality and confidence judgments, respectively. The likelihood of the review data under this model is then optimized by alternate maximization (AM) with respect to item scores and referee parameters. Since conditionally optimal trust parameters cannot be calculated explicitly, we provide two iterative schemes for this purpose. In preliminary experiments the resulting generalized AM algorithm was found to be robust, efficient and effective. We are set to field-test it in the peer review process of the NIPS*98 conference.
BibTeX Entry
@techreport{mali, author = {Nicol N. Schraudolph}, title = {\href{http://nic.schraudolph.org/pubs/mali.pdf}{ Combining Confidence-Tagged Expert Opinions by Alternate Maximization of Likelihood}}, number = {IDSIA-25-98}, institution = {Istituto Dalle Molle di Studi sull'Intelligenza Artificiale}, address = {Galleria 2, CH-6928 Manno, Switzerland}, year = 1998, b2h_type = {Other}, b2h_topic = {Other}, abstract = { We address the problem of combining the subjective, confidence-tagged opinions of experts that independently review a set of like items, such as submissions to a scientific conference. The conventional approach of confidence-weighted averaging is improved upon by augmenting a probabilistic error model with bias and trust parameters that characterize the subjectivity of the referees' quality and confidence judgments, respectively. The likelihood of the review data under this model is then optimized by alternate maximization (AM) with respect to item scores and referee parameters. Since conditionally optimal trust parameters cannot be calculated explicitly, we provide two iterative schemes for this purpose. In preliminary experiments the resulting generalized AM algorithm was found to be robust, efficient and effective. We are set to field-test it in the peer review process of the NIPS*98 conference. }}