Evaluation Paradigms in Question Answering
Abstract
Question answering (QA) primarily descends from two branches of research: (1) Alan Turing’s investigation of machine intelligence at Manchester University and (2) Cyril Cleverdon’s comparison of library card catalog indices at Cranfield University. This position paper names and distinguishes these paradigms. Despite substantial overlap, subtle but significant distinctions exert an outsize influence on research. While one evaluation paradigm values creating more intelligent QA systems, the other paradigm values building QA systems that appeal to users. By better understanding the epistemic heritage of QA, researchers, academia, and industry can more effectively accelerate QA research.