Challenges and Opportunities in NLP Benchmarking
Over the last years, models in NLP have become much more powerful, driven by advances in transfer learning. A consequence of this drastic increase in performance is that existing benchmarks have been left behind. Recent models “have outpaced the benchmarks to test for them” (AI Index Report 2021), quickly reaching super-human performance on standard benchmarks such as SuperGLUE and SQuAD. Does this mean that we have solved natural language processing? Far from it. However, the traditional practices for evaluating performance […]
Read more