Sign in

Training on the Test Task Confounds Evaluation and Emergence

By Ricardo Dominguez-Olmedo and others
We study a fundamental problem in the evaluation of large language models that we call training on the test task. Unlike wrongful practices like training on the test data, leakage, or data contamination, training on the test task is not a malpractice. Rather, the term describes a growing set of... Show more
July 10, 2024
=
0
Loading PDF…
Loading full text...
Similar articles
Loading recommendations...
=
0
x1
Training on the Test Task Confounds Evaluation and Emergence
Click on play to start listening