[2111.15366] AI and the Everything in the Whole Wide World Benchmarkhttps://arxiv.org/abs/2111.15366
... we argue that benchmarks presented as measurements of progress towards general ability within vague tasks such as “visual understanding” or “language understanding” are as ineffective as the finite museum is at representing “everything in the whole wide world,” ...
Argues that the machine learning focus on a (relatively) small number of benchmarks is counterproductive. Progress in performance on arbitrarily chosen benchmarks is conflated with general progress in the field.
Related By Tags
- 🔗 Eye-catching advances in some AI fields are not real | Science | AAAS
- 🔗 Large image datasets: A pyrrhic win for computer vision? | OpenReview
- 🔗 The steep cost of capture | ACM Interactions
- 🔗 Get Started · Snorkel
- 🔗 [1903.03129] SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems
- 🔗 ai-tree.pdf
- 🔗 [2005.03220] Fractional ridge regression: a fast, interpretable reparameterization of ridge regression
- 🔗 Straight to Spam
- 🔗 Getting machine learning to production · Vicki Boykis
- 🔗 Disembodied Machine Learning: On the Illusion of Objectivity in NLP | OpenReview