Computer scientists Sayash Kapoor and Arvind Narayanan at Princeton University reported earlier this year that the problem of data leakage (when there is insufficient separation between the data used to train an AI system and those used to test it) has caused reproducibility issues in 17 fields that they examined, affecting hundreds of papers. They argue that naive use of AI is leading to a reproducibility crisis.
At the same time, some researchers worry that ill-informed use of AI software is driving a deluge of papers with claims that cannot be replicated, or that are wrong or useless in practical terms.
There has been no systematic estimate of the extent of the problem, but researchers say that, anecdotally, error-strewn AI papers are everywhere. "This is a widespread issue impacting many communities beginning to adopt machine-learning methods," Kapoor says.
View Full Article
No entries found