Massachusetts Institute of Technology (MIT) researchers say they have developed a new approach to information extraction that inverts conventional machine learning upside down.
The researchers trained their system on sparse data because in the scenario they are investigating, that is usually all that is available. The team then found the limited information an easy problem to solve. If the new system produces a low confidence score, it automatically generates a Web search query to locate texts likely to contain the data the system is trying to extract. The system attempts to extract the relevant data from one of the new texts and reconciles the results with those of its initial extraction.
"So you have something that's a very weak extractor, and you just find data that fits it automatically from the Web," says MIT graduate student Adam Yala.
The researchers compared the system's performance to that of several conventional extractors, and found for each data item extracted, the new system outperformed its predecessors by about 10 percent.
The researchers "have this super-clever part of the model that goes out and queries for more information that might result in something that's simpler for it to process," says University of Pennsylvania professor Chris Callison-Burch.
From MIT News
View Full Article
Abstracts Copyright © 2016 Information Inc., Bethesda, Maryland, USA
No entries found