Indiana University scientists create computational algorithm for fact-checking
In the first use of this method, IU scientists created a simple computational fact-checker that assigns “truth scores” to statements concerning history, geography and entertainment, as well as random statements drawn from the text of Wikipedia, the well-known online encyclopedia.
In multiple experiments, the automated system consistently matched the assessment of human fact-checkers in terms of their certitude about the accuracy of these statements.
The results of the study, “Computational Fact Checking From Knowledge Networks,” are reported in today's issue of PLOS ONE.
“These results are encouraging and exciting,” said Giovanni Luca Ciampaglia, a postdoctoral fellow at the Center for Complex Networks and Systems Research in the IU Bloomington School of Informatics and Computing, who led the study. “We live in an age of information overload, including abundant misinformation, unsubstantiated rumors and conspiracy theories whose volume threatens to overwhelm journalists and the public.
“Our experiments point to methods to abstract the vital and complex human task of fact-checking into a network analysis problem, which is easy to solve computationally.”
The team selected Wikipedia as the information source for their experiment due to its breadth and open nature. Although Wikipedia is not 100 percent accurate, previous studies estimate the online encyclopedia is nearly as reliable as traditional encyclopedias, but also covers many more subjects.
Using factual information from infoboxes on the site, IU scientists built a “knowledge graph” with 3 million concepts and 23 million links between them. A link between two concepts in the graph can be read as a simple factual statement, such as “Socrates is a person” or “Paris is the capital of France.”
In what the IU scientists describe as an “automatic game of trivia,” the team applied their algorithm to answer simple questions related to geography, history and entertainment, including statements that matched states or nations with their capitals, presidents with their spouses and Oscar-winning film directors with the movie for which they won the Best Picture awards, with the majority of tests returning highly accurate truth scores.
Lastly, the scientists used the algorithm to fact-check excerpts from the main text of Wikipedia, which were previously labeled by human fact-checkers as true or false, and found a positive correlation between the truth scores produced by the algorithm and the answers provided by the fact-checkers.
Significantly, the IU team found their computational method could even assess the truthfulness of statements about information not directly contained in the infoboxes. For example, the fact that Steve Tesich — the Serbian-American screenwriter of the classic Hoosier film “Breaking Away” — graduated from IU, despite the information not being specifically addressed in the infobox about him.
“The measurement of the truthfulness of statements appears to rely strongly on indirect connections, or 'paths,' between concepts,” Ciampaglia said. “If we prevented our fact-checker from traversing multiple nodes on the graph, it performed poorly since it could not discover relevant indirect connections. But because it's free to explore beyond the information provided in one infobox, our method leverages the power of the full knowledge graph.”
Although the experiments were conducted using Wikipedia, the IU team's method does not assume any particular source of knowledge. The scientists aim to conduct additional experiments using knowledge graphs built from other sources of human knowledge, such as Freebase, the open-knowledge base built by Google, and note that multiple information sources could be used together to account for different belief systems.
“Misinformation endangers the public debate on a broad range of global societal issues,” said Filippo Menczer, director of the Center for Complex Networks and Systems Research and a professor in the IU School of Informatics and Computing, who is a co-author on the study. “With increasing reliance on the Internet as a source of information, we need tools to deal with the misinformation that reaches us every day. Computational fact-checkers could become part of the solution to this problem.”
The team added a significant amount of natural language processing research, and other work remains before these methods could be made available to the public as a software tool.
###
Additional co-authors on the study are professor Luis M. Rocha, associate professors Johan Bollen and Alessandro Flammini, and doctoral student Prashant Shiralkar, all in the School of Informatics and Computing. This work was supported in part by the Swiss National Science Foundation, the Lilly Endowment, the James S. McDonnell Foundation, the National Science Foundation and the Department of Defense.
Media Contact
All latest news from the category: Information Technology
Here you can find a summary of innovations in the fields of information and data processing and up-to-date developments on IT equipment and hardware.
This area covers topics such as IT services, IT architectures, IT management and telecommunications.
Newest articles
First-of-its-kind study uses remote sensing to monitor plastic debris in rivers and lakes
Remote sensing creates a cost-effective solution to monitoring plastic pollution. A first-of-its-kind study from researchers at the University of Minnesota Twin Cities shows how remote sensing can help monitor and…
Laser-based artificial neuron mimics nerve cell functions at lightning speed
With a processing speed a billion times faster than nature, chip-based laser neuron could help advance AI tasks such as pattern recognition and sequence prediction. Researchers have developed a laser-based…
Optimising the processing of plastic waste
Just one look in the yellow bin reveals a colourful jumble of different types of plastic. However, the purer and more uniform plastic waste is, the easier it is to…