Automatic Quality Assessment: NLP methods for semantic mapping of life-science texts
The increasing prevalence of misinformation and disinformation represents a serious threat to our democratic society. Disinformation is typically defined as any form of communication that is intended to mislead. Spread by political lobby groups to manipulate public discourse, it can make it hard for recipients to determine whether the information they are getting is true or false. Disinformation is also prevalent in scientific settings and can therefore have a direct impact on scientists’ work. In some instances, such as medicine, false information may even put people's health at risk.
To help tackle this problem, a team of ZB MED experts has launched AQUAS, a project that aims to compile the first German-language database on disinformation in the life sciences. To identify cases of disinformation, AQUAS will consult reports and assessments issued by relevant bodies, including the German Federal Agency for Civic Education, the fact-finding team run by German broadcaster ARD, and the NGO MedWatch. This data will then be used to develop a machine-learning application that will endeavour to classify previously unseen literature as either science, popular science or disinformation by detecting similarities with items the AI has already encountered.
In many cases, users will also be able to consult supplementary information on whether items comply with good research practice as defined by the German Research Foundation (DFG). This will include information on citations, peer review and retraction status (i.e. whether a journal has subsequently retracted an article after becoming aware of errors), all of which can help pinpoint misinformation and disinformation.
Based on the enrichment methods that have already been developed, the AQUAS project team will implement a service that can be accessed via an application programming interface (API). The service’s first main application will be on the ZB MED search portal LIVIVO, where it will be used to classify the literature available to ZB MED users. By improving the knowledge infrastructure of the LIVIVO platform, AQUAS will initially benefit life-science researchers, students and practitioners in healthcare professions.
It is important to note that AQUAS does not seek to give a final recommendation on which items are worth reading; nor does it seek to censor content. By making its results available through the ZB MED search portal LIVIVO, AQUAS simply supports readers in making an informed decision on the nature of scientific – or pseudo-scientific – literature.
Based on the principles of Open Science, ZB MED will endeavour to make everything involved in the project open source, from the database and the model to the workflow used to train the AI and, where possible, the software used to run the service. This will allow others to benefit from the application and even adapt it for use in other areas. Users will also be provided with clear information on how the AI works.
Duration
1 December 2022 - 30 November 2025
Funding bodies
- German Research Foundation - Scientific Library Services and Information Systems (DFG-LIS)