skip to primary navigationskip to content

Natural Language Processing for Digital disease detection in a fast-moving world

last modified Sep 10, 2015 02:49 PM
Nigel Collier, Department of Theoretical and Applied Linguistics

Accurate and timely collection of facts from a range of sources is crucial for supporting the work of experts in detecting and understanding highly complex diseases. In this talk I illustrate several applications our group at the Language Technology Laboratory are working on that exploits Natural Language Processing (NLP) on large-scale Web and linked data. (1) In the BioCaster project (JST: 2008-2012), working with global public health colleagues, high throughput text mining on multilingual news was employed to detect and map infectious disease outbreaks in near-real time on a global scale; (2) In the SIPHS project (EPSRC: 2015-2020) we are beginning to explore techniques for encoding personal health reports from a variety of social media sources to support real-time knowledge discovery about infectious diseases and adverse drug reactions, and (3) In the LION project (MRC, 2015-2018) we plan to explore how text mining could support literature-based discovery in cancer biology. Our aim is to develop a tool which cancer biologists can use to test and generate novel research hypotheses on the basis of knowledge already published in scientific literature.