skip to primary navigationskip to content

The Unreasonable Effectiveness of Co-occurrence Based Models

last modified Sep 10, 2015 02:49 PM
Gabriel Recchia, CRASSH

Given the current excitement over complex and computationally intensive machine learning techniques such as deep learning and conditional random fields, it may seem unlikely that much useful information could be obtained simply computing normalized counts of the number of times that pairs of words co-occur (within a sentence or a lexical window of a particular size, for example). However, co-occurrence based methods have proven surprisingly useful in a wide variety of tasks in natural language processing. Furthermore, their simplicity and transparency affords them certain advantages in particular contexts. I will present research demonstrating the surprising success of co-occurrence-based methods in a variety of circumstances: predicting the degree to which two words are similar in meaning, predicting the excavation sites of archaeological artifacts, predicting grammatical classes of words in an unsupervised fashion, and others. Practical tips and software packages for such methods will be discussed.

Upcoming events

Data Challenges in Cardiovascular Research

Sep 24, 2018

McGrath Centre, St Catherines's College, University of Cambridge

Ensembl browser workshop

Oct 17, 2018

Training room 2, European Bioinformatics Institute (South Building), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD

Cybersecurity for Smart Infrastructure: Challenges and Opportunities

Nov 01, 2018

Downing College, University of Cambridge

Cambridge Big Data Research Symposium

Nov 26, 2018

Sainsbury Laboratory Cambridge University, Bateman St, Cambridge CB2 1LR

Upcoming events