skip to primary navigationskip to content

Big Data and the Role of Statistical Scalability

When Feb 28, 2018
from 09:30 AM to 05:00 PM
Where Isaac Newton Institute, Cambridge
Contact Name
Add event to calendar vCal


The ability to collect and store data has increased exponentially in recent years. So too have the challenges around managing the huge volumes generated and trying to extract meaningful information from it. However, it’s universally acknowledged that such ‘Big Data’ has the potential to transform many aspects of people’s lives, particularly in data-rich areas – including industries, government agencies, science and technology.

The important role of statistics within Big Data has been clear for some time. Statistical techniques, such as sampling populations, confounders, multiple testing, bias, overfitting and generally dealing with variation in the data, are essential for modelling and effective analysis. A major challenge of working with Big Data is that the volume can exceed what is feasible to compute with and traditional methods can fail to scale up. There has also been a tendency to focus purely on algorithmic scalability, eg, developing versions of existing statistical algorithms that scale better with the amount of data. However, such approaches ignore the fact that fundamentally new issues often arise, and highly innovative solutions are required.


Aims and Objectives

This knowledge exchange event by the Turing Gateway to Mathematics will seek to extend the reach of the research being undertaken as part of the INI Statistical Scalability Research Programme. It will open up the discussion to a wider audience, including those working in multiple industrial sectors, Government and the public sector.

Because interest in Big Data is so intense, the field is developing very rapidly. This event will therefore facilitate the dissemination of state-of-the-art statistical research and highlight a number of key future research directions, such as:

  • Statistical inference after model selection
  • Model miss-specification
  • Heterogeneity
  • Trade-offs between statistical and computational efficiency
  • Sequential decision problems
  • New data types

There will also be three end-user sessions featuring speakers from the health, energy and communications sectors. Speakers will describe how Big Data scaling is managed in their organisations and the challenges they face. Each session will include time for discussion and feedback from the audience.

The workshop will include a poster exhibition, which will run during the lunch and the drinks/networking session and there will be a short discussion and question session to finish. It is expected to bring together industrial and academic experts from a diverse set of backgrounds and areas, including healthcare, medicine, manufacturing, finance, defence, engineering, security, communications, Government and the public sector.


Registration and Venue

A registration fee is charged to cover attendance at this event.

This is £25 for academic and public sector attendees and £50 for industrial attendees.

Further information and details how te register, can be found here.