Cambridge Big Data works with groups of researchers, Departments, and other Strategic Research Initiatives and Networks to promote collaboration, share ideas, scope out key research questions in data science and to support the development of interdisciplinary funding proposals.
Cambridge Big Data members are eligible to apply for small amounts of seed funding for the organisation of research workshops (for more information, see the call for proposals).
Research challenges are areas of interdisciplinary strength at Cambridge, for which Cambridge Big Data provides ongoing support. This includes assisting in the organisation of workshops and seminar series, preparing applications for research funding, and engaging external partners to develop new collaborations and research directions. Cambridge Big Data currently supports the following Challenge areas:
Algorithms and Systems for Energy Efficient Computing
Joint activity between the Energy@Cambridge and Big Data Strategic Research Initiatives.
Algorithms and Systems for Energy Efficient Computing is a joint Grand Challenge between the Energy@Cambridge and Big Data Strategic Research Initiatives. The ICT industry has proved a major stimulus for world-wide economic growth over the last two decades, but this has come at a cost in terms of growing energy demand. Increased energy efficiency is required, not just for environmental reasons, but to exploit opportunities using Big Data concepts. This Grand Challenge focuses specifically on: energy efficiency algorithms; novel energy-efficient architectures for Big Data challenges; and energy-efficient system design, data management and programming models, with strong links to the Energy@Cambridge Grand Challenge in Materials for Energy Efficient ICT.
The Ethics of Big Data
As a society we are creating ever-larger volumes and varieties of data, which are also being shared at increasing velocities. The embedding of sensor networks in ‘smart cities’, the rapid expansion of mobile phone and particularly mobile internet use and the growth of social, political and cultural interactions on social media platforms are some of the factors behind this phenomenon. Methods and tools for the computational analysis of such massive and complex datasets are being adopted in a wide range of settings by governments, international institutions, corporations, civil society organisations and academic researchers. However, the growing prevalence of big data research across the disciplines, has significantly outpaced our knowledge of its ethical ramifications (boyd and Crawford 2012).
The aims of this interdisciplinary Ethics of Big Data Research Group are to explore these ethical ramifications and to develop concrete resources for scholars conducting big data research. In addition, the Research Group intends to contribute to our understanding of research ethics more broadly in terms of their relationship to rapidly evolving research practices and in terms of how they translate across disciplines.
The programme of the research group for the first term will focus on the ‘what’, ‘how’, ‘who’ and ‘why’ of the Ethics of Big Data, which we will explore through a series of public seminars and workshops.
What is big data?
How is big data produced?
Who creates, collects and researches big data?
Why do ethical approaches to big data matter?
These questions naturally provoke others: What is old and what is new in big data research? Are there specific ethical challenges arising from big data research? Which ethical frameworks can we use or adapt to meet those challenges?
Research Workshops are an important part of our programme. Workshops range from half-day meetings between researchers in Cambridge Departments to discuss specific research questions, to multi-day conferences with external speakers and delegates. Workshop outcomes include strengthened networks and intellectual exchange, development of new project ideas.
We can also provide small amounts of seed funding for the organisation of research workshops (for more information, see the call for proposals).
10 June 2016
The workshop supported an interdisciplinary conversation at the University of Cambridge about the ethics of big data research. Its aims were both to raise awareness of ethical issues associated with big data and to contribute to the development of material for the Research Group’s digital reader - a publicly accessible, interactive online resource on the ethics of big data research.
We invited speakers from the worlds of academia and policy to discuss the ethical challenges of big data research. The Ethics of Big Data team also presented an overview of our findings from the year’s programme of activities, including the development of innovative formats for developing discussions about ethics in research through the performance of a mock ethics review.
The Ethics of Big Data Research Group are developing an Ethics of Big Data reader from the discussions over the past year. This will include case studies and performance notes and scenarios for researchers looking to stage a mock ethics review panel as a tool for engaging researchers, students, or practitioners in other contexts in discussions about ethics in big data research. A journal article is also in preparation.
- Human-Data Interaction (20 April 2015, 14:00 - 16:00)
- What is Big Data? Discovery through a Data Walkshop (7 October 2015, 14:00 - 16:00)
- Inside Snowden’s suitcase (21 October 2015)
- Ethics of Big Data in practice: Health and Policy research in Africa (13 January 2016, 12:00 - 14:00)
- Ethics of Big Data in practice: Patient record linkage in hospitals (27 January 2016, 12:00 -14:00)
- Ethics of Big Data in practice: Administrative data (10 February 2016, 12:00 - 14:00)
- Ethics of Big Data in Social Media Research (24 February 2016, 12:00 - 14:00)
Data Science for Smart Infrastructure
31 May 2016
This collaborative workshop brought together researchers in data science with the Centre for Smart Infrastructure and Construction, to address challenges in management of data from distributed sensor networks, as well as techniques for the analysis and optimisation of traffic data.
14-15 March 2016
The conference brought together 72 delegates from institutions across the UK and Europe to discuss approaches to the preservation and long-term curation of digital data in a range of disciplines.
The focus of the conference was largely around scientific and research data, but also looked at the challenges in national archives and memory institutions. Future areas for research include personal data archives, particularly those involving new forms of data, such as social media, online interactions, email and photo which have unique sensitivities, for example their vulnerability to changing commercial policies and discontinuation of services.
The specific objectives were
- To assemble a broad and diverse community of interest
- To identify key shared challenges and share knowledge and expertise in digital preservation
- To better define the required areas of research, including technology research
- To assess and define additional areas of training, education and skills development in long tern data preservation for science and research
- To inform the case for sustained investment in preservation and in education around preservation models and their associated cost
The range of disciplines covered in the talks included high energy physics, astronomy, infrastructure modelling, bioinformatics, libraries, archives, history, policy, medical research and law.
Videos of the keynote talks, and slides for the majority of the talks on both days of the conference are available at the conference website, along with a report.
9 March 2016
We are currently experiencing many new exciting developments in imaging technology in biology and medicine. New advances in tomographic imaging, such as photoacoustic tomography, electron tomography, multicontrast magnetic resonance tomography (MRT) and combined MR with positron emission tomography (PET), as well as new technology in microscopy such as lightsheet microscopy, only mark the beginning of an era which revolutionises the extent of what we can see. New imaging technology always goes side by side with the need of mathematical models to maximise the information gain from these novel imaging techniques.
This one day meeting aimed to bring together those working on advances in imaging technology with researchers who investigate new image analysis methods, to help address these challenges. In particular, there was a focus on the following topics:
- Big data problems and solutions
- Dynamic imaging
1 February 2016
IfM hosted a workshop on exploring the role Big Data will play in the future of manufacturing. The objective of the meeting was to identify research priorities that address the specific challenges encountered by Manufacturers in using data science. Four key topics were discussed:
• Challenges of Big Data analytics in manufacturing
• Best practices and applications for manufacturing analytics
• Technologies and ICT infrastructure for Big Data analytics and deployment
• New business models for the smart manufacturing systems
The workshop identified seven main challenges of big data analytics in manufacturing. These are: 1- Awareness and acceptance, resulting from resistance to cultural change, 2- the need for Big Data Standardisation, 3- challenges relating to Data management issues such as data quality and integration, 4- Financial constraints such as the perceived value of big data, 5-Knowledge and skillset needed to implement analytics solutions, 6-Policy and government support, and 7-the need for repeatable best practice Implementation processes.
A report on recommendations for future research is forthcoming.
22 January 2016
This EPSRC-funded workshop brought together high-dimensional big data researchers from academia with practitioners from industry. The presentations given by the invited speakers covered state-of-the-art research and cutting-edge technologies, covering both the theoretical foundations of big data analysis and the algorithms and data structures required for high performance in analysis, indexing and search. As well as enhanced collaboration networks, outcomes include:
- A publication in IEEE Transactions on Big Data, on approximation of high-dimensional data http://www.cl.cam.ac.uk/~lw525/publications/kvasir-tbd.pdf
- A publication in IEEE INFOCOM conference, on how to efficiently transmit data http://www.cl.cam.ac.uk/~lw525/publications/renewable_infocom_2016.pdf
- A working paper on optimal data structure, in preparation of submission http://arxiv.org/pdf/1509.06957v1.pdf
- A functioning system on high-dimensional data, Kvasir project, http://www.cl.cam.ac.uk/~lw525/kvasir/
- A successful proposal for a workshop on “Advances in High-Dimensional Data” at the IEEE Big Data Conference in December 2016 http://cci.drexel.edu/bigdata/bigdata2016/. The website of the previous year’s workshop can be found here: https://sites.google.com/site/adhdbigdata/
In collaboration with Cambridge Neuroscience
25 November 2015
On 25th November 2015, over 70 researchers from across the University of Cambridge gathered for an interdisciplinary workshop at Corpus Christi College on Neurocomputation: from brains to machines, chaired by Professor Zoe Kourtzi and organised by Cambridge Neuroscience with support from Cambridge Big Data. The aim of the workshop was to advance our understanding of how biological and artificial systems solve sensory and motor challenges and brought together speakers from a range of disciplines, from cognitive neuroscience and brain imaging to engineering, computer vision and robotics. The goal was to encourage dialogue using a common language of computational techniques that allow us to extract informative signals from rich biological data and design artificial systems with practical applications.
A recurring theme of the workshop was the progress that has been achieved in developing computational models of the brain that are also biologically, highlighting the importance of a continuing dialogue between biological scientists and engineers to uncover, and take inspiration from, the mechanisms underlying brain function and cognition.
See here for a summary of the talks.
In collaboration with Cambridge Public Policy
24 September 2015
Big data is expanding in its contribution across the social sciences and in public policy. While there are many practical, technical and ethical questions associated with this trend, a recent Cambridge workshop focused on its practical application across the social sciences. The workshop was hosted by two interdisciplinary research initiatives at the university, in Big Data and Public Policy. It brought together researchers from a range of departments and disciplines to present and discuss big data applications in social science research and public policy.
A full report of the workshop is available here
In collaboration with the British Antarctic Survey
7 September 2015
Remote and extreme environments present new challenges for scientific data acquisition, processing and transfer. From extremely cold and remote locations in the Antarctic to novel geotechnical sensing technologies and the Big Data challenge of the Square Kilometre Array, which will operate from remote desert locations, scientists share challenges of accessibility, physical conditions, power supply and networking.
This joint workshop between the University of Cambridge and the British Antarctic Survey brought together researchers, developers and engineers addressing complementary challenges in data acquisition, transfer and processing, to share knowledge, develop new connections and collaborations, and experience hands-on demonstrations of state of the art data acquisition and processing technology.
19 June 2015
The data generated by medical care and medically relevant research are rapidly becoming bigger and more complex, particularly with the advent of new technologies. Our ability to advance medical care and efficiently translate science into modern medicine is bounded by our capacity to access and process these big data. From human genetics and pathogen genomics to routine clinical documentation, from internal imaging to motion capture, from digital epidemiology to pharmacokinetics, and from treatment pathways to life course assessment, the big Vs of Big Data - volume, variety, velocity and veracity - abound in medicine. Statistical, mathematical, visualisation, and computational approaches, from a wide range of disciplines, as well systems for innovative ICT-based interventions are needed to keep apace of the complexity in Big Data and to advance medicine.
On 19th June 2015 at the Cancer Research UK Cambridge Institute, Cambridge-based researchers from all Schools of the University and local research institutes, the pharmaceutical industry and our funding and commissioning partners met for an afternoon of talks demonstrating methods and opportunities for harnessing Big Data in medicine.
26 January 2015
Big Data is everywhere, spanning the entire range of academic research. No matter what you do – from humanities to natural sciences, from social sciences to engineering to medicine – you are bound to come across copious amount of data: this is the outcome of modern technology which allows us to collect, measure and sample. Yet, data on its own, no matter how “Big”, is of little use. The challenge is to distill Big Data into actionable, useful information. This requires a range of tools from mathematics, statistics and computer science, methodologies which might appear intimidating to the uninitiated.
On 26 January 2015, eight Cambridge academics gave short presentations introducing The Vocabulary of Big Data, the range of concepts and ideas which underlie modern analysis of large data sets. In a maths-free manner, light on technicalities yet rich on content, they aimed to outline the meaning and intuition behind the methodologies. Click the link above to view the talks.