Menu

Home / Events / Big Data in Medicine: Exemplars and Opportunities in Data Science / Mineotaur: interactive visual analytics for high-content microscopy screens

Mineotaur: interactive visual analytics for high-content microscopy screens

Back to: 
Big Data in Medicine: Exemplars and Opportunities in Data Science

Balint Antal, Department of Genetics, University of Cambridge

Mineotaur: interactive visual analytics for high-­‐content microscopy screens

Bálint Antal, Anatole Chessel, Rafael E. Carazo Salas

University of Cambridge, Department of Genetics

ba328@cam.ac.uk, ac744@cam.ac.uk, cre20@cam.ac.uk

Abstract

Despite the ground-­‐breaking discoveries in genomics, the genomes of most organisms remain black boxes with the function of the majority of genes and gene products still unknown. Moreover, many genes and proteins play roles in multiple biological processes. High-­‐throughput/high-­‐content microscopy-­‐based screening (HT/HCS) provides an increasingly powerful tool to discover and functionally annotate genes and biological pathways, which already led to several important discoveries, like the systematic identification of genes important for mitosis, endocytosis, and other fundamental processes. Specialised large-­‐scale image and data analysis methods are needed to produce phenotypic data, limiting such functional genomic annotation techniques to researchers of groups that possess that expertise. This means that the community at large is limited in their access to data and their ability to further mine it after publication, reducing the impact of the expensive HT/HC screens. Overall, while technical advances led to an explosion in the amount of data being acquired, suitable data handling, visualization and analysis techniques are still lagging behind.

Here we propose a novel data visualization tool called Mineotaur (http://www.mineotaur.org), which will allow the community to mine further the raw multidimensional feature data and knowledge from published HT/HC screens leading to a better exploitation of experimental results. The user interface allows the members of the community without any computational knowledge to extract meaningful information from the data. The web interface can be used for querying the data and the results are visualized as plots (e.g. scatter plot, histogram) in real-­‐time. The tool is based on a novel data model allowing the visualization and analysis of extremely large amounts of data. As a demonstrative example, we use phenotypic data extracted from a high-­‐ throughput fission yeast screen.