Canada, EU, and Africa combine to allow researchers to analyze health data on the largest, most diverse scale
Common Infrastructure for National Cohorts in Europe, Canada, and Africa (CINECA) is an unprecedented multi-continental project that will build the infrastructure -- data standards, technical protocols, and software -- to allow queries and analyses over distributed data sets that are contributed and controlled by each partner.
Canadian Centre for Computational Genomics, based at SickKids and McGill, to lead in management and analysis of federated health data
A patient develops a rare condition and needs answers, so their clinician searches frantically to find patients with similar, rare, symptoms and similar possible causes. To understand the mechanisms of one debilitating disease, a medical researcher tries to separate the “signal” of causes of that disease, in particular, from the “noise” of natural biological variation of human lives and conditions.
Getting the answers those patients and researchers need requires the ability to analyze or query health and genomic data from an enormous number of patients - patients who have their own needs, and deserve to have their data kept at the highest levels of security and privacy.
On January 24, 2019, a collaboration of African, Canadian, and EU researchers came together to announce the CINECA (Common Infrastructure for National Cohorts in Europe, Canada, and Africa) project. CINECA is an unprecedented multi-continental project that will build the infrastructure -- data standards, technical protocols, and software -- to allow queries and analyses over distributed data sets that are contributed and controlled by each partner.
The Hospital for Sick Children (SickKids) is directly involved with CINECA and, along with Canadian Distributed Infrastructure for Genomics (CanDIG) partner institution, McGill University, will lead the building of standard methods for federating queries and actively participate in building compatible and interoperable systems for login, access control, and running complex distributed analyses. Canada’s health data system has always been federated and the CanDIG’s experience with building federated queries and analyses over locally controlled private health data is essential to the project.
“The technical goals we have set for ourselves are ambitious”, says Dr. Michael Brudno, Senior Scientist in the Genetics & Genome Biology program at SickKids and PI of the CanDIG project. “But we’re confident that we can not only meet those goals, but build open-source standards-based solutions for the entire community.”
“CanDIG is already connecting several important Canadian health data sets in cancer research”, said Guillaume Bourque, Director of the Centre for Computational Genomics at McGill and Co-PI of CanDIG. “As part of this project, we are proposing to connect additional Canadian data sets, and then connect those to an even larger number of data sets internationally. Those new connections between data sets are going to allow Canadian researchers much deeper insight into even that data that they already had access to.”
“Key to this project’s success is trusted, reliable, federated data querying and analysis”, said Steve Jones, Head of Bioinformatics and Co-Director, Michael Smith Genome Sciences Centre, and Co-PI of CanDIG. “We’ve shown how this can be done in support of real science and insight, while retaining control over the data we have been entrusted with; and we’re excited to bring our expertise in data federation to the international community.”
The CINECA project is funded by both the EU through the Horizon 2020 Research and Innovation Programme and the Canadian Government through the Canadian Institutes for Health Research.
CanDIG is a Canadian national health and genomics platform for allowing authorized queries and analysis of data over locally-controlled private data sets.
The Canadian Center for Computational Genomics provides bioinformatics analysis and HPC services for the life science research community.