Informasjon om genetiske data i Tromsøundersøkelsen


The Tromsø Study resource has the following genomic data:

-         Genotypes and imputation for 31,000 participants (of 45,000 participants in total)

-         Exome sequences for 2,000 participants

 

What are genetic analyses?  

Genetic analyses use many techniques and procedures to study an individual’s genetic material (DNA). Based on information about genes or genotypes in populations, research worldwide focuses on gaining insights into variation and its correlation to genetic predispositions, heritage, and risk of disease. Studies can be done for numerous genetic markers (genome-wide) or for specific variants in a gene that a researcher is interested in (single nucleotide polymorphisms). Approaches and methods like genome-wide association studies (GWAS) and mendelian randomization (MR) studies are often used in such studies. 

Introduction and status for genetic data in the Tromsø Study 

By the end of 2024 there will be genetic data for ~31,000 participants in the Tromsø Study accessible for research projects. The dataset covers 69% of the Tromsø Study participants. Approximately 9,000 participants were genotyped in 2014/2015 and approximately 22,000 were genotyped in 2020-2021. The genotyping and pretreatment of data was harmonized with the procedures for the HUNT Study genetic data and is comparable to genetic data derived in other population-based studies in Norway. Current efforts aim to impute the genotyped data using available reference panels. The recent genotyping was funded by UiT the Arctic University of Norway whereas previous genotyping initiatives were funded by single projects. 

All genotyping efforts were conducted at the NTNU Genomic Core Facility (GCF) using Illumina Human Core Exome chips (see details below). The analyses were performed in parallel with samples from the HUNT study and the planning and quality control of data was performed in collaboration with researchers at the HUNT Centre for Molecular and Clinical Epidemiology, NTNU.

Genome-wide genotyping

Brief description of the laboratory analyses

DNA extraction

The HUNT Biobank, NTNU at Levanger, Trøndelag, Norway is an established and modern research biobank with high-technology equipment for storage, analysis, sample handling and delivery of samples. HUNT biobank is ISO certified (ISO-9001). 

DNA was extracted from buffy coat samples or whole blood samples normalized and dispensed on a 96 well plate randomized according to sex and survey. 

Genotyping analyses

The genotyping was conducted at the NTNU GCF that offer services including high-through put genetic analyses. The GCF has broad experience from conducting genotyping- and sequencing analyses of DNA and other genetic markers using Illumina platforms. GCF performed the genotyping analyses of the Tromsø Study samples in 2015 and in 2020-2022 with largely the same procedures and personnel. 

In the 2015 analyses, genotyping was performed using Illumina HumanCoreExome arrays (HumanCoreExome 12 v1.1), whereas the 2022 analyses used a customized HumanCoreExome array (UM HUNT Biobank v1.0 / Infinium CoreExome-24+ v1.3), which included variants selected among researchers related to the HUNT and Tromsø Study surveys. The custom content included approximately 60,000 markers aimed to make the array as similar to the previous analyses as possible, and adapted to the research most interesting for the HUNT and the Tromsø Study. The Illumina arrays contained around 600,000 genetic markers (550,000 + 60,000 custom content) directly measured. Further descriptions can be found here:  

HumanCoreExome12 v1.1: https://support.illumina.com/downloads/humancoreexome-12-v1-1-product-files.html  

UM HUNT Biobank v2.0: https://support.illumina.com/downloads/infiniumcoreexome-24-v1-3-product-files.html 

 

Quality control (QC) of data

Further processing of the genotyping results was performed in collaboration with the HUNT Centre for Molecular and Clinical Epidemiology (former K. G. Jebsen Centre for Genetic Epidemiology). The work followed a strict QC protocol and resulted in genotyping of 358,964 genetic (polymorphic) markers.

 

Imputation

A process known as "imputation" is often performed for genotyped data and involves making assumptions about genetic variation in regions of the DNA strand that have not been directly measured, based on the actual measured data. This expands the dataset from thousands of genetic markers to comprise many million genetic markers.

Current efforts aim to impute the genotyped data using the Haplotype Reference Consortium (HRC), TOPMed, and 1000 Genomes reference panels to expand the dataset to >33 million variants using the Sanger imputation services, UK.

 

Data management and storage 

The procurement of genetic data entails handling and storing sensitive information. The Tromsø Study data is stored at HUNT-cloud service at NTNU. HUNT Cloud is a digital infrastructure that is specifically designed to enable secure storage, access control, and research analysis of sensitive data. HUNT Cloud complies with two international standards, IEC/ISO 27001 for information security and privacy and IEC/ISO 9001 for quality management. 

The QC processing, long-term storage and data governance of the Tromsø Study genotype data used HUNT Cloud. The data will be managed as part of the Tromsø Study biomarker data.

 

Description of sample sets

Overview of participants with genetic data  

A total 45,473 participants attended at least one of the first seven surveys in The Tromsø Study, and a subset of 31,280 are genotyped (69%). The genotyping has been performed in three batches (n= 5,527 from Tromsø4, performed in 2014, n=3,423 from Tromsø6, performed in 2015, and n=22,555 from Tromsø4-Tromsø7, performed in 2020-2021). DNA was extracted from several surveys of the Tromsø Study after Tromsø 4 but only one sample was analysed from each person.

All participants were initially planned for genotyping, but certain participants could not be included due to; i) Limitations by the consents given by the participants; ii) Withdrawal from the study; or iii) No available sample for DNA extraction.  

A specific informed consent is required for genotyping. For those who participated in Tromsø4 only and not in subsequent studies, the consent given in Tromsø4 was deemed insufficient to cover genotyping. In December 2020, an information letter was sent to participants (passive consent). Participants who were dead were not included in the genotyping. 

Tabell1!

Tabell2!

Whole exome sequencing 

Whole-exome sequencing measures the regions of the genome (about 2%) that are involved in coding for proteins and is particularly suitable for identifying disease-causing and/or rare genetic variants. 

 

How can I get access? 

The genetic data is governed by Norwegian legislation and cannot be deposited in a public repository. Researchers associated with Norwegian research institutions can apply for data after approval by a Norwegian Regional Committees for Medical and Health Research Ethics. Researchers from outside Norway can have access if collaborating with a Norwegian researcher.  

Applications for data access follow descriptions of procedures here. If access is granted, a dataset will be made available in or for transfer to an approved secure computing solution, such as HUNT Cloud or TSD. Note that fees for storage and analysis apply. We strongly encourage researchers to make summary statistics publicly available using open repositories, for example Dataverse. 

 

Further questions? 

Write an email to The Tromsø Study at tromsous@uit.no