User Tools

Site Tools


ugli

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
ugli [2025/12/19 15:14] sylviaugli [2026/02/16 14:40] (current) sylvia
Line 46: Line 46:
 ====SNP array intensity files==== ====SNP array intensity files====
 Raw intensity data from the GSA will be made available to the researchers.  Raw intensity data from the GSA will be made available to the researchers. 
 +
 +====Updated version - v3====
 +Changes in the GSA genotype and imputed data version 3 (Oct 2025)
 +
 +1. During the QC of the Affymetrix data (UGLI2+3) genetic relationships of all samples genotyped with the CytoSNP, GSA or Affymetrix array were determined and compared with the reported pedigree relations. Some DNA samples appeared not to be from the expected individual and could also not be matched to any other individual using the observed genetic relationships. These samples were therefore excluded. This resulted in a new sample size of N=36,233.
 +
 +2. A new phasing and imputation strategy was adopted. The online HRC imputation on the Sanger Imputation Server makes use of the SHAPEIT2 or EAGLE2 phasing tools and the PBWT imputation tool. These software tools are relatively old (2014). Instead, an in-house phasing and imputation approach was used, using a subset of ~11,000 HRC samples as a reference panel with the EAGLE2 phasing and IMPUTE5 imputation tools.
 +
 +3. Based on a new PCA on the combined data of the CytoSNP, GSA and Affymetrix chips to identify non-European individuals, a new list of non-European GSA individuals now has been constructed (PC's). It is up to the researcer whether he/she wants to exclude them or correct for population ethnicity in the analysis.
  
 \\ \\
Line 53: Line 62:
 As of March 2023, data of an additional 28,149 genotyped participants has been made available. Samples in this release, called UGLI2, were genotyped using the FinnGen Thermo Fisher Axiom® custom array. As of March 2023, data of an additional 28,149 genotyped participants has been made available. Samples in this release, called UGLI2, were genotyped using the FinnGen Thermo Fisher Axiom® custom array.
  
-29,366 participants were selected for UGLI 2 release and assessed using the pre mentioned array. All genotypes were included for QC screening, but the QC focussed on the the autosomes and chromosomes X for which there are N=617,715 and 22,346 markers available, respectively. A final set of 28,149 samples and 441,596 markers on autosomal and 18,450 X chromosomes markers passed the QC steps described in  +29,366 participants were selected for UGLI 2 release and assessed using the pre mentioned array. All genotypes were included for QC screening, but the QC focussed on the the autosomes and chromosomes X for which there are N=617,715 and 22,346 markers available, respectively. A final set of 28,149 samples and 441,596 markers on autosomal and 18,450 X chromosomes markers passed the QC steps described in the quality check rapport.
-{{ :qc_report_ugli2_release_1_-v1.pdf |}}.+
  
 ^ UGLI2 - Affymetrix cohort - samples that passed QC           || ^ UGLI2 - Affymetrix cohort - samples that passed QC           ||
Line 69: Line 77:
  
 ==== Quality Checks ==== ==== Quality Checks ====
-An UGLI2 - Affymetrix (release 2.0) Quality Control Report is available, describing in detail the QC steps that were taken during the quality control (QC) process of the second release of UGLI comprising the genotype of 29,366 participants assessed using the FinnGen Thermo Fisher Axiom® custom array. {{ :qc_report_ugli2_release_1_-v1.pdf |}}+An UGLI2 - Affymetrix (release 2.0) Quality Control Report is available, describing in detail the QC steps that were taken during the quality control (QC) process of the second release of UGLI comprising the genotype of 29,366 participants assessed using the FinnGen Thermo Fisher Axiom® custom array. {{ :qc_report_ugli2_release_2_-v1.pdf |}}
  
 ====Imputation==== ====Imputation====
Line 81: Line 89:
 ===== UGLI2+3 - Affymetrix ===== ===== UGLI2+3 - Affymetrix =====
 Important to note: **UGLI3 currently has a publication restriction**. the UGLI-consortium is currently preparing their manuscript describing the dataset and its primary analyses. Other manuscripts using UGLI3 data may not be submitted **before 31 December 2026**. Important to note: **UGLI3 currently has a publication restriction**. the UGLI-consortium is currently preparing their manuscript describing the dataset and its primary analyses. Other manuscripts using UGLI3 data may not be submitted **before 31 December 2026**.
-\\ 
-As of december 2025, data of 60,157 genotyped participants has been made available. Samples in this release, called UGLI2+3, were genotyped using the FinnGen Thermo Fisher Axiom® custom array. The sample includes the previously genotyped participants from UGLI2 (see above). The QC and imputation was done on the combined dataset of UGLI2 and UGLI3. 
  
-63,553 participants were selected for UGLI 2+3 release and assessed using the pre mentioned array. All genotypes were included for QC screening, but the QC focussed on the the autosomes and chromosomes X for which there are N=615,682 and 22,346 markers available, respectively. A final set of 60,157 samples and 476,693 markers on autosomal and X chromosomes passed the QC steps described in {{ ::qcreport_ugli2and3_oct2025.pdf |}}.+As of december 2025, data of 60,157 genotyped participants have been made available. Samples in this release, called UGLI2+3, were genotyped using the FinnGen Thermo Fisher Axiom® custom array. The sample includes the previously genotyped participants from UGLI2 (see above). The QC and imputation was done on the combined dataset of UGLI2 and UGLI3. 
 + 
 +63,553 participants were selected for UGLI 2+3 release and assessed using the pre mentioned array. All genotypes were included for QC screening, but the QC focussed on the the autosomes and chromosomes X for which there are N=615,682 and 22,346 markers available, respectively. A final set of 60,157 samples and 476,693 markers on autosomal and X chromosomes passed the QC steps described in the quality check rapport.
  
 ^ UGLI2+3 - Affymetrix cohort - samples that passed QC           || ^ UGLI2+3 - Affymetrix cohort - samples that passed QC           ||
Line 99: Line 107:
  
 ==== Quality Checks ==== ==== Quality Checks ====
-An UGLI2 - Affymetrix (release 2.0) Quality Control Report is available, describing in detail the QC steps that were taken during the quality control (QC) process of the second release of UGLI comprising the genotype of 29,366 participants assessed using the FinnGen Thermo Fisher Axiom® custom array. {{ :qc_report_ugli2_release_1_-v1.pdf |}}+An UGLI2+3 - Affymetrix (release 3.0) Quality Control Report is available, describing in detail the QC steps that were taken during the quality control (QC) process of the third release of UGLI comprising the genotype of 60,157 participants assessed using the FinnGen Thermo Fisher Axiom® custom array. {{ ::qcreport_ugli2and3_oct2025.pdf |}}
  
 ====Imputation==== ====Imputation====
 A final set of 60,157 samples and 476,693 markers on autosomal and X chromosomes passing all QC steps described in the QC rapport and were used for genetic phasing and imputation. Genetic imputation was done through the  A final set of 60,157 samples and 476,693 markers on autosomal and X chromosomes passing all QC steps described in the QC rapport and were used for genetic phasing and imputation. Genetic imputation was done through the 
-Sanger imputation service using the Haplotype Reference Consortium (http://www.haplotypereference-consortium.org) panel. +Sanger imputation service using the Haplotype Reference Consortium (http://www.haplotype-reference-consortium.org) panel. 
 Method: See QC report Method: See QC report
  
ugli.1766157286.txt.gz · Last modified: by sylvia