Franke group tools, consortia projects and federated data pipelines

Tools and Methods

2025

The Asterix pipeline calls CYP2D6 copy number variants in order to generate a low-cost, clinical-grade pharmacogenetic passports for 12 PGx genes. Asterix code is available at: https://github.com/molgenis/asterix

Reference: Lanting P, Warmerdam R, Slager J, Brugge H, Ochi T, Benjamins M, Lopera-Maya E, Jankipersadsing S, Gelderloos-Arends J, Teuben D, Hendriksen D, Charbon B, Johansson L, Munnink TO, de Boer-Veger N; Lifelines NEXT Cohort Study; Lifelines Cohort Study; Wilffert B, Swertz M, Touw D, Deelen P, Knoers N, Dekens J, Franke L. Low-cost generation of clinical-grade, layperson-friendly pharmacogenetic passports using oligonucleotide arrays. Am J Hum Genet. 2025 May 1;112(5):1015-1028. doi: 10.1016/j.ajhg.2025.03.003.

2024

Downstreamer is a framework to perform key gene prioritization using GWAS summary statistics using 57 tissue-specific co-expression networks derived from the Recount3 data. Downstreamer is freely available at: https://github.com/molgenis/systemsgenetics/wiki/Downstreamer

Reference: Bakker OB, Claringbould A, Westra HJ, Wiersma H, Boulogne F, Võsa U, Urzúa-Traslaviña CG, Mulcahy Symmons S, Zidan MMM, Sadler MC, Kutalik Z, Jonkers IH, Franke L, Deelen P. Identification of rare disease genes as drivers of common diseases through tissue-specific gene regulatory networks. Sci Rep. 2024 Dec 4;14(1):30206. doi: 10.1038/s41598-024-80670-1.

Picalo uses principal interaction component analysis to identify discrete technical, cell-type, and environmental factors that mediate eQTLs. The software is available at https://github.com/molgenis/PICALO under the BSD 3-Clause “New” or “Revised” License.

Reference: Vochteloo M, Deelen P, Vink B; BIOS Consortium; Tsai EA, Runz H, Andreu-Sánchez S, Fu J, Zhernakova A, Westra HJ, Franke L. PICALO: principal interaction component analysis for the identification of discrete technical, cell-type, and environmental factors that mediate eQTLs. Genome Biol. 2024 Jan 22;25(1):29. doi: 10.1186/s13059-023-03151-0.

2023

OTTERS (Omnibus Transcriptome Test using Expression Reference Summary data) is a TWAS framework that adapts multiple polygenic risk score methods to estimate eQTL weights from summary-level eQTL reference data and conducts an omnibus TWAS. Source code is available from https://github.com/daiqile96/OTTERS.

Reference: Otters Dai Q, Zhou G, Zhao H, Võsa U, Franke L, Battle A, Teumer A, Lehtimäki T, Raitakari OT, Esko T; eQTLGen Consortium; Epstein MP, Yang J. OTTERS: a powerful TWAS framework leveraging summary-level reference data. Nat Commun. 2023 Mar 7;14(1):1271. doi: 10.1038/s41467-023-36862-w.

2022

Idéfix is a method that uses polygenic risk scores to identify accidental sample mix-ups in biobank data. It can be used to identify a set of high-quality participants for whom it is very unlikely that they reflect sample mix-ups. It is freely available at https://github.com/molgenis/systemsgenetics/wiki/Idefix.

Reference: Warmerdam R, Lanting P; Lifelines Cohort Study; Deelen P, Franke L. Idéfix: identifying accidental sample mix-ups in biobanks using polygenic scores. Bioinformatics. 2022 Jan 27;38(4):1059-1066. doi: 10.1093/bioinformatics/btab783.

2020

Decon-cell & Decon-eQTL. Decon-cell is a framework for estimating cell proportions using expression profiles from bulk blood samples followed by deconvolution of cell type eQTLs (Decon-eQTL). Decon2 is available as an R package and Java application at https://github.com/molgenis/systemsgenetics/tree/master/Decon2 and as a web tool at www.molgenis.org/deconvolution.

Decon2 is a statistical framework for estimating cell counts using molecular profiling such as expression or methylation data from heterogeneous samples (Decon-cell) and consecutive deconvolution of expression quantitative trait loci (Decon-eQTL) into each cell subpopulation. For code see: https://github.com/molgenis/systemsgenetics/tree/master/Decon2

Reference: Aguirre-Gamboa R, de Klein N, di Tommaso J, Claringbould A, van der Wijst MG, de Vries D, Brugge H, Oelen R, Võsa U, Zorro MM, Chu X, Bakker OB, Borek Z, Ricaño-Ponce I, Deelen P, Xu CJ, Swertz M, Jonkers I, Withoff S, Joosten I, Sanna S, Kumar V, Koenen HJPM, Joosten LAB, Netea MG, Wijmenga C; BIOS Consortium; Franke L, Li Y. Deconvolution of bulk blood eQTL effects into immune cell subpopulations. BMC Bioinformatics. 2020 Jun 12;21(1):243. doi: 10.1186/s12859-020-03576-5.

2019

GeneNetwork uses gene co-regulation to predict pathway membership and associations to Human Phenotype Ontology terms by integrating 31,499 public RNA-seq samples. To use GeneNetwork online see: genenetwork.nl

Reference: Deelen P, van Dam S, Herkert JC, Karjalainen JM, Brugge H, Abbott KM, van Diemen CC, van der Zwaag PA, Gerkes EH, Zonneveld-Huijssoon E, Boer-Bergsma JJ, Folkertsma P, Gillett T, van der Velde KJ, Kanninga R, van den Akker PC, Jan SZ, Hoorntje ET, Te Rijdt WP, Vos YJ, Jongbloed JDH, van Ravenswaaij-Arts CMA, Sinke R, Sikkema-Raddatz B, Kerstjens-Frederikse WS, Swertz MA, Franke L. Improving the diagnostic yield of exome-sequencing by predicting gene-phenotype associations using large-scale gene expression analysis. Nat Commun. 2019 Jun 28;10(1):2837. doi: 10.1038/s41467-019-10649-4.

GADO (GeneNetwork Assisted Diagnostic Optimization) uses RNA-seq data from 31,499 samples to predict which genes cause specific disease phenotypes. To use GADO online: https://www.genenetwork.nl/gado. For code see: https://github.com/molgenis/systemsgenetics/wiki/GADO-Command-line

Reference: Deelen P, van Dam S, Herkert JC, Karjalainen JM, Brugge H, Abbott KM, van Diemen CC, van der Zwaag PA, Gerkes EH, Zonneveld-Huijssoon E, Boer-Bergsma JJ, Folkertsma P, Gillett T, van der Velde KJ, Kanninga R, van den Akker PC, Jan SZ, Hoorntje ET, Te Rijdt WP, Vos YJ, Jongbloed JDH, van Ravenswaaij-Arts CMA, Sinke R, Sikkema-Raddatz B, Kerstjens-Frederikse WS, Swertz MA, Franke L. Improving the diagnostic yield of exome- sequencing by predicting gene-phenotype associations using large-scale gene expression analysis. Nat Commun. 2019 Jun 28;10(1):2837. doi: 10.1038/s41467-019-10649-4.

2017

The eQTL mapping pipeline allows for QTL mapping using linear models and direct meta-analysis of this data through weighted Z-score analysis. For the code see: https://github.com/molgenis/systemsgenetics/tree/master/eqtl-mapping-pipeline

2014

Genotype Harmonizer is a command-line tool to harmonize genetic datasets by automatically solving issues concerning genomic strand and file format. Software is available open source under license LGPLv3 from: http://www.molgenis.org/systemsgenetics.

Reference: Deelen P, Bonder MJ, van der Velde KJ, Westra HJ, Winder E, Hendriksen D, Franke L, Swertz MA. Genotype harmonizer: automatic strand alignment and format conversion for genotype data integration. BMC Res Notes. 2014 Dec 11;7:901. doi: 10.1186/1756-0500-7-901.

2011

Mixupmapper is an algorithm to detect and correct sample mix-ups in genome-wide studies that study gene expression levels. MixupMapper is freely available at: http://www.genenetwork.nl/mixupmapper/

Reference: Westra HJ, Jansen RC, Fehrmann RS, te Meerman GJ, van Heel D, Wijmenga C, Franke L. MixupMapper: correcting sample mix-ups in genome-wide datasets increases power to detect small genetic effects. Bioinformatics. 2011 Aug 1;27(15):2104-11. doi: 10.1093/bioinformatics/btr323.

2004

TEAM is a tool for the integration of expression and linkage and association maps.

Reference: Franke L, van Bakel H, Diosdado B, van Belzen M, Wapenaar M, Wijmenga C. TEAM: a tool for the integration of expression, and linkage and association maps. Eur J Hum Genet. 2004 Aug;12(8):633-8. doi: 10.1038/sj.ejhg.5201215.

Consortium datasets and federated pipelines

The International eQTLgen Consortiu m aims to investigate the genetic architecture of blood gene expression and to understand the genetic underpinnings of complex traits. The project has studied blood samples from >30,000 individuals to reveal the influence of genetics on gene expression. For more information on the project and its data, pipelines and cookbooks, and publications see: https://eqtlgen.github.io/eqtlgen-web-site/

The International Single Cell eQTLgen Consortium of 28 participating groups was established to identify the upstream interactors and downstream consequences of trait-related genetic variants in specific immune cell types. By defining the cell types in which the eQTL effects manifest, regulatory information obtained from the eQTLGen Consortium can be put in the correct context. For more information on the project, its data and its publications see: https://eqtlgen.github.io/eqtlgen-web-site/

MetaBrain is a large-scale eQTL meta-analysis of previously published human brain eQTL datasets. To find a browser for the cis-eQTL and trans-eQTL meta-analysis in Cortex samples from European ancestry and a gene network that can be used to annotate genes and to perform gene set enrichment analyses, see https://www.metabrain.nl/

scMetaBrain is a federated single-cell consortium for cell-type specific eQTL analysis of neurological disease variants.

Last modified:19 January 2026 4.36 p.m.