Publication

reGenotyper: Detecting mislabeled samples in genetic data

Zych, K., Snoek, L. B., Elvin, M., Rodriguez, M., van der Velde, K. J., Arends, D., Westra, H-J., Swertz, M. A., Poulin, G., Kammenga, J. E., Breitling, R., Jansen, R. C. & Li, Y., 13-Feb-2017, In : PLoS ONE. 12, 2, 11 p., e0171324.

Research output: Contribution to journalArticleAcademicpeer-review

In high-throughput molecular profiling studies, genotype labels can be wrongly assigned at various experimental steps; the resulting mislabeled samples seriously reduce the power to detect the genetic basis of phenotypic variation. We have developed an approach to detect potential mislabeling, recover the "ideal" genotype and identify "best-matched " labels for mislabeled samples. On average, we identified 4% of samples as mislabeled in eight published datasets, highlighting the necessity of applying a "data cleaning" step before standard data analysis.

Original languageEnglish
Article numbere0171324
Number of pages11
JournalPLoS ONE
Volume12
Issue number2
Publication statusPublished - 13-Feb-2017

    Keywords

  • GENOME-WIDE ASSOCIATION, NATURAL VARIATION DATA, C. ELEGANS, MIX-UPS, EXPRESSION, QTL, DISEASE, IDENTIFICATION, PERTURBATION, POPULATIONS

View graph of relations

Download statistics

No data available

ID: 39744111