Evaluating the detection ability of a range of epistasis detection methods on simulated data for pure and impure epistatic models.

Russ D; Williams JA; Cardoso VR; Bravo-Merodio L; Pendleton SC; Aziz F; Acharjee A; Gkoutos GV

doi:10.1371/journal.pone.0263390

Evaluating the detection ability of a range of epistasis detection methods on simulated data for pure and impure epistatic models.

Affiliations

1. Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham, United Kingdom.
Authors
Russ D¹
Williams JA¹
Cardoso VR¹
Bravo-Merodio L¹
Pendleton SC¹
Aziz F¹
Acharjee A¹
Gkoutos GV¹
(8 authors)

ORCIDs linked to this article

Show all (7)

Plos one, 18 Feb 2022, 17(2):e0263390
https://doi.org/10.1371/journal.pone.0263390 PMID: 35180244 PMCID: PMC8856572

This article is in the Europe PMC Open access subset. Refer to the copyright information in the article for licensing details.

Free full text in Europe PMC

This article has been corrected. See PLoS One. 2023 Jul 6;18(7):e0288416.

Abstract

Background

Numerous approaches have been proposed for the detection of epistatic interactions within GWAS datasets in order to better understand the drivers of disease and genetics.

Methods

A selection of state-of-the-art approaches were assessed. These included the statistical tests, fast-epistasis, BOOST, logistic regression and wtest; swarm intelligence methods, namely AntEpiSeeker, epiACO and CINOEDV; and data mining approaches, including MDR, GSS, SNPRuler and MPI3SNP. Data were simulated to provide randomly generated models with no individual main effects at different heritabilities (pure epistasis) as well as models based on penetrance tables with some main effects (impure epistasis). Detection of both two and three locus interactions were assessed across a total of 1,560 simulated datasets. The different methods were also applied to a section of the UK biobank cohort for Atrial Fibrillation.

Results

For pure, two locus interactions, PLINK's implementation of BOOST recovered the highest number of correct interactions, with 53.9% and significantly better performing than the other methods (p = 4.52e - 36). For impure two locus interactions, MDR exhibited the best performance, recovering 62.2% of the most significant impure epistatic interactions (p = 6.31e - 90 for all but one test). The assessment of three locus interaction prediction revealed that wtest recovered the highest number (17.2%) of pure epistatic interactions(p = 8.49e - 14). wtest also recovered the highest number of three locus impure epistatic interactions (p = 6.76e - 48) while AntEpiSeeker ranked as the most significant the highest number of such interactions (40.5%). Finally, when applied to a real dataset for Atrial Fibrillation, most notably finding an interaction between SYNE2 and DTNB.

Free full text

PLoS One. 2022; 17(2): e0263390.

Published online 2022 Feb 18. https://doi.org/10.1371/journal.pone.0263390

PMCID: PMC8856572

PMID: 35180244

Evaluating the detection ability of a range of epistasis detection methods on simulated data for pure and impure epistatic models

Dominic Russ, Data curation, Formal analysis, Investigation, Methodology, Validation, Visualization, Writing – original draft,^#^1
,^2
,^* John A. Williams, Conceptualization, Data curation, Investigation, Methodology, Supervision, Writing – original draft, Writing – review & editing,^#^1
,² Victor Roth Cardoso, Investigation, Methodology, Writing – review & editing,^1
,² Laura Bravo-Merodio, Investigation, Methodology, Writing – review & editing,^1
,² Samantha C. Pendleton, Investigation, Methodology, Writing – review & editing,^1
,² Furqan Aziz, Investigation, Methodology, Writing – review & editing,^1
,² Animesh Acharjee, Conceptualization, Investigation, Methodology, Writing – review & editing,^1
,^2
,^3
,^‡ and Georgios V. Gkoutos, Conceptualization, Investigation, Methodology, Writing – review & editing^1
,^2
,^3
,^4
,^5
,^‡