Multifactor dimensionality reduction explained

Multifactor dimensionality reduction (MDR) is a statistical approach, also used in machine learning automatic approaches,[1] for detecting and characterizing combinations of attributes or independent variables that interact to influence a dependent or class variable.[2] [3] [4] [5] [6] [7] MDR was designed specifically to identify nonadditive interactions among discrete variables that influence a binary outcome and is considered a nonparametric and model-free alternative to traditional statistical methods such as logistic regression.

The basis of the MDR method is a constructive induction or feature engineering algorithm that converts two or more variables or attributes to a single attribute.[8] This process of constructing a new attribute changes the representation space of the data.[9] The end goal is to create or discover a representation that facilitates the detection of nonlinear or nonadditive interactions among the attributes such that prediction of the class variable is improved over that of the original representation of the data.

Illustrative example

Consider the following simple example using the exclusive OR (XOR) function. XOR is a logical operator that is commonly used in data mining and machine learning as an example of a function that is not linearly separable. The table below represents a simple dataset where the relationship between the attributes (X1 and X2) and the class variable (Y) is defined by the XOR function such that Y = X1 XOR X2.

Table 1

X1 X2 Y
0 0 0
0 1 1
1 0 1
1 1 0

A machine learning algorithm would need to discover or approximate the XOR function in order to accurately predict Y using information about X1 and X2. An alternative strategy would be to first change the representation of the data using constructive induction to facilitate predictive modeling. The MDR algorithm would change the representation of the data (X1 and X2) in the following manner. MDR starts by selecting two attributes. In this simple example, X1 and X2 are selected. Each combination of values for X1 and X2 are examined and the number of times Y=1 and/or Y=0 is counted. In this simple example, Y=1 occurs zero times and Y=0 occurs once for the combination of X1=0 and X2=0. With MDR, the ratio of these counts is computed and compared to a fixed threshold. Here, the ratio of counts is 0/1 which is less than our fixed threshold of 1. Since 0/1 < 1 we encode a new attribute (Z) as a 0. When the ratio is greater than one we encode Z as a 1. This process is repeated for all unique combinations of values for X1 and X2. Table 2 illustrates our new transformation of the data.

Table 2

Z Y
0 0
1 1
1 1
0 0

The machine learning algorithm now has much less work to do to find a good predictive function. In fact, in this very simple example, the function Y = Z has a classification accuracy of 1. A nice feature of constructive induction methods such as MDR is the ability to use any data mining or machine learning method to analyze the new representation of the data. Decision trees, neural networks, or a naive Bayes classifier could be used in combination with measures of model quality such as balanced accuracy[10] [11] and mutual information.[12]

Machine learning with MDR

As illustrated above, the basic constructive induction algorithm in MDR is very simple. However, its implementation for mining patterns from real data can be computationally complex. As with any machine learning algorithm there is always concern about overfitting. That is, machine learning algorithms are good at finding patterns in completely random data. It is often difficult to determine whether a reported pattern is an important signal or just chance. One approach is to estimate the generalizability of a model to independent datasets using methods such as cross-validation.[13] [14] [15] [16] Models that describe random data typically don't generalize. Another approach is to generate many random permutations of the data to see what the data mining algorithm finds when given the chance to overfit. Permutation testing makes it possible to generate an empirical p-value for the result.[17] [18] [19] [20] Replication in independent data may also provide evidence for an MDR model but can be sensitive to difference in the data sets.[21] [22] These approaches have all been shown to be useful for choosing and evaluating MDR models. An important step in a machine learning exercise is interpretation. Several approaches have been used with MDR including entropy analysis[23] and pathway analysis.[24] [25] Tips and approaches for using MDR to model gene-gene interactions have been reviewed.[26] [27]

Extensions to MDR

Numerous extensions to MDR have been introduced. These include family-based methods,[28] [29] [30] fuzzy methods,[31] covariate adjustment,[32] odds ratios,[33] risk scores,[34] survival methods,[35] [36] robust methods,[37] methods for quantitative traits,[38] [39] and many others.

Applications of MDR

MDR has mostly been applied to detecting gene-gene interactions or epistasis in genetic studies of common human diseases such as atrial fibrillation,[40] [41] autism,[42] bladder cancer,[43] [44] [45] breast cancer,[46] cardiovascular disease, hypertension,[47] [48] [49] obesity,[50] [51] pancreatic cancer,[52] prostate cancer[53] [54] [55] and tuberculosis.[56] It has also been applied to other biomedical problems such as the genetic analysis of pharmacology outcomes.[57] [58] [59] A central challenge is the scaling of MDR to big data such as that from genome-wide association studies (GWAS).[60] Several approaches have been used. One approach is to filter the features prior to MDR analysis.[61] This can be done using biological knowledge through tools such as BioFilter.[62] It can also be done using computational tools such as ReliefF.[63] Another approach is to use stochastic search algorithms such as genetic programming to explore the search space of feature combinations.[64] Yet another approach is a brute-force search using high-performance computing.[65] [66] [67]

Implementations

See also

Further reading

Notes and References

  1. McKinney. Brett A.. Reif. David M.. Ritchie. Marylyn D.. Moore. Jason H.. 2006-01-01. Machine learning for detecting gene-gene interactions: a review. Applied Bioinformatics. 5. 2. 77–88. 1175-5636. 3244050. 16722772. 10.2165/00822942-200605020-00002.
  2. Ritchie. Marylyn D.. Hahn. Lance W.. Roodi. Nady. Bailey. L. Renee. Dupont. William D.. Parl. Fritz F.. Moore. Jason H.. 2001-07-01. Multifactor-Dimensionality Reduction Reveals High-Order Interactions among Estrogen-Metabolism Genes in Sporadic Breast Cancer. The American Journal of Human Genetics. English. 69. 1. 138–147. 10.1086/321276. 0002-9297. 1226028. 11404819.
  3. Ritchie. Marylyn D.. Hahn. Lance W.. Moore. Jason H.. 2003-02-01. Power of multifactor dimensionality reduction for detecting gene-gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity. Genetic Epidemiology. en. 24. 2. 150–157. 10.1002/gepi.10218. 12548676. 6335612. 1098-2272.
  4. Hahn. L. W.. Ritchie. M. D.. Moore. J. H.. 2003-02-12. Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions. Bioinformatics. en. 19. 3. 376–382. 10.1093/bioinformatics/btf869. 12584123. 1367-4803. free.
  5. W.. Hahn, Lance. H.. Moore, Jason. 2004-01-01. Ideal Discrimination of Discrete Clinical Endpoints Using Multilocus Genotypes. In Silico Biology. en. 4. 2. 183–194 . 15107022 . 1386-6338.
  6. Moore. Jason H.. 2004-11-01. Computational analysis of gene-gene interactions using multifactor dimensionality reduction. Expert Review of Molecular Diagnostics. 4. 6. 795–803. 10.1586/14737159.4.6.795. 15525222. 26324399. 1473-7159.
  7. Book: Moore, Jason H.. Computational Methods for Genetics of Complex Traits . Detecting, Characterizing, and Interpreting Nonlinear Gene–Gene Interactions Using Multifactor Dimensionality Reduction . 2010-01-01. Advances in Genetics. 72. 101–116. 10.1016/B978-0-12-380862-2.00005-9. 0065-2660. 21029850. 978-0-12-380862-2.
  8. Moore. Jason H.. Gilbert. Joshua C.. Tsai. Chia-Ti. Chiang. Fu-Tien. Holden. Todd. Barney. Nate. White. Bill C.. 2006-07-21. A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. Journal of Theoretical Biology. 241. 2. 252–261. 10.1016/j.jtbi.2005.11.036. 16457852. 2006JThBi.241..252M .
  9. A theory and methodology of inductive learning . en. 10.1016/0004-3702(83)90016-4. 20. 2 . Artificial Intelligence. 111–161 . Michalski . R. February 1983 .
  10. Velez. Digna R.. White. Bill C.. Motsinger. Alison A.. Bush. William S.. Ritchie. Marylyn D.. Williams. Scott M.. Moore. Jason H.. 2007-05-01. A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction. Genetic Epidemiology. 31. 4. 306–315. 10.1002/gepi.20211. 0741-0395. 17323372. 28156181.
  11. Namkung. Junghyun. Kim. Kyunga. Yi. Sungon. Chung. Wonil. Kwon. Min-Seok. Park. Taesung. 2009-02-01. New evaluation measures for multifactor dimensionality reduction classifiers in gene-gene interaction analysis. Bioinformatics. 25. 3. 338–345. 10.1093/bioinformatics/btn629. 1367-4811. 19164302. free.
  12. Bush. William S.. Edwards. Todd L.. Dudek. Scott M.. McKinney. Brett A.. Ritchie. Marylyn D.. 2008-01-01. Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction. BMC Bioinformatics. 9. 238. 10.1186/1471-2105-9-238. 1471-2105. 2412877. 18485205 . free .
  13. Coffey. Christopher S.. Hebert. Patricia R.. Ritchie. Marylyn D.. Krumholz. Harlan M.. Gaziano. J. Michael. Ridker. Paul M.. Brown. Nancy J.. Vaughan. Douglas E.. Moore. Jason H.. 2004-01-01. An application of conditional logistic regression and multifactor dimensionality reduction for detecting gene-gene Interactions on risk of myocardial infarction: The importance of model validation. BMC Bioinformatics. 5. 49. 10.1186/1471-2105-5-49. 1471-2105. 419697. 15119966 . free .
  14. Motsinger. Alison A.. Ritchie. Marylyn D.. 2006-09-01. The effect of reduction in cross-validation intervals on the performance of multifactor dimensionality reduction. Genetic Epidemiology. en. 30. 6. 546–555. 10.1002/gepi.20166. 16800004. 20573232. 1098-2272.
  15. Gory. Jeffrey J.. Sweeney. Holly C.. Reif. David M.. Motsinger-Reif. Alison A.. 2012-11-05. A comparison of internal model validation methods for multifactor dimensionality reduction in the case of genetic heterogeneity. BMC Research Notes. 5. 623. 10.1186/1756-0500-5-623. 1756-0500. 3599301. 23126544 . free .
  16. Winham. Stacey J.. Slater. Andrew J.. Motsinger-Reif. Alison A.. 2010-07-22. A comparison of internal validation techniques for multifactor dimensionality reduction. BMC Bioinformatics. 11. 394. 10.1186/1471-2105-11-394. 1471-2105. 2920275. 20650002 . free .
  17. Pattin. Kristine A.. White. Bill C.. Barney. Nate. Gui. Jiang. Nelson. Heather H.. Kelsey. Karl T.. Andrew. Angeline S.. Karagas. Margaret R.. Moore. Jason H.. 2009-01-01. A computationally efficient hypothesis testing method for epistasis analysis using multifactor dimensionality reduction. Genetic Epidemiology. en. 33. 1. 87–94. 10.1002/gepi.20360. 1098-2272. 2700860. 18671250.
  18. Book: Enabling personal genomics with an explicit test of epistasis. Biocomputing 2010: Pacific Symposium on Biocomputing. Greene. Casey S.. Himmelstein. Daniel S.. Nelson. Heather H.. Kelsey. Karl T.. Williams. Scott M.. Andrew. Angeline S.. Karagas. Margaret R.. Moore. Jason H.. 2009-10-01. World Scientific. 9789814299473. 327–336. 10.1142/9789814295291_0035. 19908385. 2916690.
  19. Dai. Hongying. Bhandary. Madhusudan. Becker. Mara. Leeder. J. Steven. Gaedigk. Roger. Motsinger-Reif. Alison A.. 2012-05-22. Global tests of P-values for multifactor dimensionality reduction models in selection of optimal number of target genes. BioData Mining. 5. 1. 3. 10.1186/1756-0381-5-3. 1756-0381. 3508622. 22616673 . free .
  20. Motsinger-Reif. Alison A.. 2008-12-30. The effect of alternative permutation testing strategies on the performance of multifactor dimensionality reduction. BMC Research Notes. 1. 139. 10.1186/1756-0500-1-139. 1756-0500. 2631601. 19116021 . free .
  21. Greene. Casey S.. Penrod. Nadia M.. Williams. Scott M.. Moore. Jason H.. 2009-06-02. Failure to Replicate a Genetic Association May Provide Important Clues About Genetic Architecture. PLOS ONE. 4. 6. e5639. 10.1371/journal.pone.0005639. 1932-6203. 2685469. 19503614. 2009PLoSO...4.5639G. free.
  22. Book: Piette. Elizabeth R.. Moore. Jason H.. Applications of Evolutionary Computation . Improving the Reproducibility of Genetic Association Results Using Genotype Resampling Methods . 2017-04-19. 10199. en. 96–108. 10.1007/978-3-319-55849-3_7. Lecture Notes in Computer Science. 978-3-319-55848-6.
  23. Book: Moore. Jason H.. Hu. Ting. Epistasis . Epistasis Analysis Using Information Theory . 2015-01-01. Methods in Molecular Biology. 1253. 257–268. 10.1007/978-1-4939-2155-3_13. 1940-6029. 25403536. 978-1-4939-2154-6.
  24. Kim. Nora Chung. Andrews. Peter C.. Asselbergs. Folkert W.. Frost. H. Robert. Williams. Scott M.. Harris. Brent T.. Read. Cynthia. Askland. Kathleen D.. Moore. Jason H.. 2012-07-28. Gene ontology analysis of pairwise genetic associations in two genome-wide studies of sporadic ALS. BioData Mining. 5. 1. 9. 10.1186/1756-0381-5-9. 1756-0381. 3463436. 22839596 . free .
  25. Cheng. Samantha. Andrew. Angeline S.. Andrews. Peter C.. Moore. Jason H.. 2016-01-01. Complex systems analysis of bladder cancer susceptibility reveals a role for decarboxylase activity in two genome-wide association studies. BioData Mining. 9. 40. 10.1186/s13040-016-0119-z. 5154053. 27999618 . free .
  26. Book: Epistasis. . 1253. Moore. JasonH.. Andrews. PeterC.. Epistasis Analysis Using Multifactor Dimensionality Reduction . 2015-01-01. Springer New York. 9781493921546. Moore. Jason H.. Methods in Molecular Biology. 301–314. English. 10.1007/978-1-4939-2155-3_16. 25403539. Williams. Scott M..
  27. Gola. Damian. Mahachie John. Jestinah M.. van Steen. Kristel. König. Inke R.. 2016-03-01. A roadmap to multifactor dimensionality reduction methods. Briefings in Bioinformatics. 17. 2. 293–308. 10.1093/bib/bbv038. 1477-4054. 4793893. 26108231.
  28. Martin. E. R.. Ritchie. M. D.. Hahn. L.. Kang. S.. Moore. J. H.. 2006-02-01. A novel method to identify gene-gene effects in nuclear families: the MDR-PDT. Genetic Epidemiology. 30. 2. 111–123. 10.1002/gepi.20128. 0741-0395. 16374833. 25772215.
  29. Lou. Xiang-Yang. Chen. Guo-Bo. Yan. Lei. Ma. Jennie Z.. Mangold. Jamie E.. Zhu. Jun. Elston. Robert C.. Li. Ming D.. 2008-10-01. A combinatorial approach to detecting gene-gene and gene-environment interactions in family studies. American Journal of Human Genetics. 83. 4. 457–467. 10.1016/j.ajhg.2008.09.001. 1537-6605. 2561932. 18834969.
  30. Cattaert. Tom. Urrea. Víctor. Naj. Adam C.. De Lobel. Lizzy. De Wit. Vanessa. Fu. Mao. Mahachie John. Jestinah M.. Shen. Haiqing. Calle. M. Luz. 2010-04-22. FAM-MDR: a flexible family-based multifactor dimensionality reduction technique to detect epistasis using related individuals. PLOS ONE. 5. 4. e10304. 10.1371/journal.pone.0010304. 1932-6203. 2858665. 20421984. 2010PLoSO...510304C. free.
  31. Leem. Sangseob. Park. Taesung. 2017-03-14. An empirical fuzzy multifactor dimensionality reduction method for detecting gene-gene interactions. BMC Genomics. 18. Suppl 2. 115. 10.1186/s12864-017-3496-x. 1471-2164. 5374597. 28361694 . free .
  32. Gui. Jiang. Andrew. Angeline S.. Andrews. Peter. Nelson. Heather M.. Kelsey. Karl T.. Karagas. Margaret R.. Moore. Jason H.. 2010-01-01. A simple and computationally efficient sampling approach to covariate adjustment for multifactor dimensionality reduction analysis of epistasis. Human Heredity. 70. 3. 219–225. 10.1159/000319175. 1423-0062. 2982850. 20924193.
  33. Chung. Yujin. Lee. Seung Yeoun. Elston. Robert C.. Park. Taesung. 2007-01-01. Odds ratio based multifactor-dimensionality reduction method for detecting gene-gene interactions. Bioinformatics. 23. 1. 71–76. 10.1093/bioinformatics/btl557. 1367-4811. 17092990. free.
  34. Dai. Hongying. Charnigo. Richard J.. Becker. Mara L.. Leeder. J. Steven. Motsinger-Reif. Alison A.. 2013-01-08. Risk score modeling of multiple gene to gene interactions using aggregated-multifactor dimensionality reduction. BioData Mining. 6. 1. 1. 10.1186/1756-0381-6-1. 3560267. 23294634 . free .
  35. Gui. Jiang. Moore. Jason H.. Kelsey. Karl T.. Marsit. Carmen J.. Karagas. Margaret R.. Andrew. Angeline S.. 2011-01-01. A novel survival multifactor dimensionality reduction method for detecting gene-gene interactions with application to bladder cancer prognosis. Human Genetics. 129. 1. 101–110. 10.1007/s00439-010-0905-5. 1432-1203. 3255326. 20981448.
  36. Lee. Seungyeoun. Son. Donghee. Yu. Wenbao. Park. Taesung. 2016-12-01. Gene-Gene Interaction Analysis for the Accelerated Failure Time Model Using a Unified Model-Based Multifactor Dimensionality Reduction Method. Genomics & Informatics. 14. 4. 166–172. 10.5808/GI.2016.14.4.166. 1598-866X. 5287120. 28154507.
  37. Gui. Jiang. Andrew. Angeline S.. Andrews. Peter. Nelson. Heather M.. Kelsey. Karl T.. Karagas. Margaret R.. Moore. Jason H.. 2011-01-01. A robust multifactor dimensionality reduction method for detecting gene-gene interactions with application to the genetic analysis of bladder cancer susceptibility. Annals of Human Genetics. 75. 1. 20–28. 10.1111/j.1469-1809.2010.00624.x. 1469-1809. 3057873. 21091664.
  38. Gui. Jiang. Moore. Jason H.. Williams. Scott M.. Andrews. Peter. Hillege. Hans L.. van der Harst. Pim. Navis. Gerjan. Van Gilst. Wiek H.. Asselbergs. Folkert W.. 2013-01-01. A Simple and Computationally Efficient Approach to Multifactor Dimensionality Reduction Analysis of Gene-Gene Interactions for Quantitative Traits. PLOS ONE. 8. 6. e66545. 10.1371/journal.pone.0066545. 1932-6203. 3689797. 23805232. 2013PLoSO...866545G. free.
  39. Lou. Xiang-Yang. Chen. Guo-Bo. Yan. Lei. Ma. Jennie Z.. Zhu. Jun. Elston. Robert C.. Li. Ming D.. 2007-06-01. A generalized combinatorial approach for detecting gene-by-gene and gene-by-environment interactions with application to nicotine dependence. American Journal of Human Genetics. 80. 6. 1125–1137. 10.1086/518312. 0002-9297. 1867100. 17503330.
  40. Tsai. Chia-Ti. Lai. Ling-Ping. Lin. Jiunn-Lee. Chiang. Fu-Tien. Hwang. Juey-Jen. Ritchie. Marylyn D.. Moore. Jason H.. Hsu. Kuan-Lih. Tseng. Chuen-Den. 2004-04-06. Renin-Angiotensin System Gene Polymorphisms and Atrial Fibrillation. Circulation. en. 109. 13. 1640–1646. 10.1161/01.CIR.0000124487.36586.26. 0009-7322. 15023884. free.
  41. Asselbergs. Folkert W.. Moore. Jason H.. van den Berg. Maarten P.. Rimm. Eric B.. de Boer. Rudolf A.. Dullaart. Robin P.. Navis. Gerjan. van Gilst. Wiek H.. 2006-01-01. A role for CETP TaqIB polymorphism in determining susceptibility to atrial fibrillation: a nested case control study. BMC Medical Genetics. 7. 39. 10.1186/1471-2350-7-39. 1471-2350. 1462991. 16623947 . free .
  42. Ma. D.Q.. Whitehead. P.L.. Menold. M.M.. Martin. E.R.. Ashley-Koch. A.E.. Mei. H.. Ritchie. M.D.. DeLong. G.R.. Abramson. R.K.. 2005-09-01. Identification of Significant Association and Gene-Gene Interaction of GABA Receptor Subunit Genes in Autism. The American Journal of Human Genetics. English. 77. 3. 377–388. 10.1086/433195. 0002-9297. 1226204. 16080114.
  43. Andrew. Angeline S.. Nelson. Heather H.. Kelsey. Karl T.. Moore. Jason H.. Meng. Alexis C.. Casella. Daniel P.. Tosteson. Tor D.. Schned. Alan R.. Karagas. Margaret R.. 2006-05-01. Concordance of multiple analytical approaches demonstrates a complex relationship between DNA repair gene SNPs, smoking and bladder cancer susceptibility. Carcinogenesis. 27. 5. 1030–1037. 10.1093/carcin/bgi284. 16311243. 0143-3334. free.
  44. Andrew. Angeline S.. Karagas. Margaret R.. Nelson. Heather H.. Guarrera. Simonetta. Polidoro. Silvia. Gamberini. Sara. Sacerdote. Carlotta. Moore. Jason H.. Kelsey. Karl T.. 2008-01-01. DNA Repair Polymorphisms Modify Bladder Cancer Risk: A Multi-factor Analytic Strategy. Human Heredity. english. 65. 2. 105–118. 10.1159/000108942. 0001-5652. 2857629. 17898541.
  45. Andrew. Angeline S.. Hu. Ting. Gu. Jian. Gui. Jiang. Ye. Yuanqing. Marsit. Carmen J.. Kelsey. Karl T.. Schned. Alan R.. Tanyos. Sam A.. 2012-01-01. HSD3B and gene-gene interactions in a pathway-based analysis of genetic susceptibility to bladder cancer. PLOS ONE. 7. 12. e51301. 10.1371/journal.pone.0051301. 1932-6203. 3526593. 23284679. 2012PLoSO...751301A. free.
  46. Cao. Jingjing. Luo. Chenglin. Yan. Rui. Peng. Rui. Wang. Kaijuan. Wang. Peng. Ye. Hua. Song. Chunhua. 2016-12-01. rs15869 at miRNA binding site in BRCA2 is associated with breast cancer susceptibility. Medical Oncology. en. 33. 12. 135. 10.1007/s12032-016-0849-2. 27807724. 26042128. 1357-0560.
  47. Williams. Scott M.. Ritchie. Marylyn D.. III. John A. Phillips. Dawson. Elliot. Prince. Melissa. Dzhura. Elvira. Willis. Alecia. Semenya. Amma. Summar. Marshall. 2004-01-01. Multilocus Analysis of Hypertension: A Hierarchical Approach. Human Heredity. english. 57. 1. 28–38. 10.1159/000077387. 15133310. 21079485. 0001-5652.
  48. Sanada. Hironobu. Yatabe. Junichi. Midorikawa. Sanae. Hashimoto. Shigeatsu. Watanabe. Tsuyoshi. Moore. Jason H.. Ritchie. Marylyn D.. Williams. Scott M.. Pezzullo. John C.. 2006-03-01. Single-Nucleotide Polymorphisms for Diagnosis of Salt-Sensitive Hypertension. Clinical Chemistry. en. 52. 3. 352–360. 10.1373/clinchem.2005.059139. 0009-9147. 16439609. free.
  49. Moore. Jason H.. Williams. Scott M.. 2002-01-01. New strategies for identifying gene-gene interactions in hypertension. Annals of Medicine. 34. 2. 88–95. 10.1080/07853890252953473. 12108579. 25398042. 0785-3890.
  50. De. Rishika. Verma. Shefali S.. Holzinger. Emily. Hall. Molly. Burt. Amber. Carrell. David S.. Crosslin. David R.. Jarvik. Gail P.. Kuivaniemi. Helena. 2017-02-01. Identifying gene-gene interactions that are highly associated with four quantitative lipid traits across multiple cohorts. Human Genetics. 136. 2. 165–178. 10.1007/s00439-016-1738-7. 1432-1203. 27848076. 24702049.
  51. De. Rishika. Verma. Shefali S.. Drenos. Fotios. Holzinger. Emily R.. Holmes. Michael V.. Hall. Molly A.. Crosslin. David R.. Carrell. David S.. Hakonarson. Hakon. 2015-01-01. Identifying gene-gene interactions that are highly associated with Body Mass Index using Quantitative Multifactor Dimensionality Reduction (QMDR). BioData Mining. 8. 41. 10.1186/s13040-015-0074-0. 4678717. 26674805 . free .
  52. Duell. Eric J.. Bracci. Paige M.. Moore. Jason H.. Burk. Robert D.. Kelsey. Karl T.. Holly. Elizabeth A.. 2008-06-01. Detecting pathway-based gene-gene and gene-environment interactions in pancreatic cancer. Cancer Epidemiology, Biomarkers & Prevention. 17. 6. 1470–1479. 10.1158/1055-9965.EPI-07-2797. 1055-9965. 4410856. 18559563.
  53. Xu. Jianfeng. Lowey. James. Wiklund. Fredrik. Sun. Jielin. Lindmark. Fredrik. Hsu. Fang-Chi. Dimitrov. Latchezar. Chang. Baoli. Turner. Aubrey R.. 2005-11-01. The Interaction of Four Genes in the Inflammation Pathway Significantly Predicts Prostate Cancer Risk. Cancer Epidemiology, Biomarkers & Prevention. en. 14. 11. 2563–2568. 10.1158/1055-9965.EPI-05-0356. 1055-9965. 16284379. free.
  54. Lavender. Nicole A.. Rogers. Erica N.. Yeyeodu. Susan. Rudd. James. Hu. Ting. Zhang. Jie. Brock. Guy N.. Kimbro. Kevin S.. Moore. Jason H.. 2012-04-30. Interaction among apoptosis-associated sequence variants and joint effects on aggressive prostate cancer. BMC Medical Genomics. 5. 11. 10.1186/1755-8794-5-11. 1755-8794. 3355002. 22546513 . free .
  55. Lavender. Nicole A.. Benford. Marnita L.. VanCleave. Tiva T.. Brock. Guy N.. Kittles. Rick A.. Moore. Jason H.. Hein. David W.. Kidd. La Creis R.. 2009-11-16. Examination of polymorphic glutathione S-transferase (GST) genes, tobacco smoking and prostate cancer risk among men of African descent: a case-control study. BMC Cancer. 9. 397. 10.1186/1471-2407-9-397. 1471-2407. 2783040. 19917083 . free .
  56. Collins. Ryan L.. Hu. Ting. Wejse. Christian. Sirugo. Giorgio. Williams. Scott M.. Moore. Jason H.. 2013-02-18. Multifactor dimensionality reduction reveals a three-locus epistatic interaction associated with susceptibility to pulmonary tuberculosis. BioData Mining. 6. 1. 4. 10.1186/1756-0381-6-4. 3618340. 23418869 . free .
  57. Wilke. Russell A.. Reif. David M.. Moore. Jason H.. 2005-11-01. Combinatorial Pharmacogenetics. Nature Reviews Drug Discovery. en. 4. 11. 911–918. 10.1038/nrd1874. 16264434. 11643026. 1474-1776.
  58. Motsinger. Alison A.. Ritchie. Marylyn D.. Shafer. Robert W.. Robbins. Gregory K.. Morse. Gene D.. Labbe. Line. Wilkinson. Grant R.. Clifford. David B.. D'Aquila. Richard T.. 2006-11-01. Multilocus genetic interactions and response to efavirenz-containing regimens: an adult AIDS clinical trials group study. Pharmacogenetics and Genomics. 16. 11. 837–845. 10.1097/01.fpc.0000230413.97596.fa. 1744-6872. 17047492. 26266170.
  59. Ritchie. Marylyn D.. Motsinger. Alison A.. 2005-12-01. Multifactor dimensionality reduction for detecting gene-gene and gene-environment interactions in pharmacogenomics studies. Pharmacogenomics. 6. 8. 823–834. 10.2217/14622416.6.8.823. 1462-2416. 16296945. 10348021.
  60. Moore. Jason H.. Asselbergs. Folkert W.. Williams. Scott M.. 2010-02-15. Bioinformatics challenges for genome-wide association studies. Bioinformatics. 26. 4. 445–455. 10.1093/bioinformatics/btp713. 1367-4811. 2820680. 20053841.
  61. Sun. Xiangqing. Lu. Qing. Mukherjee. Shubhabrata. Mukheerjee. Shubhabrata. Crane. Paul K.. Elston. Robert. Ritchie. Marylyn D.. 2014-01-01. Analysis pipeline for the epistasis search – statistical versus biological filtering. Frontiers in Genetics. 5. 106. 10.3389/fgene.2014.00106. 4012196. 24817878. free.
  62. Pendergrass. Sarah A.. Frase. Alex. Wallace. John. Wolfe. Daniel. Katiyar. Neerja. Moore. Carrie. Ritchie. Marylyn D.. 2013-12-30. Genomic analyses with biofilter 2.0: knowledge driven filtering, annotation, and model development. BioData Mining. 6. 1. 25. 10.1186/1756-0381-6-25. 3917600. 24378202 . free .
  63. Book: Moore, Jason H.. Epistasis Analysis Using ReliefF . Epistasis. 2015-01-01. Methods in Molecular Biology. 1253. 315–325. 10.1007/978-1-4939-2155-3_17. 1940-6029. 25403540. 978-1-4939-2154-6.
  64. Book: Genetic Programming Theory and Practice IV. Moore. Jason H.. White. Bill C.. Genome-Wide Genetic Analysis Using Genetic Programming: The Critical Need for Expert Knowledge . 2007-01-01. Springer US. 9780387333755. Riolo. Rick. Genetic and Evolutionary Computation. 11–28. en. 10.1007/978-0-387-49650-4_2. 55188394. Soule. Terence. Worzel. Bill.
  65. Greene. Casey S.. Sinnott-Armstrong. Nicholas A.. Himmelstein. Daniel S.. Park. Paul J.. Moore. Jason H.. Harris. Brent T.. 2010-03-01. Multifactor dimensionality reduction for graphics processing units enables genome-wide testing of epistasis in sporadic ALS. Bioinformatics. 26. 5. 694–695. 10.1093/bioinformatics/btq009. 1367-4811. 2828117. 20081222.
  66. Bush. William S.. Dudek. Scott M.. Ritchie. Marylyn D.. 2006-09-01. Parallel multifactor dimensionality reduction: a tool for the large-scale analysis of gene-gene interactions. Bioinformatics. 22. 17. 2173–2174. 10.1093/bioinformatics/btl347. 1367-4811. 4939609. 16809395.
  67. Sinnott-Armstrong. Nicholas A.. Greene. Casey S.. Cancare. Fabio. Moore. Jason H.. 2009-07-24. Accelerating epistasis analysis in human genetics with consumer graphics hardware. BMC Research Notes. 2. 149. 10.1186/1756-0500-2-149. 1756-0500. 2732631. 19630950 . free .
  68. Winham. Stacey J.. Motsinger-Reif. Alison A.. 2011-08-16. An R package implementation of multifactor dimensionality reduction. BioData Mining. 4. 1. 24. 10.1186/1756-0381-4-24. 1756-0381. 3177775. 21846375 . free .
  69. Calle. M. Luz. Urrea. Víctor. Malats. Núria. Van Steen. Kristel. 2010-09-01. mbmdr: an R package for exploring gene-gene interactions associated with binary or quantitative traits. Bioinformatics. 26. 17. 2198–2199. 10.1093/bioinformatics/btq352. 1367-4811. 20595460. free.