Published In

Journal of Molecular Engineering and Systems Biology

Document Type

Article

Publication Date

10-3-2012

Subjects

Reconstructability Analysis, Information Theory, Probabilistic graphical modeling, Multivariate analysis discrete multivariate modeling, Data mining

Abstract

Background: Reconstructability Analysis (RA) has been used to detect epistasis in genomic data; in that work, even the simplest RA models (variable-based models without loops) gave performance superior to two other methods. A follow-on theoretical study showed that RA also offers higher-resolution models, namely variable-based models with loops and state-based models, likely to be even more effective in modeling epistasis, and also described several mathematical approaches to classifying types of epistasis.

Methods: The present paper extends this second study by discussing a non-standard use of RA: the analysis of epistasis in quantitative as opposed to nominal variables; such quantitative variables are, for example, encountered in genetic characterizations of gene expression, e.g., eQTL data. Three methods are investigated for applying variable- and state-based RA to quantitative dependent variables: (i) k-systems analysis, which treats continuous function values as pseudofrequencies, (ii) b-systems analysis, which derives continuous values from binned DVs using expected value calculations, and (iii) u-systems analysis, which treats continuous function values as pseudo-utilities subject to a lottery. These methods are demonstrated and compared on synthetic data.

Results: The three methods of k-, b-, and u-systems analyses, both variable-based and state-based, are then applied to a published SNP dataset. A preliminary search is done with b-systems analysis, followed by more refined k- and u-systems searches. The analyses suggest candidates for epistatic interactions that affect the level of gene expression. As in the synthetic data studies, state-based RA is more powerful than variable-based RA. Conclusions: While the previous RA studies looked at epistasis in nominal (or discretized) data, this paper shows that RA can also analyze epistasis in quantitative expression data without discretizing this data. Since RA can also model epistasis in frequency distributions and detect linkage disequilibrium, its successful application here also to continuous functions shows that it offers a flexible methodology for the analysis of genomic interaction effects.

Description

Copyright 2012 Zwick et al: licensee Herbert Publications Ltd. This is an Open Access article distributed under the terms of Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0). This permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

DOI : http://dx.doi.org/10.7243/2050-1412-1-4

DOI

10.7243/2050-1412-1-4

Persistent Identifier

http://archives.pdx.edu/ds/psu/11013

Share

COinS