Sponsor
This work was partially supported by the National Science Foundation under grant ECS-9904378.
Published In
Kybernetes
Document Type
Post-Print
Publication Date
2004
Subjects
Reconstructability Analysis, Information Theory, Probabilistic graphical modeling, Multivariate analysis discrete multivariate modeling, Data mining
Abstract
Extended dependency analysis (EDA) is a heuristic search technique for finding significant relationships between nominal variables in large data sets. The directed version of EDA searches for maximally predictive sets of independent variables with respect to a target dependent variable. The original implementation of EDA was an extension of reconstructability analysis. Our new implementation adds a variety of statistical significance tests at each decision point that allow the user to tailor the algorithm to a particular objective. It also utilizes data structures appropriate for the sparse data sets customary in contemporary data mining problems. Two examples that illustrate different approaches to assessing model quality tests are given in this paper.
DOI
10.1108/03684920410534010
Persistent Identifier
http://archives.pdx.edu/ds/psu/16490
Citation Details
Shannon, T. and Zwick, M. 2004. “Directed Extended Dependency Analysis for Data Mining.” Kybernetes, vol. 33, No. 5/6, pp. 973-983.
Description
This is the authors' version of a paper which subsequently appeared in Kybnernetes, published by Emerald Group Publishing Limited. The version of record may be found at http://dx.doi.org/10.1108/03684920410534010.