Document Type

Presentation

Publication Date

7-2018

Subjects

Reconstructability Analysis, Information Theory, Probabilistic graphical modeling, Multivariate analysis discrete multivariate modeling, Data mining

Abstract

Reconstructability Analysis (RA) is an analytical approach developed in the systems community that combines graph theory and information theory. Graph theory provides the structure of relations (model of the data) between variables and information theory characterizes the strength and the nature of the relations. RA has three primary approaches to model data: variable based (VB) models without loops (acyclic graphs), VB models with loops (cyclic graphs) and state-based models (nearly always cyclic, individual states specifying model constraints). These models can either be directed or neutral. Directed models focus on a single response variable whereas neutral models focus on all relations between variables.

The lattice of possible graph structures for an RA neutral system VB model with loops depends upon the number of variables in the data. With three variables there are nine possible specific structures, with six variables there are over seven million. The lattice of possible structures increases hyper-exponentially with the number of variables. For data with n number of variables, the lattice is constructed by adding all possible new single dependent relations to structures from the prior level until the saturated model, the data, is achieved.

Bayesian networks (BNs) are a subset of chain graphs, also known as block recursive models, because they have directed edges only. For a three variable BN there are twenty five specific structures and for a four variable BN there are almost thirty thousand. Like RA, the number of structures grows hyper-exponentially with the number of variables because of the possible combinations of variables and edge directions.

Generally speaking, the BN lattice is developed in the same manner as that of RA, by adding a single new relation at each level with the additional step that all possible edge orientations are identified at each level. The lattice of neutral system RA structures and BN structures mostly overlap, however there are some RA structures that BNs cannot represent and some BN structures that RA cannot represent. For example the structure ABCA:C with probability distribution p(A)p(C)p(B|AC) is not found in the RA lattice but is common in the BN lattice. In contrast the RA neutral system structure AB:AC:BC, which contains a loop, is not found in the BN lattice.

This talk highlights and compares preliminary results from RA, BN and standard linear regression to predict dynamics on the bulk electric grid. The best possible prediction models were identified using exploratory RA and BN search algorithms that search the lattice of possible graphs structures to find the best model fit. Preliminary results show that RA and BN both outperform linear regression in prediction in the tails of the distribution, whereas BN marginally outperformed RA overall although RA used less degrees freedom. This talk will focus on these preliminary results and offer explanations for differences in prediction performance as well as opportunities for extension of the research

Description

Presented at ISSS 2018, Corvallis, July 22-27

Persistent Identifier

https://archives.pdx.edu/ds/psu/26676

Share

COinS