From patterned response dependency to structured covariate dependency: Entropy based categorical-pattern-matching

Hsieh Fushing, Shan Yu Liu, Yin Chen Hsieh, Brenda Mccowan

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Data generated from a system of interest typically consists of measurements on many covariate features and possibly multiple response features across all subjects in a designated ensemble. Such data is naturally represented by one response-matrix against one covariate-matrix. A matrix lattice is an advantageous platform for simultaneously accommodating heterogeneous data types: continuous, discrete and categorical, and exploring hidden dependency among/between features and subjects. After each feature being individually renormalized with respect to its own histogram, the categorical version of mutual conditional entropy is evaluated for all pairs of response and covariate features according to the combinatorial information theory. Then, by applying Data Could Geometry (DCG) algorithmic computations on such a mutual conditional entropy matrix, multiple synergistic feature-groups are partitioned. Distinct synergistic feature-groups embrace distinct structures of dependency. The explicit details of dependency among members of synergistic features are seen through mutliscale compositions of blocks computed by a computing paradigm called Data Mechanics. We then propose a categorical pattern matching approach to establish a directed associative linkage: from the patterned response dependency to serial structured covariate dependency. The graphic display of such a directed associative linkage is termed an information flow and the degrees of association are evaluated via tree-to-tree mutual conditional entropy. This new universal way of discovering system knowledge is illustrated through five data sets. In each case, the emergent visible heterogeneity is an organization of discovered knowledge.

Original languageEnglish (US)
Pages (from-to)e0198253
JournalPLoS One
Volume13
Issue number6
DOIs
StatePublished - Jan 1 2018

Fingerprint

Pattern matching
Entropy
entropy
Information Theory
Mechanics
mechanics
Information theory
Display devices
Association reactions
Geometry
Chemical analysis

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

From patterned response dependency to structured covariate dependency : Entropy based categorical-pattern-matching. / Fushing, Hsieh; Liu, Shan Yu; Hsieh, Yin Chen; Mccowan, Brenda.

In: PLoS One, Vol. 13, No. 6, 01.01.2018, p. e0198253.

Research output: Contribution to journalArticle

Fushing, Hsieh ; Liu, Shan Yu ; Hsieh, Yin Chen ; Mccowan, Brenda. / From patterned response dependency to structured covariate dependency : Entropy based categorical-pattern-matching. In: PLoS One. 2018 ; Vol. 13, No. 6. pp. e0198253.
@article{5b54c05b048a4287a095fe7ad07fa101,
title = "From patterned response dependency to structured covariate dependency: Entropy based categorical-pattern-matching",
abstract = "Data generated from a system of interest typically consists of measurements on many covariate features and possibly multiple response features across all subjects in a designated ensemble. Such data is naturally represented by one response-matrix against one covariate-matrix. A matrix lattice is an advantageous platform for simultaneously accommodating heterogeneous data types: continuous, discrete and categorical, and exploring hidden dependency among/between features and subjects. After each feature being individually renormalized with respect to its own histogram, the categorical version of mutual conditional entropy is evaluated for all pairs of response and covariate features according to the combinatorial information theory. Then, by applying Data Could Geometry (DCG) algorithmic computations on such a mutual conditional entropy matrix, multiple synergistic feature-groups are partitioned. Distinct synergistic feature-groups embrace distinct structures of dependency. The explicit details of dependency among members of synergistic features are seen through mutliscale compositions of blocks computed by a computing paradigm called Data Mechanics. We then propose a categorical pattern matching approach to establish a directed associative linkage: from the patterned response dependency to serial structured covariate dependency. The graphic display of such a directed associative linkage is termed an information flow and the degrees of association are evaluated via tree-to-tree mutual conditional entropy. This new universal way of discovering system knowledge is illustrated through five data sets. In each case, the emergent visible heterogeneity is an organization of discovered knowledge.",
author = "Hsieh Fushing and Liu, {Shan Yu} and Hsieh, {Yin Chen} and Brenda Mccowan",
year = "2018",
month = "1",
day = "1",
doi = "10.1371/journal.pone.0198253",
language = "English (US)",
volume = "13",
pages = "e0198253",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "6",

}

TY - JOUR

T1 - From patterned response dependency to structured covariate dependency

T2 - Entropy based categorical-pattern-matching

AU - Fushing, Hsieh

AU - Liu, Shan Yu

AU - Hsieh, Yin Chen

AU - Mccowan, Brenda

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Data generated from a system of interest typically consists of measurements on many covariate features and possibly multiple response features across all subjects in a designated ensemble. Such data is naturally represented by one response-matrix against one covariate-matrix. A matrix lattice is an advantageous platform for simultaneously accommodating heterogeneous data types: continuous, discrete and categorical, and exploring hidden dependency among/between features and subjects. After each feature being individually renormalized with respect to its own histogram, the categorical version of mutual conditional entropy is evaluated for all pairs of response and covariate features according to the combinatorial information theory. Then, by applying Data Could Geometry (DCG) algorithmic computations on such a mutual conditional entropy matrix, multiple synergistic feature-groups are partitioned. Distinct synergistic feature-groups embrace distinct structures of dependency. The explicit details of dependency among members of synergistic features are seen through mutliscale compositions of blocks computed by a computing paradigm called Data Mechanics. We then propose a categorical pattern matching approach to establish a directed associative linkage: from the patterned response dependency to serial structured covariate dependency. The graphic display of such a directed associative linkage is termed an information flow and the degrees of association are evaluated via tree-to-tree mutual conditional entropy. This new universal way of discovering system knowledge is illustrated through five data sets. In each case, the emergent visible heterogeneity is an organization of discovered knowledge.

AB - Data generated from a system of interest typically consists of measurements on many covariate features and possibly multiple response features across all subjects in a designated ensemble. Such data is naturally represented by one response-matrix against one covariate-matrix. A matrix lattice is an advantageous platform for simultaneously accommodating heterogeneous data types: continuous, discrete and categorical, and exploring hidden dependency among/between features and subjects. After each feature being individually renormalized with respect to its own histogram, the categorical version of mutual conditional entropy is evaluated for all pairs of response and covariate features according to the combinatorial information theory. Then, by applying Data Could Geometry (DCG) algorithmic computations on such a mutual conditional entropy matrix, multiple synergistic feature-groups are partitioned. Distinct synergistic feature-groups embrace distinct structures of dependency. The explicit details of dependency among members of synergistic features are seen through mutliscale compositions of blocks computed by a computing paradigm called Data Mechanics. We then propose a categorical pattern matching approach to establish a directed associative linkage: from the patterned response dependency to serial structured covariate dependency. The graphic display of such a directed associative linkage is termed an information flow and the degrees of association are evaluated via tree-to-tree mutual conditional entropy. This new universal way of discovering system knowledge is illustrated through five data sets. In each case, the emergent visible heterogeneity is an organization of discovered knowledge.

UR - http://www.scopus.com/inward/record.url?scp=85057224952&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057224952&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0198253

DO - 10.1371/journal.pone.0198253

M3 - Article

C2 - 29902187

AN - SCOPUS:85057224952

VL - 13

SP - e0198253

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 6

ER -