Supporting Analysis of Dimensionality Reduction Results with Contrastive Learning

Takanori Fujiwara, Oh Hyun Kwon, Kwan Liu Ma

Research output: Contribution to journalArticle

Abstract

Dimensionality reduction (DR) is frequently used for analyzing and visualizing high-dimensional data as it provides a good first glance of the data. However, to interpret the DR result for gaining useful insights from the data, it would take additional analysis effort such as identifying clusters and understanding their characteristics. While there are many automatic methods (e.g., density-based clustering methods) to identify clusters, effective methods for understanding a cluster's characteristics are still lacking. A cluster can be mostly characterized by its distribution of feature values. Reviewing the original feature values is not a straightforward task when the number of features is large. To address this challenge, we present a visual analytics method that effectively highlights the essential features of a cluster in a DR result. To extract the essential features, we introduce an enhanced usage of contrastive principal component analysis (cPCA). Our method, called ccPCA (contrasting clusters in PCA), can calculate each feature's relative contribution to the contrast between one cluster and other clusters. With ccPCA, we have created an interactive system including a scalable visualization of clusters' feature contributions. We demonstrate the effectiveness of our method and system with case studies using several publicly available datasets.

Original languageEnglish (US)
Article number8805461
Pages (from-to)45-55
Number of pages11
JournalIEEE Transactions on Visualization and Computer Graphics
Volume26
Issue number1
DOIs
StatePublished - Jan 2020

Fingerprint

Principal component analysis
Visualization

Keywords

  • contrastive learning
  • Dimensionality reduction
  • high-dimensional data
  • principal component analysis
  • visual analytics

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Computer Graphics and Computer-Aided Design

Cite this

Supporting Analysis of Dimensionality Reduction Results with Contrastive Learning. / Fujiwara, Takanori; Kwon, Oh Hyun; Ma, Kwan Liu.

In: IEEE Transactions on Visualization and Computer Graphics, Vol. 26, No. 1, 8805461, 01.2020, p. 45-55.

Research output: Contribution to journalArticle

@article{38df0e4dbdf64aa086c08709e256a5f2,
title = "Supporting Analysis of Dimensionality Reduction Results with Contrastive Learning",
abstract = "Dimensionality reduction (DR) is frequently used for analyzing and visualizing high-dimensional data as it provides a good first glance of the data. However, to interpret the DR result for gaining useful insights from the data, it would take additional analysis effort such as identifying clusters and understanding their characteristics. While there are many automatic methods (e.g., density-based clustering methods) to identify clusters, effective methods for understanding a cluster's characteristics are still lacking. A cluster can be mostly characterized by its distribution of feature values. Reviewing the original feature values is not a straightforward task when the number of features is large. To address this challenge, we present a visual analytics method that effectively highlights the essential features of a cluster in a DR result. To extract the essential features, we introduce an enhanced usage of contrastive principal component analysis (cPCA). Our method, called ccPCA (contrasting clusters in PCA), can calculate each feature's relative contribution to the contrast between one cluster and other clusters. With ccPCA, we have created an interactive system including a scalable visualization of clusters' feature contributions. We demonstrate the effectiveness of our method and system with case studies using several publicly available datasets.",
keywords = "contrastive learning, Dimensionality reduction, high-dimensional data, principal component analysis, visual analytics",
author = "Takanori Fujiwara and Kwon, {Oh Hyun} and Ma, {Kwan Liu}",
year = "2020",
month = "1",
doi = "10.1109/TVCG.2019.2934251",
language = "English (US)",
volume = "26",
pages = "45--55",
journal = "IEEE Transactions on Visualization and Computer Graphics",
issn = "1077-2626",
publisher = "IEEE Computer Society",
number = "1",

}

TY - JOUR

T1 - Supporting Analysis of Dimensionality Reduction Results with Contrastive Learning

AU - Fujiwara, Takanori

AU - Kwon, Oh Hyun

AU - Ma, Kwan Liu

PY - 2020/1

Y1 - 2020/1

N2 - Dimensionality reduction (DR) is frequently used for analyzing and visualizing high-dimensional data as it provides a good first glance of the data. However, to interpret the DR result for gaining useful insights from the data, it would take additional analysis effort such as identifying clusters and understanding their characteristics. While there are many automatic methods (e.g., density-based clustering methods) to identify clusters, effective methods for understanding a cluster's characteristics are still lacking. A cluster can be mostly characterized by its distribution of feature values. Reviewing the original feature values is not a straightforward task when the number of features is large. To address this challenge, we present a visual analytics method that effectively highlights the essential features of a cluster in a DR result. To extract the essential features, we introduce an enhanced usage of contrastive principal component analysis (cPCA). Our method, called ccPCA (contrasting clusters in PCA), can calculate each feature's relative contribution to the contrast between one cluster and other clusters. With ccPCA, we have created an interactive system including a scalable visualization of clusters' feature contributions. We demonstrate the effectiveness of our method and system with case studies using several publicly available datasets.

AB - Dimensionality reduction (DR) is frequently used for analyzing and visualizing high-dimensional data as it provides a good first glance of the data. However, to interpret the DR result for gaining useful insights from the data, it would take additional analysis effort such as identifying clusters and understanding their characteristics. While there are many automatic methods (e.g., density-based clustering methods) to identify clusters, effective methods for understanding a cluster's characteristics are still lacking. A cluster can be mostly characterized by its distribution of feature values. Reviewing the original feature values is not a straightforward task when the number of features is large. To address this challenge, we present a visual analytics method that effectively highlights the essential features of a cluster in a DR result. To extract the essential features, we introduce an enhanced usage of contrastive principal component analysis (cPCA). Our method, called ccPCA (contrasting clusters in PCA), can calculate each feature's relative contribution to the contrast between one cluster and other clusters. With ccPCA, we have created an interactive system including a scalable visualization of clusters' feature contributions. We demonstrate the effectiveness of our method and system with case studies using several publicly available datasets.

KW - contrastive learning

KW - Dimensionality reduction

KW - high-dimensional data

KW - principal component analysis

KW - visual analytics

UR - http://www.scopus.com/inward/record.url?scp=85075631291&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85075631291&partnerID=8YFLogxK

U2 - 10.1109/TVCG.2019.2934251

DO - 10.1109/TVCG.2019.2934251

M3 - Article

C2 - 31425080

AN - SCOPUS:85075631291

VL - 26

SP - 45

EP - 55

JO - IEEE Transactions on Visualization and Computer Graphics

JF - IEEE Transactions on Visualization and Computer Graphics

SN - 1077-2626

IS - 1

M1 - 8805461

ER -