Towards automatic clustering analysis using traces of information gain: The infoguide method

Paulo Rocha, Diego Pinheiro, Martin Cadeiras, Carmelo Bastos-Filho

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Clustering analysis has become a ubiquitous information retrieval tool in a wide range of domains, but a more automatic framework is still lacking. Though internal metrics are the key players towards a successful retrieval of clusters, their effectiveness on real-world datasets remains not fully understood, mainly because of their unrealistic assumptions underlying datasets. We hypothesized that capturing traces of information gain between increasingly complex clustering retrievals-InfoGuide-enables an automatic clustering analysis with improved clustering retrievals. We validated the InfoGuide hypothesis by capturing the traces of information gain using the Kolmogorov-Smirnov statistic and comparing the clusters retrieved by InfoGuide against those retrieved by other commonly used internal metrics in artificially-generated, benchmarks, and real-world datasets. Our results suggested that InfoGuide can enable a more automatic clustering analysis and may be more suitable for retrieving clusters in real-world datasets displaying nontrivial statistical properties.

Original languageEnglish (US)
Title of host publicationProceedings of the 33rd International Florida Artificial Intelligence Research Society Conference, FLAIRS 2020
EditorsEric Bell, Roman Bartak
PublisherThe AAAI Press
Pages428-433
Number of pages6
ISBN (Electronic)9781577358213
StatePublished - 2020
Event33rd International Florida Artificial Intelligence Research Society Conference, FLAIRS 2020 - North Miami Beach, United States
Duration: May 17 2020May 20 2020

Publication series

NameProceedings of the 33rd International Florida Artificial Intelligence Research Society Conference, FLAIRS 2020

Conference

Conference33rd International Florida Artificial Intelligence Research Society Conference, FLAIRS 2020
Country/TerritoryUnited States
CityNorth Miami Beach
Period5/17/205/20/20

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software

Fingerprint

Dive into the research topics of 'Towards automatic clustering analysis using traces of information gain: The infoguide method'. Together they form a unique fingerprint.

Cite this