The novel hierarchical clustering approach using self-organizing map with optimum dimension selection.

Tripathi K

doi:10.1002/hcs2.90

The novel hierarchical clustering approach using self-organizing map with optimum dimension selection.

Tripathi K ¹

Affiliations

1. Department of Computer Applications The Maharaja SayajiRao University of Baroda Vadodara Gujarat India.
Authors
Tripathi K¹
(1 author)

ORCIDs linked to this article

Tripathi K | 0000-0003-3226-8717

Health Care Science, 11 Apr 2024, 3(2):88-100
https://doi.org/10.1002/hcs2.90 PMID: 38939618

Abstract

Introduction

Data clustering is an important field of machine learning that has applicability in wide areas, like, business analysis, manufacturing, energy, healthcare, traveling, and logistics. A variety of clustering applications have already been developed. Data clustering approaches based on self-organizing map (SOM) generally use the map dimensions (of the grid) ranging from 2 × 2 to 8 × 8 (4-64 neurons [microclusters]) without any explicit reason for using the particular dimension, and therefore optimized results are not obtained. These algorithms use some secondary approaches to map these microclusters into the lower dimension (actual number of clusters), like, 2, 3, or 4, as the case may be, based on the optimum number of clusters in the specific data set. The secondary approach, observed in most of the works, is not SOM and is an algorithm, like, cut tree or the other.

Methods

In this work, the proposed approach will give an idea of how to select the most optimal higher dimension of SOM for the given data set, and this dimension is again clustered into the lower actual dimension. Primary and secondary, both utilize the SOM to cluster the data and discover that the weight matrix of the SOM is very meaningful. The optimized two-dimensional configuration of SOM is not the same for every data set, and this work also tries to discover this configuration.

Results

The adjusted randomized index obtained on the Iris, Wine, Wisconsin diagnostic breast cancer, New Thyroid, Seeds, A1, Imbalance, Dermatology, Ecoli, and Ionosphere is, respectively, 0.7173, 0.9134, 0.7543, 0.8041, 0.7781, 0.8907, 0.8755, 0.7543, 0.5013, and 0.1728, which outperforms all other results available on the web and when no reduction of attributes is done in this work.

Conclusions

It is found that SOM is superior to or on par with other clustering approaches, like, k-means or the other, and could be used successfully to cluster all types of data sets. Ten benchmark data sets from diverse domains like medical, biological, and chemical are tested in this work, including the synthetic data sets.

Full text links

Read article at publisher's site: https://doi.org/10.1002/hcs2.90

References

Articles referenced by this article (1)

An enhanced deterministic K-Means clustering algorithm for cancer subtype prediction from gene expression data.
Nidheesh N, Abdul Nazeer KA, Ameer PM
Comput Biol Med, 213-221 2017
MED: 29100115

Lay summaries

Plain language description

Kudos

https://growkudos.com/articles/10.1002/hcs2.90

Search life-sciences literature (45,103,589 articles, preprints and more)

The novel hierarchical clustering approach using self-organizing map with optimum dimension selection.

Affiliations

Authors

ORCIDs linked to this article

Abstract

Introduction

Methods

Results

Conclusions

Full text links

References

An enhanced deterministic K-Means clustering algorithm for cancer subtype prediction from gene expression data.

Lay summaries

Plain language description

Kudos

Similar Articles

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Clustering of the self-organizing map.

Clustering of gene expression data: performance and similarity analysis.

Self-organizing map algorithm for assessing spatial and temporal patterns of pollutants in environmental compartments: A review.

Partnerships & funding

Similar Articles

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Clustering of the self-organizing map.
IEEE Trans Neural Netw, 11(3):586-600, 01 Jan 2000

Clustering of gene expression data: performance and similarity analysis.

Self-organizing map algorithm for assessing spatial and temporal patterns of pollutants in environmental compartments: A review.

Search life-sciences literature (45,103,589 articles, preprints and more)

The novel hierarchical clustering approach using self-organizing map with optimum dimension selection.

Author information

Affiliations

Authors

ORCIDs linked to this article

Abstract

Introduction

Methods

Results

Conclusions

Full text links

References

An enhanced deterministic K-Means clustering algorithm for cancer subtype prediction from gene expression data.

Lay summaries

Plain language description

Kudos

Similar Articles

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Clustering of the self-organizing map.

Clustering of gene expression data: performance and similarity analysis.

Self-organizing map algorithm for assessing spatial and temporal patterns of pollutants in environmental compartments: A review.

Partnerships & funding