×
Home Current Archive Editorial board
Instructions for papers
For Authors Aim & Scope Contact
Original scientific article

FEATURE SELECTION METHOD USING HYBRID SWARM WITH IMPROVED FUZZY C-MEANS CLUSTERING IN DATA MINING FOR DISEASE DETECTION

By
M. Birundha Rani Orcid logo ,
M. Birundha Rani

Mother Teresa Women’s University , Dindigul , India

Dr.A. Subramani Orcid logo
Dr.A. Subramani

M.V. Muthiah Govt. Arts College for Women , Dindigul , India

Abstract

A crucial method for reducing the dimensionality issue in DM (Data Mining) tasks is FS (Feature Selection). Conventional techniques for FS do not scale well in vast spaces. The HPSO-IKM approach has a rather long processing time, so future studies will keep enhancing the technique's stages to reduce the duration of detection. PSO's poor local search capability and lagging convergence in the refining search phase prevent it from mitigating the effects of poor initialization by reducing the greatest number of IC (Intra-Clustering) faults. This paper suggests a novel approach to the dimensionality issue, in which a good feature subset is produced through combining the correlation metric using clustering. Following Z Score Normalization (ZSN) for pre-processing, a computational model is constructed to identify the pertinent features based on pertinent constraints, and a structure is developed by extracting features via Principal Component Analysis (PCA). Next, utilizing Multi-Objective Glowworm Swarm Optimization using Improved Fuzzy C-Means Clustering (MOGWO-IFCM), unnecessary features are removed, and non-redundant features are chosen from every cluster based on correlation measures. This approach employs the IFCM technique for optimizing the initial clustering center after receiving the optimal solution as an initial clustering center with the GSO (Glowworm Swarm Optimization) technique. Utilizing the Modified Long Short-Term Memory (MLSTM) classifier, the suggested approach is tested on UCI datasets, and the outcomes are contrasted with those of other well-known FS methods. Percent-wise criteria are employed to confirm the accuracy of the suggested technique with varying numbers of pertinent features. The suggested technique's accuracy and efficiency are demonstrated by the outcomes of the experiment.

References

1.
Wang XD, Chen RC, Yan F, Zeng ZQ, Hong CQ. Fast Adaptive K-Means Subspace Clustering for High-Dimensional Data. IEEE Access. 2019 Mar 22; 7:42639–51.
2.
Nizam MU, Zaneta SA, Basri FA. Machine Learning based Human eye disease interpretation. International Journal of Communication and Computer Technologies (IJCCTS). 2023;11(2):42–52.
3.
Jaiswal JK, Samikannu R. Application of random forest algorithm on feature subset selection and classification and regression. In2017 world congress on computing and communication technologies (WCCCT) 2017 Feb 2 (pp 65-68) Ieee .
4.
Singhal P, Yadav RK, Dwivedi U. Unveiling patterns and abnormalities of human gait: a comprehensive study. Indian Journal of Information Sources and Services . 2024;14(1):51–70.
5.
Snousi HM, Aleej FA, Bara MF, Alkilany A. ADC: Novel Methodology for Code Converter Application for Data Processing. Journal of VLSI circuits and systems. 2022 Sep 20;4(2):46–56.

Citation

This is an open access article distributed under the  Creative Commons Attribution Non-Commercial License (CC BY-NC) License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 

Article metrics

Google scholar: See link

The statements, opinions and data contained in the journal are solely those of the individual authors and contributors and not of the publisher and the editor(s). We stay neutral with regard to jurisdictional claims in published maps and institutional affiliations.