Unsupervised feature selection: A fuzzy multi-criteria decision-making approach

Document Type : Original Manuscript


Department of Computer Engineering, Faculty of Engineering, Lorestan University, Khorramabad, Iran


Feature selection (FS) has shown remarkable performance in decreasing the dimensionality of high-dimensional datasets by selecting a good subset of features. Labeling high-dimensional data can be expensive and time-consuming as labeled samples are not always available. Therefore, providing effective unsupervised FS methods is essential in machine learning. This article provides a fuzzy multi-criteria decision-making method for unsupervised FS in which an ensemble of unsupervised FS rankers is utilized to evaluate the features. These methods are aggregated based on a fuzzy TOPSIS method. This is the first time a fuzzy multi-criteria decision-making approach has been used for an FS problem. Multiple comparisons are made to show the optimality and effectiveness of the proposed strategy against multiple competing FS methods. Our approach regarding two classification metrics, F-score and accuracy, appears superior to comparable  strategies. Also, it is performing so swiftly.  


[1] N. AlNuaimi, M. M. Masud, M. A. Serhani, N. Zaki, Streaming feature selection algorithms for big data: A survey, Applied Computing and Informatics, 18 (2022), 113-135.
[2] H. Bayati, M. B. Dowlatshahi, A. Hashemi, MSSL: A memetic-based sparse subspace learning algorithm for multilabel classification, International Journal of Machine Learning and Cybernetics, 13(11) (2022), 3607-3624.
[3] B. Bede, L. Stefanini, Generalized differentiability of fuzzy-valued functions, Fuzzy Sets and Systems, 230 (2013), 119-141.
[4] F. Beiranvand, V. Mehrdad, M. B. Dowlatshahi, Unsupervised feature selection for image classification: A bipartite matching-based principal component analysis approach, Knowledge-Based Systems, 250 (2022), 109085. DOI: 10.1016/j.knosys.2022.109085.
[5] R. E. Bellman, L. A. Zadeh, Decision-making in a fuzzy environment, Management Science, 17 (1970), 141-164.
[6] V. Bolón-Canedo, A. Alonso-Betanzos, Ensembles for feature selection: A review and future trends, Information Fusion, 52 (2019), 1-12.
[7] G. Chandrashekar, F. Sahin, A survey on feature selection methods, Computers and Electrical Engineering, 40 (2014), 16-28.
[8] S. Chiaretti, X. Li, R. Gentleman, A. Vitale, M. Vignetti, F. Mandelli, J. Ritz, R. Foa, Gene expression profile of adult T-cell acute lymphocytic leukemia identifies distinct subsets of patients with different response to therapy and survival, Blood, 103(7) (2004), 2771-2778.
[9] B. C. Christensen, E. A. Houseman, C. J. Marsit, S. Zheng, M. R. Wrensch, J. L. Wiemels, H. H. Nelson, M. R. Karagas, J. F. Padbury, R. Bueno, D. J. Sugarbaker, R. F. Yeh, J. K. Wiencke, K. T. Kelsey, Aging and environmental exposures alter tissue-specific DNA methylation dependent upon CPG island context, PLOS Genetics, 5 (2009). DOI: 10.1371/journal.pgen.1000602.
[10] A. Dalvand, M. B. Dowlatshahi, A. Hashemi, SGFS: A semi-supervised graph-based feature selection algorithm based on the Page-Rank algorithm, 27th International Computer Conference, Computer Society of Iran (CSICC), Tehran, Iran, 2022.
[11] X. Deng, Y. Li, J. Weng, J. Zhang, Feature selection for text classification: A review, Multimedia Tools and Applications, 79 (2019), 3797-3816.
[12] P. Dhal, C. Azad, A comprehensive survey on feature selection in the various fields of machine learning, Applied Intelligence, 52 (2022), 4543-4581.
[13] L. Fei, Y. Deng, Multi-criteria decision making in Pythagorean fuzzy environment, Applied Intelligence, 50 (2020), 537-561.
[14] V. Feofanov, E. Devijver, M. R. Amini, Wrapper feature selection with partially labeled data, Applied Intelligence, 52 (2022), 12316-12329.
[15] M. Friedman, A comparison of alternative tests of significance for the problem of m rankings, The Annals of Mathematical Statistics, 11 (1940), 86-92.
[16] J. Guo, W. Zhu, Dependence guided unsupervised feature selection, 32nd AAAI Conference on Artificial Intelligence, 2018, 2018.
[17] K. Han, Y. Wang, C. Zhang, C. Li, C. Xu, Autoencoder inspired unsupervised feature selection, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2018.
[18] E. Hancer, B. Xue, M. Zhang, A survey on feature selection approaches for clustering, Artificial Intelligence Review, 53 (2020), 4519-4545.
[19] A. Hashemi, M. B. Dowlatshahi, H. Nezamabadi-pour, MGFS: A multi-label graph-based feature selection algorithm via Page-Rank centrality, Expert Systems with Applications, 142 (2020), 113024. DOI:10.1016/j.eswa.2019.113024.
[20] A. Hashemi, M. B. Dowlatshahi, H. Nezamabadi-pour, MFS-MCDM: Multi-label feature selection using multicriteria decision making, Knowledge-Based Systems, 206 (2020), 106365. DOI:10.1016/j.knosys.
[21] A. Hashemi, M. B. Dowlatshahi, H. Nezamabadi-pour, Minimum redundancy maximum relevance ensemble feature selection: A bi-objective Pareto-based approach, Journal of Soft Computing and Information Technology, 50 (2021).
[22] A. Hashemi, M. B. Dowlatshahi, H. Nezamabadi-pour, A bipartite matching-based feature selection for multi-label learning, International Journal of Machine Learning and Cybernetics, 12 (2021), 459-475.
[23] A. Hashemi, M. B. Dowlatshahi, H. Nezamabadi-pour, A pareto-based ensemble of feature selection algorithms, Expert Systems with Applications, 180 (2021), 115130. DOI:10.1016/j.eswa.2021.115130.
[24] A. Hashemi, M. B. Dowlatshahi, H. Nezamabadi-pour, VMFS: A VIKOR-based multi-target feature selection, Expert Systems with Applications, 182 (2021), 115224. DOI:10.1016/j.eswa.2021.115224.
[25] A. Hashemi, M. B. Dowlatshahi, H. Nezamabadi-pour, Ensemble of feature selection algorithms: A multi-criteria decision-making approach, International Journal of Machine Learning and Cybernetics, 13 (2022), 49-69.
[26] A. Hashemi, M. Joodaki, N. Z. Joodaki, M. B. Dowlatshahi, Ant colony optimization equipped with an ensemble of heuristics through multi-criteria decision making: A case study in ensemble feature selection, Applied Soft Computing, 124 (2022), 109046. DOI:10.1016/j.asoc.2022.109046.
[27] A. Hashemi, M. R. Pajoohan, M. B. Dowlatshahi, Online streaming feature selection based on Sugeno fuzzy integral, 2022 9th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS), IEEE, Bam, Iran, 2022.
[28] D. Huang, X. Cai, C. D. Wang, Unsupervised feature selection with multi-subspace randomization and collaboration, Knowledge-Based Systems, 182 (2019), 104856. DOI:10.1016/j.knosys.2019.07.027.
[29] C. Kahraman, S. C. Onar, B. Oztaysi, Fuzzy multicriteria decision-making: A literature review, International Journal of Computational Intelligence Systems, 8 (2015), 637-666.
[30] F. Karimi, M. B. Dowlatshahi, A. Hashemi, SemiACO: A semi-supervised feature selection based on ant colony optimization, Expert Systems with Applications, 214 (2023), 119130. DOI:10.1016/j.eswa.
[31] J. Khan, J. S. Wei, M. Ringnér, L. H. Saal, M. Ladanyi, F. Westermann, F. Berthold, M. Schwab, C. R. Antonescu, C. Peterson, P. S. Meltzer, Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks, Nature Medicine, 7 (2001), 1-7.
[32] X. Lin, J. Guan, B. Chen, Y. Zeng, Unsupervised feature selection via orthogonal basis clustering and local structure preserving, IEEE Transactions on Neural Networks and Learning Systems, 33(11) (2021), 6881-6892.
[33] M. Miri, M. B. Dowlatshahi, A. Hashemi, M. K. Rafsanjani, B. B. Gupta, W. Alhalabi, Ensemble feature selection for multilabel text classification: An intelligent order statistics approach, International Journal of Intelligent Systems, 37(12) (2022), 11319-11341.
[34] W. C. Mlambo, A survey and comparative study of filter and wrapper feature selection techniques, The International Journal Of Engineering And Science (IJES), 5 (2016), 57-67.
[35] S. Nădăban, S. Dzitac, I. Dzitac, Fuzzy TOPSIS: A general view, Procedia Computer Science, 91 (2016), 823-831.
[36] M. M. Salih, B. B. Zaidan, A. A. Zaidan, M. A. Ahmed, Survey on fuzzy TOPSIS state-of-the-art between 2007 and 2017, Computers and Operations Research, 104 (2009), 207-227.
[37] S. Solorio-Fernández, J. A. Carrasco-Ochoa, J. F. Martínez-Trinidad, A review of unsupervised feature selection methods, Artificial Intelligence Review, 53 (2020), 907-948.
[38] B. Venkatesh, J. Anuradha, A review of feature selection and its methods, Cybernetics and Information Technologies, 19 (2019), 3-26.
[39] M. West, C. Blanchette, H. Dressman, E. Huang, S. Ishida, R. Spang, H. Zuzan, J. A. Olson, J. R. Marks, J. R. Nevins, Predicting the clinical status of human breast cancer by using gene expression profiles, Proceedings of the National Academy of Sciences USA, 98(20) (2001), 11462-11467.
[40] J. Xie, M. Wang, S. Xu, Z. Huang, P. W. Grant, The unsupervised feature selection algorithms based on standard deviation and cosine similarity for genomic data analysis, Frontiers in Genetics, 12 (2021). DOI:10.3389/fgene.2021.684100.
[41] H. Zeng, Y. M. Cheung, Feature selection and kernel learning for local learning-based clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence, 33 (2011), 1532-1547.
[42] L. Zhu, L. Miao, D. Zhang, Iterative Laplacian score for feature selection, Communications in Computer and Information Science, (2012), 80-87. DOI:10.1007/978-3-642-33506-8 11.
[43] P. Zhu, W. Zhu, W. Wang, W. Zuo, Q. Hu, Non-convex regularized self-representation for unsupervised feature selection, Image and Vision Computing, 60 (2017), 22-29.