(2302-7893) Improving the genetic algorithm in fuzzy cluster analysis for numerical data and its applications

Document Type : Research Paper

Authors

1 Faculty of Mechanical - Electrical and Computer Engineering, School of Technology, Van Lang University, Ho Chi Minh City, Vietnam

2 College of Natural Science, Can Tho University, Can Tho City, Vietnam

Abstract

This study proposes an automatic genetic algorithm in fuzzy cluster analysis for numerical data. In this algorithm, a
new measure called the FB index is used as the objective function of the genetic algorithm. In addition, the algorithm
not only determines the appropriate number of groups but also improves the steps of traditional genetic algorithm
as crossover, mutation and selection operators. The proposed algorithm is shown the step by step throughout the
numerical example, and can perform fast by the established Matlab procedure. The result from experiments show the
superiority of the proposed algorithm when it overcomes the existing algorithms. Moreover, it has been applied in
recognizing the image data, and building the fuzzy time series model. These show the potential of this study for many
real applications of the different fields.

Keywords


[1] A. M. Abbasov, M. B. Mamedova, Application of fuzzy time series to population forecasting, Vienna University of Technology, 12 (2003), 545-552.
[2] L. Agusti, S. Salcedo Sanz, S. Jimenez-Fernandez, L. Carro Calvo, J. Del Ser, J. A. Portilla-Figueras, A new grouping genetic algorithm for clustering problems, Expert Systems with Applications, 39(10) (2012), 9695-9703.
[3] A. Asuncion, D. Newman, Uci machine learning repository, University of California, 2007.
[4] P. Berkhin, A survey of clustering data mining techniques, In: Kogan, J., Nicholas, C., Teboulle, M. (eds) Grouping Multidimensional Data. Springer, Berlin, 2006.
[5] J. C. Bezdek, R. Ehrlich, W. Full, FCM The fuzzy c-means clustering algorithm, Computers and Geosciences, 10(2-3) (1984), 191-203.
[6] N. Bidi, Z. Elberrichi, Feature selection for text classi cation using genetic algorithms, In 2016 8th International Conference on Modelling, Identi cation and Control, IEEE, (2016), 806-810.
[7] N. Bouguila, W. ElGuebaly, Discrete data clustering using  nite mixture models, Pattern Recognition, 42(1) (2009), 33-42.
[8] J. H. Chen, W. L. Hung, An automatic clustering algorithm for probability density functions, Journal of Statistical Computation and Simulation, 85(15) (2015), 3047-3063. 
[9] M. Chen, D. Miao, Interval set clustering, Expert Systems with Applications, 38(4) (2011), 2923-2932.
[10] D. L. Davies, D. W. Bouldin, A cluster separation measure, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2 (1979), 224-227.
[11] F. D. A. De Carvalho, J. T. Pimentel, L. X. Bezerra, Clustering of symbolic interval data based on a single adaptive L1 distance, In 2007 International Joint Conference on Neural Networks, IEEE, (2007), 224-229.
[12] E. Egrioglu, C. H. Aladag, U. Yolcu, Fuzzy time series forecasting with a novel hybrid approach combining fuzzy c-means and neural networks, Expert Systems with Applications, 40(3) (2013), 854-857.
[13] A. Goh, R. Vidal, Unsupervised riemannian clustering of probability density functions, In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2008, Antwerp, Belgium, (2008), 377-392.
[14] L. O. Hall, I. B. Ozyurt, J. C. Bezdek, Clustering with a genetically optimized approach, IEEE Transactions on Evolutionary Computation, 3(2) (1999), 103-112.
[15] W. L. Hung, J. H. Yang, K. F. Shen, Self-updating clustering algorithm for interval-valued data, In 2016 IEEE International Conference on Fuzzy Systems, (2016), 1494-1500.
[16] J. T. Jeng, C. M. Chen, S. C. Chang, C. C. Chuang, Ipfcm clustering algorithm under euclidean and Hausdor distance measure for symbolic interval data, International Journal of Fuzzy Systems, 21(7) (2019), 2102-2119.
[17] H. Le Capitaine, C. Frelicot, A cluster-validity index combining an overlap measure and a separation measure based on fuzzy-aggregation operators, IEEE Transactions on Fuzzy Systems, 19(3) (2011), 580-588.
[18] T. W. Liao, Clustering of time series data a survey, Pattern Ecognition, 38(11) (2005), 1857-1874.
[19] L. Ma, Y. Zhang, V. Leiva, S. Liu, T. Ma, A new clustering algorithm based on a radar scanning strategy with applications to machine learning data, Expert Systems with Applications, 191 (2022), 116143.
[20] A. Montanari, D. G. Calo, Model-based clustering of probability density functions, Advances in Data Analysis and Classi cation, 7(3) (2013), 301-319.
[21] H. Nguyen, X. N. Bui, Q. H. Tran, N. L. Mai, A new soft computing model for estimating and controlling blast-produced ground vibration based on hierarchical k-means clustering and cubist algorithms, Applied Soft Computing,77 (2019), 376-386.
[22] T. Nguyentrang, T. Vovan, Fuzzy clustering of probability density functions, Journal of Applied Statistics, 44(4) (2017), 583-601.
[23] D. Phamtoan, T. Vovan, Automatic fuzzy genetic algorithm in clustering for images based on the extracted intervals, Multimedia Tools and Applications, 80(28) (2021), 35193-35215.
[24] S. I. R. Rodriguez, F. D. A. T. De Carvalho, A new fuzzy clustering algorithm for interval-valued data based on city-block distance, IEEE International Conference on Fuzzy Systems, (2019), 1-6.
[25] M. Rostami, P. Moradi, A clustering based genetic algorithm for feature selection, In 2014 6th Conference on Information and Knowledge Technology, (2014), 112-116.
[26] T. Vovan, An improved fuzzy time series forecasting model using variations of data, Fuzzy Optimization and Decision Making, 18(2) (2019), 151-173.
[27] T. Vovan, L. Nguyenhuynh, K. Nguyenhuu, Building the forecasting model for time series based on the improvement of fuzzy relationships, Iranian Journal of Fuzzy Systems, 19(4) (2022), 89-106.
[28] T. Vovan, T. Nguyentrang, Similar coecient of cluster for discrete elements, Sankhya B, 80(1) (2018), 19-36.
[29] T. Vovan, D. Phamtoan, D Tranthituy, Automatic genetic algorithm in clustering for discrete elements, Commu-nications in Statistics-Simulation and Computation, 50(6) (2021), 1679-1694.
[30] T. Vovan, D. Phamtoan, L. H. Tuan, T. Nguyentrang, An automatic clustering for interval data using the genetic algorithm, Annals of Operations Research, 303(1) (2021), 359-380.
[31] Q.Wang, X.Wang, C. Fang, W. Yang, Robust fuzzy c-means clustering algorithm with adaptive spatial and intensity constraint and membership linking for noise image segmentation, Applied Soft Computing, 92 (2020), 106318.
[32] K. L. Wu, M. S. Yang, A cluster validity index for fuzzy clustering, Pattern Recognition Letters, 26(9) (2005), 1275-1291.
[33] H. Yu, L. Chen, J. Yao, X. Wang, A three-way clustering method based on an improved dbscan algorithm, Physica A: Statistical Mechanics and its Applications, 535 (2019), 122289.
[34] X. Zhao, J. Liang, C. Dang, A strati ed sampling based clustering algorithm for large-scale data, Knowledge-Based Systems, 163 (2019), 416-428.