Weighted K-nearest neighbors classification based on Whale optimization algorithm

Document Type : Research Paper


Department of Computer Engineering, Miyaneh Branch, Islamic Azad University, Miyaneh, Iran


K-Nearest Neighbors (KNN) is a classification algorithm based on supervised machine learning, which works according to a voting system. The performance of the KNN algorithm depends on different factors, such as unbalanced distribution of classes, the scalability problem, and considering equal values for all training samples. Regarding the importance of the KNN algorithm, different improved versions of this algorithm are introduced, such as fuzzy KNN, weighted KNN, and KNN with variable neighbors. In this paper, a weighted KNN based on Whale Optimization Algorithm is proposed for the objective of increasing the level of detection accuracy. The proposed algorithm devotes a weight to each training sample of every feature by employing the WOA to explore the optimized weight matrix. The algorithm is implemented and experimented on five standard datasets. The evaluation results prove that the proposed algorithm performs better than both weighted KNN based on the Genetic Algorithm (GA) and the classic KNN algorithm.


[1]  A. A. Aburomman, M. B. I. Reaz,  A novel SVM-KNN-PSO ensemble method for intrusion detection system, Applied Soft Computing,  38 (2016), 360-372.
[2] S. Bandaru, A. H. Ng,  K. Deb, Data mining methods for knowledge discovery in multi-objective optimization: Part A-survey, Expert Systems with Applications, 70 (2017), 139-159.
[3] Z. Bian, C. M. Vong, P. K. Wong, S. Wang, Fuzzy KNN method with adaptive nearest neighbors, IEEE Transactions on Cybernetics, 52 (2022), 5380-5393.
[4] H. L. Chen, C. C. Huang, X. G. Yu, X. Xu, X. Sun, G. Wang,  S. J. Wang, An efficient diagnosis system for detection of Parkinson's disease using fuzzy k-nearest neighbor approach, Expert Systems with Applications, 40 (2013), 263-271.
[5] A. J. P. Delima,  An enhanced K-nearest neighbor predictive model through metaheuristic optimization, International Journal of Engineering and Technology Innovation, 10 (2020), 280-292.
[6] Z. Deng, X. Zhu, D. Cheng, M. Zong,  S. Zhang, Efficient KNN classification algorithm for big data, Neurocomputing, 195 (2016), 143-148.
[7] N. Garc\'ia-Pedrajas, J. A. R. Del-Castillo,  G. Cerruela-Garcia,  A proposal for local K-values for k-nearest neighbor rule, IEEE Transactions on Neural Networks and Learning Systems, 28 (2015), 470-475.
[8] G. V. Gayathri, S. C. Satapathy, A survey on techniques for prediction of asthma, Smart Intelligent Computing and Applications, 159 (2020), 751-758.
[9] Z. Geler, V. Kurbalija, M. Ivanovic,  M. Radovanovic, Weighted KNN and constrained elastic distances for time-series classification, Expert Systems with Applications, 162 (2020). DOI:10.1016/j.eswa.2020.113829.
[10] J. Gou, L. Du, Y. Zhang,  T. Xiong, A new distance-weighted k-nearest neighbor classifier, Journal of Information and Computational Science, {\bf 9} (2012), 1429-1436.
[11] F. Harrou, A. Zeroual, Y. Sun, Traffic congestion monitoring using an improved KNN strategy, Measurement, 156 (2020). DOI:10.1016/j.measurement.2020.107534.
[12] V. Hashemi, Z Hasani, I. Sahraei, K. Borna, Hybrid algorithms of Whale optimization algorithm and k-nearest neighbor to predict the liver disease, EAI Endorsed Transaction on Context-Aware Systems and Applications, 6 (2019), 1-5.
[13] A. B. Hassanat, M. A. Abbadi, G. A. Altarawneh, A. A. Alhasanat, Solving the problem of the K parameter in the KNN classifier using an ensemble learning approach, International Journal of Computer Science and Information Security, 12 (2014), 33-39.
[14] J. Hu, H. Peng, J. Wang, W. Yu, KNN-P: A KNN classifier optimized by P systems, Theoretical Computer Science, 817 (2020), 55-65.
[15] M. A. Imron, B. Prasetyo, Improving algorithm accuracy k-nearest neighbor using z-score normalization and particle swarm optimization to predict customer churn, Journal of Soft Computing Exploration, 1 (2020), 56-62.
[16] B. B. Jia, M. L. Zhang, Multi-dimensional classification via KNN feature augmentation, Pattern Recognition, 106 (2020). DOI:10.1016/j.patcog.2020.107423.
[17] N. Jothi, N. A. Rashid, W. Husain, Data mining in healthcare-a review, Procedia Computer Science, 72 (2015), 306-313.
[18] I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, I. Chouvarda, Machine learning and data mining methods in diabetes research, Computational and Structural Biotechnology Journal, 15 (2017), 104-116.
[19] P. Kumar, R. S. Thakur, Liver disorder detection using variable-neighbor weighted fuzzy K nearest neighbor approach, Multimedia Tools and Applications, 80 (2021), 16515-16535
[20] M. M. Kumbure, P. Luukka, M. Collan, A new fuzzy K-nearest neighbor classifier based on the Bonferroni mean, Pattern Recognition Letters, 140 (2020), 172-178.
[21] T. W. Liao, R. J. Kuo, Five discrete symbiotic organisms search algorithms for simultaneous optimization of feature subset and neighborhood size of KNN classification models, Applied Soft Computing, 64 (2018), 581-595.
[22] M. M. Mafarja, S. Mirjalili, Hybrid Whale optimization algorithm with simulated annealing for feature selection, Neurocomputing, 260 (2017), 302-312.
[23] N. Mastrogiannis, B. Boutsinas, I. Giannikos, A method for improving the accuracy of data mining classification algorithms, Computers and Operations Research, 36 (2009), 2829-2839.
[24] S. Mirjalili, A. Lewis, The Whale optimization algorithm, Advances in Engineering Software, 95 (2016), 51-67.
[25] T. M. Mohamed, Pulsar selection using fuzzy KNN classifier, Future Computing and Informatics Journal, 3 (2018), 1-6.
[26] R. Mukherji, A. Kundu, I. Mukherji, D. Gupta, P. Tiwari, A. Khanna, M. Shorfuzzaman, LOT-cloud based healthcare model for COVID-19 detection: An enhanced k-nearest neighbor classifier based approach, Computing, (2021),1-21. DOI:10.1007/s00607-021-00951-9.
[27] F. H. Rhee, C. Hwang, An interval type-2 fuzzy K-nearest neighbor, International Conference on Fuzzy Systems, Louis, MO, USA, (2003), 25-28.
[28] S. Sharma, K. M. Osei-Bryson, G. M. Kasper, Evaluation of an integrated knowledge discovery and data mining process model, Expert Systems with Applications, 39 (2012), 11335-11348.
[29] Y. Song, Y. Gu, R. Zhang, G. Yu, Bre-Partition: Optimized high-dimensional KNN search with Bregman distances, IEEE Transactions on Knowledge and Data Engineering, 34 (2020), 1053-1065.
[30] UCI Machine Learning Repository [Online], URL: https://archive.ics.uci.edu/ml/index.php.
[31] Z. Wang, J. Na, B. Zheng, An improved KNN classifier for epilepsy diagnosis, IEEE Access, 8 (2020), 100022-100030.
[32] Y. L. Yang, X. Y. Bai, A research on classification performance of fuzzy classifiers based on fuzzy set theory, Iranian Journal of Fuzzy Systems, 16 (2019), 15-27.
[33] H. Yigit, A weighting approach for KNN classifier, International Conference on Electronics, Computer and Computation, Ankara, Turkey, (2013), 228-231.
[34] S. Zeraatkar, F. Afsari, Interval-valued fuzzy and intuitionistic fuzzy-KNN for imbalanced data classification, Expert Systems with Applications, 184 (2021). DOI:10.1016/j.eswa.2021.115510.
[35] S. Zhang, Cost-sensitive KNN classification, Neurocomputing, 391 (2020), 234-242.
[36] S. Zhang, X. Li, M. Zong, X. Zhu, R. Wang, Efficient KNN classification with different numbers of nearest neighbors, IEEE Transactions on Neural Networks and Learning Systems, 29 (2018), 1774-1785.
[37] J. Zhang, Y. Niu, W. He, Using genetic algorithm to improve fuzzy KNN, International Conference on Computational Intelligence and Security, Suzhou, China, (2008), 475-479.
[38] C. Zhang, J. Yao, G. Hu, T. Schott, Applying feature-weighted gradient decent k-nearest neighbor to select promising projects for scientific funding, Computers, Materials and Continua, 64 (2020), 1741-1753.