Comparing different stopping criteria for fuzzy decision tree induction through IDFID3

Document Type: Research Paper

Authors

1 Department of Computer Engineering, Shahid Bahonar Uni- versity of Kerman, Kerman, Iran

2 Department of Computer Engineering, Shahid Bahonar University of Kerman, Kerman, Iran

Abstract

Fuzzy Decision Tree (FDT) classifiers combine decision trees with approximate reasoning offered by fuzzy representation to deal with language and measurement uncertainties. When a FDT induction algorithm utilizes stopping criteria for early stopping of the tree's growth, threshold values of stopping criteria will control the number of nodes. Finding a proper threshold value for a stopping criterion is one of the greatest challenges to be faced in FDT induction. In this paper, we propose a new method named Iterative Deepening Fuzzy ID3 (IDFID3) for FDT induction that has the ability of controlling the tree’s growth via dynamically setting the threshold value of stopping criterion in an iterative procedure. The final FDT induced by IDFID3 and the one obtained by common FID3 are the same when the numbers of nodes of induced FDTs are equal, but our main intention for introducing IDFID3 is the comparison of different stopping criteria through this algorithm. Therefore, a new stopping criterion named Normalized Maximum fuzzy information Gain multiplied by Number of Instances (NMGNI) is proposed and IDFID3 is used for comparing it against the other stopping criteria. Generally speaking, this paper presents a method to compare different stopping criteria independent of their threshold values utilizing IDFID3. The comparison results show that FDTs induced by the proposed stopping criterion in most situations are superior to the others and number of instances stopping criterion performs better than fuzzy information gain stopping criterion in terms of complexity (i.e. number of nodes) and classification accuracy. Also, both tree depth and fuzzy information gain stopping criteria, outperform fuzzy entropy, accuracy and number of instances in terms of mean depth of generated FDTs.

Keywords


bibitem{Ref01}
J. Alcalá-Fdez, A. Fernandez, J. Luengo, J. Derrac, S. García, L. Sánchez and F. Herrera, textit{KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework}, Journal of Multiple-Valued Logic and Soft Computing, textbf{17} (2011), 255-287.

bibitem{Ref02}
J. Alcalá-Fdez, L. Sánchez, S. García, M. del Jesus, S. Ventura, J. Garrell, J. Otero, C. Romero, J. Bacardit, V. Rivas, J. Fernández and F. Herrera, textit{KEEL: a software tool to assess evolutionary algorithms for data mining problems}, Soft Computing, textbf{13(3)} (2009), 307-318.

bibitem{Ref03}
L. Bartczuk and D. Rutkowska, textit{Type-2 fuzzy decision trees}, In: Proceedings of the 9th International Conference on Artificial Intelligence and Soft Computing, Springer-Verlag, Zakopane, Poland, (2008), 197-206.

bibitem{Ref04}
R. B. Bhatt and M. Gopal, textit{Neuro-fuzzy decision trees}, International Journal of Neural Systems, textbf{16(1)} (2006), 63-78.

bibitem{Ref05}
B. Chandra, P. Paul Varghese, {it Fuzzy sliq decision tree algorithm}, IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, {bf38} (2008), 1294-1301.

bibitem{Ref06}
B. Chandra and P. Paul Varghese, textit{Fuzzifying gini index based decision trees}, Expert Systems with Applications, textbf{36(4)} (2009), 8549-8559.

bibitem{Ref07}
P. C. Chang, C. Y. Fan and W. Y. Dzan, textit{A CBR-based fuzzy decision tree approach for database classification}, Expert Systems with Applications, textbf{37(1)} (2010), 214-225.

bibitem{Ref08}
Y. L. Chen, T. Wang, B. S. Wang and Z. J. Li, textit{A survey of fuzzy decision tree classifier}, Fuzzy Information and Engineering, textbf{1(2)} (2009), 149-159.

bibitem{Ref09}
M. E. Cintra, M. C. Monard and H. A. Camargo, textit{Evaluation of the pruning impact on fuzzy C4.5}, In: Anais Congresso Brasileiro de Sistemas Fuzzy, (2010), 257-264.

bibitem{Ref10}
J. Demšar, textit{Statistical comparisons of classifiers over multiple data sets}, Journal of Machine Learning Research, textbf{7} (2006), 1-30.

bibitem{Ref11}
A. Frank and A. Asuncion, textit{UCI machine learning repository}, http://archive.ics.uci.edu/ml, Irvine, 2010.

bibitem{Ref12}
S. Garcia and F. Herrera, textit{An extension on "statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons}, Journal of Machine Learning Research, textbf{9} (2008), 2677-2694.

bibitem{Ref13}
J. S. R. Jang, C. T. Sun and E. Mizutani, textit{Neuro-fuzzy and soft computing: a computational approach to learning and machine intelligence}, Prentice Hall, London, 1997.

bibitem{Ref14}
C. Z. Janikow, textit{Fuzzy decision trees: issues and methods}, IEEE Transactions on Systems, Man and Cybernetics-Part B: Cybernetics, textbf{28(1)} (1998), 1-14.

bibitem{Ref15}
R. Jensen and Q. Shen, textit{Fuzzy-rough feature significance for fuzzy decision trees}, In: Proceedings of the 2005 UK Workshop on Computational Intelligence, (2005), 89-96.

bibitem{Ref16}
U. Khan, H. Shin, J. Choi and M. Kim, textit{wFDT - weighted fuzzy decision trees for prognosis of breast cancer survivability}, In: The Australasian Data Mining Conference, (2008), 141-152.

bibitem{Ref17}
R. E. Korf, textit{Depth-first iterative-deepening: an optimal admissible tree search}, In: G. R. Peter, ed., Expert Systems, IEEE Computer Society Press, (1990), 380-389.

bibitem{Ref18}
D. McNeill and P. Freiberger, textit{Fuzzy logic}, Simon & Schuster, New York, 1994.

bibitem{Ref19}
T. M. Mitchell, textit{Machine learning}, McGraw-Hill, New York, 1997.

bibitem{Ref20}
W. Pedrycz and Z. A. Sosnowski, textit{Designing decision trees with the use of fuzzy granulation}, IEEE Transactions on Systems, Man and Cybernetics-Part A: Systems and Humans, textbf{30(2)} (2000), 151-159.

bibitem{Ref21}
W. Pedrycz and Z. A. Sosnowski, textit{Genetically optimized fuzzy decision trees}, IEEE Transactions on Systems, Man and Cybernetics-Part B: Cybernetics, textbf{35(3)} (2005), 633-641.

bibitem{Ref22}
P. Pulkkinen and H. Koivisto, textit{Fuzzy classifier identification using decision tree and multiobjective evolutionary algorithms}, International Journal of Approximate Reasoning, textbf{48(2)} (2008), 526-543.

bibitem{Ref23}
J. R. Quinlan, textit{C4.5: programs for machine learning}, Morgan Kaufmann Publishers, San Francisco, 1993.

bibitem{Ref24}
L. Rokach and O. Maimon, textit{Top-down induction of decision trees classifiers - a survey}, IEEE Transactions on Systems, Man and Cybernetics-Part C: Applications and Reviews, textbf{35(4)} (2005), 476-487.

bibitem{Ref25}
L. Rokach and O. Maimon, textit{Data mining with decision trees: theroy and applications}, World Scientific, Singapore, 2008.

bibitem{Ref26}
J. Sanz, A. Fernandez, H. Bustince and F. Herrera, textit{IIVFDT: ignorance functions based interval-valued fuzzy decision tree with genetic tuning}, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, in press, 2012.

bibitem{Ref27}
M. Umano, H. Okamoto, I. Hatono, H. Tamura, F. Kawachi, S. Umedzu and J. Kinoshita, textit{Fuzzy decision trees by fuzzy ID3 algorithm and its application to diagnosis systems}, In: Proceedings of Third IEEE Conference on Fuzzy Systems, (1994), 2113-2118.

bibitem{Ref28}
T. Wang, Z. Li, Y. Yan and H. Chen, textit{A survey of fuzzy decision tree classifier methodology}, In: B. Y. Cao, ed., Fuzzy Information and Engineering, Springer, Berlin / Heidelberg, (2007), 959-968.

bibitem{Ref29}
X. Wang and C. Borgelt, textit{Information measures in fuzzy decision trees}, In: Proceedings of 13th IEEE International Conference on Fuzzy Systems, (2004), 85-90.

bibitem{Ref30}
X. Wang, B. Chen, G. Qian and F. Ye, textit{On the optimization of fuzzy decision trees}, Fuzzy Sets and Systems, textbf{112(1)} (2000), 117-125.

bibitem{Ref31}
X. Wang and H. Jiarong, textit{On the handling of fuzziness for continuous-valued attributes in decision tree generation}, Fuzzy Sets and Systems, textbf{99(3)} (1998), 283-290.

bibitem{Ref32}
Y. Yuan and M. J. Shaw, textit{Induction of fuzzy decision trees}, Fuzzy Sets and Systems, textbf{69(2)} (1995), 125-139.

bibitem{Ref33}
M. Zeinalkhani and M. Eftekhari, textit{A new measure for comparing stopping criteria of fuzzy decision tree}, In: International Conference on Computer and Knowledge Engineering, Mashhad, (2011), 120-123.

bibitem{Ref34}
H. Zhang and B. H. Singer, textit{Recursive partitioning and applications}, 2nd ed., Springer, London, 2010.
end{thebibliography}