Hierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents

Document Type : Research Paper


1 Control and Intelligent Processing Center of Excellence, School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran

2 Control and Intelligent Processing Center of Excellence, School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran and School of Cognitive Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran


This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed to have different representations of an environment while having similar actions. The learning framework is $Q$-learning. Each dimension of the functional space is the normalized expected value of an action. An unsupervised
clustering approach is used to form the functional concepts as some fuzzy areas in the functional space. The functional concepts are abstracted further in a hierarchy using the clustering approach. The hierarchical concepts are employed for knowledge transfer among agents. Properties of the proposed approach are tested in a set of case studies. The results show that the approach is very effective in transfer learning among heterogeneous agents especially in the beginning episodes of the learning.


[1] J. S. Bruner, Actual minds, possible words, Harvard University Press, 1987.
[2] T. Dietterich, Hierarchical reinforcement learning with the MAXQ value function decompo-
sition, Journal of Arti cial Intelligent Research, 13 (2000), 227-303.
[3] K. Driessens, J. Ramon and T. Croonenborghs, Transfer learning for reinforcement learn-
ing through goal and policy parametrization, In ICML Workshop on Structural Knowledge
Transfer for Machine Learning, (2006), 1-4.
[4] W. Fritz, Intelligent systems and their societies, In Webpage: http://www. intelligentsys-
tems.com.ar/intsyst/index.htm, January (1997).
[5] G. L. Klir, Uncertainty and information: foundations of generalized information theory,
John Wiley, Hoboken, NJ (2005).
[6] G. L. Klir, B. Yuan, Fuzzy sets and fuzzy logic: theory and applications, Prentice Hall, 1995.
[7] G. Konidaris, A. Barto, Autonomous shaping: Knowledge transfer in reinforcement learning,
In Proceedings of the 23rd international conference on Machine learning, 2006, 489-496.
[8] A. Lazaric, Knowledge Transfer in Reinforcement Learning, PhD thesis, Politecnico di Mi-
lano, 2008.
[9] L. Mihalkova, T. Huynh and R. Mooney, Mapping and revising Markov Logic Networks for
transfer learning, In Proceedings of AAAI Conference on Arti cial Intelligence, (2007), 608-614.
[10] H. Mobahi, M. Nili Ahmadabadi, and B. Nadjar Araabi, A biologically inspired method for
conceptual imitation using reinforcement learning, Applied Arti cial Intelligence, 21 (2007),
[11] R. A. Mollineda, F. J. Ferri and E. Vidal, A cluster-based merging strategy for nearest pro-
totype classi ers, In Proceedings of 15th International Conference on Pattern Recognition
(ICPR'00), 2 (2000), 755-758.
[12] G. L. Murphy, The big book of concepts, MIT Press, 2004.
[13] S. Pan, J. Kwok and Q. Yang, Transfer learning via dimensionality reduction, In Proceedings
of 23rd AAAI Conference on Arti cial Intelligence, (2008), 677-682.
[14] V. Soni, S. Singh, Using hormomorphisms to transfer options across continuous reinforce-
ment learning domains, In Proceedings of 21st AAAI Conference on Arti cial Intelligence,
(2006), 494-499.
[15] R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge,
MA (1998).
[16] F. Tanaka, M. Yamamura, Multitask reinforcement learning on the distribution of MDPs,
Transactions of the Institute of Electrical Engineers of Japan, 123(5) (2003), 1004-1011.
[17] M. Taylor, P. Stone, Transfer learning for reinforcement learning domains: a survey, Journal
of Machine Learning Research, 10 (2009), 1633-1685.
[18] M. Taylor, G. Kuhlmann and P. Stone, Autonomous transfer for reinforcement learning, In
Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent
systems, 1 (2008), 283-290.
[19] M. Taylor, P. Stone, Representation Transfer for Reinforcement Learning, In Proceedings
of AAAI Fall Symposium on Computational Approaches to Representation Change during
Learning and Development, Arlington, Virginia, (2007), 78-85.
[20] M. Taylor, P. Stone and Y. Liu, Value function for RL-based behavior transfer: A comparative
study, In Proceedings of the AAAI-05 Conference on Arti cial Intelligence, (2005), 880-885.
[21] A. Thedoridis and K. Koutroumbas, Pattern Recognition, Elsevier Academic Press, Second
Edition, 2003.
[22] L. Torrey and J. Shavlik, Transfer learning, In Soria, E., Martin, J., Magdalena, R., Martinez,
M., and Serrano, A., editors, Handbook of Research on Machine Learning Applications, IGI
Global, 2009, 242-264.
[23] C. J. Watkins, Learning from Delayed Rewards, Ph.D. thesis, Cambridge University, 1989.
[24] C. J. Watkins and P. Dayan, Q-learning, Machine Learning, 8 (1992), 279-292.
[25] A. Wilson, A. Fern, S. Ray and P. Tadepalli, Multitask reinforcement learning: A hierarchical
Bayesian approach, In Proceedings of the 24th International Conference on Machine Learning,
(2007), 1015-1022.
[26] M. Zentall, M. Galizio and T. S. Critch ed, Categorization, concept learning and behavior
analysis: an introduction, The Exprimental Analysis of Behavior, 3 (2002), 237-248.