1Control and Intelligent Processing Center of Excellence, School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
2Control and Intelligent Processing Center of Excellence, School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran and School of Cognitive Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed to have different representations of an environment while having similar actions. The learning framework is $Q$-learning. Each dimension of the functional space is the normalized expected value of an action. An unsupervised clustering approach is used to form the functional concepts as some fuzzy areas in the functional space. The functional concepts are abstracted further in a hierarchy using the clustering approach. The hierarchical concepts are employed for knowledge transfer among agents. Properties of the proposed approach are tested in a set of case studies. The results show that the approach is very effective in transfer learning among heterogeneous agents especially in the beginning episodes of the learning.
 J. S. Bruner, Actual minds, possible words, Harvard University Press, 1987.  T. Dietterich, Hierarchical reinforcement learning with the MAXQ value function decompo- sition, Journal of Articial Intelligent Research, 13 (2000), 227-303.  K. Driessens, J. Ramon and T. Croonenborghs, Transfer learning for reinforcement learn- ing through goal and policy parametrization, In ICML Workshop on Structural Knowledge Transfer for Machine Learning, (2006), 1-4.  W. Fritz, Intelligent systems and their societies, In Webpage: http://www. intelligentsys- tems.com.ar/intsyst/index.htm, January (1997).  G. L. Klir, Uncertainty and information: foundations of generalized information theory, John Wiley, Hoboken, NJ (2005).  G. L. Klir, B. Yuan, Fuzzy sets and fuzzy logic: theory and applications, Prentice Hall, 1995.
 G. Konidaris, A. Barto, Autonomous shaping: Knowledge transfer in reinforcement learning, In Proceedings of the 23rd international conference on Machine learning, 2006, 489-496.  A. Lazaric, Knowledge Transfer in Reinforcement Learning, PhD thesis, Politecnico di Mi- lano, 2008.  L. Mihalkova, T. Huynh and R. Mooney, Mapping and revising Markov Logic Networks for transfer learning, In Proceedings of AAAI Conference on Articial Intelligence, (2007), 608-614.  H. Mobahi, M. Nili Ahmadabadi, and B. Nadjar Araabi, A biologically inspired method for conceptual imitation using reinforcement learning, Applied Articial Intelligence, 21 (2007), 155-183.  R. A. Mollineda, F. J. Ferri and E. Vidal, A cluster-based merging strategy for nearest pro- totype classiers, In Proceedings of 15th International Conference on Pattern Recognition (ICPR'00), 2 (2000), 755-758.  G. L. Murphy, The big book of concepts, MIT Press, 2004.  S. Pan, J. Kwok and Q. Yang, Transfer learning via dimensionality reduction, In Proceedings of 23rd AAAI Conference on Articial Intelligence, (2008), 677-682.  V. Soni, S. Singh, Using hormomorphisms to transfer options across continuous reinforce- ment learning domains, In Proceedings of 21st AAAI Conference on Articial Intelligence, (2006), 494-499.  R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA (1998).  F. Tanaka, M. Yamamura, Multitask reinforcement learning on the distribution of MDPs, Transactions of the Institute of Electrical Engineers of Japan, 123(5) (2003), 1004-1011.  M. Taylor, P. Stone, Transfer learning for reinforcement learning domains: a survey, Journal of Machine Learning Research, 10 (2009), 1633-1685.  M. Taylor, G. Kuhlmann and P. Stone, Autonomous transfer for reinforcement learning, In Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent systems, 1 (2008), 283-290.  M. Taylor, P. Stone, Representation Transfer for Reinforcement Learning, In Proceedings of AAAI Fall Symposium on Computational Approaches to Representation Change during Learning and Development, Arlington, Virginia, (2007), 78-85.  M. Taylor, P. Stone and Y. Liu, Value function for RL-based behavior transfer: A comparative study, In Proceedings of the AAAI-05 Conference on Articial Intelligence, (2005), 880-885.  A. Thedoridis and K. Koutroumbas, Pattern Recognition, Elsevier Academic Press, Second Edition, 2003.  L. Torrey and J. Shavlik, Transfer learning, In Soria, E., Martin, J., Magdalena, R., Martinez, M., and Serrano, A., editors, Handbook of Research on Machine Learning Applications, IGI Global, 2009, 242-264.  C. J. Watkins, Learning from Delayed Rewards, Ph.D. thesis, Cambridge University, 1989.  C. J. Watkins and P. Dayan, Q-learning, Machine Learning, 8 (1992), 279-292.  A. Wilson, A. Fern, S. Ray and P. Tadepalli, Multitask reinforcement learning: A hierarchical Bayesian approach, In Proceedings of the 24th International Conference on Machine Learning, (2007), 1015-1022.  M. Zentall, M. Galizio and T. S. Critched, Categorization, concept learning and behavior analysis: an introduction, The Exprimental Analysis of Behavior, 3 (2002), 237-248.