316 51 27MB
English Pages 1345 Year 2004
Lecture Notes in Artificial Intelligence Edited by J. G. Carbonell and J. Siekmann
Subseries of Lecture Notes in Computer Science
3213
This page intentionally left blank
Knowledge-Based Intelligent Information and Engineering Systems 8th International Conference, KES 2004 Wellington, New Zealand, September 20-25, 2004 Proceedings, Part I
Springer
eBook ISBN: Print ISBN:
3-540-30132-1 3-540-23318-0
©2005 Springer Science + Business Media, Inc.
Print ©2004 Springer-Verlag Berlin Heidelberg All rights reserved
No part of this eBook may be reproduced or transmitted in any form or by any means, electronic, mechanical, recording, or otherwise, without written consent from the Publisher
Created in the United States of America
Visit Springer's eBookstore at: and the Springer Global Website Online at:
http://ebooks.springerlink.com http://www.springeronline.com
Preface
We were very pleased to once again extend to the delegates and, we are pleased to say, our friends the warmest of welcomes to the International Conference on Knowledge-Based Intelligent Information and Engineering Systems at Wellington Institute of Technology in Wellington, New Zealand. The KES conferences attract a wide range of interest. The broad focus of the conference series is the theory and applications of computational intelligence and emergent technologies. Once purely a research field, intelligent systems have advanced to the point where their abilities have been incorporated into many conventional application areas. The quest to encapsulate human knowledge and capabilities in domains such as reasoning, problem solving, sensory analysis, and other complex areas has been avidly pursued. This is because it has been demonstrated that these abilities have definite practical applications. The techniques long ago reached the point where they are being exploited to provide commercial advantages for companies and real beneficial effects on profits. KES 2004 provided a valuable mechanism for delegates to obtain a profound view of the latest intelligent systems research into a range of algorithms, tools and techniques. KES 2004 also gave delegates the chance to come into contact with those applying intelligent systems in diverse commercial areas. The combination of theory and practice represents a uniquely valuable opportunity for appreciating the full spectrum of intelligent-systems activity and the “state of the art”. For the first time in the short history of KES, the conference came to New Zealand. KES 2004 aimed at providing not only a high-tech forum for presenting results on theory and applications of intelligent systems and techniques, but focused on some significant emerging intelligent technologies including evolvable hardware (EHW), evolutionary computation in computational intelligence, DNA computing, artificial immune systems (AIS), bioinformatics using intelligent and machine learning techniques, and intelligent Web mining. The impressive audience of the KES conferences series was confirmed, and we broke some KES records, such as: about 500 attendants from 55 countries, and for the first time in the conference history, more than one third of the participant presenting high-quality papers were Ph.D. students from all over the world. This last detail is relevant for the major role played by the KES organization and conferences with respect to support and education for practitioners who are acting in the area of intelligent systems and emergent technologies. Thanking all the individuals who contributed to a conference like this is always fraught with difficulty, as someone is always unintentionally omitted. The WelTec team, including Gary Hartley, the conference administrator, Michael Hyndman, the conference Web page designer, and the Local Organizing Committee, chaired by Dr. Linda Sissons, WelTec CEO, all worked hard to bring the conference to a high level of organization. We would like to arrange a special appreciation on behalf of the KES 2004 General Chair for the hard work done by David Pritchard from the WelTec Centre for Computational Intelligence. We would like to extend our praise and thanks to them.
VI
Preface
An important distinction of the KES conferences over others is the Invited Session Program. Invited sessions give new and dedicated researchers an opportunity to present a “mini-conference” of their own. By this means they can bring to public view a topic at the leading edge of intelligent science and technology. This mechanism for feeding new blood into the research is immensely valuable, and strengthens KES conferences enormously. For this reason we must extend thanks to the Invited Session Chairs who contributed in this way. We would like to thank the KES 2004 International Program Committee and the KES 2004 Reviewers Team who were essential in providing their reviews of the papers. We are immensely grateful for this service, without which the conference would not have been possible. We thank the high-profile keynote speakers and invited tutorial lecturers for providing interesting and informed talks to catalyze subsequent discussions. In some ways, the most important contributors to KES 2004 were the authors, presenters and delegates without whom the conference could not have taken place. So we thank them for their contributions. Finally we thank the “unsung heroes” the army of administrators, caterers, hoteliers, and the people of Wellington, for welcoming us and providing for the conference. We hope the attendees all found KES 2004 a worthwhile, informative and enjoyable experience. We hope to see them in Melbourne for KES 2005, which will be hosted by La Trobe University, Melbourne, Australia. June 2004
Prof. Mircea Gh. Negoita Dr. R.J. Howlett Prof. Lakhmi C. Jain
KES 2004 Conference Organization
General Chair Mircea Negoita Centre for Computational Intelligence School of Information Technology Wellington Institute of Technology (WelTec), Wellington, New Zealand Co-director of NZ-German School on Computational Intelligence at KES 2004
Conference Founder and Honorary Programme Committee Chair Lakhmi C. Jain Knowledge-Based Intelligent Information and Engineering Systems Centre University of South Australia, Australia
KES Executive Chair Bob Howlett Intelligent Systems and Signal Processing Laboratories/KTP Centre University of Brighton, UK
KES 2004 Invited Co-chair Bernd Reusch Department of Computer Science University of Dortmund, Germany Co-director of NZ-German School on Computational Intelligence at KES 2004
KES Journal General Editor Bogdan Gabrys University of Bournemouth, UK
VIII
Organization
Local Organizing Committee Linda Sissons – Chair, WelTec CEO Gary Hartley, Mircea Gh. Negoita, Murray Wills Wellington Institute of Technology (WelTec), New Zealand
KES 2004 Web Page Designer Michael Hyndman Wellington Institute of Technology (WelTec), New Zealand
Technical Emergence Desktop Team Doug StJust Ali Rashid Mardani Wellington Institute of Technology (WelTec), New Zealand
KES 2004 Liaison Officer Lesley Lucie-Smith Wellington Institute of Technology (WelTec), New Zealand
Proceedings Assembling Team David Pritchard Paulene Mary Crook Ian Hunter Terry Jeon Des Kenny Sara Rule Nick Tullock Wellington Institute of Technology (WelTec), New Zealand
International Program Committee
Hussein Abbass, University of New South Wales, Australia Peter Andreae, Victoria University, Wellington, New Zealand Viorel Ariton, “Danubius” University of Galatzi, Romania Akira Asano, Hiroshima University, Higashi-Hiroshima, Japan K. Vijayan Asari, Old Dominion University, Norfolk, Virginia, USA Norio Baba, Osaka Kyoiku University, Japan Robert Babuska, Delft University of Technology, Delft, The Netherlands Andrzej Bargiela, Nottingham Trent University, UK Marius Bazu, Institute of Microtechnology, Bucharest, Romania Yevgeniy Bodyanskiy, Kharkiv National University of Radioelectronics, Ukraine Patrick Bosc, IRISA/ENSSAT, Lanion, France Pascal Bouvry, Luxembourg University of Applied Sciences, Luxembourg Phillip Burrell, South Bank University, London, UK Yen-Wei Chen, University of the Ryukyus, Okinawa, Japan Vladimir Cherkassky, University of Minnesota, USA Krzysztof Cios, University of Colorado at Denver, USA Carlos A. Coello, LANIA, Mexico George Coghill, Auckland University, Auckland, New Zealand David W. Corne, University of Exeter, UK David Cornforth, Charles Sturt University, Albury, Australia Ernesto Damiani, University of Milan, Italy Da Deng, University of Otago, Dunedin, New Zealand Da Ruan, Belgian Nuclear Research Centre (SCK · CEN), Belgium Vladan Devedzic, University of Belgrade, Belgrade, Serbia Didier Dubois, IRIT, Université Paul Sabatier, Toulouse, France Duncan Earl, Oak Ridge National Laboratory, USA Madjid Fathi, National Magnet Lab., Florida, USA Marcus Frean, Victoria University, Wellington, New Zealand Peter Funk, Mälardalen University, Västerås, Sweden Bogdan Gabrys, University of Bournemoth, UK Boris Galitsky, Birkbeck College, University of London, UK Hugo de Garis, Utah State University, USA Max H. Garzon, University of Memphis, USA Tamas Gedeon, Murdoch University, Murdoch, Australia Mitsuo Gen, Waseda University, Kytakyushu, Japan Vladimir Gorodetski, St. Petersburg Institute of Informatics, Russian Academy of Sciences, Russia Manuel Grana, Facultad de Informatic, UPV/EHU, Spain David Gwaltney, NASA George C. Marshall Space Flight Center, Huntsville, USA Lars Kai Hansen, Technical University of Denmark, Lyngby, Denmark Chris Harris, University of Southampton, UK
X
Organization
Lars Hildebrand, Dortmund University, Dortmund, Germany Tetsuya Highchi, National Institute of Advanced Industrial Science and Technology, Japan Yuzo Hirai, University of Tsukuba, Japan Dawn Holmes, University of California, Santa Barbara, USA Daniel Howard, University of Limerick, Ireland Tzung-Pei Hong, National University of Kaohsiung, Taiwan Keiichi Horio, Kyushu Institute of Technology, Japan Hitoshi Iba, University of Tokyo, Tokyo, Japan Florin Ionescu, University of Applied Sciences, Konstanz, Germany Hisao Ishibuchi, Osaka Prefecture University, Osaka, Japan Naohiro Ishii, Aichi Institute of Technology, Toyota City, Japan Mo M. Jamshidi, University of New Mexico, Albuquerque, USA Norbert Jesse, Dortmund University, Dortmund, Germany Seong-Joon Yoo, Sejong University, Seoul, Korea Janusz Kacprzyk, Polish Academy of Sciences, Poland Nikos Karacapilidis, University of Patras, Greece Vojislav Kecman, Auckland University, Auckland, New Zealand Rajiv Khosla, La Trobe, University, Melbourne, Australia Laszlo T. Koczy, Budapest University of Technology and Economics, Budapest and Szechenyi Istvan University, Gyor, Hungary Hiroyasu Koshimizu, Chukyo University, Toyota, Japan Susumu Kunifuji, Japan Advanced Institute of Science & Technology, Japan Andrew Kusiak, University of Iowa, Iowa City, USA W.K. Lai, MIMOS Bhd., Kuala Lumpur, Malaysia Pier Luca Lanzi, Polytechnic Institute, Milan, Italy Raymond Lee, Hong Kong Polytechnic University, Kowloon, Hong Kong Chee-Peng Lim, University of Science Malaysia, Penang, Malaysia Jason Lohn, NASA Ames Research Center, Mountain View, CA, USA Ignac Lovrek, University of Zagreb, Croatia Bruce MacDonald, Auckland University, Auckland, New Zealand Bob McKay, University of NSW, Australian Defence Force Academy, Australia Luis Magdalena-Layos, EUSFLAT & Universidad Politecnica de Madrid, Spain Dan C. Marinescu, University of Central Florida, Orlando, USA Jorma K.Mattila, Lappeenranta University of Technology, Finland Radko Mesiar, Slovak Technical University, Bratislava, Slovakia Claudio Moraga, University of Dortmund, Germany Hirofumi Nagashino, University of Tokushima, Tokushima, Japan Noriko Nagata, Kwansei Gakuin University, Japan Ryohei Nakatsu, Kwansei Gakuin University, Japan Koji Nakajima, Tohoku University, Sendai, Japan Akira Namatame, National Defense Academy, Yokosuka, Japan Victor Emil Neagoe, Technical University Bucharest, Romania Ciprian Daniel Neagu, University of Bradford, UK Charles Nguyen, Catholic University of America, Washington, DC, USA Ngoc Thanh Nguyen, Wroclaw University of Technology, Poland Toyoaki Nishida, University of Tokyo, Japan
Organization
Nikhil R. Pal, Indian Statistical Institute, Calcutta, India Vasile Palade, Oxford University, UK Costas Papis, University of Piraeus, Greece Ian C. Parmee, University of the West of England, Bristol, UK Carlos-Andrés Pena-Reyes, Swiss Federal Institute of Technology–EPFL, Lausanne, Switzerland Theodor Popescu, National Institute for Research and Development Informatics, Bucharest, Romania John A. Rose, University of Tokyo, Tokyo, Japan Eugene Roventa, York University, Toronto, Canada Rajkumar Roy, Cranfield University, UK Takeshi Samatsu, Kyushu Tokai University, Japan Elie Sanchez, Université de la Méditerranée, Marseille, France Marc Schoenauer, INRIA Rocquencourt, Le Chesnay, France Udo Seiffert, Leibniz Institute of Plant Genetics and Crop Plant Research, Gatersleben, Germany Barry Smyth, University College Dublin, Ireland Flavio Soares Correa da Silva, Instituto de Matematica e Estatistica, University of São Paulo, Brazil Von-Wun Soo, National Tsing Hua University, Taiwan Adrian Stoica, NASA Jet Propulsion Laboratory, Pasadena, USA Noriaki Suetake, Yamaguchi University, Japan Sarawut Sujitjorn, Suranaree University of Technology, Thailand Mieko Tanaka-Yamawaki, Tottori University, Japan Takushi Tanaka, Fukuoka Institute of Technology, Japan Eiichiro Tazaki, Toin University of Yokohama, Japan Jon Timmis, University of Kent at Canterbury, UK Jim Torresen, University of Oslo, Norway Kazuhiko Tsuda, University of Tsukuba, Japan Andy M. Tyrrell, University of York, UK Eiji Uchino, University of Yamaguchi, Japan Angel Navia Vazquez, Universidad Carlos III de Madrid, Spain Jose Luis Verdegay, University of Granada, Granada, Spain Dianhui Wang, La Trobe University, Melbourne, Australia Pei Wang, Temple University, Philadelphia, USA Junzo Watada, Waseda University, Kitakyushu, Fukuoka, Japan Keigo Watanabe, Saga University, Japan Takeshi Yamakawa, Kyushu Institute of Technology, Graduate School of Life Science and Systems Engineering, Japan Xin Yao, University of Birmingham, UK Kaori Yoshida, Kyushu Institute of Technology, Japan Lotfi A. Zadeh, University of California at Berkeley, USA Ricardo Zebulum, NASA Jet Propulsion Laboratory, Pasadena, USA
XI
Invited Session Chairs Committee
Akinori Abe, ATR Intelligent Robotics & Communication Labs, Kyoto, Japan Yoshinori Adachi, Chubu University, Japan Alicia d’Anjou, Universidad del Pais Vasco, Spain Norio Baba, Osaka Kyoiku University, Japan Pascal Bouvry, Luxembourg University of Applied Sciences, Luxembourg Malu Castellanous, Hewlett-Packard Laboratories, Palo Alto, CA, USA Yen-Wei Chen, Ritsumeikan University, Japan George G. Coghill, Auckland University, New Zealand Ernesto Damiani, University of Milan, Italy Vladan Devedzic, University of Belgrade, Serbia and Montenegro Marijan Druzovec, University of Maribor, Slovenia Richard Duro, Universidad de A Coruña, Spain Minoru Fukumi, University of Tokushima, Japan Boris Galitsky, Birkbeck College, University of London, UK Max H. Garzon, University of Memphis, USA Wanwu Guo, Edith Cowan University, Australia Manuel Graña, Universidad Pais Vasco, Spain Jerzy M. Grzymala-Busse, University of Kansas, USA Robert F. Harrison, University of Sheffield, UK Philip Hingston, Edith Cowan University, Australia Tzung-Pei Hong, National University of Kaohsiung, Taiwan Nikhil Ichalkaranje, University of South Australia, Adelaide, Australia Takumi Ichimura, Hiroshima University, Japan Nobuhiro Inuzuka, Nagoya Institute of Technology, Japan Yoshiteru Ishida, Toyohashi University of Technology, Japan Naohiro Ishii, Aichi Institute of Technology, Japan Yuji Iwahori, Chubu University, Japan Lakhmi C. Jain, University of South Australia, Adelaide, Australia Taki Kanda, Bunri University of Hospitality, Japan Radoslaw P. Katarzyniak, Wroclaw University of Technology, Poland Le Kim, University of South Australia, Adelaide, Australia Tai-hoon Kim, Korea Information Security Agency (KISA), Korea Rajiv Khosla, La Trobe University, Melbourne, Australia Peter Kokal, University of Maribor, Slovenia Naoyuki Kubota, Tokyo Metropolitan University, Tokyo, Japan Mineichi Kudo, Hokkaido University, Japan Chiaki Kuroda, Tokyo Institute of Technology, Tokyo, Japan Susumu Kunifuji, Japan Advanced Institute of Science and Technology, Japan Weng Kim Lai, MIMOS Berhad, Technology Park, Malaysia Dong Chun Lee, Howon University, Korea Huey-Ming Lee, Chinese Culture University, Taiwan Raymond Lee, Hong Kong Polytechnic University, Kowloon, Hong Kong
Organization
XIII
Chee-Peng Lim, University of Science, Malaysia Bruce MacDonald, Auckland University, New Zealand Jun Munemori, Wakayama University, Japan Tetsuya Murai, Hokkaido University, Japan Hirofumi Nagashino, University of Tokushima, Japan Koji Nakajima, Tohoku University, Sendai, Japan Kazumi Nakamatsu, University of Hyogo, Japan Hirotaka Nakayama, Konan University, Kobe, Japan Ryohei Nakano, Nagoya Institute of Technolgoy, Japan Ngoc T. Nguyen, Wroclaw University of Technology, Poland Toyoaki Nishida, Graduate School of Informatics, Kyoto University, Japan Mariusz Nowostawski, University of Otago, Dunedin, New Zealand Yukio Ohsawa, University of Tsukuba and University of Tokyo, Japan Abhijit S. Pandya, Florida Atlantic University, USA Gloria E. Phillips-Wren, Loyola College in Maryland, Baltimore, USA Lech Polkowski, Polish-Japanese Institute of Information Technology, Koszykowa, Poland Theodor D. Popescu, National Institute for Research and Development in Informatics, Bucharest, Romania Marina Resta, University of Genoa, Italy David C. Rees, CSIRO ICT Centre, Epping, Australia John A. Rose, University of Tokyo, Japan Steffen Rothkugel, Luxembourg University of Applied Sciences, Luxembourg Kazumi Saito, Nagoya Institute of Technolgy, Nagoya, Japan Udo Seiffert, Leibniz Institute of Plant Genetics and Crop Plant Research, Germany David McG. Squire, Monash University, Australia Hirokazu Taki, Wakayama University, Japan Kazuhiko Tsuda, University of Tsukuba, Japan Claudio Turchetti, Università Politecnica delle Marche, Ancona, Italy Katsuji Usosaki, Osaka University, Japan Dianhui Wang, La Trobe University, Melbourne, Australia Pei Wang, Birkbeck College, University of London, UK Junzo Watada, Waseda University, Japan Tatjana Welzer, University of Maribor, Slovenia Yoshiyuki Yamashita, Tohoku University, Japan. Mieko Tanaka-Yamawaki, Tottori University, Japan Seong-Joon Yoo, Sejong University, Seoul, Korea Katsumi Yoshida, St. Marianna University, School of Medicine, Japan Yuji Yoshida, University of Kitakyushu, Kitakyushu, Japan Takashi Yoshino, Wakayama University, Japan Valentina Zharkova, Bradford University, UK
KES 2004 Reviewers
R. Abdulah, University of Science Malaysia, Malaysia A. Abe, ATR Intelligent Robotics & Communication Labs., Kyoto, Japan Y. Adachi, Chubu University, Aichi, Japan P. Andreae, Victoria University, Wellington, New Zealand A. Asano, Hiroshima University, Higashi-Hiroshima, Japan K.V. Asari, Old Dominion University, Norfolk, Virginia, USA N. Ashidi, KES 2004 Reviewers Team D. Arita, Kyushu University, Fukuoka, Japan N.A. Aziz, MIMOS, Malaysia N. Baba, Osaka Kyoiku University, Japan R. Babuska, Delft University of Technology, Delft, The Netherlands O. Boissier, Écoles des Mines de Saint-Étienne, France P. Bosc, IRISA/ENSSAT, France P. Bouvry, Luxembourg University of Applied Sciences, Luxembourg G. Bright, Massey University, Auckland, New Zealand D.A. Carnegie, Waikato University, Hamilton, New Zealand M. Castellaneous, Hewlett-Packard Laboratories, Palo Alto, CA, USA C.-T. Chang, National Cheng Kung University, Taiwan Y.-W. Chen, Ritsumeikan University, Japan S.-C. Chi, Huafan University, Taiwan B.-C. Chien, I-Shou University, Taiwan G.G. Coghill, Auckland University, Auckland, New Zealand D.W. Corne, University of Exeter, UK D. Cornforth, Charles Sturt University, Albury, Australia A. Czyzewski, Gdansk University of Technology, Gdansk, Poland E. Damiani, University of Milan, Italy R.J. Deaton, University of Arkansas, USA Da Deng, University of Otago, Dunedin, New Zealand V. Devedzic, University of Belgrade, Serbia and Montenegro P.M. Drezet, University of Sheffield, UK R. Dunlog, University of Canterbury, Christchurch, New Zealand C. Elamvazuthi, MIMOS, Malaysia T. Ejima, Aichi University of Education, Aichi, Japan M. Fathi, National Magnet Lab., Florida, USA M. Frean, Victoria University, Wellington, New Zealand W. Friedrich, Industrial Research Limited, Auckland, New Zealand T. Fujinami, JAIST, Japan P. Funk, Mälardalen University, Västerås, Sweden B. Gabrys, Bournemouth University, UK M.H. Garzon, University of Memphis, USA B. Galitsky, Birkbeck College, University of London, UK T. Gedeon, Murdoch University, Murdoch, Australia
Organization
V. Gorodetski, St. Petersburg Institute of Informatics, Russia M. Grana, Universidad Pais Vasco, Spain J.W. Grzymala-Busse, University of Kansas, USA N. Guelfi, Luxembourg University of Applied Sciences, Luxembourg F. Guinand, Le Havre University, France W. Guo, Edith Cowan University, Australia M. Hagiya, University of Tokyo, Japan L.K. Hansen, Technical University of Denmark, Lyngby, Denmark A. Hara, Hiroshima City University, Japan R.F. Harrison, University of Sheffield, UK Y. Hayakawa, Tohoku University, Japan L. Hildebrand, University of Dortmund, Germany P. Hingston, Edith Cowan University, Australia K. Hirayama, University of Kitakyushu, Kitakyushu, Japan O.S. Hock, University of Malaya, Malaysia T.-P. Hong, National University of Kaohsiung, Taiwan K. Horio, Kyushu Institute of Technology, Fukuoka, Japan D. Howard, University of Limerick, Ireland T. Ichikawa, Shizuoka University, Japan T. Ichimura, Hiroshima City University, Japan N. Ichalkaranje, University of South Australia, Australia F. Ishida, University of Electro-communications, Japan Y. Ishida, Toyohashi University of Technology, Japan N. Ishii, Aichi Institute of Technology, Japan S. Ito, ATR, Japan Y. Iwahori, Chubu University, Aichi, Japan S. Iwamoto, Kyushu University, Fukuoka, Japan M.E. Jefferies, Waikato University, Hamilton, New Zealand N. Jesse, University of Dortmund, Germany K. Juszczyszyn, Wroclaw University of Technology, Poland D. Khadraoui, CRP Tudor, Luxembourg K. Kakusho, Kyoto University, Kyoto, Japan T. Kanda, Bunri University of Hospitality, Japan T. Kanai, Meijin-gakuin University, Japan N. Karakapilidis, University of Patras, Greece R.P. Katarzyniak, Wroclaw University of Technology, Poland N. Katayama, Tohoku University, Japan P. Kazienko, Wroclaw University of Technology, Poland V. Kecman, Auckland University, New Zealand S.J. Kia, New Zealand C.W. Kian, Ohio Northern University, USA L. Kim, University of Canberra, Australia C.P. Lian, DSTO, Australia C.-P. Lim, University of Science Malaysia, Malaysia D.N.C. Ling, Multimedia University, Malaysia M. Kinjo, Tohoku University, Japan Y. Kinouchi, University of Tokushima, Japan
XV
XVI
Organization
A.T. Khader, University of Science Malaysia, Malaysia R. Khosla, La Trobe University, Melbourne, Australia T. Koda, Kyoto University, Japan T. Komatsu, Future University Hakodate, Hakodate, Japan T. Kondo, KES 2004 Reviewers Team B. Kostec, Gdansk University of Technology, Gdansk, Poland N. Kubota, Tokyo Metropolitan University, Tokyo, Japan M. Kudo, University of Hokkaido, Japan N. Kulathuramaiyer, University Malaysia Sarawak, Malaysia S. Kumamoto, University of Kytakyushu, Japan S. Kunifuji, Japan Advanced Institute of Science and Technology (JAIST), Japan H.-C. Kuo, National Chiayi University, Taiwan M. Kurano, Chiba University, Japan C. Kuroda, Tokyo Institute of Technology, Japan T. Kuroda, KES 2004 Reviewers Team S. Kurohashi, University of Tokyo, Japan Y. Kurosawa, Hiroshima City University, Japan A. Kusiak, University of Iowa, Iowa City, USA S. Kurohashi, University of Tokyo, Japan Y. Kurosawa, Hiroshima City University, Japan W.K. Lai, MIMOS Berhad, Technology Park, Malaysia D.C. Lee, Howon University, Korea H.-M. Lee, Chinese Culture University, Taiwan R. Lee, Hong Kong Polytechnic University, Hong Kong C.P. Lian, KES 2004 Reviewers Team J.-H. Lin, I-Shou University, Taiwan W.-Y. Lin, I-Shou University, Taiwan D.N.C. Ling, KES 2004 Reviewers Team C.-P. Lim, University of Science Malaysia, Penang, Malaysia H. Li, Edith Cowan University, Australia C. Liu, Shenyang Institute of Technology, Shenyang, China I. Lovrek, University of Zagreb, Croatia B. MacDonald, Auckland University, New Zealand B. McKay, University of New South Wales, Australian Defence Force Academy, Australia David McG. Squire, Monash University, Australia Z. Ma, Northeast Normal University, China L. Magdalena-Layos, EUSFLAT and Universidad Politecnica de Madrid, Spain N.A. Matisa, University of Science, Malaysia, Malaysia C. Messom, Massey University, Auckland, New Zealand C. Moraga, University of Dortmund, Germany N. Mort, University of Sheffield, UK K. Mera, Hiroshima City University, Japan M. Minoh, ACCMS, Kyoto University, Japan M. Miura, JAIST, Japan Y. Mizugaki, University of Electro-communications, Japan T. Mizuno, Shizuoka University, Japan
Organization
XVII
Y. Moria, Nagoya Women’s University, Japan J. Munemori, Wakayama University, Japan T. Murai, Hokkaido University, Japan J. Murata, Kyushu University, Fukuoka, Japan H. Nagashino, University of Tokushima, Japan J. Nakagami, Chiba University, Chiba, Japan K. Nakajima, Tohoku University, Japan K. Nakamatsu, University of Hyogo, Japan M. Nakamura, Hiroshima City University, Japan Y. Nakamura, ACCMS, Kyoto University, Japan R. Nakano, Nagoya Institute of Technolgoy, Nagoya, Japan R. Nakatsu, Kwansei Gakuin University, Japan H. Nanba, Hiroshima City University, Japan C.-D. Neagu, University of Bradford, UK M.Gh. Negoita, Wellington Institute of Technology, New Zealand N.T. Nguyen, Wroclaw University of Technology, Poland T. Nishida, Kyoto University, Japan K. Nishimoto, JAIST, Japan T. Noguchi, JAIST, Japan M. Novostawski, University of Otago, Dunedin, New Zealand S. Oeda, Kisarazu College of Technology, Japan Y. Ohsawa, University of Tsukuba and University of Tokyo, Japan T. Okamoto, Kanagawa Institute of Technology, Atsugi, Japan O. Ono, Meiji University, Japan T. Onomi, Tohoku University, Japan M. Ozaki, Chubu University, Aichi, Japan V. Palade, Oxford University, UK A.S. Pandya, Florida Atlantic University, USA M. Paprzycki, Wroclaw University of Technology, Poland C.-A. Pena-Reyes, Swiss Federal Institute of Technology–EPFL, Lausanne, Switzerland J.F. Peters, University of Manitoba, Winnipeg, Canada G.E. Phillips-Wren, Loyola College in Maryland, USA L. Polkowski, Polish-Japanese Institute of Information Technology, Koszykowa, Poland Th.D. Popescu, National Institute for Research and Development in Informatics, Bucharest, Romania M. Purvis, University of Otago, Dunedin, New Zealand A.R. Ramli, University Putra Malaysia, Malaysia D.C. Rees, CSIRO ICT Centre, Epping, Australia J.A. Rose, The University of Tokyo, Tokyo, Japan S. Rothkugel, Luxembourg University of Applied Sciences, Luxembourg K. Saito, NTT Communication Science Labs., Japan M.-J.E. Salami, International Islamic University of Malaysia, Kuala Lumpur, Malaysia S. Salcedo-Sanz, University of Birmingham, UK M. Sano, University of Tokyo, Japan
XVIII
Organization
S. Sato, Tohoku University, Japan R. Sakamoto, JAIST, Japan E. Sanchez, Université de la Méditerraneé, Marseille, France C. Schommer, Luxembourg University of Applied Sciences, Luxembourg S. Scott, Asia Pacific Institute of Technology, Malaysia N. Seeman, New York University, USA U. Seifert, Leibniz Institute of Plant Genetics and Crop Plant Research, Germany F. Seredynski, PJWSTK/IPIPAN, Poland T. Shimooka, Hokkaido University, Sapporo, Japan F.S. Correa da Silva, Instituto de Matematica e Estatistica, University of São Paulo, Brazil V.-W. Soo, National Tsing Hua University, Taiwan U. Sorger, Luxembourg University of Applied Sciences, Luxembourg P. Sturm, University of Trier, Germany N. Suetake, Yamaguchi University, Japan K. Sugiyama, JAIST, Japan M. Suka, St. Marianna University, Japan S. Sujitjorn, Suranaree University of Technology, Thailand Y. Sumi, Kyoto University, Kyoto, Japan N. Surayana, Multimedia University, Malaysia A. Suyama, University of Tokyo, Japan M. Takano, University of Tokyo, Japan H. Taki, Wakayama University, Japan M. Takano, University of Tokyo, Japan H. Taki, Wakayama University, Japan Y.-H. Tao, National Pingtung University of Technology and Science, Taiwan T. Tanaka, Fukuoka Institute of Technology, Fukuoka, Japan R. Taniguchi, Kyushu University, Fukuoka, Japan E.H. Tat, Multimedia University, Malaysia J. Timmis, University of Kent at Canterbury, UK J. Torresen, University of Oslo, Norway K. Tsuda, University of Tsukuba, Tokyo, Japan C. Turchetti, Università Politecnica delle Marche, Ancona, Italy E. Uchino, University of Yamaguchi, Japan H. Ueda, Hiroshima City University, Japan K. Ueda, University of Tokyo, Japan K. Umemoto, JAIST, Japan K. Unsworth, Auckland University, New Zealand K. Uosaki, Osaka University, Japan J. Xiao, Edith Cowan University, Australia N. Xiong, KES 2004 Reviewers Team H. Yamaba, Miyazaki University, Japan T. Yamakami, ACCESS, Japan Y. Yamashita, Tohoku University, Japan H. Yan, Duke University, USA X. Yao, University of Birmingham, UK M. Yasuda, Chiba University, Japan
Organization
S.-J. Yoo, Sejong University, Seoul, Korea J. Yoon, Institute of Science and Technology, Korea K. Yoshida, St. Marianna University, Japan Y. Yoshida, University of Kitakyushu, Japan T. Yoshino, Wakayama University, Japan K.-M. Yu, Chung-Hua University, Taiwan D.C.K. Yuen, Auckland University, New Zealand T. Yuizono, Shimane University, Japan D. Wang, La Trobe University, Melbourne, Australia P. Wang, Temple University, Philadelphia, USA S.-L. Wang, New York Institute of Technology, USA X. Wang, Hebei University, China J. Watada, Waseda University, Japan K. Watanabe, Saga University, Japan Y. Watanabe, Toyohashi University of Technology, Japan E. Weidert, Luxembourg University of Applied Sciences, Luxembourg T. Welzer, University of Maribor, Slovenia S. Wilk, Poznan University of Technology, Poland C.-H. Wu, Shu-Te University, Taiwan V. Zharkova, University of Bradford, UK A. Zomaya, University of Sydney, Australia C. Zhao, Edith Cowan University, Australia Z. Zheng, Chinese Academy of Sciences, Beijing, China
Sponsors
XIX
XX
Organization
Table of Contents, Part I
Keynote Lecturers Web Intelligence, World Knowledge and Fuzzy Logic – The Concept of Web IQ (WIQ) Lotfi A. Zadeh
1
Industrial Applications of Evolvable Hardware Tetsuya Higchi
6
Equilibrium Modelling of Oligonucleotide Hybridization, Error, and Efficiency for DNA-Based Computational Systems John A. Rose 8 Chance Discovery with Emergence of Future Scenarios Yukio Ohsawa
11
Brain-Inspired SOR Network and Its Application to Trailer Track Back-up Control 13 Takanori Koga, Takeshi Yamakawa Dual Stream Artificial Neural Networks Colin Fyfe
16
Session Papers
DNA-Based Semantic Information Processing Improving the Quality of Semantic Retrieval in DNA-Based Memories with Learning Andrew Neel, Max Garzon, Phani Penumatsa
18
Conceptual and Contextual DNA-Based Memory Russell Deaton, Junghuei Chen
25
Semantic Model for Artificial Intelligence Based on Molecular Computing Yusei Tsuboi, Zuwairie Ibrahim, Osamu Ono
32
The Fidelity of the Tag-Antitag System III. Robustness in the Excess Limit: The Stringent Temperature John A. Rose
40
XXII
Table of Contents, Part I
Emergent Computational Intelligence Approaches – Artificial Immune Systems and DNA Computing Robust PID Controller Tuning Using Multiobjective Optimization Based on Clonal Selection of Immune Algorithm Dong Hwa Kim, Jae Hoon Cho 50 Intelligent Tuning of PID Controller With Robust Disturbance Rejection Function Using Immune Algorithm Dong Hwa Kim 57 The Block Hidden Markov Model for Biological Sequence Analysis Kyoung-Jae Won, Adam Prügel-Bennett, Anders Krogh
64
Innovations in Intelligent Agents and Their Applications Innovations in Intelligent Agents and Applications Gloria E. Phillips-Wren, Nikhil Ichalkaranje
71
An Intelligent Aircraft Landing Support System Steve Thatcher, Lakhmi Jain, Colin Fyfe
74
Teaming Humans and Agents in a Simulated World Christos Sioutis, Jeffrey Tweedale, Pierre Urlings, Nikhil Ichalkaranje, Lakhmi Jain
80
Contextual-Knowledge Management in Peer to Peer Computing E.V. Krishnamurthy, V.K. Murthy
87
Collaborating Agents in Distributed Networks and Emergence of Collective Knowledge V.K. Murthy, E.V. Krishnamurthy
95
Intelligent Decision Making in Information Retrieval Gloria E. Phillips-Wren, Guiseppi A. Forgionne
103
Innovations in Intelligent Agents, Web and Their Applications Gloria E. Phillips-Wren, Nikhil Ichalkaranje
110
Novel Intelligent Agent-Based System for Study of Trade Tomohiro Ikai, Mika Yoneyama, Yasuhiko Dote
113
Testing of Multi-agent-based System in Ubiquitous Computing Environment Ken’ichi Takahashi, Satoshi Amamiya, Tadashige Iwao, Guoqiang Zhong, Makoto Amamiya
124
Helping Users Customize Their Pedagogical Agents: Issues, Approaches and Examples Anders I. Mørch, Jan Eirik B. Nævdal
131
Table of Contents, Part I
XXIII
Intelligent Web Site: Understanding the Visitor Behavior Juan D. Velásquez, Pablo A. Estévez, Hiroshi Yasuda, Terumasa Aoki, Eduardo Vera
140
Data Mining and Knowledge Discovery Mining Transformed Data Sets Alex Burns, Andrew Kusiak, Terry Letsche
148
Personalized Multilingual Web Content Mining Rowena Chau, Chung-Hsing Yeh, Kate A. Smith
155
Intelligent Multimedia Information Retrieval for Identifying and Rating Adult Images Seong-Joon Yoo
164
Using Domain Knowledge to Learn from Heterogeneous Distributed Databases Sally McClean, Bryan Scotney, Mary Shapcott
171
A Peer-to-Peer Approach to Parallel Association Rule Mining Hiroshi Ishikawa, Yasuo Shioya, Takeshi Omi, Manabu Ohta, Karoru Katayama
178
FIT: A Fast Algorithm for Discovering Frequent Itemsets in Large Databases Jun Luo, Sanguthevar Rajasekaran
189
Frequency-Incorporated Interdependency Rules Mining in Spatiotemporal Databases Ickjai Lee
196
Robotics: Intelligent Control and Sensing Theoretical Considerations of Multiple Particle Filters for Simultaneous Localisation and Map-Building David C.K. Yuen, Bruce A. MacDonald
203
Continuous Walking Over Various Terrains – A Walking Control Algorithm for a 12- DOF Locomotion Interface Jungwon Yoon, Jeha Ryu
210
Vision Controlled Humanoid Robot Tool-Kit Chris Messom
218
Modular Mechatronic Robotic Plug-and-Play Controller Jonathan R. Zyzalo, Glen Bright, Olaf Diegel, Johan Potgieter
225
The Correspondence Problem in Topological Metric Mapping - Using Absolute Metric Maps to Close Cycles Margaret E. Jefferies, Michael C. Cosgrove, Jesse T. Baker, Wai-Kiang Yeap 232
XXIV
Table of Contents, Part I
Intelligent Tutoring Systems Developing a “Virtual Student” Model to Test the Tutor and Optimizer Agents in an ITS Mircea Gh. Negoita, David Pritchard
240
Considering Different Learning Styles when Transferring Problem Solving Strategies from Expert to End Users Narin Mayiwar, Anne Håkansson
253
ULMM: A Uniform Logic Modeling Method in Intelligent Tutoring Systems Jinxin Si, Cungen Cao, Yuefei Sui, Xiaoli Yue, Nengfu Xie
263
Mining Positive and Negative Fuzzy Association Rules Peng Yan, Guoqing Chen, Chris Cornelis, Martine De Cock, Etienne Kerre
270
Intelligence and Technology in Educational Applications An Adaptation Framework for Web Based Learning System T.T. Goh, Kinshuk
277
Ontologies for Creating Learning Object Content 284
PASS: An Expert System with Certainty Factors for Predicting Student Success Ioannis Hatzilygeroudis, Anthi Karatrantou, C. Pierrakeas Student Modeling in Design Pattern ITS Supporting Self-Explanation in an Open-Ended Domain Amali Weerasinghe, Antonija Mitrovic
292 299 306
Creativity Support Systems Evaluation of the IRORI: A Cyber-Space that Catalyzes Face-to-Face Informal Communication Masao Usuki, Kozo Sugiyama, Kazushi Nishimoto, Takashi Matsubara
314
Information Sharing System Based on Location in Consideration of Privacy for Knowledge Creation Toshiyuki Hirata, Susumu Kunifuji
322
A Method of Extracting Topic Threads Towards Facilitating Knowledge Creation in Chat Conversations Kanayo Ogura, Masato Ishizaki, Kazushi Nishimoto
330
Support Systems for a Person with Intellectual Handicap from the Viewpoint of Universal Design of Knowledge Toshiaki Ikeda, Susumu Kunifuji
337
Table of Contents, Part I
XXV
Intelligent Media Technology for Communicative Intelligence – Knowledge Management and Communication Model Intelligent Conversational Channel for Learning Social Knowledge Among Communities S.M.F.D. Syed Mustapha
343
An Algorithm for Avoiding Paradoxical Arguments Among the Multi-agent in the Discourse Communicator S.M.F.D. Syed Mustapha
350
Gallery: In Support of Human Memory Hung-Hsuan Huang, Yasuyuki Sumi, Toyoaki Nishida
357
Evaluation of the Communication Atmosphere Tomasz M. Rutkowski, Koh Kakusho, Victor Kryssanov, Michihiko Minoh
364
A Method for Estimating Whether a User is in Smooth Communication with an Interactive Agent in Human-Agent Interaction Takanori Komatsu, Shoichiro Ohtsuka, Kazuhiro Ueda, Takashi Komeda, Natsuki Oka
371
A Meaning Acquisition Model Which Induces and Utilizes Human’s Adaptation Atsushi Utsunomiya, Takanori Komatsu, Kazuhiro Ueda, Natsuki Oka 378 Intelligent Media Technology for Communicative Intelligence – Interaction and Visual Content Video Content Manipulation by Means of Content Annotation and Nonsymbolic Gestural Interfaces Burin Anuchitkittikul, Masashi Okamoto, Sadao Kurohashi, Toyoaki Nishida, Yoichi Sato 385 Structural Analysis of Instruction Utterances Using Linguistic and Visual Information Tomohide Shibata, Masato Tachiki, Daisuke Kawahara, Masashi Okamoto, Sadao Kurohashi, Toyoaki Nishida
393
Video Contents Acquisition and Editing for Conversation Scene Takashi Nishizaki, Ryo Ogata, Yuichi Nakamura, Yuichi Ohta
401
Video-Based Interactive Media for Gently Giving Instructions Takuya Kosaka, Yuichi Nakamura, Yoshinari Kameda, Yuichi Ohta
411
Real-Time Human Proxy: An Avatar-Based Interaction System Daisaku Arita, Rin-ichiro Taniguchi
419
Soft Computing Techniques in the Capital Markets Reliability and Convergence on Kohonen Maps: An Empirical Study Marcello Cattaneo Adorno, Marina Resta
426
XXVI
Table of Contents, Part I
A New Trial for Improving the Traditional Technical Analysis in the Stock Markets Norio Baba, Tomoko Kawachi
434
Prediction of Business Failure by Total Margin Support Vector Machines Yeboon Yun, Min Yoon, Hirotaka Nakayama, Wataru Shiraki
441
Tick-Wise Predictions of Foreign Exchange Rates Mieko Tanaka-Yamawaki
449
Knowledge-Based Systems for e-Business A Rule-Based System for eCommerce Applications Jens Dietrich
455
Analyzing Dynamics of a Supply Chain Using Logic-Based Genetic Programming Ken Taniguchi, Takao Terano
464
From Gaming Simulation to Case Method – Empirical Study on Business Game Development and Evaluation Kenji Nakano, Takao Terano
472
A Study of a Constructing Automatic Updating System for Government Web Pages Keiichiro Mitani, Yoshikatsu Fujita, Kazuhiko Tsuda
480
Efficient Program Verification Using Binary Trees and Program Slicing Masakazu Takahashi, Noriyoshi Mizukoshi, Kazuhiko Tsuda
487
An Efficient Learning System for Knowledge of Asset Management Satoru Takahashi, Hiroshi Takahashi, Kazuhiko Tsuda
494
Extracting Purchase Patterns in Convenience Store E-Commerce Market Using Customer Cube Analysis Yoshinori Fukue, Kessoku Masayuki, Kazuhiko Tsuda
501
A Study of Knowledge Extraction from Free Text Data in Customer Satisfaction Survey 509 Yukari Iseyama, Satoru Takahashi, Kazuhiko Tsuda Network Information Mining for Content Delivery Route Control in P2P Network 516 Yoshikatsu Fujita, Jun Yoshida, Kenichi Yoshida, Kazuhiko Tsuda A Method of Customer Intention Management for a My-Page System Masayuki Kessoku, Masakazu Takahashi, Kazuhiko Tsuda
523
New Hierarchy Technique Using Co-occurrence Word Information El-Sayed Atlam, Elmarhomy Ghada, Masao Fuketa, Kazuhiro Morita, Jun-ichi Aoe
530
Table of Contents, Part I
XXVII
A New Method of Detecting Time Expressions for E-mail Messages Toru Sumitomo, Yuki Kadoya, El-Sayed Atlam, Kazuhiro Morita, Shinkaku Kashiji, Jun-ichi Aoe A New Classification Method of Determining the Speaker’s Intention for Sentences in Conversation Yuki Kadoya, El-Sayed Atlam, Kazuhiro Morita, Masao Fuketa, Toru Sumitomo, Jun-ichi Aoe
549
A Fast Dynamic Method Using Memory Management Shinkaku Kashiji, Toru Sumitomo, Kazuhiro Morita, Masaki Ono, Masao Fuketa, Jun-ichi Aoe
558
541
A Method of Extracting and Evaluating Popularity and Unpopularity for Natural Language Expressions Kazuhiro Morita, Yuki Kadoya, El-Sayed Atlam, Masao Fuketa, Shinkaku Kashiji, Jun-ichi Aoe 567 Intelligent Hybrid Systems for Medical Diagnosis Evaluating a Case-Based Reasoner for Clinical Decision Support Anna Wills, Ian Watson
575
Early Detection of Breast Cancer Using Mathematical Morphology Özgür Özsen
583
Diagnosis of Cervical Cancer Using Hybrid Multilayered Perceptron (HMLP) Network Dzati Athiar Ramli, Ahmad Fauzan Kadmin, Mohd. Yousoff Mashor, Nor Ashidi, Mat Isa
591
Mammographic Image and Breast Ultrasound Based Expert System for Breast Diseases Umi Kalthum Ngah, Chan Choyi Ping, Shalihatun Azlin Aziz
599
A Study on Nonparametric Classifiers for a CAD System of Diffuse Lung Opacities in Thin-Section Computed Tomography Images Yoshihiro Mitani, Yusuke Fujita, Naofumi Matsunaga, Yoshihiko Hamamoto
608
Techniques of Computational Intelligence for Web Applications Recognition of Grouping Areas in Trademarks Considering Proximity and Shape Similarity Koji Abe, Debabrata Roy, John P. Eakins
614
Multidimensional Visualization and Navigation in Search Results Will Archer Arentz, Aleksander Øhrn
620
XXVIII
Table of Contents, Part I
A Hybrid Learning Approach for TV Program Personalization Zhiwen Yu, Xingshe Zhou, Zhiyi Yang
630
An Adaptive-Learning Distributed File System Joseph D. Gradecki, Ilkeyun Ra
637
Intelligent Information Processing for Remote Sensing Review of Coding Techniques Applied to Remote Sensing Joan Serra-Sagrista, Francesc Auli, Fernando Garcia, Jorge Gonzales, Pere Guitart
647
Efficient and Effective Tropical Cyclone Eye Fix Using Genetic Algorithms Chi Lap Yip, Ka Yan Wong
654
Spectral Unmixing Through Gaussian Synapse ANNs in Hyperspectral Images J.L. Crespo, R.J. Duro, F. López-Peña
661
A Hyperspectral Based Multisensor System for Marine Oil Spill Detection, Analysis and Tracking F. López-Peña, R.J. Duro 669 Some Experiments on Ensembles of Neural Networks for Hyperspectral Image Classification Carlos Hernández-Espinosa, Mercedes Fernández-Redondo, Joaquín Torres Sospedra
677
A Modular Approach to Real-Time Sensorial Fusion Systems F. Gil-Castiñeira, P.S. Rodríguez-Hernández, F.J. Gonzáles-Castaño, E. Costa-Montenegro, R. Asorey-Cacheda, J.M. Pousada Carballo
685
Feature Extraction by Linear Spectral Unmixing M. Graña, A. D ’Anjou
692
Intelligent and Knowledge-Based Solutions for Mobile and Ad-Hoc Networks Decision Support System on the Grid M. Ong, X. Ren, J. Allan, V. Kadirkamanathan, HA Thompson, PJ Fleming
699
Representing Knowledge in Controlled Natural Language: A Case Study Rolf Schwitter
711
Supporting Smart Applications in Multihop Ad-Hoc Networks - The GecGo Middleware Peter Sturm, Hannes Frey, Daniel Gšrgen, Johannes Lehnert
718
A Heuristic for Efficient Broadcasting in the Metropolitan Ad hoc Networks Luc Hogie, Frederic Guinand, Pascal Bouvry
727
ADS as Information Management Service in an M-Learning Environment Matthias R. Brust, Daniel Görgen, Christian Hutter, Steffen Rothkugel
734
Table of Contents, Part I
XXIX
Rough Sets - Theory and Applications Noise Reduction in Audio Employing Spectral Unpredictability Measure and Neural Net Andrzej Czyzewski, Marek Dziubinski
743
Forming and Ranking Musical Rhythm Hypotheses Bozena Kostek, Jaroslaw Wojcik
750
A Comparison of Two Approaches to Data Mining from Imbalanced Data Jerzy W. Grzymala-Busse, Jerzy Stefanowski, Szymon Wilk
757
Measuring Acceptance of Intelligent System Models James F. Peters, Sheela Ramanna
764
Rough Set Based Image Texture Recognition Algorithm Zheng Zheng, Hong Hu, Zhongzhi Shi
772
Sets of Communicating Sequential Processes. A Topological Rough Set Framework L. Polkowski, M. Serneniuk-Polkowska
779
Soft Computing Techniques and Their Applications Robust System Identification Using Neural Networks Shigenobu Yamawaki, Lakhmi Jain
786
A Consideration on the Learning Behaviors of the HSLA Under the Nonstationary Multiteacher Environment and Their Application to Simulation and Gaming Norio Baba, Yoshio Mogami 792 Genetic Lips Extraction Method with Flexible Search Domain Control Takuya Akashi, Minoru Fukumi, Norio Akamatsu
799
Medical Diagnosis System Using the Intelligent Fuzzy Systems Yasue Mitsukura, Kensuke Mitsukura, Minoru Fukumi, Norio Akamatsu, Witold Pedrycz
807
Music Compression System Using the GA Hiroshi Kawasaki, Yasue Mitsukura, Kensuke Mitsukura, Minoru Fukumi, Norio Akamatsu
827
Effects of Chaotic Exploration on Reinforcement Maze Learning Koichiro Morihiro, Nobuyuki Matsui, Haruhiko Nishimura
833
Face Search by Neural Network Based Skin Color Threshold Method Takashi Imura, Minoru Fukumi, Norio Akamatsu, Kazuhiro Nakaura
840
Face Edge Detection System by Using the GAs Hideaki Sato, Katsuhiro Sakamoto, Yasue Mitsukura, Norio Akamatsu
847
A Feature Extraction of EEG with Individual Characteristics Shin-ichi Ito, Yasue Mitsukura, Norio Akamatsu
853
XXX
Table of Contents, Part I
Proposal of Neural Recognition with Gaussian Function and Discussion for Rejection Capabilities to Unknown Currencies Baiqing Sun, Fumiaki Takeda
859
Development of DSP Unit for Online Tuning and Application to Neural Pattern Recognition System Hironobu Satoh, Fumiaki Takeda
866
Face Identification Based on Ellipse Parameter Independent of Varying Facial Pose and Lighting Condition Hironori Takimoto, Yasue Mitsukura, Norio Akamatsu
874
Object Extraction System by Using the Evolutionaly Computations Seiki Yoshimori, Yasue Mitsukura, Minoru Fukumi, Norio Akamatsu
881
Wrist EMG Pattern Recognition System by Neural Networks and Multiple Principal Component Analysis Yuji Matsumura, Minoru Fukumi, Norio Akamatsu, Fumiaki Takeda
891
Age Classification from Face Images Focusing on Edge Information Miyoko Nakano, Fumiko Yasukata, Minoru Fukumi
898
Evolutionary Computation in the Soft Computing Framework Why Do Machine Learning Based Techniques Fail to Accelerate the Evolution of Neural Networks? Hugo de Garis, Thayne Batty
905
An Optimiser Agent that Empowers an ITS System to “on-the-fly” Modify Its Teaching Strategies Mircea Gh. Negoita, David Pritchard
914
A Constraint-Based Optimization Mechanism for Patient Satisfaction Chi-I Hsu, Chaochang Chiu, Pei-Lun Hsu
922
Optimizing Beam Pattern of Adaptively Linear Array Antenna by Phase Perturbations Using Genetic Algorithms Chao-Hsing Hsu, Chun-Hua Chen
929
The Optimal Airline Overbooking Strategy Under Uncertainties Chaochang Chiu, Chanhsi Tsao
937
Determination of Packet Priority by Genetic Algorithm in the Packet Switching Networks Taner Tuncer,
946
A New Encoding for the Degree Constrained Minimum Spanning Tree Problem 952 Sang-Moon Soak, David Corne, Byung-Ha Ahn
Table of Contents, Part I
XXXI
Neurodynamics and Its Hardware Implementation Towards Cortex Sized Artificial Nervous Systems Christopher Johansson, Anders Lansner
959
A Memory Model Based on Dynamical Behaviour of the Hippocampus Hatsuo Hayashi, Motoharu Yoshida
967
Analysis of Limit-Cycles on Neural Networks with Asymmetrical Cyclic Connections Using Approximately Activation Functions Shinya Suenaga, Yoshihiro Hayakawa, Koji Nakajima
974
Inverse Function Delayed Model for Optimization Problems Yoshihiro Hayakawa, Tatsuaki Denda, Koji Nakajima
981
Switched-Capacitor Large-Scale Chaotic Neuro-Computer Prototype and Chaotic Search Dynamics 988 Yoshihiko Horio, Takahide Okuno, Koji Mori A Convolutional Neural Network VLSI Architecture Using Thresholding and Weight Decomposition Osamu Nomura, Takashi Morie, Keisuke Korekado, Masakazu Matsugu, Atsushi Iwata
995
Pulse Codings of a Spiking Neuron Having Quantized State Hiroyuki Torikai, Hiroshi Hamanaka, Toshimichi Saito
1002
Design of Single Electron Circuitry for a Stochastic Logic Neural Network Hisanao Akima, Shigeo Sato, Koji Nakajima
1010
Advances, in Design, Analysis and Applications of Neural/Neuro-Fuzzy Classifiers An Improved Time Series Prediction Scheme Using Fuzzy Logic Inference Bin Qiu, Xiaoxiang Guan
1017
Fuzzy Classification of Secretory Signals in Proteins Encoded by the Plasmodium falciparum Genome Erica Logan, Richard Hall, Nectarios Klonis, Susanna Herd, Leann Tilley 1023 Web Users’ Classification Using Fuzzy Neural Network Fang Yuan, Huanrui Wu, Ge Yu
1030
Enhancing Generalization Capability of SVM Classifiers with Feature Weight Adjustment Xizhao Wang, Qiang He
1037
GREN-Networks in WDI-Based Analysis of State Economies Iveta Mrázová
1044
Learning Pseudo Metric for Multimedia Data Classification and Retrieval Dianhui Wang, Xiaohang Ma
1051
XXXII
Table of Contents, Part I
Several Aspects in Ubiquitous Pattern Recognition Techniques Projection Learning Based Kernel Machine Design Using Series of Monotone Increasing Reproducing Kernel Hilbert Spaces Akira Tanaka, Ichigaku Takigawa, Hideyuki Imai, Mineichi Kudo, Masaaki Miyakoshi
1058
Combination of Weak Evidences by D-S Theory for Person Recognition Masafumi Yamada, Mineichi Kudo
1065
Time-Frequency Decomposition in Gesture Recognition System Using Accelerometer Hidetoshi Nonaka, Masahito Kurihara
1072
A Method of Belief Base Revision for Extended Logic Programs Based on State Transition Diagrams Yasuo Kudo, Tetsuya Murai
1079
Monotonic and Nonmonotonic Reasoning in Zoom Reasoning Systems Tetsuya Murai, M. Sanada, Yasuo Kudo, Y. Sato
1085
Interaction and Intelligence An Exoskeleton for Human Shoulder Rotation Motion Assist Kazuo Kiguchi
1092
Networked Intelligent Robots by Ontological Neural Networks Eri Sato, Jun Kawakatsu, Toru Yamaguchi
1100
Some Emergences of Mobiligence in the Pursuit Game Seiichi Kawata, Kazuya Morohashi, Takeshi Tateyama
1107
Use of Successful Policies to Relearn for Induced States of Failure in Reinforcement Learning Tadahiko Murata, Hiroshi Matsumoto
1114
A Perceptual System for a Vision-Based Mobile Robot Under Office Automation Floors Naoyuki Kubota, Kazuhiko Taniguchi, Atsushi Ueda
1121
Performance Evaluation of a Distributed Genetic Algorithm with Cellular Structures on Function Optimization Problems Tadahiko Murata, Kenji Takada
1128
New Development, Trends and Applications of Intelligent Multi-Agent Systems On-Line Update of Situation Assessment Based on Asynchronous Data Streams Vladimir Gorodetsky, Oleg Kasaev, Vladimir Samoilov 1136 Mobility Management for Personal Agents in the All-mobile Network Ignac Lovrek, Vjekoslav Sinkovic
1143
Table of Contents, Part I
XXXIII
A Multi-agent Perspective on Data Integration Architectural Design Stéphane Faulkner, Manuel Kolp, Tai Nguyen, Adrien Coyette
1150
Identification of Structural Characteristics in Product Spectra Maik Maurer, Udo Lindemann
1157
Policies, Rules and Their Engines: What do They Mean for SLAs? Mark Perry, Michael Bauer
1164
Forecasting on Complex Datasets with Association Rules Marcello Bertoli, Andrew Stranieri
1171
Using a Multi-agent Architecture to Manage Knowledge in the Software Maintenance Process Oscar M. Rodríguez, Aurora Vizcaíno, Ana I. Martínez, Mario Piattini, Jesús Favela
1181
Engineering Techniques and Developments of Intelligent Systems Evolution Strategies Based Particle Filters for Nonlinear State Estimation Katsuji Uosaki, Yuuya Kimura, Toshiharu Hatanaka
1189
Coordination in Multiagent Reinforcement Learning Systems M.A.S. Kamal, Junichi Murata
1197
Measurement of Shaft Vibration Using Ultrasonic Sensor in Sump Pump Systems Shogo Tanaka, Hajime Morishige
1205
Behavior Learning of Autonomous Agents in Continuous State Using Function Approximation Min-Kyu Shon, Junichi Murata 1213 Some Experiences with Change Detection in Dynamical Systems Theodor D. Popescu
1220
Computational Intelligence for Fault Diagnosis The KAMET II Approach for Knowledge-Based System Construction Osvaldo Cairó, Julio César Alvarez
1227
A Recursive Component Boundary Algorithm to Reduce Recovery Time for Microreboots Chanwit Kaewkasi, Pitchaya Kaewkasi
1235
Electric Power System Anomaly Detection Using Neural Networks Marco Martinelli, Enrico Tronci, Giovanni Dipoppa, Claudio Balducelli
1242
Capturing and Applying Lessons Learned During Engineering Equipment Installation Ian Watson
1249
XXXIV
Table of Contents, Part I
Moving Towards a New Era of Intelligent Protection Through Digital Relaying in Power Systems Kongpan Areerak, Thanatchai Kulworawanichpong, Sarawut Sujitjorn
1255
Capacitor Switching Control Using a Decision Table for a 115-kV Power Transmission System in Thailand Phinit Srithorn, Kasem Khojulklang, Thanatchai Kulworawanichpong
1262
Author Index
1269
Table of Contents, Part II Methods of Computational Intelligence with Applications for Product Development and Human Resource Recruitment Integration of Psychology, Artificial Intelligence and Soft Computing for Recruitment and Benchmarking of Salespersons Rajiv Khosla, Tharanga Goonesekera
1
FHP: Functional Heuristic Planning Joseph Zalaket, Guy Camilleri
9
Planning with Recursive Subgoals Han Yu, Dan C. Marinescu, Annie S. Wu, Howard Jay Siegel
17
Development of a Generic Computer Aided Deductive Algorithm for Process Parameter Design K.P. Cheng, Daniel C.Y. Yip, K.H. Lau, Stuart Barnes
28
Epistemic Logic and Planning Shahin Maghsoudi, Ian Watson
36
Tàtari: An Open Source Software Tool for the Development and Evaluation of Recommender System Algorithms Halah Hassan, Ian Watson
46
DCPP: Knowledge Representation for Planning Processes Takushi Tanaka, Koki Tanaka
53
An IS Framework to Support the Collaborative Design of Supply Chains Nikos Karacapilidis, Emmanuel Adamides, Costas P. Pappis
62
Knowledge-Based Interface Systems A New Similarity Evaluation Function for Writer Recognition of Chinese Character Yoshinori Adachi, Min Liu, Masahiro Ozaki
71
Development of Teaching Materials Which Dynamically Change in Learning Process Masahiro Ozaki, Koji Koyama, Saori Takeoka, Yoshinori Adachi
77
Analog VLSI Layout Design of Motion Detection for Artificial Vision Model Masashi Kawaguchi, Takashi Jimbo, Masayoshi Umeno, Naohiro Ishii
83
Development of High-Precise and No-Contacting Capacitance Measuring System Using Dipmeter Shoji Suzuki, Yoshinori Adachi
89
Similarity of Documents Using Reconfiguration of Thesaurus Tomoya Ogawa, Nobuhiro Inuzuka
95
XXXVI
Table of Contents, Part II
On Refractory Parameter of Chaotic Neurons in Incremental Learning Toshinori Deguchi, Naohiro Ishii
103
Automatic Virtualization of Real Object Based on Shape Knowledge in Mixed Reality Kenji Funahashi, Kazunari Komura, Yuji Iwahori, Yukie Koyama
110
Generation of Virtual Image from Multiple View Point Image Database Haruki Kawanaka, Nobuaki Sado, Yuji Iwahori
118
Correlation Computations for Movement Detection in Neural Networks Naohiro Ishii, Masahiro Ozaki, Hiroshi Sasaki
124
Intelligent Human Computer Interaction Systems Information Acquisition Using Chat Environment for Question Answering Calkin A.S. Montero, Kenji Araki
131
Design and Implementation of Natural Language Interface for Impression-Based Music-Retrieval Systems Tadahiko Kumamoto 139 InTREND: An Interactive Tool for Reflective Data Exploration Through Natural Discourse Mitsunori Matsushita, Kumiyo Nakaoji, Yasuhiro Yamamoto, 148 Tsuneaki Kato Using Mitate-shi Related to the CONTAINER Schema for Detecting the Container-for-Contents Metonymy Yoshiaki Kurosawa, Takumi Ichimura, Teruaki Aizawa
156
Character Learning System Using Inter-stroke Information Jungpil Shin, Atsushi Takeda
165
Construction of Conscious Model Using Reinforcement Learning Masafumi Kozuma, Hirokazu Taki, Noriyuki Matsuda, Hirokazu Miura, Satoshi Hori, Norihiro Abe
175
Advice Recording Method for a Lesson with Computers Katsuyuki Harada, Noriyuki Matsuda, Hirokazu Miura, Hirokazu Taki, Satoshi Hori, Norihiro Abe
181
Acquiring After-Sales Knowledge from Human Motions Satoshi Hori, Kota Hirose, Hirokazu Taki
188
Emotion Analyzing Method Using Physiological State Kazuya Mera, Takumi Ichimura
195
Posters A Lyapunov Function Based Direct Model Reference Adaptive Fuzzy Control Youngwan Cho, Yangsun Lee, Kwangyup Lee, Euntai Kim
202
Table of Contents, Part II
XXXVII
Semi-automatic Video Object Segmentation Method Based on User Assistance and Object Tracking J. G. Choi, S. W. Lee, B. J. Yun, H. S. Kang, S. H. Hong, J. Y. Nam
211
Design and Evaluation of a Scale Patching Technique for VOD Servers Hyo-Young Lee, Sook-Jeong Ha, Sun-Jin Oh, Ihn-Han Bae
219
Optimal Gabor Encoding Scheme for Face Recognition Using Genetic Algorithm 227 Inja Jeon, Kisang Kwon, Phill-Kyu Rhee T-shape Diamond Search Pattern for New Fast Block Matching Motion Estimation Mi Gyoung Jung, Mi Young Kim
237
Motion Estimation Using Cross Center-Biased Distribution and Spatio-Temporal Correlation of Motion Vector Mi Young Kim, Mi Gyoung Jung 244 A Fast Motion Estimation Using Prediction of Motion Estimation Error Hyun-Soo Kang, Seong-Mo Park, Si-Woong Lee, Jae-Gark Choi, Byoung-Ju Yun
253
Ontology Revision Using the Concept of Belief Revision Seung Hwan Kang, Sim Kim Lau
261
Novelty in the Generation of Initial Population for Genetic Algorithms Ali Karci
268
Framework for Personalized e-Mediator Dong-Hwee Kim, Soon-Ja Kim
276
Advances in Intelligent Data Processing Techniques and Applications Weightless Neural Networks for Typing Biometrics Authentication Shereen Yong, Weng Kin Lai, George Goghill
284
Intelligent Pressure-Based Typing Biometrics System Azweeda Dahalan, M.J.E. Salami, W.K. Lai, Ahmad Faris Ismail
294
Classifiers for Sonar Target Differentiation C.K. Loo, W.S. Lim, M.V.C. Rao
305
Design and Development of Intelligent Fingerprint-Based Security System Suriza Ahmad Zabidi, Momoh-Jimoh E. Salami
312
Weightless Neural Networks: A Comparison Between the Discriminator and the Deterministic Adaptive RAM Network Paul Yee, George Coghill
319
Extracting Biochemical Reaction Kinetics from Time Series Data Edmund J. Crampin, Patrick E. McSharry, Santiago Schnell
329
XXXVIII
Table of Contents, Part II
PCA and ICA Based Signal and Image Processing Image Feature Representation by the Subspace of Nonlinear PCA Yen-Wei Chen, Xiang-Yan Zeng
337
Improving ICA Performance for Modeling Image Appearance with the Kernel Trick Qingshan Liu, Jian Cheng, Hanqing Lu, Songde Ma
344
Random Independent Subspace for Face Recognition Jian Cheng, Qingshan Liu, Hanqing Lu, Yen-Wei Chen
352
An RDWT Based Logo Watermark Embedding Scheme with Independent Component Analysis Detection Thai Duy Hien, Zensho Nakao, Yen-Wei Chen
359
Real-Time Independent Component Analysis Based on Gradient Learning with Simultaneous Perturbation Stochastic Approximation Shuxue Ding, Jie Huang, Daming Wei, Sadao Omata
366
Intelligent Data Processing in Process Systems and Plants Extraction Operation Know-How from Historical Operation Data – Using Characterization Method of Time Series Data and Data Mining Method – Kazuhiro Takeda, Yoshifumu Tsuge, Hisayoshi Matsuyama
375
Handling Qualitative Aspects of Human Knowledge in Diagnosis Viorel Ariton
382
Qualitative Analysis for Detection of Stiction in Control Valves Yoshiyuki Yamashita
391
Agent-Based Batch Process Control Systems Masaru Sakamoto, Hajime Eguchi, Takashi Hamaguchi, Yutaka Ota, Yoshihiro Hashimoto, Toshiaki Itoh
398
Acquisition of AGV Control Rules Using Profit Sharing Method and Evaluation of the Rules Hisaaki Yamaba, Hitoshi Yoshioka, Shigeyuki Tomita
405
Dynamic Acquisition of Models for Multiagent-Oriented Simulation of Micro Chemical Processes Naoki Kimura, Hideyuki Matsumoto, Chiaki Kuroda
412
Acquisition of Engineering Knowledge on Design of Industrial Cleaning System through IDEF0 Activity Model Tetsuo Fuchino, Takao Wada, Masahiko Hirao
418
Intelligent Systems for Spatial Information Processing and Imaging Exchanging Generalized Maps Across the Internet Min Zhou, Michela Bertolotto
425
Table of Contents, Part II
XXXIX
Adaptive Spatial Data Processing System (ASDPS) Wanwu Guo
432
Modified ASDPS for Geochemical Data Processing Chi Liu, Hui Yu
440
Gravity Data Processing Using ASDPS Kai Ding, Baishan Xu
447
Remote Sensing Image Processing Using MCDF Zhiqiang Ma, Wanwu Guo
454
Coarse-Grained Parallel Algorithms for Spatial Data Partition and Join Processing 461 Jitian Xiao Image Processing and Intelligent Information Applications Multi-agents for Decision Support Manoj Achuthan, Bala Balachandran, Dharmendra Sharma
469
Dynamic Scheduling Using Multiagent Architecture Dharmendra Sharma, Dat Tran
476
Using Consensus Ensembles to Identify Suspect Data David Clark
483
Fuzzy Analysis of X-Ray Images for Automated Disease Examination Craig Watman, Kim Le
491
New Background Speaker Models and Experiments on the ANDOSL Speech Corpus Dat Tran, Dharmendra Sharma
498
Immunity-Based Systems and Approaches An Approach for Self-repair in Distributed System Using Immunity-Based Diagnostic Mobile Agents Yuji Watanabe, Shigeyuki Sato, Yoshiteru Ishida
504
Artificial Immune System for Personal Identifiction with Finger Vein Pattern Toshiyuki Shimooka, Koichi Shimizu
511
A Switching Memory Strategy in an Immune Network Model Kouji Harada
519
A Process Algebra Model of the Immune System Raúl Monroy
526
Mechanism for Generating Immunity-Based Agents that Detect Masqueraders Takeshi Okamoto, Takayuki Watanabe, Yoshiteru Ishida
534
XL
Table of Contents, Part II
Machine and Computer Vision, Neural Networks, Intelligent Web Mining and Applications False Alarm Filter in Neural Networks for Multiclass Object Detection Mengjie Zhang, Bunna Ny
541
iJADE Scene Segmentator – A Real-Time Scene Segmentation System Using Watereshed-Based Neuro-Oscillatory Network Gary C.L. Li, Raymond S. T. Lee
549
Visual Tracking by Using Kalman Gradient Vector Flow (KGVF) Snakes Toby H. W. Lam, Raymond S. T. Lee
557
Chart Patterns Recognition and Forecast Using Wavelet and Radial Basis Function Network Jamec N.K. Liu, Raymond W.M. Kwong, Feng Bo
564
Appearance-Based Face Recognition Using Aggregated 2D Gabor Features King Hong Cheung, Jane You, James Liu, Tony W.H. Ao Ieong
572
Ontology-Based Web Agents Using Concept Description Flow Nengfu Xie, Cungen Cao, Bingxian Ma, Chunxia Zhang, Jinxin Si
580
Web Page Recommendation Model for Web Personalization Abdul Manan Ahmad, Mohd. Hanafi Ahmad Hijazi
587
iJADE Face Recognizer - A Multi-agent Based Pose and Scale Invariant Human Face Recognition System Tony W.H. Ao Ieong, Raymond S.T. Lee
594
Neural Networks for Data Mining Piecewise Multivariate Polynomials Using a Four-Layer Perceptron Yusuke Tanahashi, Kazumi Saito, Ryohei Nakano
602
Learning an Evaluation Function for Shogi from Data of Games Satoshi Tanimoto, Ryohei Nakano
609
Extended Parametric Mixture Model for Robust Multi-labeled Text Categorization Yuji Kaneda, Naonori Ueda, Kazumi Saito
616
Visualisation of Anomaly Using Mixture Model Tomoharu Iwata, Kazumi Saito
624
Obtaining Shape from Scanning Electron Microscope Using Hopfield Neural Network Yuji Iwahori, Haruki Kawanaka, Shinji Fukui, Kenji Funahashi
632
Table of Contents, Part II
XLI
Neural Networks as Universal Approximators and Paradigms for Information Processing – Theoretical Developments and Applications Speech Recognition for Emotions with Neural Network: A Design Approach Shubhangi Giripunje, Anshish Panat
640
Neuro-Genetic Approach for Bankruptcy Prediction Modeling Kyung-shik Shin, Kyoung Jun Lee
646
Design of a Robust and Adaptive Wavelet Neural Network for Control of Three Phase Boost Rectifiers Farzan Rashidi, Mehran Rashidi
653
The Comparison of Characteristics of 2-DOF PID Controllers and Intelligent Tuning of a Gas Turbine Generating Plant Dong Hwa Kim
661
Bankruptcy Prediction Modeling Using Multiple Neural Network Models Kyung-shik Shin, Kyoung Jun Lee
668
Interpreting the Output of Certain Neural Networks as Almost Unique Probability Bernd-Jürgen Falkowski 675 A Stochastic Model of Neural Computing Paolo Crippa, Claudio Turchetti, Massimiliano Pirani
683
Theoretical Developments and Applications of Fuzzy Techniques and Systems Classification of Fuzzy Data in Database Management System Deval Popat, Hema Sharda, David Taniar
691
An Efficient Fuzzy Method for Handwritten Character Recognition Romesh Ranawana, Vasile Palade, G.E.M.D.C. Bandara
698
The GA_NN_FL Associated Model for Authentication Fingerprints Le Hoai Bac, Le Hoang Thai
708
Fuzzy Modeling of Zero Moment Point Trajectory for a Biped Walking Robot Dongwon Kim, Nak-Hyun Kim, Sam-Jun Seo, Gwi-Tae Park
716
Adaptive Resource Scheduling for Workflows Considering Competence and Preference Keon Myung Lee
723
Analysis of Chaotic Mapping in Recurrent Fuzzy Rule Bases Alexander Sokolov, Michael Wagenknecht
731
Highly Reliable Applications of Fuzzy Engineering Damping Enhancement in Power Systems Using a Robust Fuzzy Sliding Mode Based PSS Controller Farzan Rashidi, Mehran Rashidi
738
XLII
Table of Contents, Part II
Design a Robust and Adaptive Reinforcement Learning Based SVC Controller for Damping Enhancement in Power Systems Farzan Rashidi, Mehran Rashidi 745 A Rule-Based Approach for Fuzzy Overhaul Scheduling Hongqi Pan, Chung-Hsing Yeh
753
Fuzzy Kolmogorov’s Network Vitaliy Kolodyazhniy, Yevgeni Bodyanskiy
764
Fuzzy Selection Mechanism for Multimodel Prediction Y. Bodyanskiy, S. Popov
772
Efficient Approximate Reasoning with Positive and Negative Information Chris Cornelis, Martine De Cock, Etienne Kerre
779
Chance Discovery Chance Discovery as Novel Empathy with TV Programs Masashi Taguchi, Yukio Ohsawa
786
Enhancing Chance Discovery: Dimensions, Strategies and Tools Daniel Howard, Mark A. Eduards
793
Consumer Behavior Analysis by Graph Mining Technique Katsutoshi Yada, Hiroshi Motoda, Takashi Washio, Asuka Miyawaki
800
A Chance Discovery Process to Understanding Spiral Behaviors of Consumers Noriyuki Kushiro, Yukio Ohsawa
807
Nursing Risk Prediction as Chance Discovery Akinori Abe, Kiyoshi Kogure, Norihiro Hagita
815
Exploring Collaboration Topics from Documented Foresights of Experts Yumiko Nara, Yukio Ohsawa
823
Condensation and Picture Annotations of Scenario Map for Consensus in Scenario Mining Kenichi Horie, Takashi Yamaguchi, Tsuneki Sakakibara, Yukio Ohsawa
831
Emergence of Product Value from On-line Communications Koichi Takahashi, Yukio Ohsawa, Naohiro Matsumura
839
Emerging Scenarios by Using DDM: A Case Study for Japanese Comic Marketing Hiroshi Tamura, Yuichi Washida, Yukio Ohsawa 847 Intelligent Cooperative Work A Mobile Clickstream Time Zone Analysis: Implications for Real-Time Mobile Collaboration Toshihiko Yamakami
855
Table of Contents, Part II
XLIII
Interpretation of Emotionally Expressive Characters in an Intercultural Communication Tomodo Koda
862
Development and Evaluation of an Intercultural Synchronous Collaboration System Takashi Yoshino, Tomohiro Shigenobu, Shinji Maruno, Hiroshi Ozaki, Sumika Ohno, Jun Munemori
869
A Proposal of Knowledge Creative Groupware for Seamless Knowledge Takaya Yuizono, Jun Munemori, Akifumi Kayano, Takashi Yoshino, Tomohiro Shigenobu
876
comDesk: A Cooperative Assistance Tool Based on P2P Techniques Motoki Miura, Buntaoru Shizuki, Jiro Tanaka
883
Development of an Emotional Chat System Using Sense of Touch and Face Mark Hajime Yoshida, Takashi Yoshino, Jun Munemori 891 Dual Communication System Using Wired and Wireless Correspondence in a Small Space Kunihiro Yamada, Yoshihiko Hirata, Yukihisa Naoe, Takashi Furumura, Yoshio Inoue, Toru Shimizu, Koji Yoshida, Masanori Kojima, Tadanori Mizuno
898
The Beijing Explorer: Two-way Location Aware Guidance System Jun Munemori, Daisuke Kamisaka, Takashi Yoshino, Masaya Chiba
905
Development of a System for Learning Ecology Using 3D Graphics and XML Satoru Fujii, Jun Iwata, Yuka Miura, Kouji Yoshida, Sanshiro Sakai, Tadanori Mizuno
912
Practice of Linux Lesson in Blended Learning Kazuhiro Nakada, Tomonori Akutsu, Chris Walton, Satoru Fujii, Hiroshi Ichimura, Kunihiro Yamada, Kouji Yoshida
920
Requisites for Talented People in Industry and the Method of Education Teruhisa Ichikawa
928
Logic Based Intelligent Information Systems Para-Fuzzy Logic Controller Jair Minoro Abe
935
Paraconsistent Artificial Neural Networks: An Introduction Jair Minoro Abe
942
The Study of the Effectiveness Using the Expanded Neural Network in System Identification Shigenobu Yamawaki, Lakhmi Jain
949
XLIV
Table of Contents, Part II
A Paraconsistent Logic Program Based Control for a Discrete Event Cat and Mouse Kazumi Nakamatsu, Ryuji Ishikawa, Atsuyuki Suzuki
954
EVALPSN Based Railway Interlocking Simulator Kazumi Nakamatsu, Yosuke Kiuchi, Atsuyuki Suzuki
961
Learning by Back-Propagating Output Correlation in Winner-takes-all and Auto-associative Networks Md. Shahjahan, K. Murase
968
Similarity Measures for Content-Based Multimedia Retrieval Content-Based Video Retrieval Using Moving Objects’ Trajectories Choon-Bo Shim, Jae-Woo Chang
975
Content-Based Image Retrieval Using Multiple Representations Karin Kailing, Hans-Peter Kriegel, Stefan Schönauer
982
Similarity of Medical Images Computed from Global Feature Vectors for Content-Based Retrieval Thomas M. Lehmann, Mark O. Güld, Daniel Keysers, Thomas Deselaers, Henning Schubert, Berthold Wein, Klaus Spitzer
989
Similarity: Measurement, Ordering and Betweenness Walter ten Brinke, David McG. Squire, John Bigelow
996
Engineering of Intelligent Systems-Components and Activities Qualitative Model for Quality Control in Production Marjan Družovec, Tatjana Welzer
1003
A Functional Language for Mobile Agents with Dynamic Extension Yasushi Kambayashi, Munehiro Takimoto
1010
Verifying Clinical Criteria for Parkinsonian Disorders with CART Decision Trees Petra Povalej, Gregor Štiglic, Peter Kokol, Bruno Stiglic, Irene Litvan, 1018 Dušan Flisar Improving Classification Accuracy Using Cellular Automata Petra Povalej, Gregor Štiglic, Tatjana Welzer, Peter Kokol
1025
Using Web Services and Semantic Web for Producing Intelligent Context-Aware Services Kimmo Salmenjoki, Tatjana Welzer
1032
Internationalization Content in Intelligent Systems – How to Teach it? Tatjana Welzer, David Riaño, Boštjan Brumen, Marjan Družovec
1039
Table of Contents, Part II
XLV
Intelligent System Design Recognizing Frontal Faces Using Neural Networks Stephen Karungaru, Minoru Fukumi, Norio Akamatsu
1045
Identification of the Multi-layered Neural Networks by Revised GMDH-Type Neural Network Algorithm with PSS Criterion Tadashi Kondo, Abhijit S. Pandya
1051
Detection of Transition of Various Time Series Model Using BP Neural Networks Takahiro Emoto, Masatake Akutagawa, Hirofumi Nagashino, Yohsuke Kinouchi
1060
A Pattern Generator for Multiple Periodic Signals Using Recurrent Neural Networks Fumihiko Takahashi, Masatake Akutagawa, Hirofumi Nagashino, Yohsuke Kinouchi
1068
Identification of Number of Brain Signal Sources Using BP Neural Networks Hirofumi Nagashino, Masafumi Hoshikawa, Qinyu Zhang, Masatake Akutagawa, Yohsuke Kinouchi
1074
Knowledge–Based Intelligent Systems for Health Care Development of Coronary Heart Disease Database Machi Suka, Takumi Ichimura, Katsumi Yoshida
1081
Extraction of Rules from Coronary Heart Disease Database Using Automatically Defined Groups Akira Hara, Takumi Ichimura, Tetsuyuki Takahama, Yoshinori Isomichi
1089
Immune Multi Agent Neural Network and Its Application to the Coronary Heart Disease Database Shinichi Oeda, Takumi Ichimura, Katsumi Yoshida
1097
FESMI: A Fuzzy Expert System for Diagnosis and Treatment of Male Impotence Constantinos Koutsojannis, Ioannis Hatzilygeroudis
1106
Disease Diagnosis Support System Using Rules, Neural Network and Fuzzy Logic Le Hoai Bac, Nguyen Thanh Nghi
1114
Partial Merging of Semi-structured Knowledgebases Ladislau Bölöni, Damla Turgut
1121
Emotion Oriented Intelligent System for Elderly People Kazuya Mera, Yoshiaki Kurosawa, Takumi Ichimura
1128
Multi-modal Data Fusion: A Description Sarah Coppock, Lawrence J. Mazlack
1136
XLVI
Table of Contents, Part II
Multiagent Systems: Ontologies and Conflicts Resolution Null Values and Chase in Distributed Information Systems Agnieszka Dardzinska Glebocka
1143
Soft Implementations of Epistemic Satisfaction Relations in Communicative Cognitive Agents
1150 Multi-agent Web Recommendation Method Based on Indirect Association Rules
1157 Migration Mechanisms for Multi-class Objects in Multiagent Systems Dariusz Król
1165
A Distributed Model for Institutions in Open Multi-agent Systems Marcos De Oliveira, Martin Purvis, Stephen Cranefield, Mariusz Nowostawski
1172
Deriving Consensus for Conflict Situations with Respect to Its Susceptibility Ngoc Thanh Nguyen, Michal Malowiecki
1179
A Collaborative Multi-agent Based Workflow System Bastin Tony, Roy Savarimuthu, Maryam Purvis
1187
A Subjective Logic-Based Framework for Aligning Multiple Ontologies Krzysztof Juszczyszyn
1194
Operations Research for Intelligent Systems When to Stop Range Process – An Expanded State Space Approach Kazuyoshi Tsurusaki, Seiichi Iwamoto
1201
A Nondeterministic Dynamic Programming Model Toshiharu Fujita, Takayuki Ueno, Seiichi Iwamoto
1208
Toward The Development of an Auto-poietic Multi-agent Simulator Katsumi Hirayama
1215
A Mean Estimation of Fuzzy Numbers by Evaluation Measures Yuji Yoshida
1222
An Objective Function Based on Fuzzy Preferences in Dynamic Decision Making Yuji Yoshida, Masami Yasuda, Jun-ichi Nakagami, Masami Kurano, Satoru Kumamoto 1230 Intelligent Data Analysis and Application An Efficient Clustering Algorithm for Patterns Placement in Walkthrough System Shao-Shin Hung, Ting-Chia Kuo, Damon Shing-Min Liu
1237
Table of Contents, Part II
XLVII
Distance Preserving Mapping from Categories to Numbers for Indexing Huang-Cheng Kuo, Yi-Sen Lin, Jen-Peng Huang
1245
An Evolutionary Clustering Method for Part Family Formation with Multiple Process Plans Sheng-Chai Chi, In-Jou Lin, Min-Chuan Yan
1252
Design the Hardware of Genetic Algorithm for TSP and MSA Wen-Lung Shu, Chen-Cheng Wu, Wei-Cheng Lai
1260
Robust Bayesian Learning with Domain Heuristics for Missing Data Chian-Huei Wun, Chih-Hung Wu
1268
OLAM Cube Selection in On-Line Multidimensional Association Rules Mining System Wen-Yang Lin, Ming-Cheng Tseng, Min-Feng Wang 1276 Mining Fuzzy Association Rules with Multiple Minimum Supports Using Maximum Constraints Yeong-Chyi Lee, Tzung-Pei Hong, Wen-Yang Lin
1283
Author Index
1291
This page intentionally left blank
Table of Contents, Part III Engineering of Ontology and Multi-agent System Design Implementing EGAP-Based Many-Valued Argument Model for Uncertain Knowledge Taro Fukumoto, Takehisa Takahashi, Hajime Sawamura
1
Ontology Revision Using the Concept of Belief Revision Seung Hwan Kang, Sim Kim Lau
8
A Robust Rule-Based Event Management Architecture for Call-Data Records C. W. Ong, J. C. Tay
16
Adaptive Agent Integration in Designing Object-Based Multiagent System Jaya Sil
24
Ontological Representations of Software Patterns Jean-Marc Rosengard, Marian F. Ursu
31
Intelligent Multimedia Solution and the Security for the Next Generation Mobile Networks Dynamic Traffic Grooming and Load Balancing for GMPLS-Centric All Optical Networks Hyuncheol Kim, Seongjin Ahn, Jinwook Chung
38
Probabilistic Model of Traffic Breakdown with Random Propagation of Disturbance for ITS Application Bongsoo Son, Taewan Kim, Hyung Jin Kim, Soobeom Lee
45
Novel Symbol Timing Recovery Algorithm for Multi-level Signal Kwang Ho Chun, Myoung Seob Lim
52
Development Site Security Process of ISO/IEC TR 15504 Eun-ser Lee, Tai-hoon Kim
60
Improving CAM-DH Protocol for Mobile Nodes with Constraint Computational Power Yong-Hwan Lee, Il-Sun You, Sang-Surm Rhee
67
Space Time Code Representation in Transform Domain Gi Yean Hwang, Jia Hou, Moon Ho Lee
74
L
Table of Contents, Part III
A Multimedia Database System Using Mobile Indexing Agent in Wireless Network Jong-Hee Lee, Kwang-Hyoung Lee, Moon-Seog Jun, Keun-Wang Lee
81
Bus Arrival Time Prediction Method for ITS Application Bongsoo Son, Hyung Jin Kim, Chi-Hyun Shin, Sang-Keon Lee
88
RRAM Spare Allocation in Semiconductor Manufacturing for Yield Improvement Youngshin Han, Chilgee Lee 95 A Toolkit for Constructing Virtual Instruments for Augmenting User Interactions and Activities in a Virtual Environment 103 Kyoung S. Park, Yongjoo Cho Mobility Grouping Scheme to Reduce HLR Traffic in IMT-2000 Networks Dong Chun Lee, Gwang-Hyun Kim, Seung-Jae Yoo
110
Security Requirements for Software Development Tai-hoon Kim, Myong-chul Shin, Sang-ho Kim, Jae Sang Cha
116
Operations Research Based on Soft Computing Intelligent Control Model of Information Appliances Huey-Ming Lee, Ching-Hao Mao, Shu-Yen Lee
123
Effective Solution of a Portofolio Selection Based on a Block of Shares by a Meta-controlled Boltzmann Machine Teruyuki Watanabe, Junzo Watada
129
Soft Computing Approach to Books Allocation Strategy for Library Junzo Watada, Keisuke Aoki, Takayuki Kawaura
136
Analysis of Human Feelings to Colors Taki Kanda
143
Possibilistic Forecasting Model and Its Application to Analyze the Economy in Japan Yoshiyuki Yabuuchi, Junzo Watada
151
A Proposal of Chaotic Forecasting Method Based on Wavelet Transform Yoshiyuki Matsumoto, Junzo Watada
159
Fuzzy Multivariant Analysis Junzo Watada, Masato Takagi, Jaeseok Choi
166
Table of Contents, Part III
LI
Web Mining and Personalization Using Coherent Semantic Subpaths to Derive Emergent Semantics D.V. Sreenath, W.I. Grosky, F. Fotouhi
173
Retrieval of Product Reputations from the WWW Takahiro Hayashi, Yosuke Kinosita, Rikio Onai
180
A Logic-Based Approach for Matching User Profiles Andrea Calì, Diego Calvanese, Simona Colucci, Tommaso Di Noia, Francesco M. Donini
187
Learning and Soft Computing with Support Vector Machines (SVM) and RBF NNs Pose Classification of Car Occupant Using Stereovision and Support Vector Machines Min-Soo Jang, Yong-Guk Kim, Hyun-Gu Lee, Byung-Joo Lee, Soek-Joo Lee, Gwi-Tae Park
196
A Fully Automatic System Recognizing Human Facial Expressions Yong-Guk Kim, Sung-Oh Lee, Sang-Jun Kim, Gwi-Tae Park
203
A Study of the Radial Basis Function Neural Network Classifiers Using Known Data of Varying Accuracy and Complexity Patricia Crowther, Robert Cox, Dharmendra Sharma
210
Novel Methods in Evolutionary Computation Top Down Modelling with Genetic Programming Daniel Howard
217
A Two Phase Genetic Programming Approach to Object Detection Mengjie Zhang, Peter Andreae, Urvesh Bhowan
224
Mapping XML Schema to Relations Using Genetic Algorithm Vincent Ng, Chan Chi Kong, Stephen Chan
232
Diagnosing the Population State in a Genetic Algorithm Using Hamming Distance 246 Radu Belea, Sergiu Caraman, Vasile Palade Optimizing a Neural Tree Using Subtree Retraining Wanida Pensuwon, Rod Adams, Neil Davey
256
LII
Table of Contents, Part III
Bioinformatics Using Intelligent and Machine Learning Techniques Cluster Analysis of Gene Expression Profiles Using Automatically Extracted Seeds Miyoung Shin, Seon-Hee Park
263
Prediction of Plasma Membrane Spanning Region and Topology Using Hidden Markov Model and Neural Network Min Kyung Kim, Hyun Seok Park, Seon Hee Park
270
Speed Control and Torque Ripple Minimization in Switch Reluctance Motors Using Context Based Brain Emotional Learning Mehran Rashidi, Farzan Rashidi, Mohammad Hossein Aghdaei, Hamid Monavar
278
Practical Common Sense Reasoning Reasoning in Practical Situations Pei Wang
285
Commonsense Reasoning in and Over Natural Language Hugo Liu, Push Sing
293
A Library of Behaviors: Implementing Commonsense Reasoning About Mental World Boris Galitsky
307
Handling Default Rules by Autistic Reasoning Don Peterson, Boris Galitsky
314
Systems for Large-scale Metadata Extraction and Maintenance An Ontology-Driven Approach to Metadata Design in the Mining of Software Process Events Gabriele Gianini, Ernesto Damiani
321
Knowledge Extraction from Semi-structured Data Based on Fuzzy Techniques Paolo Ceravolo, Maria Cristina Nocerino, Marco Viviani
328
Managing Ontology Evolution Via Relational Constraints Paolo Ceravolo, Angelo Corallo, Gianluca Elia, Antonio Zilli
335
Table of Contents, Part III
Service Customization Supporting an Adaptive Information System Antonio Caforio, Angelo Corallo, Gianluca Elia, Gianluca Solazzo
LIII
342
Soft Computing in Fault Detection and Diagnosis Using Design Information to Support Model-Based Fault Diagnosis Tasks Katsuaki Tanaka, Yoshikiyo Kato, Shin’ichi Nakasuka, Koichi Hori
350
Fault Detection and Diagnosis Using the Fuzzy Min-Max Neural Network with Rule Extraction Kok Yeng Chen, Chee Peng Lim, Weng Kin Lai
357
Refinement of the Diagnosis Process Performed with a Fuzzy Classifier C. D. Bocaniala, J. Sa da Costa, V. Palade
365
ANN-Based Structural Damage Diagnosis Using Measured Vibration Data Eric W.M. Lee, H.F. Lam
373
Induction Machine Diagnostic Using Adaptive Neuro Fuzzy Inferencing System Mohamad Shukri, Marzuki Khalid, Rubiyah Yusuf, Mohd Shafawi 380 Intelligent Feature Recognition and Classification in Astrophysical and Medical Images Real Time Stokes Inversion Using Multiple Support Vector Regression David Rees, Ying Guo, Arturo López Ariste, Jonathan Graham
388
Extracting Stellar Population Parameters of Galaxies from Photometric Data Using Evolution Strategies and Locally Weighted Linear Regression Luis Alvarez, Olac Fuentes, Roberto Terlevich
395
Using Evolution Strategies to Find a Dynamical Model of the M81 Triplet Juan Carlos Gomez, Olac Fuentes, Lia Athanassoula, Albert Bosma
404
Automated Classification of Galaxy Images Jorge de la Calleja, Olac Fuentes
411
Automatic Solar Flare Tracking Ming Qu, Frank Shih, Ju Jing, Haimin Wang, David Rees
419
Source Separation Techniques Applied to Astrophysical Maps E. Salerno, A. Tonazzini, L. Bedini, D. Herranz, C. Baccigalupi
426
Counting Magnetic Bipoles on the Sun by Polarity Inversion Harrison P. Jones
433
LIV
Table of Contents, Part III
Correlation of the He I 1083 nm Line Width and Intensity as a Coronal Hole Identifier Olena Malanushenko, Harrison P. Jones Automated Recognition of Sunspots on the SOHO/MDI White Light Solar Images S. Zharkov, V. Zharkova, S. Ipson, A. Benkhalil
446
A Procedure for the Automated Detection of Magnetic Field Inversion in SOHO MDI Magnetograms S.S. Ipson, V.V. Zharkova, S.I. Zharkov, A. Benkhalil
453
Automatic Detection of Active Regions on Solar Images A. Benkhalil, V. Zharkova, S. Ipson, S. Zharkov
460
Automatic Detection of Solar Filaments Versus Manual Digitization N. Fuller, J. Aboudarham
467
Adaptation of Shape Dendritic Spines by Genetic Algorithm A. Herzog, V. Spravedlyvyy, K. Kube, E. Korkotian, K. Braun, E. Michaelis
476
Detection of Dynamical Transitions in Biomedical Signals Using Nonlinear Methods Patrick E. McSharry
483
439
Applications of Machine Learning Concepts On Retrieval of Lost Functions for Feedforward Neural Networks Using Re-Learning Naotake Kamiura, Teijiro Isokawa, Kazuharu Yamato, Nobuyuki Matsui
491
Analyzing the Temporal Sequences for Text Categorization Xiao Luo, A. Nur Zincir-Heywood
498
Prediction of Women’s Apparel Sales Using Soft Computing Methods Les M. Sztandera, Celia Frank, Balaji Vemulapali
506
A Try for Handling Uncertainties in Spatial Data Mining Shuliang Wang, Guoqing Chen, Deyi Li, Deren Li, Hanning Yuan
513
Combining Evidence from Classifiers in Text Categorization Yaxin Bi, David Bell, Jiwen Guan
521
Predicting the Relationship Between the Size of Training Sample and the Predictive Power of Classifiers Natthaphan Boonyanunta, Panlop Zeephongsekul
529
Table of Contents, Part III
LV
Topographic Map Formation Employing kMER with Units Deletion Rule Eiji Uchino, Noriaki Suetake, Chuhei Ishigaki
536
Neuro-Fuzzy Hybrid Intelligent Industrial Control and Monitoring Study on Weld Quality Control of Resistance Spot Welding Using a Neuro-Fuzzy Algorithm Yansong Zhang, Guanlong Chen, Zhongqin Lin 544 Exploring Benefits of Neuro Fuzzy Controller with Vehicle Health Monitoring Preeti Bajaj, Avinash Keskar
551
Improvement of Low Frequency Oscillation Damping in Power Systems Via an Adaptive Critic Based NeuroFuzzy Controller Farzan Rashidi, Behzad Moshidi 559 Use of Artificial Neural Networks in the Prediction of the Kidney Transplant Outcomes Fariba Shadabi, Robert Cox, Dharmendra Sharma, Nikolai Petrovsky
566
Intelligent Hybrid Systems for Robotics An SoC-Based Context-Aware System Architecture Keon Myung Lee, Bong Ki Sohn, Jong Tae Kim, Seung Wook Lee, Ji Hyong Lee, Jae Wook Jeon, Jundong Cho
573
An Intelligent Control of Chaos in Lorenz System with a Dynamic Wavelet Network Yusuf Oysal
581
Intelligent Robot Control with Personal Digital Assistants Using Fuzzy Logic and Neural Network Seong-Joo Kim, Woo-Kyoung Choi, Hong-Tae Jeon 589 Mobile Robot for Door Opening in a House Dongwon Kim, Ju-Hyun Kang, Chang-Soon Hwang, Gwi-Tae Park
596
Hybrid Fuzzy-Neural Architecture and Its Application to Time Series Modeling 603 Dongwon Kim, Sam-Jun Seo, Gwi-Tae Park Techniques of Computational Intelligence for Affective Computing Accelerometer Signal Processing for User Activity Detection Jonghun Baek, Geehyuk Lee, Wonbae Park, Byoung-Ju Yun
610
LVI
Table of Contents, Part III
Neural Network Models for Product Image Design Yang-Cheng Lin, Hsin-Hsi Lai, Chung-Hsing Yeh
618
Evaluation of Users’ Adaptation by Applying LZW Compression Algorithm to Operation Logs Hiroshi Hayama, Kazuhiro Ueda
625
Study on Segmentation Algorithm for Unconstrained Handwritten Numeral Strings 632 Zhang Chuang, Wu Ming, Guo Jun Information Agents on the Internet and Intelligent Web Mining Wavelet-Based Image Watermaking Using the Genetic Algorithm Prayoth Kumsawat, Kitti Attkitmongcol, Arthit Srikaew, Sarawut Sujitjorn
643
Extraction of Road Information from Guidance Map Images Hirokazu Watabe, Tsukasa Kawaoka
650
Dynamic Customer Profiling Architecture Using High Performance Computing Qiubang Li, Rajiv Khosla, Chris Lai
657
Intelligent Information Systems Using Case-Based Reasoning or Search Engineering Predicting Business Failure with a Case-Based Reasoning Approach Angela Y.N. Yip
665
Capturing and Applying Lessons Learned During Engineering Equipment Installation Ian Watson
672
Case-Based Adaptation for UML Diagram Reuse Paulo Gomes, Francisco C. Pereira, Paulo Carreiro, Paulo Paiva, Nuno Seco, José L. Ferreira, Carlos Bento 678 Harmonic Identification for Active Power Filters Via Adaptive Tabu Search Method Thanatchai Kulworawanichpong, Kongpol Areerak, Kongpan Areerak, Sarawut Sujitjorn
687
Active Power Filter Design by a Simple Heuristic Search Thanatchai Kulworawanichpong, Kongpol Areerak, Sarawut Sujitjorn
695
Stochastic Local Search for Incremental SAT and Incremental MAX-SAT Malek Mouhoub, Changhai Wang
702
Table of Contents, Part III
LVII
Finite Convergence and Performance Evaluation of Adaptive Tabu Search Deacha Puangdownreong, Thanatchai Kulworawanichpong, Sarawut Sujitjorn
710
Applications of Computational Intelligence to Signal and Image Processing Knowledge-Based Method to Recognize Objects in Geo-Images Serguei Levachkine, Miguel Torres, Marco Moreno, Rolando Quintero
718
Fast Design of 2-D Narrow Bandstop FIR Filters for Image Enhancement Pavel Zahradnik,
726
Fast Design of Optimal Comb FIR Filters Pavel Zahradnik,
733
Artificial Intelligence Methods in Diagnostics of the Pathological Speech Signals Andrzej Izworski, Ryszard Tadeusiewicz, Wieslaw Wszolek 740 Intelligent Sub-patch Texture Synthesis Algorithm for Smart Camera Jhing-Fa Wang, Han-Jen Hsu, Hong-Ming Wang
749
Exploration of Image Features for Describing Visual Impressions of Black Fabrics 756 Chie Muraki Asano, Satoshi Hirakawa, Akira Asano Emergent Global Behaviors of Distributed Intelligent Engineering and Information Systems Distributed Resource Allocation via Local Choices: General Model and a Basic Solution Marian F. Ursu, Botond Virginas, Chris Voudouris
764
Behavior Profiling Based on Psychological Data and Emotional States Rajiv Khosla, Chris Lai, Tharanga Goonesekera
772
Extension of Multiagent Data Mining for Distributed Databases Ayahiko Niimi, Osamu Konishi
780
Agent-Based Approach to Conference Information Management Hee-Seop Han, Jae-Bong Kim, Sun-Gwan Han, Hyeoncheol Kim
788
Mining Frequency Pattern from Mobile Users John Goh, David Taniar
795
Semi-supervised Learning from Unbalanced Labeled Data – An Improvement Te Ming Huang, Vojislav Kecman
802
LVIII
Table of Contents, Part III
Posters Handling Emergent Resource Use Oscillations Mark Klein, Richard Metzler, Yaneer Bar-Yam
809
A Practical Timetabling Algorithm for College Lecture-Timetable Scheduling Kyoung-Soon Hwang, Keon Myung Lee, Joongnam Jeon
817
Java Bytecode-to-.NET MSIL Translator for Construction of Platform Independent Information Systems YangSun Lee, Seungwon Na
826
A Scale and Viewing Point Invariant Pose Estimation M. Y. Nam, P. K. Rhee
833
A Novel Image Preprocessing by Evolvable Neural Network M.Y. Nam, W.Y. Han, P.K. Rhee
843
Transition Properties of Higher Order Associative Memory of Sequential Patterns 855 Hiromi Miyajima, Noritaka Shigei, Yasuo Hamakawa Morphological Blob-Mura Defect Detection Method for TFT-LCD Panel Inspection Young-Chul Song, Doo-Hyun Choi, Kil-Houm Park
862
A Recommendation System for Intelligent User Interface: Collaborative Filtering Approach 869 Ju-Hyoung Yoo, Kye-Soon Ahn, Jeong Jun, Phill-Kyu Rhee Fast Half Pixel Motion Estimation Based on the Spatial Correlation Hyo Sun Yoon, Guee Sang Lee
880
A New Vertex Selection Scheme Using Curvature Information Byoung-Ju Yun, Si-Woong Lee, Jae-Soo Cho, Jae Gark Choi, Hyun-Soo Kang
887
Author Index
895
Web Intelligence, World Knowledge and Fuzzy Logic – The Concept of Web IQ (WIQ) Lotfi A. Zadeh Professor in the Graduate School, Computer Science Division, Department of Electrical Engineering and Computer Sciences, University of California Berkeley, California, USA Director, Berkeley Initiative in Soft Computing (BISC)
Fuzzy Conceptual Matching: Tool for Intelligent Knowledge Management and Discovery in the Internet Given the ambiguity and imprecision of the “concept” in the Internet, which may be described with both textual and image information, the use of Fuzzy Conceptual Matching (FCM) is a necessity for search engines. In the FCM approach, the “concept” is defined by a series of keywords with different weights depending on the importance of each keyword. Ambiguity in concepts can be defined by a set of imprecise concepts. Each imprecise concept, in fact, can be defined by a set of fuzzy concepts. The fuzzy concepts can then be related to a set of imprecise words given the context. Imprecise words can then be translated into precise words given the ontology and ambiguity resolution through a clarification dialog. By constructing the ontology and fine-tuning the strength of links (weights), we could construct a fuzzy set to integrate piecewise the imprecise concepts and precise words to define the ambiguous concept.
References [1] Nikravesh M ,Zadeh L A (2000-2004) Perception Based Information Processing and Retrieval Application to User Profiling, Berkeley-BT project. [2] Nikravesh M, Azvine B. (2001) FLINT 2001, New Directions in Enhancing the Power of the Internet, UC Berkeley Electronics Research Laboratory, Memorandum No. UCB/ERL M01/28, August 2001. [3] Vincenzo L, Nikravesh M, Zadeh L, A., (2004) Journal of Soft Computing, Special Issue: Fuzzy Logic and the Internet, Springer Verlag (to appear). [4] Nikravesh M., (2001) Fuzzy Logic and Internet: Perception Based Information Processing and Retrieval, Berkeley Initiative in Soft Computing, Report No. 2001-2-SI-BT, September 2001. [5] Nikravesh M., (2001) BISC and The New Millennium, Perception-based Information Processing, Berkeley Initiative in Soft Computing, Report No. 2001-1-SI, September 2001.
From Search Engines to Question-Answering Systems - The Need for New Tools Search engines, with Google at the top, have many remarkable capabilities. But what is not among them is the deduction capability--the capability to synthesize an answer to a query by drawing on bodies of information which are resident in various areas of M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 1–5, 2004. © Springer-Verlag Berlin Heidelberg 2004
2
L.A. Zadeh
the knowledge base. It is this capability that differentiates a question-answering system (Q/A-system) from a search engine. Question-answering systems have a long history. Search engines as we know them today owe their existence and capabilities to the web. Upgrading a search engine to a Q/A system is a complex, effort-intensive and open-ended problem. Semantic web and related systems may be viewed as steps in this direction. However, the thrust of the following is that substantial progress is unattainable through the use of existing tools, which are based on bivalent logic and probability theory. The principal obstacle is the nature of world knowledge. Reflecting the bounded ability of sensory organs, and ultimately the brain, to resolve detail and store information, perceptions are intrinsically imprecise. The imprecision of perceptions puts them well beyond the reach of existing methods of meaning-representation based on predicate logic and probability theory. What this implies is that new tools are needed to deal with world knowledge in the context of search, deduction and decision analysis. The principal new tool is based on the recently developed methodology of computing with words and perceptions (CWP). The point of departure in CWP is the assumption that perceptions are described in a natural language. In this way, computing with perceptions is reduced to computing with propositions drawn from a natural language, e.g., “If A/person works in B/city then it is likely that A lives in or near B.” A concept which plays a key role in CWP is that of precisiated natural language (PNL). A proposition, p, in NL is precisiable if it is translatable into a precisiation language. In the case of PNL, the precisiation language is the generalized constraint language (GCL). By construction, GCL is maximally expressive. One of the principal functions of PNL is that of serving as a knowledge-description language and, more particularly, as a world-knowledge-description language. In this context, PNL is employed to construct what is referred to as epistemic (knowledge-directed) lexicon (EL). The BISC Initiative: Fuzzy Logic and the Internet (FLINT); Perception Based Information Processing and Analysis This project is focused on the need for an initiative to design an intelligent search engine based on two main motivations: The web environment is, for the most part, unstructured and imprecise. To deal with information in the web environment, we need a logic that supports modes of reasoning that are approximate rather than exact. While searches may retrieve thousands of hits, finding decision-relevant and query-relevant information in an imprecise environment is a challenging problem, which has to be addressed. Another less obvious issue is deduction in an unstructured and imprecise environment given the huge stream of complex information. As a result, intelligent search engines with growing complexity and technological challenges are currently being developed. This requires new technology in terms of understanding, development, engineering design, and visualization. While the technological expertise of each component becomes increasingly complex, there is a need for better integration of each component into a global model adequately capturing the imprecision and deduction capabilities.
Web Intelligence, World Knowledge and Fuzzy Logic
3
The objective of this initiative is to develop an intelligent computer system with deductive capabilities to conceptually match and rank pages based on predefined linguistic formulations and rules defined by experts or based on a set of known homepages. The Conceptual Fuzzy Set (CFS) model will be used for intelligent information and knowledge retrieval through conceptual matching of both text and images (here defined as “Concept”). The selected query doesn’t need to match the decision criteria exactly, which gives the system a more human-like behavior. The CFS can also be used for constructing fuzzy ontology or terms related to the context of search or query to resolve the ambiguity. Also the expert knowledge with soft computing tools of Berkeley groups will be combined.
References [1] Nikravesh M, Azvine B (2001) FLINT 2001, New Directions in Enhancing the Power of the Internet, UC Berkeley Electronics Research Laboratory, Memorandum No. UCB/ERL M01/28, August 2001. [2] Vincenzo L, Nikravesh M, Zadeh L A (2004) Journal of Soft Computing, Special Issue: Fuzzy Logic and the Internet, Springer Verlag (to appear). [3] Nikravesh M. (2001) Fuzzy Logic and Internet: Perception Based Information Processing and Retrieval, Berkeley Initiative in Soft Computing, Report No. 2001-2-SI-BT, September 2001. [4] Nikravesh M (2001) BISC and The New Millennium, Perception-based Information Processing, Berkeley Initiative in Soft Computing, Report No. 2001-1-SI, September 2001. [5] Nikravesh M , Zadeh L A (2000-2004) Perception Based Information Processing and Retrieval Application to User Profiling, Berkeley-BT project, 2000-2004.
Biography Lotfi A. Zadeh is a Professor in the Graduate School, Computer Science Division, Department of EECS, University of California, Berkeley. In addition, he is serving as the Director of BISC (Berkeley Initiative in Soft Computing). Prof. Lotfi A. Zadeh is an alumnus of the University of Teheran, MIT and Columbia University. He held visiting appointments at the Institute for Advanced Study, Princeton, NJ; MIT; IBM Research Laboratory, San Jose, CA; SRI International, Menlo Park, CA; and the Center for the Study of Language and Information, Stanford University. His earlier work was concerned in the main with systems analysis, decision analysis and information systems. His current research is focused on fuzzy logic, computing with words and soft computing, which is a coalition of fuzzy logic, neurocomputing, evolutionary computing, probabilistic computing and parts of machine learning. The guiding principle of soft computing is that, in general, better solutions can be obtained by employing the constituent methodologies of soft computing in combination rather than in stand-alone mode. Prof. Zadeh is a fellow of the IEEE, AAAS, ACM and AAAI, and a member of the National Academy of Engineering. He held NSF Senior Postdoctoral Fellowships in 1956-57 and 1962-63, and was a Guggenheim Foundation Fellow in 1968. Prof. Zadeh
4
L.A. Zadeh
was the recipient of the IEEE Education Medal in 1973 and a recipient of the IEEE Centennial Medal in 1984. In 1989, Prof. Zadeh was awarded the Honda Prize by the Honda Foundation, and in 1991 received the Berkeley Citation, University of California. In 1992, Prof. Zadeh was awarded the IEEE Richard W. Hamming Medal “For seminal contributions to information science and systems, including the conceptualization of fuzzy sets.” He became a Foreign Member of the Russian Academy of Natural Sciences (Computer Sciences and Cybernetics Section) in 1992 and received the Certificate of Commendation for AI Special Contributions Award from the International Foundation for Artificial Intelligence. Also in 1992, he was awarded the Kampe de Feriet Prize and became an Honorary Member of the Austrian Society of Cybernetic Studies. In 1993, Prof. Zadeh received the Rufus Oldenburger Medal from the American Society of Mechanical Engineers “For seminal contributions in system theory, decision analysis, and theory of fuzzy sets and its applications to AI, linguistics, logic, expert systems and neural networks.” He was also awarded the Grigore Moisil Prize for Fundamental Researches, and the Premier Best Paper Award by the Second International Conference on Fuzzy Theory and Technology. In 1995, Prof. Zadeh was awarded the IEEE Medal of Honor “For pioneering development of fuzzy logic and its many diverse applications.” In 1996, Prof. Zadeh was awarded the Okawa Prize “For outstanding contribution to information science through the development of fuzzy logic and its applications.” In 1997, Prof. Zadeh was awarded the B. Bolzano Medal by the Academy of Sciences of the Czech Republic “For outstanding achievements in fuzzy mathematics.” He also received the J.P. Wohl Career Achievement Award of the IEEE Systems, Science and Cybernetics Society. He served as a Lee Kuan Yew Distinguished Visitor, lecturing at the National University of Singapore and the Nanyang Technological University in Singapore, and as the Gulbenkian Foundation Visiting Professor at the New University of Lisbon in Portugal. In 1998, Prof. Zadeh was awarded the Edward Feigenbaum Medal by the International Society for Intelligent Systems, and the Richard E. Bellman Control Heritage Award by the American Council on Automatic Control. In addition, he received the Information Science Award from the Association for Intelligent Machinery and the SOFT Scientific Contribution Memorial Award from the Society for Fuzzy Theory in Japan. In 1999, he was elected to membership in Berkeley Fellows and received the Certificate of Merit from IFSA (International Fuzzy Systems Association). In 2000, he received the IEEE Millennium Medal; the IEEE Pioneer Award in Fuzzy Systems; the ASPIH 2000 Lifetime Distinguished Achievement Award; and the ACIDCA 2000 Award fot the paper, “From Computing with Numbers to Computing with Words -From Manipulation of Measurements to Manipulation of Perceptions.” In 2001, he received the ACM 2000 Allen Newell Award for seminal contributions to AI through his development of fuzzy logic. Prof. Zadeh holds honorary doctorates from Paul-Sabatier University, Toulouse, France; State University of New York, Binghamton, NY; University of Dortmund, Dortmund, Germany; University of Oviedo, Oviedo, Spain; University of Granada,
Web Intelligence, World Knowledge and Fuzzy Logic
5
Granada, Spain; Lakehead University, Canada; University of Louisville, KY; Baku State University, Azerbaijan; the Silesian Technical University, Gliwice, Poland; the University of Toronto, Toronto, Canada; the University of Ostrava, Ostrava, the Czech Republic; the University of Central Florida, Orlando, FL; and the University of Hamburg, Hamburg, Germany; and the University of Paris(6), Paris, France. Prof. Zadeh has authored close to two hundred papers and serves on the editorial boards of over fifty journals. He is a member of the Advisory Board, Fuzzy Initiative, North Rhine-Westfalia, Germany; Advisory Board, Fuzzy Logic Research Center, Texas A&M University, College Station, Texas; Advisory Committee, Center for Education and Research in Fuzzy Systems and Artificial Intelligence, Iasi, Romania; Senior Advisory Board, International Institute for General Systems Studies; the Board of Governors, International Neural Networks Society; and is the Honorary President of the Biomedical Fuzzy Systems Association of Japan and the Spanish Association for Fuzzy Logic and Technologies. In addition, he is a member of the International Steering Committee, Hebrew University School of Engineering; a member of the Advisory Board of the National Institute of Informatics, Tokyo; a member of the Governing Board, Knowledge Systems Institute, Skokie, IL; and an honorary member of the Academic Council of NAISO-IAAC.
Industrial Applications of Evolvable Hardware Dr. Tetsuya Higuchi MIRAI Project National Institute of Advanced Industrial Science and Technology, Japan [email protected]
In this talk, first, the introduction of the concept of Evolvable Hardware (EHW) is made focussed on: basic concept of evolvable hardware, digital hardware evolution (gate-level evolvable hardware and function level evolvable hardware), also analogue hardware evolution and mechanical hardware evolution. The industrial applications of EHW are discussed, including : (a) cellular phone analog LSI EHW chip – installed in cellular phones since December 2001, (b) GHz-processor clock optimisation – improvement of clock frequency by EHW clock timing adjustment with GA, (c) high-speed data transfer, (d) (evolvable) femto-second laser system, (e) artificial hand –EHW implements a pattern recognition hardware specific to individuals, (f) EMG (electromyograph) prosthetic hand can adapt to individuals very quickly by EHW chip.
Reference [1] Higuchi T (1999) Real-world applications of analog and digital evolvable hardware. In: IEEE Trans. on EC, September, 1999
Biography Dr. Tetsuya Higuchi heads the new circuit/system group of MIRAI semiconductor project at National Institute of Industrial Science and Technology, Japan. He is also a professor at University of Tsukuba. He also chaired ICES conferences.
Outline of MIRAI Project Millennium Research for Advanced Information Technology (MIRAI) is a research project authorized by the New Energy and Industrial Technology Organization (NEDO) under a program funded by the Ministry of Economy, International Trade and Industry (METI) of Japan. MIRAI is a seven-year project divided into a first phase (three years) and a second phase (four years). In 2003, the last year of the first phase, the project was assessed by outside experts, and received high evaluation marks. The allocated grants are provided to the Advanced Semiconductor Research Center (ASRC) and the Association of Super-Advanced Electronics Technologies (ASET), the two organizations that conduct the joint research project. Work on the project is to be shared among five R&D groups, each organized around a specific theme. One of theses 5 teams is called New Circuits and System M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 6–7, 2004. © Springer-Verlag Berlin Heidelberg 2004
Industrial Applications of Evolvable Hardware
7
Technology, having Dr. Tetsuya Higuchi, ASRC, National Institute of Advanced Industrial Science and Technology, as a group leader. The research of this group is focussed on the following theme: Development of post-production adjustment technology to make possible 45-nm-generation LSIs with higher processing speeds and lower power consumption; development of technology for yield enhancement by installing adjustment circuits within LSIs to compensate for signal delay lags in LSIs.
Equilibrium Modelling of Oligonucleotide Hybridization, Error, and Efficiency for DNA-Based Computational Systems John A. Rose The University of Tokyo, Dept. of Computer Science, and Japan Science and Technology Corporation, CREST [email protected] http://hagi.is.s.u-tokyo.ac.jp/johnrose/
A principal research area in biomolecular computing [1] is the development of analytical methods for evaluating computational fidelity and efficiency [2]. In this work, the equilibrium theory of the DNA helix-coil transition [3] is reviewed, with a focus on current applications to the analysis and design of DNA-based computers. Following a brief overview, a discussion is presented of typical basic application to modeling the characteristics of coupled DNA systems, via decomposition into component equilibria which are then assumed to proceed independently [4–6]. Extension to support the explicit modeling of the gross behavior of coupled equilibria, via an estimate of the mean error probability per hybridized conformation, or computational incoherence is then discussed, including approximate application [7– 11] to estimate the fidelities of the annealing biostep of DNA-based computing [1], and DNA microarray-based Tag-Antitag (TAT) systems [12]. Finally, a variation of this method is discussed, which models the computational efficiency of an autonomous DNA hairpin-based computer, Whiplash PCR [13] via a pseudo-equilibrium, Markov chain approach, by assuming process equilibrium between successive computational steps [14]. Illustrative simulations, computed under enhanced statistical zipper [5] and all-or-none models of duplex formation, combined with a nearest-neighbor model of duplex energetics [15] are illustrated for three DNAbased systems of interest: (1) melting curves for typical, perfectly-matched and mismatched DNA oligonucleotides; (2) coupled error-response curves for a small TAT system, with comparison to the behavior expected via a consideration of the isolated melting curves, as well as approximate solution of the inverse problem of high-fidelity design; and (3) prediction of the efficiency of recursive hairpin formation/extension during Whiplash PCR, along with a brief discussion of rational re-design.
References Rose J A, Wood D H, Suyama A (2004) Natural Computing Adleman L (1994) Science, 266, 1021-4 Wartell R, Benight A (1985) PHYSICS REPORTS 126, 67-107 Wetmur J (1999) In: IDNA-based Computers III (eds) Rubin H, Wood D, pp 1-25 (A.M S 1999). 5. Hartemink , Gifford D (1999) In: IDNA-based Computers III (eds) Rubin H, Wood D, pp 25-38, A.M.S)
1. 2. 3. 4.
M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 8–10, 2004. © Springer-Verlag Berlin Heidelberg 2004
Equilibrium Modelling of Oligonucleotide Hybridization, Error, and Efficiency
9
6. Deaton R, et al (1998) Phys. Rev. Let. 80, 417-20 7. Rose J A, et al (1999) In: Proc. 1999 Genet. Evol. Comp. Conf. (GECCO’99), Morgan Kauffman, San Francisco, pp 1829-1834 8. Rose J A, et al (2003) In: DNA Computing LNCS 2054 (Proc DNA 6), (eds) Condon A, Ropzenberg G, Springer-Verlag, pp 231-246 9. Rose J A, et al (2002) In: DNA Computing. LNCS 2340 (Proc DNA 7), (eds) Jonaska N, Seeman N, Springer-Verlag, pp 138-149 10. Rose J A, et al (2003) In: Proc. 2003 Cong. Evol. Comp. (CEC’03), Canberra, Australia, pp 2740-47 Int’l Conf. Knowledge-based Intel. Inf. & Eng. Sys. 11. Rose J A (2004) In: Proc. (KES’04), Wellington, New Zealand, in press 12. BenDor A, et al (2000) J. Comput. Biol. 7, 503-519 13. Sakamoto K, et al., (1999) Biosystems 52, 81-91 14. Rose J A, et al (2002) Equilibrium Analysis of the Efficiency of an Autonomous Molecular Computer. Physical Review E 65, Article 02910, pp 1-13 15. Santalucia J, (1998), P.N.A.S. U S A 95, pp 1460-5 16. Rubin H, Wood D (1999) DNA-based Computers III. (eds.) Rubin H, Wood D (A.M S)
Biography Actually, dr. John A Rose is with Department of Computer Science, U. P. B. S. B. The University of Tokyo, and Japan Science and Technology Corporation, CREST 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, JAPAN Tel/Fax: +81-3-5841-4691 Email:[email protected] Web-page: http://hagi.is.s.u-tokyo.ac.jp/johnrose/ The main steps of his formal education may be mentioned as follows: Undergraduate (Rhodes College, Memphis, Tennessee) Dates of Attendance: 09/1982-05/1987 Degree: Bachelor of Arts, Physics (conferred, 05/1990) Honors: Rhodes Trustee Scholarship/Lowenstein Scholarship Graduate, Masters (Memphis State University, Memphis, Tennessee) Dates of Attendance: 01/1988-08/1990 Degree: Master of Science, Physics (conferred, 08/1990) Honors: NASA, L. A. R. S. Scholarship; A. T. I. Fellowship Thesis: A Study on the Photoproduction of Psi-prime Mesons Graduate, Doctoral (The University of Memphis, Memphis, Tennessee) Dates of Attendance: 09/1993-12/1999 Degree: Doctor of Philosophy, Electrical Engineering (conferred, 12/1999) Honors: Herff Internship, 1997-1999; Eta Kappa Nu Honors Society, 1996 Dissertation: The Fidelity of DNA Computation He is an outstanding young scholar with a wide postdoctoral and professional experiences are to be pointed out by some very active positions as follows:
10
J.A. Rose
Postdoctoral Research Associate (University of Tokyo; Information Science Dept.) Dates of Employment: 01/2000-03/2001 Research Focus: Physical Models of DNA-based Computation JSPS Postdoctoral Research Fellow (Japan Society for the Promotion of Science) Host Institution: University of Tokyo, Dept. of Computer Science Dates of Award Tenure: 04/2001-01/2002 Research Focus: DNA-based Computing Assistant Professor (University of Tokyo, Dept. of Computer Science, U.P.B.S.B.) Dates of Employment: 04/2002-present Research Focus: Physical models of hybridizing DNA systems; Application to Biotechnology (DNA microarrays, DNA computing-based protein evolution)
Chance Discovery with Emergence of Future Scenarios Yukio Ohsawa1, 2, 3 1 University of Tsukuba The University of Tokyo, and 3 The Chance Discovery Consortium Office: GSSM, University of Tsukuba, 3-29-1 Otsuka, Bunkyo-ku, Tokyo 112-0012, Japan Fax: +81-3-3942-6829 2
[email protected] http://www.gssm.otsuka.tsukuba.ac.jp/staff/owawa
A “chance” is an event or a situation significant for making a decision in a complex environment. Since we organized a session of Chance Discovery in KES 2000, the basic theories attracted the interdisciplinary community of researchers from philosophy, sociology, artificial intelligence, finance, complex systems, medical science, etc. Even stronger reactions from companies lead to organizing the Chance Discovery Consortium in Japan, achieving big fruits of business. In this talk, the methods of chance discovery are presented. In summary, visual data mining methods developed for chance discovery aid user’s individual thoughts and users’ group discussions about scenarios in the future. This process is called Double Helix, where humans and computers cooperatively make spiral deepening of their concerns with chances. In this process, valuable scenarios emerge with users’ awareness of chances, like creatures emerging with the chromosome - crossing at crossover points. Emerging scenarios motivate the user to work in the real world to try actions, and the new data acquired from the real world accelerates the process. Chance discovery, in other words, is the child, and is also the parent of scenario emergence. Participants of KES’2004, interested in human-human, humanenvironment, and human - machine interactions, will find how all these kinds of interactions are integrated to make real benefits.
References 1. Ohsawa Y, McBurney P (Eds) (2003) Chance Discovery – Advanced Information Processing Series. ISBN: 3-540-00549-8, Springer Verlag 2. Ohsawa Y (2002) KeyGraph as Risk Explorer from Earthquake Sequence. J of Contingencies and Crisis Management 10 (3): 119-128 3. Ohsawa Y, Fukuda H (2002) Chance Discovery by Stimulated Group of People - An Application to Understanding Rare Consumption of Food. J of Contingencies and Crisis Management 10 (3): 129-138 4. Ohsawa Y (2002) Chance Discoveries for Making Decisions in Complex Real World. J New Generation Computing 20 (2):143-163 M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 11–12, 2004. © Springer-Verlag Berlin Heidelberg 2004
12
Y. Ohsawa
Biography Bachelor of Engineering (1990) Dept. Electronic Engineering, Faculty of Engineering, University of Tokyo. Thesis: Morpheme Analysis of Natural Language Sentences Including Unknown Words. Superviser: Prof. Hiroya Fujisaki. Master of Engineering (1992) Graduate School of Engineering, University of Tokyo: Discovery of a New Stationary Solution of Femto-Second Optical Pulse Propagation in Optical Fiber. The solution is named Super Soliton. Superviser: Prof. Yoichi Fujii. Doctor of Engineering (1995) High-speed abduction. The method - Networked Bubble Propagation - achieves a polynomial-time approximate computing for abduction, though it is NP-complete. Research Associate (1995-1999) Osaka University Current Positions In University of Tsukuba(1999-) - Graduate School of Business Sciences Researcher - Japan Science and Technology Corp. (2000-) Visiting Researcher - AIR Intelligent Robotics Labo (2003-) Member of DISCUS Project, Illinois University (2003-).
Brain-Inspired SOR Network and Its Application to Trailer Track Back-up Control Takanori Koga and Takeshi Yamakawa Kyushu Institute of Technology, Graduate School of Life Science and Systems Engineering, Japan [email protected]
Self-organizing Maps (SOM) was presented as the model of cortex by Prof. T. Kohonen in 1982 and facilitates vector quantization, topological mapping and visualization of similarities after unsupervised learning. Therefore it can be used for the pattern classification based on the stochastic features of the input data, and its significant utilities are presented by more than 6,000 papers so far. However it can not exhibit the input-output relationship which will be very useful for industrial applications. In this keynote speech, the Self-organizing Relationship (SOR) Network is proposed, in which the input-output relationship is self-organizingly established by unsupervised learning and also input data produces output data and vice versa. This unsupervised learning of SOR Network is effectively achieved by desirable data with positive evaluation and undesirable data with negative evaluation. The evaluation is given subjectively or objectively. The example of the former case is image enhancement and the latter the selforganized control systems. The back-up control of a trailer track is very difficult because of its mechanism.The back-up control is successfully achieved by the SOR network with human experts’ common sense which produces evaluations for input-output data necessary for learning.
List of Selected Publications Books Yamakawa T, G. Matsumoto G, (1999) Methodologies for the Conception, Design and Application of Soft Computing. 2 Volume Set. ISBN: 9810236328 (Eds) Yamakawa T, Matsumoto G, (1996) Methodologies for the Conception, Design, and Application of Intelligent Systems: Proceedings of the 4th International Conference on Soft Computing. International Fuzzy Systems Association, Iizuka, Fukuoka, Japan ISBN: 9810229305 (Eds) Gupta M. M, Yamakawa T, (1988) Fuzzy Logic in Knowledge-Based Systems, Decision and Control. ISBN: 0444704507 (Eds) Gupta M. M, Yamakawa T, (1988) Fuzzy Computing: Theory, Hardware, and Applications. ISBN: 0444704493 M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 13–15, 2004. © Springer-Verlag Berlin Heidelberg 2004
14
T. Koga and T. Yamakawa
Papers Yamakawa T, Horio K, (2002) Modeling of Nonlinear Systems by Employing SelfOrganization and Evaluation - SOR Network. In: AFSS 2002: 204-213 Horio K, Yamakawa T,(2001) Feedback Self-Organizing Map and its Application to Spatio-Temporal Pattern Classification. International Journal of Computational Intelligence and Applications 1(1): 1-18 Yamakawa T., (1998) A Novel Nonlinear Synapse Neuron Model Guaranteeing a Global Minimum - Wavelet Neuron. ISMVL 1998: 335Uchino E, Nakamura S, Yamakawa T, (1997) Nonlinear Modeling and Filtering by RBF Network with Application to Noisy Speech Signal. Information Sciences 101(34): 177-185 Yamakawa T, Uchino E, Takayama M, (1997) An Approach to Designing the Fuzzy IF-THEN Rules for Fuzzy-Controlled Static Var Compensator (FCSVC). Information Sciences 101(3-4): 249-260
Biography Prof. Takeshi Yamakawa received the B. Eng. degree in electronics engineering in 1969 from Kyushu Institute of Technology, Tobata and the M. Eng. degree in electronics engineering in 1971 from Tohoku University, both in Japan. He received the Ph.D. degree for his studies on electrochemical devices in 1974 from Tohoku University, Japan. From 1974 to 1977, he engaged in the development of new electrochemical devices as a Research Assistant at Tohoku University. From 1977 to 1981 he served as a Research Assistant in electrical engineering and computer science at Kumamoto University, Japan. From 1981 to 1989 he was an Associate Professor at Kumamoto University. He joined the faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology (KIT), Iizuka, Japan and received a full professorship in April 1989. Prof. Takeshi Yamakawa established a national foundation, Fuzzy Logic Systems Institute (FLSI), in Japan in 1990 to promote the international collaboration on soft computing, and to promote the spread of the research results. He is now the chairman of FLSI and a professor of Computer Science and Systems Engineering at Graduate School of Life Science and Systems Engineering, Kyushu Institute of Technology (KIT), Japan. His main research interest lies on hardware implementation of fuzzy systems, fuzzy neural networks, chaotic systems and self-organizing maps. He holds 11 patents in U.S.A., 4 patents in Europe, 1 patent in Australia, 1 patent in Taiwan and 1 patent in Canada, and he has also applied for more than 75 patents in Japan. He is currently working as the project leader of the Center of Excellence entitled “World of Brain Computing Interwoven out of Animals and Robots”.
Brain-Inspired SOR Network and Its Application
15
He is acting as a member of editorial board and a regional editor of 10 international professional journals. Prof. Takeshi Yamakawa contributed more than 30 international conferences as an organizer or a member of organizing/programming committee. He organizes the International Conference on Soft Computing, namely IIZUKA Conference, every two years in Iizuka city, Japan. He is a Senior Member of IEEE. Prof. Takeshi Yamakawa plays Karate (Japanese traditional martial arts) and possesses a black belt (5th Dan). And he likes swimming, a monocycle and horse riding as well. His interest also lies on Shakuhachi and Sangen, which are Japanese traditional music instruments.
Dual Stream Artificial Neural Networks Colin Fyfe Applied Computational Intelligence Research Unit, The University of Paisely, Scotland, United Kingdom
In this paper, we review the work in 4 PhDs which were undertaken at the University of Paisley, by Dr Pei Ling Lai [4], Dr ZhenKun Gou [1], Dr Jos Koetsier [3] and Dr Ying Han [2]. Each of these theses examined the problem of simultaneously extracting information from two data streams which have an underlying common cause. An example is that from Dr Lai who began by trying to model Canonical Correlation Analysis (CCA) which finds the linear combination of a data set which shows the greatest correlation under the constraint that the variance of the outputs is 1. Thus, if and are related inputs, we find and so that the expected value of is maximum. If we let and the learning rules for and
i.e. a mixture of Hebbian and anti-Hebbian learning. Each of the 4 theses used a somewhat different starting point yet the resulting artificial neural networks often exhibit a remarkable likeness: they are often a combination of Hebbian learning between an input and the opposite output and antiHebbian learning between the input and the corresponding output. Of course, since CCA exists as a standard statistical technique, our artificial neural networks need to go beyond linear correlations in order to have functionality beyond that available from CCA. We do this by considering nonlinear correlations which are defined in a number of different ways. The biological rationale for investigating such methods is that organisms must integrate information from more than one sensory stream in order to make sense of their environment. It is thus reassuring that Hebbian learning is capable of finding filters which integrate such information. As computer scientists, we are also interested in creating algorithms which perform difficult engineering tasks. Thus we discuss the applications of these techniques to Blind source separation: Extraction of one signal from a noisy mixture of signals. Forecasting financial time series. Image Registration
References [1] Gou, Z.K. Canonical Correlation Analysis and Artificial Neural Networks, PhD Thesis, University of Paisley, 2003. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 16–17, 2004. © Springer-Verlag Berlin Heidelberg 2004
Dual Stream Artificial Neural Networks
17
[2] Han, Y. Analysing Times Series using Artificial Neural Networks, PhD Thesis, University of Paisley, 2004. [3] Koetsier, J. Context Assisted Learning in Artificial Neural Networks, PhD Thesis, University of Paisley, 2003. [4] Lai, P.L. Neural Implementations of Canonical Correlation Analysis, PhD Thesis, University of Paisley, 2004.
Improving the Quality of Semantic Retrieval in DNA-Based Memories with Learning Andrew Neel, Max Garzon, and Phani Penumatsa Computer Science, University of Memphis, Memphis, TN 38153-3240 {aneel,mgarzon}@memphis.edu
Abstract. At least three types of associative memories based on DNA-affinity have been proposed. Previously, we have quantified the quality of retrieval of genomic information in simulation by comparison to state-of-the-art symbolic methods available, such as LSA (Latent Semantic Analysis.) Their ability is poor when performed without a proper compaction procedure. Here, we use a different compaction procedure that uses learning to improve the ability of DNA-based memories to store abiotic data. We evaluate and compare the quality of the retrieval of semantic information. Their performance is much closer to that of LSA, according to human expert ratings, and slightly better than the previous method using a summarization procedure. These results are expected to improve and feasibly scale up with actual DNA molecules in real test tubes.
1 Introduction DNA molecules for computing applications were suggested by Adleman [1] and have led to a now, well-established field of biomolecular computing (BMC). Several applications of these methodologies are currently the subject of much research. A promising application is the creation of memories that can store very large data sets in minuscule spaces [2,3,7,12,14,15]. The enormous potential for storage capacity (over a million fold compared to conventional electronic media) combined with advances in recombinant DNA over the last few decades make this approach appealing. Other research has estimated the capacity of large memories and determined the optimal concentration for the efficient retrieval from memories without compaction [3]. Recently, interest has focused on encoding techniques to translate abiotic data into strands of oligonucleotides [7]. With improved capabilities for encoding and storing large capacity memories, there is also a greater need for an equally viable protocol for retrieving relevant information from a memory. As the level of noise increases, the usefulness of the retrieved data and the time efficiency of the retrieval process decrease [6, 7]. Therefore, a new technique for querying large memories and, consequently, improving the quality of the results is required. A new method, the so-called memory-P, was proposed based on a summarization procedure and evaluated in [13]. In this paper, we evaluate a new technique for compacting large memories of abiotic data, this time based on a learning procedure suggested by [3]. We begin by providing a complete description of the experimental design in Section 2 including M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 18–24, 2004. © Springer-Verlag Berlin Heidelberg 2004
Improving the Quality of Semantic Retrieval in DNA-Based Memories
19
the two techniques (compaction by summarization and compaction by extension), and discuss the advantages of each. In Section 3, we make comparisons of the results to LSA, a-state-of-the-art procedure for information retrieval in modern computing, and to a summarization technique (discussed below.)
2 Experimental Design The experimental evaluation data used in this paper was obtained from simulations in a virtual test tube of Garzon et al. [6,8,9], called Edna, an alternative to performing DNA experiments in vitro. The test simulator works by moving the strands contained within the test tube one step per iteration. One iteration represents the order of one millisecond of real time, which is roughly the time required for two DNA molecules to settle a hybridization event. Hybridization between data structures representing DNA strands are modeled by the so-called h-measure of how likely two given strands are to hybridize [7]. Recent work has shown that these simulations produce results that closely resemble, and at times are indistinguishable from the protocols they simulate in wet tubes [7]. Furthermore, the ability to accurately simulate DNA reactions has been recently demonstrated by reproducing the experiments performed by Adleman [1] with random graphs of up to 15 vertices. This simulation has proven to match up well with no false positives and under 0.4% false negatives when compared to Adleman’s experiment. Similar results are obtained on simulations of more sophisticated protocols such as PCR selection for word design [3]. Virtual test tubes have matched very well the results obtained in vitro by means of more recent and elaborate protocols such as the selection protocol for DNA library (memory) design of [5]. Therefore, there is good evidence that, despite the lack of physical realism and microscopic granularity, virtual test tubes provide reliable estimates of the events in wet tubes ([8, 9] contain a more detailed discussion.) In this section, we concentrate on describing the elements and results of the comparison with standard symbolic methods for storage of abiotic data and retrieval of semantic information from these memories in a form useful to humans.
2.1 Latent Semantic Analysis Latent Semantic Analysis (LSA) has been demonstrated [17] to be among the most effective forms of semantic retrieval available from text corpora on conventional silicon-based computers. LSA captures the relative frequency of each word within the corpus (memory) to give “context” or “meaning” to each word. A document (e.g., a paragraph) is represented by a vector in high dimensional Euclidean space, which is compressed into a much smaller space of the most significant eigendirections in the space (dimension about 300). The compacted space can be used to determine the semantic similarity of two documents by measuring the angle (say by the cosine value) between their projections [13,17]. This form of associative retrieval has shown to be proficient enough to perform, in the (multiple choice) TOEFL exam of English as a second language, at the level of competence of an average foreign graduate
20
A. Neel et al.
student [11, 16]. For this reason, our standard objective measure for evaluating the experimental results on text libraries will be LSA benchmarks. The quality of LSA-based semantic retrieval has also been evaluated in a tutoring setting by using correlation coefficients with human expert assessments (more below.) Further details about LSA can be found in [17]. LSA is not perfect, however, since it ignores structural features of language (e.g., word order and grammar structure.). DNA memories, on the other hand, store data in strands and more complex structures that can be retrieved associatively through hybridization [2] in such a way that more text structure may be captured. The question thus arises whether DNA-based retrieval using hybridization (which naturally takes into account word frequency and structure) might perform better than LSA-based techniques.
2.2 Text Corpus Encoding and Objective Measures Our corpus of data chosen was a selected sub-corpus (about 4,000 paragraphs) of the original corpus used in a prior LSA evaluation (which was, however, performed using LSA compaction from the original corpus.) The corpus is semantically equivalent to first year undergraduate’s knowledge of Newtonian physics, as given by a standard textbook for the class. The queries selected for evaluation were determined by a set of selected questions on the topic of qualitative physics. Ideal answers had already been determined by a panel of experts. These questions were put to students participating in some experiments conducted to evaluate the efficiency of a computerized tutor for helping students learn basic physics [18]. The computer tutor rates a student’s answer (hereafter referred to also as a query [7]) by comparing it to an ontology consisting of a pre-determined set of good and bad answers. In an evaluation of the ability of the tutor to assess semantic similarity, cosines of the LSA comparison between the students’ answers and the ideal answers were compared to corresponding match values determined by another panel of expert physicists for the same pairs of question/student’s answer. The effectiveness of LSA was gauged by computing the correlation between these LSA similarity indices and the experts’ similarities across all the pairs of question/student’s answers. As shown in the first column of Table 1, the corresponding value turns out to be about 0.40, and constitutes a benchmark for performance of symbolic methods on semantic retrieval. In this paper, we analogously gauge the quality of DNA’s ability for semantic retrieval by comparing it to LSA’s benchmark. The corpus was mapped into strands of oligo-nucleotides by first extracting a complete set of words from the entire corpus including all the documents. Low-content parts of speech, such as articles and one- or two-character words, were removed. Markup data, such as punctuation and sentence labels, were also removed. Each remaining word was coded into a unique 8-mer DNA sequence. A hashing scheme was then used to translate arbitrary documents (such as ideal and the students’ answers) into strings representing strands of DNA molecules. This process was performed on the memories supplied to the protocol for compaction and to the set of queries used to probe both the memory (compacted corpus) and the whole corpus below. A form of the original sentences can be re-constructed by replacing each 8-mer word with the appropriate key from the hash.
Improving the Quality of Semantic Retrieval in DNA-Based Memories
21
2.3 The CDW Memory Protocol In preliminary experiments with genomic DNA, it was established that DNA hybridization per se is not enough to perform semantic retrieval [3]. In order to test the same with abiotic data we used two forms of memory compaction used before with genomic data [3]. Three protocols were done. The first protocol is the most natural form of retrieval; that is retrieval from the entire library of data. Due to the large capacity of DNA memories [2], this protocol (hereafter called memory B) may retrieve too much irrelevant information. Our second and third protocols address this issue. The second protocol P (retrieval by summarization based on PCR selection [4]) and third protocol CDW (retrieval by learning [3]) are tools that we can use to improve the efficiency of the retrieval by hiding or removing portions of the memory. Both protocols are described in [2,3,16] and have been shown to retain significant information when the input data is a memory of naturally occurring genes [3]. The difference between the naturally occurring genes used in [12] and the symbolic data used in this paper will be discussed below. Memory P was described in detail and shown to work very well with abiotic data in [16], but an analysis of memory CDW required further work. Retrieval by learning, a form of compaction called memory-CDW [3,4,7], uses tags (analogous to “concepts”) that have been extended with random tails of about 150 oligo-nucleotides in length. These strands are constructed independently of the corpus and in their initial state have no relationship to it. The strands of the corpus are inserted into the tube and allowed to hybridize to the tails of the memory strands to extend the tag by contributing new words. The hybridization process has been modified slightly from [4,13] in order to maintain the integrity of the extensions by never partially hybridizing words. After hybridization, the non-hybridized single-stranded portions of the memory are pruned away, being careful not to partially erase words. The tagged strands are melted in order for the process to continue. After several rounds, the extended tails are retrieved from the tube and constitute the CDW memory. At this point, some or all of the related information on a given topic is captured into the extensions of the tags. Retrieval occurs by homology of probes to memories strands using a hybridization criterion (e.g., Gibbs energy in wet test tubes, or its combinatorial approximation, the h-distance [7] used below.).
2.4 Experiments Two sets of experiments using both compaction techniques described above were performed. The goal was to evaluate the consistency and soundness of the protocol and to evaluate the semantic quality of the retrievals. In the first set, the text corpus was used as input to the learning protocol in order to obtain a memory-CDW described above. After compaction, the quality of the memory was evaluated using a list of queries (students’ answers) previously used to evaluate LSA. For the second set, the text corpus was replaced by the set of ideal answers as a training set. Each experiment was performed about 10 times since hybridizations and their simulations are stochastic events. To verify the robustness of the results with respect to the encoding set, we used different encodings of the entire corpus, and hence of the ideal answers, in the various runs.
22
A. Neel et al.
3 Retrieval Soundness and Semantic Quality In the original experiments with LSA [17], a relatively high correlation was observed between a human expert’s evaluation of the queries (i.e. student answers) and LSA’s evaluation of those same answers with respect to pre-determined ideal answers. LSA’s choices correlated about 0.40 to each of four human expert evaluations. This value is used as an objective standard of semantic quality of symbolic retrieval in our assessment of the semantic quality of DNA-based retrievals. In associative memory applications, the h-measure can play the role of the LSA index in order to answer a given query. Thus, the h-measure was calculated between each query (student answers in DNA form) and the ideal answers. The value of this comparison is really a sine since it is better as it gets larger, so it was necessary to calculate the cosine from this value. Afterwards, the queries were substituted with the best matches from our compacted corpus and the cosines re-calculated.
Table 1 shows the average results of both experiments. The first column shows the correlation of human expert evaluations using LSA, as mentioned above. The second column shows the best results achieved using memory P to summarize the corpus presented in [13]. The third column shows the correlation between queries obtained using DNA memories without compaction and each of the four human expert evaluations of the student answers (rows). This results in a negative correlation with all four human judges. When the corpus is compacted by CDW learning, the fourth column shows the analogous correlations between the queries and best matches from the CDW memory. The fifth column shows a major improvement of 20 to 30% increase in efficiency that shows the efficiency of our compaction process. The sixth column shows a very stable error margin for this protocol (about 2 % standard deviation.)
Improving the Quality of Semantic Retrieval in DNA-Based Memories
23
The average results of the second set of experiments are shown in the next three columns, Here, there is a further average improvement of 2-4% in semantic quality of retrieval from compaction over that of the entire corpus.
4 Conclusions A new protocol has been proposed for compaction of text corpora that improves the ability of the corresponding memory for semantic retrieval with respect to a previous method using another type of DNA-memory. The new method uses learning to extract information from a subset of the full corpus by PCR extension. Further, we have also shown that only minor enhancements to our compaction process do provide sizable improvements in the quality of semantic retrieval. Although our protocol has not been able to surpass the best symbolic-based methods (best represented by LSA) in semantic quality of retrieval, it has only fallen short by 10-20%. These results also make evident other important factors in the quality of semantic retrieval. Our protocol is highly dependent on the hybridization criterion (here the hdistance) that is used in both learning and retrieval. Another factor is the quality of the encoding from text to DNA. In a previous set of experiments, not reported here, with a fixed encoding of lesser quality, the results were likewise of lesser quality. Finally, there is the lingering question whether DNA memories and semantic retrieval can eventually become better than symbolic methods, in particular those based on LSA. The first observation is that the hybridization criterion, the h-distance, assumes that DNA strands are stiff and cannot form bulges and loops, for example. Better criteria are the Gibbs energy of approximations thereof [5], but they come at increased computational expense. The ultimate criterion is, of course, actual DNA strands in a wet test tube memory. These different choices will impact not only the training of the memory, but the retrieval process in a critical way. It is therefore quite plausible that much better results will be obtained with actual DNA for this application, even when this approach is applied in simulation on Edna. At any rate, regardless of whether DNA can be made to outperform LSA, it is apparent that the enormous capacity of DNA for volume compaction and thermodynamical efficiency will make it feasible where conventional techniques may fail on terabyte and petabyte size corpora. Likewise, further research is required to determine how well these techniques scale to wider domains beyond the narrow domain (qualitative physics) used here.
References [1] Adleman L. M: Molecular Computation of Solutions to Combinatorial Problems. Science 266 (1994) 1021-1024 [2] Baum E, Building an Associative Memory Vastly Larger Than the Brain. Science 268 (1995), 583-585. [3] Chen J, Deaton R, Wang Y. Z. A DNA-based Memory with in vitro Learning and Associative Recall. Proc. of DNA based computing DNA9 2003. Springer-Verlag Lecture Notes in Computer Science 2943 (2004), 145-156.
24
A. Neel et al.
[4] Deaton R, Chen J, Bi H, Garzon M, Rubin H, Wood D. H. A PCR-Based Protocol for InVitro Selection of Non-cross hybridizing Oligonucleotides, In [10], 105-114. [5] Deaton R. J, Chen J, Bi H, Rose J. A: A Software Tool for Generating Non-cross hybridizing Libraries of DNA Oligonucleotides. In [10], pp. 211-220. [6] Garzon G, Blain D, Bobba K, Neel A, West M, Self-Assembly of DNA-like structures in silico. In Journal of Genetic Programming and Evolvable Machines 4 (2003), 185-200. [7] Garzon M. H, Neel A, Bobba K, Efficiency and Reliability of Semantic Retrieval in DNA-based Memories, In Proc. of DNA based computing DNA9 2003, Springer-Verlag Lecture Notes in Computer Science 2943 (2004), 157-169. [8] Garzon M, Biomolecular Computing in silico. Bull. of the European Assoc. for Theoretical Computer Science EATCS 79 (2003), 129-145. [9] Garzon M, Oehmen C: Biomolecular Computation on Virtual Test Tubes, In: Proc. DNA7, 2001. Springer-Verlag Lecture Notes in Computer Science 2340 (2002), 117-128. [10] Hagiya M, Ohuchi A(eds.) Proceedings of the 8th Int. Meeting on DNA Based Computers, Hokkaido University, 2002. Springer-Verlag Lecture Notes in Computer Science 2568 (2003). [11] Landauer T. K, Dumais S. T: A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of the Acquisition, Induction, and Representation of Knowledge. Psychological Review 104 (1997), 211-240. [12] Neel A, Garzon M. Efficiency and Reliability of Genomic Information Storage and Retrieval in DNA-based Memories with Compaction. Congress for Evolutionary Computation CEC 2003, 2733-2739. [13] Neel A, Garzon M and Penumatsa P. Semantic Retrieval in DNA-based Memories with Abiotic Data. Congress for Evolutionary Computation 2004, in press. [14] Reif J. H, LaBean T. Computationally Inspired Biotechnologies: Improved DNA Synthesis and Associative Search Using Error-Correcting Codes and Vector Quantization. Proc. of the 6th International Workshop on Springer-Verlag Lecture Notes in Computer Science 2054, 145-172. [15] Reif J. H, LaBean T , Pirrung M, Rana V. S, Guo B, Kingsford C, Wickham G. S. Experimental Construction of Very Large DNA Databases with Associative Search Capability. Proc. of DNA7, 2001. Springer-Verlag Lecture Notes in Computer Science 2340 (2002), 231-247. [16] Test of English as a Foreign Language (TOEFL), Educational Testing Service, Princeton, New Jersey, http://www.ets.org/. [17] Landauer T. K, Foltz P. W, Laham D: Introduction to Latent Semantic Analysis. Discourse Processes 25, 259-284. [18] http ://ww w. autotutor.org
Conceptual and Contextual DNA-Based Memory Russell Deaton1 and Junghuei Chen2 1
Computer Science and Engineering University of Arkansas Fayetteville, AR, USA 72701 [email protected] 2
Chemistry and Biochemistry University of Delaware Newark, DE, USA 19716 [email protected]
Abstract. DNA memories have the potential not only to store vast amount of information with high density, but also may be able to process the stored information through laboratory protocols that match content and context. This might lead to knowledge mining applications on a massively parallel scale, and to a limited capability for intelligent processing of stored data to discover semantic information. In this paper, a design for such a DNA memory is presented.
1 Introduction The original excitement around the idea of DNA computing was caused by its potential to solve computational problems through massive parallelism [1]. It appears, however, that current technology is not capable of the level of control of biomolecules that is required for large, complex computations [2]. Thus, other paradigms for abiotic information processing with DNA have gained focus. In this paper, an associative DNA memory is described which could store massive amounts of information. In addition, the architecture of the memory is designed to represent and search formal conceptual structures [3], and to exploit contextual information for semantic information processing. The DNA memory is a searchable database of information stored in the sequences of a collection of DNA molecules. It is also a DNA computer that can do computations on the stored data by manipulating the contents of the test tube. There are four tasks that need investigation. The first task is to map records M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 25–31, 2004. © Springer-Verlag Berlin Heidelberg 2004
26
R. Deaton and J. Chen
onto DNA sequences, which are called coding strands (Section 3). The second step, which is called the DNA memory architecture, is to design methods for searching, retrieving, and manipulating the information stored in DNA (Section 2). In the third task, the memory can be queried to match new data with stored records. This involves protocols to process the queries strands in vitro for retrieval of information based on context and content (Section 4). Fourth, the output of the DNA memory (output strands) is read and converted back into a readable format (Section 5). Ultimately, the goal is for the DNA memory not only to store and recall data, but also to actively manipulate its contents in order to make reasoned inferences about relationships among the data. Many data processing and mining tasks involve messy and errorprone data, and thus, a modicum of machine intelligence is an advantage to recognize relevant relationships and matches. For instance, customer data Fig. 1. Representation of objects and could include, among other things, attributes of Table 1 in DNA molecules business names, addresses, consumer names, and purchases. The problem of customer data integration (i.e. matching businesses with potential customers) is difficult because of errors, incomplete and misleading information, and the size and diversity of the data sources. A particular difficulty is the accurate matching of information in different records. For example, business names can include typographical errors, phonetic spellings, homonyms (words that are spelled the same, but have different meanings), synonyms (words with the same meaning, but different spellings), polysemy (words with multiple meanings), and different combinations of word breaks. Business names can be abbreviated or aliased. For example, records, which identify a single business, might be “WILLIAM MILLER PHARMACY DRUG,” “MILLR SUPER D DRUG STORE,” and “SUPER D EXPRESS DRUG.” The challenge is to recognize these separate records as identifying the same business. Frequently, human intelligence is able to deal with these difficulties by using context and experience, or in other words, the meaning (semantics) of the information. Context is “the part of a text or statement that surrounds a particular word or passage and determines its meaning [4].” For example, the meaning of “bank” is evident in “The deposit was made at the bank,” and “The river overflowed its bank.” Likewise, the three records can be connected as referring to the same business because they have the same context, i. e. certain words are shared among the records. The ultimate challenge is to use context to match records in vitro in an DNA memory. The intent is similar to the machine intelligence technique, latent semantic analysis (LSA) [5].
Conceptual and Contextual DNA-Based Memory
27
2 DNA Memory Architecture The proposed DNA memory architecture is modeled upon formal contexts and concepts [3]: Definition 1. A formal context consists of a set of objects O and attributes A, where (denoted oIa which means has attribute Example contexts from Table 1 are O1IA6 (Bream can move around) and O2IA2 (Frogs live in water). In the DNA memory, both objects and attributes are represented with DNA sequences (Figure 1). The sequence for the object becomes a label for a molecular record composed of attribute sequences. Definition 2. A formal concept of the context (O, A, I) is a pair (B, C) where and A concept from Table 1 is {{O3, O4, O5}, {A1, A4, A5}}When ordered by set inclusion, the relation provides a hierarchical order of the concepts. This basically means a concept provided that and The set of all concepts is a complete lattice under this order [6]. This structure allows the concept lattice to be explored by implementing primitive operations on sets of DNA oligonucleotides, in order to search, compute, and mine information stored in DNA sequences. Thus, the goal is to represent a conceptual space in DNA, that has a prescribed hierarchical and relational structure. This space is composed of objects that have certain attributes. An example from [3] is shown in Table 1. The intent is to establish a reasoning system about these objects. For instance, Breams and Frogs share attributes A1 (needs water to live), A2 (lives in water), A6 (can move around), A7 (has limbs), Frogs, Reeds, and Maize have attribute A3 (lives on land), and Reed has A1, A2, A3, A4 (needs chlorophyll), A5 (one seed leaf). The DNA representation of object O1 is shown in Figure 1. Ideally, the entire conceptual space would be created in DNA and explored through in vitro operations. To accomplish this, we think of a logic on the DNA space that corresponds to navigation of the concept. A DNA memory is a collection of DNA words M, which can be divided into two subsets representing objects O, and attributes A. The DNA space is the power set of M. To navigate the space, set union and set intersection must be implemented in laboratory protocols. Under set union and intersection, the power set of DNA words is a complete lattice [6]. If an operation corresponding to set complement is implemented, and a 1 is represented by the universal set of all sequences, and 0 as the empty set, then, a Boolean algebra is present on the sets of sequences. Set union corresponds to logical OR, set intersection to logical AND, and set complement to NOT. The implementation of set union is relatively straightforward, and involves simply mixing two sets of molecules. The implementation of set intersection is more complicated, but involves several key components. The idea is to determine
28
R. Deaton and J. Chen
sequence commonality among the sets of DNA words through hybridization, and then, separate duplexes that have components from both sets under comparison. For set complementation, the set to be complemented S is separated from the universal set of sequences M to form Both set intersection and complement could be done with column separation or biotin-avidin magnetic bead extraction.
3 Mapping Abiotic Data onto DNA Sequences For large data sets, the cost of synthesizing the DNA strands would be prohibitive, even at the cost of pennies per nucleotide base. An alternative approach is to use cloning technology. The intent is to reduce the cost of synthesis by starting with random sequences. These random sequences would have known primer sequences on each end, similar to the selection protocol for noncrosshybridizing oligonucleotides[7]. This makes it possible to amplify the starting material. Also, the starting material could be processed with the selection protocol to remove crosshybridizing strands. In year’s worth of effort, 50,000 seFig. 2. Sequences representing attributes quences could be isolated, from which and objects would be mapped to different the number of records that could be colonies. The sequences would be ligated formed would be huge Moretogether to form complete records, pro- over, 500,000 sequences are possible, ducing in vitro formation of molecules cor- eventually, which would be enough to responding to permutations and combina- represent every word in the Oxford Entions of attributes in objects glish Dictionary. The primer sequences will have an embedded restriction site so that the internal sequences can be inserted in a plasmid. These plasmids will be transformed into E. coli under conditions that promote one plasmid per cell, and then, colonies grown. Each clone would on average incorporate just one sequence from the starting set. The sequences can then be extracted and perhaps, sequenced to build a library of coding strands to which abiotic information can be mapped. It might be possible to assign data without the sequence being known (Figure 2). Sequencing the coding strands is a potential bottleneck. Some sequencing could be done for small-scale applications, but sequencing of large numbers of coding strand slows scaling of the memory, and adds cost. Thus, it might be more efficient and cost-effective to avoid sequencing the coding strands. In this
Conceptual and Contextual DNA-Based Memory
29
scenario, DNA would be indexed by the clone from which it was extracted. Terms would be assigned to DNA molecules from specific clones. For example “MILLER” would be assigned to DNA from clone 1, “PHARMACY” to DNA from clone 2, and so on. This can be accomplished without knowing the specific sequences. Likewise, using cDNA arrays, coding strands from specific clones could be attached to a solid support for output without knowing the sequence. There is an error potential in this approach from transformations that produce no plasmid uptake, multiple plasmid uptake, and colonies that have the same sequence, but optimization of the cloning process would have to be done to minimize this[8]. In addition, an appropriate representation of concepts in DNA is required. Ideally, DNA molecules corresponding to the rows of Table 1 would be created, with DNA words for labeling objects ligated to words representing attributes (Figure 2). For a business name application, the object would correspond to a record identifier, and the attributes to the terms in the business name. Using the restriction sites to hybridize input words should produce every possible combination and permutation of words representing the attributes of the object. This is important for capturing contextual information because theoretically, all terms or attributes occur in the context (adjacent) to all others. Thus, the individual coding strands, which represent the terms in a Fig. 3. Queries can be done via column record, are mixed in a test tube, their separation or biotin-avidin bead extracprimers or restriction sites are allowed tion. Query on objects of Table 1 to hybridize, and ligation is done (Fig- produces concept {{O1, O2}, {A6, A7}} ure 2). This is similar in design to how Adleman[1] formed all possible paths in a graph to find the Hamiltonian path. In this application, however, the combinatorial power of DNA computing is being used to form all possible combinations and permutations of attributes in a given object.
4 Query of the Memory The memory can be searched for matching of queries to the closest object, for categorization of object according to shared attributes, and for formal concept generation. For matching of queries to objects (Figure 3), a molecular represen-
30
R. Deaton and J. Chen
tation is formed of the query, which is composed of permutations and combinations of complements of the query terms. This is used to separate those objects that have the desired attributes. In Figure 3, a query of extracts all attributes that have these objects, and in the process, forms a molecular representation of the Fig. 4. Contextual information is captured formal concept {{O1, O2}, {A6, A7}}. by presence of sequences in the same Shared attributes mean that molecules molecule. In this case, context is used share sequences that can be used to distinguish two meanings of the word through affinity separation to sense similar content. The context of atbank. Appropriate queries are tributes are the other attributes that a and particular object shares. Thus, in the molecular representation of that object, attributes occur in the same context because their sequences are common to a given molecule. Thus, term content and context is translated to and sensed through sequence content and context in the DNA memory, and as a result, the memory is content-addressable, or associative. For semantic processing, the idea is that sequences representing different attributes occur in the context of the same molecule. Thus, by appropriate query, meaning can be deduced, as shown in Figure 4, and, through the query process, records in which terms occur in similar contexts are matched. In set notation, the query in Figure 4 corresponds to these two operations, and The effect is similar to latent semantic analysis (LSA)[5], and has been explored in simulations [9]. There are other ways of doing the query matching with, for instance, separation columns, which is similar to Sticker DNA Memories [10]. Moreover, these capabilities are achieved in vitro with the advantages of massive parallelism.
5 Output Output in a readable format is accomplished by attaching the cloned, coding sequences to an array. Thus, each spot would represent either an object or an attribute. Readout occurs directly from sensing fluorescent tags attached to memory strands as probes.
6 Conclusion To summarize the DNA memory, data is mapped to coding sequences that have been cloned from random starting material, or alternatively, to reduce hybridization errors, from sequences selected in vitro to be non-crosshybridizing [7]. This has the advantage that sequences do not have to be synthesized at great cost.
Conceptual and Contextual DNA-Based Memory
31
The coding strands are concatenated together in vitro to form molecules representing the possible permutations and combinations of attributes for a given object. These molecules become the memory strands, and by implementing simple lab protocols to represent set operations, logical inferences can be made from the in vitro data. Thus, the architecture for the memory is equivalent to formal contexts and concepts [3]. Sets of query strand are formed, annealed to memory strands, and used to extract records with similar contexts. Since sequences, which represent abiotic information, occur in a molecular context with other sequences, contextual information can be extracted, providing a rudimentary semantic processing capability. Read out is accomplished using DNA microarrays with coding strands strands as spots.
References 1. Adleman, L.M.: Molecular computation of solutions to combinatorial problems. Science 266 (1994) 1021–1024 2. Adleman, L.: DNA computing FAQ. http://www.usc.edu/dept/molecular-science/ (2004) 3. Ganter, B., Wille, R.: Formal Concept Analysis. Springer-Verlag, Berlin (1999) 4. Definition of context. http://www.dictionary.com (2003) 5. Deerwester, S., Dumai, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. Journal of the Society for Information Science 41 (1990) 391–407 6. Davey, B.A., Priestley, H.A.: Introduction to Lattices and Order. Cambridge University Press, Cambridge, UK (1990) 7. Deaton, R., Chen, J., Bi, H., Garzon, M., Rubin, H., Wood, D.H.: A PCR-based protocol for in vitro selection of non-crosshybridizing oligonucleotides. In Hagiya, M., Ohuchi, A., eds.: DNA Computing: 8th International Workshop on DNA-Based Computers, Berlin, University of Tokyo, Hokkaido University, Sapporo, Japan, June 2002, Springer-Verlag (2003) Lecture Notes in Computer Science 2568. 8. Sambrook, J., Fritsch, E.F., Maniatis, T.: Molecular Cloning: A Laboratory Manual. Second edn. Cold Spring Harbor Laboratory Press (1989) 9. Garzon, M., Bobba, K., Neel, A.: Efficiency and reliability of semantic retrieval in DNA-based memories. In Chen, J., Reif, J., eds.: DNA Computing: 9th International Workshop on DNA-Based Computers, Berlin, University of WisconsinMadison, Madison, WI, June 2003, Springer-Verlag (2004) 157–169 Lecture Notes in Computer Science 2943. 10. Roweis, S., Winfree, E., Burgoyne, R., Chelyapov, N.V., Rothemund, P.W.K., Adleman, L.M.: A sticker based model for DNA computation. In Landweber, L.F., Baum, E.B., eds.: DNA Based Computers II. Volume 44., Providence, RI, DIMACS, American Mathematical Society (1998) DIMACS Workshop, Princeton, NJ, June 10-12, 1996. 1–30 DIMACS Workshop, Princeton, NJ, June 10-12, 1996.
Semantic Model for Artificial Intelligence Based on Molecular Computing Yusei Tsuboi, Zuwairie Ibrahim, and Osamu Ono Control System Laboratory, Institute of Applied DNA Computing, Graduate School of Science & Technology, Meiji University, 1-1-1, Higashimita, Tama-ku, Kawasaki-shi, Kanagawa, 214-8671 Japan {tsuboi, zuwairie, ono}@isc.meiji.ac.jp
Abstract. In this work, a new DNA-based semantic model is proposed and described theoretically. This model, referred to as ‘semantic model based on molecular computing’ (SMC) has the structure of a graph formed by the set of all attribute-value pairs contained in the set of represented objects, plus a tag node for each object. Attribute layers composed of attribute values then line up. Each path in the network, from an initial object-representing tag node to a terminal node represents the object named on the tag. Application of the model to a reasoning system was proposed, via virtual DNA operation. On input, objectrepresenting dsDNAs will be formed via parallel self-assembly, from encoded ssDNAs representing (value, attribute)-pairs (nodes), as directed by ssDNA splinting strands representing relations (edges) in the network. The computational complexity of the implementation is estimated via simple simulation, which indicates the advantage of the approach over a simple sequential model.
1 Introduction Our research group focuses on developing a semantic net (semantic network) [1] via a new computational paradigm. Human information processing often involves comparing concepts. There are various ways of assessing concept similarity, which vary depending on the adopted model of knowledge representation. In featural representations, concepts are represented by sets of features. In Quillian’s model of semantic memory, concepts are represented by relationship name via links. Links are labeled by the name of the relationship and are assigned “criteriality tags” that attest to the importance of link. In artificial computer implementations, criteriality tags are numerical values that represent the degree of association between concept pairs (i.e., how often the link is traversed), and the nature of the association. The association is positive if the existence of that link indicates some sort of similarity between the end nodes, and negative otherwise. For example, superordinate links (the term used for ‘is-a...’ relationships) have a positive association, while ‘is-not-a...’ links have a negative association. Just as there are at least two research communities that deal necessarily with questions of generalization in science, there are at least two bodies of M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 32–39, 2004. © Springer-Verlag Berlin Heidelberg 2004
Semantic Model for Artificial Intelligence Based on Molecular Computing
33
knowledge concerned with representation of the known world as discovered and explained by science. On one hand, knowledge can be fundamentally procedural and causal; on the other, knowledge is fundamentally judgemental [2]. Naturally, the knowledge representation schemas are quite different; thus, the manner in which the knowledge may be processed to generate new knowledge in each model is also quite different. Semantic modeling provides a richer data structuring capability for database applications. In particular, research in this area has articulated a number of constructs that provide mechanisms for presenting structurally complex interrelations among data typically arising in commercial applications. Eric Baum [3] first proposed the idea of using DNA annealing to perform parallel associative search in large databases encoded as sets of DNA strands. This idea is very appealing since it represents a natural way to execute a computational task in massively parallel fashion. Moreover, the required volume scales only linearly with the base size. Retrievals and deletions under stringent conditions occur reliably (98%) within very short times (100’s of milliseconds), regardless of the degree of stringency of the recall or the number of simultaneous queries in the input. Arita, et al. [4] suggest a method for encoding data and report experimental results for performing concatenation and rotation of DNA. This work also demonstrates the feasibility of join operations in a relational database with molecules. However, this work regarding the database is not based on semantic nets. It is thought that one method of approaching a memory with power near to that of man is to construct the semantic model based on molecular computing. In this light, we ask: what type of the model is most suitable for implementing such a DNA-based architecture? In this paper, we propose a new semantic model and its application. The semantic model works on DNA-based architecture, using standard tools from DNA computing. The application is to an Adleman-like [5] scheme which employs primitive motion of DNA strands in the vessel, to effect parallel computation. An important point of his work is the verification of the effectiveness of these approaches via actual experiment.
2 Methodology In this section, we first provide an overview of the structure of a basic semantic net. Second, we describe how to create a new model, based on DNA molecules. Finally, the proposed model is represented by double-stranded DNAs for purposes of application.
2.1 Structure of Semantic Net The basic structure of a semantic net is a two-dimensional graph, similar to a network. It is relatively easy for humans to deal with semantic net, because it represents an object (or concept) created from knowledge based on human memories. The semantic net is made of three relations: Object, O; Attribute, A; and Attribute Value, V. In general, this list representation is denoted as follows:
34
Y. Tsuboi et al.
A basic semantic net may be described as a graph with nodes, edges and labels representing their relations. O is reasoned out by the relation between and Because the semantic net is simply defined with nodes and edges, it is a suitable system to support the search for multiple objects in parallel, and to be used as a knowledge-based system. In general, semantic net size increases with the number of attributes or attribute values. The other hand, it is imperative to transform complicated graphs into simpler ones. The AND/OR graph enables the reduction of graph size, and facilitates easy understanding. Thus, instead of using the standard existent semantic net described above, in the next section, we instead define a new model, developed to make the most of DNA computing.
2.2 Semantic Model Based on Molecular Computing First, a tag as a name of an object is set to an initial node in the graph. After we determine the number and kinds of the attribute extracted from the object, both the attribute and attribute value are also set to another node following by the tag node. Second, the relation between nodes and edges is represented using a new defined AND/OR graph. In Fig. 1-a a directed edge in the terminal direction is connected between the nodes in series except for the following case. If there are two nodes which have the same attributes but different attribute values, each of directive edges is connected in parallel as shown in Fig. 1-b. Each edge denotes only connection between the nodes in the directive graph. Finally, labels are attached to the nodes, such as ‘(Tag)’ and ‘(Attribute, Attribute Value)’. The nodes denote either a name of the object or both the attribute and attribute value. In short, one path from an initial node to a terminal node means one object named on the tag. We newly define this graph as the knowledge representation model. The model represents an object, as reasoned out by the combinations between the nodes connected by the edges. For example, Fig. 2 illustrates this object representation in the context of an apple (named via the tag). An overall graph is then formed by the union of a set of such basic objects, each of which is described in similar, simple fashion. Fig. 3 shows an example of such a network. We name such a graph a semantic model based on molecular computing (SMC). An SMC contains all attributes common to every object as well as each attribute value. Attribute layers consist of attribute values, lined up. If an object has no value of a certain attribute, the attribute value is assigned ‘no value’.
Fig. 1. AND/OR graph connecting nodes in series and in parallel
Semantic Model for Artificial Intelligence Based on Molecular Computing
35
Fig. 2. Simple object model of an apple; The three determined attributes are shape, color, and size
Fig. 3. Semantic Model Based on Molecular Computing (SMC), which collectively models a set of objects, given a total number of attributes, m
2.3 DNA Representation of SMC Each of the nodes and edges of an SMC may be represented by a DNA strand, as follows: each node (except for tags) is mapped onto a unique, single-stranded (ss) DNA oligonucleotide, in a DNA library of strands. In the DNA library, a row shows attributes, a column shows attribute values and each DNA sequence is designed according to these relations to prevent mishybridization via other unmatching sequences. Every object-naming tag node is represented by a random sequence of unique length (200, 300, 400...) to distinguish the objects. Each from to is designed to be WatsonCrick complementary to the node sequences derived from the 3’ 10-mer of and the 5’ 10-mer of Except for initial and terminal edge strands of each graph path, each is a ssDNA oligonucleotide of length 20. These two ssDNAs are respectively represented by the size which suits the end of the DNA pieces of the initial or the terminal node exactly. In this way, the SMC is represented by double-stranded (ds) DNAs. Fig. 4 shows one of the paths shown for the apple model in Fig. 3, as represented by a dsDNA
36
Y. Tsuboi et al.
Fig. 4. One of the double- stranded DNAs represented by the graph (Apple) in Fig. 2
3 Application The following demonstrates the application of the semantic model to a reasoning system. The system is implemented by chemical operations with DNA molecules.
3.1 Reasoning System This reasoning system consists of: (a) Input, (b) Knowledge base, (c) Reasoning engine, and (d) Output. a) Input In the input, the attribute values are extracted from an input object separately according to previously determined attributes. Using the attributes and attribute values, a ssDNA is synthesized as an input molecule. b) Knowledge base In the knowledge base, a ssDNA representation of each edge and tag in the network is synthesized as a knowledge based molecule. c) Reasoning engine The reasoning engine denotes the biochemical reactions which occur under experimental conditions, given a particular set of input molecules, and the complete set of knowledge based molecules. d) Output Output refers to the dsDNA products, as determined via length.
3.2 Implementation In this work, the system is implemented by virtual chemical operations. For reasonable operation, each of the knowledge based and the input molecules must first be amplified sufficiently. Knowledge based molecules are inserted into a test tube as a molecular knowledge based memory. Input molecules are then put into the test tubes. It is assumed that each ssDNA will spontaneously anneal to complementary sequences under defined reaction conditions in the test tube. The ssDNA sequences representing input molecules and knowledge based molecules are then mixed in the presence of DNA ligase, which will form a covalent bond between each templatedirected pair of adjacent DNAs, directed by a pair of complementary single-stranded overhangs. Thus, each set of sequences is ligated to form a dsDNA which represents a path between an initial node and a terminal node in the model. As a result, all possible dsDNAs representing the paths are generated at random. The generated dsDNA set must then be analyzed to determine the specific set of represented objects, as produced by the reaction. Generated dsDNAs are subjected to
Semantic Model for Artificial Intelligence Based on Molecular Computing
37
gel electrophoresis, which separates the strands based on length, which then appear as a discrete bands on the gel in a lane. The length of each generated dsDNA, denoted as N_S, is given by the simple relation:
where L_D is the length of ligated dsDNA, except for the tag sequence, N_A is the number of attributes, and L_T is the length of the tag. For instance, if a reference object is an apple such as L_D=20, N_A = 3 and L_T = 200, we find out doublestranded DNAs of 260 bp (base pair) exist in the lane.
4 Discussion The model and implementation presented in this paper relies on chemical processes such as annealing and gel electrophoresis. In actual practice, an effective way to select sequence to avoid mismatched, error hybridization will have to be devised. Recently, substantial progress has been reported on this issue [6]–[8]. We expect that this issue will be resolved satisfactorily in the near future. The proposed model is applied to knowledge based memory via DNA molecules, which is in some sense similar to human memory, due to the inherent massive parallelism. This performance is not realized in artificial, sequential models of computing. Although simulations will be interesting, the inherent advantages provided by the design will therefore be evident only when using real DNA molecules. We might have to evaluate the advantage of the proposed model by using a DNA computer as compared with a silicon-based computer. It is commonly said that it is difficult to evaluate a simulation of chemical reaction on the silicon-based computer. DNA-based computers integrate software with hardware and calculate in parallel. A direct attempt to simulate the implemented reaction on a normal silicon-based computer will be compromised by the potential for a combinatorial explosion in the number of distinct path molecules. Some studies on artificial intelligence have been performed with regards to avoiding such increases in knowledge and computational complexity. For this reason, in order to demonstrate the advantage of the proposed model over a simple, sequential model, we estimate the computational complexity required for solution, assuming that every ssDNA encounters all others in the test tube. It is possible to reason out an object by the combinations between input molecules and knowledge based molecules. Therefore, it is reasonable to expect the number of combinations to increase with the number of objects and attributes. Fig. 5 shows relations between the attributes and the combinations. The number of combinations is estimated for the simple, sequential architecture and a DNA-based architecture separately when there are 3, 100, and 1000 target objects in the molecular knowledge based memory. With a simple architecture, blue, green and red lines are shown, which correspond to the case of 3, 100 and 1000 objects respectively. Each of these three lines (Only three of 4 lines are labeled in the figure; This should be corrected...) increases exponentially with the number of attributes. In contrast, a single light blue line indicates the operation number required for a DNA-based architecture
38
Y. Tsuboi et al.
for each of the case of 3, 100, and 1000 objects. This line also increases exponentially with attribute number. However, the number of combinations does not depend on the number of target objects, since the proposed application requires only DNA selfassembly which proceeds for all objects in parallel. This simulation result suggests that the proposed implementation will be effective in reducing the computational time, under ideal conditions.
Fig. 5. Estimation of the computational complexity, with increasing number of attributes and objects in the knowledge based memory
5 Conclusion In this work, a semantic model has been presented for knowledge representation with DNA molecules, as follows: (1) A newly-defined semantic model, in which objects are represented by dsDNAs; (2) For the stated application, reaction proceeds via DNA self-assembly. This process was outlined, and analyzed via simulation, from a theoretical point of view. (3) We estimated the computation complexity of the DNA-based architecture, and compared with that of a simple, sequential architecture;
Since the inception of DNA-based computing, a primary concern has been the development of applications in the field of engineering and artificial intelligence. The proposed model suggests that DNA-based computing should be applicable to the artificial intelligence field. It seems likely that this approach will be utilized as a natural application for problems involving pattern matching, deduction, and retrieval in the future.
Acknowledgement The authors would like to thank J. A. Rose of the University of Tokyo for helpful comments that led to many improvements in this paper.
Semantic Model for Artificial Intelligence Based on Molecular Computing
39
References 1. Quillan, M.R.: Semantic Memory. In Semantic Inform. Processing, M. Minsky, Ed. Cambridge, MA: MIT Press (1968) 2. Blanning, R. F.: Management Applications of Expert Systems. Information and Management, Vol.7 (1984) 311-316 3. Baum, E. B.: How to Build an Associative Memory Vastly Larger than the Brain. Science 268 (1995) 583-585 4. Arita, M., Hagiya, M., and Suyama, A.: Joining and Rotating Data with Molecules. IEEE International Conference on Evolutionary Computation, pp.243-248, (1997) 5. Adleman, L. M.: Molecular Computation of Solutions to Combinatorial Problems. Science, Vol.266 (1994) 1021-1024 6. Deaton, R., Murphy, C. R, Garzon, M., D. R. Franceschetti and S. E. Stevens, Jr.: Good Encodings for DNA-based Solutions to Combinatorial Problems. DNA Based Computers II DIM ACS Series in Discrete Mathematics and Theoretical Computer Science, Vol.44 (1999) 247-258 7. Rose, J. A., Deaton, R., Franceschetti, D., Garzon, M., and Stevens, S. E. Jr.: A Statistical Mechanical Treatment of Error in the Annealing Biostep of DNA Computation., Proc. GECCO’99, pp.1829-1834 (1999) 8. SantaLucia, J., Allawi, H., Seneviratne, P.: Improved Nearest-Neighbor Parameters for Predicting DNA Duplex Stability. Biochemistry, Vol.35, No.11 (1996) 355-356 9. Garzon, M., Bobba, K., Neel, A.: Efficiency and Reliability of Semantic Retrieval in DNAbased Memories, Lecture Notes in Computer Science, Springer-Verlag Heidelberg, pp. 379-389 (2003)
The Fidelity of the Tag-Antitag System III. Robustness in the Excess Limit: The Stringent Temperature John A. Rose Department of Computer Science, The University of Tokyo, and Japan Science and Technology Corporation, CREST [email protected]
Abstract. The importance of DNA microarrays and Tag-Antitag (TAT) systems has prompted the recent development of various approaches for high-fidelity design, including analytical methods based on an ensemble average error probability per conformation, or computational incoherence Although predictions for dilute inputs indicate the easy attainment of excellent fidelity, recently a sharp phase transition from the low-error predicted for dilute inputs, to a high-error was predicted to accompany an asymmetric (i.e., single-tag) excess input. This error-response, which is likely to be the critical test of TAT system robustness for DNA-based computing applications that employ non-linear amplification, is examined more closely, via derivation of an approximate expression, for the single-tag, excess limit. The temperaturedependence of this expression is then characterized, and applied to derive an expression for a novel TAT system error-parameter, which defines the temperature of minimal is taken to provide a precise definition of the stringent reaction temperature previously discussed conceptually in the literature. A similar analysis, undertaken for a uniform excess multi-tag input, indicates the absence of a phase transition in The validity of each expression is discussed via simulation, with comparison to the general model. Applicability of to both TAT system design and selection of an optimal reaction temperature is discussed.
1 Introduction DNA microarrays are indexed arrays of single-stranded (ss) DNA probes which are immobilized on a solid substrate. When exposed to a set of unbound target ssDNA strands, the chip essentially performs an exhaustive, parallel search for complementary sequences between the immobilized probes and target species. DNA chips have been successfully applied to gene expression profiling (GEP) and genotyping on a genomic scale [1], and have also been suggested for applications in DNA computing [2], and DNA computing-based biotechnology [3]. Notably, design for computational application simplifies word selection, since the ssDNA species need not be correlated to a genome of interest, but may be selected arbitrarily. The resulting set forms a Tag-Antitag system [4], constrained only by M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 40–49, 2004. © Springer-Verlag Berlin Heidelberg 2004
The Fidelity of the Tag-Antitag System III
41
the requirement that each anchored probe, or ‘antitag’ species be the WatsonCrick complement of exactly one corresponding target, or ‘tag’ species. The anchored component is often referred to as a ‘universal’ DNA chip. Although a number of design goals exist for TAT systems (e.g., uniform energetics, minimal folding, maximal specific-affinity, etc. [5]), the design for maximal specific affinity is the most challenging, due to the highly-coupled nature of the hybridizing system. Various heuristic methods for TAT system design have been proposed [4, 6, 7, 2, 8] for this purpose. In addition, a statistical thermodynamic approach for TAT system error analysis and design has also been reported [9, 5], which is attractive due to physical motivation, the availability of energetic parameters, and the generation of a quantitative, well-defined measure of system performance. Fundamental to this approach is the modelling of system error in terms of an ensemble average probability of error hybridization per conformation (the computational incoherence), so that the inverse problem of system design is equated to the process of measure minimization. Thus far, results include approximate expressions for dilute single-tag and multi-tag inputs [9], and general single-tag inputs[5]. Note that an equilibrium approach has also been applied to investigate the fidelity of DNA-protein interactions [10], nucleic acid-based antisense agents [11], and via the computational incoherence, the annealing [12] and annealing-ligation biosteps of DNA-based computing [13].
1.1
Recent Work and Motivation
In [5], an approximate solution for the computational incoherence of the TAT system in response to a single-tag input, was derived for the error-response over a wide range of input tag concentrations. For all error conditions, the simulated dependence of on total input tag concentration indicated a sharp phase-transition between high-error and low-error operation, in the vicinity of an equimolar input the total concentration of each antitag species), for temperatures, T beneath the melting transition of the planned TAT species. In particular, TAT system fidelity was predicted to abruptly transition between: (1) a monotonically increasing function of T (dilute inputs; characterized by low-error operation; and (2) a convex function of T (excess inputs; characterized by an error minimum at temperature, with exponentially-increasing away from this temperature. Intuitively, this transition signals saturation of the target antitag, which naturally accompanies an excess single-tag input, beneath the melting transition of the target TAT duplex. For simple, 1-step TAT system applications in biotechnology, dilute conditions may generally be safely assumed. Given the ease of attaining high-fidelity performance at low temperatures, predicted in the dilute regime [9,5], the biasing of DNA-computers to ensure dilute regime operation of an associated TAT system component is clearly desirable. However, given the tendency for DNA computing architectures to implement repeated linear strand growth, via merge operations, as well as species-specific, non-linear strand growth via PCR amplification, over the course of multiple steps/rounds [8], there appears to be a strong
42
J.A. Rose
potential for computational processes to generate an asymmetric input, consisting of a dilute component combined with an excess component of one (or more) input species. In this case, consistent high-fidelity operation at low temperatures is predicted to become substantially more problematic, for even the best encodings (see Sec. 4). For these architectures, consideration of the associated TAT system’s single-input, excess-error response curves yields valuable information for selection of a reaction temperature, appropriately robust to a range of asymmetric excess-inputs. To support this analysis, the current work undertakes a closer examination of the single-tag error behavior in the excess regime, with the aim of identifying a design principle which renders the implemented TAT systems maximally robust to asymmetric, excess inputs. Following an overview of the general model for predicting TAT system singleinput, error-response (Sec. 2), an approximate expression is derived in Sec. 2.1 for in the limit of excess input Sec. 2.2 then discusses the temperaturedependence of this expression, followed in Sec. 2.3 by identification of a new TAT system parameter, which estimates the temperature of optimal fidelity, given an excess input of tag species, is taken to provide a novel, precise definition of the stringent previously discussed conceptually in the literature for the TAT system [4]. For completeness, Sec. 3 describes a parallel analysis for the uniformly excess input. Sec. 4 reports a set of statistical thermodynamic simulations undertaken to explore the validity and implications of derived expressions for and In closure, Sec. 5 discusses applicability to TAT system design.
2 The Single-Tag Input The error probability per hybridized tag for a TAT system, in response to an input of a single tag species, is estimated by the expression [5],
Here,
is the signal-to-noise ratio,
where denotes the net equilibrium constant of duplex formation between tag species and antitag species while distinguishes that of matching TAT pair, For approximation purposes, it is typical to assume a small overall error-rate, so that (i.e., [9, 5]. At equilibrium, this condition, here referred to as weak orthogonality takes the convenient form, Although this approximation will begin to fail for an excess (but not dilute) input, as is reduced to the vicinity of the melting temperature of the most stable error TAT pair, it nevertheless facilitates an investigation of the approximate functional form of Furthermore, upon failure, this approximation
The Fidelity of the Tag-Antitag System III
43
will overestimate and thus provide a bounding value, which, as simulations indicate, is not too far off the mark [5]. Following application of weak orthogonality, approximate solution of Eq. 1 involves re-expression of in terms of equilibrium constants and initial concentrations, via combination of the equations of stand conservation, with an equation of mass action for each component equilibrium. In particular, strand conservation yields an equation of the form,
for each antitag species,
and equation,
for the single input tag species, where the impact of tag-tag interaction has been neglected. Strict estimation of then proceeds via numerical solution of the coupled equations formed by Eqs. 3 and 4. In [5], an approximate approach was used to derive a general solution for applicable over a wide range of input concentrations. Readers are referred to the original paper, for a detailed development and discussion. In the current work, attention is restricted to a more detailed analysis of TAT system behavior in the excess-input limit.
2.1
Behavior in the Excess Limit
A simple, approximate expression for the single-tag error-response, in the limit of excess input (i.e., may be derived straightforwardly, by noting that the impact of hybridization on the equilibrium concentration of the input tag species, may be neglected. This allows Eq. 4 to be approximated as Substitution of this expression into Eq. 3 yields,
Invoking weak orthogonality, followed by insertion of these expressions reduces Eq. 1 to the desired approximate form,
which applies to the case of excess input. In the absence of significant hairpin formation, this reduces to a simple ratio,
44
J.A. Rose
For comparison purposes, the approximate expression for the converse limit of dilute input, without hairpinning was given in [5] as the simple ratio,
2.2
Temperature Dependence
The temperature-dependence of may be investigated by straightforward differentiation. Neglecting hairpin formation, this process yields
where denotes an ensemble average taken only over the set of error conformations, defined formally by the expression,
This quantity is distinguished from by the absence of a contribution from in both numerator and denominator, due to the restriction that measurements are over the error ensemble. In contrast with the monotonically increasing form reported for [9], the form of Eq. 9 suggests that behaves as a convex function of T, with a minimum between the melting temperatures of the planned and error TAT pairs. This is discussed via simulation, in Sec. 4.
2.3
Robustness in the Excess Limit: The Stringent Temperature
Eq. 9 may also be used to derive an approximate expression for the temperature, at which assumes a minimum value. This is accomplished by noting that at so that
where the superscript, ‘†’ denotes strict evaluation at followed by the application of three well-motivated approximations. First of all, as simulations [5] predict that is consistently located substantially above the melting transition of all error TAT species, the statement is expected to hold, so that Secondly, the ensemble average enthalpy of formation for error species is assumed to be approximated to first order by the enthalpy of the single most-dominant error species, given the usual dominance of this term in the weighted average. Finally, the statement should hold, since is also expected to be located beneath the melting temperature of the planned TAT species, [5], at least for
The Fidelity of the Tag-Antitag System III
45
the case of excess input. In this case, Application of each of these expressions to Eq. 11, followed by rearrangement yields,
which defines the for optimum-fidelity operation, given an excess input of species, This new TAT system parameter is taken to provide a novel, precise definition of the intuitive concept of stringent previously discussed conceptually in the literature [4]. Applicability of the parameter set, to both TAT system design and selection of optimal is discussed in Sec. 5.
3 The Excess Multi-tag Input An approximate expression for the error-response due to a multi-tag input in the excess limit may be derived similarly, beginning with the standard expression for the computational incoherence [12, 13], as applied to the TAT system[9, 5]:
and proceeds via approximation of the equilibrium concentrations, in a process similar to the single-tag development presented in Sec. 2. First, the impact of hybridization on each excess is again neglected, so that the equation of strand conservation for each input tag, again takes the approximate form, Using this expression, the equation of strand conservation for each antitag species, may then be written as,
The sum over may now be simplified by invoking the dual of ‘weak orthogonality’, which holds for all but the worst TAT encodings, but only under conditions of excess input for all tag species. Insertion of these expressions into Eq. 13 via mass action and invoking weak orthogonality yields the desired approximation,
where subscript, denotes excess input, for all The form of this expression is similar to that reported for a dilute, multi-tag input [9]. The temperature-dependence of may be investigated by a rather tedious process of differentiation. Here, only the result is presented:
46
J.A. Rose
where hairpin formation has been neglected, and denotes the ensemble average of quantity computed over all error TAT pairs, respectively. Again, this expression is functionally similar to the temperature-dependence reported in [9] and [5] for the dilute multi-tag and single-tag inputs, respectively, although in [9], was mistakenly identified in the text as the sum, rather than ensemble average over the enthalpies of formation for error conformations. From the form of Eq. 16, the error-response due to an excess input is a monotonically increasing function of T, with no error minimum between the melting temperatures of the planned and dominant error duplexes (i.e., no stringent temperature). This typical behavior is in marked contrast with the logarithmically convex TAT system error behavior predicted in the vicinity of the error minimum, in response to an asymmetric input, composed of an excess of a single (or several, but not all) tag species.
4 Simulations In order to investigate the applicability of (Eq. 7) for approximating the error-response due to an asymmetric input of a single, excess tag species, a set of simulations was implemented via (Mathematica™). Fig. 1 illustrates simulation results, which predict for the minimal complexity (i.e., 2-probe) DNA chip, composed of ssDNAs of length 20 bases, in which the input target species may participate in a full-length planned hybrid, or a single error duplex of length (a) 15 base-pairs (bps), (b) 10 bps, or (c) 5 bps. Predictions presented as a function of and in response to specific dilute panel 1, solid blue lines), and excess panel 2, solid red lines) input tag concentrations. Each antitag present at total concentration, pH = 7.0, and Each was estimated via the Gibbs factor, using a Watson-Crick, two-state model, assuming mean doublet stacking energetics The impact of dangling ends, hairpin formation and tag-tag interaction were also neglected. Dashed lines in panels 1 and 2 present corresponding predictions provided by the approximate expressions presented in Sec. 2.1 for the limiting cases of dilute and excess single-tag input, respectively. For each error condition, Panel 3 compares the predicted temperature for optimal fidelity excess operation, obtained via (1) visual inspection of plotted curves, and (2) approximate expression, Eq. 12. For comparison purposes, melting temperatures, for each planned duplex (under both excess and dilute input conditions), and each error duplex, (excess conditions only) are also illustrated, as predicted in isolation (denoted by, Each listed value corresponds to the temperature which maximizes the corresponding differential melting curve, generated via a statistical, two-state model of DNA melting [15, 5]. Panel 4, top inset shows a blow-up of high-error curve, panel 2(a); Middle and bottom insets illustrate isolated, differential melting curves predicted in isolation for the planned (solid curves) and dominant error (dashed curves), for excess (‘10x’) and dilute (‘0.1x’) input, respectively.
The Fidelity of the Tag-Antitag System III
47
5 Discussion and Conclusion As shown in Fig. 1 (panel 2), simulations for all error conditions indicate that (Eq. 7) is in good agreement with the predictions of the general model, reported in [5] for excess input, with only minor deviations at high and low temperatures. In each case, excess-input error-response is predicted to assume the expected logarithmically convex function of with a minimum at distinguished temperature, This behavior is in stark contrast to the low-error, monotonically error-response predicted for both single-tag, dilute inputs (panel 1) and multi-tag inputs which are either uniformly dilute [9, 5], or uniformly at excess (Eq. 15).
Fig. 1. Behavior and Validity of Approximate Models. Panels 1 and 2: estimates of provided by (Eq. 8) and (Eq. 7) for limiting dilute and excess inputs (dashed curves), respectively, vs. full-model predictions (solid curves) for specific dilute blue) and excess red) inputs. For all cases, Curve sets (a), (b), and (c) depict error-responses due to a single dominant error-duplex of length 15/20, 10/20, and 5/20 bps, respectively. Panel 3: Optimal-fidelity temperatures for excess-input operation, estimated by visual inspection of (a-c) (row 1), and Eq. 12 (row 2); Melting temperatures for the planned duplex (excess and dilute inputs) and error duplexes (excess only), predicted in isolation (denoted also listed for comparison (rows 4-6). Panel 4: (Top sub-panel) Blow-up of high-error curve; Middle, bottom sub-panels: isolated differential melting curves for the planned (solid curves) and dominant error species (dashed curves), for excess (‘10x’) and dilute (‘0.1x’) inputs, respectively
48
J.A. Rose
For TAT systems which form a component of a DNA computer, the potential for excessive error due to an asymmetric input, consisting of a dilute component combined with an excess component of one (or more) input species may be evaluated by examining that system’s set of single-tag, excess input values, at the operating temperature of interest, For such systems, the mean value of over is proposed as a well-defined measure for high-fidelity design. As indicated by Fig. 1 (panel 3), Eq. 12 provides a good approximation for Note that minimization of has the additional desirable effect of decreasing the sensitivity of the excess error response to variations away from since this process broadens the width of around The evident dominance of target duplex formation on the inflection point, as evinced by the general proximity of to the melting temperature of the isolated planned TAT pair, under excess conditions (see Fig. 1 (panel 3)) deserves further discussion. As illustrated in Fig. 1 (panel 4; top inset), in the context of the high-error case (panel 2, (a)), the sigmoidal portion of beneath (vertical line) is seen to just span the interval between and as indicated by the differential melting curves of the isolated planned and error TAT pairs, predicted under excess-input conditions (Panel 4, middle inset; solid and dashed red curves, respectively)). Overall duplex formation in this regime, predicted to accompany successive decreases in beneath is thus characterized by increasing concentrations of error TAT species, compensated for by increasingly smaller fractional increases in the concentration of planned TAT species, due to the onset of planned antitag saturation (thus, the sigmiodal shape). From a fidelity perspective, for systems in which the potential for asymmetric, excess input is unavoidable, the most robust operating condition is approximated by the mean value of the set For this reason, design for uniform values, enabling uniformly error-resistant operation at the mean, is proposed as a second well-defined criterion for guiding high-fidelity statistical-thermodynamic TAT design. On the other hand, several points of care are required in interpreting the mean value as an optimal operation First of all, if an architecture can be verifiably biased to ensure operation of any associated TAT system strictly in the dilute regime, then a lower temperature of much greater fidelity may be employed, according to the temperature-dependence of as shown in Fig. 1, panel 1 [9]. If non-dilute operating conditions cannot be strictly avoided, then simulations strongly suggest the utility of selecting a higher operating temperature, for which should provide a guide. However, additional care is still required. An additional concern is that operating conditions be selected which not only ensure high fidelity, but also allow substantial process completion, for all potential input conditions of interest (i.e., both excess and dilute). For this reason, a comparison of with the melting temperatures of each planned TAT species (denoted, for a TAT system with distinct, single-tag inputs) as expected under dilute conditions is also indicated. Based on the predictions provided by Fig. 1 (panel 3, row 6), adoption of as the optimal for general
The Fidelity of the Tag-Antitag System III
49
system operation, although attractive due to its robustness to error-prone excess inputs, will always come at the cost of reduced completion of the planned TAT pair, and according to Fig. 1, is strictly satisfactory only for well-encoded TAT systems (as is located beneath for both dilute and excess inputs). This is illustrated more clearly in Fig. 1 (panel 4, bottom inset), which compares the depressed melting transition of the planned duplex under dilute input ((‘0.1x’), solid blue curve; compare with the same transition, under excess input (solid red curve, middle inset)) with the elevated characteristic of a high-error system (vertical line), indicating a substantial lack of completion of planned duplex, at under dilute conditions. If the potential for operation in the non-dilute regime cannot be avoided (so that a suitable, high-fidelity, lower cannot be selected), the best compromise is probably to select where the melting temperatures of planned TAT pairs are assessed under the most dilute practical conditions of interest. Furthermore, to minimize this problem, it is evident that a third well-motivated design criterion is to encode for uniform of planned interaction, as suggested previously [8].
Acknowledgements Financial support generously provided by Grants-in-Aid for Scientific Research B (15300100 and 15310084), from the Ministry of Education, Culture, Sports, Science, and Technology of Japan, from Nihon Tokutei, and by JST-CREST.
References 1. D. Lockhart and E. Winzeler, Nature 405, 827 (2000). 2. Q. Liu, et al., Nature 403, 175 (2000). 3. A. Suyama, et al., in Currents in Computational Molecular Biology, S. Miyano, et al., Eds. (Univ. Acad. Press, 2000), 12. 4. A. BenDor, et al., J. Comput. Biol. 7, 503 (2000). 5. J. Rose, M. Hagiya, and A. Suyama, in Proc. 2003 Cong. Evol. Cornp., Vol. IV, R. Sarker, et al., Eds., (IEEE Press, 2003), 2740. 6. R. Deaton, et al., Phys. Rev. Lett. 80, 417 (1998). 7. Q. Liu, et al., Biosystems 52, 25 (1999). 8. H. Yoshida and A. Suyama, in DNA Based Computers V, E. Winfree and D. Gifford, Eds., (Am. Math. Soc., 2000), 9. 9. J. Rose, et al., in DNA Computing, N. Jonaska and N. Seeman, Eds., (Springer, Berlin, 2001), 138. 10. P. von Hippel and O. Berg, Proc. Natl. Acad. Sci. USA 83, 1608 (1986). 11. B. Eaton, et al., Chemistry and Biology 2, 635 (1995). 12. J. Rose, et al., in W. Banzhaf, et al., eds., Proc. GECCO ’99, (Morgan Kauffman, San Francisco, 1999), 1829. 13. J. Rose and R. Deaton, in DNA Based Computers, A. Condon and G. Rozenberg, Eds., (Springer, Berlin, 2001), 231. 14. J. SantaLucia, Jr., Proc. Natl. Acad. Sci. 95, 1460 (1998). 15. R. Wartell and A. Benight, PHYSICS REPORTS 126, 67 (1985).
Robust PID Controller Tuning Using Multiobjective Optimization Based on Clonal Selection of Immune Algorithm Dong Hwa Kim and Jae Hoon Cho Dept. of Instrumentation and Control Eng., Hanbat National University, 16-1 San Duckmyong-Dong Yuseong-Gu, Daejon City, Korea, 305-719. Tel: +82-42-821-1170, Fax: +82-821-1164 [email protected] ial.hanbat.ac.kr
Abstract. The three-mode proportional-integral-derivative (PID) Controller is widely used industrial process due to ease of use and robustness in the face of plant uncertainties. However, it is very difficult to achieve an optimal PID gain with no experience, since the parameters of the PID controller has to be manually tuned by trial and error. This paper focuses on robust tuning of the PID controller using clonal selection of immune algorithm which has function such as diversity, distributed computation, adaptation, self-monitoring function. After deciding disturbance rejection condition for the given process, the gains of PID controller using clonal selection of immune algorithm depending on disturbance rejection is tuned for the required response. To improve this suggested scheme, simulation results are compared with FNN based responses and genetic algorithm.
1 Introduction A Proportional – Integral – Derivative (PID) controller has been using in the most control loops of plant despite continual advances in control theory: process control, motor drives, thermal power plant and nuclear power plant, automotive, fight control, instrumentation, etc. This is not only due to the simple structure which is conceptually easy to understand but also to the fact that the algorithm provides adequate performance in the vast majority of applications [1]. Also, the advantage of a PID controller includes simplicity, robustness but it cannot effectively control such a complicated or fast running system, since the response of a plant depends on only the gain P, I, and D. Because of this, a great deal of effort has been spent to find the best choice of PID parameters for different process models. In the tuning problems of a PID process control, the classical tuning methods based on the ultimate gain and the period of the ultimate oscillation at stability limit, approaches based on an exact form of the process expressed by a transfer function, the self-tuning based on process parameters estimation tuning based on process parameters estimation, and the self-adaptive tuning have been typically used. However, these approaches have some problems with tuning such as oscillatory M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 50–56, 2004. © Springer-Verlag Berlin Heidelberg 2004
Robust PID Controller Tuning
51
operation’s problems, difficulty of physical characteristics in real system. That is, since most of the PID tuning rules developed in the past years use the conventional method such as frequency-response methods, this method needs a highly technical experience to apply as well as they can not provide simple tuning approach to determine the PID controller parameters. For example, the Ziegler-Nichols approach often leads to a rather oscillatory response to set-point changes because the system has non-linearities such as directionally dependent actuator and plant dynamics, and various uncertainties, such as modeling error and external disturbances, are involved in the system. Due to a result of these difficulties, the PID controllers are rarely tuned optimally. Therefore, to improve the performance of PID tuning for processes with changing dynamic properties, the complicated system, and dead time process, several tuning strategies such as, automatic tuning PID, adaptive PID, and intelligent tuning technique have been proposed [2]. However, the PID controller parameters are still computed using the classic tuning formulae and these can not provide good control performance in control situations. When there is the disturbance in a PID controller loop, the design of a PID controller has to take care of specifications on responses to the disturbance signals as well as robustness with respect to changes in the process. Since load disturbances are often the most common problems in process control, most design methods should therefore focus on disturbance rejection and try to find a suitable compromise between demands on performance at load disturbances and robustness. It will be a great advantage if this compromise can be decided using a tuning method. For instance, if we use to give a good approximation for the gain and phase margins of the system design without having to solve for the equations using numerical methods, on process model such as the first-order plus dead-time model, tuning approaches will be satisfaction. Therefore, in order to provide consistent, reliable, safe and optimum parameter to industrial control problems, novel tuning PID control schemes are needed. In this paper, an intelligent tuning method of PID controller based on gain margin/phase margin is suggested by an immune algorithm for robust control.
2 Gain Margin and Phase Margin for PID Controller 2.1 Gain Margin and Phase Margin When the PID controller, the loop transfer function are given as
On the other hand, the basic definitions of phase margin and gain margin are given as [10, 26]:
52
D.H. Kim and J.H. Cho
For process given as k,
L, final gain margin and phase margin can be defined
by
3 Immune Algorithms for Tuning of PID Controller Based on Gain Margin and Phase Margin 3.1 Immune Algorithm for Tuning In Fig. 1, when an antibody on the surface of a B cell binds an antigen, that B cell becomes stimulated. The level of stimulation depends not only on how well the B cell’s antibody matches the antigen, but also how it matches other B cells in the immune network: [4], [5]. The stimulation level of the B cell also depends on its affinity with other B cells in the immune network. This network is formed by B cells possessing an affinity to other B cells in the system. If the stimulation level rises above a given threshold, the B cell becomes enlarged and if the stimulation level falls below a given threshold, the B cell die off. The more neighbors a B cell has an affinity with, the more stimulation it will receive from the network, and vice versa. Against the antigen, the level to which a B cell is stimulated relates partly to how well its antibody binds the antigen. We take into account both the strength of the match between the antibody and the antigen and the B cell object’s affinity to the other B cells as well as its enmity. Therefore, generally the concentration of i-th antibody, which is denoted by is calculated as follows [3]:
In Eq. (4), N is the number of antibodies, and and are positive constants. denotes affinities between antibody j and antibody i (i.e. the degree of interaction), represents affinities between the detected antigens and antibody i, respectively.
3.2 Tuning of PID Controller Based on Gain Margin/Phase Margin and Immune Algorithm In this paper, for the constrained optimization tuning based on gain margin and phase margin, immune algorithms are considered. That is, immune algorithm is used to minimize fitness function for gain margin and phase margin in
Robust PID Controller Tuning
53
memory cell. For optimal search of the controller parameters and gain/phase margin, computation procedure is initialized with parameters within the search domain specified by the designer. These parameters are calculated by network theory based on immune algorithm. Immune algorithm minimizes fitness function for searching optimal gain/phase margin and parameters of PID controller during generations. In the evaluation of the fitness function of memory, individuals with higher fitness values are selected automatically and those penalized in memory cell will not survive during the evolutionary process. For the implementation of the immune algorithm, this paper used tournament selection, arithmetic crossover, and mutation [3], [5]. The fitness value of each individual in immune network is defined as
where n denotes the population size of immune network. In this paper, there are five kinds of fitness values as fitness function given in equation (5). The fitness function for gain margin and phase margin is decided by difference between the given margin (gain and phase) and margin calculated by immune algorithm. The more difference is bigger fitness function. When value of overshoot in reference model is over the reference value 1.2, fitness value is 0 but if overshoot value is within the given value 1.2, fitness value is varied with 0 to 1 by level of membership function defined in Fig. 1. Rise time, settling time is also varied with level of each membership function in Fig. 1.
3.3
Computational Procedure for Optimal Selection of Parameter
[Step 1] Initialization and recognition of antigen: That is, initialize the populations of network and memory cell [Step 2] Product of antibody from memory cell: For each individual of the network population, calculate the maximum value of using memory cell. If no individuals of the network satisfy the constraint [Step 3] Calculation for searching a optimal solution: Calculate the fitness value for each individual [Step 4] Stimulation and suppression of antibody: The expected value of the stimulation of the antibody is given by where is the concentration of the antibodies. Through this function, for each individual of the network, calculate using memory cell, and initialize the gene of each individual in the population. [Step 5] Stimulation of Antibody: If the maximum number of generations of memory cell is reached, stop and return the fitness of the best individual to network; otherwise, go to step 3.
54
D.H. Kim and J.H. Cho
Fig. 1. Shape of membership functions for decision of fitness level (Membership function for settling time: f1, Membership function for Rise time: f2)
4 Simulations and Discussions In order to prove robust control scheme using the gain margin and phase margin, and multi-optimization based on the clonal selection of immune algorithm suggested in this paper, we used the plant models as the following equations [1]:
For this model, when gain margin phase margin is given as the design requirement, tuning results tuned by gain margin-phase margin and multi-optimization base on clonal selection of immune algorithm are obtained as shown in Figs. 2-7.
Fig. 2. Step response depending on clonal variation of immune algorithm (Cn: 5)
Fig. 3. Step response depending on clonal variation of immune algorithm (Cn: 20)
Fig. 4. Objective function and fitness function depending on the variation of clonal selection in immune algorithm
Fig. 5. Step response by objective function
Robust PID Controller Tuning
Fig. 6. Comparison response by fuzzy neural network, memory cell, and clonal selection
55
Fig. 7. Response by objective function f2, f3, f4, and f5
Figs. 2–3 are response obtained by clonal selection of immune algorithm and Fig. 4 illustrates objective function and fitness function depending on the variation of clonal selection in immune algorithm. Fig. 5 shows the variation of parameter to objection variation obtained using clonal selection of immune algorithm and In Fig. 12, its results are compared with result tuned by fuzzy neural network. The result by clonal selection is showing the best shape in response. Fig. 7 is illustrating plant response to kinds of the objective function. That is, Fig. 7 obtained by simulation represents the level of effectiveness by objective function to plant response.
4 Conclusions The PID controller has been used to operate the industrial process including nuclear power plant since it has many advantages such as easy implementation and control algorithm to understand. However, achieving an optimal PID gain is very difficult for the feedback control loop with disturbances. Since the gain of the PID controller has to be tuned manually by trial and error, tuning of the PID controller may not cover a plant with complex dynamics, such as large dead time, inverse response, and a highly nonlinear characteristic without any control experience. This paper focuses on tuning of PID controller using gain/phase margin and immune algorithm based multi-objective approach. Parameters P, I, and D encoded in
56
D.H. Kim and J.H. Cho
antibody are randomly allocated to obtain an optimal gain for robustness based on gain margin and phase margin. Optimal value for these parameters can be obtained through clonal selection process. The parameters vary with variation of the object function obtained as shown in Table 1. The suggested tuning scheme is compared with fuzzy neural network [7] and its result is illustrating the best shape in response.
References 1. Wang Ya-Gang, (2000) PI tuning for processes with large dead time. Proceeding of the ACC, Chicago Illinois, June (2000) 4274-4278 2. Matsummura, S., (1998) Adaptive control for the steam temperature of thermal power plants. Proceedings the 1993 IEEE on Control applications, (1998) 1105-1109 3. Farmer J. D., Packard N. H and Perelson A. S., (1986) The immune system, adaptation, and machine learning, Vol. Physica. D, No. 22, (1986) 187 – 204. 4. Mori Kazuyuki and Tsukiyama Makoto, (1993) Immune algorithm with searching diversity and its application to resource allocation problem. Trans. JIEE, Vol. 113 - C, No. 10, (1993). 5. Kim Dong Hwa, (2002) Intelligent tuning of a PID controller using a immune algorithm. Trans. KIEE , vol. 51-D, no.l, (2002). 6. Khuen Weng, Hang Chang Chien, and Cao Liseng S., (1995) “Tuning of PID controllers based on gain and phase margin specifications,” Automatica, Vol. 31, No. 3, pp. 497-502. 7. Lee Ching-Hung, Lee Yi Hsiung, and Teng Ching Ch-eng, (2002) A novel robust PID controllers design by Fuzzy Neural network. Proceedings of the American Control Conference, Anchorage, May 8-10, (2002) 1561-1566 8. Kim Dong Hwa, (2002) “Intelligent tuning of a PID controller using a immune algorithm,” Trans. KIEE, Vol. 51-D, no.l, pp. 2002. 9. Kim Dong Hwa, (2003) Comparison of PID Controller Tuning of Power Plant Using Immune and genetic algorithm. Measurements and Applications Ligano, Switzerland, 29-31 July 2003
Intelligent Tuning of PID Controller with Robust Disturbance Rejection Function Using Immune Algorithm Dong Hwa Kim Dept. of Instrumentation and Control Eng., Hanbat National University, 16-1 San Duckmyong-Dong Yuseong-Gu, Daejon City, Korea, 305-719 Tel: +82-42-821-1170, Fax: +82-821-1164 [email protected] ial.hanbat.ac.kr
Abstract. This paper focuses on robust tuning of the PID controller using immune algorithm which has function such as diversity, distributed computation, adaptation, self-monitoring function. After deciding disturbance rejection condition for the given process, the gains of PID controller is tuned to obtain the required response by fitness value of immune algorithm depending on disturbance rejection. Simulation results are compared with genetic algorithm.
1 Introduction The PID controller is still widely used in the most control loops of plant despite continual advances in control theory. This is not only due to the simple structure which is conceptually easy to understand but also to the fact that the algorithm provides adequate performance in the vast majority of applications [1], [2]. Also, the advantage of a PID controller includes simplicity, robustness but it cannot effectively control such a complicated or fast running system, since the response of a plant depends on only the gain P, I, and D. Most of the PID tuning rules developed in the past years use the conventional method such as frequency-response methods. This method needs a highly technical experience to apply since they provide simple tuning formulae to determine the PID controller parameters. For example, the Ziegler-Nichols approach often leads to a rather oscillatory response to set-point changes because of the following reasons [2], [6]: The system has non-linearities such as directionally dependent actuator and plant dynamics; Various uncertainties, such as modeling error and external disturbances, are involved in the system. Due to a result of these difficulties, the PID controllers are rarely tuned optimally and the engineers need to settle for a compromise performance given the time available for the exercise. Especially, to improve the performance of PID tuning for processes with changing dynamic properties, the complicated system, and dead time process, several tuning strategies such as, automatic tuning PID [3], adaptive PID [7], and intelligent tuning technique [1], [2], [6], [7] have been proposed. Since load disturbances are often the most common problems in process control, most design M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 57–63, 2004. © Springer-Verlag Berlin Heidelberg 2004
58
D.H. Kim
methods should therefore focus on disturbance rejection and try to find a suitable compromise between demands on performance at load disturbances and robustness [6].
Fig. 1. Control system with disturbance
In this paper, an intelligent tuning method of PID controller by an immune algorithm is suggested for robust control with disturbance rejection on control system and results is compared with genetic algorithm based control.
2 Disturbance Rejection Condition for Robust Control of PID Controller 2.1 Condition for Disturbance Rejection In
Fig.
1,
the
disturbance
rejection Here,
rejection
level
and
denotes
the
constraint
can
be
given
by
is constant defining by the desired -norm, which is defined as
The disturbance rejection constraint becomes [10]:
The controller K (s, c) is written as controller parameter is given by disturbance rejection is given as
and the vector c of the Hence, the condition for
Intelligent Tuning of PID Controller
59
2.2 Performance Index for Optimal Controller Design The performance index defined as ITSE (Integral of the Time-Weighted Square of the Error) is written by
Because E(s) contains the parameters of the controller (c), the value of performance index, PI for the system of nth order can be minimized by adjusting the vector c as The optimal tuning is to find the vector c, such that the ITSE performance index, PI(c) is a minimum and the constraint
is satisfied through
real coded immune algorithms.
3 PID Controller Tuning with Disturbance Rejection Function by Immune Algorithms 3.1 Immune Algorithm for Tuning In Fig. 1, when an antibody on the surface of a B cell binds an antigen, that B cell becomes stimulated. The level of stimulation depends not only on how well the B cell’s antibody matches the antigen, but also how it matches other B cells in the immune network: [4], [5]. The stimulation level of the B cell also depends on its affinity with other B cells in the immune network. This network is formed by B cells possessing an affinity to other B cells in the system. If the stimulation level rises above a given threshold, the B cell becomes enlarged and if the stimulation level falls below a given threshold, the B cell die off. The more neighbors a B cell has an affinity with, the more stimulation it will receive from the network, and vice versa. Against the antigen, the level to which a B cell is stimulated relates partly to how well its antibody binds the antigen. We take into account both the strength of the match between the antibody and the antigen and the B cell object’s affinity to the other B cells as well as its enmity.
3.2 Evaluation Method for Disturbance Rejection Based on Immune Algorithms Immune Algorithm In this paper, for the constrained optimization tuning, immune algorithms are considered, i.e., memory cell of immune algorithm to minimize the performance index PI(c) , and network of immune algorithm to maximize the disturbance rejection constraint as depicted in Fig. 3. Immune network maximizes the disturbance rejection constraint during a fixed number of generations for each individual of memory cell in immune network. Next, if the maximum value will be associated to the corresponding individual of memory cell. Individuals of memory cell
60
D.H. Kim
that satisfy the disturbance rejection constraint will not be penalized. In the evaluation of the fitness function of memory, individuals with higher fitness values are selected automatically and those penalized will not survive the evolutionary process. For the implementation of the immune algorithm, this paper used tournament selection, arithmetic crossover, and mutation [3]. An approach using penalty function [3], [4] is employed to solve the constrained optimization selection. The value of the fitness of each individual of immune network is determined by the evaluation function, denoted by as where n denotes the population size of immune network. The penalty function is discussed in the following. Let the disturbance rejection constraint be The value of the fitness of each individual of memory cell is determined by the evaluation function, denoted by as where m denotes the population size of memory cell. The penalty for the individual is calculated by means of the penalty function given by
If the individual
does not satisfy the stability test applied to the characteristic
equation of the system, then
is an unstable individual and it is penalized with a
very large positive constant If satisfies the stability test, but not the disturbance rejection constraint, then it is an infeasible individual and is penalized with where n is a positive constant to be adjusted. Otherwise, the individual
3.3
is feasible and is not penalized.
Computational Procedure for Optimal Selection of Parameter
The coding of an antibody in an immune network is very important because a well designed antibody coding can increase the efficiency of the controller. As shown in Fig. 2, there are three types antibodies in this paper: 1) antibody type 1 is encoded to represent only P (c1) gain in the PID controller; 2) antibody type 2 is encoded to represent I (c2) gain; 3) antibody is encoded to represent D (c3) gains. The value of the k locus of antibody type 1 shows P gain allocated to route 1. That is, the value of the first locus of antibody type 1 means that P gain allocated to route 1 is obtained by route 2 [9-10]. On the other hand, the n locus of antibody 2 represents I (c2) gain for tuning of the PID controller with disturbance rejection function. Here, the objective function can be written as follows. This algorithm is implemented by the following procedures. Given the plant with transfer function G(s) controller with fixed structure and transfer function C(s,c), and the weighting function W(s), determine the error signal E(s) and the disturbance rejection constraint disturbance rejection constraint
Intelligent Tuning of PID Controller
61
Fig. 2. Allocation structure of P, I, D gain in locus of antibody of immune algorithm
[Step 1] Initialization and recognition of antigen: That is, initialize the populations of network and memory cell and set the generation number of network to where denotes the number of generations for network. [Step 2] Product of antibody from memory cell: The immune system produces the antibodies that were effective to kill the antigen in the past. If no individuals of the network satisfy the constraint then a feasible solution is assumed to be nonexistent and the algorithm stops. [Step 3] Calculation for searching a optimal solution: Calculate the fitness value for each individual of network by using (10) and (11). [Step 4] Differentiation of lymphocyte: The B - lymphocyte cell, the antibody that matched the antigen, is dispersed to the memory cells in order to respond to the next invasion quickly. [Step 5] Stimulation and suppression of antibody: The expected value stimulation of the antibody is given by
where
of the
is the concentration of
the antibodies. [Step 6] Stimulation of Antibody: To capture the unknown antigen, new lymphocytes are produced in the bone marrow in place of the antibody eliminated in step 5. If the maximum number of generations of memory cell is reached, stop and return the fitness of the best individual to network; otherwise, set and go to step 3.
4 4.1
Simulations and Discussions Example 1:
Model
The transfer function for simulation is and transfer function are given as sine wave and results for this
system are shown as Fig. 3–4.
Disturbance signal The simulation
62
4.2
D.H. Kim
Example 2: Process Model
This paper simulated the suggested disturbance rejection function as pant [6]. Figs. 3–4 show response to disturbance rejection depending parameter variation. To decide the performance of control results, this paper introduces performance index by ITSE, PI(c).
Fig. 3. Comparison of step response by immune algorithm based PID tuning and (disturbance: 0.l sin(t), performance index: ITSE, Pm=0.02-0.1)
Fig. 4. Ste response by immune algorithm based PID tuning (disturbance:0.l sin(t), performance index: ITS)
Intelligent Tuning of PID Controller
63
6 Conclusions This paper focuses on tuning of PID controller with disturbance rejection by immune algorithm. For this purpose, we suggest an immune algorithm based tuning method for the PID controller with disturbance rejection. Parameters P, I, and D encoded in antibody are randomly allocated during selection processes to obtain an optimal gain for plant. The object function can be minimized by gain selection for control, and the variety gain is obtained as shown in Table 2. The suggested controller can also be used effectively in the power plant as seen from Figs. 3–4.
References 1. Wang Ya-Gang, (2000) PI tuning for processes with large dead time. Proceeding of the ACC, Chicago Illinois, June, 4274-4278 2. Matsummura S., (1998) Adaptive control for the steam temperature of thermal power plants. Proceedings the 1993 IEEE on Control applications, 1105-1109 3. Farmer J. D., Packard N. H., Perelson A. S., (1986) The immune system, adaptation, and machine learning, Vol. Physica. D, No. 22, 187 – 204. 4. Mori Kazuyuki and Tsukiyama Makoto, (1993) Immune algorithm with searching diversity and its application to resource allocation problem. Trans. JIEE, Vol. 113 - C, No. 10, (1993). 5. Kim Dong Hwa, (2002) Intelligent tuning of a PID controller using a immune algorithm. Trans. KIEE , vol. 51-D, no.1. 6. Khuen Weng, Hang Chang Chien, Cao Liseng S., (1995) “Tuning of PID controllers based on gain and phase margin specifications,” Automatica, Vol. 31, No. 3, pp. 497-502, (1995). 7. Lee Ching-Hung, Lee Yi Hsiung, Teng Ch-eng, (2002) A novel robust PID controllers design by Fuzzy Neural network. Proceedings of the American Control Conference, Anchorage, May 8-10, (2002) 1561-1566 8. Kim Dong Hwa, (2002) “Intelligent tuning of a PID controller using a immune algorithm,” Trans. KIEE, Vol. 51-D, no.1, pp. 2002. 9. Kim Dong Hwa, (2003) Comparison of PID Controller Tuning of Power Plant Using Immune and genetic algorithm. Measurements and Applications Ligano, Switzerland, 29-31 July (2003)
The Block Hidden Markov Model for Biological Sequence Analysis Kyoung-Jae Won1, Adam Prügel-Bennett1, and Anders Krogh2 1
ISIS Group, ECS Department, University of Southampton, SO17 1BJ, United Kingdom [email protected] 2
Bioinformatics Centre, University of Copenhagen, DK-2100 Copenhagen, Denmark
Abstract. The Hidden Markov Models (HMMs) are widely used for biological sequence analysis because of their ability to incorporate biological information in their structure. An automatic means of optimising the structure of HMMs would be highly desirable. To maintain biologically interpretable blocks inside the HMM, we used a Genetic Algorithm (GA) that has HMM blocks in its coding representation. We developed special genetics operations that maintain the useful HMM blocks. To prevent over-fitting a separate data set is used for comparing the performance of the HMMs to that used for the Baum-Welch training. The performance of this algorithm is applied to finding HMM structures for the promoter and coding region of C. jejuni. The GA-HMM was capable of finding a superior HMM to a hand-coded HMM designed for the same task which has been published in the literature.
1 Introduction In the field of bioinformatics, one of the most successful classes of techniques for analysing biological sequences has been Hidden Markov Models (HMMs). With an ability to encode biological information in their structure, they have proved highly successful for modeling biological sequences, see e.g. [1]. Because the performance of an HMM relies heavily on the structure, great care is required when designing HMM architectures. To create a proper HMM architecture for the biological data, researchers used their biological knowledge [2,3]. However, in many applications the biological mechanism is not well understood. In this paper we investigate the Genetic Algorithms (GA) for optimising the HMM structure while keeping human interpretable structure in HMM. Genetic Algorithm is a robust general purpose optimisation technique which evolves a population of solutions [4]. It is easy to hybridise other algorithms such as Baum-Welch training within a GA. Furthermore, it is possible to design operators which favour biologically plausible changes to the structure of an HMM. There was an attempt to find a structure of HMM without biological knowledge [5]. They used a GA to search a proper HMM topology for the motif patters in primate promoters. They showed the possibility that GAs can be applied to finding HMM architectures. However, the result of their approach was not easy to interpret DNA pattern. Because their crossM.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 64–70, 2004. © Springer-Verlag Berlin Heidelberg 2004
The Block Hidden Markov Model for Biological Sequence Analysis
65
overs and mutations produce arbitrary transitions to all of the other states, it is very difficult to check the result by looking at the topology of the HMM. Our GA maintains human readable blocks through genetic operation so as to find optimal HMM structure without generating too complex model. When presenting knowledge of the DNA patterns, biologists usually use the information such as conserved region, periodic signals and length of a pattern. For the construction of the promoter model, Petersen et al. [2] used small HMM blocks and combined them to form the whole structure. They used a line of states for the description of TATA box, a line of states with loop for the first state for periodic signal, a line of state that can express several lengths for a spacer region and states with self loop for the modeling of a background. Our idea is to use these small HMM blocks as the building blocks of an HMM. The genetic algorithm plays a crucial role in combining these blocks and evaluating the suitability of the HMM.
2 Methods 2.1 HMM Blocks and Encoding Methods The commonly used HMM blocks for the biological sequence analysis can be categorized as one of four types: linear, self loop, forward jump and backward jump blocks (figure 1).
Fig. 1. Commonly used HMM blocks (a) a linear block (b) a self loop block (c) a forward jump block (d) a backward jump block
Linear blocks can model the conserved region. Self loop blocks are used usually for background signal of the sequence. This block has only two transitions: to self and to the next state. To express sequences pattern with variety in length the forward jump blocks are used. The forward jump model in figure 1 (c) can model a sequence with a length of between 2 and 4. The backward jump blocks are for the periodic signal of the sequence. These four blocks can be combined to construct the whole HMM architecture. To represent those blocks efficiently in a string form we composed each element of string with a length and a type of the block. The length of the block is the number of the states inside the HMM blocks. Then the whole structure of an HMM can be represented in a string form.
66
K.-J. Won et al.
We also defined zero blocks. The zero block has no state inside the block. Thus, the length is ‘0’. It does not affect the size of the whole HMM structure. An HMM with these block models is shown at figure 2.
Fig. 2. String representation of an HMM structure
In this scheme, we can make a initial population with fixed length strings that contain the structure information of an HMM.
2.2 Genetic Operations Mutations and crossovers take place in any block of the string. The genetic operations are performed on the strings, which eventually changes the transition matrix of the HMM. Inside HMMs crossover is unusual because the models are not of fixed length. That is, not all HMMs have the same number of states. In crossing over, blocks can be taken from any part of the first child and swapped with blocks from a different part of the second child. This is similar to crossover in Genetic Programming. Figure 3 shows the crossover scheme. The last block of the first child crosses with the second block of the second child. The matrix representation changes as the architecture changes.
Fig. 3. Block crossover. Crossover swaps the HMM states without breaking the property of HMM blocks
The Block Hidden Markov Model for Biological Sequence Analysis
67
The mutation can take place in any block. Figure 4 shows four possible mutations from a HMM block with 4 states. The mutation can delete (figure 4 (a)) or add (b) a transition in a block and add (c) or delete (d) a state. To minimize the disruption due to the mutation only a single state is added or deleted.
Fig. 4. Four possible types of mutations. From the forward jump block with 4 states four types of mutations are possible
Another mutation is type mutation. A type mutation changes the property of the block. When a block is chosen by type mutation, the block is modified to one of the four basic blocks or to the zero block. It increases the diversity of HMM block and enables the GA not to lose one of the types of the block.
2.3 Training Procedures For the block HMM, the initial populations are created. The initial populations are string of fixed length. The initial lengths of the blocks are randomly chosen between 3 and 6. After being created, initial populations are transformed to a matrix form. During this procedure the transition and emission probabilities are determined randomly. On every iteration the block HMM repeats Baum-Welch training and the genetic operations. To diminish over-fitting during the training, the training and evaluation set are divided. The fitness value of each individual is calculated only with the evaluation data set. One quarter of sequence data are used as evaluation set. For the selection operator stochastic universal sampling was used to reduce genetic drift [6]. Bolzmann selection was used to make the selective pressure invariant under addition or multiplication of a constant to the fitness [7]. The fitness function of this simulation is then,
68
K.-J. Won et al.
Here, is the standard deviation of the members of population and is the mean value of the population. The term s controls the strength of the selection.
3 Results 3.1 Simulation I: Coding Region Model of C. Jejuni To investigate the block-HMM’s functional ability to find biologically interpretable solutions, we performed an experiment using 400 sequences from the coding regions from C. jejuni. The sequence data comprised a start codon (ATG), some number of codon and stop codon (TAA, TAG or TGA). A simple HMM architecture for detecting this region would consist of a 3 state loop. Of the 400 sequences 300 sequences are used for training and 100 sequences for evaluation. Table 1 shows the parameters used for this experiment. The length of each string was set to 10, which means each HMM has 5 HMM blocks inside it.
Figure [5] shows one of the HMMs found by the GA. All of the blocks have a length of 6. Inside each block the triplet structures are found. The first block of the model (6, -3) has 2 loops. The other 2 states in this block has shrunk during the training. Several results of the simulation can be obtained from http://www.ecs.soton. ac.uk/~kjw02r/blockhmm/resultl.htm1. Compared to the commonly used HMM with 3 states loop model, this model also contains the 3 states loop in its architecture. This shows that the proposed approach can replace the hand-coded architecture.
Fig. 5. The result of the block HMM simulation. It shows the triplet model
The Block Hidden Markov Model for Biological Sequence Analysis
69
3.2 Simulation II: Promoter Model of C. Jejuni Unlike other organisms C. jejuni does not have a conserved sequence in the -35 region [2]. We applied the block HMM to this model to see if the block HMM can find a good HMM architecture. For the simulation a population of 30 individuals was used. Because of the complexity of the promoter region we set the block length to 12 for this region. We checked the transition and emission possibilities for some part of the states. Figure 6 shows the whole structure of the structure model for the promoter of C. jejuni. TATA box is located at and the Ribosomal binding site is at This result shows that the block HMM could find the conserved regions without any knowledge of the location and the emission probabilities. A TGx is located in front of the TATA box. In figure 6 the transition with probability less than 0.1 are not shown. The full result can be obtained from http://www.ecs.soton.ac.uk/~kjw02r/blockhmm/result2.html.
Fig. 6. The whole structural model for the promoter of C. jejuni
To test the accuracy of the HMMs, five-fold cross-validation test was conducted. we assume that promoters predicted in the test sequences in the cross-validation experiment are true positive (TP). A window of size 121 bp is scanned across a random sequence of 500,000 bp to measure false positive (FP). A Threshold value of the log odds probability is used to distinguish promoter from non-promoter region. The number of windows in the random sequence with log-odds greater than the threshold gave a measure of the number of false positives while the percentage of the test set with log-odds above the threshold gave a measure of the sensitivity. To compare our HMM with those published in the literature we set the threshold so that we can predict 127 sequences to have a promoter in the cross-validation (sensitivity 72%), and predict seven promoters in 500,000 bp of random sequence. This result shows con-
70
K.-J. Won et al.
siderable improvement compared to the Petersen’s result with sensitivity of 68% when they found ten promoters in the same random sequence.
4 Discussions The goal of this work is to develop a human interpretable HMM model from the sequences information. We used 4 types of HMM blocks to construct an HMM structure. The proposed crossover scheme can search the optimal structure while keeping interpretable blocks. The crossover plays a crucial role in block HMM, combining the HMM blocks together. Also, the mutation enables the block to reach suitable size. In the simulation II the block HMM can produce more human interpretable structure than the previous works with GA. In the simulation of the promoter model, even though it could not find the same model with the hand-coded model, the block HMM can find the consensus of TATA box and Ribosomal binding site easily. In the simulation with the random sequence, the block HMM could produce better result than the hand coded structure. The strategy dividing train and test set enabled the HMMs to be trained without over-fitting. The block HMM shows that well organized GA methods can generate human readable model. Some upgraded block method (e.g. unlimited transitions from block to another block or block itself) is possible for the future work.
References 1. Durbin R., Eddy S., Krogh A., Mitchison G. (1998) Biological sequence analysis. Cambridge. Cambridge University Press. 2. Petersen L., Larsen T. S., Ussery D. W., On S. L. W., Krogh A., (2003) Rpod promoters in Campylobacter jejuni exhibit a strong periodic signal instead of a -35 box. In: Journal of Molecular Biology, 326(5): 1361-1372. 3. Krogh A., Larsson B., von Heijne G., Sonnhammer E., (2003) Predicting transmembrane protein topology with a Hidden Markov Model: Application to complete genomes. In: Journal of Molecular Biology, 305(3):567-580. 4. Goldberg D. E., (1989) Genetic Algorithms in Search, Optimization & Machine Learning. Addison-Wesley (Reading, Mass). 5. Yada T., Ishikawa M., Tanaka H., Asai K., (1994) DNA Sequence Analysis Using Hidden Markov Model and Genetic Algorithm. In: Genome Informatics Vol.5, pp. 178-179. 6. Baker J. E., (1987) Reducing bias and inefficiency in the selection algorithm. In: Proceedings of the Second International Conference on Genetic Algorithms, Lawrence Erlbaum Associates (Hillsdale). 7. Prügel-Bennett A., Shapiro J. L., (1994) An analysis of genetic algorithms using statistical mechanics. In: Physical Review Letters, 72(9): 1305-1309.
Innovations in Intelligent Agents and Applications Gloria E. Phillips-Wren1 and Nikhil Ichalkaranje2 1
Sellinger School of Business and Management, Loyola College in Maryland, 4501 N. Charles Street, Baltimore, MD 21210 USA 2
[email protected]
School of EIE, University of South Australia, Mawson lakes Campus, Mawson lakes Boulevard SA 5095, AUSTRALIA [email protected]
Abstract. This paper provides an introduction to Session 1 of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference along with a brief summary of the papers in the session.
1 Introduction Research Into software agents has been pursued since the 1980’s. However, it was until the growth of the Internet in the mid 1990’s, that applications of intelligent agents expanded in an exponential manner [1]. Agent technology can aid and automate complex problem solving such as brokering in electronic commerce and produce information or knowledge in areas such as financial services [2, 3, 4]. There are no agreed upon criteria for determining whether a program is a software agent. In general, agents act on behalf of someone [2]. Thus, an agent is a software package that carries out tasks for others autonomously, with the others being human users, business processes, workflows or applications [2,3,4,5]. In some ways, a software agent is just a software program. However, according to Finan and Sherman [5] agents can be differentiated from other types of computer programs by combining three essential properties of autonomy, reactivity and communication ability. Broadly stated, to achieve common goals agents need to communicate, coordinate and cooperate with both the human user and other agents. [5] Hyacinth, Nwana and Ndumu [1] have described the domain of software agents in terms of two types: multi-agent systems and autonomous interface/information agents. Multi-agent systems interconnect agents that may have been developed separately to create a system that has more capability than any one of its elements. An example is a system of scheduling agents for travel. Interface agents are envisioned as proactive systems that could assist the user. An example is a personal software assistant that manages a person’s daily schedule and resolves conflicts based on human input. While the potential of intelligent agents is great, developing practical systems has proven problematic. Research areas include designing systems that effectively work together, taxonomies, producing standards to allow interoperability, and producing usable applications [6,7]. Some of these topics are illustrated by the research papers M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 71–73, 2004. © Springer-Verlag Berlin Heidelberg 2004
72
G.E. Phillips-Wren and N. Ichalkaranje
that formed Session 1 of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference. An introduction to the papers in this session is offered below.
2 Session Papers The first paper by Krishnamurthy and Murthy entitled “Contextual-Knowledge Management in Peer to Peer Computing” investigates the environment of peer to peer computer interaction by looking at the context in which the conversation occurs. [8] Personalized services delivered with computer technology need to be cognizant of the context in order to correctly interpret language and automate a response. Eiffel, Java and UML are discussed as potential languages that are powerful enough to implement a “context-based workflow model” between multiple peers. The paper by Murthy and Krishnamurthy entitled “Collaborating Agents in Distributed Networks and Emergence of Collective Knowledge” describes how a set of intelligent agents can collaborate in an electronic auction. [9] They define an agent system in a similar manner to Finan and Sherman [5]. Their agent system perceives events, represents information, and acts on the basis of this information. The research focuses on designing an agent system that effectively collaborates to achieve a common goal, and the paper describes agent collaboration for an electronic auction. The authors explain the “stochastic emergence of collective knowledge.” Phillips-Wren and Forgionne describe the use of intelligent agent technology to retrieve information for the non-technical user in a technical application field in their paper entitled “Intelligent Decision Making in Information Retrieval.” [10] The paper focuses on a healthcare application in which the information search must be conducted in a rigorous technical database. The non-technical user requires assistance in making decisions about the search, and this can be provided through the use of intelligent agent technology. A paper by Sioutis, Tweedlae, Urlings, Ichalkaranje, and Jain entitled “Teaming humans and agents in a simulated world” explores the use of intelligent agents in a hostile environment in which a human is at physical risk. [11] The vision is that humans and intelligent software agents form a team, thereby reducing risk for the human in a real environment. The work explores the theory and design of such systems, and is applied to a simulated hostile environment played as a game with offensive and defensive manoeuvres. The final paper in Session 1 is a contribution by Thatcher, Jain and Fyfe entitled “An Intelligent Aircraft Landing Support System.” [12] Intelligent agents are applied to the prevention of accidents in contemporary aircraft landings. Three autonomous agents coordinate their activities in order to improve safety. The agents have foreknowledge and beliefs about expected behaviors that are specific to the aircraft, airport and landing conditions. The agents communicate to identify dangers and alert the human operators.
Innovations in Intelligent Agents and Applications
73
3 Summary The research papers in Session 1 of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference advance the field of intelligent agents by offering theory in agent collaboration and design, developing the concept of contextual knowledge, exploring teaming between human and agents, and applying intelligent agents to important areas of e-commerce, information retrieval, hostile environment, and aircraft landing safety.
References 1. Hyacinth, S., Nwana, D. and Ndumu, T.: A perspective on software agents research. The Knowledge Engineering Review. Vol. 14(2), 1-18 (1999) (Also available from http://agents.umbc.edu/introduction/hn-dn-ker99.html). 2. Bradshaw, J. (ed.): Software Agents. The MIT Press: Cambridge, MA (1997) 3. Huhns, M. and Singwh, M. (eds.): Readings in Agents. Morgan Kaufmann Publishers, Inc, San Francisco, CA (1998) 4. Jennings, N. and Woolridge, M. (eds.): Agent Technology: Foundations, Applications and Markets. Springer-Verlag, Berlin, Germany (1998) 5. Finan, T. and Sherman, T.: Secure Agent Communication Languages. Accessed from http://www.cs.umbc.edu/lait/research/sacl/on May 13 (1999) 6. Vinaja, R. and Raisinghani, M.: A multi-attribute profile-based classification for intelligent agents. Proceedings of the Eighth Americas Conference on Information Systems, 1495-1502 (2002) 7. FIPA: Foundation for Intelligent Physical Agents. Accessed from http://www.fipa.org/ on April 28 (2004) 8. Krishnamurthy, E.V. and Murthy, V.K.: Contextual-Knowledge Management in Peer to Peer Computing. Proceedings of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference, Wellington, NZ (2004) 9. Murthy, V.K and Krishnamurthy, E.V.: Collaborating Agents in Distributed Networks and Emergence of Collective Knowledge. Proceedings of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference, Wellington, NZ (2004) 10. Phillips-Wren, G. and Forgionne, G.: Intelligent Decision Making in Information Retrieval. Proceedings of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference, Wellington, NZ (2004) 11. Sioutis, C., Tweedlae, J., Urlings, P., Ichalkaranje, N. and Jain, L.: Teaming humans and agents in a simulated world. Proceedings of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference, Wellington, NZ (2004) 12. Thatcher, S., Jain, L. and Fyfe, C.: An Intelligent Aircraft Landing Support System. Proceedings of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference, Wellington, NZ (2004)
An Intelligent Aircraft Landing Support System Steve Thatcher1, Lakhmi Jain1, and Colin Fyfe2 1
School of Electrical and Information Engineering, University of South Australia, Adelaide, Australia 2 Applied Computational Intelligence Research Unit, The University of Paisley, Paisley, Scotland
Abstract. We discuss the problem of continuing accidents in contemporary aircraft landings and propose three autonomous agents whose task it is to jointly monitor the aircraft and its flight crew. Two of these agents are monitoring the path of the aircraft, one armed with prior knowledge of how planes tend to land at that airport, the other with the ability to project forward from the plane’s current position in order to identify potential dangers. The third agent monitors the flight crew’s behavior. These three agents act together to improve safety in the specific process of landing the aircraft.
Introduction Over the last century air travel has become extremely safe. This has largely been attributed to the increased mechanical reliability of the turbo jet, coupled with the increased reliability of on board automated systems and the wide spread development and implementation of flight crew training in team management and group effectiveness. Crew (previously Cockpit) Resource Management (CRM) training is now used by airlines all over the world in an effort to increase the safety of their airline operations. There is consensus that CRM has increased the safety of air travel. Thatcher [9] [10] has suggested that a further increase in safety could be achieved if CRM training and techniques were introduced earlier in a pilot’s training, at the ab-initio level. However, even with all the advances in aviation safety there remains a stubborn remnant of air crashes which are seemingly not eradicable. Of these accidents, worldwide, Helmreich and Foushee, [5] have suggested that 70% are due to flight crew actions or in some case inactions. This is despite the fact that pilots are extremely technically competent and well trained in CRM. Pilots undergo regular line checks and are assessed frequently in the simulator both in the technical and human factors area. There is no question that flight crews are highly trained to operate in the technical and human environments of the cockpit. This raises the question as to why such accidents happen and, perhaps more disturbingly, continue to happen. It seems that most are due to a momentary loss of concentration or awareness during which the flight crew did not consciously notice that a necessary event did not occur, or that an adverse event did occur. When subsequent events occur the flight crew attempts to structure these events in terms of their current mental model, or awareness, of the situation. Thus an event can only be M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 74–79, 2004. © Springer-Verlag Berlin Heidelberg 2004
An Intelligent Aircraft Landing Support System
75
perceived within the framework of the existing paradigm. This is termed situated cognition (Lintern). Data will continue to be perceived and restructured in terms of the mental model until an event happens which forces an unsettling recognition that the pilot’s mental model of the world (weltanschauung) is actually false. If this happens too late on in a critical process, the result can be an adverse event. This is termed loss of situational awareness. Situational awareness is “the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future.” [1] [2]. In terms of situational awareness and automation on the flight deck Endsley and Strauch [3] maintain that “despite their high reliability, accurate flight path control, and flexible display of critical aircraft related information, automated flight management systems can actually decrease” a flight crew’s “awareness of parameters critical to flight path control through out-of-the-loop performance decrements, over-reliance on automation, and poor human monitoring capabilities.” Further, pilots can in some respects configure the Flight Management System to present a view of the physical world which supports their interpretation of the world or their mental model of the current operating environment. Weiner [11] describes reports of pilots creating flight paths to wrong locations which went undetected and resulted in collision with a mountain. This is referred to as a controlled-flight-into-terrain accident or CFIT. A Flight Safety Foundation (FSF) report concludes that from 1979 through 1991 CFIT and approach-and-landing accidents (ALAs) accounted for 80% of the fatalities in commercial transport-aircraft accidents (Flight Safety Foundation, 2001) The FSF Approach-and-landing Accident Reduction Task Force Report [6] concludes that the two primary causal factors for such accidents are “omission of action/inappropriate action” and “loss of positional awareness in the air”. We will investigate the critical period associated with ALAs and CFIT accidents when the primary causal factors occur. In this paper, we propose to develop a trio of intelligent agents which will aid pilots during the critical approach and landing phase. One agent is physically situated on the ground (at air traffic control) and monitors the approaching aeroplane for deviations from normality. The other 2 agents are situated in the aeroplane itself: one predicts the future trajectory of the aircraft and identifies potential dangers, while the other monitors the actions of the pilot searching for patterns of behavior which suggest that the flight crew is losing situational awareness or is making an inappropriate action or omitting to make a necessary action – the knowledge which each pilot must keep in his mind which gives him a mental model of the 4 dimensional (i.e. including time) position in which the flight crew are situated. The interactions between the flight crew and the three agents form the backbone of a safety critical process.
Existing Technologies In 1974 the Federal Aviation Administration (FAA) mandated that all heavy airliners be fitted with a GPWS. In 1978 this was extended to all turbine aircraft fitted with 10 or more passenger seats. This has lead to a decrease in CFIT accidents however as
76
S. Thatcher et al.
discussed above there continues to be a large number of fatalities attributed to ALA or CFIT accidents. These early GPWS used information from the radar altimeter and air data computer to determine the aircraft’s vertical distance from the terrain below. The system was limited because it only perceived vertical separation between the aircraft and the ground directly below the aircraft in real time. As a result the Flight Safety Foundation (FSF) CFIT Task Force recommended that early model GPWS be replaced by Enhanced GPWS (EGPWS) or Terrain Awareness and Warning Systems (TAWS) which have a predictive terrain hazard warning function. As a result the FAA mandated in 2001 that all heavy transports be fitted with EGPWS and all turbine aircraft with 10 or more passenger seats be fitted with EGPWS after 2003. The EGPWS compares the aircraft’s position and altitude derived from the Flight Management and Air Data computers with a 20MB terrain database. In the terrain database the majority of the Earth’s surface is reduced to a grid of 9x9 km squares. Each square is given a height index. In the vicinity of airports the grid resolution is increased to squares of 400m x 400m. The height index and the aircraft’s predicted 3 dimensional position 20 to 60 seconds into the future are compared to see if any conflict exists. If it does the EGPWS displays an alert or warning to the flight crew. Other than to initially alert the pilots of “TERRAIN” up to 40-60 s before impact or warn the pilots to “PULL UP” up to 20-30 s before impact it does not offer any other solution to the potential problem. This research aims to extend the EGPWS by using three intelligent software agents which can plot a course around, or over, possible conflicting terrain and present a solution to the pilot on the cockpit display system or as input to the autopilot.
Intelligent Agents Wooldridge [12] describes an intelligent software agent as a program that performs a specific task on behalf of a user, independently or with little guidance. It performs tasks, tailored to a user’s needs with/without humans or other agents telling it what to do. To accomplish these tasks, it should possess the characteristics such as learning, cooperation, reasoning and intelligence. By analogy, a software agent mimics the role of an intelligent, dedicated and competent personal assistant. In this application we propose developing three agents, one ground based and the other two aircraft based, which will aid pilots during the critical approach and landing phase. The Anomaly Detection Agent The anomaly detection agent is situated on the ground in the air traffic controller centre. Each airport has its own anomaly detection agent and each agent is under local control. Pilots will no doubt come to judge the effectiveness of different anomaly detection agents at different airports. A typical airport has many safe landings each day. These are recorded by the air traffic control authorities but not used for automatic sensing of dangerous landings: this is the task of the air traffic controller who has ultimate authority in advising the pilots of danger. We propose creating an agent whose
An Intelligent Aircraft Landing Support System
77
Beliefs are in two major areas: firstly it retains a knowledge of all previously successful landings at that airport. This database itself can be hand-crafted by the (human) air traffic controllers since there may have been some successful landings in the past which, despite being successful, followed a pattern of activity which the air traffic controllers deem to be not good practice. Secondly the agent will have beliefs centered on the current landing – the aircraft’s height, horizontal distance from landing strip, speed, heading, lateral distance from landing strip, type of aircraft, weather conditions and any other factors which affect landing performance. Desires are that the aircraft lands safely. Intentions are to do nothing unless the plane is deemed to be deviating from the historical norm. If such a deviation is noted, the agent informs the air traffic controller who has responsibility for the plane and the pilot himself. This agent uses anomaly detection as its basic method. Consideration was given to a neural network anomaly detector (e.g. Kohonen’s anomaly detector [7]) but because it is critical that the warning be given clearly identifying why the warning has been raised, an expert system approach was used for this application. Thus a series of “if ... then ...” rules have been created from the database of past successful landings and the current flight’s data compared with the rules associated with this database. The Prediction Agent On board the aircraft, we have two Agents: the Prediction agent is monitoring the aircraft’s position, heading etc and the Pattern Matching Agent (next section) monitors the pilot’s behavior. The Prediction agent is essentially an improved version of the existing software described above. The improvements are intended to give earlier warning of potential problems than the existing software. The Prediction Agent has Beliefs about the aircraft’s position, heading, speed, rate of descent etc the landing strip’s position weather conditions surrounding ground topology, particularly where dangers are to be found the pilot. This may be controversial to the Pilots’ Unions but one must concede that different pilots will tackle tasks differently. Again this agent desires that the plane be landed safely. It again has the intention of doing nothing unless the patterns it is monitoring match potentially dangerous conditions. It might be thought that the Prediction agent is duplicating the work done by the Anomaly Detecting Agent on the ground but note that it is monitoring the descent in a very different manner. The Anomaly Detecting Agent is using a database of previous landings to that landing strip to ensure that the current landing is bona fide. The Prediction Agent is taking its knowledge of current position etc and of the local geography to extrapolate the plane’s position 5 minutes ahead in order to predict dangerous conditions before they actually occur. This prediction will be done with an
78
S. Thatcher et al.
artificial neural network trained with the standard radial basis function methods [4]. We use radial basis networks rather than the more common multilayered perceptron since it is more inherently modular dissecting the input space into regions of responsibility for different basis functions. A full description of radial basis function networks is given in [4]. If the prediction suggests danger, the Prediction Agent will contact the Anomaly Detection Agent and the Pattern Matching Agent. The Anomaly Detection Agent can assert that the current landing pattern is within the recognized safe zone but if it seems to be close to the edges of this zone, an alert will be issued to the pilot and the air traffic controller. The alert to the Pattern Matching Agent will be discussed in the next section. The Pattern Matching Agent The Pattern Matching Agent has Beliefs about The recent past behavior of the pilot Typical behaviors of the current pilot Behaviors which are typical of pilots losing situational awareness, performing an inappropriate action or not performing an appropriate action. Again its desires are that the plane lands safely and its intentions are to do nothing unless it matches the pilot’s current behavior with dangerous practice. The Pattern Matching Agent is equipped with a database of behaviors which are suggestive of, or a prelude to, the loss of situational awareness. In other words, this agent fills the role of a dedicated professional who, sitting in the cockpit, would identify the pilot’s actions as worthy of concern. This pattern matching is done by a simple Associative Artificial Neural Network [4] which matches approximately existing patterns of behavior to those in the database. We stated above that the Prediction Agent would contact the Pattern Matching Agent when it felt it had identified danger. We did this since we wish all agents to communicate at all times and so each of the three agents has beliefs about the other two. When the Pattern Matching Agent receives a warning from either of the others, it will respond with a degree of confidence about the pilot’s current situational awareness. This will not overrule the others’ warnings but may reinforce them.
Conclusion We have identified the specific process of approach-and-landing accidents as one which might successfully be augmented with intelligent agent technology. We thus have proposed three agents: 1. The first is on the ground and has a knowledge of typical landings at the current airport. 2. The second is on board the aircraft and is attempting to use the aircraft’s current position and heading and knowledge of the local geography to predict potential dangers. 3. The third is also on board the aircraft and is monitoring the behavior of the flight crew for actions indicative of the loss of situational awareness.
An Intelligent Aircraft Landing Support System
79
This research is in its early stages but we consider the interaction between these three agents to be central to the research and future research will concentrate on this area.
References 1. Endsley, M.R., (1988) Design and evaluation for situational awareness enhancement. Proceedings of the 32nd Annual Meeting of the Human Factors Society, 97-101. 2. Endsley, M. R. (1988). Situation Awareness Global Assessment Technique (SAGAT). Proceedings of the National Aerospace and Electronics Conference (pp. 789 795). New York: IEEE. 3. Endsley, M. & Strauch, B. (1997) Automation and Situation Awareness: The Accident at Cali, Columbia, In the Ninth International Symposium on Aviation Psychology, Columbus, OH 4. Haykin, S. (1999) Neural Networks: a Comprehensive Foundation. Prentice Hall International. 5. Helmreich, R. L. & Foushee, H. C (1993) Why Crew Resource Management? In Weiner, E. L., Kanki, B. G. & Helmreich, R. L., (Eds.), (1993). Cockpit Resource Management. San Diego: Academic Press. 6. Khatwa & Helmreich ,(1999), Analysis of Critical Factors During Approach and Landing in Accidents and Normal Flight., Data Acquisition and Analysis Woking Group, Flight Safety Foundation Approach-and-landing Accident Reduction Task Force. Flight Safety Foundation, Flight Safety Digest, Nov 1998-Feb 1999. 7. Kohonen, T. (1988). Self-organization and associative memory, Springer-Verlag. 8. Lintern, G (1995). Flight Instruction: The Challenge From Situated Cognition. The International Journal of Aviation Psychology, 5(4), 327-350 9. Thatcher, S. J. (1997). Flight Instruction or Flight Facilitation: A Foundation for Crew Resource Management. In the Proceedings of the Ninth International Symposium on Aviation Psychology, Columbus, OH. 10. Thatcher, S. J. (2000). The foundations of crew resource management should be laid during ab initio flight training. In Lowe, A. R. & Hayward, B. J., (Eds.) (2000), Aviation Resource Management. Aldershot, England: Ashgate 11. Wiener, E. L. (1988). Cockpit automation. In E. L. Wiener & D. C. Nagel (Eds.), Human Factors in Aviation (pp. 433-461). San Diego: Academic Press. 12. Wooldridge, M., An Introduction to MultiAgent Systems, Chichester: John Wiley & Sons Ltd, 2002.
Teaming Humans and Agents in a Simulated World Christos Sioutis1, Jeffrey Tweedale2, Pierre Urlings2, Nikhil Ichalkaranje1, and Lakhmi Jain1 1
School of Electrical and Information Eng., University of South Australia
{Christos.Sioutis, L.Jain, Nikhil.Ichalkaranje}@unisa.edu.au 2
Airborne Mission Systems, Australian Defence Science and Technology Organisation {Pierre.Urlings, Jeffrey.Tweedale}@dsto.defence.gov.au
Abstract. Previous research on human-machine teaming[1] argued that a formal specification in human-agent systems could prove beneficial in a hostile environment, and proposed an initial demonstration which has been successfully implemented[2] using a test-bed based on JACK[3] and Unreal Tournament (UT) [4]. This paper describes how to harvest these results and proposes a team for situations where human-agent collaboration is crucial to its success. A specific game type within Unreal Tournament is utilised, called Capture The Flag[5]. The proposed team is designed with the Prometheus agent design methodology [6] as a guide and is comprised of humans and agents each having a specific role.
Introduction The use of Intelligent Agents (IA) in teaming human and machines has found a critical place in modern distributed artificial intelligence. The IA teaming domain has attracted a lot of research interest[7-11]. Previous research[1] describes the goal of achieving a paradigm shift in regards to teaming humans with agents based on the Belief-DesireIntention (BDI) reasoning model[12]. The human-agent team has a complementing role that embraces the Situation Awareness of the human in that environment. Initial research established a stable and powerful test-bed that enabled such concepts using Unreal Tournament1(UT)[4]. Interfacing software called UtJackInterface (UtJI) was developed to interface BDI agents developed using JACK[3] to UT. UtJI was verified by developing a simple agent that can achieve small mission goals such as exploring the environment and defending an asset[2], this stimulated further work presented here. This research aims to create a team of UT players comprised of a number of agents along with one or more humans. The team is required to win a game of Capture The Flag (CTF) in UT. In the CTF game type there are two teams, each having a base with their own flag they must defend. Points are scored when the enemy team’s flag is captured, returned to the defended home base and allowed to touch the home flag [5]. For this experiment the human player is integrated into the team in order to improve the team’s performance and hence, reduce the time required to achieve its goals. 1
Unreal Tournament is a popular computer video game that provides a multi-player simulated 3D world for team-based competitions.
M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 80–86, 2004. © Springer-Verlag Berlin Heidelberg 2004
Teaming Humans and Agents in a Simulated World
81
The design of an agent-based system is not a trivial task. There is still no welldefined or mature methodology to follow. Those that do exist require an outcome that can be used to derive suitable action plans. In the last few years there is increased interest in refining a generic agent-based development methodology[13, 14]. The Prometheus agent-oriented design methodology was developed by RMIT University in association with Agent Oriented Software (AOS)2[15]. This methodology is relatively new and documentation is still scarce. Nevertheless, Prometheus provides a useful starting point and helpful guide regarding a number of important issues encountered when designing an agent system. It suggests maximising the power of the BDI execution model through using goals and sub-goals to define tasks whenever possible. Furthermore it is natural to write many simple, alternative plans of action rather than a single complex plan [16]. The authors argue that this requires that the execution engine provided by the agentdevelopment system provides some important features. In particular, when a specific plan fails, the agent still maintains the goal and tries to achieve it using alternative plans, rather than continuing to retry the same plan [16]. The Prometheus methodology has three stages. In the System Specification, the designer firstly determines external interfaces such as actions, percepts3 and permanent data and then determines system goals, functionalities and scenarios. In the Architectural Design the designer defines the agents, incidents, interactions and also shared data structures. Finally in the Detailed Design the designer defines capabilities, plans, belief structures and events. This paper describes a subset of the design and emphasises on the human-agent collaboration[6].
System Specification There are two types of percepts available to agents in this system as implemented by UtJI which are derived from the way that it interfaces to UT specifically the Gamebots[17] protocol. Firstly Synchronous percepts are received in a batch at a configurable interval and provide a range of information for a specific instant in time, including: The agent’s state (health, weapons etc.), the current score, a list of other players in the current field of view and a list of inventory items in the current field of view. Secondly Asynchronous percepts are received when some important events occur in UT, such as: Obtaining an inventory item, colliding with a wall, falling off a ledge, enduring damage and receiving a text message. Finally, agents can take action in UT by sending specific commands to it, for example: Jumping, running to a specific location, shooting a target, changing weapons, and sending a text message[17]. An important issue to consider here is that the agents are occupying the same environment as the human and they have the ability directly interact-with and communicate through text messages. Both entities exchange Situation Awareness via this process. A necessary step in the agent-oriented design methodology is to carefully consider the problem at-hand and define any goals that the team needs to achieve (or strive-for) 2 3
Agent Oriented Software are the authors of the JACK BDI-Agent development platform. The Prometheus methodology identifies any information that is received directly from the environment as percepts.
82
C. Sioutis et al.
to solve the problem. For this project, a number of goals are strategically asserted at the start of a game, these goals are intended to be in force throughout the duration of the game and govern the pro-active behavior of everyone in the team. A number of agents are also created, incorporated into the team and assigned appropriate roles to achieve these goals. The goals can be better understood when structured in a hierarchy as shown in Fig. 1.
Fig. 1. UnrealAgents goal map
The highest level (primary) goal is WinCTF which is asserted when the game begins and directs the team to win the game of CTF. It is shown in the centre of the diagram with lower level sub-goals spanning out in a co-centric fashion. The full design contains a number of extra sub-levels to cater for different situations, however only a portion of it is included in this paper due to space constraints. In order to satisfy the primary goal, the humans and agents in the team need to: Survive in the game world. Achieve scoring points as defined in the game rules (and prevent the other team from scoring). Explore the world in two ways, firstly find important locations and items that are required, and secondly find more efficient ways to achieve the primary goal and/or sub-goals. Communicate with other team members in order to achieve team coordination. There are four types of data stores needed to achieve the required system functionality. Information data stores are updated directly whenever some new percepts are received. They retain any important information that has been previously received. A number of these have already been implemented as part of UtJI. Tactical data stores contain tactical information that the team has compiled through experiencing different situations. For example WeaponsTactical contains information about the use of weap-
Teaming Humans and Agents in a Simulated World
83
ons, such as which weapons to use in which situations. Human collaboration data stores contain information obtained through observing and communicating with the human and are used for successfully coordinating with the human troops. For example HumanAwareness contains information about the human’s believed mental model of the world. Finally, Environmental data stores are used by agents to cope with the complexity of the UT environment. For example: TroopMovement can provide hints on what it means to be in a specific location and traveling at a specific velocity in a particular direction.
Agent Architectural Design The proposed design features one team, called UnrealAgents (UA) which encompasses the entire human-agent system. UA is comprised of a number of sub-teams derived in JACK via UtJI that engage in a specific subset of the pre-defined goals according to their role in the team. When the game starts the sub-teams are created to assume their roles. If a human is found in the UT world, he/she is requested to choose a preferred role and join UA. It is required that agents and humans can dynamically be assigned or assume different roles depending on the situation at hand. This may include entering the simulated world as a troop, or assuming command of a sub-team. The three sub-teams within UA are: UADefence, that is responsible for preventing the opposition from scoring. UAOffence, that scores points for the team. UASupport that conducts reconnaissance in the UT world and provides support to other sub-teams when requested. A necessity in the system design is to define many instances of execution [15]. Scenarios, need to be short and only describe how to satisfy a specific goal. Two example scenarios used with UA are given in Fig. 2. The first scenario describes the steps that the UA team needs to take in order to score a point. The second scenario describes how UA can prevent the enemy team from scoring points against it.
Fig. 2. Sample scenarios for UT
84
C. Sioutis et al.
The diagram shown in Fig. 3 illustrates the overall system. The box on the left of Fig. 3 is a simplified birds-eye view of the UT world that contains a flag on each side, and a number of circles and triangles that are spread randomly throughout the environment. Circles denote friendly players while the triangles denote enemy players. The right side of Fig. 3 is more complex since it depicts the internal structure of UA. The fact that all agents in UA are implemented using JACK is illustrated by the large rectangle with the thick border. The teams implemented are shown via dotted line borders while agents and control links are solid line ellipsoids and arrows respectively. The highest level team is UnrealAgents which encloses the UAOffence, UASupport and UADefence sub-teams. There are three types of troops (shown as agents on the left side of sub-teams) each with specific roles that vary with each sub-team. It is shown that troops have direct control over characters within the simulated world of UT. In addition, all sub-teams have commanders that are responsible for making decisions about how to deploy their troops and can exercise control over them. Finally, UACommand monitors and controls the entire UA team and is responsible for high level team-wide decisions required to win the game. UA extends outside the JACK boundary to include a human. This implies the system considers the human as a part of the UA team. In turn, the human also considers him/herself part of the team with a specified role. It is important to note that a human can assume any of the other roles within UA, this means that the architecture changes.
Fig. 3. UnrealAgents system architecture
Teaming Humans and Agents in a Simulated World
85
Agent Detailed Design The detailed design phase concentrates on the inner workings of the agents identified in the previous phases. Each agent has a set of capabilities that allows it to perform its intended role. In turn, each capability is formed as an intuitive grouping of a set of tasks that the agent is able to achieve. This section of the paper briefly describes some of the detailed design for agents in the UADefence team. The commanding agent in UADefence is UADefenceCommand. It receives orders from UACommand, it makes tactical decisions and in turn asserts goals for the defence team. It has two capabilities, one is for defending team assets and the other for retrieving the home flag when it is stolen. UADefence has three troops. They are called UADefenceTroop to distinguish them from other team troops. A UADefenceTroop is able to survive in UT using its Move, UseWeapons and MaintainHealth capabilities and it can also carry out orders received by UADefenceCommand though the tasks encapsulated within the rest of its capabilities. The complete detailed design would involve delving into each of the capabilities themselves, and explicitly describing all the different tasks within them. However, a general overview of UnrealAgents is the aim of this paper.
Conclusion and Future Work This paper describes the design of UnrealAgents, a team comprised of humans and agents that can win a game of CTF in UT. The design uses the Prometheus agentoriented design methodology as a guide. Prometheus has three stages, system specification, architectural design and detailed design. UA is able to communicate with UT through UtJackInterface, software developed in 2003 and published previously[2]. Its primary goal is WinCTF which directs UA to win the game of CTF. The design proposes that in order to satisfy the primary goal, the team needs to be able to survive in the game world by scoring points, explore the world and collaborate to provide this functionality. A unique feature of the UA design is that humans are considered as a part of the team and in turn humans obtain the belief that they are a part of the team. The system is designed so that a human can achieve the role of any agent in the team, from making command decisions to entering the world of UT as troops. Humans that are members of UA achieve their goals more efficiently though enhanced Situational Awareness of the simulated environment by collaborating with other team members during the game. Future work for this project involves the implementation, testing and evaluation of this design using the JACK platform along with parallel research into agent learning and adaptation in order to implement aspects of improving the team’s efficiency.
References [1] Urlings P, Tweedale J, Sioutis J. C, and Ichalkaranje N, “Intelligent Agents and Situation Awareness,” presented at 7th International Conference on Knowledge-Based Intelligent Information & Engineering Systems, United Kingdom, 2003.
86
C. Sioutis et al.
[2] Sioutis C, Ichalkaranje N, and Jain L, “A Framework for Interfacing BDI agents to a Real-time Simulated Environment,” presented at 3rd International conference on Hybrid Intelligent Systems, Melbourne, Australia, 2003. [3] AOS, “Jack Intelligent Agents,” Agent Oriented Software Pty. Ltd., (Online accessed: 10/3/2004) http://www.agent-software.com.au/shared/home/ [4] Epic_Games, “Unreal Tournament Website,” (Online accessed: 27/3/2004) http://www.unrealtournament.com [5] InfoGrames, Epic_Games, and Digital_Extremes, Unreal Tournament Manual, 2000. [6] Padgham L and Winikoff M, “Prometheus: A methodology for developing intelligent agents,” presented at Proceedings of the 1st International Joint Conference on Autonomous Agents and Multiagent Systems, Bologna, Italy, 2002. [7] Appla D, Heinze C, Goss S, and Connell R, “Teamed Intelligent Agents Software,” Defence Operations Analysis Symposium, DOAS 2002, Edinburgh, Australia, May 2002. [8] Ioerger T, Yin J, and Miller M, “Modeling Teamwork in Multi-Agent Systems: The CAST Architecture,” Computer Science, Texas A&M University, 2001. [9] Urlings P and Jain L, “Teaming Human and Machine: a Conceptual Framework,” Hybrid Information Systems”, Abraham A. and Köppen M. Eds, Heidelberg: Physica-Verlag, 2002. [10] Kaminka G. A, Veloso M. M, Schaffer S, Sollitto C, Adobbati R, Marshall N. A, Scholer A, and Tejada S, “GameBots: a flexible test bed for multi-agent team research,” Communications of the ACM, vol. 45, pp. 43-5, 2002. [11] Osawa E, Kitano H, Asada M, Kuniyoshi Y, and Noda I, “RoboCup: the robot world cup initiative,” presented at Second International Conference on Multi-Agent Systems, ICMAS-96 Proceedings., Menlo Park, CA, USA, 1996. [12] Wooldridge M, “Reasoning about rational agents,” Intelligent robotics and autonomous agents. Cambridge, Massachussetts/London, England: MIT Press, 2000, pp. 240. [13] F. Zambonelli, N. R. Jennings, and M. Wooldridge, “Developing Multiagent Systems: The Gaia Methodology,” ACM Transactions on Software Engineering Methodology, vol. 12, pp. 317-370, 2003. [14] M. Wooldridge, N. R. Jennings, and D. Kinny, “The Gaia Methodology for AgentOriented Analysis and Design,” Autonomous Agents and Multi-Agent Systems, vol. 3, pp. 285-312, 2000. [15] L. Padgham and M. Winikoff, “Prometheus: A Pragmatic Methodology for Engineering Intelligent Agents,” presented at Workshop on agent-oriented methodologies at OOPSLA’02, 2002. [16] L. Padgham, “Design of Multi Agent Systems,” presented at Tutorial at Net.ObjectDays, Erfurt, Germany, 2002. [17] A. N. Marshall, S. Gamard, G. Kaminka, J. Manojlovich, and S. Tejada, “Gamebots: Network API,” (Online accessed: 1/3/2004) http://planetunreal.com/gamebots/docapi.html
Contextual-Knowledge Management in Peer to Peer Computing E.V. Krishnamurthy1 and V.K. Murthy2 1
Australian National University, Canberra, ACT 0200, Australia [email protected]
2
Australian Defence Force Academy, Canberra, ACT 2600, Australia [email protected]
Abstract. In the pervasive computing environment consisting of peers (clients/servers or agents) contextual knowledge is an important feature to be embedded. Here, the traditional transaction model needs to be replaced by a model called a “Workflow model” between several peers that interact, compete and cooperate. Eiffel, iContract tool of Java and UML are powerful languages to implement the Peer-Peer-Pervasive-Program. They provide for program constructs essential to deal with the uncertain nature of connectivity of pervasive devices and networks, and the trial-error (subjunctive) nature of the processes and the programs used in E-commerce and robotics.
1 Introduction The Oxford dictionary defines “Context ” as “The circumstances in which an event occurs”. In our daily lives, where pervasive and ubiquitous computing systems (consisting of agent-based and peer-to peer systems) are going to play a central role to provide comprehensive services, contextual dynamics plays an important role to offer personalized services for various applications, e.g., medical services, robotics, security monitoring. Accordingly, contextual-knowledge management is an important issue in manipulating, acquiring information and reacting to the situation. In the pervasive computing environment, the traditional transaction model needs to be replaced by a more realistic model, called a “workflow model” between peers that interact, compete and cooperate, realising a pervasive peer-peer program (PPPP). The various types of tasks that arise in many pervasive applications (hospital admission, E -checking, Shipping, Purchasing and market forecasting, virtual hospitals) require a context -aware programming approach consisting of intention, context and actions; as a consequence, they require subjunctive or “what-if” programming approach to execute hypothetical or pseudo-transactions to test the intention of actions for trialerror design.
2 Background A pervasive computing environment consists of fixed and mobile computers linked together by a network (wired or wireless) so that they can communicate among M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 87–94, 2004. © Springer-Verlag Berlin Heidelberg 2004
88
E.V. Krishnamurthy and V.K. Murthy
themselves using messages. The peers can receive inputs from either sensors or humans to generate suitable outputs to a fixed host (or another peer) via fixed or a wireless link. Fixed host provide mobile application services and information to mobile hosts. Each peer supports query invoking and information filtering from fixed hosts to provide personal information service and pervasive computing. A pervasive object programming system (POPS) can be interpreted as a collection of objects interacting through messages. Each object maintains its own share of data and has its own program piece to manipulate it. That is each object combines datastructure and functionality. Also some of these objects can be active and behave like actors in a movie, each following its own script and interacting with other objects. A task submitted from a peer is called a pervasive workflow. It is a contextaware distributed task that can be executed partly within itself as an internal transactions (Intran) and partly in other peers (or fixed hosts (FH)) as external transactions (Extran). Each peer has a coordinator PC that receives external transaction operations from mobile hosts and monitors their execution in the database servers within the fixed host. Similarly every other peer has a coordinator. Conventionally, transactions have to satisfy the ACID properties, namely : Atomicity: All or none of transaction happens; Consistency: A transaction preserves the consistency in database before and after its execution; Isolation: Intermediate results are not externally made visible until commitment; Durability: The effects are made permanent when a transaction succeeds and recovers under failure. The ACID properties turn out to be restrictive for context-aware pervasive transactions and need to be relaxed, as illustrated in the following example. Example: Coordinating Transactions In an emergency and treatment of a patient admitted to a hospital, it may be required for the hospital to embed the context of the patient’s health history by querying information from the patient’ doctor’s database, and the pharmacy that issued the medicine. The patient record is obtained from the Medicare number, family and first names. This requires the matching attributes of the patient identity and acquiring the items from the database of the patient’s doctors and the databases of one or more pharamacists. This coordination will require updates to prescription and medication or other pertinent relations and has to be formulated as queries in a coordination formula Serafini et al. [18]. The coordination formula can express both constraints, as well as, queries. The evaluation of this formula will involve decomposition of queries into subqueries in another database with a different schema. This whole task is not a conventional transaction, but a long duration transaction, and we may not be able to enforce ACID properties.
3 Context Context plays a vital role in every day life.Even a driving task in the road is highly context dependent. Here, the communication takes place among the drivers through message passing. In simple cases, the context specifies: Who acts (name of individual
Contextual-Knowledge Management in Peer to Peer Computing
89
or role), on What (Service object or person), Where (Location, address), When (Time), How (Description of action), and Why (Intention and necessity for action); Barkhuus [1], Brezillon [2]. For example, the car- driving task clearly specifies that a driver (who) driving a car or a truck (what), signals at some intersection (where), at a specified time (when), through a turn signal (how), since he is turning (why). In this case, the context information has to be evaluated by other drivers, for suitable road conditions, and to trigger appropriate new contexts and actions. Numerous examples can be constructed to understand the role of various contexts in driving. In a more general situation, context can be classified under several categories, Jang and Woo [7], Louca [12], Munoz et al [14], Satyanarayana [16]. In pervasive computation, the context can specify other details, e.g., User characteristics, weather. Context as a Dynamical Object Context is a dynamic entity.We need to assume that the evolution of the context dynamics is not too fast for the sensor and the system should be able to react on the fly. The Object oriented model is useful for representing context dynamics, since it permits operational and attribute descriptions, context-sensitive referencing, inheritance, collaboration and contracts and dynamic routing , Orfali et al. [15]. Contextual Knowledge Contextual knowledge can be represented by a directed graph whose nodes denote actions and edges denote contexts, Brezillon[2]. A more powerful model is the Petri net (including timed Perti net [11]. In the Petri net model, the contexts can be represented by tokens in places and actions by transitions. Active and Passive Contexts There are two types of contexts: Active and Passive. The active context can directly trigger an action as in an involuntary action; while in the passive context, a user is prompted and is made aware of the context to do execute a voluntary action. e.g., A mobile phone can automatically adapt to a new time zone (active) or it can inform the user about the change in time zone (passive). Context, Intention and Actions In practice, not only the context, but also the intention plays an important role in pervasive computing. In practical real-life applications, Intention, context and action are interrelated and can enable, disable (inhibit) each other or remain dormant. Thus to have a successful implementation of pervasive computing, we need to have a suitable set of rules (protocols) to deal with Intention-Context-Action. This protocol specifies the required precondition for an appropriate action and is highly problem domain dependent. Such a precondition can be a two-valued logical predicate (true or false) in the case of hard computation, or a fuzzy or probabilistic measure in soft computation. An essential requirement in the above cases is the introduction of the attribute-tolerance in the context and intention, so that the preconditions need not be strictly a boolean predicate, but a fuzzy measure having tolerances in their
90
E.V. Krishnamurthy and V.K. Murthy
parameters. Also, the actions performed can be exact or approximate depending upon the context. Context Management In a pervasive environment, we need to set up a special context database with a private interface for each context.The context database can store the execution history of the context. Context evaluation involves a search, and evaluation of truth or falsity of boolean predicates or other measures.This requires: (i) The Context monitor that monitors the values of context parameters, e.g. Location, speed. (ii) The Context server that stores information about the entities in the environment. Entities are people, places things, time, or other attributes- but this data structure can be extended to other types of context data. The context server can serve either actively or passively. Any peer can request or the monitor can prompt and trigger the necessary action if the context arises. Context Evaluation The context evaluation can be interpreted as the query evaluation function in a database and the action can be interpreted as the updating function for a set of database instances. Hence, if one or several context conditions hold for several nondisjoint subsets of data at the same time, the choice or options made among them can be nondeterministic or probabilistic. This results in competing contexts. Such cases require careful evaluation of the strength of the context to decide which of the actions are to be done based on subsequent support received in real life applications. However, if the context condition holds for several disjoint subsets of data at the same time, the actions can take place concurrently. This results in a cooperative context. The context evaluation can be a time consuming long duration operation. Therefore, we need to devise a suitable measure for the relevance (strength) of the context and also in certain situations test out whether that action performed has produced the desired outcome (support). Concurrency and Conflicts In pervasive computing, we need to consider how to speed up the system by permitting concurrent transactions between several peers. This would require the analysis as to how the respective internal and external transactions interfere with each other when they are applied under varying conditions of context, intention and actions. That is a previous action, intention and context can create the required precondition. The resulting new action should ensure that appropriate post condition (new context, new intention) is created after performing the new action. It is well-known that, the following two conditions are to be satisfied for global serialization: 1. At each peer the local schedules the actions are performed in the non-conflicting order with respect to intention and context (Local serializability). 2. At each peer the serialization order of the tasks dictated by every other peer is not violated. That is, for each pair of conflicting actions among transactions p and q, an
Contextual-Knowledge Management in Peer to Peer Computing
91
action of p precedes an action of q in any local schedule, if and only if, the preconditions required for p do not conflict with those preconditions required for execution of the action q in the required ordering of all tasks in all peers (Global serializability). The above two conditions require that the preconditions for actions in different peers P(i) and P(j)) do not interfere or cause conflicts. In fact we can end up in a situation of deadlock cycles. Context Validation Through Contracts In a typical context management situation the contract manager implements an event oriented flow management by using a precondition “require” and a post condition “ensure” as in Eiffel language.
4 Contract Based Workflow A workflow is a collection of tasks (that include conventional transactions) organized to accomplish some business activity. Here each task defines some unit of work to be carried out. A workflow ties together a group of tasks by specifying execution dependencies, constraints and the dataflow between tasks. In a workflow we need to ensure certain conditions. This feature enables customers to decide just before committing whether the workflow satisfies the conditions; if not the customer can reject or abort the workflow, since a contract is violated. Workflows- External and Internal: A global workflow (we call it an External transaction or Extran ) T(ij) is defined as a collection of tasks between two objects O(i) and O(j); this consists of a message sent from O(i) to execute a desired action in O(j); this message is received by O(j). O(j) has a behaviour specified by: Pre(T(ij)),G(j), F(j), Post (T(ij)), where Pre() and Post() are respectively the pre and post states that are active before and after the transaction T(ij). G(j) is a guard of O(j) and F(j) is the command function consisting of operations that map values to values in local domains (note that the operations used in G(j) and F(j) are assumed to be defined) and sending messages. Thus the script specifies the context, namely, what message O(j) can accept and from whom, and what actions it has to perform when it receives the message while in state Pre(T(ij)) to satisfy the post condition post (T(ij)).The Extran T(ij) can trigger in O(j) numeric, symbolic or database computations. Each Extran T(ij) triggers a set of serializable computations in O(j) either in a total order or in a partially order depending upon whether parallelism, concurrency and interleaving are possible locally within O(j). If the object O(j) is “made up” of subobjects, we may have to execute a workflow consisting of several local workflows (called internal transaction - Intran). After executing Intran the system reaches a new state s' from old state s such that: using the command set F(j). This is based on the contract approach, Meyer, [13]; that is widely used in the language Eiffel. The precondition is specified by “require” and post condition by “ensure’, Jezequel et al, [8] Kramer [10], Meyer [13], Thomas &Weedon [19],Warmer & Kleppe [20], Clark & Warmer [4].
92
E.V. Krishnamurthy and V.K. Murthy
If a crash occurs and contract fails an exception is raised; here, three possibilities arise: a. Exception is not justified: it is a false alarm ; we may ignore. b. If we have anticipated the exception when we wrote the routine and provided an alternative way to fulfil the contract, then the system will try that alternative. This is called resumption, Meyer [13]. c. If, however, we are still unable to fulfil the contract we go into graceful degradation or surrender with honour. Then bring all the objects to an acceptable state ( precommitted- state) and signal failure. This is called organized panic. This should restore the invariant. At this point we initiate retry. The effect of retry is to execute the body of the routine again. In Eiffel the rescue clause does all the above (this is essentially RESTART after recovery in Transaction processing. Retry/Rescue and Reset: If false alarm then retry; else rescue and restart so that all the invariants in all objects are reset to their pre-action state. Role of Extrans and Intrans A local commit of an Intran is an intentional commit that contain all relevant details for an action commit of the Extran that is bound by a contract between the peers that contains the required time-tolerance (timeliness) and other attribute tolerances. Role of Agents As observed by Koubarakis [9], peer-to-peer systems and agents are very similar, in the sense the former is a concrete hardware realization of the abstract notion of agents [3, 21]. Hence the arguments presented in this paper, can be extended to agent-based systems including agent-based speech acts, Huget and Woolridge [6].
5 Language Support Eiffel [13,19], Java and UML [5] are powerful languages to implement Pervasive Object Programming System.They provide for software contract that captures mutual obligations using the program constructs “require [else]” (precondition) and “ensure [then]”(post condition. Eiffel provides for exception handling through a “rescue” clause and “retry” clause for dealing with the recovery and resumption. The tool called “iContract”, Kramer [10] provides the design by contract support by Java. Unified Modelling Language (UML)[5] has also been gaining importance. OCL (Object constraint Language) [20] is used along with UML. The OCL can be used to add design by contract information in UML, Sendall & Strohmeier [17]. They provide for program constructs to take care of the unpredictable nature of connectivity of the mobile devices and the networks, as well as the trial and error program design required in contextual -knowledge management.
Contextual-Knowledge Management in Peer to Peer Computing
93
6 Conclusion In the pervasive computing environment, the traditional transaction model needs to be replaced by a “context-based workflow model” between several peers that interact, compete and cooperate. The various types of workflow patterns that arise in such cases require subjunctive or “what-if” programming approach consisting of intention and actions for trial-error design, as well as contextual knowledge. Eiffel, iContract tool of Java and UML are powerful languages to implement the Peer-Peer-PervasiveProgram.They provide for program constructs that can deal with the uncertain nature of connectivity of pervasive devices and networks, and the trial-error (subjunctive) nature of the processes and the programs used in context-based information processing.
Acknowledgment The authors thank the reviewers for helpful suggestions in revising this paper.
References 1. Barkhuus, L., How to define the communication situation: Determining Context cues in Mobile Telephony, Lecture Notes in Artificial Intelligence, Vol.2680, Springer Verlag, New York (2003) 411-418 2. Brezillon, P., Context dynamic and explanation of contextual graphs, CONTEXT 2003, Lecture Notes in Artificial Intelligence, Vol.2680, Springer Verlag, New York (2003) 94106. 3. Chen, Q. & Dayal, U. Multi agent cooperative transactions for E-commerce, in Lecture Notes in Computer Science, Vol. 1901, Springer Verlag, New York (2000) 311-322. 4. Clark, A & Warmer, J., Object Modeling with the OCL, Lecture Notes in Computer Science, Vol. 2263, Springer Verlag, New York (2002). 5. Gogolla , M and Kobryn, C., 2001,Lecture Notes in Computer Science,Vol.2185, Springer Verlag, New York (2001) 6. Huget,M-P and Woolridge,M., Model Checking for ACL Compliance Verification, ACL 2003, Lecture Notes in Artificial Intelligence, Vol.2922,Springer Verlag, NewYork (2004), 75-90. 7. Jang, S and Woo,W., Ubi-UCAM,A unified Context aware application Model, Lecture Notes in Artificial Intelligence, Vol.2680, Springer Verlag, New York (2003) 178-189. 8. Jezequel, M.et al, Design Patterns and contracts, Addison Wesley, Reading, Mass. (2000). 9. Koubarakis.M, (2003) , Multi Agent Systems and peer-to-peer computing,, in Cooperative Information Agents VII, Lecture Notes in Computer Science, Vol.2782(2003) 46-62, 10. Kramer,R., iContract-The java Design by contract tool,, 26 th Conference on Technology of object oriented Systems,(TOOLS USA ’98) Santa Barbara(1998). 11. Krishnamurthy,E.V., Parallel Processing, Addison Wesley, Reading,Mass, 1989. 12. Louca, J, Modeling Context-aware distributed Knowledge, Lecture notes in Artificial Intelligence, Vol.2926, Springer Verlag, New York(2003) 201-212. 13. Meyer, B., Applying design by contracts, IEEEComputer 25(10)(1992) 40-52. 14. Munoz,M,A., Rodriguez,M., Garcia,A.I., and Gonzalez,V.M., Context aware Mobile communication in Hospitals, Computer,Vol.36 (9)(2003) 38-47.
94
E.V. Krishnamurthy and V.K. Murthy
15. Orfali,R., et al., The essential distributed objects, John Wiley, New York, (1996). 16. Satyanarayana, M., Challenges in implementing a context-aware system, Editorial, IEEE Pervasive Computing,Vol.1, (2002) 2-3. 17. Sendall, S & Strohmeier, A., Specifying Concurrent System Behaviour and Timing constraints using OCL and UML, Lecture Notes in Computer Science, Vol.2185, Springer Verlag, New York (2002) 391-405. 18. Serafini,L., et al., Local relational model:A logical formalizaton of database coordination, Lecture Notes in Artificial Intelligence,Vol.2680, Springer Verlag, New York(2003) 286299. 19. Thomas,,P and Weedon, R., Object-Oriented Programming in Eiffel, Addison Wesley, Reading, Mass. (1998). 20. Warmer,J and Kleppe, A.,The Object Constraint Language, Addison Wesley, Reading, Mass. (1999) 21. Woolridge, M. (2002) An introduction to Multi-Agent systems, John Wiley, New York (2002).
Collaborating Agents in Distributed Networks and Emergence of Collective Knowledge V.K. Murthy1 and E.V. Krishnamurthy2 1
UNSW@ADFA, University of New South Wales, Canberra, ACT 2600, Australia [email protected] 2
Australian National University, Canberra, ACT 0200, Australia [email protected]
Abstract. We describe how a set of agents can collaborate in E-marketing- in particular, we consider E- Auction. We also give a theoretical basis to detect the collaboration termination, without indefinite cycling, We also discus the possibility of self-organized criticality among interacting agents in which there is stochastic emergence of collective knowledge due to agent’s internal reasoning, as well as, incremental knowledge obtained from interactions with other agents.
1 Introduction The AOIS (agent oriented information system community) defines an agent thus: A system that is capable of perceiving events in its environment, or representing information about the current state of affairs and of acting in its environment guided by perceptions and stored information. There have been several proposals for agentbased paradigm [1,2,3,6,9,12]. Agents can be classified according to their functionality as: collaborative agents that compete or cooperate; interface agents that act as personal assistants; mobile agents that migrate among hosts to enhance efficiency of computation and improve the network throughput; information agents that manage, manipulate and collate information from many distributed sources; reactive agents that respond to stimulus and respond in an environment where they are embedded; smart agents that learn from their actions; hybrid agents that can combine any of the functionality of the above agents. In this paper we use the integrated model described in [6] that consists of the salient features of several agent paradigms [1,2,3,9,12]. This model has the simplicity and adaptability for realisation as a distributed transaction -based paradigm for negotiation and other E-marketing problems. The nature of internal condition-event-action-rules, their mode of application and the action set of an agent determines whether an agent is deterministic, nondeterministic, probabilistic or fuzzy. Rule application policy in a condition-event system can be modified by: (1) Assigning probabilities/fuzziness for applying the rule (2) Assigning strength to each rule by using a measure of its past success (3) Introducing a support for each rule by using a measure of its likely relevance to the current situation. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 95–102, 2004. © Springer-Verlag Berlin Heidelberg 2004
96
V.K. Murthy and E.V. Krishnamurthy
The above three factors provide for competition and cooperation among the different rules [7]. In particular, the probabilistic rule system can lead to emergence and self-organized criticality resulting in smart agents [4], [8].
2 What Is Collaboration? “Collaboration” is an interactive process among a number of agents that results in varying degrees of cooperation, competition and ultimately to commitment that leads to a total agreement, consensus or a disagreement. Agents connected by a network sharing a common knowledge base exchange private knowledge through transactions and create new knowledge. Each agent transacts its valuable private knowledge with other agents and the resulting transactional knowledge is shared as common knowledge. Agents may benefit by exchanging their private knowledge if their utility will be increased. This knowledge is traded in, if and only if, their utilities can be improved [11]. If during a transaction the difference between external and internal knowledge is positive this difference is added to private knowledge; else it is treated as common knowledge. A collaboration protocol is viewed as a set of public rules that dictate the conduct of an agent with other agents to achieve a desired final outcome in sharing the knowledge and performing actions that satisfy a desired goal satisfying some utility functions. A directed graph can be used to represent a collaboration process. Such a directed graph that expresses the connectivity relationship among the agents can be real or conceptual and can be dynamic or static depending upon the problem at hand. Multiagents can interact to achieve a common goal to complete a task to aid the customer. The interaction follows rule-based strategies that are computed locally by its host server. Here competing offers are to be considered; occasionally cooperation may be required. Special rules may be needed to take care of risk factors, domain knowledge dependencies between attributes, positive and negative end conditions. When making a transaction several agents have to deliberate and converge to some final set of values that satisfies their common goal. Such a goal should also be cost effective so that it is in an agreed state at the minimum cost or a utility function. To choose an optimal strategy each agent must build a plan of action and communicate with other agents.
3 Collaboration as a Transactional Paradigm Human collaboration uses an act-verify strategy through preconditions and actions This process has a similarity to the transaction handling problem; for each transaction is an exploratory non pre-programmed real-time procedure that uses a memory recall (Read), acquires a new information and performs a memory revision (Write). Each transaction is also in addition provided with the facility for repair (recovery-Undo) much like the repair process encountered in human problem solving. In human problem solving, several independent, or dependent information is acquired from various knowledge sources and their consistency is verified before completing a step
Collaborating Agents in Distributed Networks
97
of the solution to achieve each sub-goal; this process corresponds to committing a sub-transaction in a distributed transaction processing system, before proceeding to reach the next level of sub-goal arranged in a hierarchy. Thus the transactional approach provides for a propose, act and verify strategy by offering a non-procedural style of programming (called ‘subjunctive programming’) paradigm that is wellsuited for agent-based collaboration [6].
4 Agent Collaboration Protocol We now describe how agents can collaborate by sending, receiving, hand-shaking and acknowledging messages, and performing some local computations. A multi-agent collaboration protocol has the following features: 1. There is a seeding agent who initiates the collaboration. 2. Each agent can be active or inactive. 3. Initially all agents are inactive except for a specified seeding agent which initiates the computation. 4. An active agent can do local computation, send and receive messages and can spontaneously become inactive. 5. An inactive agent becomes active, if and only if, it receives a message. 6. Each agent may retain its current belief, revise or update its belief as a result of receiving a new message by performing a local computation. If it modifies its belief, it communicates its new belief to other concerned agents; else it does not modify its belief and remains silent. 7. The collaboration leads to a finite number of states. 8. The collaboration process has no infinite loop and reaches a terminal state.
In order that the collaboration protocol (C-protocol) is successful we need to ensure that all the above properties hold and the process ultimately terminates. For detecting termination we describe an algorithm called “Commission-Savings-Tally Algorithm” (COSTA) that can detect the global termination of a C- protocol. This is a general algorithm; we will apply it to an example on E-auction. Let us assume that the N agents are connected through a communication network represented by a directed graph G with N nodes and M directed arcs. Let us also denote the outdegree of each node i by Oud (i) and indegree by Ind(i). Also we assume that an initiator or a seeding agent exists to initiate the transactions. The seeding agent (SA) holds an initial amount of money C. When the SA sends a data message to other agents, it pays a commission: C/ (Oud (SA) + 1) to each of its agents and retains the same amount for itself. When an agent receives a credit it does the following: a. Let agent j receive a credit C(M(i)) due to some data message M(i) sent from agent i. If j passes on data messages to other agents j retains C((M(i))/(Oud(j)+1) for its credit and distributes the remaining amount to other Oud(j) agents. If there is no data message from agent j to others, then j credits C(M(i)) for that message in its own savings account; but this savings will not be passed on to any other agent, even if some other message is received eventually from another agent.
98
V.K. Murthy and E.V. Krishnamurthy
b. When no messages are received and no messages are sent out by every agent, it waits for a time-out and sends or broadcasts or writes on a transactional blackboard its savings account balance to the initiator. c. The initiator on receiving the message broadcast adds up all the agents’ savings account and its own and verifies whether the total tallies to C. d. In order to store savings and transmit commission we use an ordered pair of integers to denote a rational number and assume that each agent has a provision to handle exact rational arithmetic. If we assume C=1, we only need to carry out multiplication and store the denominator of the rational number.
We state the following theorems to describe the validity of the above algorithm, see [6] for proof. Theorem 1: If there are collaboration cycles that correspond to indefinite arguments among the agents (including the initiator) then the initiator cannot tally its sum to C. Theorem 2: The above algorithm (COSTA) terminates if and only if the initiator tallies the sum of all the agents savings to C, i.e., the common resource is not wasted and all the agents have reached an agreement on their beliefs. Thus termination can happen only if the total sum tallies to C. We use the above algorithm for E-auction with an auctioneer and a set of clients.
5 E-Auction Auction process is a controlled competition among a set of agents (clients and auctioneer) coordinated by the auctioneer. In an auction, the belief is first obtained from the auctioneer and other clients through communication and these are successively updated. Finally, the distributed belief among all participants is composed of all the existing beliefs of every agent involved in the process. The rules that govern the auction protocol are as follows: 1. At the initial step the auctioneer-agent begins the process and opens the auction. 2. At every step, decided by a time stamp, only one of the client-agent is permitted to bid and the auctioneer relays this information. The bidding client agent is called active and it does not bid more than once and this client becomes inactive until a new round begins. 3. After the auctioneer relays the information a new client becomes active and bids a value strictly greater than a finite fixed amount of the earlier bid. (This is English auction; it can be modified for other auctions). 4. If within a time-out period no client-agent responds, the last bid is chosen for the sale of the goods and the auction is closed.
5.1 E-Auction Protocol Among Clients and Auctioneer Let us assume that there are three clients A,B, C and an auctioneer G. G initiates the auction. Then each of the clients A, B, C broadcast their bid and negotiate and the auctioneer relays the information. The bidding value is known to all the clients, and
Collaborating Agents in Distributed Networks
99
the auctioneer. When the bid reaches a price above a certain reserve price, and no bid comes forth until a time-out, G terminates the auction and the object goes under the hammer for that price. The combined communication protocol and computational tree of E-Auction is shown in Fig 1. At initiation, the node labelled G is the root and the seeding agent (auctioneer). It transmits the information to each client the beginning of the E-auction. Also it starts with a credit 1 and retains a credit of 1 / (Oud (SA) + 1 to itself, and transmits the same amount to its neighbours A, B, C which in this case is 1/4. The retained credit for each transmission is indicated near the node. Then the COSTA proceeds as indicated generating the communication tree of Figure 1. To start with the agent - client A bids a value. Then all clients and G get this information and the credits. Then agent - client node B updates its earlier belief from the new message received from G; but the other nodes A, C do not update their initial beliefs and remain silent .The agent -client node C then bids. Finally as indicated in the rules a, b, c, d in Section 4 we sum over all the retained credits after each transmission. These are respectively (denominator being 4096): Node G: 1093; Node A: 341; Node B: 1301; Node C: 1361. Note that the sum tallies to 1.
Fig. 1. E-Auction protocol
6 Emergence of Collective Knowledge The agent collaboration system can model the E-market with many traders (agents) popularly known as buyers and sellers. These agents collaborate over the internet to sell or buy shares or stocks in a stock market. In an E-market situation, it is possible that the negotiation ultimately leads to self-organized criticality causing crashes. That is individual agents which correspond to a microscopic system can emerge as a self organizing macroscopic system corresponding to a “percolation model” or the more
100
V.K. Murthy and E.V. Krishnamurthy
general “random cluster model” [8]. The agent paradigm can be modelled by different percolation models in a manner analogous to modelling the spread of epidemics and forest fires [4,8,10] . For example the forest fire problem is modelled assuming that a tree can be in 3 states: burnt, burning and not burning. For simplicity we can assume that the trees are in a two dimensional square lattice to determine the relevant parameters. In Epidemiology, we can use two states: those “infected” and those “Susceptible”. We can model the behaviour of E-market agents analogously as follows: (i) the experience and economics knowledge of an agent deployed by a trader based totally on individualistic idiosyncratic criteria (elementary belief) (ii) the trader’s acquired knowledge through communication with other selected agents; such a trader is called a fundamentalist (derived belief). (iii) the trader’s acquired knowledge by observing the trends on market from a collective opinion of other traders; such a trader is called a trend chaser (inferential belief). In practice a trader is influenced by all the above factors and the modified knowledge is incorporated in the agent’s set of beliefs, organization and rules for actions. The above three factors play an important role in deciding the number of possible states that each agent will be in and his inclination to buy or sell or wait in an Emarketing decision. Each agent corresponding to a trader can communicate with one another and this creates a connectivity relationship (bond) among them modifying the organizational knowledge of the agents. This bond is created with a certain probability determined by a single parameter which characterises the willingness of an agent to comply with others. The three states of behaviour are obviously a very complicated function of the behavioural property and personality of an individual and whether he uses elementary or derived or inferential beliefs. It is interesting to note that all the beliefs are again a function of the speed with which information is available to an agent, financial status, his ability to reason and susceptibility to pressure from neighbours. Thus in a share market or auction situation we need to work out how the agents are linked in order to obtain information through communication, the personality factors- such as age, financial status and the market trend. Using datamining techniques the above factors can be used to derive a detailed information about the mechanism of bond formation among the agents. Based on this information, we can assume that any two agents are randomly connected with a certain probability. This will divide the agents into clusters of different sizes whose members are linked either directly or indirectly via a chain of intermediate agents. These groups are coalitions of market participants who share the same opinion about their activity. The decision of each group is independent of its size and the decision taken by other clusters. In this situation, using the random cluster model we can show that when every trader is on average connected to another, more and more traders join the spanning cluster, and the cluster begins to dominate the overall behaviour of the system. This can give rise to “speculation bubble” (if the members all decide to buy), a crash (if the members all decide to sell) or a stagnation (if the members all decide to wait). These are cooperative phenomenon and depend upon trading rules, exchange of
Collaborating Agents in Distributed Networks
101
information- the speed and volume, and the connectivity relationship. For the 3 -state agents the critical probability p(c) = 0.63.Thus in a large network of interacting agents, if an agent is even showing about 63% preference to the information from his neighbours a crash or bubble is bound to happen. Detailed study on the evolution of smart systems and the role of percolation model is available in [8]. This example illustrates that in a peer-to-peer agent based distributed knowledge management system new knowledge can emerge as a result of interaction with unpredictable consequences [5]. In a very recent paper Sen et al. [11] describe how a cooperative group formation can take place among agents. Also in a recent paper, Krishnamurthy et al. [4] describe the evolution of a swarm of interacting multiset of agents (birds, ants, cellular automata) that is able to optimize some global objective through cooperative search of space. Here, also there is a general stochastic tendency for individuals to move toward a centre of mass in the population on critical dimensions, resulting in convergence to an optimum. Using such agents we can simulate random walks independent of the past history of the walk and non-Markovian random walks, dependent upon past history- such as selfavoiding-repelling, active random-walker models, and a swarm whose global dynamics emerges from local rules [4, 8]. Such global dynamics can evolve to self– organized criticality, through chaos or stochasticity not only for physical states as in the case of a swarm, but also for the mental states of the agents, that can lead to a harmonious whole or a disharmony above a crtical threshold [8].
7 Conclusion We described how a set of agents can be used for collaboration in E-marketing; in particular we gave an example of E-Auction. We also explained the stochastic emergence of collective knowledge based on each agent’s internal reasoning and percolation of knowledge arising from interaction with its own peers.
Acknowledgment The authors thank the reviewers for helpful suggestions in revising this paper.
References 1. Chen, Q., and Dayal, U, Multi agent cooperative transactions for E-commerce, Lecture Notes in Computer Science, Vol. 1901,Springer Verlag, New York (2000) 311-322. 2. Dignum, F., Sierra, C, Agent Mediated E-Commerce, Lecture Notes in Artificial Intelligence, Vol. 2003, Springer Verlag, New York (2002). 3. Fisher, M.1995, Representing and executing agent-based systems, Lecture Notes in Computer Science,Vol. 890, Springer-Verlag, New York (1995) 307-323. 4. Krishnamurthy,E.V. et al, Biologically Inspired Multiset Programming Paradigm for Soft Computing, ACM Conference on Computing Frontiers Ischia, Italy (2004).
102
V.K. Murthy and E.V. Krishnamurthy
5. Louca, J. Modeling Context-aware distributed Knowledge, Lecture notes in Artificial Intelligence,Vol.2926, Springer Verlag, New York (2003) 201-212. 6. Murthy,V.K and Abeydeera, R., Multi-Agent Transactional Negotiation, Lecture Notes in Computer Science, Vol.2468, pp.151- 164, Springer Verlag, New York (2002) 7. Murthy,V.K. and Krishnamurthy, E.V.,1995, Probabilistic Parallel Programming based on multiset transformation, Future Generation Computer Systems, Vol. 11(1995) 283-293. 8. Murthy,V.K and Krishnamurthy,E.V., Entropy and Smart Systems, International Journal of Smart Engineering System Design, Vol. 5 ( 2003) 1-10. 9. Nagi , K., Transactional agents, Lecture Notes in Computer Science,Vol.2249, Springer Verlag, New York (2001). 10. Paul,W. and Baschnagel,J., Stochastic Processes, Springer Verlag, New York (2000). 11. Sen, S. et al., Emergence and stability of collaborations among rational agents, in Lecture Notes In Artificial Intelligence,Vol.2782, Springer Verlag, New York (2003)192-205. 12. Woolridge, M., An Introduction to Multi-Agent systems, John Wiley,New York (2002).
Intelligent Decision Making in Information Retrieval 2
Gloria E. Phillips-Wren1 and Guisseppi A. Forgionne 1
Sellinger School of Business and Management, Loyola College in Maryland, 4501 N. Charles Street, Baltimore, MD 21210 USA [email protected]
2
Department of Information Systems, University of Maryland Baltimore County, 1000 Hilltop Circle, Catonsville, MD 21250 USA [email protected]
Abstract. Information retrieval on the Internet is particularly challenging for the non-expert user seeking technical information with specialized terminology. The user can be assisted during the required search tasks with intelligent agent technology delivered through a decision making support system. This paper describes the technology and its application to suicide prevention searches performed for the National Institute of Mental Health.
1 Introduction Information retrieval is one of the most important uses of the Internet. Yet there is no consistent method of organizing or categorizing material. [1] Together with the large number of documents available, determining relevant information is difficult for search engines due to the fact that relevance is subjective and dependent on the individual user. Most search engines provide free-text entry by the user, and the selection of terms requires the user to make decisions about descriptive words that will yield the desired information. The user may require numerous attempts to locate specific information, particularly for inexperienced users or for difficult-to-locate information. The task is particularly taxing for the non-expert in the case of technical information that uses specialized terminology since the terms are generally not known to the user. The non-expert user needs guidance during the decision-making task of choosing appropriate search terms and evaluating the results, and assistance can be provided by presenting information in non-technical terms and by delivering the assistance in a transparent manner. This paper describes such as approach and its application to a suicide prevention search for National Institute of Mental Health. The application illustrates how the approach can make the pertinent technical information readily accessible to a non-expert searcher. The paper is organized as follows. We first describe the information retrieval process and its relationship to decision making. Next, we describe the application and technical requirements. The user interface to assist decision making during the design phase is discussed along with the implementation. We then discuss the use of intelligent agent technology in the system and propose further enhancements to the system. We conclude with contributions to the literature and further research. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 103–109, 2004. © Springer-Verlag Berlin Heidelberg 2004
104
G.E. Phillips-Wren and G.A. Forgionne
1.1 Information Retrieval Information retrieval informs the user on the existence or location of documents related to the desired information. It is differentiated from data retrieval which retrieves primary information such as stock quotes. [2] The user chooses among alternatives to reach a goal in decision making [3,4], and information retrieval involves decision making since the user selects words that accomplish the task. Tasks in decisionmaking can be classified as structured, semi-structured or unstructured. Structured tasks are accomplished with well-known and well-defined steps making them candidates for automation. On the other side of the spectrum, unstructured tasks are so dependent on the decision maker that no general guidelines can be given. Tasks that fall between structured and unstructured tasks are called semi-structured tasks, and these tasks can be supported with decision making support systems. Information retrieval by the non-expert in a technical field with specialized vocabulary can be considered a semi-structured task, and decision making during the search can be aided. During the search, the user moves through Simon’s classical model of decision making with the major phases as shown in Table 1 [5].
Intelligent Decision Making in Information Retrieval
105
Although the phases occur sequentially for most tasks, the process is continuous and the decision maker may loop back to previous phases at any time. [6] As new information is revealed, the decision maker iterates within the sub-tasks until a final choice is made. During the intelligence phase of information retrieval the user observes events that motivate a search for information. In our application of suicide prevention, for example, the decision maker may observe a friend that he/she thinks may be suicidal. The decision maker may be motivated to gather additional details and facts in order to determine if he/she should intervene. The user accesses an Internet-based search engine during the design phase. During this phase, the user formulates an idea of those characteristics that are most relevant to the search and how these ideas may interact. In the case of technical information such as suicide, medical terms are used to describe the literature and to categorize the information for retrieval. The nonexpert user may need assistance with the terminology and the search. After repeated attempts and possibly returning to the intelligence phase, the user selects one or more items deemed to be relevant from the search returns. The final phase, implementation, is concerned with the user applying the results of the search.
2 Application According to the U.S. Center for Disease Control, there were 10.8 deaths from suicide per 100,000 in the U.S. population in 2001. [7] Suicide prevention in the military has been addressed by centers such as the Naval Health Research Center who report that suicide is the second leading cause of death following accidents over the past decade in the U.S. Navy. [8] One of the objectives of our application is to provide suicide information to appropriate people who know someone who may be suicidal using an Internet-based methodology. [9] The strategy of delivering this support remotely through electronic communication technologies is called telemedicine. [10] The Preventing Suicide Network (PSN) is a telemedicine application that seeks to deliver personalized information on suicide prevention from technical databases to nonexperts. The application, which is accessed through a Web-based interface developed under contract to the National Institute of Mental Health, has the appearance shown in Figure 1 [9]. The technical databases addressed in our application consist of the Web-accessible databases at the National Library of Medicine (NLM) shown in Table 2. [11] As this table illustrates, there are seven databases at NLM that are of interest to the PSN. The scope of the external databases is demonstrated by the over 12 million references in MedLine alone [12]. The primary search terms used to catalog the databases are defined by the Medical Subject Headings (MeSH®) controlled vocabulary containing over 19,000 terms. [12] The NLM reports that biomedical subject specialists with degrees in science who are also experts in one of more foreign languages create the MeSH® terms and review complete articles rather than abstracts. Additional terms, primarily chemical concepts, are added to the MeSH® headings each year to provide for changing content [12].
106
G.E. Phillips-Wren and G.A. Forgionne
Fig. 1. The Preventing Suicide Network homepage [9]
Terminology utilized in the MeSH® to describe suicide and used to catalog material in the NLM databases is not known to the non-expert user. In addition, the user is not experienced with terms that a medical specialist uses to describe suicide. NLM provides a simple interface called the Gateway to the medical databases that
Intelligent Decision Making in Information Retrieval
107
consists of a free-text search box. The combination of MeSH® terms, technical language and lack of guidance from the medical databases suggest that the user would benefit from decision support in the information retrieval task.
3 Decision Support for Information Retrieval The user is assisted during the design phase of decision making in our application. Terms were selected in consultation with the librarians at the NLM with attention to the specific MeSH® terms describing suicide. The terms are presented to the user as a list of approximately 85 terms provided by a professional clinician in consultation with medical experts in the subject field of suicide. [9] Suggested terms were evaluated with respect to the MeSH® terms and selected for inclusion based on their representation of the subject matter. A portion of the terms are shown in Figure 2.
Fig. 2. Portion of the aided search provided in the Preventing Suicide Network [9]
The terms are presented to the user through the interface shown in Figure 2 using non-technical language. The terms sent to the search engine are different from those presented to the user and more closely match the MeSH® terms. For example, the term “family and physiopathology” is used for the database search rather than “family dysfunction” that is shown to the user. As another example, the terminology “(mental disorders AND suicide [mesh] NOT suicide, assisted NOT euthanasia)” is sent to the database search rather than “history of mental health disorders.” [9] The user selects the terms that are descriptive of his/her personal situation essentially derived during his/her personal application of the intelligence phase of decision making. The user preferences are stored within the system to allow for
108
G.E. Phillips-Wren and G.A. Forgionne
dynamic processing as he/she gains additional information and to tailor the search for the particular user. Intelligent agent technology is used to implement this feature and to maintain currency.
4 Intelligent Agents Intelligent agents are software packages that carry out tasks for others autonomously, with the others being human users, business processes, workflows or applications. [13-15] Agents in decision support systems can provide benefit to users automating some tasks and reducing complexity. [16] Recent applications of intelligent agents to decision support applications range from medical support to business decisions. [17, 18] Intelligent agents are useful for retrieval of information to support decision making processes. [19] In our application with specific, specialized information agents can facilitate interaction with the user and can act as a guide through the decision making process. Agents collect the responses and develop a profile of the user’s description of the desired information. Agents then retrieve information from the external databases at the NLM, weight and filter the information according to the user’s profile, and return the desired information to the user. In the current implementation, agents operate 24/7 to search NLM databases for new information of interest to the user based on the profile stored in a SQL database internal to the PSN website. When agents identify information of potential interest to the user, he/she is notified via an automatically-generated email. Currently, the user must return to the PSN website to retrieve the new information. In the future, agents could interact with the user to guide the development of the user profile and to better tailor the information retrieval for a particular user.
5 Summary and Contributions Intelligent agent technology is being applied to facilitate decision making for information retrieval in telemedicine related to suicide prevention. The methodology, delivered through intelligent decision support with agents, has the potential to significantly enhance the accessibility of technical information for the non-expert user.
Acknowledgements The authors would like to thank Florence Chang, chief of the Specialized Information Services Branch of the National Library of Medicine, for her invaluable assistance with the NLM databases. James Wren is acknowledged for his work on developing search statistics related to the Preventing Suicide Network. This work was supported in part by iTelehealth, Inc., and Consortium Research Management, Inc., under a Small Business Innovation Research contract N44MH22044 from the National Institutes of Mental Health for an Intermediary-Based Suicide Prevention Website Development Project.
Intelligent Decision Making in Information Retrieval
109
References 1. Tang, M. and Sun, Y.: Evaluation of Web-Based Search Engines Using User-Effort Measures. LIBRES. Vol. 13(2). Sept (2003), http://libres.curtin.edu.au/libres13n2/tang.htm 2. Rijsbergen, C.J.: Information Retrieval. Butterworths. London (1979) 3. Holsapple, C.W. and Whinston, A.B.: Decision Support Systems. West Publishing Company, St. Paul, MN (1996) 4. Turban, E. and Aronson, J.: Decision Support Systems and Intelligent Systems. Upper Saddle River NJ, A. Simon and Schuster Company (1998) 5. Simon H.: Administrative behavior, fourth edition (Original publication date 1945). The Free Press, New York NY (1977) 6. Forgionne, G. A.: Decision Technology Systems: A Vehicle to Consolidate Decision Making Support. Information Processing and Management. Vol. 27(6). (1991) 679-797 7. CDC: Center for Disease Control, http://www.cdc.gov/nchs/fastats/suicide.htm. Accessed on March 15 (2004) 8. NHRC: Military Suicide Research Program. http://www.nhrc.navy.mil/programs/donsir/. Accessed on March 5 (2004) 9. PSN: Preventing Suicide Network, http://www.preventingsuicide.com/, Accessed on February 15 (2004) 10. Field, M. (ed.): Telemedicine: A Guide to Assessing Telecommunications for Health Care. Institute of Medicine of the National Academy of Sciences: Washington, D.C. (1996) 11. NLM: National Library of Medicine. http://gateway.nlm.nih.gov/, Accessed on August 25 (2003) 12. NLM: National Library of Medicine. http://www.nlm.nih.gov/pubs/factsheets/bsd.html, Accessed on February 2 (2004) 13. Bradshaw, J. (ed.): Software Agents. MIT Press, Cambridge, MA (1997) 14. Huhns, M. and Singh, M. (eds.): Readings in Agents. Morgan Kaufmann Publishers, Inc., San Francisco CA (1998) 15. Jennings, N. and Woolridge, M. (eds.): Agent Technology: Foundations, Applications and Markets. Springer-Verlag, Berlin Germany (1998). 16. Hess, T., Rees, L. and Rakes, T.: Using Autonomous Software Agents to Create the Next Generation of Decision Support Systems. Decision Sciences, Vol. 31(1). (2000) 1-31 17. Harper, P. and Shahani, A.: A decision support system for the care of HIV and AIDS patients in India. European Journal of Operational Research. Vol. 147(1). May (2003) 187 18. Chen, J. and Lee, S.: An exploratory cognitive DSS for strategic decision making. Decision Support Systems. Vol. 36(2). October (2003) 147. 19. Lesser, V., Horling, B., Klassner, F., Raja, A., Wagner, T., and Zhang, S.: BIG: An Agent for Resource-Bounded Information Gathering and Decision Making. Artificial Intelligence Journal. Vol. 118(1-2). (2000) 197-244
Innovations in Intelligent Agents, Web and Their Applications Gloria E. Phillips-Wren1 and Nikhil Ichalkaranje2 1
Sellinger School of Business and Management, Loyola College in Maryland, 4501 N. Charles Street, Baltimore, MD 21210, USA [email protected] 2
School of EIE, University of South Australia, Mawson lakes Campus, Mawson lakes Boulevard SA 5095, Australia [email protected]
Abstract. This paper provides an introduction to Session 2 of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference along with a brief summary of the papers in the session.
1 Introduction Session 1 of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference presented research papers in innovations in intelligent agents and applications. Session 2 continues the research stream by extending agents to the Internet and Web applications. A general categorization of this work can be called web agents, and frameworks are being developed to apply mathematical models and statistical learning techniques so that Web agents can learn about their environment [1]. Some web agents have been categorized as mobile agents, defined as programs that can migrate from machine to machine in a heterogeneous network by deciding when and where to migrate [2]. These agents can migrate to another computer, suspend or initiate action, or resume execution on another machine. Hyacinth, Nwana and Ndumu (1999) include in this category agents such as shopbots and information agents [3]. Research in the area of web agent systems is focused on theory and development of a unified framework, the structure of the web, semantic web, information retrieval from the Web, and ethical agents [1,2,3,4,5]. Some of these topics are illustrated by the research papers that formed Session 2 of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference. An introduction to the papers in this session is offered below.
2 Session Papers The first paper by Ikai, Yoneyama and Dote entitled “Novel Intelligent Agent-Based System for Study of Trade” investigates human activity by allowing the agent system to evolve based on a concept called sugarscape to govern the behavior of the agents [6]. Agents attempt to acquire the most sugar within their space, and the concept is M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 110–112, 2004. © Springer-Verlag Berlin Heidelberg 2004
Innovations in Intelligent Agents, Web and Their Applications
111
applied to the study of trade. Simulation programs are utilized to compare various approaches. The paper by Takahashi, Amamiya, Iwao, Zhong and Amamiya entitled “Testing of Multi-Agent-based System in Ubiquitous Computing Environment” attempts to move into the real world with an agent system in an e-commerce application [7]. The experiment described in the paper consists of two communities, a shopping mall and a user community with actual users. Morch and Nævdal describe user interface (or pedagogical) agents in their paper “Helping Users Customize their Pedagogical Agents: Issues, Approaches and Examples” [8]. These agents attempt to provide awareness of the social situation in a web-based collaborative learning environment. For example, agents could virtually take the place of a human teacher. The paper addresses both technical and social issues in the use of pedagogical agents. The final paper in Session 2 is a contribution by Velásquez, Estévez, Yasuda, H., Aoki and Vera entitled “Intelligent web site: Understanding the visitor behavior” [9]. In this paper authors propose a portal generation technique which improves its structure and content by analysing user’s (visitors) behaviour. The prime focus of this paper is to model visitor behavior from the only information available, which is the user’s browsing behavior on the web. A framework is developed to extract knowledge from Web data and discover meaningful patterns. The method is applied to a commercial bank’s web site to provide recommendations for modifying the site.
3 Summary The research papers in Session 2 of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference advance the field of web intelligent agents by offering theory and applications in agent frameworks, simulating human behavior with evolutionary systems, increasing awareness of social situations, and applying intelligent agents to important areas of ecommerce, collaborative learning and banking.
References 1. Web Agents Group. Accessed from http://www.cs.brown.edu/research/webagent/ on April 30 (2004) 2. Dartmouth Agents. Accessed from http://agent.cs.dartmouth.edu/general/overview.html on April 30 (2004) 3. Hyacinth, S., Nwana, D. and Ndumu, T.: A perspective on software agents research. The Knowledge Engineering Review. Vol. 14(2), 1-18 (1999) (Also available from http://agents.umbc.edu/introduction/hn-dn-ker99.html). 4. UMBC AgentWeb. Accessed from http://agents.umbc.edu/Applications_and_Software/ Applications/index.shtml on April 30 (2004) 5. Eichmann, D.: Ethical Web Agents, Accessed from http://archive.ncsa.uiuc.edu/SDG/IT94/ Proceedings/Agents/eichmann.ethical/eichmann.html on April 30 (2004)
112
G.E. Phillips-Wren and N. Ichalkaranje
6. Ikai, T., Yoneyama, M. and Dote, Y..: Novel Intelligent Agent-Based System for Study of Trade. Proceedings of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference, Wellington, NZ (2004) 7. Takahashi, K., Amamiya, S., Iwao, T., Zhong, G. and Amamiya, M..: Testing of MultiAgent-based System in Ubiquitous Computing Environment. Proceedings of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference, Wellington, NZ (2004) 8. Morch, A. and Nævdal, J..: Helping Users Customize their Pedagogical Agents: Issues, Approaches and Examples. Proceedings of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference, Wellington, NZ (2004) 9. Velásquez, J., Estévez, P., Yasuda, H., Aoki, T. and Vera, E..: Intelligent web site: Understanding the visitor behavior. Proceedings of the Knowledge-Based Intelligent Information & Engineering Systems (KES) Conference, Wellington, NZ (2004)
Novel Intelligent Agent-Based System for Study of Trade Tomohiro Ikai, Mika Yoneyama, and Yasuhiko Dote Department of Computer Science and Systems Engineering, Muroran Institute of Technology, Mizumoto 27-1, Muroran 050-8585, Japan Phone: +81 143 46 5432, Fax: +81 143 46 5499 [email protected]
Abstract. In this paper a novel intelligent agent-based system (rule-based system) on the basis of the sugarscape agent–based system(Epstein and Axtell 1996) forc the study of trade is developed introducing the architectures of hybrid clusters and peer to peer(Hybrid P2P) computer networks (Buyya 2002) . It is confirmed by simulations that the sugarspace with the architecture of hybrid P2Pcomputer networks is the most efficient and flexible among the sugarspace without trade, the sugarspace with the architecture of clusters computer networks (Clusters), and the sugarspace with the architecture of P2P computer networks (Pure P2P). This developed agent-based system (rule-based system) is also more efficient and flexible than our developed agent-based system using artificial immune networks (Dote 2001). Keywords: agent-based system, sugarspace, cluster and peer to peer computings, artificial societies
1 Introduction Agent-based computer modeling techniques to the study of human social phenomena, including trade, migration, group formation, combat, interaction with an environment, transmissin of culture, propagation of diseas, and population dynamics.have been developed. This broad aim is to begin the develpment of a computational approach that permits the study of these diverse spheres of human activity from an evolutionary perspective as a single social science, a trans discipline subsuming such fields aseconomics and demography (complex systems). Our developed computer agent-based (rule based) model is based on the sugarscape agent-based model (Epstein and Axtell 1996).
The sugarscape is a spatial distribution, or landscape, of generalized resource that agents like to eat. The landscape consists of variously shaped regions, some rich in sugar, some relatively impoverished. Agents are born on the sugarscape with a vision, a metabolism, and other genetic attributes. Their movement is governed by a simple local rule. Paraphrasing, it amounts to the instruction:” Look around as far as your M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 113–123, 2004. © Springer-Verlag Berlin Heidelberg 2004
114
T. Ikai et al.
vision permits, find the spot with the most sugar, go there and eat the sugar” Every time an agent moves, it “burns” some sugar-an amount equal to its metabolic rate. Agents die if and when they burn up all their sugar. A remarkable range of phenomena emerges from the interaction of these simple agents. The ecological principle of carrying capacity—that a given environment can support only some finite population—quickly becomes evident. When “seasons” are introduced, migration is observed. Migrators can be interpreted as environment refugees, whose immigration boosts population density in the receiving zone. Intensifying the competition for resources there – a dynamic with “national security” implications. Since agents are accumulating sugar at all times, there is always a distribution of wealth—measured in sugar—in the agent society. Does the wealth distribution mimic anything observed in human societies? Under a great variety of conditions the distribution of wealth on the sugarscape is highly skewed, with most agents having little wealth. Highly skewed distributions of income and wealth are also characteristic of actual human societies, a fact first described quantitatively by the nineteen century mathematical economist Vilfredo Areto. Thus we find the first instance of a qualitative similarity between extant human societies and artificial on the sugarscape. Spice is added to the sugarscape model resulting in a trade model. On the other hand clusters, grids, and peer to peer (P2P) computer networks have been developed (Buyya 2002). Cluster networks have the characteristics that they are all centralized for user management, resource management, and allocation/scheduling. P2P networks are decentralized for user management and allocation/scheduling, and are distributed for resource management. In this paper the architectures of clusters and P2P (HybridP2P) networks are introduced to the sugarscape model to construct more flexible and efficient intelligent agent-based systems for the study of trade. It is confirmed by simulations that the proposed agent-based system (simulator) is more efficient and flexible for the study of trade than the sugarscape simulator without trade, the sugarscape simulator with the architecture of clusters networks, with P2P networks, and the agent-based systems using artificial immune networks (Dote 2001). This paper is organized as follows. Section 2 describes the sugarscape (trade) model with the architecture of hybrid clusters and P2P (Hybrid P2P) networks. In section 3 the simulation results of the proposed approach in comparison with those of the other approaches are given. Section 4 draws some conclusions.
2 Sugarscape Model (Sugar and Spice, Trade) with Architecture of Hybrid Cluster and Peer to Peer (Hybrid P2P) Computer Network 2.1 Agents, Environment, and Rules Agents, environment, and rules take an important role in agent–based models.
Novel Intelligent Agent-Based System for Study of Trade
115
Agents: Agents are ‘people’ of artificial societies. Each agent has internal states and behavioral rules. Some states are fixed for the agent’s life, while others change through interaction with other agents or with the external environment. For example, in the model to be described below, an agent’s metabolic rate, and vision are fixed for life.. However, individual economic preferences, wealth can all change as agents move around and interact. These movements, interactions, changes of state all depend on rules of behavior for the agents and space. Environment: Life in an artificial society unfolds in an environment of some sort. This could be landscape, for example, a topography of renewable resource that agents eat and metabolize. Such a landscape is natually modeled as a lattice of resourcebearing sites. However, the environment, medium over which agents interacy, can be a more abstract structure, such as a communication network whose very connection geometry may change over time. The point is that the ‘environment’ is a medium separate from the agents, on which the agents operate and with which they interact. Rules: Finally, there are rules of behavior for the agents and for sites of the environment. A simple movement rule for agents might be : Look around as far as you can, find the site richest in food, go there and eate the food. Such a rule couples the agents to their environment. One could think of this as an agent-environment rule. In turn, every site of th landscape could be coupled to its neighbors by cellular automata rules. For example, the rate of resource growth at a site could be a function of the resource levels at neighboring sites. This would be an environment-environment rule. Finally,there are rules governing agent-agent interactions- mating rules, combat rules, or trade rules, for example. To begin, since trade involves an exchange of distinct items between individuals, the first task is to add a second commodity to the landscape. This second resource, ‘spice’ is artanged in two mountains opposite the ogiginal sugar mountains. At each position there is a sugar level and capacity, as well as a spice level and capacity. Each agent now keeps two separate accumulations, one of sugar and one of spice, and has two distinct metabolisms, one for each good. These metabolic rates are heterogeneous over the agent population, just as in the single commodity case, and represent the amount of the commodityes the agents must comsume each period to stay alive. Agents die if either their sugar or their spice accumulation falls to zero. The Agent Welfare Function: We now need a way for the agents to compare for the two goods. A ‘rational’ agent having ,say, equal sugar and spice metabolisms but with a large accumulation of sugar and small holdings of spice should pursue sites having relatively more spice than sugar. One way to capture this is to have the agents compare how ‘close’ they are to starving to death due to a lack of either sugar or spice. They then attempt gather relatively more of the good whose absence most jeopardizes their survival. In particular, imagine that an agent with metabolisms and accumulations
computed the ‘amount of time until death given no further
resource gathering’ for each resource ; these durations are just
and
The ralative size of these two quantities, the dimensionless number
T. Ikai et al.
116
is a measure of the relative importance of finding sugar to finding spice. A number less than one means that sugar is relatively more important, while a number greater than one means that spice is needed more than sugar. An agent welfare function giving just these relative valuations at the margin is
where
Note that this is a Cobb-Douglas function form.
Internal Valuations: According to microeconomic theory, an agent’s internal valuations of economic commodities are given by its so-called marginal rate of substitution (MRS) of one commodity for another. An agent’s MRS of spice for sugar is the amount of spice the agent considers to be valuable as one unit of sugar, that is, the value of sugar in units of spice. For the welfare function (1)a above, the MRS can be shown to be
An agent’s MRS 0.13 AND Biomass Feed Rate > 0.4 THEN Efficiency = 88_90 Each rule is an IF-THEN statement including the premise and the conclusion. This algorithm was chosen due to its simplicity and its production of explicit and understandable rules.
Mining Transformed Data Sets
151
3 Feature Selection Method This section describes a model that identifies the best feature subset. Several transformation schemes are applied to the original data set. A genetic wrapper feature selection algorithm is utilized to identify the key features from each of the transformed data sets. The selected features are then combined into a single data set. The genetic wrapper selection method is applied to the combined data set (see Figure 1).
Fig. 1. Feature transformation and selection model
Analyzing the final set of selected features will provide insight into the process and the feature interactions. A selected feature that was not transformed indicates that it is sensitive and any denoising would be detrimental to the knowledge quality, e.g., measured with classification accuracy. Conversely, selected features with high level of denoising may suggest that they are critical for outlier detection or large process shifts but are robust to small changes within the process. A data mining algorithm is applied to the final feature subset. Comparing the results of the data mining algorithm to the results obtained from applying the same
152
A. Burns et al.
algorithm to the original data set will demonstrate the improvement the quality of the discovered knowledge.
4 Case Study The method outlined in the previous section was demonstrated on industrial data obtained from a Circulating Fluidized Boiler (CFB). The boiler provides an excellent case study due to that fact it is a complex and temporal environment. Furthermore, there has been some research in the area that utilizes wavelets for data transformation. The applications include: partial discharge monitoring ([8], [9], [10]), transforming inputs to a neural network [10], and fault detection [11]. For the purposes of this case study, data on fourteen features was collected in oneminute intervals over several days. The parameters consisted of both control parameters and observed parameters. The parameters included primary and secondary air flows, boiler temperatures, pressures, and oxygen levels. The resulting data set consisted of over 12,000 observations. The fourteen features were used to predict combustion efficiency of the boiler in the applications of the decision tree. These applications include the fitness function of the GA wrapper and the applications of the decision tree for the evaluation of the feature subsets. Any transformation scheme can be utilized with this method, but moving average and wavelets were the focus of the case study. Both schemes capture the time behavior of data (vertical relationships) that is of importance in mining temporal data sets. The transformation were applied and examined separately. Six moving range transformations (original data, 10, 20, 30, 40, and 60 minute moving averages) were considered. Each transformation was applied to each of the features. The GA wrapper selected the most ideal feature subsets for each transformation scheme. That is, the GA wrapper selected the ideal feature subset from the set of all features that had been transformed with a 20 minute moving average. This was repeated with each moving range transformation as well as the original data. The selected features were then combined together and the GA wrapper selected the ideal subsets from the combined data set. Four wavelet transformations were analyzed (0.3, 0.2, 0.1, and 0.01). The same procedures that were used with moving average transformation were applied to a wavelet transformation scheme.
5 Results The decision tree algorithm was applied to the original feature set as well as the final feature subset generated from both the moving range and wavelet transformation. The classification accuracies are a result of 10-fold cross validation [12] and it should be noted that all applications of the decision tree were completed on the same computer. The results in terms of predication accuracy, number of rules, and computation time for the moving range transformation are shown in Table 1.
Mining Transformed Data Sets
153
It is evident that the features selected by the proposed approach improved the performance of the decision tree in all three metrics. The results from the wavelet transformations can be seen in Table 2. The difference between the original feature set metrics for the wavelet and moving average is due to the fact that the efficiency outcome was also transformed with a moving average and wavelet in the respective trials.
The results from the wavelet transformation are not as dramatic as the moving range, but there is still marginal improvement in all metrics.
6 Conclusion In this paper an approach for selection of the best transformed feature subset is presented. The approach utilizes a genetic algorithm wrapper and several data transformation schemes. The final feature subset contains not only the best features but also their best transformations. The feature transformation approach is well suited for temporal data as it provides new insight about the dynamics of the data and determines parameter sensitivity. The approach was demonstrated on data from a boiler combustion process. A wavelet transformation scheme and a moving average scheme were applied to the data. The moving average scheme produced significant improvements in terms of classification accuracy, and the reduction in the number of rules and processing time. The approach provided more insight by repeatedly selecting the same features regardless of the type of transformation scheme. These features might be crucial to controlling the process. Furthermore there were some features that were selected for only specific transformations. These features may require only the level of control that was defined by the denoising transformation. The wavelet transformed data produced little improvement. The wavelet transformations could have denoised the data too significantly. The type of denoising transformation as well as the denoising scheme itself are critical to the quality of solutions.
154
A. Burns et al.
References 1. Kusiak A. (2001) “Feature Transformation Methods in Data Mining,” IEEE Transactions on Electronic Packaging Manufacturing, Vol. 24, No. 3, pp. 214-221. 2. Weigend A., Chen F., Figlewski S., and Waterhouse S. R., (1998) “ Discovering Technical Trades in the T-Bond Futures Market,” Proc. Fourth Int’l Conf. Knowledge Discovery and Data Mining (KDD ’98), (Eds) Argawal R., Stolorz P., Piatetsky-Shapiro G., pp. 354-358. 3. Vafaie H., De Jong K., (1998) “Feature Space Transformation Using Genetic Algorithms,” IEEE Intelligent Systems, Vol. 13, No. 2, pp. 57-65. 4. Hubbard B.B., (1998) “The World According to Wavelets: The Story of a Mathematical Technique in the Making”, Second ed., A.K Peters, Eds. Natick, Massachusetts. 5. Goldberg D. E., (1989) Genetic Algorithms in Search, Optimization, and Machine Learning. Reading, MA: Addison-Wesley. 6. Quinlan J. R., (1986) “Induction of decision trees,” Machine Learning, Vol. 1, No. 1, pp. 81-106. 7. Fayyad U., Piatetsky-Shapiro, Smyth Uthurusamy, R., (1995) Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press. 8. Hu M. Y., Xie H., Tiong T.B., and Wu X., (2000) “Study on a Spatially Selective Noise Filtration Technique for Suppressing Noises in Partial Discharge On-line Monitoring,” Proceedings of the 6th International Conference on Properties and Applications of Dielectric Materials, Vol. 2, pp. 689-692. 9. Hu M. Y., Jiang X., Xie H., and Wang Z., (1998) “A New Technique For Extracting Partial Discharge Signals In On-Line Monitoring with Wavelet Analysis,” Proceedings of 1998 International Symposium on Electrical Insulating Materials, pp. 677-680. 10. Huang C. M., and Huang Y. C., (2002) “Combined Wavelet-Based Networks and GameTheoretical Decision Approach for Real-Time Power Dispatch,” IEEE Transactions on Power Systems, Vol.17, No. 3, pp. 633-639. 11. Smith K., and Perez R., (2002) “Locating partial discharges in a power generating system during neural networks and wavelets,” Annual Report Conference on Electrical Insulation and Dielectric Phenomena, pp. 458-461 12. Masugi M., (2003) “Multiresolution Analysis of Electrostatic Discharge Current from Electromagnetic Intereference Aspects,” IEEE Transactions on Electromagnetic Compatibility, Vol. 45, No. 2, pp. 393-403. 13. Stone M., (1974) “Cross-validatory choice and assessment of statistical classifications”, Journal of the Royal Statistical Society, Vol. 36, pp.111-147.
Personalized Multilingual Web Content Mining Rowena Chau, Chung-Hsing Yeh, and Kate A. Smith School of Business Systems, Faculty of Information Technology, Monash University, Clayton, Victoria 3800, Australia {Rowena.Chau,ChungHsing.Yeh,Kate.Smith}@infotech.monash.edu.au
Abstract. Personalized multilingual Web content mining is particularly important for user who wants to keep track of global knowledge that is relevant to his/her personal domain of interest over the multilingual WWW. This paper presents a novel concept-based approach to personal multilingual Web content mining by constructing a personal multilingual Web space using self-organising maps. Multilingual linguistic knowledge required to define the multilingual Web space is made available by encoding all multilingual concept-term relationships using a multilingual concept map. With this map as the linguistic knowledge base, a concept-based multilingual text miner is developed to reveal the conceptual content of multilingual Web documents and to form concept categories of multilingual Web documents on a concept-based browsing interface. To construct the personal multilingual Web space, a concept-based user profile is generated from a user’s bookmark file for highlighting the user’s topics of information interests on the browsing interface. As such, personal multilingual Web mining activities ranging from explorative browsing to useroriented concept-focused information filtering are facilitated.
1 Introduction The rapid expansion of the World Wide Web throughout the globe means electronically accessible information is now available in an ever-increasing number of languages. With majority of this Web data being unstructured text [2], Web content mining technology capable of discovering useful knowledge from multilingual Web documents thus holds the key to exploit the vast human knowledge hidden beneath this largely untapped multilingual text. Web content mining has attracted much research attention in recent years [6]. It has emerged as an area of text mining specific to Web documents focusing on analysing and deriving meaning from textual collection on the Internet [3]. Currently, Web content mining technology is still limited to processing monolingual Web documents. The challenge of discovering knowledge from textual data which are significantly linguistically diverse has been well recognised by text mining research [13]. In a monolingual environment, the conceptual content of documents can be discovered by directly detecting patterns of frequent features (i.e. terms) without precedential knowledge of the concept-term relationship. Documents containing an M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 155–163, 2004. © Springer-Verlag Berlin Heidelberg 2004
156
R. Chau et al.
identical known term pattern thus share the same concept. However, in a multilingual environment, vocabulary mismatch among diverse languages implies that documents exhibiting similar concept will not contain identical term patterns. This feature incompatibility problem thus makes the inference of conceptual contents using term pattern matching inapplicable. To enable multilingual Web content mining, linguistic knowledge of concept-term relationships is essential to exploit any knowledge relevant to the domain of a multilingual document collection. Without such linguistic knowledge, no text or Web mining algorithm can effectively infer the conceptual content of the multilingual documents. In addition, in the multilingual WWW, a user’s motive of information seeking is global knowledge exploration. As such, major multilingual Web content mining activities include (a) explorative browsing that aims at gaining a general overview of a certain domain and, (b) user-oriented concept-focused information filtering that looks only for knowledge relevant to the user’s personal topics of interest. To support global knowledge exploration, it is thus necessary to reveal conceptual content of multilingual Web documents by suggesting some scheme of document browsing to the user that suits the user’s information seeking needs. To address these issues, a concept-based approach to generating a personal multilingual Web space for personal multilingual Web content mining is proposed. This is achieved by constructing a multilingual concept map as the linguistic knowledge base. The concept map encodes all multilingual concept-term relationships acquired from a parallel corpus using a self-organising map. Given this concept map, concept-based multilingual Web document classification is achieved with a multilingual text miner using a second self-organising map. By highlighting a user’s personal topics of interests on the concept-based document categories as defined by the multilingual text miner, explorative browsing and user-oriented concept-focused information filtering are both facilitated on the same browsing space. A personal multilingual Web space is thus realised. In subsequent sections, we first present the basic idea of the concept-based approach for personal multilingual Web content mining with the generation of a personal multilingual Web space. We then describe the technical details about the construction of the multilingual concept map for encoding the multilingual linguistic knowledge, and the development of a concept-based multilingual text miner for classifying multilingual Web documents by concepts. Finally, to realize a personal multilingual Web space, we generate a user profile using the user’s bookmark file for highlighting the user’s topics of information interest on a concept-based document browsing interface.
2 Personalized Multilingual Web Content Mining The concept-based approach towards personal multilingual Web content mining is due to a notion that while languages are culture bound, concepts expressed by these languages are universal [12]. Moreover, conceptual relationships among terms are inferable from the way that terms are set down in the text. Therefore, the domainspecific multilingual concept-term relationship can be discovered by analysing
Personalized Multilingual Web Content Mining
157
relevant multilingual training documents. Figure 1 shows the framework for this concept-based approach for personal multilingual Web content mining facilitated by the generation of a personal multilingual Web space. First, a parallel corpus, which is a collection of documents and their translations, is used as the training documents for constructing a concept map using a self-organising map [5]. The concept map encodes all multilingual concept-term relationships as the linguistic knowledge base for multilingual text mining. With the concept map, a concept-based multilingual text miner is developed by organising the training documents on a second self-organising map. This multilingual text miner is then used to classify newly fetched multilingual Web documents using the concept map as the linguistic knowledge base. Multilingual documents describing similar concepts will then be mapped onto a browsing interface as document clusters. To facilitate the construction of a personal multilingual Web space, a concept-based user profile using the user’s bookmark file as the indicator of his/her information interests is generated. Each user’s personal topics of interest are then highlighted on the browsing interface mapping the user profile to relevant document clusters. As a result, explorative browsing that aims at gaining an overview of a certain domain and user-oriented concept-focused information filtering are both achieved.
Fig. 1. Personalized Multilingual Web Content Mining
3 Constructing the Multilingual Concept Map From the viewpoint of automatic text processing, the relationships between terms’ meanings are inferable from the way that the terms are set down in the text. Natural language is used to encode and transmit concepts. A sufficiently comprehensive sample of natural language text, such as a well-balanced corpus, may offer a fairly complete representation of the concepts and conceptual relationship applicable within specific areas of discourse. Given corpus statistics of term occurrence, the associations among terms become measurable, and sets of semantically/conceptually related terms are detected.
158
R. Chau et al.
To construct multilingual linguistic knowledge base encoding lexical relationships among multilingual terms, parallel corpora containing sets of documents and their translations in multiple languages are ideal sources of multilingual lexical information. Parallel documents basically contain identical concepts expressed by different sets of terms. Therefore, multilingual terms used to describe the same concept tend to occur with very similar inter- and intra-document frequencies across a parallel corpus. An analysis of paired documents has been used to infer the most likely translation of terms between languages in the corpus [1,4,7]. As such, cooccurrence statistics of multilingual terms across a parallel corpus can be used to determine clusters of conceptually related multilingual terms. Given a parallel corpus D consisting N pairs of parallel documents, meaningful terms from every languages covered by the corpus are extracted. They form the set of multilingual terms for constructing the multilingual concept map. Each term is represented by an n-dimensional term vector. Each feature value of the term vector corresponds to the weight of the nth document indicating the significance of that document in characterising the meaning of the term. Parallel documents which are translated versions of one another within the corpus, are considered as the same feature. To determine the significance of each document in characterising the contextual content of a term based on the term’s occurrences, the following weighting scheme is used. It calculates the feature value of a document for p = 1,...,N in the vector of term
where is the occurrence of term
in document
is the inverse term frequency of document number of terms in the whole collection, and document frequency;
The longer the document
is the
is the number of terms in the smaller the inverse term
Personalized Multilingual Web Content Mining
159
is the normalisation factor. With this normalisation factor, the feature value relating a document to a term is reduced according to the total number of documents in which the term occurs. When contextual contents of every multilingual term are well represented, they are used as the input into the self-organising algorithm for constructing the multilingual concept map. Let be the term vector of the multilingual term, where N is the number of documents in the parallel corpus for a single language (i.e. the total number of documents in the parallel corpus divided by the number of languages supported by the corpus) and M is the total number of multilingual terms. The selforganising map algorithm is applied to form a multilingual concept map, using these term vectors as the training input to the map. The map consists of a regular grid of nodes. Each node is associated with an N-dimensional model vector. Let be the model vector of the node on the map. The algorithm for forming the multilingual concept map is given below. Step 1: Select a training multilingual term vector at random. Step 2: Find the winning node s on the map with the vector which is closest to such that
Step 3: Update the weight of every node in the neighbourhood of node s by
where is the gain term at time that decreases in time and converges to 0. Step 4: Increase the time stamp t and repeat the training process until it converges. After the training process is completed, each multilingual term is mapped to a grid node closest to it on the self-organising map. A multilingual concept map is thus formed. This process corresponds to a projection of the multi-dimensional term vectors onto an orderly two-dimensional concept space where the proximity of the multilingual terms is preserved as faithfully as possible. Consequently, conceptual similarities among multilingual terms are explicitly revealed by their locations and neighbourhood relationships on the map. To represent the relationship between every language-independent concept and its associated multilingual terms on the concept map, each term vector representing a multilingual term is input once again to find its corresponding winning node on the self-organising map. All multilingual terms, for which a node is the corresponding winning node, are associated with the same node. Therefore, a node will be
160
R. Chau et al.
represented with several multilingual terms that are often synonymous. In this way, conceptual related multilingual terms are organised into term clusters within a common semantic space. The problem of feature incompatibility among multiple languages is thus overcome.
4 Developing the Multilingual Text Miner The objective of developing a concept-based multilingual text miner is to reveal the conceptual content of arbitrary multilingual Web documents by organising them into concept categories in accordance with their meanings. Sorting document collections by the self-organising map algorithm depends heavily on the document representation scheme. To form a map that displays relationships among document contents, a suitable method for document indexing must be devised. Contextual contents of documents need to be expressed explicitly in a computationally meaningful way. In information retrieval, the goal of indexing is to extract a set of features that represent the contents, or the ‘meaning’ of a document. Among several approaches suggested for document indexing and representation, the vector space model [11] represents documents conveniently as vectors in a multi-dimensional space defined by a set of language-specific index terms. Each element of a document vector corresponds to the weight (or occurrence) of one index term. However, in a multilingual environment, the direct application of the vector space model is infeasible due to the feature incompatibility problem. Multilingual index terms characterising documents of different languages exist in separate vector spaces. To overcome the problem, a better representation of document contents incorporating information about semantic/conceptual relationships among multilingual index terms is desirable. Towards this end, the multilingual concept map obtained in Section 3 is applied. On the multilingual concept space, conceptually related multilingual terms are organised into term clusters. These term clusters, denoting language-independent concepts, are thus used to index multilingual documents in place of the documents’ original language-specific index terms. As such, a concept-based document vector that explicitly expresses the conceptual context of a document regardless of its language is obtained. The term-based document vector of the vector space model, which suffers from the feature incompatibility problem, can now be replaced with the language-independent concept-based document vector. The transformed concept-based document vectors are then organised using the self-organising map algorithm to produce a concept-based multilingual text miner. To do so, each document of the parallel corpus is indexed by mapping its text, term by term, onto the multilingual concept map whereby statistics of its ‘hits’ on each multilingual term cluster (i.e. concept) are recorded. This is done by counting the occurrence of each term on the multilingual concept map at the node to which that term is associated. This statistics of term cluster occurrences can be interpreted as a kind of transformed ‘index’ of the multilingual document. The concept-based multilingual text miner is formed with the application of the self-organising map algorithm, using the transformed concept-based document vectors as inputs.
Personalized Multilingual Web Content Mining
161
Let be the concept-based document vector of the multilingual document, where G is the number of nodes existing in the multilingual concept map and H is the total number of documents in the parallel corpus. In addition, let be the G-dimensional model vector of the node on the map. The algorithm for forming the concept-based multilingual text miner is given below. Step 1: Select a training concept-based document vector at random. Step 2: Find the winning node s on the map with the vector which is closest to document such that
Step 3: Update the weight of every node in the neighbourhood of node s by
where is the gain term at time t that decreases in time and converges to 0. Step 4: Increase the time stamp t and repeat the training process until it converges. After the training process, multilingual documents from the parallel corpus that describe similar concepts are mapped onto the same node forming document clusters on the self-organising map. Each node thus defines a concept category of a conceptbased multilingual text miner and its corresponding browsing interface. The conceptbased multilingual text miner is then used to classify newly fetched multilingual Web documents. To do so, the text of every multilingual Web document is first converted into a concept-based document vector using the multilingual concept space as the linguistic knowledge base. This document vector is then input to the multilingual text miner to find the winning concept category which is closest to it on the self organising map. Consequently, every multilingual Web documents is assigned to a concept category on a concept-based browsing interface based on the conceptual content it exhibits. Based on a predefined network of concepts associating correlated multilingual Web documents, the purpose of concept-based explorative browsing in multilingual Web content mining is thus achieved.
5 Generating the Personal Multilingual Web Space With the overwhelming amount of information in the multilingual WWW, not every piece of information is of interest to a user. In such circumstances, a user profile, which models the user’s information interests, is required to filter out information that the user is not interested in. Common approaches to user profiling [8,9,10] build a representation of the user’s information interest based on the distribution of terms found in some previously seen documents which the user has found interesting. However, such representation has
162
R. Chau et al.
difficulties in handling situations where a user is interested in more than one topic. In addition, in a multilingual environment, the feature incompatibility problem resulted from the vocabulary mismatch phenomenon across languages makes a languagespecific term-based user profile insufficient for representing the user’s information interest that spans multiple languages. To overcome these problems, we propose a concept-based representation for building user profiles. Using language-independent concepts rather than language-specific terms implies that the resulting user profile is not only more semantically comprehensive but also independent from the language of the documents to be filtered. This is particularly important for multilingual Web content mining where knowledge relevant to a concept in significantly diverse languages has to be identified. To understand the user’s information interests for personalising multilingual Web content mining, the user’s preference on the WWW is used. Indicators of these preferences can be obtained from the user’s bookmark file. To generate a conceptbased user profile from a user bookmark file, Web documents pointed by the bookmarks are first retrieved. Applying the multilingual concept map as the linguistic knowledge base, each Web document is then converted into a concept-based document vectors using the procedure described in Section 4. Each concept-based document vector representing a bookmarked Web page is input to find its winning node on the multilingual text miner. All bookmarked multilingual Web pages for which a node is the winning node are associated with the same concept category. After mapping all bookmarks’ document vectors onto the multilingual text miner, the concept categories relevant to the user’s bookmark file are revealed. As such, these concept categories can be regarded as the user profile representing a user’s information interest in multiple topics. By highlighting these concept categories on the concept-based browsing interface, a personal multilingual Web space is generated. Hence, multilingual Web content mining is thus personalised. This task of useroriented concept-focused information filtering is particularly important for user who wants to keep track of global knowledge that is relevant to his/her personal domain of interest over the multilingual WWW.
6 Conclusion This paper has presented a concept-based approach for personal multilingual Web content mining with the construction of a personal multilingual Web space using selforganising maps. The multilingual concept map is constructed to enable an automatic and unsupervised discovery of the multilingual linguistic knowledge from a parallel corpus. A concept-based multilingual text miner is developed to realise a languageindependent concept-based classification of multilingual Web document onto a single browsing interface. A concept-based user profile is generated from the user’s bookmark file to model a user’s multilingual information interests comprising multiple topics. This approach to user profiling increases the semantic comprehensiveness and the resultant user profile is independent of the language of the Web documents to be filtered. As a result, multilingual Web content mining activities
Personalized Multilingual Web Content Mining
163
ranging from explorative browsing to concept-focused information filtering can be effectively personalised in a user’s individual information space.
References [1] Carbonell J. G., Yang Y., Frederking R. E., Brown R. D., Geng Y. and Lee D (1997) Translingual information retrieval: a comparative evaluation. (Ed) Pollack M. E., In: IJCAI-97 Proceedings of the International Joint Conference on Artificial Intelligence, pp. 708-714. [2] Chakrabarti S. (2000) Data mining for hypertext: a tutorial survey. ACM SIGKDD Exploration, 1(2), pp. 1 –11. [3] Chang C., Healey M. J., McHugh J. A. M. and Wang J. T. L. (2001) Mining the World Wide Web: an information search approach. Kluwer Academic Publishers. [4] Davis M., (1996) New experiments in cross-language text retrieval at nmsu’s computing research lab. In Proceedings of the Fifth Retrieval Conference (TREC-5) Gaithersburg, MD: National Institute of Standards and Technology. [5] Kohonen T. (1995) Self-Organising Maps. Springer-Verlag, Berlin. [6] Kosala R. and Blockeel H. (2000) Web mining research: a survey. ACM SIGKDD Exploration, 2(1), pp. 1 –15. [7] Landauer T. K. and Littman M. L. (1990) Fully automatic cross-language document retrieval. In Proceedings of the Sixth Conference on Electronic Text Research, pp. 31-38. [8] Lang K. (1995) NewsWeeder: Learning to filter news. In: Proceeding on the International Conference on Machine Learning, Lake Tahoe, CA, Morgan Kaufmann, pp. 331-339. [9] Lieberman H., Van Dyke N. W. and Vivacqua A. S. (1999) Let’s browse: A collaborative browsing agent. In Proceedings of the 1999 International Conference on Intelligent User Interfaces, Collaborative Filtering and Collaborative Interfaces, pp. 65-68. [10] Mukhopadhyay S., Mostafa J., Palakal M., Lam W., Xue L. and Hudli A. (1996) An adaptive multi-level information filtering system. In: Proceedings of The Fifth International Conference on User Modelling, pp. 21-28. [11] Salton G. (1989) Automatic Text Processing: The Transformation, analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading. MA. [12] Soergel D. (1997) Multilingual thesauri in cross-language text and speech retrieval. In: Working Notes of AAAI Spring Symposium on Cross-Language Text and Speech Retrieval, Stanford, CA, pp. 164-170. [13] Tan A. H. (1999) Text Mining: The state of the art and the challenges. In Proceedings of PAKDD’99 workshop on Knowledge Disocovery from Advanced Databases, Beijing, pp. 65-70.
Intelligent Multimedia Information Retrieval for Identifying and Rating Adult Images Seong-Joon Yoo School of Computer Engineering, Sejong University, Seoul, 143-747, Korea [email protected]
Abstract. We applied an intelligent multimedia information retrieval technique to devise an algorithm identifying and rating adult images. Given a query, ten most similar images are retrieved from an adult image database and a non-adult image database in which we store existing images of each class. If majority of the retrieved are adult images, then the query is determined to be an adult image. Otherwise, it is determined to be a non-adult class. Our experiment shows 99% true positives with 23% false positives with a database containing 1,300 non-adult images, and 93.5% correct detections with 8.4% false positives when experimented with a database containing 12,000 non-adult images. 9,900 adult images are used for both experiments. We also present an adult image rating algorithm which produces results that can be used as a reference for rating images.
1 Introduction As the Internet proliferates, young children are easily exposed to adult contents through web browsers or emails. To remedy this problem, there have been several efforts such as developing systems filtering adult contents either from the Internet or spam mails. Several institutions build rate databases by exploring web sites everyday and rate each pages manually or semi-automatically. The rate database can be downloaded periodically into each filter of a user’s PC so that children are blocked from accessing any adult rate web pages. This semi-automatic rating requires an algorithm that classifies web pages by interpreting textual words or images. Since rating a web page based upon the interpretation of textual contents does not bring perfect results, interpretation of images needs to be followed. There has been drastic evolution in image processing technology for decades. Although several papers[1,2,3,4,5,6] have presented methods of identifying nude pictures by applying the achievements of image processing technology, no previous work shows a satisfactory performance. In addition, we have not seen any research mentioning methods of rating adult images. In this paper, we devise an algorithm identifying and rating adult images by utilizing intelligent multimedia information retrieval (IMIR) technique. IMIR is defined as multidisciplinary area that lies at the intersection of artificial intelligence, information M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 164–170, 2004. © Springer-Verlag Berlin Heidelberg 2004
Intelligent Multimedia Information Retrieval
165
retrieval and multimedia computing[7]. Content-based Retrieval of Imagery (CBIR) as well as Content-based Retrieval of Video and other intelligent multimedia retrieval topics is one of the areas of IMIR. Even though CBIR method is exploited to identify and rate adult images in this paper and its performance is better than any previous work based mainly on image understanding technique. Given a query, ten most similar images are retrieved from an adult image database and a non-adult image database in which we store existing images of each class. If majority of the retrieved are adult images, then the query is determined to be an adult image. Otherwise, it is determined to be a non-adult class. Our experiment shows the results of the proposed method on objectionable images: 99% detection rate and 23% false positives with a 1,300 non-adult training image database; 93.5% detection rate and 8.4% false positives with a 12,000 non-adult training image database. We used 9,900 adult images in either case. We show how it can be incorporated into a detection system of adult images on the Internet in the later section. A method of rating adult images is also presented. In a similar way as described above, ten similar images are retrieved from databases of four types: A, B, C and D. For example, majority of similar images are from type B database, then the query image is potentially type B – a fully naked person in the query picture. Due to a rather low performance of the proposed method, rating result can be used as a reference instead of being used in practice.
2 Previous Work Jones and Rehg[1] proposed statistical color models for skin and non-skin classes. A skin pixel detector based on the color models achieves a detection rate of 80% with 8.5% false positives. They built a detector for naked people using aggregated features such as percentage of pixels detected as skin, average probability of the skin pixels, size in pixels of the largest connected component of skin, number of connected components of skin, percent of colors with no entries in the skin, percent of colors with no entries in the skin and non-skin histograms that are computed from the skin detector. They used adult and non-adult 10679 images to train a neural network classifier. Their color-based adult image detector achieved 85.8% correct detections with 7.5% false positives. Forsyth et al.[2,3,4] and Wang et al.[5] also developed systems for detecting images containing naked people. Forsyth et al. combined color and texture properties to obtain a mask for skin regions that are then fed to a geometric filter based on body plans. If the skin filter alone is used, the detection rate is 79.3% with 11.3% false positives while combined with a geometry filter, the detection rate falls to 42.3%. However, the false positives fall to 4.2%. Wang et al. developed WIPE system that uses a manually-specified color histogram model as a pre-filter in an analysis pipeline. Images whose probability is high pass on to a final stage of analysis where they are classified using wavelet filters. This system shows 96% detection rate with 9% false positives.
166
S.-J. Yoo
Alexandru F. Drimbarean et al.[6] proposed an image processing technique to detect and filter objectionable images based on skin detection and shape recognition. The technique includes a method of matching skin tones based on a fuzzy classification scheme and shape recognition techniques to match faces and other elements of the human anatomy. However, since they did not present any performance result, we can not compare the detection correctness.
3 Identifying and Rating Adult Images 3.1 The Architecture and Data Flow We build an intelligent adult image retrieval and rating system (AIRS). Fig. 1 shows its architecture and data flow. AIRS is composed of three layers: query and rate processing layer, indexing layer, and model rate database layer. Once a query is issued to AIRS, the query processing layer extracts its MPEG-7 composition histogram features as introduced in section 3.2 and [8]. The feature is compared with the features in the model rate database layer. The intelligent image retrieval method, described in section 3.2, retrieves relevant images by computing the distances between feature descriptors (e.g. histograms) of images. The CBF multidimensional indexing scheme [15] is exploited to speed up the comparison. Type A database stores pictures with naked female breasts, type B database includes pictures with male or female genitals, and type C database contains pictures with explicit sexual action. In addition to detecting adult images, AIRS classifies them into three groups that can be used for reference. Eleven most relevant descriptors (images) are found in the four databases. Then the query image itself is excluded from the result set. Suppose that one image is retrieved from database A, two from B, five from C, and one from A, then the query image belongs potentially to type C.
3.2 Multimedia Intelligent Information Retrieval Method in AIRS This research exploits composition of three MPEG-7 visual descriptors. Among MPEG-7 visual descriptors, we adopt three descriptors for our retrieval system. They are the edge histogram descriptor (EHD)[9,11], the color layout descriptor (CLD)[12], and the homogeneous texture descriptor (HTD)[13,14]. This multimedia information retrieval method is extensively described in [8].
4 Experiment and Comparison 4.1 Detection Rate of AIRS We tested the detection rate of AIRS in four different ways for the query: i) of 9,900 training adult images and 1,700 training non-adult images with 3,300 A, 3,300 B, 3,300 C and 1,700 D images in the training databases; ii) of 9,900 training adult images and 12,000 training non-adult images with 3,300 A, 3,300 B, 3,300 C and 12,000 D images in the training databases; iii) of 800 A, 800 B, 800 C, and 400 D testimages
Intelligent Multimedia Information Retrieval
167
Fig. 1. The Architecture and Data Flow of AIRS
with 2,500 A, 2,500 B, 2,500 C and 1,300 D images in the training databases; iv) of 800 A, 800 B, 800 C, and 2000 D test images with 2,500 A, 2,500 B, 2,500 C and 10,000 D images in the training databases. We also tested A, B, C image classification performance of AIRS using the query and database in test case i) above. This is summarized in Table 1. Experiment 1 and 2 use training images as query image while 3 and 4 do not use the training images in the databases.
Experiment 3 shows the best detection rate with rather high false positives. We prefer a system with almost perfect detection rate even though it has rather high false positives since low or zero false negatives keep an automatic adult image detection systems from missing adult images. Therefore we can use databases with moderate number of training adult images and small amount of non-adult images in a commercial system.
168
S.-J. Yoo
4.2 Comparison of Detection Rate Table 3 compares detection rate and false positives of AIRS with those of previous works. While AIRS shows rather large number of false positives, it is superior in detection rate and can identify nearly all of adult images.
4.3 Rating Adult Images As the number of images increases in the model databases, the hit ratio of type A images for type A query images increases. Given 3,300 query images of type A, 45% of the retrieved images is type A, 26% is type B, 28% is type C, and only 1 % is rated “non-adult” image. Given 3,300 query images of type B, 19% of the retrieved images is type A, 49% is type B, 32% is type C, and no image is rated “non-adult”. Whereas AIRS is not excellent in distinguishing type A from B as shown in and, it show a good performance in identifying type C image. We recognized that the proposed method is so excellent in detecting adult images that it is practically useful. AIRS detects adult images using a combination of MPEG-7 visual descriptors - the edge histogram descriptor, the color layout descriptor, and the homogeneous texture descriptor. Four experiments tested the performance with query pictures and 11,600 to 21,900 training images stored in four databases. If the query is similar to an adult image, then it is determined to be adult picture. AIRS detects 99% of adult images in the best case with 23% false positives. It is more
Intelligent Multimedia Information Retrieval
169
accurate in detecting adult images than any other techniques previously developed. The system is relatively faster for query of this complexity since it adopted multidimensional indexing scheme. We have shown how the algorithm can be applied to reducing the time for manual rating of Internet content since the algorithm removes most of (77%) non-adult images in the preliminary filtering. The experiment shows that the detection rate and the false positives of AIRS vary according to the number of adult and non-adult images. More experiments are needed to find an optimal number of training images that minimize the false positives and maximize the detection rate. We are going to further improve the retrieval performance of the proposed method by adopting a relevance feedback approach. Specifically, we can utilize the previous relevance feedback approaches[11], namely query point movement and re-weighting methods. The query point movement method essentially tries to improve the estimate of the “ideal query point” by considering the user’s choice of the relevant and irrelevant images among the retrieved images. The reweighting method, on the other hand, tries to improve the relative importance of the feature values for the similarity matching.
5 Conclusion We recognized that the proposed method is so excellent in detecting adult images that it is practically useful. AIRS detects adult images using a combination of MPEG-7 visual descriptors - the edge histogram descriptor, the color layout descriptor, and the homogeneous texture descriptor. Four experiments tested the performance with query pictures and 11,600 to 21,900 training images stored in four databases. If the query is similar to an adult image, then it is determined to be adult picture. AIRS detects 99% of adult images in the best case with 23% false positives. It is more accurate in detecting adult images than any other techniques previously developed. The system is relatively faster for query of this complexity since it adopted multidimensional indexing scheme. We have shown how the algorithm can be applied to reducing the time for manual rating of Internet content since the algorithm removes most of (77%) non-adult images in the preliminary filtering. The experiment shows that the detection rate and the false positives of AIRS vary according to the number of adult and non-adult images. More experiments are needed to find an optimal number of training images that minimize the false positives and maximize the detection rate. We are going to further improve the retrieval performance of the proposed method by adopting a relevance feedback approach. Specifically, we can utilize the previous relevance feedback approaches[1l], namely query point movement and re-weighting methods. The query point movement method essentially tries to improve the estimate of the “ideal query point” by considering the user’s choice of the relevant and irrelevant images among the retrieved images. The reweighting method, on the other hand, tries to improve the relative importance of the feature values for the similarity matching.
170
S.-J. Yoo
References 1. Michael J. Jones and James M. Rehg, (1998) “Statistical Color Models with Application to Skin Detection,” Technical Report Series, Cambridge Research Laboratory, December. 2. Margaret Fleck, David Forsyth, and Chris Bregler, (1996) “Finding Naked People,” European Conference on Computer Vision, Volume II, pp.592-602, 1996. 3. David A. Forsyth and Margaret M. Fleck, (1996) “Identifying nude pictures,” IEEE Workshop on the Applications of Computer Vision 1996, pp. 103-108. 4. David A. Forsyth and Margaret M. Fleck, (1997) “Body Plans,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 678-683. 5. James Ze Wang, Jia Li, Gio Wiederhold and Oscar Firschein, (1997) “System for screening objectionable images using daubechies’ wavelets and color histograms,” in Proceedings of the International Workshop on Interactive Distributed Multimedia Systems and Telecommunications Services, pages 20-30. 6. Alexandru F. Drimbarean, Peter M. Corcoran, Mihai Cucic and Vasile Buzuloiu, (2000) “Image Processing Techniques to Detect and Filter Objectionable Images based on Skin Tone and Shape Recognition,” IEEE International Conference on Consumer Electronics. 7. Mark T. Maybury, (1997) Intelligent Multimedia Information Retrieval, AAAI Press. 8. Kang Hee Beom, Park Dong Kwon, Won Chee Sun, Park Soon Jun, and Yoo Seong Joon , (2002) “Image Retrieval Using a Composition of MPEG-7 Visual Descriptors,” CISST. 9. ISO/IEC JTC1/SC29/WG11/W4062, (2001) “FCD 15938-3 Multimedia Content Description Interface – Part 3 Visual,” Singapore, Mar. 10. Park D. K., Jeon Y. S., Won C. S., and Park S. J., (2000) “Efficient use of local edge histogram descriptor,” In: Workshop on Standards, Interoperability and Practices, ACM, pp. 52-54, Marina del Rey, CA, Nov. 4. 11. Yoon Su Jung, Park Dong Kwon, Park Soo Jun, and Won Chee Sun, (2001) “Image Retrieval Using a Novel Relevance Feedback for Edge Histogram Descriptor of MPEG-7”, ICCE 2001, L.A, June. 12. Huang J., Kumar S., Zhu W. J., Zabih R., (1997) Image indexing using color correlogram. In Proc. Of IEEE Conf. On Computer Vision and Pattern Recognition, 1997. 13. Manjunath B.S., Ma W.Y., (1996) Texture Features for Browsing and Retrieval of Image Data. IEEE Transactions on PAMI, Vol. 18, No. 8, August. 14. Wu Peng, Ro Yong Man, Won Chee Sun, Choi Yanglim, (2001) “Texture Descriptors in MPEG-7,” CAIP 2001, LNC52124, pp. 21-28. 15. Chonbuk University, Multidimensional Feature Data Indexing Technology, Report, Electronics and Telecommunications Research Institute, 1999. 16. Ruim Y., Huang T. S., Mehrotra S., (1997) “Content-based image retrieval with relevance feedback in MARS,” in Proc. IEEE int. Conf. on Image Proc., 1997. 17. ISO/IEC/JTCI/SC29/WG11: (1999) “Core Experiment Results for Spatial Intensity Descriptor(CT4),” MPEG document M5374, Maui, Dec.
Using Domain Knowledge to Learn from Heterogeneous Distributed Databases Sally McClean, Bryan Scotney, and Mary Shapcott Faculty of Engineering, University of Ulster, Coleraine BT52 1SA, Northern Ireland {SI.McClean, BW.Scotney, CM.Shapcott}@ulster.ac.uk
Abstract. We are concerned with the processing of data held in distributed heterogeneous databases using domain knowledge, in the form of rules representing high-level knowledge about the data. This process facilitates the handling of missing, conflicting or unacceptable outlying data. In addition, by integrating the processed distributed data, we are able to extract new knowledge at a finer level of granularity than was present in the original data. Once integration has taken place the extracted knowledge, in the form of probabilities, may be used to learn association rules or Bayesian belief networks. Issues of confidentiality and efficiency of transfer of data across networks, whether the Internet or Intranets, are handled by aggregating the native data in situ, typically behind a firewall, and carrying out further transportation and processing solely on multidimensional aggregate tables. Heterogeneity is resolved by utilisation of domain knowledge for harmonisation and integration of the distributed data sources. Integration is carried out by minimisation of the Kullback-Leibler information divergence between the target integrated aggregates and the distributed data values.
1 Background Our approach to knowledge discovery involves the use of domain knowledge in the form of rules to refine and improve the extraction of probabilities from integration of data held in distributed heterogeneous databases. Once integration has taken place the new knowledge, in the form of probabilities, may be used for knowledge discovery using association rules or Bayesian belief networks. We have previously developed a methodology that combines domain knowledge stored as metadata in the form of rules and ontological information with micro (raw data) and macro data (multidimensional tables) which may be subject to uncertainty and imprecision (McClean et al. 2000a, 2000b). Such domain knowledge may then be used to reengineer the database so as to solve problems of heterogeneity, resolve conflicts and refine the data to increase precision, prior to aggregation. Issues of confidentiality and efficiency of transfer of data across networks, whether the Internet or Intranets, are handled by aggregating the native data in situ, typically behind a firewall, and carrying out further transportation and processing solely on aggregate tables. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 171–177, 2004. © Springer-Verlag Berlin Heidelberg 2004
172
S. McClean et al.
Such aggregate data are frequently stored in distributed Data Warehouses (Albrecht & Lehner, 1998). Formerly Data Warehousing was concerned with combining, possibly heterogeneous, databases at a central location; more recently the focus has moved to keeping the native databases in a distributed environment and integrating them ‘on the fly’. The use of Data Warehousing and OLAP Technology along with Data Mining therefore allows for the possibility of carrying out analysis on large datasets that were previously inaccessible (Jiawei, 1998) where such data are often subject to both imprecision and uncertainty, including missing values (Parsons, 1996). In this paper we build on our previous work in a number of ways, as follows: 1. We extend to distributed heterogeneous databases our previous work on using background knowledge to improve knowledge extraction (McClean et al. 2000a, 2000b). 2. We describe how this approach can be used to resolve data conflicts. 3. We extend our previous work on integrating heterogeneous aggregate views of Distributed Databases (McClean et al., 2003) to data that are imprecise and show how background knowledge may be used to improve this process.
2 Re-engineering Using the Background Knowledge We are concerned with utilising domain knowledge in the form of rules; these rules may be specified as arising from a concept hierarchy of attributes via ontologies, as integrity constraints, from the integration of conflicting databases, or from knowledge possessed by domain experts. We have proposed a methodology that re-engineers the database by replacing missing, conflicting or unacceptable outlying data by subsets of the attribute domain (McClean et al., 2000b, 2001). This approach may be thought of as a preparatory stage that utilises inductive reasoning to re-engineer the data values, thus increasing their precision and accuracy. Probabilistic reasoning is used to integrate the distributed data and extract a set of probabilities. These probabilities are then used to discover new knowledge via association rules or Bayesian belief networks. We assume that in the original database attribute values may be given either as singleton sets, as proper subsets of the domain, or as concepts that correspond to proper subsets of the attribute domain. In the last case the values may be defined in terms of a concept hierarchy or as part of an ontology. In addition there are rules describing the domain. A partial value relation, such as we propose, with values that are sets, as illustrated in Table 1, has been discussed previously, e.g. Chen & Tseng (1996). Here a value for the attribute Job_title may be a singleton set, {NULL}, may be a concept from the concept hierarchy as defined in Figure 1, e.g. {Academic}, or may be a subset of the base domain, e.g. {Technician, Computer Officer}. We note that the base domain is given here by the set {Professor, Senior Lecturer, Lecturer, Technician, Computer Officer}. We define integrity constraints for the data in Table 1, which are here associated with salary scales. For example, a Lecturer salary must be between £19,000 and £30,000. In addition we might have some general rules such as “Non-Professorial salaries are always less than Professorial staff salaries”. Examples of domain knowledge of this sort are presented in Table 2.
Using Domain Knowledge to Learn from Heterogeneous Distributed Databases
173
Database 1 adheres to Ontology 1 (Academic, Technician, Computer Officer) while database 2 adheres to Ontology 2 (Professor, Non-Professorial, Technical). Ontologies represent a form of background data whereby the Data Providers can map their schema to another, often via another well-known ontology. These mappings are encapsulated in the concept hierarchy (Figure 1). The second type of background knowledge consists of rules which are of the form: where A and B are user-defined predicates that are relational views of the original data. These relational views may be defined by relational operators and (implies) is the if-then logical connective. An example of such a rule in Table 2 is: (job_title= “Computer Officer” salary > 20000). The rule may also be generalised as in the clause: and Non-Professiorial(Y)] where (for all) is the universal quantifier. General rules of this sort may either by provided by the domain expert or induced from the database using inductive logic programming. Such rules may involve individual tuples, express constraints between tuples, or express constraints between relations. In addition we may resolve conflicts by expanding the set of possible values so that all the conflicting values are accommodated. We assume here that at least one of the conflicting values is correct. Such concepts and rules may be encoded in a declarative language such as Prolog. A program may thus be developed which scans through the database, evaluating each data value and replacing it, if possible, with a subset of the base domain of finer
174
S. McClean et al.
granularity. An example of how this database re-engineering works in practice is presented in Table 3, which is Table 1 re-engineered, using the background knowledge in Table 2. Such re-engineering is possible when there is functional dependency between the re-engineering and the re-engineered attribute.
Fig. 1. The concept hierarchy for the Job_title attribute
3 Integrating the Data Sources Databases distributed over the Internet are often rich in latent information that may be extracted using knowledge discovery and data mining techniques. Data integration is another means that can be used to extract new knowledge, where large benefits in knowledge acquisition can be obtained from the application of appropriate integration algorithms (e.g. Scotney et al., 1999a, 1999b). We are here concerned with the integration of heterogeneous data to provide new knowledge, often at a finer level of detail than was available at any of the contributing sources. Once the domain knowledge has been used to re-engineer the databases, integration is carried out by using the EM (expectation-maximisation) algorithm (Vardi & Lee, 1993) to minimise the Kullback-Leibler information divergence
Using Domain Knowledge to Learn from Heterogeneous Distributed Databases
175
between the aggregated probability distribution and the data This is equivalent to maximising the likelihood of the model given the data. It is carried out in a manner which apportions uncertain belief in an intuitive manner. We consider an attribute A with corresponding domain Then for Ontology corresponding to distributed database the domain D is described by attribute subsets for r=1,...m, where (We note that these subsets may be overlapping and each subset consists of a number of the Here is the number of categories in the classification scheme of attribute A in view We further define:
In our case, minimising the Kullback-Leibler information divergence, or equivalently maximising the log-likelihood, is achieved by the following iterative scheme, which is an application of the EM (expectation-maximisation) algorithm (Vardi & Lee, 1993) which is widely used for the solution of incomplete data problems.
Here r indexes the tuples, i indexes the base values, and the are the set of probabilities of the base values This formula has been shown to converge monotonically to the solution of the minimum information divergence equation (Vardi & Lee, 1993). In a series of papers (McClean et al. 1998, 2001, 2003; Scotney et al. 1999a, 1999b) we have described the derivation and implementation of such an approach to the integration of heterogeneous and/or distributed data. In our present context the cardinality values are derived at the level of the common ontology, and the resulting datasets are then integrated. Within this data model we can reclassify all local schema (the local ontologies) with respect to the new classification (the common ontology). Here the common ontology may be computed from the ontology mappings; we call this the Dynamic Shared Ontology (DSO) (McClean et al., 2003). Alternatively the common ontology may be user specified (McClean et al., 2002). For example, using the re-engineering data in Table 3, we obtain a DSO of {Professor, Senior Lecturer, Lecturer, Technician, Computing Officer} with probabilities: (N.B. in this case the DSO comprises the base values but this is not necessarily the case.) The iterative equations are then:
176
S. McClean et al.
with corresponding probabilities (0.12, 0.12, 0.37, 0.16, 0.23).
4 Knowledge Discovery Knowledge discovery from databases often consists of deriving beliefs or rules from database tuples. Rules tend to be based on sets of attribute values, partitioned into an antecedent and a consequent. Support for a rule is based on the proportion of tuples in the database which have the specified attribute values in both the antecedent and the consequent. Depending on the nature of the domain, rules may be strong, i.e. true most of the time, or weak, i.e. typically described in terms such as “if the antecedent is true then probability of the consequent is significantly higher (or lower) than if the antecedent is false”. Weak rules may be induced using probabilistic methods such as Bayesian belief models. Finding rules, whether weak or strong, can be computationally intensive, and involves finding all of the covering attribute sets, A, and then testing whether the rule “A implies B”, for some attribute set B separate from A, holds with sufficient confidence. Efficient mechanisms are therefore required for searching the data and eliminating redundant or ineffective rules. In Section 2 we have described the use of domain knowledge to use inductive reasoning to re-engineer the database, thus permitting more accurate rules to be induced. In Section 3, the re-engineered local databases are then integrated using probabilistic reasoning to obtain rules in the form of probability distributions, often at a finer level of detail than was available at any of the contributing sources. Such rules can be utilised for Knowledge Discovery by learning association rules and Bayesian Belief Networks. Such methods are based on the calculation of conditional probabilities; in our case these are computed from the probability distributions.
Acknowledgement This work was funded by MISSION- Multi-agent Integration of Shared Statistical Information over the (inter)Net (IST project number 1999-10655) within EUROSTAT’s EPROS initiative.
Using Domain Knowledge to Learn from Heterogeneous Distributed Databases
177
References 1. Albrecht J. and Lehner W., (1998). On-line Analytical Processing in Distributed Data Warehouses. IDEAS 1998, 78-85. 2. Chen A.L.P. and Tseng F. S. C., (1996), Evaluating Aggregate Operations over Imprecise Data, IEEE Transactions on Knowledge and Data Engineering 8273-284. 3. Jiawei H., (1998). Towards On-Line Analytical Mining in Large Databases. SIGMOD Record 27(1), 97-107. 4. McClean S. I., Scotney B. W. and Shapcott C. M., (1998). Aggregation of Imprecise and Uncertain Information for Knowledge Discovery in Databases. In Proceedings of KDD-98, New York, 269-273. 5. McClean S. I., Scotney B. W. and Shapcott C. M., (2000a). Using Background Knowledge in the Aggregation of Imprecise Evidence in Databases. Data and Knowledge Engineering 32, pp. 131-143. 6. McClean S. I., Scotney B. W. and Shapcott C. M., (2000b). Incorporating Domain Knowledge into Attribute-Oriented Data Mining, International Journal of Intelligent Systems, 6, 535-548. 7. McClean S.I., Scotney B.W. and Shapcott M., (2001). Aggregation of Imprecise and Uncertain Information in Databases. IEEE Transactions on Knowledge and Data Engineering (TKDE), 13(6), 902-912. 8. McClean S.I., Páircéir R., Scotney B.W., Greer K. R. C., (2002). A Negotiation Agent for Distributed Heterogeneous Statistical Databases. Proc. 14th IEEE International Conference on Scientific and Statistical Database Management (SSDBM), 207-216. 9. McClean S. I., Scotney B. W., Greer K. R. C. (2003), A Scalable Approach to Integrating Heterogeneous Aggregate Views of Distributed Databases. IEEE Transactions on Knowledge and Data Engineering, 15(1), 232-235. 10. Parsons S., (1996). Current Approaches to Handling Imperfect Information in Data and Knowledge Bases. IEEE Transactions on Knowledge and Data Engineering, 8, 353-372. 11. Scotney B. W., McClean S. I. Rodgers M. C., (1999a). Optimal and Efficient Integration of Heterogeneous Summary Tables in a Distributed Database. Data and Knowledge Engineering, 29, 337-350. 12. Scotney B. W., McClean S. I., (1999b). Efficient Knowledge Discovery through the Integration of Heterogeneous Data. Information and Software Technology, 41, 569-578. 13. Vardi Y. and Lee D., (1993) From Image Deblurring to Optimal Investments: Maximum Likelihood Solutions for Positive Linear Inverse Problems (with discussion). J. R. Statist. Soc. B, 569-612.
A Peer-to-Peer Approach to Parallel Association Rule Mining Hiroshi Ishikawa, Yasuo Shioya, Takeshi Omi, Manabu Ohta, and Kaoru Katayama Graduate School of Engineering, Tokyo Metropolitan University
Abstract. Distributed computing based on P2P (peer-to-peer) networks is a technology attainable at a relatively low cost. This enables us to propose a flexible approach based on “Partition” algorithm as an extension of “Apriori” algorithm to efficiently mine association rules by cooperatively partitioning and distributing processes to nodes on a virtually tree-like P2P network topology. The concept of cooperation here means that any internal node contributes to the control of the whole processes. First, we describe the general design of our basic approach and compare it with related techniques. We explain the basic algorithm (without load balancing) implemented as experimental programs in detail. Next, we explain simulation settings and discuss evaluation results, which can validate the effectiveness of our basic approach. Further, we describe and evaluate the algorithm with load balancing as an extension to the basic algorithm.
1
Introduction
Recently, Grid computing [6] deserves a lot of attention in providing a large-scale computing framework by use of computers connected with networks. Distributed computing, based on P2P (peer-to-peer) networks [7] is such a technology attainable at a relatively low cost. This enables us to propose a flexible approach based on Partition [12] algorithm to efficiently mine association rules by cooperatively partitioning and distributing processes to nodes on P2P networks. The concept of cooperation here means that any internal node contributes to the control of the whole processes. In this paper, we evaluate the effectiveness of our approaches by simulating the whole system. First, we describe Grid computing and association rule mining as related works and compare parallel data mining techniques with our approach. Next, we describe the general design of our approach and explain the basic algorithm (without load balancing) implemented as experimental programs in detail. Then we explain simulation settings and discuss evaluation results as a feasibility study of the basic algorithm. Lastly, we describe and evaluate load balancing as an extension to the basic algorithm.
2
Related Works
Grid computing is a technique which can provide large-scale computing powers by connecting computing resources connected on networks. P2P distributed computing is M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 178–188, 2004. © Springer-Verlag Berlin Heidelberg 2004
A Peer-to-Peer Approach to Parallel Association Rule Mining
179
classified into this category such as the project SET@home [13]. The greatest merit of P2P computing is its high cost-effectiveness in that it provides processing power equal to super computing at a relatively low cost. Association rule mining [1] is a technique evolved from basket analysis for promoting sales and improving layouts in retailers. It analyses transactions of items which customers bought recorded in databases and extracts association rules such as “60 % of customers who bought a product X bought a product Y at the same time and the ratio of such customers is 20 % of the whole.” We represent a set of all items as I={i1, i2,..., im} and call its subset itemsets. We represent a database of transactions as D= {t1, t2, ..., tn}. Each transaction ti is a subset of I. An association rule is a form of implication satisfying and X and Y are an antecedent and a consequence, respectively. There are two parameters support and confidence associated with association rules. The former is a ratio of transactions containing itemsets within rules over D. The latter is a ratio of transactions containing itemsets as the antecedent and consequence over transactions containing itemsets as the antecedent. Given a minimum support and a minimum confidence, we first create itemsets satisfying the minimum support, which we call large itemsets. Second, based on the large itemsets, we extract association rules satisfying the minimum confidence. The former step consumes most of the whole processing time. This is because creation of large itemsets must handle at most itemsets, given m distinctive items while extraction of association rules has only to handle filtered data. Apriori [1] efficiently processes creation of large itemsets by creating candidate itemsets from large itemsets and by reducing counting of redundant itemsets. Partition [12] an extension of Apriori partitions a database in order to reduce I/O costs and CPU overheads in creation of large itemsets. Partition creates local large itemsets in each partitioned database and merges them to create global candidate itemsets, from which global large itemsets are created. There are a lot of works on parallel mining of association rules, which can be classified into distributed-memory approaches [2][10][14] and shared-memory approaches [4] [16]. The authors [2] parallelize basic Apriori by exchanging local support counts through an all-to-all broadcast. The authors [10] exchange mandatory parts of hash tables for 2-itemsets based on Dynamic Hashing and Pruning [9], The authors [14] balance loads of each processor by replicating extremely frequent itemsets among all processors based on Hash-Partitioned Apriori. Our approach also falls into this category, but it parallelizes finding local frequent itemsets based on Partition [12] and transfers data and itemsets through a virtually tree-like network topology unlike the above three. The authors [12] also just suggest a parallel extension of Partition, which exchanges local frequent itemsets based on an all-to-all broadcast unlike our approach. The authors [16] share hash trees as a parallel extension of Apriori. The authors [14] virtually divide a database into logical partitions and perform Dynamic Itemset Counting [3] on each partition. These two are based on shared-memory differently form our approach.
180
3
H. Ishikawa et al.
Proposed Method-Basic Algorithm
We propose a cooperative method which partitions a database and distributes each partition to nodes in P2P networks based on Partition. We introduce some more parameters. We define A as times of transmission of partitioned databases, T as maximum value of A, and C as number of connections in adjacent nodes. We partition a database into C sub-databases at a time. We begin with partitioning at the DB-owning node. We repeat partitioning databases for T times. When transmission times A reaches T, each node creates local large itemsets L from partitioned databases. Then Ls are transmitted towards the DB-owning node in reverse direction and are repeatedly merged at intermediate nodes to create global candidate itemsets GC. Lastly, the DB owning node creates global large itemsets from GC and extracts association rules. The P2P network assumed by our proposed method is based on hybrid P2P model [15], which is efficient in searching for nodes. It has the following topology: 1) Each node in the P2P network has information about adjacent nodes. 2) The first time a node connects with the P2P network, it can obtain information about adjacent nodes by consulting available indexing servers.
The information about nodes include IP addresses, node performance (i.e., CPU clock speed, main memory size, available amount of hard disks), and type and transmission speed of connection lines. These pieces of information are used in selection of nodes to which processing is delegated. Note that nodes obtained in the second step are a subset of the whole nodes participating in the P2P network. The network topology in the case where C is three and T is two is shown in Figure 1. We describe the process flow of our method as follows: 1) The DB-owning node connects with an indexing server. 2) It obtains information about adjacent nodes and selects appropriate ones and connects with them. 3) DB is partitioned into the C number of sub-databases, which are transmitted to selected adjacent nodes in turn. 4) Repeat the steps 2 and 3 for T times. 5) Local large itemsets L are created from partitioned databases. 6) L are transmitted towards the DB-owning node. 7) L are merged at intermediate nodes and as a result global candidate itemsets GC are created. 8) Global large itemsets GL are created from GC. 9) Association rules are extracted based on GL.
If the number of available nodes in the information about adjacent nodes which a node either already has or knows by consulting an indexing server exceeds the connection number C, the node must choose exactly C nodes to be really connected among them. It selects ones with higher CPU clock speed and higher transmission speed to efficiently process data mining tasks and transmit partitioned data. After it decides on nodes to be connected, it sends messages of request for processing to them and establishes the connections. Unless it is possible to connect with them for some reasons, other nodes are searched.
A Peer-to-Peer Approach to Parallel Association Rule Mining
181
Fig. 1. Process Flow (C=3, A=2)
The DB-owning node partitions the database into C sub-databases and transmits them to its connected nodes. We don’t partition databases in a serial way. That is, every time the TID of scanned transactions changes, the target sub-database is switched. This ensures even load balancing. Every time all partitioned sub-databases are transmitted to connected nodes, the times of transmission (i.e., A) are increased by one. The partition and transmission of the database terminates when A reaches T. After database partition and transmission is repeated until A equals to T, the node owning the partitioned sub-database (i.e., leaf nodes of a node tree) creates local large itemsets L from the partitioned database. Here we denote large itemsets of length k by k-L and candidate itemsets of length k by k-C. Creation of L continues to be done by the following steps until no more k-L is created: 1) creates k-C from two (k-1)-L which have the same items except one. 2) counts supports of k-C. 3) creates k-L satisfying the minimum support from k-C.
Once the leaf nodes create k-L, they transmit L to the DB-owning node. After intermediate nodes receive all L from its connected nodes, they merge L by the same length k and creates global candidate itemsets GC. GC is sent to the nodes which request the intermediate nodes for processing and it is in turn merged into larger GC by them. This merge is repeated until GC reaches the DB-owning node. Once the DB-owning node receives GC from its connected nodes, it merges GC and creates global large itemsets GL satisfying the minimum support. Next it extracts association rules satisfying the minimum confidence. Lastly in this section, we summarize the reasons for choosing the extended Partition scheme based on P2P networks as follows:
182
H. Ishikawa et al.
1) simplicity in control over distributed computing 2) homogeneity in tasks performed by individual nodes 3) applicability of our scheme to other tasks than association rule maining
4
Implementation
We implemented the processes described in the previous section such as DB partition, L creation, GC merge, GL creation, and rule extraction by using perl [11] and SQL. Since we aim to realize Grid computing at a low cost, we chose MySQL [8], an open source DBMS. We used DBI as API with DB and DBD::MySQL as a driver, which are both provided by perl. We describe the schema of the transaction database used and algorithms implemented in the following subsections. We have designed the schema of the transaction database as a relational database consisting of transaction ID (tid) and item (item) fileds. We have output partitioned databases, local large itemsets, and global candidate itemsets as files because they must be transmitted through the networks. We illustrate programs for file I/O implemented by My SQL as follows: outputs table (table1) into file (data1) with CSV (Comma-Separated Values) format inputs file (data1) with CSV format into table (table2) We consider partitioning the transaction database into C sub-databases. The algorithm for database partitioning reads rows one by one and stores them into partitioned sub-databases. Every time the different tid is read, the targeted sub-database is alternated. This avoids skew in distribution of transaction data and realizes load balancing. We have used the following three tables to create local large itemsets L from partitioned sub-databases: k-ct: consists of tid and k-candidate itemsets k-st: consists of k-large itemsets and support k-lt: tid and k-large itemsets where k is the length of itemsets. Global candidate itemsets GC are created by merging the tables k-lt by each length k. GC are similarly merged at relay (i.e., intermediate) nodes by each length k. We assume that L as well as GC are merged by each length k. We denote the maximum length of itemsets in L or GC by the maximum itemset length. We assume that GC has the table schema similar to that described previously (i.e., k-ct). We denote GC before merging and GC after merging by k-gct’ and k-gct, respectively. Global large itemsets GL are created by selecting itemsets satisfying the minimum support, together with their supports, from the table k-gct for each length k. The global supports are counted by intersecting TIDs by using the item-TID index over the original database. The results are stored in k-glt with the same schema with k-lt. We extract association rules by selecting itemsets satisfying the minimum confidence from k-glt. Association rules are created by using the table k-glt (k>=1) and
A Peer-to-Peer Approach to Parallel Association Rule Mining
183
the table n-glt (n>k) containing k-glt. The antecedent and consequence of association rules are k-glt and (n-glt minus k-glt), respectively. If we denote the supports for k-glt and n-glt by sup(k-glt) and sup(n-glt), the confidence conf of the rule is calculated as follows:Conf=sup(n-glt)/sup(k-glt). We describe the SQL command for extracting association rules as follows: SELECT x.item1, ..., x.itemk, y.item1, ..., y.itemn, y.sup/x.sup FROM k-glt x, n-glt y WHERE y.sup/x.sup>=minconf The above command selects itemsets constituting a rule and computes the confidence of the rule satisfying the minimum confidence minconf. The command allows for overlapping of itemsets between the antecedent and consequence. We delete overlapped itemsets by using perl.
5
Simulation
We have simulated the algorithm to validate its effectiveness. We used PCs (Pentium III, 1GHz and 550MHz). We created a transaction database consisting of 10,000 transactions of itemsets by using DBGen [5], a synthetic data generator for SQL tables. The number of different items is 50 and the average length of itemsets is 5. We mined association rules with 1% minsup and 10% minconf by changing the final number of partitions from 3 through 9 to 27 (C=3, T=1, 2, 3). We describe the comparison of processing time by our methods with processing time by a single-node (i.e., the original Partition). The case for 1 GHz CPU is illustrated in Table 1. The unit is second and N is the number of nodes used for creating L.
We describe the case for 1GHz (See Table 1). The time for partitioning databases with our methods increase from 19 to 26 (N=9) and from 20 to 30 (N=27) in comparison with the single-node processing. This is because the portioning is repeated T times in our cases.
184
H. Ishikawa et al.
On the other hand, time for creation of local large itemsets decreases to 1/2.9 (N=3), 1/8.4 (N=9), and 1/25.3 (N=27) in comparison with the single-node processing. The total amount of time for mining rules decreases to 1/2.9 (N=3), 1/6.4 (N=9), and 1/7.3 (N=27). We see the improvement in case for 550GHz (See Table 2). The time for creation of local large itemsets decreases to 1/3.0 (N=3), 1/8.7 (N=9), and 1/24.6 (N=27) in comparison with the single-node processing. We conclude that the time of L creation can be improved in proportion to N. We compare the results from the viewpoint of the differences of CPU clock speeds. The ratio of total processing time for 550MHz/1GHz is 1.9 (N=3), 2.2 (N=9), and 2.8 (N=27). We can conclude that the improvement ratio is larger than the ratio of two CPU clock speeds as N increases. In order to consider the skews and ranges in time for L creation by each node, we calculated the average, variance, standard deviation, coefficient of variation, and range of the time (See Table 2 for 1GHz). The coefficient of variation is normalized by dividing the standard deviation by the average. The range is the difference between the maximum and minimum values. The coefficient of variation of the time for L creation increases as N increases. This is thought to be reasonable because the coefficient of variation represents the variations in processing time. On the other hand, the range for the time for L creation is reduced as N increases. This is because the processing load by each node is reduced by increasing N and the difference is reduced between nodes as a result. Therefore, for small N, the processing load for each node as well as the difference between the processing time of each node increases and as a result, the whole processing tends to be delayed. On the contrary, as N increases, the difference of the time for L creation decreases and as a result, the whole processing can be done faster. When we compare the effect of the difference of CPU clock speeds, the average time is reduced to 1/8.3 (550MHz/1GHz CPU) and the range for each node is reduced to 1/1.78 (550MHz/1GHz CPU). We can draw such a conclusion that it is more desirable to use high–performance PC and partition a database into as many sub-databases as possible in order to effectively reduce the whole processing time. We measured the time for transmitting data such as partitioned databases, large itemsets, and global candidate itemsets as preliminary experiments based on ftp. As a result, the time was 0.13 sec at longest. This is because the files size was 976kB at largest and was sufficiently small from the viewpoint of transmission time. So we didn’t
A Peer-to-Peer Approach to Parallel Association Rule Mining
185
take such transmission time into account in this set of evaluation. As we discuss later, we of course must consider effects of transmission in addition to utilized DBMS and programs when we scale up the system.
6
Load Balancing-Extension
We have proposed a flexible approach to association rule mining, which cooperatively distributes processes to nodes on P2P networks. We have successfully simulated the basic approach and have validated its effectiveness. However, we have assumed that we distribute the same amount of data to any node in our basic algorithm. Now we consider load balancing which is expected to increase the total performance. To this end, we must solve the following issues. 1) We must determine the rank of a peer, that is, the overall capability of processing data mining tasks. 2) We must determine the amount of data dispatched to each peer based on its rank. First, we transfer benchmark data (i.e., of the same size) to each peer to determine its intrinsic rank only based on its CPU and disks. Next, we calculate the synthetic rank of the peer based on both the intrinsic rank of its own and the synthetic ranks of other networked peers as follows (rank formula):
where and denote the intrinsic rank and synthetic rank of a peer i, respectively and denotes the number of out-links from the peer and denotes the set of peers from which in-links come to the peer. Then we allocate computing tasks to a peer based on its synthetic rank. The amount of a computing task is functionally determined by its data size. We denote the amount of a task and the size of the data by A and D, respectively. We can represent a relationship by using a function F specific to the mining tasks as follows: A=F(D) We have discovered that F is approximately proportional to the squares of the input variable (i.e., D) in the context of association rule mining specific to the current implementation. In other words, its inverse function is approximately proportional to the square root of the input variable (i.e., A). So we determine the amount of data allocated to a peer, and those of its adjacent peers, as follows (data formula):
where D denotes the total amount of data.
186
H. Ishikawa et al.
We have simulated the extended algorithm (i.e., the algorithm with load balancing) as follows: First, by using the above rank formula, we have determined the synthetic and intrinsic ranks of three peers participating in the network where peer1 connects out-links to peer2 and peer3.
We have generated the following transaction data for our experiments: The number of transaction is 10,000, the size of itemsets is 50, and the average length of
Fig. 2. Processing Time
transactions is 5. We have discovered association rules with its minimum support 1% and its minimum confidence 10%. We have determined the amount of each allocated data by using the data formula. We have measured the performance (i.e., processing time [second]) by association rule mining without load balancing (the basic algorithm) and with it (the extended algorithm) illustrated in Figure 2, where P, Li, Rule, and Total denote time for data partition, large itemset creation, association rule extraction, and total processing, respectively. Drastically, in the extended algorithm, all of the time for large itemset creation is almost equal and therefore the total processing time is shorter in comparison with the basic algorithm. Lastly, we explain the network topology which we assume for the moment. In general, the network topology is a (connected) tree although a child node can have
A Peer-to-Peer Approach to Parallel Association Rule Mining
187
more than one parent node. It is a subset of the whole P2P network. Of course, we don’t allow any cycle in the tree. We assume the following matrices: r: a matrix for intrinsic ranks normalized by the number of in-links plus 1 R: a matrix for synthetic ranks T: an adjacency matrix I: a unit matrix Then we have the following relation: R = TR + r Therefore, we can calculate the synthetic ranks as follows:
7
Conclusion
We have proposed flexible approaches to association rule mining, which cooperatively distribute processes to nodes on low-cost P2P networks. We have described the basic algorithm (without load balancing) and the extended algorithm (with load balancing). We have simulated both of the approaches and have successfully validated there effectiveness. There are remaining issues as follows: First, the topology of P2P network used by our proposal is straightforward and takes no account of the concrete protocol. We have to define exactly the network protocol by modeling P2P applications for exchanging files. We have simulated our approach by using a single PC and have not considered connection time and transmission time. Therefore, we have to implement the system by using more than one node on the real P2P networks. Second, we have to apply our approach to large-scale transaction databases where time of transmitting partitioned databases and local large itemsets is crucial. The transaction database that we used is small enough for the single PC to process. Therefore, we have to improve our implementation and choose appropriate large-scale DBMSs in order to make our system scalable. Third, we have assumed that the maximum times of transmission and the number of connected nodes are given by the user of the system. It is found from the result of the experiments, that the time of creating local large itemsets can be shorter approximately in proportion to the number of partitioned databases and the time of partitioning databases can be longer according to it conversely. In addition to these facts, we have to take the time of transmitting partitioned databases into consideration when we determine the number of connected nodes ideal for improving the total time of data mining. We also have to determine how to select connected nodes by utilizing P2P networks. For example, we can consider the following scenario. If a node selected from adjacent nodes is not connectable, its adjacent nodes are delegated to instead. This can avoid the termination of transmitting partitioned databases in case that not all nodes in the list of adjacent nodes are connected. Fourthly, we have to devise security mechanisms which take direct communication as a feature of P2P networks into consideration. This is one of keys to evolving our approach to Grid computing.
188
H. Ishikawa et al.
Acknowledgements This work is partially supported by the Ministry of Education, Culture, Sports, Science and Technology, Japan under a Grant–in-Aid for Scientific Research on Priority Areas 16016273.
References 1. Agrawal R., Srikant R., (1994) Fast Algorithms for Mining Association Rules in Large Databases. Proc.VLDB 487-499 2. Agrawal R. and Shafer J., (1996) “Parallel Mining of Association Rules,” IEEE Trans. Knowledge and Data Eng., Vol. 8, No. 6, 962–969. 3. Brin S. et al., (1997) “Dynamic Itemset Counting and Implication Rules for Market Basket Data,” Proc. ACM SIGMOD Conf. Management of Data, 255–264. 4. Cheung D., Hu K., and Xia S., (1998) “Asynchronous Parallel Algorithm for Mining Association Rules on Shared-Memory Multi-Processors,” Proc. 10th ACM Symp. Parallel Algorithms and Architectures:279–288. 5. DBGen: http://research.microsoft.com/~Gray/DBGen/ 6. (Eds) Foster I. and Kesselman C., (1999) The Grid: Blueprint for a New Computing Infrastructure. Morgan-Kaufmann. 7. Gong L., (2001) Project JXTA: A technology overview. Technical report, SUN Microsystems . http://www.jxta.org/project/www/docs/TechOverview.pdf. 8. MySQL: http://www.mysql.com/ 9. Park J. S, Chen M., and Yu P. S., (1995) “An Effective Hash Based Algorithm for Mining Association Rules,” Proc. ACM SIGMOD Conf.,175–186. 10. Park J. S., Chen M., and Yu P. S., (1995) “Efficient Parallel Data Mining for Association Rules,” Proc. ACM Int’l Conf. Information and Knowledge Management, 31–36. 11. Perl: http://www.perl.com/ 12. Savasere A., Omiecinski E., Navathe S. B., (1995) An Efficient Algorithm for Mining Association Rules in Large Databases. Proc. VLDB 432-444 13. SETI@HOME: http://setiathome.ssl.berkele y.edu/ 14. Shintani T. and Kitsuregawa M., (1996) “Hash Based Parallel Algorithms for Mining Association Rules,” Proc. 4th Int’l Conf. Parallel and Distributed Information Systems, IEEE, 19–30. 15. Yang B., Garcia-Molina H., (2001) Comparing Hybrid Peer-to-Peer Systems. Proc.VLDB, 561-570 16. Zaki M. J., et al., (1996) “Parallel Data Mining for Association Rules on Shared-Memory Multi- Processors,” Proc. Supercomputing ’96, IEEE.
FIT: A Fast Algorithm for Discovering Frequent Itemsets in Large Databases Jun Luo1 and Sanguthevar Rajasekaran2 1
Computer Science Department, Ohio Northern University, Ada, OH 45810 2 Department of CSE, University of Connecticut, CT 06269-3155*
Abstract. Association rules mining is an important data mining problem that has been studied extensively. In this paper, a simple but Fast algorithm for Intersecting attributes lists using hash Tables (FIT) is presented. FIT is designed for efficiently computing all the frequent itemsets in large databases. It deploys an idea similar to Eclat but has a much better computational performance than Eclat due to two reasons: 1) FIT makes fewer total number of comparisons for each intersection operation between two attributes lists, and 2) FIT significantly reduces the total number of intersection operations. Our experimental results demonstrate that the performance of FIT is much better than that of Eclat and Apriori algorithms.
1 Introduction Mining association rules is one of the classic data mining problems. The study of mining algorithms was initiated by Agrawal, Imielinski, and Swami [AIS] and since then numerous papers have been written on the topic. An itemset that contains items from an item set (I) is called a An itemset is frequent if the number of transactions in a database (D) that contain the itemset is no less than a user-specified minimum support (minsup). Some notations used in this paper are listed in Table 1. If the union of two itemsets, X and Y for example, is a frequent itemset then X and Y are defined to have a strong association relation. Otherwise, X and Y are said to have a weak association relation. If A represents a set or a list, then we let denote the number of elements in A. In this paper, we present a new algorithm called FIT. FIT is a simple but fast algorithm for computing all the frequent itemsets in a large database. The basic idea of FIT is similar to that of Eclat [ZPOW], but FIT has a much better computational performance than Eclat. The remainder of this paper is organized as follows: Section 2 describes a simple method, Section 3 details the algorithm FIT, Section 4 discusses sample experimental results, and Section 5 presents some conclusions. * This author has been supported in part by the NSF Grants CCR-9912395 and ITR-
0326155. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 189–195, 2004. © Springer-Verlag Berlin Heidelberg 2004
190
J. Luo and S. Rajasekaran
2 A Simple Method If a transaction contains an itemset (X), then is treated as an attribute of X. The attribute can be represented by the unique transaction identification number of All the attributes of X form an attributes list. Attribute lists for all the items in I are generated by scanning a database (D) once. As a result, D is transformed into the attributes list format. Note that all the attributes lists whose support values are no less than minsup constitute With the attributes list format, calculations for frequent itemsets become straightforward: the support value of X is determined by the number of attributes in its attributes list. The support value of the union of two itemsets X and Y is calculated in two steps: 1) Intersect the attributes lists of X and Y, and 2) Calculate the number of attributes in the intersection. The intersection of any two attributes lists and can be calculated using a hash table. The length of the hash table depends on the largest attribute value in the attributes lists. The initial value for each hash table entry is set to -1. The calculation begins with scanning first. During the scan, attribute values are used as indices to access hash table entries, and values of entries being accessed are set to 1. Then, is scanned. During the scan, attribute values are also used as indices to access hash table entries. If the entry being accessed contains 1, the corresponding attribute is kept in the intersection. Otherwise, the attribute is discarded. The total number of comparisons for computing is For attributes lists intersections between an attributes list and each of the remaining attributes lists are computed as follows: scan once and initialize the hash table as discussed above. Then, successively scan each and calculate intersections. If all the attributes lists are arranged in such an order that the total number of comparisons for calculating and is equal to Starting with all the frequent itemsets of any size in D could be calculated in two ways: breadth-first calculation (BFC) or depth-first calculation (DFC). In BFC, all the frequent are identified before any possible is. In DFC, given an if the intersection results between an attributes list and the attributes lists that follow in generate a
FIT: A Fast Algorithm
191
non-empty then the calculations on will start immediately before any other calculations. Experiments show that DFC performs better than BFC. Given D and minsup, a formal description of the simple method is as follows: Algorithm 1:
A Simple Method
Step 1. Calculate by scanning D once. Sort such that Mark all the entries in as unvisited. Step 2. Establish a hash table (hb) with entries whose initial values are set to –1. Set to 1. Step 3. If all the itemsets in have been visited, and equals 1, the calculation terminates. If all the itemsets in have been visited, and does not equal 1, decrease by 1. Step 4. Scan the attributes list of the first unvisited itemset (X) in For each attribute set to 1. Mark X as visited. Step 5. For each itemset Y that follows X in do the following. For each attribute in the attributes list of Y, if equals 0, discard If equals 1, put into a new attributes list (for If the number of attributes in the resulting attributes list is no less than minsup, put the itemset and the resulting attributes list into Mark the itemset as unvisited in Step 6. Reset entries in hb to – 1. If is not empty, increase by 1 and go to Step 4. Otherwise, go to Step 3.
3 Algorithm FIT Given attributes lists, the simple method needs to perform intersection calculations. If two itemsets, X and Y, have a weak association relation, the attributes list calculation for is unnecessary. The overall computational performance can be greatly improved if such unnecessary intersection calculations are avoided. The technique for cutting down on the unnecessary intersection operations is based on Lemma 1. Lemma 1. Let be the union of attributes lists If has a weak association relation with another attributes list any attributes list will also have a weak association relation with Proof. Assume has a weak association relation with and, without loss of generality, has a strong association relation with Let and Since the attributes of form a subset of is greater than or equal to Thus, is no less than minsup. This implies that has a strong association relation with which contradicts the assumption. So cannot have a strong association relation with Given an the attributes lists are logically divided into groups. Each group has attributes lists (where with the possible exception of
192
J. Luo and S. Rajasekaran
the last group. The last group has attributes lists. For simplicity, we assume throughout that is an integral multiple of The groups are denoted as For each group (where do the following: 1) calculate the set 2) for attributes list in the simple method is adopted here with the following changes: for each attributes list, for example, in the simple method needs to calculate the intersections between and any of the other attributes lists that either are in or follow in The method of calculating is described as follows: at the beginning, is set to Then, the union of all the attributes lists in is calculated, and the result is The intersections between and each attributes list, for example, in are calculated one at a time. If and have a weak association relation, is removed from The algorithm FIT is simply a recursive version of the above algorithm. After the first logical division of if the size of each group is still large, then, after calculating each group is treated as a new set of frequent The above algorithm is applied on This procedure repeats until the size of each subgroup is small enough. Note that when is divided into smaller subgroups whose sizes are denoted as for each subgroup (where the initial set is the union of
and A formal description of FIT is given below.
Algorithm 2: Step 1. Calculate
FIT
by scanning D once. Sort
such that
Step 2. Establish a hash table hb with entries. Step 3. Determine a value for Step 4. Calculate frequent itemsets calling The function stage1 receives four parameters: the parameters L and F represent sets of frequent itemsets with their attributes lists. The parameters and denote the size of current frequent itemsets and the size of current groups, respectively. Step 1. Divide L into groups, For each group (G), do Step 2, Step 3, and Step 4. Step 2. Assume that the last attributes list in G is the attributes list in L. Set Set all the entries in the hash table to – 1. Scan all the attributes lists in L. For any attribute read, if the entry in the hash table contains – 1, then set it to 1.
FIT: A Fast Algorithm
193
Step 3. For each attributes list in C, do the following: Set Scan through the attributes in For any attribute read, if the entry in the hash table contains 1, then After is scanned, if then remove and its itemsets from C. Step 4. If then calculate frequent itemsets with stage2(G, 1, C). Otherwise, Set and calculate the frequent itemsets with In stage1, the symbols and denote two parameters. If the size of the current group is smaller than then the current group will not be further divided. Otherwise, the current group will be further divided into subgroups and the size of each subgroup is with the possible exception of the last one.
The function stage2 receives three parameters. The parameters L, are defined in the same way as in stage1.
and F
Step 1. Starting from the first attributes list in L until the last one, for each attributes list do Step 2, Step 3, Step 4, and Step 5. Step 2. Set all the entries in the hash table to –1. Set and Step 3. Scan through the attributes list For each attribute read, set the entry in the hash table to 1. Step 4. For each attributes list in C, do the following: for each attribute in if the entry in the hash table is 1, then At the end, if add and R into assuming that the itemsets X and Y are associated with the attributes lists and respectively. Step 5. If is not empty, determine a value for and call
4 Sample Experimental Results We have implemented the algorithms Apriori [AS] and Eclat. Besides FIT, the simple method discussed in Section 2 was also implemented. All the programs were written in C++. For the same reason mentioned in [HGN], we did not implement the algorithm FP-Growth [HPY]. Also, in our implementation, Eclat was extended to calculate from All the experiments were performed on a SUN UltraTM 80 workstation which consisted of four 450-MHz UltraSPARC II processors with 4-MB L2 cache. The total main memory was 4GB. The operating system was Solaris 8. Synthetic datasets were created using the data generator in [AS]. The synthetic dataset used in this paper is D1 = T26I4N1kD10k, which means an average transaction size of 26, an average size of the maximal potential frequent itemsets of 4, 1000 distinct items, and 10000 generated transactions. The number of patterns in the synthetic datasets is set to 10,000. The experimental results are shown in Fig. 1 and Fig. 2. Fig. 1 shows the run time comparisons. Fig. 2 illustrates the corresponding speedups of FIT over
194
J. Luo and S. Rajasekaran
Fig. 1. Run time comparisons
Fig. 2. Speedup comparisons
FIT: A Fast Algorithm
195
Eclat, the simple method, and Apriori. The run times of FIT were measured when the set was divided into two levels of groups. The size of each group at the first level and the second level was set to 15 and 3, respectively. For any other set only one level of groups was used and the size of each group was set to 3. Our experimental results assert that FIT is consistently faster than the other three algorithms. As minsup decreases, the run times of FIT increase at a slower pace than Apriori and Eclat. However, when the minsup is large enough, the database contains few frequent itemsets. As a result, the speedups of FIT over Eclat or Apriori become less significant. Our experimental results also illustrate that the simple method is always faster than Eclat.
5 Conclusions In this paper, we have presented two novel algorithms for efficiently computing the freqeuent itemsets in large databases. The simple method and FIT were designed and implemented before we came to know about Eclat. Although Eclat, the simple method, and FIT all employ the so-called tid-list idea, the simple method and FIT have much better computational performance. In all of our experiments, FIT was consistently the fastest among all the algorithms that were tested.
References [AIS]
R. Agrawal, T. Imielinski, and A. Swami: Mining Associations between Sets of Items in Large Databases. Proceedings of the ACM-SIGMOD 1993 International Conference on Management of Data, Washington D.C., USA, 1993. [AS] R. Agrawal and R. Srikant: Fast Algorithms for Mining Association Rules. Proceedings of the 20th International Conference on Very Large Databases, Santiago, Chile, 1994. [HGN] J. Hipp, U. Guntezr, and G. Nakhaeizadeh: Algorithms for Association Rule Mining A General Survey and Comparison. The Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, USA, 2000. [HPY] J. Han, and J. Pei, and Y. Yin: Mining Frequent Patterns without Candidate Generation, ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 2000. [ZPOW] M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li: New Algorithms for fast discovery of association rules. Proceedings of the third International Conference On Kdd and Data Mining, Newport Beach, CA, USA, 1997.
Frequency-Incorporated Interdependency Rules Mining in Spatiotemporal Databases Ickjai Lee School of Information Technology, James Cook University, Douglas Campus, Townsville, QLD 4811, Australia [email protected]
Abstract. Spatiotemporal association rules mining is to reveal interrelationships within large Spatiotemporal databases. One critical limitation of traditional approaches is that they are confined to qualitative attribute measures. Quantitative frequencies are either ignored or discretized. In this paper, we propose a robust data mining method that efficiently reveals frequency-incorporated associations in Spatiotemporal databases.
1 Introduction Data mining is an emerging discipline that attempts to dig massive databases to efficiently find previously unknown patterns. Recently, data mining in the context of geography has emerged as an emerging research area [1]. Several spatial data mining attempts [2, 3] have been made to discover positive associations or co-location rules among attributes. Although these approaches efficiently discover multivariate spatial relationships, they are not general enough to handle various spatial types and attribute measures. Most importantly, they focus on relationships based on qualitative attribute measures (boolean or nominal). Quantitative measures are not adequately handled by these approaches. That is, frequencies are either ignored or discretized into categories. Frequencies are of great importance particularly in area data since data is aggregated. In this paper, we propose a Spatiotemporal data mining method that efficiently reveals frequency-incorporated positive associations among large area-aggregate Spatiotemporal databases. We define frequency-incorporated interdependency rules mining along with new interesting measures. It detects subsets of features tightly intercorrelated given a set of large area-aggregate features.
2 Association Rules Mining Association Rules Mining (ARM) has been a powerful tool for discovering positive associations and causal relationships among a set of items in a transactional database Here, each transaction is a subset of An association rule is an expression in the form of and It is interpreted as of transactions in that satisfy X also satisfy Y”. Typically, support and confidence are two measures of M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 196–202, 2004. © Springer-Verlag Berlin Heidelberg 2004
Frequency-Incorporated Interdependency Rules Mining
197
rule’s interestingness. Support is an estimate for and confidence is an estimate for A set of items is refereed to as an itemset. Two user defined thresholds, minimum support and minimum confidence, are used for pruning rules to find only interesting rules. Itemsets satisfying the required minimum support constraint are named frequent while rules satisfying the two thresholds are called strong. ARM is to compute all strong rules. ARM has been applied to spatiotemporal data in two different ways. This is parallel to the fact that there are two popular views of geographical space in GIS: raster and vector. Vectors regard geographical space as a set of objects while rasters see it as a set of locations. Spatial Association Rules Mining (SARM) [2] is similar to the raster view in the sense that it tessellates a study region S into discrete groups based on spatial or aspatial predicates derived from concept hierarchies. A spatial association rule is a rule that consists of a set of predicates in which at least a spatial predicate is involved. One drawback of this approach is that it is limited to qualitative measures. It either ignores frequencies or discretizes them. Co-Location Rule Mining (CLRM) [3] discovers a subset of features given a set of features frequently located together in a spatial context. It extends traditional ARM to where the set of transactions is a continuum in a space. That is, there may not be an explicit finite set of transactions due to the continuity of geographical space. Thus, this approach focuses on features similar to the vector approach. Here, transactions correspond to point locations and items correspond to features. Since features have different occurrences, the size of transactions vary with features.
3 Frequency-Incorporated Interdependency Rules Mining 3.1
Problem Statement
Let us take an example to point out the problems of traditional methods and to outline our approach. Assume that there are five distinct features (layers) over S. Features are recorded as an area-aggregate measure (for instance, features represent crime incidents of different crime types) over six states and two territories in Australia as shown in Fig. 1. Fig. 1(a) depicts S and Fig. 1(b) shows a relational table where columns correspond to features and rows correspond to areal units. Numbers are frequencies (occurrences) of features. Each feature has ten occurrences. Let us consider two rules: Despite distributions of and are different, BARM fails to discover the difference. Both rules exhibit the same support (50%) and confidence (100%). However, the latter rule seems to be highly correlated than the former since and exhibit higher interdependency than and Thus, ignoring frequencies of features hinders from discovering some interesting interdependencies. QARM seems to overcome this limitation to some degree. The former rule can be decomposed into two quantitative rules: and and the latter includes: and We may induce a partly negative relationship between and and a partly positive relationship between and from these quantitative rules. However, these rules do not
198
I. Lee
Fig. 1. Associations among features: (a) S; (b) A relational table with feature values
reveal frequency-incorporated interdependencies (interactions and connectivity). Note that, SARM is an hierarchical extension of BARM and QARM to spatial data. Thus, it is unable to identify frequency-incorporated interdependencies. In fact, CLRM is for boolean point datasets. It is not directly applicable to areal datasets. By assuming that frequencies are point objects and the neighborhood is defined when objects lie within the same areal unit, we can apply CLRM to this area-aggregate dataset. Here, the interesting measures (prevalence and conditional probability) for both rules are all 100%. This is because all elements of and are participated. Namely, and can be found any areal units (WA, NT, SA and QLD) at which occurs. Obviously, CLRM is incapable of discriminating the two rules. In spatiotemporal processes, more interactions imply stronger interdependencies. That is, features interacted have an influence on each other. Note that, similar measures are used in other disciplines such as coupling in software engineering and connectivity in graph theory. In this paper, a frequency-incorporated interdependency between features and is measured by the number of interactions between the two. Interactions are definable within the same areal unit.
Fig. 2. Frequency-incorporated interdependencies between features
Fig. 2 explains how to define an interaction with a graph-theoretic approach where occurrences of a feature correspond to nodes and interactions correspond to edges. For instance, Fig. 2 depicts possible interactions between features and For two features and connections (interactions) are made between nodes occurring in the same areal unit. Thus, the number of connections between and is 30. Similarly, we can compute the number of connections between and that is 20. This concludes that a pair is more tightly associated than a pair
Frequency-Incorporated Interdependency Rules Mining
199
On the other hand, two pairs show the same distribution as depicted Fig. 1(b). However, the connectivity of is weaker than that of Also note that, even if distributions of and are identical, their connectivity (30) is less than that (40) of and This is because connectivity (interaction) in our approach is measured by the number of possible interactions within the same area unit not by the degree of similarity for distributions.
3.2
Problem Definition and Interesting Measures
Imagine that we are required to analyze large area-aggregate spatiotemporal datasets. Given a set of features in an area-aggregate spatiotemporal database our aim is to efficiently unravel to discover all strong frequency-incorporated interdependency rules among Each feature may consist of time slides Here, denotes an element of feature that occurred at the time period and An Area-aggregate Frequency-incorporated Interdependency Rule (AFIR) is in the form: where (IR%, RR%), and Each AFIR is associated with two interesting measures: Interaction Ratio (IR) and Realization Ratio (RR). IR is a ratio of the number of interactions (connections) to the number of total possible interactions while RR is a ratio of the number of interactions to the number of total possible interactions given antecedent. That is, RR is a conditional probability of the number of possible interactions given the distribution of antecedent. The higher interaction ratios imply the stronger connections. Since we are interested in highly interconnected featuresets, IR is used to prune loosely connected featuresets. Rule’s strength is measured by RR. Formal definitions are as follows. Let S be a set of areal units and denote the occurrence (frequency) of feature at area and
A set of features is referred to as a featureset. For a given a is called if its cardinality is If IR(Z) is greater than or equal to a userdefined Minimum IR (MIR), then Z is called tight. Area-aggregate Frequencyincorporated Interdependency Rule Mining (AFIRM) is to find all AFIRs that satisfy the two user-defined thresholds: MIR and Minimum RR (MRR).
3.3
AFIRM Properties
Since our approach is based on the Apriori algorithm that uses the downward closure property [4], we need to prove that IR satisfies the downward closure.
200
I. Lee
Lemma 1. IR is downward closure. Proof. We need to prove when a are tight. Let us consider a tight
is tight, all of its Z, then it must satisfy the following.
Without loss of generality, let be the feature in Z, and consider the Then, we need to prove
Here, IR(Z) can be rewritten as follows.
Since
satisfies for
Lemma 1 is proved.
All tight are used to generate candidates for tight That is, the Apriori-like algorithms can be used in our approach to efficiently mine AFIRs. However, there is another property that even further prune the search space. Note that, the number of interactions is defined by the production of occurrence in each unit of featuresets. Thus, the maximum number of interactions among all areal units in limits the total number of interactions in Let be the areal unit that returns the maximum interaction among all areal units in Here, the areal unit returning the maximum is called maxunit of and the number of interactions in the maxunit is called maxnum. A ratio of maxnum of to the number of total possible interactions is called maxratio of and it is as follows.
Lemma 2. If a is tight, then maxratios of all of its subsets are greater than or equal to MIR. Proof. Here, we need to prove that
Frequency-Incorporated Interdependency Rules Mining
Without loss of generality, let the following.
be the maxunit of
201
then IR(Z) satisfies
Tight
that satisfy Lemma 2 are referred to as heavy Only they are used to generate candidates for Finally, an equation below is derived.
3.4
Working Principle
For illustration, we consider 5 features and 6 areal units in S. Each feature contains 6 objects and they are distributed over S. Fig. 3(a) depicts the dataset summarized in a 6 × 5 table. Here, we have two constraints MIR = 7/36 and
Fig. 3. The working principle of AFIRM with MIR = 7/36 and MRR = 1/2: (a) A 6 × 5 table; (b) Candidates for 1-featuresets; (c) Candidates for 2-featuresets; (d) Candidates for 3-featuresets; (e) Rules generated from the 3-featuresets
MRR = 1/2. Interaction ratios of 1-featuresets are 1 whilst maxratios are not the same. This is illustrated in Fig. 3(b). All 1-featuresets satisfy MIR, thus they are all tight. However, they are not all heavy. exhibits the smallest ratio (1/6) that is smaller than MIR (7/36). Namely, all five 1-featuresets are heavy except Thus, four 1-featuresets are used to generate candidates for 2-featuresets. Fig. 3(c) shows six 2-featuresets with their IR and MaxR. For instance, the number of interactions between features and is 10 (2·2 + 3·2),
202
I. Lee
thus becomes 10/36. The maxunit of and is at which six interactions are defined. Thus, becomes 6/36. Note that, this is less than MIR. Namely, the 2-featureset is not heavy. Thus, it is eliminated and is not used anymore in candidate generation. Here, four 3-itemsets and are possible. The first three cannot be generated since some of their subsets are not heavy. For instance, the 2-featureset is not heavy. Thus, we can derive only a single 3-featureset. It is shown in Fig. 3(d). Its MaxR is 5/27 that is less than MIR. Thus, no 4-featuresets are generated. From the 3-featureset we can generate 6 different AFIRs. Fig. 3(e) depicts them with their realization ratios. Let us consider a rule A maxunit of is and corresponding maxnum is 2. Thus, a realization ratio of this rule is (2 · 4 · 5 + 2 · 1 · l ) / ( 2 · 6 · 6 ) that is 7/12. Among six rules, three of them satisfy the MRR constraint. The three strong AFIRs are: and
4 Final Remarks This paper investigates the problem of discovering frequency-incorporated interdependencies among area-aggregate spatiotemporal features. It outlines the limitations of traditional association mining and presents a frequency-incorporated interdependencies rules mining method that efficiently reveals strongly and tightly correlated area-aggregate featuresets. Our algorithm is not designed to detect geographically wide spread frequently occurring weak patterns, but to identify strong and tight interdenpendencies determined by interactions in the same areal unit. Since the well-known Apriori-like pruning technique is used here for this pruning, the computational efficiency of our algorithm is similar to the efficiency of Apriori pruning [4].
References 1. Miller, H.J., Han, J.: Geographic Data Mining and Knowledge Discovery: An Overview. Cambridge University Press, Cambridge, UK (2001) 2. Koperski, K., Han, J.: Discovery of Spatial Association Rules in Geographic Information Databases. In Egenhofer, M.J., Herring, J.R., eds.: Proc. of the Int. Symp. on Large Spatial Databases. LNCS 951, Portland, Maine, Springer (1995) 47–66 3. Shekhar, S., Huang, Y.: Discovering Spatial Co-location Patterns: A Summary of Results. In Jensen, C.S., Schneider, M., Seeger, B. Tsotras, V.J., eds.: Proc. of the 7th Int. Symp. on the Advances in Spatial and Temporal Databases. LNCS 2121, Redondo Beach, CA, Springer (2001) 236–256 4. Agrawal, R., Imielinski, T., Swami, A.N.: Mining Association Rules between Sets of Items in Large Databases. In Buneman, P., Jajodia, S., eds.: Proc. of the ACM Int. Conf. on Management of Data, Washington, D.C., ACM Press (1993) 207–216
Theoretical Considerations of Multiple Particle Filters for Simultaneous Localisation and Map-Building David C.K. Yuen and Bruce A. MacDonald Department of Electrical and Computer Engineering, University of Auckland, New Zealand. {d.yuen,b.macdonald}@auckland.ac.nz
Abstract. The rationale of adopting multiple particle filters to solve the Simultaneous Localisation and Map-building (SLAM) problem is discussed in this paper. SLAM can be considered as a combined state and parameter estimation problem. The particle filtering based solution is not only more flexible than the established extended Kalman filtering method, but also offers computational advantages. Experimental results based on a standard SLAM data set verify the feasibility of the method.
1 Introduction Simultaneous Localisation And Map-building (SLAM), also known as Concurrent Mapping and Localisation (CML), was introduced originally by Smith et al. [1] to estimate both the robot and obstacle positions at the same time. It is a special form of combined state and parameter estimation, in which the position estimates of newly observed obstacles are expected to be detected and added dynamically to the system. Extended Kalman Filtering (EKF) is the prevalent SLAM technique [2, 3], but its first order Taylor approximation may fail in highly nonlinear systems. The complexity of EKF also leaves room for improvement. This paper explores the alternatives for SLAM methods and describes an efficient SLAM algorithm using Multiple Particle Filters (MPFs). The proposed MPF algorithm is compared with the established EKF SLAM method using a standard SLAM testing data set, before concluding the paper.
2 Possible SLAM Architectures Although data from many robotic sensors and actuators follow the Gaussian distribution, the SLAM posterior often behaves differently. The following statements explain how non-Gaussian characteristics could be introduced to the posterior distribution. The dynamics can be highly unpredictable especially when the robot is being driven over a slippery surface. The noise model for the system would no longer be Gaussian. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 203–209, 2004. © Springer-Verlag Berlin Heidelberg 2004
204
D.C.K. Yuen and B.A. MacDonald
Some sensors are vulnerable to highly nonlinear and bursty noise sources, for example, temporary loss of signal, cross talk, multiple path propagation and specular reflection as in the case of an ultrasonic range finder. The environment itself may have a high degree of symmetry at both global and local levels. The posterior will appear as multimodal if the current sensor data cannot fully disambiguate the possible position. Obstacles that fail to appear at the expected position also provide useful information. The unimodal Gaussian model cannot handle this negative information [4]. By assuming the presence of only Gaussian noise and a linearisable system, EKF has been adopted and has established wide acceptance for SLAM applications. In addition to the fragile Gaussian posterior distribution assumptions, it has some additional shortcomings: EKF extends the original Kalman filtering method by linearising the nonlinear system and observation model of the target system. The assumption is not applicable for many more complex dynamic systems. EKF maintains only the best position estimate. The data association procedure tries to match the current sensor data to the already observed ones with reference to this sole robot position estimate. It is well perceived that a single incorrect association can lead to catastrophic failure in tracking [5]. Various improvements have been attempted. The Unscented Kalman Filter represents the mean and covariance accurately to the third order in the Talyor series expansion [6]. Multiple Hypothesis Tracking [7] uses multiple Kalman filters to track distributions. While these encouraging developments extend the application of the Kalman filter to certain types of highly nonlinear or multimodal systems, they are still limited by the Gaussian assumption. The Monte Carlo algorithms offer an alternative. For many dynamic systems, including SLAM, the latest estimate is the main interest. It is necessary to marginalise the variables corresponding to the past. Instead of performing high dimensional integration rigorously, Monte Carlo algorithms approximate the solution by simulating the distribution with lots of random samples. Since the outcome space is exponential in the dimension of the system, the generation of a uniformly sampled grid quickly becomes an intractable process. More sophisticated sampling approaches, e.g. Markov Chain Monte Carlo (MCMC) or Importance sampling, are often adopted. MCMC constructs a Markov chain for the posterior. After a sufficiently long burn-in period, the Markov chain tends to converge to the true posterior. It is an iterative method. The posterior distribution can thus be simulated from the stationary part of the chain. For a dynamic system, the observation changes between different time steps. It is not straight forward to carry the stationary part of the Markov chain from one time step to another. The feasibility of estimating parameters online is still an open question [8]. Therefore, MCMC is not considered for SLAM implementation in this work. Importance sampling simulates the posterior by sampling from a proposed distribution which is supposedly similar. The constituents chosen for the sample
Theoretical Considerations of Multiple Particle Filters
205
are selected by importance factors which are based on similarity to the observation data. Particle filtering, also known as the Sequential Monte Carlo method, is an importance sampling implementation for dynamic systems, and has stimulated much interest in recent years. Particle filtering offers a very flexible, reasonably efficient and easy way to implement the estimation architecture. The application of particle filtering for robot localisation was introduced by Fox et.al. [9] under the name of Monte Carlo Localisation. Many different forms of localisation problems, including position tracking, global localisation and even the difficult kidnapped robot problem [10] have been solved using Monte Carlo Localisation. Particle filtering has been developed primarily for state estimation while SLAM is a state and parameter co-estimation problem. Murphy [11] demonstrated the potential of particle filtering by solving a discrete SLAM problem in a 10 × 10 grid world. It is a challenge to scale the application of particle filtering from a grid world to continuous state space with hundreds or even thousands of obstacles (parameters). An MPF is proposed and discussed in the next section.
3 The Rationale of the Multiple Particle Filter (MPF) Approach It is well known that the performance of importance sampling deteriorates for a high dimensional system [12]. It is a particular concern for SLAM as the number of obstacles, i.e. the dimensionality, can be very high in large maps. Fortunately conditional independence makes it possible to estimate the robot and obstacle positions with different filters. The MPF approach should reduce the variance significantly, and thus improves the convergence behaviour, as it divides the original high dimensional problem into multiple low dimensional ones. The proposed method is similar to the idea of using a multiple model bootstrap filter for fast manoeuvring target tracking [13]. However, each of the filters in [13] estimates the same state while our filters estimate different components of the full state. In our case, it is useful to first separate the state into two components, where is an estimate of the time–varying portion, in this case the robot position, and is an estimate of the time–invariant portion (the parameters) in this case the map of obstacles. Secondly, it is sensible to further separate the estimate of parameters into each parameter, in this case dividing the obstacle map state component into a separate estimate for each of the L obstacles. The intention is that a sensible subdivision of the estimation problem will lead to computational advantages as a result of the degree of independence between the components. Separate particle filters are thus introduced for the position estimate of the robot and for each of the obstacles. The filter that estimates the robot positions for the kth step given sensor data and actuator action is denoted as the state or localisation filter The filters assigned to estimate the L obstacle positions are known as parameter or mapping filters
206
D.C.K. Yuen and B.A. MacDonald
The presence of dynamic objects has not been explicitly considered. The proposed MPF approach is a realisation of the factorised SLAM posterior expression (1), which was introduced by Murphy [11] and subsequently adopted by Montemerlo [14] for the development of FastSLAM.
The mapping filter being maintained is however different from that suggested in the factorised expression. The mapping filter estimates while the SLAM posterior in (1) requires This is because is conditionally dependent on the state estimate The localisation filter usually has a few tens to thousands of particles representing slightly different state estimates. A separated mapping filter will be allocated for each of these state particles if is estimated directly. In other words, a total of mapping filters will be required with being the the number of particles in the localisation filter. Instead, the proposed MPF approach estimates in the mapping filters, which marginalises from i.e. removes an uninterested variable from a conditional probability through some form of integration.
In step is taken as the prior probability for The calculation of allows the update of (1). Then, is multiplied with available from the localisation filter. The evaluation is repeated with a different as part of the marginalisation process. Particle filtering approximates the continuous posterior distribution with a discrete one. Figure 1 shows the flowchart for the MPF SLAM algorithm. The algorithm is not specific to a particular robot platform or operating environment. The particle filters created for the MPF SLAM algorithm are based on the Sample Importance Resampling (SIR) algorithm [12]. A total of L + 1 particle filters would be allocated to estimate the robot and the L obstacle positions. The appearance of unseen obstacles is detected dynamically. The previous posterior is taken as the prior for the latest step, and this prior is in turn taken as the proposed distribution. The importance factor is proportional to the previous importance factor multiplied by the likelihood It is necessary to account for the influence of additional process noise upon the importance factor The updating stage refines the estimates using the exteroceptive sensors data which can be further divided into feature extraction, data association, new/unseen obstacle detection and filter update procedures.
Theoretical Considerations of Multiple Particle Filters
207
Fig. 1. The flowchart for MPF SLAM algorithm
The Generalised Likelihood Ratio Test (GLRT) is employed to identify the new or unseen obstacles. A new mapping filter initialisation is triggered when
1 refers to the hypothesis that a new obstacle is present while 0 is for the hypothesis that a new obstacle is not present. A new SIR obstacle filter is initialised for each newly detected obstacle from the current measurement. The filter particles are assigned randomly around the detected position. The likelihood calculation is central to the rest of the updating process. A high likelihood is assigned if the estimate stored with the particle fits closely to the measurements. The sampling rate of the sensors can be different. The number of predictions may not be the same as the updates, because of the multirate nature of the system.
4 Results The MPF algorithm has been tested with the Victoria Park outdoor SLAM experimental data set, released by the University of Syndey [15]. The velocity and
208
D.C.K. Yuen and B.A. MacDonald
steering angle of the vehicle were recorded by encoders. A laser range scanner is mounted on the vehicle that measures the distance and bearing to the surrounding obstacles. It is a large test space with a dimension of about 250 × 250 m. The experiment covers a track of about 4 km. The vehicle is driven through loops. Measurement data were collected from more than 7000 time steps in the course of about 25 mins. Trees are the predominant type of obstacles. So, centres of the circular objects are extracted from the data as the features.
Fig. 2. The robot trajectory and the obstacle positions given by MPF SLAM and standard EKF SLAM
The MPF and EKF SLAM methods are compared in two different aspects, accuracy and the calculation speed. The MPF method gives comparable results to EKF SLAM, as suggested in Figure 22b and c. The trajectories of the robot consisted of many loops. The MPF completes the loops fairly satisfactorily, which is a good indication for the success of a SLAM method. However, MPF SLAM scales even better than EKF in the presence of a large number of obstacles. Since only a fixed number of particles is introduced for each new obstacle, the computation of MPF is O(N), comparing favourably with EKF’s The average calculation time per step is 0.04 s with an Athlon 1.8GHz computer, applicable to the near real-time SLAM application.
5 Conclusion This paper explores the possible SLAM architectures and examines the multiple SIR particle filter approach. Experimental results based on a standard SLAM data set are encouraging. MPF is a simulation based method. The solution can be slightly different in each run. A small proportion of the runs (less than 20%) is worse than that illustrated in this paper. Further research is needed to improve the robustness.
Theoretical Considerations of Multiple Particle Filters
209
References 1. Smith, R., Self, M., Cheeseman, P.: Estimating uncertain spatial relationships in robotics. In: Uncertainty in Artificial Intelligence. Volume 2., New York, Elsevier Science (1988) 435–461 2. Dissanayake, M., Newman, P., Clark, S., H.F. Durrant-Whyte, M.C.: A solution to the simultaneous localization and map building (slam) problem. IEEE Trans. Robot. Automat. 17 (2001) 229–241 3. Guivant, J., Nebot, E.: Optimization of the simultaneous localization and mapbuilding algorithm for real-time implementation. IEEE Trans. Robot. Automat. 17 (2001) 242–257 4. Thrun, S.: Particle filters in robotics. In: Proceedings of Uncertainty in AI. (2002) 5. Dissanayake, G., Newman, P., Durrant-Whyte, H., Clark, S., Csorba., M.: An experimental and theoretical investigation into simultaneous localisation and map building (slam). In: Lecture Notes in Control and Information Sciences; Experimental Robotics VI, Springer (2000) 6. Wan, E., van der Merwe, R.: The unscented kalman filter for nonlinear estimation. In: Proc. of IEEE Symposium 2000 on Adaptive Systems for Signal Processing, Communications and Control, Alberta, Canada (2000) 7. Jensfelt, P., Kristensen, S.: Active global localisation for a mobile robot using multiple hypothesis tracking. IEEE Transactions on Robotics and Automation 17 (2001) 748–760 8. Pitt, M.K., Shephard, N.: Filtering via simulation: Auxiliary particle filters. Journal of the American Statistical Association 94 (1999) 590–630 9. Fox, D., Burgard, W., Dellaert, F., Thrun, S.: Monte carlo localization: Efficient position estimation for mobile robots. In: Proceedings of the National Conference on Artifical Intelligence, AAAI (1999) 10. Fox, D., Thrun, S., Burgard, W., Dellaert, F.: 19. In: Sequential Monte Carlo Methods in Practice. Springer (2000) 11. Murphy, K.: Bayesian map learning in dynamic environments. In: Advances in Neural Information Processing System. Volume 12., MIT Press (2000) 1015–1021 12. Doucet, A., de Freitas, N., Gordon, N.: An introduction to sequential monte carlo methods. In: Sequential Monte Carlo Methods in Practice. Springer (2000) 3–14 13. McGinnity, S., Irwin, G.: 23. In: Sequential Monte Carlo Methods in Practice. Springer (2000) 14. Montemerlo, M.: FastSLAM: A Factored Solution to the Simultaneous Localization and Mapping Problem With Unknown Data Association. PhD thesis, Carnegie Mellon University (2003) 15. Guivant, J.E.: Efficient Simultaneous Localization and Mapping in Large Enviroment. PhD thesis, University of Syndey (2002)
Continuous Walking Over Various Terrains – A Walking Control Algorithm for a 12-DOF Locomotion Interface Jungwon Yoon and Jeha Ryu Human-Machine-Computer Interface Laboratory, Department of Mechatronics, Gwangju Institute of Science and Technology, Bukgu, Gwangju 500-712, Korea {garden, ryu}@kjist.ac.kr
Abstract. This paper describes control algorithm for continuous walking interactions at various terrains with a 12-DOF locomotion interface. The walking control algorithm is suggested for human to walk continuously on infinite floors generated by locomotion interface. For continuous walking, each independent platform of the locomotion interface will follow a human foot during swing phase, while the same platform will move back during stance phase. The transition phase between swing and stance phase is detected by using threshold of the ground height and the reaction force. For moving-back motions of the locomotion interface, the triangular retreating velocity profile is used to generate parabolas trajectory, which is similar to normal walking trajectory. By preliminary walking experiments with a 6dof locomotion interface, the algorithm is proven for a general human to walk naturally at levels, slopes, and stairs terrains. This walking control algorithm can be applied to any locomotion interfaces for applications such as navigations, rehabilitation, and gait analysis.
1 Introduction Virtual Reality (VR) technologies are rapidly developing in the areas of engineering, medical operation, teleoperation, welfare, and entertainment. Locomotion Interface allows users to participate in a life-like walking experience in virtual environments [1], which include various terrains such as plains, slopes and stair ground surfaces and to feel real spatial sense by generating appropriate ground surfaces to human feet. LI that can simulate walking interactions with virtual environments without restricting human mobility in a confined space will also become indispensable to enhance the feeling of immersion in a VR system. LI can be used in several application fields such as walking rehabilitation, virtual design, training, exercises, etc. For simulation of various walking surfaces using LI, Hollerbach et al. [2] simulated a slope display by utilizing mechanical tether and treadmill since treadmill through tilt mechanisms are typically too slow to present fast slope changes. However, the mechanical attachment to the human body can reduce the naturalness of walking and restrict the lift and yaw motions of the walker’s body [3]. Similarly, M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 210–217, 2004. © Springer-Verlag Berlin Heidelberg 2004
Continuous Walking Over Various Terrains
211
Hollerbach et al. [4] simulated side slopes by applying a lateral force to the waist of a person and compared the results with the walking at “ATLAS [5]” which has a treadmill platform mounted on an active spherical joint. Iwata [6] simulated omnidirectional walking using “Torus Treadmill”-a mechanism with 12 treadmills. Iwata [7] also simulated adjustable stair-like terrain by using “Gait Master ”. However, until now, there are no general algorithms for locomotion interface that can simulate various terrains such as plane, slope and stair surfaces to experience diverse walking environments. Even though Iwata [7] conceptually suggested the algorithm to implement infinite stairs by using “Gait Master”, details of their algorithms including control of the interface was not presented. Therefore, in order to simulate natural walking over various terrains, we propose a control algorithm that can generate infinite floors for various surfaces using a locomotion interface. To simulate walking over various terrains, K-Walker [8] is utilized for natural walking, which can generate 6 DOF motions including 3 translations and 3 orientations at each foot. This paper describes a walking control algorithm that allows waking interactions at various terrains such as planes, slopes, and stairs terrains using “K-Walker”.
2 System Overview of a 6DOF Locomotion Interface: K-Walker Figure 1 shows the structure of K-Walker which is composed of two 3-DOF (X, Y, Yaw) planar devices and two 3-DOF (Pitch, Roll, and Z) footpads. The separation of planar and spatial motions can achieve sufficiently large workspace for general walking and enough force capability to support full weight of the users. The planar devices are driven by AC servomotor for generating fast motions, while the footpad device are driven by pneumatic actuators for continuous support of human weight. The user standing on K-Walker can walk and interact with the virtual environment while wearing a Head Mount Display (HMD), or watching big display screen. The position and orientation of a human foot can be measured using a Polhemus 3D magnetic tracker, which is tightly connected to the shoe so that it should precisely trace the foot motion without delay. In the meantime, vertical forces on the footpad
Fig. 1. The 6dof locomotion interface: K-Walker
212
J. Yoon and J. Ryu
device can be estimated by pressure sensors in the cylinder to measure reaction forces by human weight. Currently, only one platform for the right foot is constructed and the other platform is under construction. Even though two platforms are not simultaneously operating, the walking experiments are possible since during a complete gait cycle motions of the left and right lower limb are same. Only difference is the phase of two limbs.
3 Walking Control Algorithm Given the traveling ground, the locomotion interface control system will compute the desired locomotion geometries and generate infinite surfaces for continuous walking in a confined area. Thus, the control algorithm should be designed to keep the position of the human at the neutral position of the K-Walker during walking. In order to achieve this goal, the principal of the cancellations was used [7]: While one platform will follow a foot during swing phase, the other platform will move back the foot in stance phase. However, that algorithm cannot be applied to normal gait including double limb stance, at which the two feet will be in contact with ground. Therefore, we suggest new cancellations method, in which the cancellations of the walking motions are independent of motion for both feet. Thus, each independent platform will follow the magnetic tracker attached to a foot during swing phase when human foot is moving forward without contact with any object, while the same platform will move back during stance phase when the foot is in contact with ground. The transition phase between swing and stance phase are detected by using threshold of the ground height and the reaction force exerted by the human foot. Since K-Walker is composed of independent planar and spatial devices, the motions to generate continuous walking are divided into planar and spatial motions. For planar motions of the K-Walker, it is considered that the foot is in contact with the ground when the foot height is lower than a reference height; thus, the device will move back. On the other hand, when the foot height is higher than reference height, it is considered that the human foot is moving forward without contact with the ground, thus, the device will follow the human foot trajectory. The algorithm to implement this walking scenario on the level ground is shown in Figure 2(a). Figure 2 (a) shows in detail that if the z coordinate, of the foot tracker is higher than ground height, the gait cycle is recognized as swing phase and the planar motion
of
the foot tracker is inserted to the command input for motion control of the planar device. On the other hand, if the is lower than the gait cycle is recognized as the stance phase and the planar device moves back. In order to put back the limb in the original position considering walking speed, the triangular velocity profile with respect to time is suggested:
Continuous Walking Over Various Terrains
where Sd, St, and
213
are respectively the forward moving distance, the required
time, and the average speed during swing phase for x, y, and yaw, and
is the
velocity control input of the planar device for given directions. It is observed that the triangular retreating velocity profile can generate parabolas trajectory, which is similar to normal walking trajectory.
Fig. 2. Walking control algorithm
For spatial motions of the K-Walker, the footpad device will be used to generate various terrains such as stairs and slopes. In order to simulate stairs, the platform of the footpad device should have zero angles since the stairs have no slope. Therefore, cancellation method about planar motions can also be applied to lift motion control. However, the collision detection between human foot and stair should have additional force information about the lift motions because the two collisions between and will happen when human will raise his foot and put his foot to the stair. Thus, for climbing up stairs, if the z coordinate reaction force, of the footpad device is lower than the force threshold we assume that the foot is being lifted above the stair, even though the ground height is below stair height. Therefore, the command input of the z coordinate of the footpad device will be On the other hand, if is lower than and is higher than the retreating velocity command for lift motion will apply to the footpad device. If the is higher than the ground height the command lift motion of the footpad device will be to keep stairs surface. This algorithm for spatial motions is shown in Figure 2(b). Planar motions at the stairs simulation can be controlled in the same way as that of the level ground walking simulation.
214
J. Yoon and J. Ryu
In summary, the stairs surface should have following parameters as shown in Figure 3 (a).
where
and and
are the pitch and roll angle of the slope on the ground, are the height of ground and stairs, and
is the desired
trajectory generated by retreat velocity command in equation (1) for lift motion during stance phase. For slope surface generation, if the pitch angles of the footpad device have constant values and the roll angle is zero, the surface will be uphill or downhill slopes. Conversely, if the pitch angle of the footpad device is zero and the roll angles have constant value, the surface will be side slopes. If ground has up-slope, the pitch angle should be positive and the ground height should be increased as human foot proceeds in forward direction as shown in Figure 3 (b). Therefore, to detect the contact of the human foot with the ground at slope surfaces, the ground height threshold should be computed in equation (3), while the same walking scheme for planar motions and lift motions will be applied to retreat the human foot back for continuous walking.
where
and are the desired pitch and roll angles of the footpad device, and is the back-and-forth motion of a human foot during swing phase. This walking control algorithm, therefore, will sustain continuous walking at various terrains with the 6DOF locomotion interface in a limited area.
Fig. 3. The walking surface generation
4 Walking Experiments and Evaluation To implement the real walking simulations with K-Walker, the safety of a user should be guaranteed all the time. Therefore, we let user wear a harness, which can totally support human weight above 100kg, and have shock absorber. In addition, the
Continuous Walking Over Various Terrains
215
balance bar is constructed for user to keep balance of their bodies during walking. The balance bar is moved by the hand of the human walker. Figure 4 shows real experimental results of human walking using the K-walker with the control algorithm for level ground. Figure 4(a) shows the tracker value for lift motion, the for back-and-forth motion, and the command input for back-and-forth motion. The swing phase is recognized when is higher than reference, and the interface device tracks the magnetic tracker attached to the shoe, while the interface device moves back with the proposed control algorithm satisfactorily during stance phase. Figure 4 (b) show that the interface device is tracking well the trajectory of the command input. This results shows that the control of the interface is operating well enough to follow human walking speed. It should be noted that the ratio of stance and swing phase during general walking with the K-walker was very similar to normal gait cycle. Figure 5 shows the up-stairs walking with the stair height of 5cm. Figure 5 (a) shows that the footpad device can follow the trajectory of the foot tracker with relatively slow motions at down motions, compared with the planar device. Figure 5(b) shows that according to the reaction force of the footpad device, the footpad device is moving up when while the device is going down when and the ratio of the stance and swing phase is about 7:3. For slope surface simulations, walking on a sloping surface
Fig. 4. Walking test for plane level
Fig. 5. Walking test for stair level
216
J. Yoon and J. Ryu
Fig. 6. Walking test for slope level
only involves a simple modification of the walking on the level ground. Only reference height is increasing or decreasing with respect to the values of given slopes when human proceeds forward during swing phase or retreats during stance phase. Figure 6(a) shows that the ground height threshold is changing according to the back-and-forth motion of a human foot. Figure 6(b) shows that the footpad device is moving up following the human foot during swing phase and moving down during stance phase according to
5 Conclusions This paper presented a control algorithm that can simulate continuous natural walking over various terrains generated from a 12-DOF locomotion interface. The motions to generate continuous walking are divided into planar and spatial motions. The heave, pitch, and roll motions of a 6-DOF locomotion interface at each foot are used to generate stairs, and side or up-down slopes of ground surfaces, while the x, y, and yaw motions of the interface are used to generate infinite floors with the triangular retreating velocity command. Using this algorithm, the real walking experiments were performed with the 12DOF locomotion interface: K-Walker. During preliminary experiments, we observed that the natural walking at plane, slope, and stair terrains are achievable by using the walking control algorithm. Future research will apply the locomotion interface based on the suggested control algorithm for walking to lower limb rehabilitation and gait analysis. The algorithm also needs to be evaluated while using both platforms.
Acknowledgement Research reported here was supported by grant (No. R01-2002-000-00254-0) from the Basic Research Program of the Korea Science & Engineering Foundation.
Continuous Walking Over Various Terrains
217
Reference [1] Hollerbach J. M., (2002) “Locomotion interfaces,” in: Handbook of Virtual Environments Technology, (Eds) Stanney K. M., Lawrence Erlbaum Associates, Inc., 2002, pp. 239-254. [2] Hollerbach J. M., Mills R., Tristano D., Christensen R. R., Thompson W. B., and Xu Y., “Torso force feedback realistically simulates slope on treadmill-style locomotion interfaces,” Intl, J. Robotics Research, Vol. 20, pp. 939-952. [3] Iwata H., (1999) “The Torus Treadmill:Realizing Locomotion in: VEs,” IEEE Computer Graphics and Applications, Vol. 19, No. 6, pp. 30-35, [4] Hollerbach J. M., Checcacci D., Noma H., Yanaida Y., and Tetsutani N., (2003) “Simulating Side Slopes on Locomotion Interfaces using Torso Forces,” Proc. of 11th Haptic Interfaces For Virtual Environment And Teleoperator Systems, pp. 247-253. [5] Noma H., Miyasato T., (1998) “Design for Locomotion Interface in a Large Scale Virtual Environment, ATLAS: ATR Locomotion Interface for Active Self Motion”, ASME-DSCVol. 64, pp. 111-118. [6] Iwata H., (1999) “Walking About Virtual Environment on an Infinite Floor,” Proc. of IEEE Virtual Reality 99, pp. 286-293. [7] Iwata H., (2001) “Gait Master: A Versatile Locomotion Interface for Uneven Virtual Terrain,” Proc. of IEEE Virtual Reality 2001, pp.131-137. [8] Yoon J. and Ryu J., (2004) “A Novel Locomotion Interface with Independent Planar and Footpad Devices for Virtual Walking”, 6th Asia-Pacific Conference on Computer-Human Interaction (APCHI), Rotorua, New Zealand.
Vision Controlled Humanoid Robot Tool-Kit Chris Messom Massey University, Auckland, New Zealand [email protected]
Abstract. This paper introduces a novel parallelised vision based intelligent controller for a Humanoid Robot system. This intelligent controller is simulated dynamically and its performance evaluated for a standard benchmark problem. The parallel nature of the simulation architecture which can separate the image processing and control algorithms allows the simulation to progress in real-time or faster than real-time. This allows automated control algorithms using neural network or evolutionary algorithms to be efficiently and effectively developed.
1 Introduction Biped humanoid robot structures have been investigated for many years [1-3], but it is only recently that the costs of the robot have been reduced to the point that it is possible to consider placing humanoid robots in everyday working environments. Before we can put humanoid robots in work related situations such as health care, aged care and miscellaneous service roles, machine intelligence must develop to the point that it is adequate to solve the problem. This research aims to develop robust humanoid robot controllers that can work reliably in dynamic environments along with people in a safe, fault tolerant manner. Currently systems with a small number of links and degrees of freedom (say 12 as in a biped robot) can be simulated in real time on a single processor machine. Simulating a larger system (say 12 joints, their motors and sensors as well as the robots vision system) can not be completed in real time on a single processor machine, so a multiprocessor approach must be adopted. This study is significant as real-time or faster than real-time simulation of robotic systems is required for intelligent controller design. Currently many techniques make use of kinematic models as a first approximation of a dynamic system so that the time complexity of the algorithms can be improved. While this approach is suitable for behaviours that are well understood and do not have complex dynamics, it is not suitable for investigating unknown or hard to model scenarios. For example when investigating vision based control of robots, a kinematic model would be suitable for a slow moving wheel based mobile robot that has as small and constant delay for image processing, but will not be suitable for fast moving, legged robot systems with highly dynamic delays as is the case with many modern image processing algorithms. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 218–224, 2004. © Springer-Verlag Berlin Heidelberg 2004
Vision Controlled Humanoid Robot Tool-Kit
219
Early biped robot research focused on the control algorithms for static and dynamic gait control [4-7] however as these problems have essentially been solved researchers have turned their attention to higher level control algorithms. Several groups have taken the visual servo control [8-9] approach particularly motivated by the Hurosot benchmark problems, these benchmarks are classified as simulation and real robot benchmarks. Many of the active researchers have focused on the real robot system, particularly due to the difficulty of transferring a controller that works on the simulation model to one that will work on a real robot system and also because of the difficulty in transferring between different simulation models and environments. This research hopes to overcome this limitation of simulation models by using realistic models of real biped robot systems that model the dynamics of the sensors and actuators used in the system as well as the robot morphology.
2 Real Biped Robot Systems Many research groups have built biped robot structures beginning with Waseda’s biped[l], followed by Honda[10], Sony[11] and MIT’s Spring Flamingo[12] and more recently M2[13]. The last few years with the benchmark problems proposed by RoboCup Humanoid and FIRA Hurosot [14] many university based research groups have developed biped robot structures including Hanyang University [15], Singapore Polytechnic [16-17], University of Auckland (and Manitoba University) [18], National University of Singapore [19], University of Tokyo [20] plus many more. These research biped robots can in general be categorised into two groups, servo motor controlled systems with limited sensor feedback and dc-motor controlled systems with full sensor feedback. Servo-motor control works by giving the required set point to the lower level control system, but often, especially when using low cost components that are under strength this set point is not achieved. Most systems that use these motors do not use any sensor feedback so the higher level control system is often unaware that the set-points have not been achieved. The dc motor systems tend to be higher power and higher speed systems with full sensor feedback, but then require more complex control circuitry for at least 10 motors which provide hardware challenges that are hard to overcome. The morphology of the biped robot used in this study is based on the M2 MIT robot from the MIT AI Lab [12-13] a 7-link, 12-degree of freedom biped robot (see figure 1) that uses series elastic actuators [21]. The vision system and the robot control system can be simulated separately in parallel giving real-time or faster than real-time performance.
3 Simulation System The simulation system is based on Yobotics Inc Simulation Construction set, a Java based 3 D dynamical simulation visualisation system. The system provides morphological input of the robot and environment automating the development of the
220
C. Messom
Fig. 1. M2 morphology
dynamic models required to simulate the system. The dynamic models are simulated using Featherstone’s algorithm [22-23] which has good time complexity properties for simulation of linked rigid bodies such as biped robots. Closed circuits of links cannot be directly simulated, but with the use of constraint forces, most closed circuit models can also be simulated. The extensions to this system that have been developed for this study are the robot vision capture subsystem and image processing subsystem, the vision based control system and the distributed parallel programming infrastructure that allows different components to run on separate processors or machines. Several authors have studied parallel algorithms for simulating the dynamics of open and closed rigid body links. Featherstone [22-23] has published theoretical performance on processors in systems with rigid bodies. This theoretical limit is hard to achieve as it does not take into account latency in interprocess communication. This study uses a 132 processor cluster based system, but simulations of one robot do not scale well beyond 2 processors. For small linked bodies such as robots it is seen that the simulation algorithm is most efficient on a single processor and improvements can only be made by separating components of the control system, such as the vision system into separate parallel processing units.
3.1 Parallel Programming The communication in the parallel simulation is achieved using the message passing interface. The simulation construction set uses the java interfaces to the Message Passing Interface (MPI) [24], while the image processing component (which uses a run length encoding (RLE) image processing algorithm [25] developed in C) uses the Java Native interface to link to the MPI libraries. In a similar manner other image processing libraries such as OpenCV [26] can also be used.
Vision Controlled Humanoid Robot Tool-Kit
221
Fig. 2. Biped Robot View
The image as captured by the camera on the M2 robot (figure 2) must be sent to the image processing module. This module uses RLE image segmentation and executes on a different processor to the main simulation system. The image is sent to the second processor using MPI while the results of processing the image, position of objects, position of people are returned to the simulator so that it can be given to the controller. The intelligent controller modifies the motion of the robot based on positions of obstacles and people using a state transition based control approach [27].
3.2 Vision Processing The image from the camera attached to the top of the robot is processed to identify positions of obstacles as well as any land marks in the field of view. Obstacles can be reasonably accurately placed relative to the robot, but land marks allow the robot to know where it and the obstacles are in world coordinates. The RLE algorithms identify objects in the image providing their size and position. Once the objects have been located in the 2 dimensional image, a coordinate transformation based on the fact that the ground is level and all the joint angles are available, allows us to determine the object’s position relative to the camera. If the joint angles are not available, an approximation of the camera position and orientation must be calculated based on the image from the camera. Visual artifacts that will contribute to this calculation include position of the horizon and any gravitationally vertical lines (the central line through a cone).
4 Path Planning The Hurosot benchmark tasks [14] include straight walking and weaving through a set of cones. If the distance between the cones and the starting position of the robot is known then it is possible to program a set of actions in an open loop manner. To overcome this, the starting position and distances between the cones can be varied, testing the robustness of the system.
222
C. Messom
Fig. 3. Gait path plan
Having identified the position of the cones from the image processing system it is possible to plan a path that avoids touching the cones using a piece-wise linear path. This is achieved by minimizing the length of the path through the cones with the constraint that the distance to the cone must be greater than 0.5 m for the start and end cone and greater than 0.75 m for the remaining cones. Each linear component of this path should be the average step length so that the approximate planned positions of the feet are easily calculated. The planned path is followed by modifying the lateral and yaw control of the robot. Lateral offsets, allow any small lateral error in position of the robot to be corrected, while yaw offsets correct for any errors in the direction of motion of the robot.
5 Results The biped robot controller successfully walks through the cones as long as the space between the cones is larger than 3m (the robot biped’s legs are 0.87 m long), see figure 4 for simulation results. Cone spacing of less than 3m reduces the reliability of the walking algorithm, this can only be improved by modifying the algorithm to reduce the turning circle of the robot. The vision component executes within 10ms while the biped robot simulation executes in real-time within the 16.67 ms sample time of the control system. When all components are executing on a single processor real-time performance cannot be achieved since the sample time of the system is 16.67 ms. In the current system using more than 2 processors for a robot simulation problem provides no improvement in performance as the image processing and dynamic simulation component algorithms have not been internally parallelized.
Fig. 4. a) Lateral movement, b) vertical movement and c) forward movement
Vision Controlled Humanoid Robot Tool-Kit
223
6 Conclusions This paper presented a vision based biped control system simulation toolkit that can operate in real-time when executing on two or more processors. Simulations show that performance is little improved with more than two processors, but it is hoped that with further work in distributing the simulation algorithm this obstacle can be overcome. This toolkit can form the basis for automatically developing vision based control systems using genetic programming and neural network learning techniques that can be rapidly developed before testing on real robot systems. Future work will extend the system so that multiple collaborating robots can be simulated in real time using multiple processors and machines.
References 1. Hashimoto, S. and Takanobu, H., (2000) Humanoid Robots in Waseda University - Hadaly-2 and WABIAN - IEEE-RAS International Conference on Humanoid Robots CDROM. 2. Hired, K., Hirose, M., Hedkawa, Y. and Takenaka, T., (1998) The development of Honda humanoid robot, In: Proc of the Int Conf on Robotics and Automation, vol 2, pp 13211326. 3. Pratt, J. and Pratt, G., (1998) Intuitive control of a planar bipedal walking robot, IEEE Conf on Robotics and Automation, pp 2014-2021. 4. Shih, C.L., (1996) The Dynamics and Control of a Biped Walking robot with seven degrees of freedom, Vol 118, Num 4, pp 683-690. 5. Raibert, M.H., Brown, H.B. and Chepponis, M., (1984) Experiments in Balance with a 3D One-Legged Hopping Machine, Int Journal of Robotics Research, Vol 3, No 2, pp 75-92. 6. Li, Z.X., (1989) Strategies for Biped Gymnastics, pp 433-437. 7. Zhou, C. and Meng, Q., (2003) Dynamic balance of a biped robot using fuzzy reinforcement learning agents, Fuzzy Sets and Systems, Vol.134, No.l, pp.169-187. 8. Okada, K, Kino, Y., Kanehiro, F., Kuniyoshi, Y., Inaba, M. and Inoue, H., (2002) Rapid Development System for Humanoid Vision-based Behaviours with Real-Virtual Common Interface, Proc. Int. Conf. on Intelligent Robotics and Systems(IROS). 9. Cupec R., Denk, J. and Schmidt, G., (2002) Practical Experience with Vision-Based Biped Walking, Proc Int Symp on Experiemntal Robotics (ISER’02). 10. Hirai, K., Hirose, M., Haikawa, Y.and Takenaka, T., (1998) The development of honda humanoid robot, IEEE Conf on Robotics and Automation. 11. Ishida, T., Kuroki, Y., Yamaguchi, J., Fujita, M. and Doi, T.T., (2001) Motion Entertainment by a Small Humanoid Robot Based on OPEN_R, IEEE/RSJ Int Conf on Intelligent Robots and Systems. 12. Pratt, J.E., (2000) Exploiting Inherent Robustness and Natural Dynamics in the Control of Bipedal Walking Robots, unpublished PhD thesis, MIT. 13. Pratt, J. and Pratt, G., (1999) Exploiting Natural Dynamics in the Control of a 3D Bipedal Walking Simulation. Proc of the Int Conf on Climbing and Walking Robots. 14. Pratt, J. and Kun, A., Hurosot Simulation: Rules of the Game, FIRA web site http://www.fira.net 15. Park J.H., (2001) Impedance Control for Biped Robot Locomotion, IEEE Transactions on Robotics and Automation, Vol 17, No 6.
224
C. Messom
16. Jagannathan K., Pratt, G., Pratt, J. and Persaghian, A., (2001) Pseudo-trajectory Control Scheme for a 3-D Model of a Biped Robot, Proc of ACRA, pp 223-229. 17. Jagannathan K., Pratt, G., Pratt, J. and Persaghian, A., (2001) Pseudo-trajectory Control Scheme for a 3-D Model of a Biped Robot(Part 2. Body Trajectories), Procs of CIRAS, pp 239-245. 18. Bakes, J. and McGrath, S., (2003) Tao-Pie-Pie, Proceedings of the RoboCup Symposium. 19. Zhang, R. and Vadakkepat, R., (2003) Motion Planning of Biped Robot Climbing Stairs, Proc FIRA Congress. 20. Miura, H. and Shimoyama, I., (1984) Dynamic Walk of a Biped, Int Journal of Robotics Research, Vol 3, No 2, pp 60-74. 21. Pratt, G.A. and Williamson M.W., (1995) Series Elastic Actuators, Proc of the IEEE/RSJ Int Con on Intelligent Robots and Systems (IROS-95), vol 1, pp 399-406. 22. Featherstone, R., (1999) A divide and conquer articulated-body algorithm for parallel O(log(n)) calculation of rigid-body dynamics, Part 1: Basic algorithm, International Journal of Robotics Research 18(9), pp 867-875. 23. Featherstone, R., (1999) A divide and conquer articulated-body algorithm for parallel O(log(n)) calculation of rigid-body dynamics, Part 2: Trees, loops and accuracy, International Journal of Robotics Research 18(9), pp 876-892. 24. Message Passing Interface Forum. The MPI message-passing interface standard. http://www.mpi-forum.org, (1995). 25. Messom, C. H., Demidenko, S., Subramaniam, K. and Sen Gupta, G., (2002) “Size/Position Identification in Real-Time Image Processing using Run Length Encoding”, IEEE Instrumentation and Measurement Technology Conference, pp 1055-1060, ISBN 0-7803-7218-2. 26. Opencv - Intel Open Source Compter Vision Library, available online http://www.intel.com/research/mrl/research/opencv/. 27. Sen Gupta, G. , Messom, C.H., & Sng H.L, (2002) “State Transition Based Supervisory Control for a Robot Soccer System”, Proc of IEEE Int Workshop on Electronic Design, Test and Applications, pp 338-342, ISBN: 0-7695-1453-7.
Modular Mechatronic Robotic Plug-and-Play Controller Jonathan R. Zyzalo1, Glen Bright2, Olaf Diegel1, and Johan Potgieter1 1
Institute of technology and Engineering, Building 106, Albany Campus, Massey University, Auckland, New Zealand {O.Diegel, J.Potgiet, J.Zyzalo}@massey.ac.nz
2
School of Mechanical Engineering, KwaZulu-Natal University, Durban, South Africa [email protected]
Abstract. Most current industrial robot arms require a dedicated controller for the actuating systems. This can be a disadvantage when trying to integrate several robots into an agile manufacturing environment. More flexible and adaptive modular plug-and-play controllers can highly enhance the use of these robots and eases their integration into modern, agile manufacturing environments. Interfacing automated machines can then be done at a PC level. At this level, “plug-and-play” becomes the benchmark for new devices being added to the system, allowing ease of operation and increased flexibility for agile manufacturing. The modular mechatronic control system described in this paper was used to operate a Unimate PUMA 560 series industrial robotic arm.
1 Introduction In today’s manufacturing community, mass production of custom products is becoming an important issue. Flexible Manufacturing Systems are required by manufacturing companies to meet demand for high quality, low cost priced products [1]. Agile manufacturing allows a manufacturer to efficiently change manufacturing processes or operations to produce custom products at mass manufacturing speeds [2]. Agile manufacturing systems are controlled by computer-based technology. Since the advent of the microprocessor, computer-based technologies have made it possible to improve productivity, reduce manufacturing costs, and produce higher quality goods. The development of the microprocessor has seen the use of robots for many applications. Agile manufacturing systems generally consist of a number of automated machine tools and materials-handling systems. These systems are flexible enough to reconfigure their elements or processes in order to produce custom proDcts. Industrial Robotics are an important part of an agile manufacturing process due to the flexibility of robotic arms [3]. There are many brands of industrial robot arms available. A problem that has, however, occurred over the years is a lack of standardisation of the operating systems between the robots. There is little interoperability between different manufacturers’ systems and between the different generations of automated machinery. Most robots and associated automated systems are custom built and expensive to upgrade [4]. The main disadvantage of current robots is that a dedicated and expensive controller is usually required for the robot’s actuating systems. This proves costly and makes interfacing with the robot complex due to hardware and software variations. It also reduces the flexibility of the machine. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 225–231, 2004. © Springer-Verlag Berlin Heidelberg 2004
226
J. R. Zyzalo et al.
The objective of this research was to develop a low-cost modular mechatronic plug-and-play controller for the control of a 6-axis robot. The system would then be tested with a Puma 560 robot, and with a CNC lathe (2 axis) to demonstrate the system modularity and flexibility. Ultimately, a truly flexible modular mechatronic controller would not only be low cost, but would also the selection of any particular motorized system through software, and thus have the controller appropriately reconfigure itself. This means that a large scale agile manufacturing system could be controlled from a single central PC, with fast set-up times and without any mismatches between hardware and software.
2 Industrial Robots A Unimate PUMA 560 series robot arm was donated to the Institute of Technology and Engineering for the project. This robot was a six-axis revolute robot arm. An initial aspect of the project was to become familiar with the PUMA 560’s actuation system. This industrial robot came supplied with the entire arm intact but without the control hardware or power supply.
Fig. 1. PUMA 560 [5]
Each joint was operated by a 40V brushed permanent magnet DC motor. The motors for the bottom three joints were rated at 160W and the motors in the wrist rated at 80W. Each of the first three joints (waist, shoulder, elbow) were equipped with a 24V electromagnetic brake. All the joint motors were fitted with 250 line sine-cosine incremental encoders giving position feedback to the controller [6].
3 Hardware Development From the specifications of the PUMA 560 robot, it was found that a 40V DC power supply was needed to power the motors. A 24V DC supply was needed to disengage the electromagnetic brakes and a 5V logic supply was necessary to power all the encoder circuits on the motors and the microprocessors used in the controller.
Modular Mechatronic Robotic Plug-and-Play Controller
227
A power supply was built for the system which included a main transformer with a VA rating of 640W. It also included logic power for the encoder circuits and the microprocessors using a computer ATX switch mode power supply unit. This was a very convenient logic supply to use as it provided 0V, +5V, -5V, +12V, and –12V. The encoders used for each robot motor were quadrature encoders which generated two output signals, with one of them offset 90° with respect to the other. By evaluating which signal was leading, the direction of rotation of the encoder disk could be determined. Based on an investigation into the workings of modern industrial incremental encoders, a square wave output was expected from the encoder receivers [7]. However, due to age of the encoder circuits, only small, 30mV peak-to-peak sine wave outputs were detected. In order for the microprocessor to be able count the encoder increments, the analogue sine wave signal required conversion into a digital pulse train. Amplification of the signal was accomplished by using a standard LM741 operational amplifier to amplify the differential of each pair of signal lines. An LM339 voltage comparator converted the amplified signal into a pulse train. The circuit as shown in Figure 2 was implemented on each of the encoder lines of the six motors.
Fig. 2. Encoder output conversion circuit
Modularity was one of the principal project requirements. This objective was to provide a reconfigurable system with the possibility of easily expanding the system to control more motors, or replacing any damaged modules as required. In this project, the system was to be tested on both a Puma 560 6 axis robot, and a 2 axis CNC lathe. The project was therefore designed around an industry standard 19” rack. Each individual 4U high rack bin could contain up to six separate 2 channel motor control module assemblies (Figure 3). A full rack bin could thus control up to 12 motors. The rack bin had several buses on the rear of the enclosure used for the 5V and 0V logic supply, the 40V and 0V motor power supply, and the communication bus.
Fig. 3. 19” Racking System with slides
228
J. R. Zyzalo et al.
The project was designed so that the motor control modules could slide into the rack, from the front, as control cards. Each control card module was designed to control two motors. The block diagram for the modular system is shown in Figure 4.
Fig. 4. Motor Control Card Block Diagram
All the inputs and outputs of each card were routed to the back of the module so that the module could plug into the rack. The I/O for the microprocessor was wired to the back of the card with ribbon cable and a female DB-25 pin connector was used to plug the I/O into the rack bin. There were six gold plated terminals provided to plug into the back of the 19” rack bin. This provided the interface with all the high power motors wiring and communication bus system. The H-bridges used for the project were: Devantech MD03s. These H-bridges provided the voltage and amperage requirement to drive the motor system. Rated at 50V 20Amp, the MD03 H-bridge is a medium power motor driver. The MD03 had four control modes. The mode used for the project was the Analogue Mode. This mode could be controlled using digital TTL logic levels, supplied by the microprocessor. The SDL input of the MD03 was used to indicate the direction, logic 0 for reverse direction and logic 1 for forward. The SDA input controlled the amount of voltage sent to the motor. 0V indicated no power and 5V indicated full power. This mode allowed a pulse width modulation (PWM) signal to be used to control the motor speed on the SDA input. PWM is a technique used by most microprocessors and other controllers to control an output voltage at any value between the power rails. It consists of a pulse train whose duty cycle is varied so that it creates variable “on” and “off states. The average output value approximates the same percentage as the “on” voltage [8]. In the case of MD03 H-bridge, used for the project, a 0% duty cycle represented a 0V supplied to the motor. A 50% duty cycle represented half of the supply voltage available to the motor. A 100% duty cycle was maximum voltage. The robot’s motors were controlled using PIC18C252 microprocessors. Each PIC was implemented in an off-the-shelf module called the BrainStem Moto 1.0, supplied by Acroname. The BrainStem has the following specifications: 40MHz RISC processor, 2 motion control channels with PWM frequencies from 2.5kHz-5kHz, 5 analogue inputs with 10-bit ADC, 11 digital I/O lines, 1 Mbit port, routing, status LED, 368 bytes of user RAM and RS-232 serial port communication.
Modular Mechatronic Robotic Plug-and-Play Controller
229
The Moto module was used in Encoder PID mode. The Encoder PID mode made adjustments to the position of the motor, called the set-point, based on feedback from the motor’s encoder. PID control was used in an algorithm running on the PIC to determine how much PWM was applied over time to the motor in order to maintain or move to a desired position. Proper selection of PID gain constants minimised oscillations of the motor [9]. Figure 5 shows the overall logic flow of the control loop.
Fig. 5. Basic Flow of Encoder PID Mode
The PIC allowed for reduced instruction set computing (RISC). The Moto responded to a limited set of pre-programmed commands. These commands were used to communicate with each microprocessor via a serial cable. The commands were used to retrieve and set data in the microprocessor. Another important feature of the BrainStem Moto was that they could be daisychained using an bus. This allowed control card modules to be placed anywhere on the bus. All that was required was for the bus address to be set up on each BrainStem before “plugging” in the control card. This allowed communication with any BrainStem on the bus and between all the BrainStems themselves.
4 Software Development The Graphical User Interface (GUI) was developed using Visual Basic 6.0. The GUI was developed progressively to test the functionality of the system. It communicated with the microprocessors through RS232 serial communication. The GUI communicated with each of the BrainStems through the first BrainStem Moto module in the chain, which acted as a router. Each packet of data sent from the PC started with an address for the BrainStem the message was intended for. If the packet was not addressed to the first router, it sent the pack on the bus to the appropriate module.
230
J. R. Zyzalo et al.
The main GUI, served the following major functions: Communication management of packets between BrainStems and the PC. Inputs for set-points for each motor controlled. Manipulation of the settings of programs running on the PIC. In the case of the BrainStems it included the PID control settings, mode selection, PWM, register monitoring, etc. The GUI allowed for the control of up to six motors. Each motor’s set-point could be changed and sent to the respective BrainStem Moto module.
5 System Performance When the entire system was assembled, control of the robot was possible. The system performed well enough to control motion of each joint, (6 joints), of the PUMA robot arm to the specified set-points. Though the system did eventually perform correctly and successfully controlled both the Puma 560, and a CNC lathe, there were initially a few problems to overcome. The main problem occurred with the logic power supply. Sporadically, the logic power, which supplied all the encoder circuits and microprocessors, would fail to power the most essential parts of the system. This would result in the unpredictable behaviour of the robot arm and make it unsafe. Further investigation revealed that the ground, 0V, of the logic power supply was floating causing differences in the ground, 0V, of other system components. Using the ground, 0V, of a bench-top power supply, this problem was solved. The PID control method did not account well for the effects of gravity because the feedback gains of the PID algorithm were fixed. This meant that the system had a very low level of repeatability and accuracy. The encoder PID mode of the BrainStem Moto 1.0, worked well for the wrist joints, but did not perform well for the larger joints. When a new set-point was entered for a motor, the PID control loop algorithm outputs maximum voltage to the motor until it neared the new set-point based on the feedback from the encoders and the PID gains. It did this without accounting for the effects of inertia and gravity. This meant that, for the shoulder and elbow joints of the robot, a joint would move more rapidly in the down direction than in the up direction. To correct this problem, a velocity control method was implemented. Another problem was that the BrainStem Moto 1.0 only had a 16-bit set-point number. This was reduced to 15-bits as the most significant bit was used to indicate direction. Due to the resolution of the 250 line incremental encoders, only limited movement of a joint could be completed with each command sent to a BrainStem. The BrainStem program was unable to detect when a movement was complete so that the next movement of the robot could take place, as there were no flags set to indicate a new set-point had been achieved. This problem was bypassed by introducing a timer delay between selected movements. The robot joints still required calibration. Calibration was done each time the robot is turned on. Potentiometers on the motors were connected directly to the analogue inputs of the BrainStem Moto 1.0 that provided a 10-bit analogue-to-digital conversion of the potentiometer value. This value gave an indication of the position of the robot when it was powered up.
Modular Mechatronic Robotic Plug-and-Play Controller
231
The modular control system was initially tested and developed around the Puma 560 6 axis robot. After successfully being used to control the Puma, it was then used to control a 2 Axis CNC lathe. All that was required for the changeover were software changes to select the appropriate module addresses, and to retune the PID software algorithms for the Lathe. In future developments, the software would contain a library of standard robot, lathe, and other configurations that can be called up and thus automatically configure the control modules.
6 Conclusion The objective of this project was to develop a modular mechatronic plug-and-play controller for the control of a 6-axis robot system. All steps involved in the research project were successfully completed such that the robot was capable of movement through the developed software interface. The modular controller was then successfully used, with only some software reconfiguration, to control a CNC lathe. Further BrainStem Moto 1.0 development is required for the system to improve its resolution. Further PIC programming could also improve the repeatability and accuracy of the control system. Research into more sophisticated control techniques is also an area for further development. Future work would include the development of the system so as to integrate it with a computer aided manufacturing (CAM) package for materials handling and assembly. The software will also be developed with a library of available robots and motor driven devices so that the system can easily be configured to drive different devices, thus allowing seamless integration into an agile manufacturing environment.
References 1. Preiss Kenneth, Steven L., (1996) Goldman; Roger N. Nagel, Cooperate to Compete: Building Agile Business Relationships, Van Nostrand Reinhold. 2. Kidd Paul T., (2003) Agile Manufacturing: Forging New Frontiers, Addison-Wesley, 1994. 3. Krar, S., & Arthur, G., Exploring Advanced Manufacturing Technologies. New York: Industrial Press. 4. Grabowski R., Navarro-Serment L. E., Paredis C., KhoslaInstitute PK., (2002) Heterogeneous Teams of Modular Robots , In Robot Teams, edited by T Balch and L Parker, Carnegie Mellon University. 5. Fu Gonzalez, & Lee, (1984) Robotics: Control, Sensing, Vision and Intelligence. Singapore: McGraw-Hill. 6. Wyeth G.F., Kennedy J. and Lillywhite J., (2000) Distributed Control of a Robot Am, Proceedings of the Australian Conference on Robotics and Automation (ACRA 2000), August 30 - September 1, Melbourne, pp. 217 - 222. 7. (Ed) Valentine Richard, (1998) Motor Control Electronics Handbook, McGraw-Hill, Boston. 8. Agrawal Jai P., (2001) Power electronic systems: theory and design, Upper Saddle River, N.J.: Prentice Hall. 9. Acroname, (1994) BrainStem: Moto, Retrieved Nov, 2003, http://www.acroname.com
The Correspondence Problem in Topological Metric Mapping - Using Absolute Metric Maps to Close Cycles Margaret E. Jefferies1, Michael C. Cosgrove1, Jesse T. Baker1, and Wai-Kiang Yeap2 1
Department of Computer Science, University of Waikato, Hamilton, New Zealand {mjeff, mcc2, jtb5}@cs.waikato.ac.nz 2
Artificial Intelligence Technology Centre, Auckland University of Technology, Auckland, New Zealand [email protected]
Abstract. In Simultaneous Localisation and Mapping (SLAM) the correspondence problem, specifically detecting cycles, is one of the most difficult challenges for an autonomous mobile robot. In this paper we show how significant cycles in a topological map can be identified with a companion absolute global metric map. A tight coupling of the basic unit of representation in the two maps is the key to the method. Each local space visited is represented, with its own frame of reference, as a node in the topological map. In the global absolute metric map these local space representations from the topological map are described within a single global frame of reference. The method exploits the overlap which occurs when duplicate representations are computed from different vantage points for the same local space. The representations need not be exactly aligned and can thus tolerate a limited amount of accumulated error. We show how false positive overlaps which are the result of a misaligned map, can be discounted.
1 Introduction In this paper we describe one of the approaches we are using to solve the corresponding problem in Simultaneous Mapping and Localisation (SLAM). This is regarded as one of the hard problems in SLAM. It is often termed cycle or loop closing because the problem presents itself when the robot traverses a cycle in its environment. The challenge is how to recognise that the cycle has been closed - that parts of the environment observed from different vantage points correspond to the same physical space. The problem is encountered in both topological and absolute metric maps. For absolute metric maps current localisation methods provide consistent enough local maps but residual error accumulates over large distances. By the time a large cycle is encountered the map will contain significant inconsistencies (see Fig.1). Current approaches use some form of probability evaluation to estimate the most likely pose of the robot given its current observations and the current state of its map [1-4]. Detecting the cycle allows the map to be aligned correctly but means the error has to be corrected backwards through the map. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 232–239, 2004. © Springer-Verlag Berlin Heidelberg 2004
The Corresponding Problem in Topological Metric Mapping
233
Fig. 1. The topological and metric maps. (a) a corner of the robot’s environment, a large semiopen laboratory and its surrounding corridor. (b) the topological map (c) the global metric map. The ASRs are numbered in the order they are encountered
Most topological approaches to robot spatial mapping partition the environment in some way and link these partitions as they are experienced to form a topological map [5-8]. The advantage of this approach is that global consistency is not an issue because the error cannot grow unbounded as in absolute metric maps. Consistency is not a problem within the partitions as they are usually around the size of a local environment. State of the art localisation methods are good enough for local environments. In closing cycles in a topological map the problem is to match two nodes in the topological map if they represent the same physical space (the correspondence problem) and to distinguish two nodes that look the same if they represent different parts of the environment (the perceptual aliasing problem). Recently hybrid topological/metric approaches have emerged [7, 9, 10] and in [7] the advantages of both the topological and metric mapping paradigms are exploited in closing large cycles. Hybrid approaches are popular in the cognitive mapping community [8, 11-13] however the metric and topological maps do not have equal status. The topological map is the dominant representation in their models. Cognitive maps are often regarded as being like a “map in the head” that an agent (human, animal or robot) has for its experience of its spatial environment. In absolute metric maps the need to match the local map associated with a particular pose and the need to propagate error corrections backwards through the map has seen the introduction of
234
M.E. Jefferies et al.
topologically linked local metric maps for sequences of poses [1-3]. However these are a means to an end which is more consistent absolute metric maps. Our mapping system is based on our previous work where a computational theory of cognitive mapping has been derived from empirical evidence of how humans and animals solve similar problems [8, 14]. An agent could be human animal or robot. Cognitive mapping researchers have been interested in the correspondence problem for some time but it was not clear from their computer simulations that their algorithms would handle all the uncertainties that a robot faces in the real world [8, 11, 12]. Recently cognitive mapping researchers have begun to adapt their theories and algorithms for the real world problem robots encounter [15-17]. Our approach to mapping the robot’s environment extends the hybrid model of [8] and adheres to the dominant cognitive mapping tenet, that the prime representation is the topological map (see [5, 8] for a discussion on why this is so). Yeap and Jefferies’ [8] topological map of metric local space descriptions (see Fig.1) has been implemented on a mobile robot with minor adaptations to handle input from a laser range sensor. Yeap and Jefferies [8] proposed a limited (in size) absolute metric map to close small cycles in the topological map. The restricted size of their absolute metric map accounts for the limitations in the human or animal path integration system with accumulating error [18]. The idea is that parts of the map that are distant enough from the agent’s current pose will be significantly misaligned with rest of the map due to accumulating error. These would simply drop out of the map. In practice, however, without some error correction the global metric map can detect very few cycles. In the implementation we describe here, using a locally consistent global metric map, we are able to detect significant cycles. Using this method, we use the global metric map to detect and close cycles in the topological map. False positive matches are possible but using the method in conjunction with topological verification we are able to eliminate most false positive matches [17].
2 The Basic Mapping Approach The topological map comprises a representation for each local space visited with connections to others which have been experienced as neighbours. The local space is defined as the space which “appears” to enclose the robot. The local space representation is referred to as an Absolute Space Representation (ASR) a term which emphasises the separateness and independence of each individual local space. Each ASR in the topological map has its own local coordinate frame. Note that these are local absolute spaces in contrast to the global absolute metric representations referred to in section 1. The basic algorithm described in [8] was modified to handle input from a laser range sensor and accumulating odometric and sensor errors. However the fundamentals of the algorithm remain. Yeap and Jefferies [8] argued that the exits should be constructed first because they are the gaps in the boundary which tell the robot how it can leave the current space. An exit will occur where there is an occlusion and is formed by creating the shortest edge which covers the occlusion. Once the exits are formed it is straightforward process to connect the surfaces which lie between them to form the boundary of the ASR. At the same time surfaces which
The Corresponding Problem in Topological Metric Mapping
235
are viewed through the exits, and are thus outside the ASR, are eliminated. Fig.2 (b) shows a sequence of two ASRs so computed. See [8] for an in-depth description of the basic algorithm and [17, 19] for the details of how it is implemented on an autonomous mobile robot using laser range sensing.
Fig. 2. (a) A section of the robot’s environment. (b) The ASRs constructed correspond to the labelled sections of the environment in (a). E1 and E2 are exits, E1 links ASR1 and ASR2
Rofer’s [20] histogram correlation localisation method is used to provide consistency within ASRs. New ASRs are computed whenever the robot crosses an exit into an unexplored region and ASRs are linked, as they are experienced, via the exits which connect them to their neighbours in the topological map. The ASRs are the nodes of the topological map and the exits are its edges. Fig.1(b) shows the topological map constructed in our large L-shaped open plan laboratory and its surrounding corridor. ASRs 1-8 and ASR 13 comprise the laboratory and the remaining the corridor. Tables and protruding desks provide occlusions where exits are computed. In large open spaces there are fewer occlusions and thus fewer opportunities to partition the space, for example ASR 7 in Fig.1(b).
3 Closing Cycles with a Global Absolute Metric Map The main advantage of global absolute metric mapping should be that because the robot’s location is measured in absolute terms, returning to a previously visited place is clearly apparent by virtue of robot’s location within the absolute map. In reality, however, this is not the case - significant misalignment of the map occurs as residual errors accumulate (see Fig.1(c)). However we noted that this misalignment is often not complete, that even though there is significant misalignment in the map, the corresponding ASRs may continue to have substantial overlap. For example, in Fig.1(c) due to the misalignment along the corridor comprising ASRs 11 and 12A one cannot detect immediately from the robot’s pose that the robot has re-entered ASR12A from ASR13. However it can be seen that ASR12A overlaps with the ensuing duplicate ASR12B. Note that ASR12B is smaller than ASR12A as the robot has yet to fully explore it. If we maintain the global metric map as a collection of ASRs in a single global coordinate system, we can exploit this overlap to detect that the robot is re-entering a known part of its environment.
236
M.E. Jefferies et al.
The global metric map is discretised into the local space descriptions which correspond to the nodes in the topological map. Whenever the robot crosses an untraversed exit the robot computes a new ASR for its current local environment. It then checks its known ASRs in the global metric map for overlap. We want to detect true overlap, i.e. the overlap which is probably not as good as it should be due to the misaligned map rather than the false overlap which results from the map misalignment. To minimise the effect of the latter we match ASR centres. The robot’s position is firstly projected to the centre of the current ASR and this location is checked for inclusion in the ASRs in the global map. For example, in Fig.1(c) the robot’s position is projected to the centre of ASR12B. This position is checked for inclusion in ASRs 1-12A. This is true for ASR12A. To minimise the effect of the spurious overlaps which are the result of the misalignment we perform a crosscheck of the matching ASRs’ centers. In Fig.1(c) we take the centre of ASR12A and check it for inclusion in ASR12B. This eliminates many of the false positive matches at very little cost. The trade-off is that some positive matches will be missed. The method tolerates a significant but limited amount of accumulated error - each of the centers of the duplicate ASRs must lie inside the other. Fig.3(b) shows an example of an overlap which would fail the centers crosscheck.
Fig. 3. (a) The environment (b) An example of where the overlap would not be detected. The centers of each of the overlapping ASRs are not inside the corresponding ASR
While the above check discounts many false positive matches, if the accumulated error is significantly large then some false matches may pass this test. The next step in the process is to “close the loop” in the topological map. In the example of Fig.1(c), this means that ASR12A is linked to ASR13.To achieve this “linking of ASRs” the corresponding exits need to be found, in particular the pair belonging to the edge which closes the cycle (see Fig.4). Fortuitously this provides another mechanism for eliminating false positive matches. If the pair of corresponding exits cannot be found the match is rejected. We do not attempt to combine ASR12A and ASR12B into a single integrated representation. The problem is that even accounting for the fact ASR12B has not been fully explored, there are significant differences in the boundary of ASR12A and ASR12B. Some of this is due to sensing and odometry errors but it is also be attributed to the fact that the ASRs are viewed from different vantage points, The same physical space does not look the same when viewed from different locations. Combining the ASRs would provide a neater map. However, from whichever
The Corresponding Problem in Topological Metric Mapping
237
viewpoint the robot encountered the ASR, the map would be a compromise. This is problematic in dynamic environments where discrepancies in the representation of the current view as compared with a previous representation need to be attributed to either map errors or real changes in the environment. Thus we maintain duplicate representations for the same physical space which correspond to the different vantage points from which they were initially computed. The links in the topological map which correspond to duplicate ASRs are unidirectional. For example, in Fig.4 when traversing ASR11 to ASR13, ASR12A is used. When traversing ASR3 to ASR11, ASR12B is used.
Fig. 4. The topological map with its cycle closed, i.e. ASR12A is linked to ASR13
The main purpose of our approach is to close cycles in the topological map. However with the cycle closed there is the opportunity to realign the global map, correcting the error backwards through the map and develop a model of the residual error to assist future cycle detection. We are currently investigating this aspect of our approach and are comparing it with Yeap and Jefferies [8] limited in size global metric map where the misaligned parts of the map would simply drop off. We also employ landmark matching to identify and cycles in the topological map [17]. Cycles detected in the topological map provide supporting evidence for cycles detected in the global metric map and vice versa.
4 Conclusion We have shown that significant cycles in a topological map can be detected from the corresponding cycles in a global metric map. The key to the approach is to ensure that the global metric map is made up of the ASRs in the topological map. The approach is conservative but combined with landmark cycle detection [17] we are able to close many cycles in large-scale environments. However our approach is conservative; we sacrifice some true positive matches so that we can reject most false positive matches.
238
M.E. Jefferies et al.
Missing the opportunity to close a cycle in a topological map is not catastrophic as in absolute metric mapping. The outcome is that the robot will take a longer route than it needs to.
References 1. Hahnel, D., Burgard, W., Fox, D., and Thrun, S. A efficient fastSLAM algorithm for generating maps of large-scale cyclic environments from raw laser range measurements. in Proceedings Intelligent Robots and Systems. 2003. 2. Thrun, S., Hahnel, D., Ferguson, D., Montemerlo, M., Triebel, R., Burgard, W., Baker, C., Omohundro, Z., Thayer, S., and Whittaker, W. A system for volumetric robotic mapping of abandoned mines. in Proceedings International Conference on Robotics and Automation. 2003. 3. Hahnel, D., Thrun, S., Wegbreit, b., and Burgard, W. Towards lazy data association in SLAM. in Proceedings 10th International Symposium of Robotics Research. 2003. 4. Gutmann, S. and Konolige, K. Incremental mapping of large cyclic environments. in Proceedings International Symposium on Computational Intelligence in Robotics and Automation. 1999. 5. Kuipers, B., The spatial semantic hierarchy. Artificial Intelligence, 2000. 119. 191-233. 6. Tomatis, N., Nourbakhsh, I., and Siegwart, R. Simultaneous localization and map building: A global topological model with local metric maps. in Proceedings International Conference on Intelligent Robots and Systems. 2001. 7. Bosse, M., Newman, P., Leonard, J., Soika, M., Feiten, W., and Teller, S. An Atlas framework for scalable mapping. in Proceedings International conference on Robotics and Automation. 2003. 8. Yeap, W.K. and Jefferies, M.E., Computing a representation of the local environment. Artificial Intelligence, 1999. 107.265-301. 9. Tomatis, N., Nourbakhsh, I., and Siegwart, R. Hybrid simultaneous localization and map building: closing the loop with multi-hypotheses tracking. in Proceedings IEEE International Conference on Robotics and Automation. 2002. Washington DC, USA. 10. Thrun, S., Learning metric-topological maps for indoor mobile robot navigation. Artificial Intelligence, 1998. 99(1). 21-71. 11. Kuipers, B.J. and Byun, Y.-T. A robust, qualitative method for robot spatial learning. in Proceedings of the National Conference on Artificial Intelligence (AAAI-88). 1988. 12. Yeap, W.K., Towards a computational theory of cognitive maps. Artificial Intelligence, 1988. 34. 297-360. 13. Chown, E., Chaplain, S., and Kortenkamp, D., Prototypes, location, and associative networks (PLAN): Towards a unified theory of cognitive mapping. Cognitive Science, 1995. 19. 1-51. 14. Jefferies, M.E. and Yeap, W.K. Representing the local space qualitatively in a cognitive map. in Proceedings Twentieth annual conference of the Cognitive Society. 1998. 15. Kuipers, B. and Beeson, P. Bootstrap learning for place recognition. in Proceedings 18th International Conference On Artificial Intelligence. 2002. 16. Beeson, P., MacMahon, M., Modayil, J., Provost, J., Savelli, F., and Kuipers, B. Exploiting local perceptual models for topological map building. in Proceedings IJCAI2003 Workshop on Reasoning with Uncertainty in Robotics. 2003. 17. Jefferies, M.E., Weng, W., Baker, J.T., Cosgrove, M.C., and Mayo, M. A hybrid approach to finding cycles in hybrid maps. in Proceedings Australian Conference on Robotics and Automation. 2003.
The Corresponding Problem in Topological Metric Mapping
239
18. Gallistel, C.R. and Cramer, A.E., Computations on metric maps in mammals: getting oriented and choosing a multi-destination route. The Journal of Experimental Biology, 1996. 199.211-217. 19. Jefferies, M.E., Yeap, W.K., and Baker, J., Robot mapping with a topological map of local space representations, in Advances on Simulation, Systems Theory and Systems Engineering, N.E. Mastorakis, V.V. Kluev, and D. K., Editors. 2002, WSEAS Press. 287294. 20. Rofer, T. Using histogram correlation to create consistent laser scan maps. in Proceedings IEEE International Conference on Intelligent Robotics Systems. 2002.
Developing a “Virtual Student” Model to Test the Tutor and Optimizer Agents in an ITS Mircea Gh. Negoita and David Pritchard School of Information Technology, Wellington Institute of Technology, Private Bag 39089, Wellington Buick Street, Petone, New Zealand [email protected] [email protected]
Abstract. Education is increasingly using Intelligent Tutoring Systems (ITS), both for modeling instructional and teaching strategies and for enhancing educational programs. The first part of the paper introduces the basic structure of an ITS as well as common problems being experienced within the ITS community. The second part describes WITNeSS- an original hybrid intelligent system using Fuzzy-GA techniques for optimizing the presentation of learning material to a student. In part three our original work is related to the concept of a “virtual student”. This student mode, modeled using fuzzy technologies, will be useful for any ITS, providing it with an optimal learning strategy for fitting the ITS itself to the unique needs of each individual student. Experiments focus on problems developing a “virtual student” model, which simulates, in a rudimentary way, human learning behavior. The paper finishes with concluding remarks.
1 Introduction There would seem to be many students who really want to learn, who have a huge appetite to learn, but who constantly struggle with their work. They just have not been prepared to be independent learners who can think and solve problems. A recent but promising area of applying IHS (Intelligent Hybrid Systems) is focused on intelligent tutoring systems. These intelligent tutoring systems, based on Intelligent Hybrid Systems, are becoming a highly effective approach to developing computer-teaching systems. They model instructional and teaching strategies, enhancing educational programs, enabling them to decide on “what” to teach students, “when” to teach it and “how” to present it. A “stand alone” intelligent (HIS based) tutoring component is added to the usual learning environment, to support the work done by lecturers in teaching their students. (Negoita and Pritchard, 2003a, 2003b)
1.1 Using HIS in ITS Applications Human society is evolving – changing at a faster and faster rate. These changes are having a major influence on every aspect of global social-economic development, including education, business commerce and industry. One of these changes is the M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 240–252, 2004. © Springer-Verlag Berlin Heidelberg 2004
Developing a “Virtual Student” Model to Test the Tutor
241
growing influence of the latest advances in information technology and computer science in developing better learning systems. One of the ultimate achievements of this trend would be the development of an HIS based learning system primarily designed to help any student become a better learner. These artificial learning systems would challenge students to work with and improve on one of the most important factors in any learning situation – themselves. Although HIS and its CUAI techniques form the main field of knowledge involved in the development of the learning system, other disciplines are also closely associated, such as psychology, human-computer interface technologies, knowledge representation, databases, system analysis and design and not least, advanced computer programming technologies. One of the characteristics shared by all users the ITS is trying to help, is a high degree of uncertainty. People are different and just when you think you have them worked out they do something different. At present, many electronic learning systems do nothing to address this problem – they would simply have the student change to fit the system. What is really required is for the actual learning system to fit itself to the student’s unique needs. So the system must be able to change itself to suit the student, which means a high level of adaptability must be displayed. The key role in accomplishing this degree of adaptability will be played by IHS technologies. We feel that it is only in this way that the main aim of an ITS – to provide adaptive multifunctioning for both teacher and student – will be achieved. With such a degree of uncertainty it would seem that the most suitable modeling strategy for achieving the desired functionality would be a Fuzzy System. In achieving this highly desired learning strategy of fitting “itself to the student’s unique needs”, the system will be able to present the student with any content or skill set they wish to learn, in a way that suits their particular personal, individual learning style and psychological profile. The system will be able to deliver the right content to the right user in the right form at the right time. (Smyth 2003). But it is more than just adaptability – it’s the question of how best to handle the resources available. Some recent research in the area of distance learning suggests that such systems could recommend to instructors how to manage the distance learning courses more effectively. (Kosba 2003). We see the whole question of handling resources as one of optimization of time and learning quality. How quickly can we move the student through a course of study, doing so with the highest possible quality of learning as the outcome. We see WITNeSS having a Tutor agent that would decide, using fuzzy logic, what to teach, and when and how. All the time there would be an Optimiser agent that would use GA algorithms to adjust the fuzzy rule set, to come up with a more effective way for the Tutor to make its decisions – a way that would result in quicker, high quality learning. For the system to be able to adapt to the student in this way, the system would have to “know” the student. The work of (O’Riodan and Griffith 2003) in using an approach that combines information retrieval methods and collaborative filter techniques to generate personalized recommendations, both on course content and on peer-peer groups were most enlightening.
242
M.G. Negotia and D. Pritchard
Ultimately we would wish WITNeSS also to have a sophisticated student model that could capture all the vital characteristics of the student using the system, but at the moment, in this paper, our aim was to develop a “virtual” student model that could imitate the learning and forgetting of a “human” student. We would then use this model to test our ideas of the optimizer agent working with the Tutor.
1.2 The Basic Structure of an ITS An ITS can assess a student’s mastery of a subject; match this against a knowledge database of learning content and a database of teaching strategies. Through the dynamic, creative collaboration between the program’s components and the student, a variety of interactive tutorial experiences can be provided for the student. (Self 1999). The main ITS components are: (see Fig. 1). See (McTaggart 2001).
Fig. 1. The basic schematic for a typical ITS (Intelligent Tutoring System)
The expert model – using expert systems and semantic networks – the expert’s knowledge is captured and organised into a database of declarative and procedural knowledge. (Orey 1993). The student model – guides students through the system’s knowledge base. It stores data about the student’s learnt knowledge and behavior as a result of interacting with the ITS, providing the instructional model with information that enables it to make “what”/“how” decisions. This is the “virtual” student model that we have developed. The instruction model, see in (Buiu 1999) – makes “what”/“how” decisions on presenting learning activities to the student. It holds knowledge about tutoring tactics, based on an ever-adjusting student profile. This is the Tutor in WITNeSS. The interface model – this human-computer interface is the student’s window to the “mind” of the ITS.
Developing a “Virtual Student” Model to Test the Tutor
243
The optimizer model – this is the agent we will be developing that will help the instructional model change the way it adapts to the student. All ITS components interact to provide the student with a face-to-face encounter with the knowledge of a subject domain. This interaction results in a student being able to assimilate new knowledge into his current mental schemata.
2 Main ITS Features of WITNeSS One of the features of recent ITSs is a Bayesian network student model of which there are three approaches: Expert-centric, Efficiency-centric and Data-centric. (Mayo and Mitrovic 2001). The expert-centric approach is similar to expert systems. The complete structure and conditional probabilities are determined, directly or indirectly, by the domain expert. Every effort is made to “fit” the student model to the problem domain. The Efficiency-centric approach is the exact opposite: the student model is restricted or limited in some way and the problem domain is made to “fit” the resulting student model. The weaknesses in both of these methods are deeply described in (Mayo et al. 2000), where Data-centric approach is preferred. The CAPIT model is created from observational data, so the model’s predictive ability can be easily measured by testing the NN (neural network) on data that was not used to train it; only observable data is used, so the student models are smaller. Intelligent Technologies either standalone or by hybridisation remove the lacks of classical ITS. Fuzzy Systems are removing the tractability/complexity problem, also they offer the possibility of connecting an ITS with mental schemata through natural language. Integrated NNs based learning environments are providing ITS agents that instruct and learn along with a student and his/her fellow (agents) pupils, helping to include even the social context of learning in the system. Bayesian Networks are used to reason in a principled manner about multiple pieces of evidence. WITNeSS (Wellington Institute of Technology Neural Expert Student Support) was thought as an intelligent hybrid system, a Fuzzy – Neural soft computing system for optimising the presentation of learning material to a student. See Fig 2 below for a description of the WTTNeSS’s component blocks and their function.
3 Experimental Results 3.1 The Reason for the “Virtual” Student Model The idea is to create a “Virtual” Student Model – a representation of the long and short term learning characteristics of a human student using an the system. This model simulates the learning and forgetting that occurs when a human learning attempts to acquire knowledge. The system could then use this model to try out, in the background, various teaching strategies in an effort to best decide “what” to teach next and “how” to present it to the human student. The importance of the “best what and how” at each moment of learning is because this would represent the most efficient optimization of the system for
244
M.G. Negotia and D. Pritchard
Fig. 2. The functional components of the WITNeSS System
that particular student. It would enable the student to learn the knowledge content in the smallest possible time, with the highest possible quality of learning.
3.2 The Aim of the Experiment The aim of the experiment was to test our ability to create a Student Model that would behave, both learning and forgetting, dependent on how we initialised it with the factors – learning capability, general interest in learning and interest in mathematics. The experiment was to be judged a success if three different classifications of students (above average, average and below average) produced clear, distinct learning curves when repeatedly asked to stimulate the learning of the same task and that these curves would accurately estimate the learning of similar human learners. We hoped to achieve something like the three learning curves shown in Fig. 4. The graph shows hypothetical different speeds of learning that on average, you would expect from above average, average and below average students. At the same time the quality of learning must be of the highest calibre.
Developing a “Virtual Student” Model to Test the Tutor
245
Fig. 3a. How WITNeSS is set up to work with a student Note: The numbers in brackets on the flow arrows refer to the description box of the same number
3.3 The Methodology Sixty “Virtual” Student models were created and tested – twenty of each classification– above average, average and below average.
246
M.G. Negotia and D. Pritchard
Fig. 3b. WITNeSS working Note: The numbers in brackets on the flow arrows refer to the description box of the same number
It was the responsibility of the “experimenter” agent to run the experiment. This agent created the sixty “virtual” student models and tested them. (See Fig. 5) Once the “experimenter” created a “virtual” student model it tested it in the following way (see Fig. 6)
Developing a “Virtual Student” Model to Test the Tutor
247
Fig. 4. The results of the experiment
Fig. 5. Creating “Virtual” Student Models
Once the “experimenter” has finished sixty report files were generated on the learning performance of each “virtual” student model that the “experimenter” created. Each file contains the performance details that can be used to create a learning curve for that “virtual” student model. The ITS now has performance information on 20 above average “virtual” student models, 20 average “virtual” student models and 20 below average “virtual” student models. Using each set of 20 curves we calculated an average curve for each of the three classifications. The graph in Fig. 7 shows the three learning curves representing the result of the experiment. This experiment was then repeated five times for validation.
3.4 The Results The following graph represents the results of the experiment. It shows the results of run number 1. See Fig. 7.
248
M.G. Negotia and D. Pritchard
Fig. 6. The testing of the “Virtual Student Model
It must be stated from the beginning that the “virtual” student model is very rudimentary and is not, at the moment, based on any complex educational theory. The idea was to keep this model very simple and only have it be a rough but suitable approximation of the learning and forgetting that would take place in real life. The “virtual” student models tested in this experiment are designed to be a crucial part of a complete intelligent learning system. In this system the “virtual” student model will be used to approximate the learning behaviour of a real student using the system. The student model would be part of system called the “optimiser” which is always, in the background, trying out different teaching strategies in order to come up with the best one. No matter where the human student is in their course of study, the “optimiser” would determine the best sequence of learning to take the student from where they are currently, to the end of the course, in the shortest possible time and with the highest possible quality of learning. The system would try always to modify itself to best suit the student’s learning needs. The idea was to keep the “virtual” student model simple so that other aspects of the system could be worked out. Only then would the “virtual” student model be modified based on the latest learning and teaching theory.
Developing a “Virtual Student” Model to Test the Tutor
249
Fig. 7. Average learning curve for each classification of the “Virtual” Student Model
When the “virtual” student model is created it is initialised with variables learning capability, general interest in learning and interest in Mathematics. These represent relatively long term learning influences of the student. Using these values as linguistic input variables to a fuzzy rule structure a factor called longTermLearn is calculated. After the first request by the “experimenter” to the “virtual student” to learn, it always follows with a random number of requests to the “virtual student” to forget. This simulates the situation that learning activity on a topic doesn’t necessarily happen consecutively, that there a periods where forgetting can occur. On most occasions forget simulates 1, 2 or 3 forget requests. An example of a student’s performance file will illustrate this. (See Fig. 8) A random number called forget is generated representing all factors that could influence forgetting at a particular moment in time. As stated before no effort, at the moment, has been made to base this on any particular learning model. This will come later. It was decided that the final forgetRate used to simulate forgetting within the student would be dependent on the value of longTermLearn – the long term learning characteristics of the student. For example the above average student would be assumed to forget less than the below average student. Arrow 1. This row displays the designation of this mode. It belongs to Run #1 and is the model of classification “above average”.
250
M.G. Negotia and D. Pritchard
Fig. 8. Example of file produced showing the performance of the “Virtual” Student model Arrow 2. These three lines display how the “virtual student model was initialised. The inputs were learningCapability, generalInterest and interestMaths with the universal discourse values of 8.0, 8.0, and 8.0 respectively. Arrow 3. In this experiment an 888 represented an “above average” student, a 555 represents an “average” student and 222 a “ below average” student. Arrow 4. SESSION 00 represents the first request for the “virtual” student to learn. In this case the learnProbability went from 0.000 to 0.199. Arrow 5. learnProbability represents the probability that the student would get the problem right when it’s next presented. The “Virtual” student always starts with a learnProbability of 0.000 and through a process of learning and forgetting progresses through a number of days until the task is learnt – e.g. learnProbability = 1.000. Arrow 6. We can see an example of random forgetting on SESSION 03 when the “Virtual” student was asked to simulate forgetting which resulted learnProbability to drop from 0.357 to 0.332 – a 0.025 forget. Arrow 7. At other times two forget requests are randomly called. Example as in SESSION 09 and 10. So a combination of influence between longTermLearn and the random forget value was used as linguistic variables into a fuzzy rule structure and the final forgetRate defuzzified out. This forgetRate was then used to arrive at the forget activity shown in the above table. The first step in calculating the improvement in learning that results when the “virtual” student is requested to learn by the experimenter”, is to calculate a variable called “level number”. The result is a value 1 to 5. The lower the number the greater
Developing a “Virtual Student” Model to Test the Tutor
251
the improvement will be. Level number is calculated by a fuzzy combination of the linguistic variables longTermLearn and concentration. The improvement in learning that results when the “Virtual” student is requested to learn by the “experimenter” is determined by a fuzzy combination of the linguistic variables longTermLearn and concentration. The value of longTermLearn was calculated when the “virtual” student model was first created by the “experimenter”. Concentration was calculated by using fuzzy logic with linguistic variables longTermLearn, motivation and intensity. LongTermLearn we’ve explained previously. Once again it is assumed that the better the student, the better he will be able to concentrate. Motivation is a variable that represents all those factors that would influence a student’s ability to concentration. Intensity is based on how intense the activity is and is based on intensity of practice and intensity of problem. Intensity of practice and problem are determined randomly and represent respectively how long the learning activity was and how hard the problems were. Intensity was calculated using fuzzy logic with linguistic variables of intensity of practice and intensity of problem. The level number that emerges from these fuzzy structures is used to determine the amount of improvement that has occurred. Future work will be to use the “virtual” student model inside an agent of the intelligent learning system called the “optimiser”. Also once the basic idea of the intelligent learning system has been proved – work will be done on the “virtual” student model to reflect current learning and teaching theory.
4 Concluding Remarks An experiment was conducted to test WITNeSS in its ability to produce a student models of different student ability – for example, above average, average and below average. The experiment was replicated five times and each time the averages of 20 “above average”, “average” and “below average” student models shown distinctively different learning curves. We will now be able to take this “virtual” student model and use it to test our tutor and optimiser components of the system. The interesting question will be how the system adapts its response to the student model. Will it develop a different strategy to deal more efficiently with the student model? The concept of “virtual” student will be a key concept for ITS testing and its further development.
References 1. Buiu C (1999) Artificial intelligence in education – the state of the art and perspectives. In: ZIFF Papiere 111, Fern University, Institute for Research into Distance Education, Germany, (ERIC Document Reproduction Service No. ED 434903).
252
M.G. Negotia and D. Pritchard
2. Kosba E, et al. (2003) Using Fuzzy Techniques to Model Students in Web-based Learning Environments. In: Knowledge-Based Intelligent Information and Engineering Systems, Springer-Verlag, Berlin Heidelbergy New York, Part II, pp 222-228. 3. McTaggart J (2001) Intelligent Tutoring Systems and Education for the Future. In: 512X Literature Review, April 30 2001, pp. 2 4. Mayo M, Mitrovic (2001) Optimising ITS Behavior with Bayesian Networks and Decision Theory. In: International Journal of Artificial Intelligence in Education, 12, pp 124-153, 2001. 5. Mayo M, et al. (2000) CAPIT: An Intelligent Tutoring System for Capitalisation and Punctuation. In: Kinshuk J C, Okamoto T (eds) Advanced Learning Technology: Design and Development Issues, IEEE Computer Society, Los Almitos, CA, ISBN 0-7695-0653, pp. 151-154. 6. Negoita Gh M, Pritchard D (2003a) Testing Intelligent Tutoring Systems by Virtual Students. In: Proceedings of International Conference on Machine-Learning and Applications (ICMLA ’03), Los Angeles USA, pp 98-104. 7. Negoita Gh M, Pritchard D (2003b) Some Test Problems Regarding Intelligent Tutoring Systems. In: Palade V, Howlett J R, Jain L (eds) Knowledge-Based Intelligent Information and Engineering Systems, Springer-Verlag, Berlin Heidelberg New York, Part II, pp 986-992. 8. Orey M A, Nelson W A (1993) Development Principles for Intelligent Tutoring Systems: Integrating Cognitive Theory into the Development of Computer-based Instruction. Journal of Educational Technology Research and Development, vol. 41, no. 1, pp 59-72 9. O’Riordann C, Griffith J (2003) Providing Personalised Recommendations in a WebBased Education System. In: Palade V, Howlett J R, Jain L (eds) Knowledge-Based Intelligent Information and Engineering Systems, Springer-Verlag, Berlin Heidelberg New York, Part II, pp 245-251 10. Self J (1999) The Defining Characteristics of Intelligent Tutoring Systems Research: ITS’s Care, Precisely. In: International Journal of Artificial Intelligent in Education, 10, pp 350-364,1999 11. Smyth B (2003) Intelligent Navigation on the Mobile Internet. In: Palade V, Howlett J R, Jain L (eds) Knowledge-Based Intelligent Information and Engineering Systems, Springer-Verlag, Berlin Heidelberg New York, Part I, pp 17-19.
Considering Different Learning Styles when Transferring Problem Solving Strategies from Expert to End Users Narin Mayiwar and Anne Håkansson Department of Information Science, Division of Computer Science, Uppsala University Box 513, SE-751 20 Uppsala, Sweden Tel: +46 18 471 10 73, Fax: +46 18 471 71 43 {Narin.Mayiwar, Anne.Hakansson}@dis.uu.se
Abstract. This paper discusses the manner in which a knowledge-based system can support different learning styles. There has been a long tradition of constructing knowledge-based systems as learning environments to facilitate understanding and tutor subjects. These systems transfer domain knowledge and reasoning strategies to the end users by making the knowledge available. However, the systems are not usually adapted to the individual end user and his or hers way of learning. The systems only use a small number of ways of teaching while end users have many different ways of learning. With this in mind, the knowledge-based systems need to be extended to support these different learning styles and facilitate the individual end users learning. Our focus in this article will be on the knowledge transfer, which is a process that enables learning to occur. We suggest using visualization and simulation to support the transfer of problem solving strategies from a domain expert to end users.
1 Introduction Human beings learn all the time. And what we have learned affects almost everything we do. Hence it is very important that the teachers challenge the learners preconceptions and encourage reflection to be able to develop or change them [25], this is also applicable for computer systems designed to support learning. Learning is a process whereby knowledge is created and stored in the memory through the transformation of experience, which can be constructed by the learner in a knowledge-building process, piece by piece [11]. Letting the user perform a task or solve a problem is one way to achieve this. Systems containing knowledge can be utilised when tutoring both children and adults (see e.g. [10], [9], [18]). There are a wide variety of these kinds of systems. One type is Knowledge-based Systems. This term includes all types of systems built on some kind of domain knowledge, independent of implementation [15]. These systems simulate human reasoning and judge capabilities by accepting knowledge from external sources and accessing stored knowledge through M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 253–262, 2004. © Springer-Verlag Berlin Heidelberg 2004
254
N. Mayiwar and A. Håkansson
the reasoning process to solve problems [1]. Another type is Expert Systems, which are designed to represent and apply factual knowledge of specific areas of expertise to solve problems [15]. Unfortunately, it seems that most of the mentioned systems do not support their users individual ways of learning. Not all users will have the same preferred learning styles, this makes the learning process complex. Therefore one objective should be to accommodate to a number of styles of learning when transferring problem-solving strategies. The remainder of this paper is structured as follows: In Section 2 we introduce learning, learning styles, theory of multiple intelligences and knowledge transfer. Section 3 gives a introduction of knowledge in the system, and section 4 presents different reasoning strategies. In section 5 our suggestion for providing support to the end user are presented and in section 6 our conclusions are discussed.
2 Learning In order to transfer problem solving strategies from expert to end user in an effective way, we need to consider some cognitive and educational issues of importance. Thus in this section the key issues of learning, learning styles, multiple intelligences, knowledge and knowledge transfer will be introduced.
2.1
Learning and Learning Styles
Mazur [22], a professor of psychology, states that in studying learning, psychologists follow two main theoretical approaches: behavioral and cognitive approaches to learning. Behavioral psychologists focus on the changes that take place in an individuals behavior. Kimble [19], e.g., defined learning as “an experience, which produces a relative change in behavior, or potential behavior”. Cognitive psychologists on the other hand, prefer to study the change in an individuals knowledge, emphasizing mental processes such as thinking, memory, and problem solving. The most essential issue when defining learning is knowledge and there are different definitions of knowledge, e.g. constructivists view knowledge as something that a learner actively constructs in a knowledgebuilding process, piece by piece. According to this view, knowledge is stored in schemata comprising our mental constructs of ideas and concepts [11]. The means by which people learn are referred to as learning styles [6]. A learning style can be defined as the way we begin to concentrate, process and retain unfamiliar, difficult information [7].
2.2
Theory of Multiple Intelligences
Researchers agree that individual differences in learning exist. A leading developmental psychologist Dr. Howard Gardner has developed a theory called “Multiple Intelligences” [12], [13], [14]. The power o his theory is the categorization of different human abilities, ranging from verbal intelligence to the intelligence
Different Learning Styles when Transferring Problem Solving Strategies
255
involved in understanding oneself. Moreover, human beings adopt learning styles according to their personal intelligences. Gardner’s nine types of intelligences, together with their corresponding learning styles are [14]: Verbal-Linguistic Intelligence: an ability to use language and words. The learners think in words rather than in pictures. They learn best by using words and languages. Logical-Mathematical Intelligence: an ability to analyze problems logically and carry out mathematical operations. The learners learn best by categorizing, classifying and working with abstract patterns/relationships. Rhythmical-Musical Intelligence: an ability to think musically. The learners learn better with music in the background. Visual-Spatial Intelligence: an ability to represent the spatial world internally in the mind. The learners like to draw, read maps, look at graphs etc. Bodily-Kinesthetic Intelligence: an ability to use the body in a skilled way. The learners learn through moving, doing and touching. Interpersonal Intelligence: an ability to perceive and understand other individuals. The learners learn through interaction. Intrapersonal Intelligence: an ability to reflect upon oneself and be aware of one’s inner state of being. The learners work best alone. Naturalist: an ability to love the outdoors, animals and field trips. More than this, though, these students love to pick up on subtle differences in meanings. Existentialist: an ability to learn in the context of where man kind stands in the “big picture” of existence. They ask: “Why are we here?” The assumption is that users with different types of intelligence and learning styles may need different kinds of teaching strategies and material, which would influence the system architecture in a learning environment [10].
2.3
Knowledge Transfer
“Knowledge transfer can be referred to as learning” [24]. The term knowledge transfer, related to knowledge-based systems, is the activity of moving knowledge from the knowledge base, created by the expert and knowledge engineer, to an environment where it is made available to the end user. The result should be interpreted by the end user so he or she will be able to understand and apply the knowledge. The interpretation can for example be some advice to the end user, which may increase the performance of the end users work [16]. Knowledge transfer can thus be seen as the process that enables learning to occur. It is based on a proven Instructional System Design (ISD) methodology. The effectiveness of this methodology for adults learning has been discussed in many studies. Although the authors may vary in defining the number of steps, the content of the process is always the same [24]. 1. Establish the job-specific knowledge requirements and evaluate the current level of employee understanding.
256
N. Mayiwar and A. Håkansson
2. Create a navigation methodology that allows operating data and procedures to be delivered in the form of information necessary to achieve the required learning objective. 3. Document that the learning objective has been achieved (i.e., knowledge transfer has occurred).
Davenport and Pruzak [5] illustrate the goal of knowledge transfer as: Transfer = Transmission + Absorption (and use) In order to reach transfer, knowledge not only needs to be sent to a recipient, but also be absorbed and put to use [2]. Andersson argues that if we design a representation of knowledge that can be understood, we can improve the chances that the knowledge will be absorbed. If the representations of the knowledge can be used by the computer for solving a task, then we increase the chances that the knowledge will be used, and thereby learnt by the user of the system.
3 Knowledge in the System When a knowledge base system is developed, it is important to extract certain types of knowledge. The knowledge to be acquired from the domain expert can be put into four knowledge categories: procedural, declarative, semantic and episodic knowledge [21]. “Procedural knowledge” refers to skills of performing a task or an action. The knowledge is automatic and reactive. It includes learning psychomotor skills and knowledge about native language. “Declarative knowledge” refers to the information that an expert can verbalise. It is an expression of what the expert is aware of, constituting conscious heuristics or rule-of-thumbs. “Semantic knowledge”, reflects on organised knowledge about words and symbols and their meanings. It also consists of rules, their referents and interrelationships, and of algorithms for manipulating symbols, concepts and relations. The knowledge includes ones ability to memorise vocabulary, concepts, facts, definitions and relationships among facts. “Episodic knowledge” is an autobiographical, experimental type of information, which will have been chunked or compiled episodically or temporally. The knowledge is often described in terms of perceptual characteristics to the extent where it is not possible to recall the behaviour. In a knowledge-based system, the procedural and declarative knowledge categories are commonly covered [8]. The procedural knowledge describes how a problem is solved and provides directions on how to do something. The declarative knowledge describes what is known about a problem and includes the true/false statement and a list of statements that more fully describe objects or concepts. Other types of knowledge handled by the system are meta-knowledge, heuristics and structural knowledge [8]. “Meta-knowledge” describes knowledge about knowledge. It is used to select the knowledge best suited for solving problems. This knowledge can enhance the efficiency of problem solving since it can direct the reasoning. “Heuristics” describes the rules-of-thumb that guide the reasoning process. Heuristics is called shallow knowledge, and refers to the
Different Learning Styles when Transferring Problem Solving Strategies
257
knowledge compiled by the expert through previous problem solving experience. It is compiled from the fundamental knowledge, also called deep knowledge. “Structural knowledge” describes knowledge structures, i.e., through providing a model of a problem. The model includes concepts, sub-concepts and objects.
4 Reasoning Strategies Reasoning is the process of drawing conclusions by utilising facts, rules and problem solving strategies [8]. The reasoning is in focus in this article. Commonly used reasoning strategies are deductive, abductive, inductive, analogical, common sense and non-monotonic reasoning. “Deductive reasoning” is deducing new information from logically related information. To draw conclusions, it uses facts (axioms) and rules (implications) and the basic form of reasoning is Modus ponens rules of inference, which are using an IF (conditions) - THEN (conclusion) syntactic form. This reasoning is logically appealing and one of the most common problem-solving techniques used by human beings. “Abductive reasoning” is a form of deduction that allows plausible inference. Plausible refers to the conclusions, which are drawn from available information but might be wrong. From an implication and a fact, it can infer an explanation of that fact. “Inductive reasoning” is to arrive at general conclusions from a limited set of facts by generalisation. The abstract general rules are hypothesis explaining a set of facts. From a limited number of cases, a general rule is generated which probably applies on all cases of that certain type. “Analogical reasoning” - Human beings form a mental model of some concept through their experience by using analogical reasoning to understand some situation or object. They draw analogies to get the similarities and differences to guide their reasoning. A frame provides a natural way of capturing stereotypical information and can be used to represent typical features of some set of similar objects. In “Common sense reasoning” - through experience, human beings learn to solve problems efficiently. They use their common sense to derive a solution. The reasoning rather relies on good judgements than on exact logic. The type of knowledge is called heuristics (rules of thumb). “Non-monotonic reasoning” presupposes that the state (true or false) is static during the problem solving, i.e., the facts remain constant. However, sometimes the facts change and already derived conclusion may have to be withdrawn since they no longer follow logically. This reasoning is said to be non-monotonic. If the system is a truth maintenance system, non-monotonic reasoning can be used.
5 Suggestion to Support the End Users As mentioned before, when it comes to learning, people have different approaches [12]. Education needs to be individualized if it is to provide all students with the opportunity to learn in accordance with their abilities [10]. Several knowledge-based systems have been constructed as learning environments intended to facilitate understanding and used to tutor subjects. These
258
N. Mayiwar and A. Håkansson
systems transfer domain knowledge and reasoning strategies to the end users by making the domain knowledge available. However, the systems are, usually, not adapted to the individual end user and his or her way of learning. This can make the learning process more complex. We have, therefore, chosen to take a closer look at different kinds of knowledge and problem solving strategies to see how they can support users with different intelligences and their different learning styles. Verbal-Linguistic intelligence: refers to the ability to use language and words. People who have this kind of intelligence learn best by using language, reading, writing and speaking. They understand the order and meaning of words in both speech and writing easily. By using declarative and semantic knowledge in the system, we believe that this kind of intelligence can be supported, because declarative knowledge is described through language and words to present facts, rules and conclusions to the end user, and semantic knowledge can be described through words and the meaning of these words. The users can also be supported by deductive reasoning since, in deductive reasoning, facts (axioms) and rules (implications) presented in words are used when drawing conclusions. Besides, declarative and semantic knowledge can be used the other way around: facts, rules and conclusions can be used to present knowledge in words and languages, which should support users with this kind of intelligence. Logical-Mathematical Intelligence: is an ability to analyze problems logically and carry out mathematical operations. They who posses this intelligence can use numbers and logic to find and understand the various patterns through number patterns, visual patterns, color patters, and so on. In a knowledge-based system logical-mathematical intelligence can be supported by semantic knowledge, which consists of rules, their referents, interrelationships, and of algorithms for manipulating symbols, concepts and relations. These are presented in a logical terminology and use technology that resembles humans logical-mathematical analysis in order to achieve conclusions. In other words the system is about reasoning like human beings. Heuristics should also support this intelligence by presenting rules of thumbs (some method or procedure that comes from practice or experience, without any formal basis). Heuristics are humans rules of thumb. Knowledge is presented in the form of rules, which are patterns and thereby can support logical-mathematical intelligence. It can also be supported by deductive reasoning where new information is deduced from logically related information. In the system the objects can be divided in groups and all the relevant objects are presented, giving an overview of the domain [10]. Furthermore, a compounded evaluation of the object groups can be displayed as a kind of formula and the problem solving rules, i.e., the rules of thumb within the domain may be presented. The intelligence should also be supported by other reasoning strategies e.g., common sense reasoning which is closer to our logically thought, while others are more vague such as abductive reasoning.
Different Learning Styles when Transferring Problem Solving Strategies
259
Visual-Spatial Intelligence: an ability to represent the spatial world internally in the mind. It is often stated that “a picture is worth a thousand words”. People provided with this intelligence learn best by looking at shapes, maps, images and so on. A way of transferring problem solving strategies from a domain expert to an end user via a system is to visualize the reasoning strategies of the system. Visualization can support following the reasoning strategy more easily in the knowledge-based system. The episodic knowledge is often described in terms of perceptual characteristics, indeed it can support this intelligence if the knowledge is expressed as icons or metaphors. These icons or metaphors support visual-spatial intelligence, since the system uses pictures with inherent meanings. Moreover the analogical reasoning may also support this intelligence, since human beings form a mental model of some concepts through their experience by using an analogical reasoning to understand some situation or object. The word analogy is defined as similarity in some respects of things that are otherwise dissimilar, a comparison that determines the degree of similarity, or an inference based on resemblance or correspondence. By presenting a similar problem during problem solving to the end user the system can help them to draw analogies to get similarities and differences to guide their reasoning. Bodily-Kinesthetic Intelligence: an ability to use the body in a skilled way. People gifted with this intelligence learn best through physical movement and by using their body. We argue that this kind of intelligence can be supported by procedural knowledge, because procedural knowledge shows a skill, and also by analogical reasoning (by presenting a similar task during problem solving). Additionally, the common sense can also support this kind of intelligence. Procedural knowledge can be presented as step by step performance, which can be easier understood by people with this intelligence e.g. when performing a new task it will be easier to see for instance a simulation of the task step by step. From this point of view, the pedagogical idea behind visualization and simulation environments should be of importance when solving a problem. Interpersonal Intelligence: an ability to perceive and understand other individuals. People with this kind of intelligence enjoy to communicate with other people and learn best through interaction with others, e.g. when working in a team. According to Preece et al. [23] much learning that goes on in the real world occurs through interacting and collaborating with others. On the basis of the problems characteristics different knowledge and strategies should support interpersonal intelligence. The system can provide support for interpersonal users by simulations and by using several different student models. Chen et al. [4] define a student model as a tuple SM=(SB,SH,SK), where SB=(student background), SH=(student history), and SK=(student knowledge). Through the interaction with the different students models, the system can support the communication with sets of knowledge. From this point of view, the pedagogical idea behind visualization and simulation environments should be of impor-
260
N. Mayiwar and A. Håkansson
tance when solving a problem. For instance by interacting with the different student models in the system and asking the system for suggestion and for comparing his or hers solution with others learning may be facilitated. Experiences from, e.g., using the knowledge-based system Analyse More showed that a co-operation within and between groups of students took place in the classroom [9]. Collaborative discussions when working with the system can encourage a dialogue beneficial to effective learning, according to constructivists [20]. Intrapersonal Intelligence: an ability to reflect upon oneself and be aware of ones inner state of being. Those who have this kind of intelligence have a good understanding of themselves and learn best by working alone. People with this intelligence should be supported by different kinds of knowledge and reasoning strategies. Pieces of knowledge can be presented in states depending on the situation, i.e. from the situation it is decided in which order these states should be presented. This can support interpersonal intelligence, since people learn step by step in order to achieve a state goal. This kind of intelligence can also be supported by giving possibility to use the already stored knowledge in the system. At the moment, we can not find any knowledge or reasoning strategies to support rhythmical-musical or naturalist and existentialist intelligence.
6 Concluding Remarks and Further Work Theories of human problem solving and learning are of great importance for improving teaching. Each advance in the understanding of problem solving and learning processes provides new insights about the ways in which a learner must be supported. A knowledge-based system simulates human reasoning and judging capabilities by accepting knowledge from an external source and access stored knowledge through a reasoning process to solve problems. But the question is what kind of knowledge and reasoning strategies to manage and toward who? By answering this question we have tried to highlight strengths and weaknesses in the knowledge-based systems in order to support different intelligences and learning styles. In order to realize transfer, knowledge needs not only to be sent to a recipient but also be absorbed and put to use [2]. Thus, if the knowledge and problem solving strategies, extracted from the knowledge base, can satisfy the users different learning styles then the knowledge can be absorbed. This is desirable in an educational system directed towards deep learning. We believe that by providing different users with interfaces adapted to different intelligences and learning styles, users can understand the knowledge and problem solving strategies better, and thereby learn more. One feature of an intelligent interface technology is that a representation of the user (a student model) is included. The representation describes facets of user behaviour, knowledge and aptitudes, and may have a greater or lesser degree of formality [3]. In sum, such a model can be used to
Different Learning Styles when Transferring Problem Solving Strategies
261
improve transferring of knowledge and problem solving strategies in a knowledgebased system toward supporting different intelligences and learning styles. To conclude the key to effective use of knowledge-based systems in solving problems and also for all other kind of learning situations is a better understanding of human abilities and the role of technology in education.
References 1. Anderson, R.G.: Information & Knowledge Based Systems. An Introduction. Prentice Hall International, Great Britain (1992) 2. Andersson, K.: Knowledge technology Applications for Knowledge Management, PhD thesis, Computer Science, Uppsala University, Uppsala, Sweden (2000) 3. Benyon, D.R., Murray, D.M.: Special issue on intelligent interface technology: editors introduction. Interacting with Computers 12 (2000) 4. Chen, J., Okamoto, T., Belkada, S. Interactive Tutoring on Communication Gaps in a Communicative Language Learning Environment Proceedings of the International Conference on Computers in Education (ICCE02)0-7695-1509-6/02 $17.00 2002 IEEE (2002) 5. Davenport, T.H. and Prusak, L.: Working knowledge: how organizations manage what they know. Harvard Business School Press, Boston, Mass (1998) 6. Davidson, G.V.: Matching learning styles with teaching styles: Is it a useful concept in instruction? Performance & Instruction, 29(4) (1990) 7. Dunn, R.: Understanding the Dunn and Dunn learning styles model and the need for individual diagnosis and perscription. Journal of Reading, Writing and Learning Disabilites, 6. New York: American Library (1990) 8. Durkin, J.: Expert System Design and Development. Prentice Hall International Editions. Macmillian Publishing Company, New Jersey (1994) 9. Edman, A.: Combining Knowledge Systems and Hypermedia for Use Co-operation and Learning. Ph thesis, Computer Science, Uppsala University, Uppsala, Sweden (2001) 10. Edman, A., Mayiwar, N.: A Knowledge-Based Hypermedia Architecture Supporting Different Intelligences and Learning Styles. Proceedings of the eleventh PEG2003 Conference. Powerful ICT for Teaching and Learning, St. Petersburg, Russia (2003) 11. Elmeroth, E.: Hypermedia som verktyg fr lärande (Hypermedia as a tool for learning), Rapport D/ Department of Pedagogy and Methodology, Kalmar University (1999) 12. Gardner, H.: Frames of Mind. New York: Basic Books (1983) 13. Gardner, H.: The Unschooled Mind: How children think and How schools should Teach. New York: Basic Books (1991) 14. Gardner, H.: Intelligence reframed. Multiple intelligences. New York: Basic Books (1999) 15. Hayes-Roth, F., Waterman, D., Lenat, D.: Building Expert Systems, AddisonWesley (1983) 16. Håkansson, A.: Graphic Representation and Visualisation as Modelling Support for the Knowledge Acquisition Process. Uppsala. ISBN 91-506-1727-3 (2003) 17. Håkansson, A.: Visual Conceptualisation for Knowledge Acquisition in Knowledgebased Systems. Accepted in: Frans Coenen (ed.): Expert Update (SGAI) Specialist Group on Artificial Intelligence, ISSN 1465-4091 (2003)
262
N. Mayiwar and A. Håkansson
18. Håkansson, A., Öijer, C.: A Development of an Expert System for Applied Environmental Impact Assessment. (Utveckling av expertsystem för miljökonsekvensanalys). Bachelor Thesis, Computing Science Department, Uppsala University, Sweden (1993) 19. Kimble, G.: Hilgard and Marquis’ Conditioning and Learning. 2nd edition. New York: Appleton (1961) 20. Lim, C. P.: The dialogic dimensions of using a hypermedia learning package. Computers & Education 36, ( 2001) 21. McGraw, K. L. and Harbison-Briggs, K.: Knowledge Acquisition, principles and guidelines. Prentice-Hall International, Inc (1989) 22. Mazur, J. E: http://encarta.msn.com/encyclopedia_761556088_5/Learning.html# howtocite, Online Encyclopedia (2004) 23. Preece, J, Rogers, Y., Sharp, H., Benyon, D., Holland, S., Carey, T.: HumanComputer Interaction, Addison-Wesley publishing company (1994) 24. Resource Development Corporation: Compliance through knowledge transfer, the case for active learner (1996 http://www.resourcedev.com 25. Säljö: Lärande i praktiken: ett sociokulturellt perspektiv (Learning in practise: a socio-cultural perspective). Stockholm: Prisma (2000)
ULMM: A Uniform Logic Modeling Method in Intelligent Tutoring Systems Jinxin Si1,2, Cungen Cao1, Yuefei Sui1, Xiaoli Yue1,2, and Nengfu Xie1,2 1
Knowledge Acquisition and Sharing Group, Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China 2 The Graduate School of the Chinese Academy of Sciences {jxsi,cgcao,yfsui,xlyue,nfxie}@ict.ac.cn
Abstract. More researchers recognize that it is an emergent and important issue that intelligent tutoring mechanism can be depicted, evaluated and measured on the uniform theoretical foundation which should be a highly formalized and computerized model in order to explore ITSs and advance effective and efficient cross-reference, fusion and integration among diverse intelligent tutoring models and systems. This paper proposes a novel uniform logic modeling method from an associative viewpoint and highlights the concrete formal models of three core elements (i.e. knowledge model, student model and pedagogical strategy model) in the architecture of an ITS.
1 Introduction Intelligent tutoring systems are distinct from and more individualized than traditional computer-aided instruction systems because tutoring processes are endowed more and more attention to by researchers and developers from multiple domains including artificial intelligence, cognitive science, pedagogical science etc. In recent years, a common agreement about the core elementary models of an ITS seems to be reached in the ITS community, they are an expert knowledge model, a pedagogical knowledge model, a student or learner model, and a user interface model [8, 9, 19]. Furthermore, from the perspective of knowledge design and redesign there are many implicit and associative clews among knowledge, users and strategies [18]. Recently, Reddy challenged AI researchers with an open problem named “Encyclopedia on Demand” [12]. Obviously, it becomes more and more evident that knowledge model, user model and pedagogical strategy model should be developed as a whole in an ITS system. In this paper, we present ULMM as a uniform logic modeling method for ITS design and implementation. The remainder of the paper is organized as follows. In Section 2, we propose a novel idea about uniform logic modeling method used in an ITS and depict its 3-layer internal architecture. From Section 3 to Section 5, we respectively present and discuss the concrete description of the three core elements of an ITS. Section 6 concludes the paper and raises a few future research problems. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 263–269, 2004. © Springer-Verlag Berlin Heidelberg 2004
264
J. Si et al.
2 What’s ULMM? ULMM is a uniform logic modeling method in ITSs through which ITS researchers and designers can preciously depict features of domain knowledge and users, and teaching and learning strategies in order to satisfy both teaching objectives from teachers and learning objectives from learners [16]. On the one hand, ULMM allows us to describe user states using three layers, i.e. student meta-cognitive layer, methodological layer, and knowledge layer. On the other hand, ULMM can provide a method to build teaching and learning strategies about state transition, reasoning, conflict detection and elimination, state validation and evaluation etc. Ultimately, the mechanism of interaction and negotiation among teaching and learning strategies could be built and performed in an ITS at the run time. ULMM provides three kinds of modeling languages (ab. MLs) in order to represent an ITS as defined below: and can be used to repre1. The substrate language: It can be formalized as sent the concepts and relations between concepts that are to be learned (or taught). 2. The object language: It can be formalized as and be used to represent how a concept is learned (or taught), what actions should be taken during learning process (or teaching process), and what a strategy should be used to achieve a learning goal (or teaching goal). 3. The meta-language: It can be formalized as and be used to represent and reason the terms and sentences involved in the above two languages. It is noticeable that the terms and formulas in the substrate language are taken as constants in the object languages. At the same time, the predicates between two modals should be represented in the meta-language. Let be the language to represent the concepts and properties to be learned by learners (or taught by teachers). Given a concept x if x has a property then There is another language to represent the pedagogic concepts and processes, which can take every formula in as its own term. A strategy of interrogation is a formula in Usually, where is a formula in and is a formula in For example, Let be an interrogation strategy: if then test(?x). Then isa, has-attributes, x, a, and where and We assume that a strategy works in a certain context , which can be represented by a set of formulas in a language Hence, a strategy is a rule of the following form: which means that in a situation if is satisfied, a strategy is trigged.
3 Knowledge Logic Model Generally, an ontology is an explicit specification of conceptualization [4], and is fundamental to the design of the knowledge logic model in an ITS. In the national
ULMM: A Uniform Logic Modeling Method
265
knowledge infrastructure (NKI) project of China, we have built more than 600 domain-specific ontologies and about millions of domain assertions covering 19 different domains [e.g. 2]. Nevertheless, the knowledge base in the NKI is not developed directly for an ITS, and thus does not meet pedagogical needs. The knowledge logic model is the fine-grained knowledge base consisting of concepts and their relations to be taught and learned during a pedagogical process, and is a formal conceptual foundation of ITSs in order to construct adaptive and individualized plans of content and navigation. Definition 1. A knowledge logic model is a 4-tuple
where
1. C is the set of concepts, and each element is of declarative type, algorithmic type, step type, theoretical type, instance type and exercise type etc. 2. is the set of semantic relations, and each relation is of is-a type, part-of type, and has-instance type. 3. is the set of pedagogical relations to be explained shortly. 4. A is the set of axioms among concepts and their relations.
During instructional design, identifying pedagogical relations among concepts is a vital task in order to produce a suitable instructional plan. In the NKI project, we proposed and adopted some useful pedagogical relations to model and regulate “knowledge flow” for instructional purposes. Given a concept set C, we designed five predicates to describe pedagogical “anchors” as follows: It denotes that mastering concept
1.
is a prerequisite for mas-
tering concept 2.
It denotes that mastering concept can achieve the same goal as mastering concept because the concept’s extent and intent about each of them are equivalent. 3. It denotes that there are some similarities in facet f between concepts and 4. It denotes that mastering concept is dependent conditionally on mastering concepts 5. It denotes that the difficulty of concept is less than that of concepts
4 User Logic Model The user logic model (ULM) can help ITSs to determine static characters and dynamic requirements for a given user in an efficient and effective way. John Self proposed that during a user modeling process there are an “ideal” student and a “real” student, such that the former holds no misconceptions, reasons and learns rationally, and the latter is naturally less considerate and more uncertain [13]. Obviously, a student state can be depicted as at any time, where indicates the practical student state from student’s perspective and indicates the planned student state from the tutor’s perspective. We give an explicit and integrated description of a student states from a
266
J. Si et al.
three-level structure that comprises knowledge conceptual layer, knowledge methodological layer and meta-cognitive layer. At the same time, there are corresponding predicates used to elaborate the functions of these three layers.
4.1 Knowledge Conceptual Layer Given a knowledge concept k, there are three predicates derived from k in the knowledge conceptual level, as follows: 1. known(k). It means that the student knows and masters concept k after the instruction of k. 2. unknown(k). It means that the student is still unknowable about concept k after the instruction of k. 3. error(k). It means that the student has cognitive problems about concept k after the instruction of k. Furthermore, error(k) is classified into two kinds: misconception(k) and missing(k).
Therefore, for a given student, the whole structure about knowledge concepts (named is defined as follows: where K is the set which consists of all knowledge concepts for delivery.
4.2 Knowledge Methodological Layer Given a method m, there are also three predicates derived from m in the knowledge methodological layer as follows. 1. capable-use(m). For a given method m, the student is fully capable of making use of m into some applicable scenarios successfully. 2. incapable-use(m). For a given method m, the student is incapable of making use of m into some applicable scenarios on the premise of knowing the method m. 3. misuse(m). In a given applicable scenario, the student employs some inaccurate method that gives birth to an undesired end-state.
Based on the research results in human errors [6, 10, 11], we can classify the coarsely-grained predicate misuse(m) into three kinds of finely-grained predicates: mistake(m), slip(m) and lapse(m). In [14], we gave an explicit classification of these predicates from the intentional perspective and depicts the origins of their causes, prevention mechanism and performance occurrence Accordingly, for a given student, the whole structure about knowledge methods (named is defined as follows: where M is the set which consists of all knowledge methods.
4.3 Meta-Cognitive Layer Given a cognitive ability c, there are three predicates derived from c in the metacognitive layer. These predicates include good(c), average(c) and poor(c). Some
ULMM: A Uniform Logic Modeling Method
267
psychological experiments argue that the taken granularity of instructional actions for some a learner is heavily dependent on one’s meta-cognitive level [7,15]. The taxonomy of cognitive levels was proposed by Bloom and his associates: knowledge, comprehension, application, analysis, synthesis and evaluation [1]. Based on their work, Wasson used three types of learning outcomes: fact, analysis, and synthesis [17]. We define all cognitive levels using the set C as follows: C = {knowledge-ability, comprehension-ability, application-ability, analysisability, synthesis-ability, evaluation-ability} Accordingly, for a given student, the whole structure about meta-cognitive levels is defined as follows:
4.4 Comparative Operators Between Student States We define some useful operators so as to compare two student-states, which can not only describe the same student from both practical and planned viewpoints, but also depict two different students from the same viewpoint [14].
5 Strategy Logic Model The strategy logic model specifies an executable rule set through which an ITS can not only bring content and navigation adaptability into effect, but also decide and modify student states in the ULM. Theoretically, instructional strategies are the sequence of interactive actions and events between tutor initiative and student initiative, and it is becoming an important problem how to deal with learner control from “tutor authority” to “tutee democracy” [5]. According to initiative type, tutoring strategies (ab. TS) can be divided further into two cases: teaching strategy (ab. TES) and learning strategy (ab. LES), such that the former emphasizes on the implement of teaching goals through individualized planning and the latter prefers to achieving learning requirements through autonomous sequence. Obviously, the pedagogical strategy is connected closely with pedagogical actions and goals, where the former reflects what to execute and the latter represents why to execute. Furthermore, pedagogical actions can be categorized into teaching actions (ab. TA) and learning actions (ab. LA ). We give the formal schema of tutoring strategy (abs. TS), as illustrated below:
268
J. Si et al.
Based on the discussion above, we have developed more than 50 teaching and learning strategies covering innovational interrogation strategies, cognitive-structure constructive strategies, rehearsal strategies, organizational strategies and elaborative strategies etc. [3].
6 Conclusion Based on a uniform logic modeling method from the associative viewpoint, this paper highlights the study about the novel formal models of three core elements in the architecture of ITS. The main advantages of ULMM lie in that it can not only represent the intrinsic characteristics of and compact relations among knowledge, student and strategy in the objective manner, but also generate, regulate and monitor teaching plans and learning plans in an efficient and effective way so as to adapt to static features and dynamic expectations from both students and teachers. Moreover, the ULMM method can provide a unified and operable environment of pedagogical strategy acquisition and formalization from versatile educational and cognitive knowledge resource for ITS designers and engineers. In our future work, a flexible interactive mechanism between teaching strategy and learning strategy will be future taken into account in ULMM. In addition, an explicit representation of increment and decrement between two states should be adopted in the action-state model so as to promote conciseness and computability of strategy modality.
Acknowledgements This work is supported by the Natural Science Foundation (#60073017 and #60273019), and the Ministry of Science and Technology (#2001CCA03000 and #2002DEA30036).
References 1. Bloom, B. S.; Englehart, M. D., (1956), A Taxonomy of Educational Objectives: Handbook I. Cognitive Domain. (Eds) Furst, E. J., Hill, W. H., Krathwohl, D. New York: David McKay. 2. Cao C.; Feng Q.; Gao Y. et al.(2002) Progress in the Development of National Knowledge Infrastructure, Journal of Computer Science and Technology, vol.17 n.5, pp.523-534, May 2002 3. Dick, W.; Carey, L.; and Carey, J. (2001). The Systematic Design of Instruction, 5th Edition. New York: Harper Collins Publishers. 4. Gruber T R. A translation approach to portable ontology specification. Knowledge Acquisition, vol.5, no.2, pp. 199-220. 5. Kay, J. (2001). Learner control, User Modeling and User-Adapted Interaction, Tenth Anniversary Special Issue, 11(1-2), Kluwer, 111-127. 6. Laughery, K.R.; and Wogalter, M. S. (1997). Warnings and risk perception. In Salvendy (Ed.), Handbook of Human Factors and Ergonomics, Second Edition.
ULMM: A Uniform Logic Modeling Method
269
7. Lesgold, A. (1988). Towards a theory of curriculum for use in designing intelligent instructional systems. In: Learning issues for Intelligent Systems. (Eds) Mandl, H. and Lesgold, A. Berlin, Springer Verlag. 8. Martens, A. (2003). Centralize the Tutoring Process in Intelligent Tutoring Systems. In: Proc. of the 5th International Conference on New Educational Environments, ICNEE’03, Lucerne, Switzerland, 26.-28. May. 9. Murray, R. C.; and VanLehn, K. (2000). DT Tutor: A decision-theoretic, dynamic approach for optimal selection of tutorial actions. In Proceedings of ITS 2000. Berlin: Springer-Verlag. 10. Rasmussen, J. (1983). Skills, rules, knowledge: Signals, signs, and symbols and other distinctions in human performance models. IEEE Transactions on Systems, Man, and Cybernetics, 13, 257-267. 11. Reason, J. (1990). Human Error. Cambridge, Cambridge University Press. 12. Reddy, R. (2003). Three open problems in AI. JACM 50(1): 83-86. 13. Self, J.A. (1994). Formal approaches to student modelling, in: Student Modelling: the Key to Individualized Knowledge-Based Instruction, (Eds) Greer, J. E., McCalla, G. I., Berlin: Springer-Verlag. 14. Si J., Yue X., Cao C., Sui Y., (2004), PIModel: A Pragmatic ITS Model Based on Instructional Automata Theory, To appear in the proceedings of The 17th International FLAIRS Conference, Miami Beach, Florida, May 2004. AAAI Press. 15. Siemer, J., and Angelides, M. C., (1998). Towards an Intelligent Tutoring System Architecture that Supports Remedial Tutoring. Artificial Intelligence Review 12(6): 469-511. 16. Sui Y.; and Si J., (2004) A formative description method in intelligent tutoring systems. Technical report. 17. Wasson, B. (1990). Determining the focus of Instruction: content planning for intelligent tutoring systems. Ph.D. Thesis, Department of Computer Science, University of Saskatchewan, Saskatoon, Canada. 18. Yue X., and Cao C. (2003). Knowledge Design. In: Proceedings of International Workshop on Research Directions and Challenge Problems in Advanced Information Systems Engineering, Japan, Sept. 19. Zapata-Rivera, J., and Greer J., (2001). SMODEL server: student modelling in distributed multi-agent tutoring systems. Proceedings of the 10th International Conference on Artificial Intelligence in Education, pages 446-455, San Antonio, Texas, May 19-23.
Mining Positive and Negative Fuzzy Association Rules* Peng Yan1, Guoqing Chen1, Chris Cornelis2, Martine De Cock2, and Etienne Kerre2 1
School of Economics and Management, Tsinghua University, Beijing 100084, China {yanp,chengq}@em.tsinghua.edu.cn
2
Fuzziness and Uncertainty Modelling Research Unit, Ghent University, Krijgslaan 281 (S9), B–9000 Gent, Belgium {chris.cornelis, martine.decock, etienne.kerre}@UGent.be http://fuzzy.UGent.be
Abstract. While traditional algorithms concern positive associations between binary or quantitative attributes of databases, this paper focuses on mining both positive and negative fuzzy association rules. We show how, by a deliberate choice of fuzzy logic connectives, significantly increased expressivity is available at little extra cost. In particular, rule quality measures for negative rules can be computed without additional scans of the database. Keywords: fuzzy association rules, positive and negative associations, quantitative attributes
1 Introduction and Motivation Association rules [1], which provide a means of presenting dependency relations between attributes in databases, have become one of the most important fields in knowledge discovery. An association rule has the form where X and Y are two separate sets of attributes (itemsets). An example of an association rule is {mobile, batteries} {phone card}, which means that a customer who buys a mobile and batteries is likely to buy a phone card as well. Since the attributes of real applications are not restricted to binary values but also quantitative ones like age and income exist, mining quantitative association rules is regarded meaningful and important. A straightforward approach to this problem is to partition attribute domains into intervals and to transform the quantitative values into binary ones, in order to apply the classical mining *
This work was partly supported by the National Natural Science Foundation of China (79925001/70231010), the MOE Funds for Doctoral Programs (20020003095), the Bilateral Scientific and Technological Cooperation Between China and Flanders (174B0201), and the Fund for Scientific Research Flanders.
M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 270–276, 2004. © Springer-Verlag Berlin Heidelberg 2004
Mining Positive and Negative Fuzzy Association Rules
271
algorithm [9]. To avoid abrupt transitions between intervals, vagueness has been widely introduced into the model of quantitative association rule mining because of its flexibility w.r.t. knowledge representation (see e.g. [3–7]). Indeed, a quantitative rule like “If the customers are between the ages of [30, 60], then they tend to buy electronics at a price of [$1000, $5000]”, may lead to the so-called “boundary problem” [7]; e.g. a customer aged 29 with a purchase of $4000 is not accounted for in the model. On the other hand, “Middle-aged customers tend to buy expensive electronics” may be more flexible and would reflect this customer’s buying behaviour. To deal with the sharp boundary problem, a number of fuzzy sets can be defined on the domain of each quantitative attribute, and the original dataset is transformed into an extended one with attribute values in the interval [0, 1]. On another count, classical algorithms merely concern positive association rules, that is, only those itemsets appearing frequently together will be discovered. However, a negative rule such as {¬ high income } {¬ expensive electronics} is also useful because it expresses that people who are not rich generally do not buy expensive electronics. Although this kind of knowledge has been noted by several authors [2, 5, 10], we believe that the research on negative association rules has not received sufficient attention for the following reason: association rule mining first emerged in the domain of supermarkets, whose databases always contain thousands of goods (attributes) but each customer only buys few of them. In other words, most of the attribute values in a transaction are 0. If negative associations are also considered, a great deal of frequent negative patterns are generated, making algorithms unscalable and positive rules less noticed. In quantitative databases this problem is much less significant, because the fraction of attribute values equal to 0 is usually much smaller. In this paper, in Section 2 we introduce positive and negative quantitative association rules in the classical (crisp) case. We show that, for the computation of the traditional rule quality measures of support and confidence, as well as the more logic-inspired degree of implication, the use of negative association rules does lead to additional database scans. Section 3 investigates the extension to a fuzzy framework, while Section 4 discusses important issues to be considered in a realistic application.
2 Positive and Negative Association Rules Let be a relational database of tuples (or transactions) with a set of binary attributes (or items) each transaction in D can be considered as a subset of I, if and if An association rule is of the form: where X and Y are two disjoint non–empty subsets of I (itemsets). Support and confidence for rule are defined as and respectively, where is the number of tuples in D, is the number of tuples in D that contain X and (hence) is the number of tuples in
272
P. Yan et al.
D that contain both X and Y. Also, we define the support of itemset X as clearly A valid association rule is a rule with support and confidence greater than given thresholds. [1] When a database also contains a quantitative attribute Q, it is possible to “binarize” Q by partitioning its range into intervals and by replacing Q by new binary attributes such that when the value of for Q falls within the interval, and 0 otherwise. We can then apply traditional mining algorithms to this transformed dataset [9]; these algorithms usually involve detecting all the frequent itemsets1, and using them to construct valid association rules (e.g. Apriori algorithm [8]). In [5,6], authors distinguish between positive, negative and irrelevant examples of an association rule. A transaction is called a positive example of if and a negative example if and and an irrelevant example if It is clear that with this terminology, the support of equals the relative fraction of database transactions that are positive examples to the rule. In [10], expressions of the form and where X and Y are itemsets, are introduced and called negative association rules. The understanding is that, e.g., each negative example of is a positive example of However, this definition has an important drawback: a negative association rule then means that customers who buy a mobile are unlikely to buy both batteries and alarm clocks. If a transaction contains mobile and batteries, but no alarm clock, is then a positive example to the rule because and More generally, if then the support of is not less than that of which (informally) means that for two rules with the same antecedent, the negative rule with longer consequent has larger support. This results in much more computations and uninteresting negative rules with long consequents. In real life, a more desirable kind of knowledge may be which means that customers buying mobiles are unlikely to buy alarm clocks but are likely to buy batteries. Therefore, we regard each item’s complement as a new item in the database. That is, for the rule X and Y are two disjoint itemsets of where and As rule quality measures, we complement2 support and confidence with a so–called degree of implication (see e.g. [3, 5]). The latter measure interprets the arrow sign in as an implication relationship, and is defined as
1 2
i.e., those meeting the support threshold. In [5] it was shown that under certain circumstances degree of implication may even replace confidence, but in principle the three measures can meaningfully co–exist. Degree of implication may be particularly relevant when considering incorporation of the mined association rules into a rule–based system (see e.g. [4]).
Mining Positive and Negative Fuzzy Association Rules
273
where Clearly, this non–symmetrical measure computes the relative fraction of transactions that are not negative examples to the rule. A detailed investigation into this measure and its relationship to support and confidence was carried out in [5]. Because of the large size of the databases in real life applications, computations that require database scanning are by far the most time–consuming. It is therefore worthwhile to avoid them as much as possible. The following properties show that mining negative assocations, as well as using do not require additional database scans. Proposition 1. No transaction simultaneously contains
and
During candidate frequent itemset generation, any itemset containing both an item and its complement can be pruned away immediately. Proposition 2. Proposition 2 relates the support of a negative association rule to that of a corresponding positive rule. More generally, the following holds. Proposition 3. Let Then supp(X) equals to
where
Degree of implication can be derived from support, i.e. computing not lead to additional database scans.
and
does
Proposition 4. [3] Finally, proposition 5 gives us a hint about how to choose meaningful threshold values in the definition of a valid association rule. Proposition 5.
3 Positive and Negative Fuzzy Association Rules In the framework of fuzzy association rules, transactions can be perceived as fuzzy sets in I, so moreover, we assume The idea is that a transaction can contain an item to a given extent. A standard approach to extend quality measures to fuzzy association rules is to replace set-theoretical operations by corresponding fuzzy set–theoretical operations. Specifically, we need extensions to the classical conjunction and implication. To this end, tnorms and implicators are used; some popular t-norms and implicators are listed in Table 1.
274
P. Yan et al.
Support. Given a t-norm T, the degree to which a transaction itemset is expressed by:
supports the
Support is defined, by means of the cardinality of a fuzzy set, as:
Confidence.
Degree of Implication.
where I is an implicator. For a comparative study of the behaviour of various implicators w.r.t. fuzzy association rule mining we refer to [5]. Since ordinary sets are replaced by fuzzy sets, the properties mentioned in Section 2 need to be re–investigated. Proposition 1 does not generally remain valid because does not hold for every t-norm (it does hold for which means that item and can appear in an itemset simultaneously. To avoid meaningless rules like we should explicitly include this restriction in the definition of a valid fuzzy association rule For Proposition 2 to hold, should hold. As was discussed in [6], for the proposition is valid. It can be verified that Proposition 3 is also valid for hence the optimization strategy to reduce the number of candidate itemsets can still be used. As discussed in [3], Proposition 4 is maintained for some t-norm/implicator combinations, in particular for and Finally, Proposition 5 is valid as soon as Proposition 4 is valid.
Mining Positive and Negative Fuzzy Association Rules
275
4 Implementation and Discussion To implement the fuzzy association rule mining procedure, we used a modified version of the Apriori algorithm. To guarantee that all simplifying properties from the previous section are valid, we chose and Note that these properties assure that the additional complexity caused by considering negative items and degree of implication, can be kept within very reasonable bounds, and the algorithm is definitely much more economical than straightforwardly applying Apriori, treating negative items as new database attributes. It is also very much preferable to the approach for mining negative association rules from [10] which involves the costly generation of infrequent as well as frequent itemsets. Regarding the quality of the mined association rules, we observed that most of them are negative. This can be explained as follows: when for each transaction and each collection of [0, 1]–valued positive attributes corresponding to a quantitative attribute Q, it holds that3
then at the same time
In other words, the overall support associated with positive items will be 1, while that associated with negative items will be which accounts for the dominance of the latter. Since typically is between 3 and 5, the problem however manifests itself on a much smaller scale than in supermarket databases. To tackle it, we can e.g. use different thresholds for positive rules, and for rules that contain at least one negative item. However, this second threshold apparently should differ for every quantitative attribute since it depends on the number of fuzzy sets used in the partition. A more robust, and only slightly more time-consuming, approach is to impose additional filtering conditions and interestingness measures to prune away the least valuable negative patterns.
5 Conclusion We introduced fuzzy negative association rules, and showed that their incorporation into mining algorithms does not cause additional database scans, making implementations efficient. Future work will focus on selecting adequate quality measures to dismiss uninteresting negative rules.
References 1. Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. ACM–SIGMOD Int. Conf. on Management of Data (1993) 207–216 3
Note that this is a very natural assumption, since it means that each transaction makes the same overall contribution to the support measure. It is automatically fulfilled for classical quantitative association rules.
276
P. Yan et al.
2. Brin, S., Motwani, R., Silverstein, C.: Beyond Market Baskets: Generalizing Association Rules to Correlations. In: Proc. ACM SIGMOD on Management of Data (1997) 265–276 3. Chen, G.Q., Yan, P., Kerre, E.E.: Computationally Efficient Mining for Fuzzy Implication-Based Association Rules in Quantitative Databases. In: International Journal of General Systems (to appear) 4. Cornelis, C.: Two–sidedness in the Representation and Processing of Imprecise Information (in Dutch), Ph.D. thesis 5. De Cock, M., Cornelis, C., Kerre, E.E.: Elicitation of Fuzzy Association Rules from Positive and Negative Examples. Submitted. 6. Dubois, D, Hüllermeier, E., Prade, H.: A note on Quality Measures for Fuzzy Association Rules. In: LNAI, Vol. 2715 (2003) 346–353 7. Gyenesei, A.: A Fuzzy Approach for Mining Quantitative Association Rules. TUCS technical report 336, University of Turku, Finland (2000) 8. Srikant, R., Agrawal, R.: Fast Algorithms for Mining Association Rules. In: Proc. VLDB Conference (1994) 487–499 9. Srikant, R., Agrawal, R.: Mining Quantitative Association Rules in Large Relational Tables. In: Proc. ACM–SIGMOD Int. Conf. on Management of Data (1996) 1–12 10. Wu, X., Zhang, C., Zhang, S.: Mining Both Positive and Negative Association Rules. In: Proc. 19th Int. Conf. on Machine Learning (2002) 658–665
An Adaptation Framework for Web Based Learning System T.T. Goh1 and Kinshuk2 1
School of Information Management, Victoria University of Wellington, Wellington, New Zealand 2
[email protected]
Department of Information Systems, Massey University, New Zealand [email protected]
Abstract. There are many e-learning systems available nowadays but most of them are geared towards access through desktop platforms. With increasing use of mobile devices, it is apparent that the learners would need to access these systems through a variety of devices such as PDA, mobile phone or hybrid devices. Current design solutions do not cater for such multi-platform environments. This paper focuses on some of the issues from the perspective of mobility, multiple platforms and learner experience, and provides details of mobile adaptation framework that is designed to circumvent these problems.
1 Introduction The advances in wireless technologies and the increasing availability of highbandwidth telecommunication network such as 3G infrastructures in recent years have provided a fertile environment for the extension of traditional e-learning to mobile devices. With many web-based e-learning systems (KBS 2000, CBR 2002, BlackBoard 2002, SQL 2002) already in existence, one would think that mobile devices would be able to access these resources just as a desktop machine, which is connected to a fixed network. The fact is that these resources are created specifically for desktop scenarios, and accessing them through mobile devices could not only degrade learning experience but also, in worst case, deny the access completely. Hence, there is a need to identify a framework that allows access to e-learning systems adaptively in a multiple platform environment.
2 Related Work - Content Adaptation The work on content adaptation in typical web-based systems provide good starting point for our adaptation framework (such as Bickmore & Schilit 1997, Bharadvaj et al.,1998, Smith et al., 1999, Fox et al., 1998, Buyukkokten et al., 2000 and Chen et al., 2001). We shall discuss some of the significant research attempts here. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3 213, pp. 277–283, 2004. © Springer-Verlag Berlin Heidelberg 2004
278
T.T. Goh and Kinshuk
According to Bickmore and Schilit (1997), one straightforward method for content adaptation is to re-author the original web content. Manual re-authoring can be done but obviously it is not the most effective way. It also requires that the web pages must be accessible for re-authoring. This sometime poses some practical constraints. However, the underlying principles and questions faced are identical regardless of the methods used. What are the strategies used to re-author the pages? What are the strategies used to re-designate the navigations? What are the presentation styles achievable? The underlying principle is to isolate and distinguish the web content objects, presentation objects, navigation objects and interactive objects for desktop publication and re-map them into other device capable objects. The re-authoring approach can be mobile device specific or can be tailored to multiple classes of devices. For multiple devices re-authoring, transformation styles sheets (XSLT) and cascading styles sheets (CCS) can also be used. An example of re-authoring is the Digestor system (Bickmore and Schilit 1997) which focuses on different display size rather then device capabilities or network condition. However the re-authoring techniques and heuristic guideline suggested that content should be structurally organized. This finding is included in our adaptation framework. Bharadvaj et al. (1998) used a transcoding technique by modifying the HTTP stream and changing its content dynamically without user intervention. Transcoding can be performed in both upstream and downstream directions. An implementation of this technique is MOWSER. MOWSER used proxy to perform transcoding. The HTTP stream is modified by the proxy to include the capabilities and preferences of the mobile users. The users’ preferences and capabilities are store in the server. Modification and update of preferences is done by a CGI form on a URL at a web site maintained by the proxy. The proxy then fetches the files with the most suitable format to the requesting client. This implementation assumes that different formats are available for content adaptation. Transcoding of images and videos is done using scaling, sub sampling or sub key frame techniques. Transcoding of HTML page is done by eliminating unsupported tags and allowing the users to select their preferences. This implementation however did not touch on the aspect of navigation and might not work well if adaptive navigation is required. The AWCD framework (Chen et al., 2000) consists of user/client/networkdiscovery, decision engine, and content adaptation algorithm modules. The goal is to improve content accessibility and perceived quality of service for information access under changing network and viewer conditions. The approach intends to modify existing web content rather than providing an initial guideline for platforms multiple access environments. The session tracking approach by establishing session ID is adopted in our framework. Instead of combining the user/client/network-discovery module, we separate this module into three modules. The separation makes a clearer distinction between learner, device and its capabilities, and network environment. It should be noted that AWCD framework did not consider off-line scenario. Dynamic web page with embedded scripts, active server page and form were not highlighted in the framework. We need to consider these issues in our adaptation framework especially in learning environment where the mode of feedback is mainly through form action.
An Adaptation Framework for Web Based Learning System
279
3 The Adaptation Framework The types of content adaptation discussed earlier are mostly multimedia rich transformation and e-commerce focused. In contrast, e-learning systems to suit multiple platforms environment, leaner mobility and satisfactory learning experience have yet to be researched extensively. In addition to failure in providing multiple platform environments, the traditional e-learning systems are also very static in nature. Traditional elearning systems deliver identical content regardless of learner conditions such as need, environment, device capabilities and communication condition. Traditional elearning systems did not take mobile user characteristics into consideration. For example, according to Chen et al. (2002), mobile learner is different from desktop learner. From a mobility perspective, one of the unique characteristics of a mobile learner is the “urgency” of learning need. That is to say when a mobile learner engages in learning he/she is likely to require the information urgently. Thus the adaptation framework must have the competency of packaging content suitable for such a condition rather than delivering the content that might take a long to download. Another example is the mobility of learning setting. With increasing mobility, learning environment could be anywhere, such as a hot spot, café, classroom, camping ground, or even a train or bus. The learning environment can be quiet or noisy. The adaptation framework must be able to take this into consideration. In this study, we identify the possible environmental dimensions and attributes that will influence the learning experience of a multiplatform environment. The adaptation framework identified five core dimensions (Goh & Kinshuk 2002): content dimension, user dimension, device dimension, connectivity dimension and coordination dimension.
3.1 Content Dimension Content dimension represents the actual context and knowledge base of the application. It includes various sub-dimensions. The course modules organization subdimension includes attributes such as part, chapters and sections of the content. Another sub-dimension is the granularity level of the content that indicates the level of difficulty of the content presented to the learner. Multimedia sub-dimension represents the multimedia representation of the content. This includes the use of text, audio, animation, video, 3-D video, animation, and so on to represent the content to the learner. Pedagogy sub dimension represents the teaching models and domain expert that the system adopts. The adaptation framework must have the competency of organizing and selecting the appropriate content and deliver according to the engaging situation.
3.2 User Dimension The learning model sub-dimension of the user dimension includes attributes such as module completed, weight and score, time taken, date of last access and so on, depending on the algorithms used in determining the learner profile. User preference sub-dimension contains attributes such as preferred difficulty level and learning style.
280
T.T. Goh and Kinshuk
Environmental sub dimension represents the actual location where the learner uses the system. Different environments such as café, hot spot and classroom situation will have to be adopted differently. The adaptation must take into account of the motivation sub dimension such as urgency of use. The adaptation framework must have the competency of organizing, extracting and utilizing the information to best suit the learner.
3.3 Device Dimension Device dimension consists of the capabilities sub-dimension, which includes attributes such as the media support types and their capabilities in presenting multimedia content, display capability, audio and video capability, multi-language capability, memory, bandwidth, cookies, operation platform, and so on. The adaptation framework must have the competency of identifying and utilizing some or all of these capabilities.
3.4 Connectivity Dimension Under this dimension, there are three operating sub-dimensions. The user can operate in a real-time online mode. Another sub-dimension is the pre-fetching capability of the application. Here device capability, network reliability and connecting type are the main consideration for adaptation. The third sub-dimension is the off-line synchronization sub-dimension. Here the attributes of depth and encrypted cookies need to be considered in order to provide seamless adaptation, especially for web based learning assessment application where parameters regarding users’ actions need to be returned to the server. The adaptation framework must have the competency of deciding which mode of operation is best suited for the condition.
3.5 Coordination Dimension The coordination dimension represents the software and algorithms used for the application, the presentation, the interactivity and the navigation of the application. This dimension provides the coordination to support the other four dimensions. The adaptation framework must have the competency of effectively isolating the content, presentation, navigation and interaction components and subsequently integrating them seamlessly and effectively.
3.6 Comparison While some dimensions and attributes are similar to the traditional e-learning systems we would like to highlight some of the significant differences between the adaptation framework (MA) and the traditional e-learning systems.
An Adaptation Framework for Web Based Learning System
281
(a) Traditional systems are typically designed for desktop access and not for mobile access. MA provides multiple platforms access including access through mobile devices. (b) Traditional systems usually deliver identical content 24x7. MA adapts to several environmental parameters such as connection, environment, motivation, and device capabilities. (c) Traditional systems generally have only one type of content for delivery. MA has different content for the same concept for adaptive delivery. (d) Traditional systems mostly use browser feature to provide offline access. MA uses application to ensure offline access is functional regardless of the browser. (e) Traditional systems might detect browser and adjust presentation but generally not the content. MA can detect the browser to present different content (but same meaning). (f) Assessment in traditional systems is typically not designed for offline mode. MA assessment is designed to work offline. (g) Traditional systems are usually static. MA provides dynamic and adaptive environment. (h) Traditional systems are unable to provide collaboration between devices. MA opens the collaboration channel among devices with capabilities such as Bluetooth.
4 Adaptation Framework Implementation 4.1 Prototype System We developed a prototype system based on the five competencies of our adaptation framework. The system is a web based learning and assessment system. The learner is able to learn a module and later take an assessment. The system has been tested with multiple platforms such as desktop, laptop, PDA and cell phone simulator. Once the user has been authenticated, the system will proceed to a recommendation page. The recommendation is based on a decision tree within the algorithm subdimension. If the user accepts the recommendation the server side script selects the appropriate style sheet and packages the delivery to the user. Figure 1 shows an adaptation of animated gif for revision instead of interactive flash if flash plug-in is not detected in the PDA. If flash is detected, an interactive flash is delivered as in Figure 2. In all these cases the task can be achieved without disparity. The off-line mode has also been tested with a wireless PDA using Bluetooth device. The physical distance between the server and the PDA was increased until it triggered below the acceptable roundtrip response time. The system then recommended an off-line operation for the user. Again the appropriate style sheet transformed and packaged the content for delivery.
4.2 Lessons Learnt With respect to device dimension, our initial framework consisted of a device type sub-dimension. However after the prototype development, it became apparent that the
282
T.T. Goh and Kinshuk
device type is not appropriate as an influence for the content delivery but rather the capabilities of the device need to be determined for adaptation. In other words different device type such as PC or PDA can receive the same content if they have the same type of capabilities (which is increasingly the case due to significant technical advances in mobile device technology). The device type will only be useful in a situation where the capabilities detection algorithm fails. With respect to content dimension, we have structured and organized our content according to the recommendation of the framework. This helps the coordination dimension in the selection of style sheet to perform transformation. In the pedagogy subdimension, we have adopted the multiple-representation approach for adaptation (Kinshuk & Goh 2003). We have used the principle of content revisit to enhance leaning experience. Thus users have second chance to learn the content if they are not successful in performing the exercises. The pedagogy works well for both on-line and off-line modes.
5 Conclusion and Future Work The adaptation framework provides a competency guideline for developing a learning system that is capable of adapting and delivering content in a multiple platform environment. Using the content dimension, user dimension, device dimension, connection dimension and coordination dimension, the prototype system performed adequately. However we have yet to fully evaluate the system with respect to learning experience. In order to strengthen the framework, our future work will focus on comparing the learning experience with that of traditional e-learning systems.
Fig. 1. Animated Gif image without Interactivity (No Flash)
Fig. 2. Flash with interactivity
References 1. Bharadvaj, H., Joshi, A. and Auephanwiriyakul, S. (1998) An active transcoding proxy to support mobile web access, 17th IEEE Symposium on Reliable Distributed Systems, October 1998.
An Adaptation Framework for Web Based Learning System
283
2. Bickmore, T. and Schilit, B. (1997) Digestor: Device Independent Access to the World Wide Web, http://www.fxpal.com/papers/bic97/ 3. Blackboard (2002). Blackboard http://products.blackboard.com/cp/bb5/access/index.cgi 4. Buyukkokten, O., Garcia-Molina, H & Paepcke, A., Winograd, T. (2000), Power Browser: Efficient Web Browsing for PDAs, in Proceedings CHI2000 (The Hague, April 2000) (2002). Broadway, CLF, ROL, Hermes http://www-sop.inria.fr/aid/software. 5. html#broadway 6. Chen, J., Yang, Y. and Zhang, H. (2000) An adaptiveweb content delivery system,International Conference on Adaptive Hypermedia and Adaptive Web-based Systems(AH2000), August 2000, Italy, pp.28–30. 7. Chen, J., Zhou, B., Shi, J., Zhang, H. and Wu, Q. (2001) Functional-based object model towards website adapation, WWW10 May 2001, Hong Kong, pp.1–5, http://www10.org/cdrom/papers/296/ 8. Chen, Y.S., Kao, T.C., Sheu, J.P. and Chiang, C.Y. (2002). A Mobile Scaffolding-AidBased Bird-Watching Learning System. In M. Milrad, H. U. Hoppe and Kinshuk (Eds.), IEEE International Workshop on Wireless and Mobile Technologies in Education (pp. 1522). Los Alamitos, USA: IEEE Computer Society. 9. Fox, A., Goldberg, I., Gribble, S.D., Lee, D.C., Polito, A., and Brewer, E.A. (1998) Experience With Top Gun Wingman, A Proxy-Based Graphical Web Browser for the USR PalmPilot in Proceedings of the IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing (Middleware ’98) (Lake District, UK, Sept. 1998) 10. Goh T. and Kinshuk (2002). Mobile Web based ITS Adaptation. Paper presented at the International conference on computer in education 2002. Auckland, New Zealand 11. KBS Hyperbook System (2002). http://www.kbs.uni-hannover.de/hyperbook/ 12. Smith, J., Mohan, R. and Li, C. (1999) Scalable multimedia delivery for pervasivecomputing, ACM Multimedia. 13. SQL Tutor (2002).http://www.cosc.canterbury.ac.nz/~tanja/sql-tut.html.
Ontologies for Creating Learning Object Content and F FON – School of Business Administration, University of Belgrade, POB 52, Jove Ili a 154, 11000 Belgrade, Serbia and Montenegro [email protected], [email protected], [email protected] http://goodoldai.org.yu
Abstract. This paper gives a proposal to enhance learning object (LO) content using ontological engineering. In the previous work on using ontologies to describe LO researchers build ontologies for description of metadata. These semantically annotated metadata improves retrieval for objects describing the same or similar content. However, these ontologies do not improve an LO content. Our approach suggests creating LOs that have content marked up in accordance with domain ontologies. Accordingly, LOs can be used not only as learning materials, but can also be used in real world applications (e.g. simulation and CASE tools, etc.). This approach is based, on defining domain ontologies, annotation-based author tools, ontology languages (RDF), and transformation (e.g. XSLT). As an illustration, we developed a simple Web application for teaching Petri nets is a simulation-supported environment.
1 Introduction The Semantic Web introduces a better semantic interoperability of Web resources [1]. Adequate infrastructure, consisting of ontologies, XML-based descriptions, and necessary tools, is essential for achieving desired level of semantic interoperability. Using the Semantic Web we can easily find existing learning materials, understand their descriptions (e.g. purpose, creator, etc.), locate related Web materials, etc. In that way, Semantic Web improves LOs’ reusability. In the learning community reusability is connected with research on LOs’ metadata [2]. Recently, researchers mainly propose the Semantic Web and ontologies for improving LOs’ metadata. For example, Mohan and Brooks [3] analyze relations of LOs and Semantic Web, especially emphasizing importance of ontologies. Accordingly, they identify several kinds of ontologies regarding LOs: an ontology of domain concepts, ontologies for e-learning, ontologies about teaching and learning strategies, and ontologies about physical structuring of learning objects. In the paper [4] the authors give an example of an ontology developed in accordance with the ACM Computer Classification system (ACM CCS). This ontology is described with RDF, and used in Edutella system. However, none of these solutions enables reusing the same LO in different ways through provision of ontology-based content. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 284–291, 2004. © Springer-Verlag Berlin Heidelberg 2004
Ontologies for Creating Learning Object Content
285
Firstly, we have to define the meaning of LO reusability – using a LO in different courses, by different teachers and learners. We advocate that ontologies can be used to describe a LO’s content, thus providing, LOs with a new dimension of reusability – a LO can be used within the same course, but in different way (e.g. different LO’s presentations). For instance, an author can make domain ontology-based annotations of Web resources (e.g. a Web page about ancient Rome). Later, the annotated parts can be used as a LO in a course that (s)he creates. This LO can be prepared to be used in the same course in different ways (e.g. as a table, or as bulleted sequence, etc.). Obviously, it would be useful to use different presentations of the same LO than a number of LOs describing the same problem. Also, with semantically marked up LO’s content we would achieve better LO findability – based on its content. In order to achieve semantically marked up LO’s content we need suitable equipment: authoring tools, domain ontologies, annotation tools, and transformations.
2 Starting Points We will try to explain how an author creates learning materials, as well as, how an author searches for LOs in a Web-based environment [5]. Educational materials may be distributed among different educational servers – specific Web applications running on physical servers. Intelligent pedagogical agents provide a necessary infrastructure for knowledge and information flows between clients (learning and authoring tools) and servers in the context of Web-based education. In the case we have ontologically annotated LOs’ metadata, pedagogical agents are additionally empowered to find more appropriate LOs. This approach is different from the approach suggested in [3] where the authors address smarter LOs. We think that LOs should be further enhanced by providing ontology-based knowledge for their content. That means, semantically organized LOs’ content has better potentials to be repurposed. The main thing is that we have one LO (i.e. its content) that can be transformed into different presentations or accessed from different platforms. According to the previous discussion we can differentiate two kinds of ontologies regarding LOs: 1. ontologies that describe LOs’ metadata 2. ontologies that describe LOs’ content.
We have already mentioned the first group of ontologies [3] [4]. The main focus of this paper is the second group of ontologies that describe LOs’ content. LO’s content is educational material that can be: text, paragraph, Web page, image, audio file etc. The meaning of the content can be described using ontology-based annotations, or more precisely, by inserting pointers to appropriate ontologies. Annotations can be remote or embedded [6] and XML/RDF mechanism is used to describe annotations. Generally, annotating Web resource means adding semantic content to it. As LOs consumers we do not need these semantic marks, but they are essentials for machines to be able to read and understand LO’s content. A LO created using this annotation principle gets a new dimension of reusability – it can be used in different ways within the same course. Furthermore, this way created LOs are more suitable for retrieving
286
et al.
since their content can be inspected using ontology-based conceptualization. This is important in computer science courses like, for example, object oriented modeling with the UML. A teacher uses an UML model in Power Point presentation, while students should try the same model in a CASE tool (e.g. Rational Rose). Similarly, this principle can be used in other disciplines (e.g. philosophy, history). In the next section we explain enhancing LOs’ content using ontologies.
3 Proposed Solution: Creating Ontology-Based LOs’ Content A LO can be created to be used in different courses, and its content can be created on many ways (e.g. using text editor, slide presentation creator, HTML editors, graphical tool, domain applications, etc.). In the classical LOs creational schema adding general LO descriptions (i.e. metadata – e.g. creator, purpose of LO, etc.) means attaching semantics. Here we extend part of this schema related to LOs’ content. In Figure 1 we depict this enhanced schema. The central part of this figure is a LO. A LO consists of two parts: metadata and content. LO’s metadata are described using IEEE Learning object Metadata (LOM) standard. Learning object repositories (LORs) contain either LOs’ metadata or reference to LOs (i.e. their metadata) in the Web. Also, this metadata can be enriched with ontologies (e.g. the ontology based on the AMC CCS [4]). In Figure 1 we marked this kind of ontologies as etc. An author accesses and retrieves available LOs in LORs. When (s)he finds a useful LO, (s)he takes it and incorporates it into the instructional model of a course (s)he is creating. Here we assume a course as an instructional model crated in accordance with an Educational Modeling Language – EML (e.g. the EML, LMML, etc) [7]. A resulting instructional model can be mapped into an XML form of the used EML (i.e. its XML bindings). This XML course description can be transformed (e.g. using eXtensible Stylesheet Language Transformations – XSLT) into a learnersuitable presentational form (e.g. HTML, Scalable Vector Graphics – SVG, etc.). In the extended LO creational schema we use domain ontologies to semantically mark up content of a LO. In Figure 1 domain ontologies are denoted as etc. An author can use already developed ontologies (preferred case) or develop her/his own domain ontology. A domain ontology describes a subject domain of a course for which a LO is being created. Generally, it would be better that an author does not have to explicitly know about the domain ontology. For example, we can not expect that a teacher of social science knows how to develop an ontology. In order to overcome this problem we recommend either usage of existing (i.e. annotation tools) or development of new tools (see the next section) that have appropriate GUI for creating annotations [6]. One possible solution is to provide a teacher with a tool that would, in background, perform all required annotations of a Web document (i.e. creates instances of ontology elements) while (s)he selects parts the document (HTML or PDF). Later, the teacher extracts annotated parts of document and creates from them a
Ontologies for Creating Learning Object Content
287
Fig. 1. Extended LO creational schema – LO’s content is related with a domain ontology
presentational form within a course. In fact, this idea has analogy in marking printed book parts using scriber. While reading a printed text, a teacher, uses these marks as remainders to the parts that (s)he found interesting for her/his course. An advantage that Web resources have is that denoted (i.e. annotated) parts can be automatically extracted. Once created a LO with its ontology-based content can be included in different courses. Formally, one can say we use LOs when a new instructional model is being constructed. Usually, this model consists of learning modules or units (these terms depends on selected EMLs since there is not generally adopted terminology for EMLs’ items). When an instructional model is finished it can be transformed into learner-suitable presentational form (HTML, SVG, etc.). Since almost every EML has an XML binding this transformation can be performed using XSLT mechanism. Since LOs’ annotations are XML-based we can transform LOs using XSLT. These transformations can also mean content extraction so that we do not show the full LO’s content, but only parts suitable for a concrete situation. Further more, an instructional model in some of its parts uses LOs with enhanced content. When transforming this instructional model, we should use transformations for all included LOs. Accordingly, transformations of the instructional model (i.e. its XSLT) depend on LOs’ transformations (i.e. their XSLTs). In this way, the same instructional model can repurpose content of a LO in different forms. We believe that this content transformation principle can be useful for adaptive tutoring systems where system should adapt teaching material regarding students’ preferences, foreknowledge, etc. That means, the same LO can be prepared according to a student’s model.
288
et al.
4 Required Equipment This section gives an overview of recommended equipment for achieving semantically enhanced LOs’ content. Domain Ontologies. An author can use already created ontologies that describe content of a LO. Also, authors should be provided with mechanisms that would enable them to create their own ontology during a LO construction. An author does not explicitly have to know (s)he is developing an ontology. Authoring Tools. In order to enable creating LOs for the Semantic Web, we need to develop suitable tools. Beside widely accepted and well-known authoring tools (e.g. text processors, Power Point, HTML editors) we suggest employing additional tools. Here we consider two kinds of additional authoring tools: annotation tools and domain tools. Authoring tools should also have ability to use domain specific XMLbased formats (e.g. W3C Mathematical Markup Language - MathML). Annotation Tools. Annotation tools are a Semantic Web effort aimed at producing semantically marked up Web resources (http://annotation.semanticweb.org). Examples are well-known annotation tools like: Annotea, CREAM framework, SMORE, etc. The main characteristic of these tools is either relating Web content with a domain ontology or giving semantic meaning to Web content. In both cases, they produce annotations that can be understood (in the narrower sense) as RDF descriptions. Web content may be a Web page or part of a Web page (e.g. paragraph, heading, some XML tag). Present annotation tools have features that support annotation of, for instance: HTML pages (CREAM), SVG and MathML (Amaya), Power Point (Briefing Associate) etc. Future tools should implement support for other important Web formats, such as PDF, SMILE etc. Also, annotation tools can be used to annotate different multimedia formats (e.g. animation, sound, etc.). Domain Tools. In some engineering courses a teacher uses a LO, for example, in its Power Point presentation, while students should use this LO in a real world application (e.g. in object oriented courses the Rational Rose tool is used for UMLbased object oriented modeling). This is also important for simulation-supported learning (see the next section). Consequently, we think that the best way to create LOs is to use an original domain tool. LOs created in that way can be shared using XML-based domain formats. XML-Based Formats. Presently, the XML is a widely adopted Web data sharing standard. There is also an increasing number of tools that support XML, as well as XML-defined domain sharing formats. Examples of XML-based formats are: MathML, Petri Net Markup Language (PNML), XML Metadata Interchange (XMI) etc. LOs based on the XML can be easily converted into different formats, e.g. HTML, SVG, but also formats of other specialized tools like are MS Power Point,
Ontologies for Creating Learning Object Content
289
Word, or other domain specific tools. Accordingly, it would be important advantage if we develop domain ontologies closely related with XML-based formats. Transformations. Transformations are important segment for achieving LOs repurposing on the Semantic Web. The XSLT is most suitable to be used for transforming semantically marked up LOs. We should note that the Annotea annotation tool uses this XSLT-based approach. Using XSLT we have additional feature to convert XML-based LOs into valid ontology mark up (e.g. OWL).
5 An Application Example In this section we depict a simple educational Web application in order to illustrate the proposed approach. This application is for teaching Petri nets and it introduces well-known consumer/producer problem that is taught in many different computer science courses. The application is based on the Petri net ontology developed in RDFS and OWL languages. We have developed the ontology to be closely related with the PNML in order to exploit as much as possible compatibility with present Petri net software tools. Relations between the PNML and the RDF/XML-based Petri net description we have implemented using XSLT in both directions. For the educational Web-based application we also use the P3 – a Petri net tool we have developed for teaching Petri nets [8]. The P3 tool has the ability to generate RDF description of a Petri net as well as to produce SVG description of a Petri net model. An SVG document can be annotated using RDF compliant to the ontology. In this case we understand a Petri net model in RDF-annotated SVG form as a LO. In Figure 2 we give an application’s screenshot. A created LO is incorporated in the Web application (the Petri net graph in Figure 2). In order to empower Web applications with ability to perform interactive simulation of Petri net models, implementation of the logic of Petri net execution is needed. This can be achieved using PNML-based Web service for Petri nets simulation developed in [9]. Web application should forward Petri net model to the Web service. This model is converted from RDF annotated SVG format into PNML format using an XSLT. Once the simulation is finished, another XSLT is used to transform the result from PNML to RDF annotated SVG format. Each Web page in the system contains a graphical presentation of adequate Petri net model (based on RDF annotated SVG) and provides support for simulation with that model (using Web service). User can save a Petri net he/she is working with in PNML format and that Petri net can be further imported into Petri net tools (e.g. P3, Renew, DaNAMiCS). The same model in SVG form can be used in other Web pages, but also can be shown in tool such as Power Point. This application can be reached and used at http://p3lectures.europe.webmatrixhosting.net.
290
et al.
Fig. 2. The educational Web application for teaching Petri nets that uses enhanced LOs
6 Conclusions In this paper we presented an approach that suggests using ontologies for annotating LO’s content. That way we extended LOs reusability – LOs not can only be used in different courses, but they can also be used in different ways (e.g. presentation, platforms etc.). Furthermore, using ontology-annotated LO’s content one improves retrieval of LOs. In order to provide an adequate environment for creating enhanced LO’s content we need proper infrastructure that consists of: domain ontologies, authoring tools (annotation and domain tools), domain formats and transformations (i.e. XSLT-based). We believe that teachers can benefit from the proposed approach since it would enable them to create LOs while reading Web literature. Actually, they would just have to denote (i.e. annotate) suitable parts that could be later included in their new courses. We also hope that proposed approach can help developers of LO authoring tools for the Semantic Web. In the future we are planning to explore relations between domain ontologies and didactical ontologies in order to obtain LO’s content that is more suitable to be used in EMLs. In that way pedagogical agents would be enabled to make “smarter” decisions while selecting, preparing, and adapting domain materials that should be shown to a student.
Ontologies for Creating Learning Object Content
291
References 1. Berners-Lee, T. et al: The Semantic Web, Scientific American, Vol. 284, No. 5 (2001) 34-43 2. McClelland, M.: Metadata Standards for Educational Resources, IEEE Computer, Vol. 36, No. 11 (2003) 107-109 3. Mohan, P., Brooks, C.: Learning Objects on the Semantic Web, In Proc. of the IEEE Int’l Conference on Advanced Learning Technologies, Athens, Greece (2003) 195-199 4. Brase, J., Nejdl, W.: Ontologies and Metadata for eLearning, In S. Staab & R. Studer (Eds.) Handbook on Ontologies, Springer-Verlag (2004) 555-574 5. Devedži , V.: Key Issues in Next-Generation Web-Based Education, IEEE Transaction on SMC – Part C: Applications and Reviews, Vol. 33, No. 3 (2003) 339-349 6. Handschuh, S., et al: Annotation for the Deep Web, IEEE Intelligent Systems, Vol. 18, No. 5 (2003) 42-48 7. R. Koper: Educational Modeling Language: adding instructional design to existing specifications, Workshop “Standardisierung im eLearning”, Frankfurt, Germany (2002) 8. Gaševi , D., Devedži , V.: Software support for teaching Petri nets: P3, In Proc. of the IEEE Int’l Conference on Advanced Learning Technologies, Athens, Greece (2003) 300301 9. Havram, M. et al: A Component-based Approach to Petri Net Web Service Realization with Usage Case Study, In Proc. of the Workshop Algorithms and Tools for Petri nets, Eichstätt, Germany (2003) 121-130
PASS: An Expert System with Certainty Factors for Predicting Student Success Ioannis Hatzilygeroudis1,2 , Anthi Karatrantou1 , and C. Pierrakeas3 1
Department of Computer Engineering & Informatics, University of Patras GR-26500 Patras, Greece 2 R. A. Computer Technology Institute, P.O. Box 1122, GR-26110 Patras, Greece 3 Hellenic Open University, 23 Saxtouri Str., GR-26221, Patras, Greece {[email protected], [email protected]}
Abstract. In this paper, we present an expert system, called PASS (Predicting Ability of Students to Succeed), which is used to predict how certain is that a student of a specific type of high school in Greece will pass the national exams for entering a higher education institute. Prediction is made at two points. An initial prediction is made after the second year of studies and the final after the end of the first semester of the third (last) year of studies. Predictions are based on various types of student’s data. The aim is to use the predictions to provide suitable support to the students during their studies towards the national exams. PASS is a rule-based system that uses a type of certainty factors. We introduce a generalized parametric formula for combining the certainty factors of two rules with the same conclusion. The values of the parameters (weights) are determined via training, before the system is used. Experimental results show that PASS is comparable to Logistic Regression, a well-known statistical method.
1 Introduction In the last decades, there has been extensive use of computer-based methods in education, either for administrative or for pedagogical purposes. Those methods can be distinguished in traditional and artificial intelligence (AI) methods. Various forms of regression analysis are representatives of the traditional methods, whereas the expert systems approach is a common representative of the AI methods. Both have been used in various applications in the education domain, e.g. admission decisions [1, 2], academic advising [3], academic performance prediction [1]. In this paper, we use them in a somehow different application: prediction of a student success in the national exams for admission in a higher education institute. It is obvious that the ability to predict a student’s success in the entry examinations to the higher education could be useful in a number of ways. It is important for the teachers as well as the directors of a secondary education school to be able to recognize and locate students with high probability of poor performance (students at risk) in order to provide extra help to them during their studies. So, it is useful to have a tool to assist them in this direction. This is the objective of the work, which is presented here. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 292–298, 2004. © Springer-Verlag Berlin Heidelberg 2004
PASS: An Expert System with Certainty Factors
293
We use two methods, an expert system approach and a well-known statistical method, namely logistic regression, to achieve our objective. Logistic regression is used for comparison reasons. In the expert system, we introduce and use a modified version of the MYCIN’s certainty factors [4], We call the expert system PASS (Predicting Ability of Students to Succeed). Our aim is to use PASS as an education supporting tool, mainly addressed to high school teachers for the above mentioned purpose. The design of PASS is based on an analysis of demographic, educational and performance data of students from an available database.
2 Modeling Prediction Knowledge 2.1 The Problem Our work concerns students of technical and vocational secondary education in Greece. In evening schools of that level, students attend a three years program (grades A, B and C) and choose one of the specializations offered, such as electrolgy, mechanics, nursing etc. Students attend a variety of courses: general education courses, specialization courses etc. Each year has two semesters. At the end of each semester, students are given marks representing their performance. To enter a Technological Educational Institute (TEI), which is a higher-level institute, the students should pass corresponding national exams. The exams include tests in three courses. It is important to notice that the number of students who succeed is very low, which means that the students of this type of high schools need some help. Thus, it is very important for a teacher to be able to recognize, as early as possible, the students who (a) have a high possibility to succeed, in order to help and encourage them during their studies, (b) have a low possibility to succeed, in order to treat them properly during their studies So, apart from teacher’s personal opinion, a tool that could make predictions about the possibility that a student has to pass the national exams would be of great assistance. It would be also useful for school directors and curriculum designers, because it can offer them useful information about how to organize the school program.
2.2 Specifying the Parameters Knowledge acquisition in such problems mainly consists in specifying the parameters (input variables) that play some role in predicting the success of a student. To this end, we interviewed some teachers with long experience. We also analyzed data from a student database, which contained 201 records of students, who took the entry examinations during the last three years. We finally resulted in the following parameters, as being important for predicting the students success: sex, age, specialization, grade A (average mark of all first year courses), grade B (average mark of all second year courses), grade SC (average mark of the three courses to be examined in the national exams at the end of the first semester of the third year). Marks in courses follow the 1–20 scale. Another crucial point was to determine the values of the parameters/variables, like age, grade A, grade B and grade SC. The variables and their decided values are as follows: specialization: electrology, mechanics, electronics, office clerks, nursing, sex: male, female, age: normal (’, ‘white’ in utterance 19 is appeared in utterance 18. 3.3.4
Does the Current Utterance Include a Noun or a Noun Phrase that Already Appeared in a Previous Utterance ? Fig.4. shows the case where the current utterance 57 and the previous utterance 56 share a noun ‘creation’ that a link, these two utterances. Personal pronouns are not used for this case, because they are especially ambiguous about the reference in this place.
Fig. 4. Example of an utterance including a noun or noun phrase that appeared in the previous utterance
3.3.5
Is the Speaker of the Current Utterance Same as that of the Utterance Just Before ? If no utterance is found to be linked after 1a) - 1b), the speaker of the utterance just before is checked against that of the current speaker. If the speaker is the same, these utterances are linked as in Figure 5; If not, the current utterance is determined not to have a link to any previous utterances.
Fig. 5. Example of an utterance whose speaker is the same as that of the utterance just before
3.4 Selecting Utterance Candidates that Should be Linked to Some Subsequent Utterances Discourse Analysis has proposed a classification of utterances into three types: initiating, responding and following-up [4]. According to them, an initiating utterance predicts a responding utterance, the pair of which is topically relevant. The proposed algorithm incorporates this observation as step 1a). In face-to-face conversations, it is usual that two persons talk with each other (one-to-one). However, in chat conversations, more than three participants often communicate with one another. In order to find the pair of utterances for such cases, an algorithm for extracting topic threads needs to find a participant whom an initiating utterance directs to. Before this process, whether the current utterance is initiating or not must be determined. The proposed algorithm uses explicit markers like ‘?’, ‘??’ for this decision. For the problem of finding a participant whom the current initiating utterance targets at, our algorithm examines the expressions indicating the participant names. If the names are not explicit, the participant in the previous utterance linked to the
334
K. Ogura et al.
current is used. This is based on our observation that people usually do not finish their conversation just by one exchange (a pair of initiating and responding utterances).
Fig. 6. Example of selecting utterance candidates that should be linked to some subsequent utterances: the proposed algorithm can identify utterance 40 as initiating using the participant name followed by the symbol ‘>’. Then it examines the utterance and can find ‘C’ as the participant who is expected to respond. This information is used by step 1a)
4 Evaluation of the Proposed Algorithm 4.1 Experimental Data Eight chat conversations (three with two participants (311 utterances), five with three participants (870 utterances) were experimentally collected for our analysis. Three researchers manually extracted topic threads for evaluating the proposed algorithm, based on the procedure proposed in [5] [6]. The different decisions on topic threads were resolved by their discussions.
4.2 Evaluation Result We compared the result of our proposed algorithm with the method of baseline to show usefulness of our proposed algorithm by correct rate (table1). Correct rate is calculated by the number of correct utterances in the number of selected utterances by a factor of proposed algorithm and all of utterances in experimental data. To calculate correct rate of method of baseline we used the same algorithm of Coterie[3] which automatically separates out chat conversations using keywords and key-phrases.
A Method of Extracting Topic Threads
335
As shows Table1, the average of proposed algorithm is 53.0%. When we consider practical use of this algorithm, this result is not satisfied. But, the result of this algorithm is better than the result of baseline. In processing of chat conversations and dialogue, the algorithm of extracting topic threads and relations of utterances hardly exist. From this point of view, this result of our proposed algorithm is satisfied. Moreover, to consider these results in detail, we analyzed what linguistic devices were used in correct and incorrect answers. As a result, we found that about twenty percents of all of utterances recognize as beginning of topic because these are not selected by any factor of proposed algorithm and that “noun phrases” caused about half of errors, especially being related in synonym. On the other hand, a cut-andpasted phrase, a phrase following the symbol ‘>’, and an expression “> speaker name”, caused only about five percents of errors and hence these devices are useful for determining linked utterances.
5 Applications of the History of Chat Conversations for Knowledge Creation Recently, there are many studies of chat systems that emulate the situation of face-toface conversations as far as possible. Chat conversations naturally generate text documents, but this advantage doesn’t seem to be actively utilized. We discuss the history of chat conversations can be utilized for knowledge creation by applying our proposed algorithm. There are two promised applications two uses: 1) An automatically generated knowledge-base (after using chat systems) and 2) A supporting system for chat conversations (synchronously using systems).
5.1 An Automatically Generated Knowledge-Base We can extract topic threads, assort and file them to generate a knowledge-base. In addition, because it is possible to reorganize assorted threads, we can use not only the contents of the conversation but also the knowledge of know-who, i.e., who knows the contents well by relating extracted topic threads to participants.
5.2 A Supporting System for Chat Conversations The problem that chat users can’t read the history of conversations easily is solved by using the algorithm for extracting topic threads. As a result, they can refer to the history without taking extra operation for chat system. In addition, because it is possible to visualize and grasp the situation of conversations, we can support for activating the conversation with agents that offer new topics according to the situation, such as AIDE[7], and for making acquaintances who have similar interests.
6 Summary and Conclusion This paper proposed a robust algorithm for extracting topic threads in chat conversations using some language devices and discussed how chat conversations can
336
K. Ogura et al.
contributes to knowledge creation using our proposed algorithm for extracting topic threads. When we apply this algorithm to some systems, we will have some problems. One problem is a method of processing analysis of morphology. When we prepared the history of chat for extracting topic threads with morphological analyzer in Japanese “ChaSen[8]”, many words which should be classified into noun were classified not into noun but unknown-word. The other problem is about processing synonym and to judging which elements for linking utterances is preceded. Although we have some problems of applying our proposed algorithm, it is clear that the elements of using features of chat conversations perform well.
References [1] Hosoma, H.: What do people presuppose in chat conversations – Timing Structure of chat and speech conversations, in Okada, M., Mishima, H. and Sasaki, M. (eds.) Embodiment and Computers, bit magazine, Kyoritsu Publisher, Japan, pp.339-349, 2000. [2] Smith, M., Cradiz, J. J. and Brukhelter, B.: Conversation Trees and Threaded Chats, Proc. of CSCW’00, pp.97-105, 2000. [3] Spiegel, D.: Coterie: A Visualization of the Conversational Dynamics within IRC, MIT Master’s Thesis, 2001. [4] Sinclair, J. M. and Coulthard, R.M.: Towards an analysis of discourse, Oxford University Press, 1975. [5] Miura, A. and Shinohara, K.: An Exploratory Study of “Chat” Communication on the Internet, Japanese Journal of Interpersonal and Social Psychology, No.2, pp25-34, 2002. [6] Mizukami, E. and Migita, M.: Order of Chat Conversations – Study of Conversation Structure by Interval Analysis, Cognitive Studies: Bulletin of the Japanese Cognitive Science Society, 9(1), pp.77-88, 2002. [7] Nishimoto, K., Sumi, Y., Kadobayashi, R., Mase, K. and Nakatsu, R.: Group Thinking Support with Multiple Agents, Systems and Computers in Japan, Vol.29, No.14, pp.21-31, 1998. [8] Matsumoto, Y., Kitauchi, A., Yamashita, T., Hirano, Y., Matsuda, H., Takaoka, K. and Asahara, M.: Japanese Morphological Analysis System ChaSen version 2.2.1, http://chasen.aist-nara.ac.jp/hiki/ChaSen/, 2000.
Support System for a Person with Intellectual Handicap from the Viewpoint of Universal Design of Knowledge Toshiaki Ikeda and Susumu Kunifuji School of Knowledge Science, Japan Advanced Institute Science and Technology, Hokuriku, Asahidai, Tatsunokuchi, Ishikawa 923-1292, Japan {tosiakii, kuni}@jaist.ac.jp
Abstract. In a special education school, teachers are devising various devices for supporting intellectual handicapped children. Many of those devices are effective also to people without a handicap. If we can develop products and systems that handicapped persons can use easily, they should be able to be safe and comfortable also for ordinary persons. This paper is introducing the possibility and availability of universal design of knowledge that based on practice in a special education school.
1 Introduction Since the International Year of Disabled Persons in 1981, various campaigns have been done all over the world. The words “barrier-free” and “universal design” have not been especially new. The car parking space for wheelchairs or the toilet for person with disabilities was prepared everywhere. The appliances and services for a visually impaired person or a hearing-impaired person are prepared. It has proved that those resources are useful also for a senior citizen, a small child or a person with some physiological problem. Steep stairs changed to the pleasant slope. Many company and public offices are also beginning to understand the availability of barrierfree or universal design. However, we cannot find the appliance for a person with intellectual handicap anywhere. There are many unclear or complicated things around us. The operation manual of some electric appliance, the clause of life assurance, terms of law, a medical jargon, it is endless. An unintelligible thing irritates us and sometimes these problems drive us away to dangerous situations. A serious accident and disaster occur from the slight failure of comprehension. If we can find the general method of making things intelligible, our life will become safer and more comfortable. Total support system for a mentally handicapped person will change our society. We call this strategy the universal design of knowledge. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 337–342, 2004. © Springer-Verlag Berlin Heidelberg 2004
338
T. Ikeda and S. Kunifuji
2 The Issue of Assistance for Mentally Handicapped Persons Handicapped persons are considered about ten percent or more of all population. There are many support systems using high-tech machines for persons with physical handicap, visual impairment or auditory difficulties. However, there is almost no assistance for persons with intellectual handicap. For example, almost all stations, public offices, schools, or shopping centers have the car parking space and slope for the persons using wheelchairs. Nevertheless, there are no supporting systems for persons with intellectual handicap. They are asking for intelligible articles and services required for life earnestly. It has no significance for those who cannot understand a letter if it is written that it is dangerous with a big and red letter. We cannot make the support system to a person with intellectual handicap. This problem is not only a mentally handicapped persons’ problem. There are many cases that an unclear thing makes our life inconvenient, or makes unpleasant. Many people become a missing child in a big basement garage everyday. The thick manual of electrical machinery has refused the user from the beginning. There are many troubles caused by unclear things, such as a traffic accident and a trouble on a contract of dealings. If a community is safe and comfortable for a disabled person, all the people can be safe and can live comfortably.
3 The Example of Practice in a Special Education School 3.1 The Device Currently Performed in the Special Education School With precision instruments, such as a camera or a watch, the maker of these products often advertised the goods having been used on severe conditions. The difficulty of the special education of mentally handicapped person is as severe as to design a product used in the space or the polar zones. Various devices for telling intelligibly are made in the special education school for children with an intellectual handicap. These are some examples. a. Simplification Ex. Kanji character is made into a Hiragana character. 125 yen is calculated as 100 yen. b. Delay Ex. Show the teaching materials slowly. c. Emphasis Ex. Using big letters, colored letters. d. Repetition Ex. Show repeatedly. Tell repeatedly. e. Subdivision Ex. Show little by little. Teach step by step. f. Supplement Ex. Add notes. Add illustrations.
Support System for a Person with Intellectual Handicap
339
g. Add Modality Ex. Add sounds, lights, smells, feelings and motions. h. Embodiment Ex. Number are replaced with apples and count them. i. Abstraction Ex. Use symbols to count. j. Familiarity Ex. The favorite animation character is made to appear. k. Gradualness Ex. Some middle targets are prepared. 1. Order Ex. Change order or keep order.
These devices can classify into three items. [1] Make it intelligible by controlling the quantity of information. b, d, e, i, l [2] Make it intelligible by deforming significance, a, c, f, g, h, j, k [3] Make it intelligible by preparing mental conditions. f,g,j,k
These three techniques are effective not only in a special education school, but are actually used also in all scenes around us. The important issue is that we should consider the way of assistance depending on each condition of the user, such as age, sex, character, experience, etc. The most important point in these conditions is giving priority to the user’s relief and pleasure.
3.2 Sample Case from Special School: Transit Position The target student: Female, age 15, Mental retardation, Autism. Problems: She sometimes bites classmates and teachers when she becomes a panic. Solution: In order to stabilize her emotion, we made a transit position in the classroom for her (fig. 1). There is a video set, which she can use anytime she wants. She likes some TV personality and TV commercials. We prepare some videotape for her. When she feels nervous she can go there and watch some videos in order to prepare her feeling. Later she wants to take the tapes to her home. She finds this system may be effective at her home. She uses the transit position both before and after the subjects for her weakness (fig. 2.) She can choose the tape and which scene to watch. We found controlling things (also times) is the most important demand for her. It is useful for her to prepare mental conditions by using controllable system.
340
T. Ikeda and S. Kunifuji
Fig. 1. The Transit position
Fig. 2. The observed state of her feeling
3.3 Sample from Special School: Moving Transit Position The target student: Male, age 15, mental retardation, speak any words Problems: If there is a disagreeable thing, He beats his face seriously. Solution: He likes some characters of TV show. He cannot part with the character doll and books of those characters. There is various causality of a panic; the character doll causes a panic in many cases. Some classmates also want to see these dolls and books. We print those characters he loves on some T-shirts (Fig. 3). He was pleased very much. He wears three shirts one over another. He accepts this system as a Moving Transit Position. When he feels nervous he can watch these characters (Fig 4). He also uses these shirts as a communication board. He points a character, when he wants to draw some pictures or wants to listen to the theme song. It is important to secure the implement for feeling easy. It is important that it can use anytime anywhere. He could control some important information for him easily. Then he was able to get mental stability.
Fig. 3. Moving Transit Position
Fig. 4. The state of the feeling
Support System for a Person with Intellectual Handicap
341
3.4 Sample from Special School 3: Simple Keyboard System The target student: Male, age 16, mental retardation, Autism, speak any words. Problems: He wants to watch cars on Internet. However, he cannot use a mouse or a keyboard. He cannot wait a moment until a picture appears. He cannot stop striking any keys or switches. Finally, he becomes a serious panic. Solution: We made a simple keyboard system for him (Fig. 5). The keyboard has only two switches. One is a lever for go and back; the other is a button switch that has no function. At first, he strikes the dummy key earnestly. He gradually becomes not pressing the key. Finally, he can wait without pushing the key with shinning smile. Later he collected the pictures on the Internet eagerly. We made some multimedia pictorial books. This intelligible system not only brings about mental stability, but pulled out his new talent.
Fig. 5. Simple Keyboard System
Fig. 6. His collection book
4 Conclusion If we reconsider all products or systems of our society from a viewpoint of universal design of knowledge, all the people would be able to be safe and more comfortable. Most important thing is offering the intelligibility according to each individuality and conditions. Off course, it is impossible to assume all cases. However, persons with intellectual handicap give us wonderful wisdom. We should have the following appraisal standards. Can the Product or service adjust according to a user’s individuality and conditions? Moreover, does the product or service make security and relief for the user as the first requirement or not? Even if the each device is small, when they gather, it will prevent a big trouble. When the product and service based on the universal design of knowledge spread widely, the safety, the creativity and the productivity is improved not only for the handicapped person.
Acknowledgement Thank you for children of Meiwa special school and children of Komatsu special school and their family. They are encouraging me every day and night.
342
T. Ikeda and S. Kunifuji
References 1. Cabinet Office: Disabled person white book , National Bureau of Engraving and Printing 2003 (in Japanese) 2. Hiroshi Asai, Takesi Fujisima: Intellectual handicap and “education” “welfare” denkenpublication 1999 (in Japanese) 3. Technical promotion-of-utilization examination boards:Introduction to e-AT- Electronic information assistance knows how is studied. New Media Development Association 2002(in Japanese) 4. Masaaki Kurosu: Usability testing It turns to the production of a thing led by a user. Kyoritsu shuppan 2003(in Japanese) 5. Kaoru Misaki: Ubiquitous computing-future type computer environment. The dream became actual. SoftMagic 2003(in Japanese) 6. Arakawa Hiroki: What if IC tag - RFID which realize ubiquitous society To know technology (4) NTT DATA ubiquitous study group 2003 (in Japanese) 7. Hachizou Umezu: The psychological map of behavior. First international congress for the study of Child Language. 1977
Intelligent Conversational Channel for Learning Social Knowledge Among Communities S.M.F.D. Syed Mustapha Faculty of Computer Science and Information Technology, University of Malaya, 50603 Kuala Lumpur, Malaysia [email protected] http://www.perdana.fsktm.um.edu.my/~symalek
Abstract. Recent studies have shown two approaches in building learning system. Each corresponds to the two types of knowledge which are the content knowledge and social knowledge. The former is knowledge about knowing how to perform a task while the latter is more about best practices. Intelligent Conversational Channel (ICC) is built to support for learning social knowledge. In this paper, the two types of knowledge are explained and how ICC can be used to support learning among communities.
1 Introduction There are numerous types of learning system being used as the technology for learning such as the Intelligent Tutoring System, Computer Aided Learning, Microworld and Computer-based Learning [4]. Expert system was also used to train general practitioner to be specialist [1]. These systems support learning for a specific domain of knowledge. In the last decade, study has shown that learning through social process has becoming an integral learning method besides the conventional self-learning. Community of Practice is a social learning theory that describes one’s learning through participation and reification through the community activities. Collaborative learning supports learning by sharing knowledge through mutual contribution [2]. Social Knowledge-building (SKB) describes a collaborative knowledge building by a community [3]. Researchers claim that story-telling as an effective mode of knowledge sharing and knowledge transfer [5]. The type of knowledge that suits learning using this approach is so-called social knowledge. Our approach to enabling knowledge sharing for social knowledge has three prongs which are to facilitate communication through virtual community, to analyze social interaction through discourse analyzer and building social knowledge through story-telling. Intelligent Conversational Channel (thereafter, ICC) has been developed that support this approach through three main components which are the Discourse Communicator, Hyper-media Learning Space and Discourse Analyzer. These three components are built on a community channel as the main venue for knowledge sharing. In section 2, a descriptive analysis is given to differentiate between content knowledge and social knowledge, section 3 describes the ICC components and the life cycle model of social knowledge and section 4 is the conclusion and future work. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 343–349, 2004. © Springer-Verlag Berlin Heidelberg 2004
344
S.M.F.D. Syed Mustapha
2 What Is Social Knowledge? In our daily life activities, there are two types of knowledge frequently used. First is the content knowledge. Content knowledge is all about learning how to perform certain tasks in a professional manner. It may be derived from basic principles learned from formal education such as tertiary institution or learned from an experienced expert. Many learning tools support the learning of content knowledge as that reflects one’s in-depth knowledge about his/her skill and professionalism. The current tools that are known are such as the expert system, intelligent tutoring system, intelligent computer aided-learning, microworld etc. In a simple example, a medical doctor is called specialist when he/she embarks on specialized course and training in order to be an orthopedist or pediatrician. His/her knowledge accumulated after a long years of experience. This type of knowledge is static, rigid and stable. However, the second type of knowledge is called social-knowledge (or socially-derivable knowledge) which may not be obtained though formal learning or experience but rather through community interactions. Knowledge about the current epidemics and which medical center has the best treatment can only be obtained through interaction with the community. Knowledge about the best practices in conducting staff appraisals by the blue chip company can be known through social interactions. This type of knowledge is dynamic, fluid and unstable in the sense that it may change from time to time and its validity can easily be superseded by the most current ones. In the other scenario, Denning [6] describes how the problem in Pakistan’s highway was solved at instant after contacts with colleagues who had experience solving the similar problems in South Africa. The knowledge exchange was not on the content knowledge (about fundamental theories in engineering course) but rather a social knowledge which can only be derived through acquaintance. Due to the differences between content knowledge and social knowledge, the development tool in facilitating the learning is also different. The content knowledge which contains facts and fundamental theories can be learned using courseware or computer-based learning software; while experience can be learned through expert system or intelligent tutoring system. Nevertheless, social knowledge requires community as the integral part of knowledge source. The process of building the system that support learning for social knowledge requires consideration given to the following factors [7]: Multiplicity in learning objects – knowledge in the real world is delivered or obtained in different forms. The objects, which are used as part of the learning whether directly or indirectly is called learning, object as described by Community of Practice [8]. Radio, television or LCD screen used for advertising are examples of broadcasting system that contribute to one’s knowledge. Newspaper, magazines, leaflets or brochures are pieces of information, which transform into one’s knowledge when he/she reads them. Other forms of learning objects are the working colleagues, animated or unanimated artifacts such as the copier machine, pets at home, video movies and neighbors whom one socialize with. In this respect, the expert knowledge does not come from a single source as well as the multiplicity in
Intelligent Conversational Channel
345
methodology for delivering the knowledge. Expert’s talk in the open seminars or television are examples of learning objects. Open-world assumptions – assumption is needed when one designs a system to be used as problem-solver. The assumptions are perspective that draws the boundary of the intended world in order for the system to work successfully within the specified limit. In modeling the content-knowledge, close-world assumption is always used. Unlike the content knowledge, social knowledge does not specify the assumption as the knowledge is not modeled but shared in its original form. The knowledge contains the description about the real world problems and solution rather than the hypothesized. Rapid knowledge-building – content knowledge requires a system builder to analyze and study, to model the solution, to build the system and test its performance. These processes are rather time-consuming and costly. On the other hand, the social knowledge is built by the community in a progress manner and can be learned immediately without the need of highly mechanistic and sophisticated process. Knowledge is presented in a human-readable format rather than machine-readable format. Unorganized, ubiquitous but retrievable – content knowledge built in an expert system is structurally organized and frequently validated by the truth maintenance technology. The purpose is to avoid conflict of facts and retain consistencies in delivering solution. The retrieval of the solution depends on the reasoning technique employed in the system. Social knowledge is rather unstructured and ubiquitous. The knowledge allows conflict solutions to a single problem as it can be treated as having choices of different perspectives. Learners are not confined to solution of a single expert in this case as knowledge is contributed by several experts or non-experts who is involved in the knowledge construction process. The social knowledge is retrieved through social interactions and dialogues with the communities. In the following section, we discuss the technology built on ICC as a tool in supporting learning social knowledge.
3 Components of Intelligent Conversational Channel The technology of ICC is built to enable the operation of the upper stream of the knowledge management which is at the user or community level. There are researches about building techniques in extracting knowledge from resources such as documents, images, videos, audio, data warehouse using intelligent information retrieval or human expert through knowledge acquisition. Our claim is that these systems are not flexible to allow the knowledge to be shaped up by the community who are the main beneficiaries of the knowledge. For example, several educational softwares are designed according to the specifications of the pedagogy theories which are rather predetermined by the designer. The design of an expert system takes consideration of small scope of human users while its application is expected to be wide. In all cases, the design is known and fixed before its development. ICC approaches towards knowledge shaping is flexible such that the community will determine what knowledge will be placed on
346
S.M.F.D. Syed Mustapha
the knowledge repository, the content of knowledge is extracted through “mix and match”1 process by the community, the shaping of knowledge process is resilient and destined by the responses and arguments posted into the community channel and knowledge externalization is done through dynamic interaction with the virtual community. These ideas are illustrated in Fig. 1.
Fig. 1. Components of Intelligent Conversational Channel
3.1 Community Channel In the community channel, two forms of knowledge can be presented using narrated text typed in the story object and also uploading of multimedia objects such as video clips, images, documents, html files. Fig. 2 shows a user expressed his/her concern about school delinquency problem and uses an image file to share the reality. Other members have the choices of replying to the above message or submit a new story object as shown in Fig. 3. The text on
1
Each member of the community has his/her own way of extracting (match) the gist of know edge he/she is interested in from a single source. The combination (mixing) of these knowledge collections gradually builds the community knowledge base.
Intelligent Conversational Channel
347
Fig. 2. Community channel that supports two forms of knowledge representation
the left box is submitted by the user who wants to start with new subtopic about canning system practiced in school. The right text box contains responses of another two members who respectively support the earlier statement and suggest a new solution. The taggers 1 and For example, assume that the action corresponding to is ‘go-forward’ with and that the action corresponding to is ‘turn-right and go-forward’ with In this case, the angle of movement direction is calculated as follows :
where, A=1000. Therefore, the angle
4.2
and the agent moves in this direction(Fig. 2(b))
Learning Algorithm
In order that the modified LVQ algorithm can select the optimal action among available actions, it imposes a limit on how close a weight vector can approach an input vector. This limit is realized by changing the learning rate appropriately as the learning proceeds, and the limit depends on the amount of the reward that the action gets. Therefore, an incorrect weight vector can not be closer to the input vector than the correct weight vector.
Behavior Learning of Autonomous Agents in Continuous State
1217
The modified learning algorithm is defined as follows:
where,
Even if the same action receives a reward repeatedly, the weight vector cannot be close to the input vector beyond a limit which is given by
The limit depends on the amount of reward assigned to the input vector As the received reward becomes larger, the weight vector can be closer to the input vector. Therefore this modified LVQ algorithm can solve the problems described in the previous section. See [3] for more details.
5 Example The objective of the agent is to learn its optimal behaviors from start point to goal area avoiding four obstacles in the environment as shown Fig. 1. The agent uses the modified LVQ algorithm(Fig. 1). The agent (3cm x 3cm) has three sensors whose readings are fed to the modified LVQ netwrok as inputs (left front right Each sensor can detect angle and distance of the nearest obstacle as real values (Fig. 3). The agent also detects its own heading direction information as input The modified LVQ outputs the direction of movement. Here, in one time step, the agent can move up to 2cm. But if there is an obstacle in its way, it stops just before the obstacle. The resolution of the agent’s movement is 0.1mm. During exploring the environment, the agent selects its action based on strategy [1]. The weight vectors are updated after every episode. An episode is a period during which the agent sets off the start point and ends in a terminal state. There are two terminal states. One is the state where the agent reaches the goal area,
1218
M.-K. Shon and J. Murata
Fig. 3. Structure of an agent
and the other is the time-up state when the agent cannot reach the goal area within a prespecified number of steps N = 300. When the agent has successfully arrived at the goal area, the weight vectors of the modified LVQ output nodes activated along the path to the goal area are updated by Eq.(6) and (7) with
where is the episode number, is calculated by Eq.(7), and is the number of moves that the agent has made in this episode The reward is equally given to all the actions in this episode. If is small, i.e. the agent finds the goal area quickly, the agent gets a large reward This, combined with the modified LVQ, drives the agent to find better action. When the agent cannot reach the goal area within a prespecified number of steps N, the reward and its sign are set as
Training was done for 1000 episodes with 5 different randomly initialized weights. Table 1 shows the length of route (measured in the time step) that was found by the Kohonen’s LVQ algorithm and the modified LVQ algorithm. In Table 1, the ordinary LVQ algorithm can hardly find the route to the goal area, while the modified LVQ algorithm can in all cases successfully find the route to the goal area.
Behavior Learning of Autonomous Agents in Continuous State
1219
Since the ordinary LVQ algorithm blindedly updates all actions which get reward, it does not know which action is actually good. However, since the modified LVQ algorithm knows how much a selected action is good and updates its weight vector accordingly, it easily finds the route to the goal area at a high probability.
6 Conclusions This paper proposed behavior learning of an autonomous agent in continuous state spaces using a function approximation technique based on the modified LVQ algorithm with fuzzy sets. The ordinary LVQ network has a good feature that it learns input-output relationships fast but it sometimes mis-learns in the reinforcement learning framework. The modified LVQ algorithm overcomes this problem by imposing a limit on how close a weight vector can approach an input vector based on amount of the reward that the action gets. Moreover, in order to represent various actions in the real world, the action of an agent is generated from basic actions using fuzzy sets. From the results of the example problem, it has been found that the modified LVQ network performs much better than the ordinary LVQ network. This research was partly supported by the 21st Century COE Program ‘Reconstruction of Social Infrastructure Related to Information Science and Electrical Engineering’.
References 1. Sutton, R. S. and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, 1998. 2. Watkins, C.J.C.H.and Dynan.P., “Techmical note : Q-Learning” Machine Learning, 1992. 3. Shon. M.K., J. Murata and K. Hirasawa, “Behavior learning of Autonomous Robots by Modified Learning Vector Quantization”, Trans. of the Society of Instrument and Control Engineers, 2001. 4. Kohonen. T, “Self-Organizing Maps”, Springer Series in Information Sciences, 1995 5. Kohonen. T, “Learning Vector Quantization for Pattern Recognition”, Technical Report TKK-F-A602, 1990
Some Experiences with Change Detection in Dynamical Systems Theodor D. Popescu National Institute for Research and Development in Informatics 8-10 Averescu Avenue, Bucharest 011455, Romania [email protected]
Abstract. The problem of change detection in the poles of signals having unknown time varying zeroes is addressed. It is used, instead of the test statistics based upon the standard likelihood ratio approach, which is of no help in this case because of the unknown zeroes, a test statistics based on an identification method (instrumental variables), which is known to decouple the AR and MA coefficients of the model. The procedure has been used for change detection in modal characteristics of a vibrating structure, a multi-story reinforced concrete building, subject to a strong seismic motion.
1 Introduction For complex mechanical structures, such as offshore platforms, bridges, buildings and dams, it is of crucial interest to monitor the vibrating characteristics without using artificial excitation, but in the usual functioning mode under natural or usual excitation (waves, wind, earthquakes, etc.). The vibrating characteristics of a mechanical structure reflects its state of health, and any deviation in these characteristics provides information on importance to its functioning mode. The main difficulty in this problem is that the measured signals reflect both the nonstationarities due to surrounding excitation, which is always highly timevarying, and the nonstationarities due to changes in the eigencharacteristics of the mechanical object itself. In this case it is of interest the change detection in a subset of model parameters while the complementary subset of model parameters are completely unknown and thus have to be considered as nuisance parameters. The change detection and isolation problem can be formulated as follows: using an ARMA model with nonstationary unknown MA coefficients to model the excitation, detect a change in the AR part, assumed stationary, and determine which AR coefficients or which poles have changed (isolation problem). The change detection procedure applied in this paper uses a test statistics based on an identification method (instrumental variables), which is known to decouple the two types of coefficients. The test statistics based upon the standard likelihood ratio approach, is of no help in this case because of the unknown MA part. M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 1220–1226, 2004. © Springer-Verlag Berlin Heidelberg 2004
Some Experiences with Change Detection in Dynamical Systems
1221
It can be shown that the central limit theorem holds for the statistics used, under both hypotheses, null (no change) and local alternative (small change). This gives a test statistics for a global test (change in the AR part), that will be used in the case study having as object the change detection in modal characteristics of a vibrating structure, a multi-story reinforced concrete building, subject to a seismic motion (Vrancea earthquake, Romania, 1986).
2 Fault Detection Procedure Let be the following scalar ARMA model:
where is a Gaussian white noise with constant variance We suppose that the unknown moving average coefficients are time varying, and may even be subject to jumps. The problem to be solved is the detection of abrupt changes or jumps in the autoregressive parameters We shall first recall the main result concerning the identification problem, that represents the starting point of the detection procedure.
2.1
Identification of the AR Coefficients
Assume available a single record of the process It was proved that the instrumental variable method for identification [1] provides consistent estimates of the autoregressive parameters in the present framework. More precisely, let:
the
empirical Hankel matrix of the process
where
and
Then the least squares solution
of the equation:
is a consistent estimate of true vector parameter: of model (1). This result does not require any stationarity assumption about the moving average parameters [1]. In this sense, the identification method of the AR part may be thought as being robust with respect to the unknown MA part.
1222
2.2
T.D. Popescu
Change detection in AR parameters
The use of standard observation-based likelihood ratio techniques for change detection problem requires the identification of all ARMA model coefficients. Because of the highly varying features of MA coefficients, especially for excitations like wind, waves or earthquakes, this approach does not seen to be appropriate. Moreover, the Fisher information matrix of an ARMA model is not block diagonal. This means that there is an interaction between the AR and MA coefficients, and a coupling effect between the detection of changes in poles or zeroes, and therefore it is not convenient to use likelihood methods [2] for detecting changes on poles when zeroes have to be viewed as nuisance parameters. Based on the robustness properties of the identification procedure with respect to the nuisance parameters, the following off-line change detection procedure is proposed [3]. Let assume that a reference model parameter has been estimated on a record of signal and let consider the following problem: given a new record of signals decide whether they follow the same model or not. The solution proposed is the following: compute again the empirical Hankel matrix corresponding to this new record, and analyze the vector defined by:
If there has been no change in the AR part, this vector should be close to zero; in case of a change in the AR parameters, this vector should be significantly different from zero. We can rewrite in a numerically more efficient way, as:
where is the moving average part, and :
Under the hypothesis of no change (i.e. represents the AR part of the actual process) is orthogonal to and the covariance matrix of is:
because, for Finally, let
be the following matrix:
Some Experiences with Change Detection in Dynamical Systems
1223
Despite the fact that the process and thus is nonstationary, it turns out that the nonstationary law of large numbers and the central limit theorem hold [4]. In this case, the use of the local approach for detecting changes [5] reduces the original problem to the problem of detecting a change in the mean value of a Gaussian process. Considering the generalized likelihood ratio [2] as the decision rule for this new problem, maximization with respect to all possible magnitudes of changes is straightforward, and leads to the following test [3]:
Two special cases, AR(p) and ARMA(p,p-1), are investigated in [3].
3 Case Study The previously described procedure has been applied to the data representing the transversal (N-S) and longitudinal (E-W) components of the acceleration recorded at the roof level of a 12-storey reinforced concrete building, during the August 30-31, 1986 Vrancea earthquake, Romania. Some preliminary identification results [6] obtained in this case point out the fact that the transfer function corresponding to the input in the direction of the output is more dominant, suggesting that the building could be investigated, from the change detection point of view, as two decoupled systems. In this case, the change detection procedure has been applied for 2 singleinput single-output systems, corresponding to investigated directions of seismic wave propagation: N-S and E-W. The total number of data used was 2000 sample points and the sampling period was 0.02 sec. The data are represented in Fig. 1 and Fig. 2. Fourier amplitude spectrum of the seismic acceleration components at the basement have indicated that their frequency content is mainly in the range 0-5
Fig. 1. Input-output data: N-S direction
1224
T.D. Popescu
Fig. 2. Input-output data: E-W direction
Fig. 3.
statistics and segmentation for N-S acc
Hz for both components. Therefore, the data have been lowpass filtered using a zero-phase, second-order Butterworth filter, with the cut-off frequency of 5 Hz. The main idea used in this application consists of the comparison of two AR models estimated at different locations in the data, in order to detect the changes in the structure dynamics, for investigated directions. An AR model was estimated in a fixed data window of size 200 sample points, and another AR model was estimated in a moving data window with the same size. When the models are too different from each other, according with the test statistics presented, a change is detected in the system dynamics, and the second model becomes the reference model and so on. The model order has been chosen The necessary software support was assured by CHANGE program package [7]. The output accelerations analysed, as well as the change instants marked by vertical lines are represented in Fig. 3 and in Fig. 4, corresponding to the investigated directions. In the same figures is also represented the evolution of the statistics.
Some Experiences with Change Detection in Dynamical Systems
1225
The frequency responses of the structure, determined for the quasi-stationary data segments, where the system dynamics is unchanged, according with the obtained results, are given in Fig. 5 and in Fig. 6, for the transversal and lon-
Fig. 4.
statistics and segmentation for E-W acc
Fig. 5. Frequency response evolution for N-S direction
Fig. 6. Frequency response evolution for E-W direction
1226
T.D. Popescu
gitudinal direction, respectively. In these figures the spectra are represented at the instants corresponding to the change detection time (see Fig. 3 and Fig. 4).
4 Conclusions The problem of change detection in modal characteristics of nonstationary (scalar) signals has been addressed with application to a multi-story reinforced concrete building monitoring during a strong seismic motion. The initial problem is translated to an equivalent problem to detect the changes in the poles of an ARMA model having nonstationary unknown moving average coefficients. On the basis of the results reported here, it is tentatively concluded that this approach is capable of adequately detecting main change instants in dynamic characteristics of systems subject to nonstationary unknown excitation such as waves, wind, earthquakes. It can be mentioned that the extension of the method to the case of vector signals is under study.
References 1. Benveniste, A., Fuchs, J. J.: Single Sample Modal Identification of a Nonstationary Stochastic Process. IEEE Trans. Automatic Control, 30 (1985), 66-74 2. Basseville, M., Nikiforov, I.: Detection of Abrupt Changes. Theory and Applications. Prentice Hall, Information and System Sciences Series, New York (1993) 3. Basseville, M., Benveniste, A., Moustakides, G.: Detection and Diagnosis of Abrupt Changes in Modal Characteristics of Nonstationary Digital Signals. Rapport de Recherche, no. 348, INRIA, Centre de Rennes (1984) 4. Moustakides, G., Benveniste, A.: Detecting Change in the AR Parameters of a Nonstationary ARMA Process. Stochastics, 16 (1986), 137-155 5. Nikiforov, I.: Modification and Analysis of the Cumulative Sum Procedure. Automatika i Telemekanika, 41 (1980), 74-80 6. Popescu, Th., Demetriu, S.: Identification of a multi-story structure from earthquake records, Preprints 10th IFAC Symp. on System Identification, SYSID’94, Copenhagen (1994) 411-416 7. Popescu, Th.: CHANGE - Software Support for Change Detection in Signals and Systems, Preprints llth IFAC Symp. System Identification, SYSID’97, Kitakyushu, Fukuoka (1997) 1619-1625
The KAMET II Approach for Knowledge-Based System Construction Osvaldo Cairó and Julio César Alvarez Instituto Tecnológico Autónomo de México (ITAM) [email protected] [email protected]
Abstract. Problem-solving methods are ready-made software components that can be assembled with domain knowledge bases to create application systems. In this paper, we describe this relationship and how it can be used in a principled manner to construct knowledge systems. We have developed a ontologies: first, to describe knowledge bases and problem-solving methods as independent components that can be reused in different application systems; and second, to mediate knowledge between the two kinds of components when they are assembled in a specific system. We present our methodology and a set of associated tools that have been created to support developers in building knowledge systems and that have been used to conduct problem-solving method reuse.
1 Ontologies Ontologies provide a structured framework for modeling the concepts and relationships of some domain of expertise. Ontologies support the creation of repositories of domain-specific reference knowledge -domain knowledge bases- for communication and sharing of this knowledge among people and computer applications. Ontologies provide the structural basis for computer-based processing of domain knowledge to perform reasoning tasks. In other words, ontologies enable the actual use of domain knowledge in computer applications. Problem-Solving Methods provide reusable reasoning components that participate in the principled construction of knowledge-based applications.
2 Problem-Solving Methods Overcoming the limitations of both rule-based systems and custom programs, Problem-Solving Methods (PSMs) were introduced as a knowledge engineering paradigm to encode domain-independent, systematic and reusable sequences of inference steps involved in the process of solving certain kinds of application tasks with domain knowledge. Over the years, the knowledge-engineering community identified and developed PSMs of general usefulness or for specific high-level tasks such as classification, M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 1227–1234, 2004. © Springer-Verlag Berlin Heidelberg 2004
1228
O. Cairó and J.C. Alvarez
diagnosis, design, and planning. A PSM that the knowledge-acquisition community studied at length is generate-and-test, a method that conducts a state-based search algorithm to perform constraint-satisfaction problem solving. More specifically, generate-and-test calculates a valid configuration of domain parameters iteratively, by assigning values to the parameters, verifying that the resulting configuration does not violate domain constraints among parameters and revising the parameter assignments according to constraint-fixing procedures. However this method has been applied to diagnosis tasks as well. The objective is to find the fault that causes a system to malfunction. Generate-and-test follows a generate-and-test strategy decomposing the diagnosis task into more subfunctions, which will be detailed later, in order to execute diagnosis tasks.
3 KAMET Architecture The KAMET II Methodology [2, 3] relies on a conceptual and formal framework for the specification of knowledge-based systems. This conceptual framework is developed in accordance to the CommonKADS model of expertise [10]. The formal means applied are based on combining variants of algebraic specification techniques and dynamic logic [8]. This conceptual framework is a four component architecture that defines different elements to solve diagnosis problems. The framework consists of the following elements: a task that defines the reasoning process of a knowledge-based system, a problem-solving method that defines the reasoning process of a knowledge-based system, and a domain model that describes the domain knowledge of the knowledgebased system. Each of these elements is described independently to enable the reuse of tasks descriptions in different domains, the reuse of problem-solving methods for different task and domains, and the reuse of domain knowledge for different tasks and problem-solving methods. Therefore, a fourth element of a specification of a knowledge-based system is an adapter which is necessary to adjust the three other (reusable) parts to each other and to the specific application problem. This element is used to introduce assumptions and to map the different terminologies. Fig. 1 shows the architecture. [8] gives a detailed explanation of these elements.
4 An Ontology-based Approach to Developing Knowledge Systems The use of ontologies in constructing a knowledge system is pervasive. At least, ontologies support the modeling of the domain-knowledge component counterpart of PSMs in knowledge applications. However, PSMs and domain ontologies are developed independently and therefore need to be reconciled to form a coherent knowledge system. As the basis for reconciliation, PSMs declare the format and semantics of the knowledge that they expect from the domain to perform their task [6]. A PSM provides a method ontology, that elicits its input and output knowledge requirements, independently of any domain. For instance, the generate-and-test PSM
The KAMET II Approach for Knowledge-Based System Construction
1229
Fig. 1. The KAMET II Architecture
declares its input-knowledge needs in terms of state variables, constraints and fixes. This way, the method ontology assigns roles that the domain knowledge needs to fill so that the PSM can operate on that knowledge. Further, the method ontology states the assumptions that the PSM makes on domain knowledge. Besides making all domain knowledge requirements explicit, refined versions of the PSM can be modeled directly by weakening or strengthening its assumptions by way of additional sets of ontological statements - or adapter component [6].
Fig. 2. Use of Ontologies in KAMET
To avoid impairing the independence of either the domain or the method ontologies, this approach includes a mediating component. This third, separate knowledge component holds the explicit relationships between the domain and the method ontologies assembled in a specific knowledge application [6]. Underlying this
1230
O. Cairó and J.C. Alvarez
mediating component is a mapping ontology that bridges the conceptual and syntactic gaps between the domain and method ontologies. [6] studies this in depth.
5 Modeling a Diagnosis Task in KAMET This section presents an example of how to model a diagnosis task by means of KAMET II. We will take a simple economic problem: how to determine if a country’s economy is going through a recession. The definition of recession is the prolonged period of time when a nation’s economy is slowing down or contracting. This period might go from six months to two years. The economy is the production and consumption of goods and services. Before continuing, the symbols of KAMET are presented in order to make the example comprehensible. The symbols of KAMET are used for modeling visually knowledge models and problem-solving methods and they are the means by which the concepts needed by ontologies are modeled. The KAMET II CML has three levels of abstraction. The first corresponds to structural constructors and components. The second level of abstraction corresponds to nodes and composition rules. The third level of abstraction corresponds to the global model [2]. Table 1, 2 and 3 present them.
The KAMETII Approach for Knowledge-Based System Construction
1231
Fig. 3 shows the causal chain that explains the behaviour of a nation’s economy
Fig. 3. When things go wrong
1232
O. Cairó and J.C. Alvarez
Fig. 4. When things go well
Below it is presented the knowledge base of this application using the KAMET II Conceptual Language.
Fig. 5. Brief economy knowledge base
The KAMET II Approach for Knowledge-Based System Construction
1233
According to the reuse necessities of knowledge bases the ontology architecture presented before needs to be applied. That is why we present an ontology description in Fig. 5.
Fig. 6. The domain model
The problem-solving method presented here is generate-and-test. The diagrammatic representation in KAMET of this method is presented below.
Fig. 7. The problem-solving method
6 Conclusions We presented the KAMET II capabilities for modeling domain knowledge as well as for modeling reasoning knowledge. We showed the facilities provided by KAMET for visual modeling as well as the architecture that enables domain knowledge and
1234
O. Cairó and J.C. Alvarez
problem-solving knowledge reuse. We are currently working on a tool for taking advantage of this characteristics of the KAMET methodology.
References 1. Angele, J., Fensel, D., and Studer, R.: Developing Knowledge-BASED Systems with MIKE. Journal of Automated Software Engineering. 2. Cairó, O.: A Comprehensive Methodology for Knowledge Acquisition from Multiple Knowledge Sources. Expert Systems with Applications, 14(1998), 1-16. 3. Cairó, O.: The KAMET Methodology: Content, Usage and Knowledge Modeling. In Gaines, B. and Musen, M., editors, Proceedings of the 11th Banff Knowledge Acquisition for Knowledge-Based Systems Workshop, pages 1 -20. Department of Comp uter Science, University of Calgary, SRGD Publications 4. Cairó, O., Barreiro, J., and Solsona, F.: Software Methodologies at Risk. In Fensel, D. and Studer, R., editors, 11th European Workshop on Knowledge Acquisition, Modeling and Management, volume 1621 of LNAI, pages 319-324. Springer Verlag. 5. Cairó, O., Barreiro, J., and Solsona, F.: Risks Inside-out. In Cairó, O., Sucar, L. and Cantu, F., editors, MICAI 2000: Advances in Artificial Intelligence, volume 1793 of LNAI, pages 426-435. Springer Verlag. 6. Crubézy, M., and Musen, M.: Ontologies in Support of Problem Solving. Handbook on Ontologies in Information Systems. S. Staab and R. Studer, Springer. 7. Domingue, J., Motta, E., and Watt, S.: The Emerging VITAL Workbench. 8. Fensel, D.: Problem-Solving Methods Understanding, Description, Development, and Reuse. Lecture Notes in Artificial Intelligence, Vol. 1791. Springer-Verlag, Berlin Heidelberg New York (2000) 9. Medsker, L., Tan, M., and Turban, E.: Knowledge Acquisition from Multiple Experts:Problems and Issues. Expert Systems with Applications, 9(1), 35-40. 10. Schreiber, G., and Akkermans, H.: Knowledge Engineering and Management: the CommonKADS Methodology. MIT Press, Cambridge, Massachusetts, 1 999. 11. Schreiber, G., Crubézy, M., and Musen, M.: A Case Study in Using Protégé-2000 as a Tool for CommonKADS. In Dieng, R. and Corby, O., editors, 12th International Conference, EKAW 2000, Juan-les-Pins, France. 12. van der Gaag, L., and Helsper, E.: Experiences with Modeling Issues in Building Probabilistic Networks. In Gómez-Pérez, A. and Benjamins, R., editors, 13th International Conference, EKAW 2002, volume 2473 of LNAI, pages 21-26. Springer Verlag.
A Recursive Component Boundary Algorithm to Reduce Recovery Time for Microreboots Chanwit Kaewkasi1 and Pitchaya Kaewkasi2 1
School of Computer Engineering, Suranaree University of Technology Nakorn Ratchasima 3000, Thailand [email protected]
2
School of Laser Technology and Photonics, Suranaree University of Technology Nakorn Ratchasima 3000, Thailand [email protected]
Abstract. Recovery-Oriented Computing (ROC) is a research area that interests to cope with the fault problems, instead of solving them. It is based on the idea that some unsolvable problems are not problems, but facts. Recently invention from ROC is the Microreboots technique. Microreboot is a server mechanism to reboot a subcomponent of the system when it is failed. The main contribution of Microreboot is reducing the recovery time of the system because the server employing Microreboot does not need to restart the whole system when it crashes. Using Microreboots leads to the new concept. That is the better modularizing the components, the smaller the recovery time. This paper introduces a new algorithm for clustering and modularizing the components to make Microreboots better. Our recursive component boundary algorithm is based on the fault-driven approach. We have found that our technique significantly reduces time-to-recovery in the Microreboots system.
1 Introduction The online systems, such as the Internet services, are now important parts of businesses. Many critical services need the servers running all the time. It is impossible to have a non-crash system, and also difficult to have a high availability system, but the latter case is possible. Due to the fact that all bugs cannot be eliminated out of the software, so we cannot avoid the system failure. Although, there are many techniques that have been invented to help developers making better software, such as, processes, methodologies, programming languages, and tools, but the system still fails. Software defects cannot be zero, so we need to cope with these problems, instead of solving them. Recovery-Oriented Computing (ROC) [10] is a research area based on the above idea that is some unsolvable problems are not problems, but facts. ROC considers the system’s crashes, hangs, deadlock, or infinite looping to be facts that we cannot avoid them, and they may occur all the time. ROC aims at coping with those facts by inventing both hardware and software techniques. Microreboot is a technique developed from ROC project. Microreboot is a server mechanism that makes the system rebooting only sub components. When the server is failed, it does not need to reboot the whole system. It will restart only the components that cause the failure. Because of working at fineM.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 1235–1241, 2004. © Springer-Verlag Berlin Heidelberg 2004
1236
C. Kaewkasi and P. Kaewkasi
grained component level, this technique reduces time-to-recovery of the system. There are two key properties for identifying that a component is a Microrebootcompatible component. First, the component must have lifecycle management support. Second, the component must have a dependency declaration. We consider the following components as a Microreboot component. In the Java 2 Enterprise Edition (J2EE) [11] platform, those components can be Servlet modules, Enterprise JavaBean (EJB) [12] modules, and etc. The additional technique for working with Microreboots is the Recursive Recovery (RR) [3]. RR directly relates to Microreboots because it is a technique used for restarting a group of components. A Microreboots agent restarts the component and its related components that cause the fault recursively using RR. Microreboots suggest a possible enhancement at its open end. Better clustering and modularizing of the component packages may significantly reduce recovery time of the system. In this paper, we propose a new technique for refactoring a deployable component package. The main contribution of this work is introducing of the algorithm to improve the high availability of an application server that employs Microreboots technique. Our algorithm analyzes data collected from the application load testing process. We perform load testing on the application, and inject the software exceptions during the test to construct a fault tree. A fault tree contains serveral fault paths and is filled up with fault frequency ratio (FFR) values. This technique is based on the previous work [4]. Our technique, finally, uses the fault tree to find all deployment component boundaries. The rest of this paper is organized as follows; section 2 discusses related works. Section 3 suggests some definitions, our algorithm, and the example. Section 4 presents the evaluation results of our approach. We end with conclusion, and discussion, in section 5.
2 Related Works This section discusses various technologies related to our work starting with the base techniques that we use for our algorithm, Micrereboots and RR. Then, we provide background information on the autonomous fault propagation inference algorithm (AFPI) [4]. After that, the information about current deployable component technology is given.
2.1 Microreboots and Recursive Recovery Microreboots [2, 5] is a technique to reduce the server downtime by rebooting only transient failure components. It is a part of ROC project [10] that aims at improving the system high availability. The philosophy of ROC is based on the idea that a system failure will happen due to both hardward, and software faults. Microreboots technique begins to reboot the system at the most fine grained component-level, as opposed to the entire system. Recursive Recovery (RR) is accompanying technique with Microreboots. Microreboots use RR to recursively restart a chain of components by walking through the corresponding fault path. In this step, the technique firstly creates a set of fault paths, called an f-map, of the application. The authors [4] developed an algorithm to automatically build an f-map from J2EE deployment
A Recursive Component Boundary Algorithm to Reduce Recovery
1237
components. The AFPI algorithm constructs all possible fault paths of the application by injection a set of exceptions after the deployment of the components into the server. When the component is failed, the Microreboot agent will restart it. If the system still fails, RR will be applied to the corresponding f-map to restart other related components. The authors reported that these techniques help reducing timeto-recovery significantly comparing to the whole system rebooting [2]. Our technique is based on Microreboots and extends the f-map model by adding FFR values throughout the map. We believe that the appropriate modularizing components can reduce a mean-time-to-recovery of the application.
2.2 Deployable Components Both J2EE [11], and Microsoft .NET framework [9] are popular platforms for developing server-side software components. These two platforms have their ways to make the deployable component. In J2EE, a package contains several classes, and then a deployable component may be one or more packages packing in a .JAR file. In .NET framework, a namespace contain several classes, and a deployable component, called an assembly, contains one or more namespace. An assembly is in a special dynamic linking library (DLL) format. Both platforms provide techniques and tools to manage their deployable components. Particularly, EJB [12] technology allows an application server vendor to define its own way for deploy an EJB component. There are two key features needed for the container to make a container supporting Microreboots, the component lifecycle management, and the component dependency declaration. Several containers [1,7,8] including EJB application servers have these properties. Repackaging the component is normally unnecessary task but Microrebooting induces this as a significant issue. Several applications are packed into a single deployment module. Some J2EE applications deploy their WAR and JAR files in a single EAR file format [12]. Our idea is to refactor these modules. We follow Microreboots principle, and strongly believe that the better component packaging reduces the system’s mean-time-to-recover.
3 Refactoring Algorithm This section, we introduce an algorithm to make boundaries for deployable component modules. It is called Fault-driven Package Refactoring for Microreboots (FPRM). FPRM makes use of a fault frequency ratio (FFR).
3.1 Definitions We firstly define an FFR value for a given class, or node, as follows:
1238
C. Kaewkasi and P. Kaewkasi
where
is a frequency of failure (FF) of class c over path p, is a frequency of use (FU) of path p. Thus, FFR value of the class c is a summation of an FF at class c over path p by an FU over path p. We collect all FF and FU values from the load testing process. Then an FFR value of each class has to be computed using the equation (1). The next step is to construct a fault tree, and then find all boundaries for the application’s components. Each branch of a fault tree has its average fault value. We call this average value a fault bar.
3.2 Algorithm The FPRM algorithm is described here. After having FFR values for all nodes, the algorithm constructs a fault tree containing the maximum FFR value as its root. If there are many maximum FFR nodes, we will have many fault trees. Finding module boundaries processes over the fault tree. The complete algorithm is described as follows: put all nodes into a set N. repeat 1. find a node containing maximum FFR value. 2. choose max FFR node as the root of a fault tree. 3. choose all ‘caller paths’ of the root, add them as its branch. 4. find a fault bar value for each branch. 5. bound the root with the branch having a max fault bar into a module. 6. N = N - {root} - {x x is on selected branch} until N is empty.
3.3 Example We illustrate a case study for our algorithm here. A case study, shown in Fig 1., is a simple Web application containing 3 paths. We employ a set of notations from a special version of Use Case Maps for illustrating the application diagram. Each path has a prefix a, b, and c respectively. We denotes a:login for the login path that will referred as an a, for short. A dotted circle, ‘a:1/50’ on a node means having a value 1/50 for this node over path a. The summation of the value of dotted circles in each node represents its FFR value. For example, FFR value of the Users node is 1/10 + 1/10 + 1/50 = 11/50. After applying the algorithm to the case study, we have found that the maximum FFR node is the Users node (FFR = 11). Its fault tree has 2 branches containing fault bar values 1.33, and 10 respectively. Thus, it is clearly to pack the Users node with the second branch. The fault tree is illustrated in Fig. 2.
A Recursive Component Boundary Algorithm to Reduce Recovery
1239
Fig. 1. A case study contains 3 paths denoted with a, b, and c respectively. Each path starts with a shaded circle, and ends with a shaded rectangle. A dotted circle denotes FFR value for each node over its path
Fig. 2. The first fault tree has 5 nodes. Shaded node is selected by the FPRM algorithm to be packed into the same deployment component. Selection directly depends on the maximum fault bar value
4 Evaluation We evaluate our algorithm described in the prevoius section here. Our algorithm will be evaluated using a data gathered from the simulated scenario. The evaluation has been done into 2 situations, the best, and the worse case. In the worse case, it is needed to reboot other related components because of its fault propagation. In the best case, only the component that causes failure is rebooted.
1240
C. Kaewkasi and P. Kaewkasi
The J2EE Web application is commonly packed into separate WAR, and JAR files. We have one WAR file and one JAR file from the case study. There are 5 classes in the WAR file, and 3 classes in the JAR file. Assuming a time for loading a class is seconds and a time to restart each component module is seconds. We have and Besides, we also pack all classes into a single package for evaluating. The recovery time of single package equals to The result of the best and worse situation results are the same value. On the other hand, our algorithm divides the application into 3 modules. Total time of these modules is shown in Table 1. In the worse case, there are 6 dependencies that cause recursive rebooting. These take more addition time to its best case. In our scenario, we inject 35 faults in 50 runs. Exception occurred in the WAR file are 20 times, and in the JAR file are 15 times. Table 1 shows the evaluation results. Our FPRM algorithm is 70% faster than the single component, and 42% faster than the normal component packaging, in the best case. For the worse case, we are 53% faster than the normal technique, and approximately 60% faster than the single module.
5 Conclusion and Discussion Recursive Microreboots [2,3,4,5] can reduce time-to-recovery of the failure system. It recursively restarts the failure component, and related components, instead of rebooting the whole system. This technique induces a new approach for software package refactoring. We have presented the new algorithm to refactor the deployment components. Based on fault-driven approach, our technique helps speeding up a recovery process in Microreboots system by grouping components using their failure frequency. Our technique is called Fault-driven Package Refactoring for Microreboots (FPRM). We have illustrated our algorithm to repackaging the J2EE application example. The example consists of J2EE web modules, and EJB modules. The evaluation shows that employing our technique to modularize the components is better than the normal packaging technique being used by software developers. There are several ways to further extend Microreboots. We are extending Microreboots in the cluster environment for J2EE application servers. Using our technique with Microreboots in the large-scale cluster environment can extremely increase the availability of the system. We are also investigating to apply Microreboots to the lightweight Inversion of Control [6] component container such as Apache Avalon Merlin [1], and Spring framework [8]. This is possible because the
A Recursive Component Boundary Algorithm to Reduce Recovery
1241
frameworks support key properties of Microreboots. Additionally, investigating the way to prevent nested exception throwing during the Microreboots process is should be done. Our other approach on this area is to develop the more robust refactoring algorithm working with the inter-component proxy for the J2EE server, and the new aspect-oriented server architecture to make Microreboots support more EJB servers. This work was supported, and funded by Suranaree University of Technology. We also acknowledge anonymous reviewers for their useful comments.
References 1. Apache Software: Apache Avalon Merlin. 2004, Apache Software Foundation, http://avalon.apache.org/merlin/. 2. Candea, G., J. Cutler, and A. Fox: Improving Availability with Recursive Microreboots: A Soft-State System Case Study. Performance Evaluation Journal, 2004. 56(1-3). 3. Candea, G., et al.: Reducing Recovery Time in a Small Recursively Restartable System. Proceedings of the International Conference on Dependable Systems and Networks (DSN2002). 2002. Washington, D.C. 4. Candea, G., et al.: Automatic Failure-Path Inference: A Generic Introspection Technique for Internet Applications. Proceedings of the 3rd IEEE Workshop on Internet Applications (WIAPP). 2003. San Jose, CA. 5. Candea, G., et al.: JAGR: An Autonomous Self-Recovering Application Server. Proceedings of the 5th International Workshop on Active Middleware Services. 2003. Seattle, WA. 6. Fowler, M.: Inversion of Control Containers and the Dependency Injection pattern. 2004, http://martinfowler.com/articles/injection.html. 7. Hammant, P., A. Hellesoy, and J. Tirsen: PicoContainer: User Documentation. 2004, PicoContainer, http://docs.codehaus.org/display/PICO/User+documentation. 8. Johnson, R., et al.: Spring Java/J2EE Application Framework: Reference Documentation.2004, Springframework.org. http://www.springframework.org/docs/springreference.pdf. 9. Microsoft: .NET Framework. 2004, Microsoft Corp., http://www.microsoft.com/net. 10. Patterson, D., et al.: Recovery oriented computing (ROC): Motivation, definition, techniques, and case studies. 2002, Technical Report UCB/CSD-02-1175 of UC Berkeley: Berkeley, CA. 11. Sun Microsystems: Enterprise JavaBean. 2004, Sun Microsystems Inc., http://java.sun.com/products/ejb. 12. Sun Microsystems: Java 2 Enterprise Edition. 2004, Sun Microsystems Inc., http://java.sun.com/j2ee.
Electric Power System Anomaly Detection Using Neural Networks Marco Martinelli1, Enrico Tronci1, Giovanni Dipoppa2, and Claudio Balducelli2 1
Dip. di Informatica Università di Roma “La Sapienza” Via Salaria 113, 00198 Roma, Italy
[email protected]
[email protected]
2
ENEA - Centro Ricerche Casaccia Via Anguillarese 301, 00060 Roma, Italy {giovanni.dipoppa,
c_balducelli}@casaccia.enea.it
Abstract. The aim of this work is to propose an approach to monitor and protect Electric Power System by learning normal system behaviour at substations level, and raising an alarm signal when an abnormal status is detected; the problem is addressed by the use of autoassociative neural networks, reading substation measures. Experimental results show that, through the proposed approach, neural networks can be used to learn parameters underlaying system behaviour, and their output processed to detecting anomalies due to hijacking of measures, changes in the power network topology (i.e. transmission lines breaking) and unexpected power demand trend.
1 Introduction Monitoring and protecting Large Complex Critical Infrastructures (LCCIs) is becoming more and more important, as the growth of structures interdependecies, and their increasing complexity make them vulnerable to failures or to deliberate attacks. Our goal is to detect anomalies in the dynamics (i.e. evolution over time) of the measure vectors coming from the substations of an Electric Power System. In this paper, a neural network based approach for novelty detection is presented, on the same lines proposed by Thompson et al. [1], but in a different setting. The use of autoassociative neural networks is aimed at learning normal behaviour of a LCCIs subcomponents, for a low level, distributed monitoring approach: dangerous attack or accidental fault within the system would probably bring significant deviations at this level, thus causing novelty detection. Research, implementation and test have been developed in the operative framework of the Safeguard1 project. 1
Safeguard is a European project investigating new ways of protecting Large Complex Critical Infrastructures, developed by ENEA, QMUL, AIA, LiU, and Swisscom. For further details refer to [5] .
M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 1242–1248, 2004. © Springer-Verlag Berlin Heidelberg 2004
Electric Power System Anomaly Detection Using Neural Networks
1243
2 Basics 2.1
Electric Power Systems
An Electric Power System (EPS) can be seen as a set of nodes, called substations, connected each other by transmission lines. Each substation, usually monitored by a Remote Terminal Unit (RTU), is composed by several components, each playing a specific role in the power generation/consuming process. Electric power is generated by generators, distributed through transmission lines, consumed by loads, which demand may usually vary hourly, weekly and monthly.
2.2
Artificial Neural Networks
An Artificial Neural Network (ANN) is built out from simple, non-intelligent units (neurons) which are connected together, becoming able to perform complex signal processing. In the learning phase, an ANN is presented with input data set and is trained to fire out the desired values at output layer. The training algorithm iteratively modify weights on connections through which signals are transmitted, in order to minimize gap between network output and desired one. The Autoencoder Model - An Autoassociative Neural Network Encoder (or simply autoencoder) has two primary features: Autoassociative Feature: the network is trained to reproduce at output layer same values presented as input. For this reason input and output layer have the same size (i.e. the same number of neural units). Bottleneck Layer: at least one of the hidden layers of the network must be smaller than input and output. The architecture selected in this work consists of an input layer, 3 hidden layers, and an output layer (see Fig. 1). The three hidden layers shape a “feature detection” architecture in which the bottleneck layer plays the key role in the identity mapping, as it forces the network to develop a compact representation of the training data that better models the underlying system parameters.
Fig. 1. Sample of autoassociative neural network encoder
1244
M. Martinelli et al.
3 Problem Definition The aim of this work is to built a system able to perform strict on-line monitoring on substations belonging to an electric power network, reading measures by RTUs, and able to raise an alarm signal in case of anomaly detection. One of the major difficulties in LCCIs monitoring is due to the non-linear nature of its behaviour; the problem become even harder when a large amount of non predictable abnormal states can arise, due to local or generalized faults. Numerical methods are usually time and resources consuming and could be not proper for an on-line measure monitoring with a small sampling time. Presently, numerical estimation algorithms are often used to rebuilt the state of the power system in case of missing and/or corrupted data: however, state estimator approach does not address the problem of giving a normal/abnormal state assessment, and in some cases could tend to hide traces of an ongoing attack or of other anomalies. Moreover state estimators efficiency and accuracy depend on the size of the network, and the estimation of state is often based on prior knowledge about substations sensor reliability. Electric system peculiarities and problem specific features suggested the use of neural networks as objects able to deal with continuous values coming from physical fields, with good performances, suitable for on-line data processing, and able to implicitly learn data underlying aspects featuring the system behaviour.
4 Power Net Monitoring In our approach, a specific autoencoder is deployed on each substation (i.e. through a software-agent platform). Due to peculiarities of each substation in terms of components, geographic location, role in power generation and/or distribution, it’s necessary to have a different, specific training session for each deployed neural network.
4.1
Substation Measures as Autoencoder Input
Some data preprocessing is needed in order to feed the autoencoder with substation measures: it can be useful to make a selection among available measures in order to reduce input and output layer size, thus saving time in training phase. In order to obtain some learning of data variation over time, measures have been composed in a sliding window, so that each autoencoder is feeded with a data vector containing current as well as past measures. The gap between the slots should be wide enough to have a sensible delta of variation for measured values. Since it turns out that we only need the signal first derivative, we use in our sliding window. It can be noticed that granularity of substation monitoring depends on closeness of the sampling rate by which measures are retrieved from the electric field and stored in the buffer. Let’s call the substation measure vector at the time T the sampling rate and the time gap between the sliding window slots; then, at step a slot is filled with
Electric Power System Anomaly Detection Using Neural Networks
1245
while the other slot is filled with How can be noticed, there is no dependence between parameters T and as past measures are buffered and ready to be processed at the right time (see Fig. 2).
Fig. 2. The sliding window technique: measures coming from power network are preprocessed and stored in a buffer, so that two measures vectors at once are read by the neural network
4.2
Autoencoder Output and Novelty Assessment
In the proposed approach, a novelty assessment is given out measuring absolute gap vector between input data set and network output, and then performing some post-processing on it (see Fig. 3). A properly trained autoencoder is able to successfully reproduce normal data sets: it will be now showed how autoencoder output is processed to have a measure of anomaly level. Let and be respectively the input and output data vector for the autoencoder at the time both composed by measures. First of all, let’s consider the vector where each component is calculated as Let be the mean value of the components of at the time that is The value of itself could be used to have a measure of anomaly level detected by the autoencoder, but it’s opportune to introduce
Fig. 3. Comparison between input data vector and autoencoder output vector. Vector is the absolute difference between autoencoder input and output vector
1246
M. Martinelli et al.
some smoothing, averaging this value on a sliding window. Thus, being width of the sliding window, a new value is introduced as:
the
As some measures in the autoencoder output vector are more sensible to anomalies than others, it has been also used the average of the absolute deviation of the measures from their mean. Also in this case this value is observed over a sliding window, so that the final calculated value at the time is:
where is the average absolute deviation of the measures from their mean at time that is Values and can be used to take a decision about novelty detection. Analyzing these two values during system normal behaviour, two threshold values are defined: let’s call them and an alarm signal is raised if or
5 Experimental Results Experimental tests have been conducted implementing the electric network model IEEE RTS-96 [6] in an electric power network simulator.
5.1
Training the Autoencoder
Training session has been carried out by a 72 hours data set, consisting of 432 training patterns. Using backpropagation training algorithm a root mean squared error (RMSE) equal to 0.015 can be reached in about 6000 epochs. Experimental results have shown that a similar value for RMSE permits the autoencoder to generalize well, recognizing as normal data which have small variation respect to training set.
5.2
Testing the Autoencoder
The following experimental tests are aimed to verify if it is possible to discriminate values of and generated in case of normal data processing from the ones coming from processing data containing anomalies. Normal Behaviour Data - Data set used to test network during normal behaviour consisted of vectors obtained simulating a load demand added with zero average uniformly distributed random perturbation. For robustness, calling the nominal load demand value at time at each step the simulation is
Electric Power System Anomaly Detection Using Neural Networks
1247
executed with a load demand value where is randomly generated in a suitable interval (i.e. for a 1% random perturbation) . Several test sets have been generated with the above criteria: the trained neural network was able to generalize well and successfully reproduced substation behaviour. As stable (small) values for and were obtained, it has been easy to choose thresholds and to discriminate normal values of and from abnormal ones. Next step is to prove that using the autoencoder and the choosen thresholds it is possible to raise an alarm signal in case of anomaly. Novelty Detection - One of the critical point of this work is the lackness of a priori knowledge about system behaviour in case of anomaly. To test the autoencoder on non-normal values, the following kinds of editing has been made on the data sets: 1. introduction of random noise on each measure vector; 2. changing the curve shape of load demand over time; 3. changing the electric network topology or components status.
The first approach is aimed to verify if relationships among data have been embedded in the autoencoder connection weights. The original measures vector produced by simulation has been perturbated varying each component value by a certain percentage. Being the measure vector at the time the value of each component has been recalculated as where is a zero average uniformly distributed random noise. Results of this kind of test are particulary significant: values of and given by the autoencoder output postprocessing are very different from normal ones when the random noise introduced is just 2% (i.e. Plots of anomalous values and thresholds are shown in Fig. 4 (middle plot). Several simulation have been performed scheduling different demand curves: the right side of plot in Fig. 4 shows values of and becoming greater than established thresholds where demand trend is different by the learned one.
Fig. 4. Plot for (solid) and (dashed) in case of anomalous state: generators fault in a neighbour substation (on the left), data hijacking with 2% of noise (center) and unexpected demand trend (on the right); threshold lines are also plotted
1248
M. Martinelli et al.
With the third data set we want to investigate if the autoencoder is able to detect variations in network topology (i.e. due to a transmission line breaking or a power generator fault). Left side of plot in Fig. 4 shows peak values in correspondence of power generators (simulated) fault in a substation belonging to the same area of the one monitored by the autoencoder. Values for and in case of system normal behaviour can be seen in the left plot, for and in the right plot, for
6 Conclusions As shown, using autoassociative neural networks it is possible to build a module to monitor electric power system substations: after a training on data concerning components normal activity, the autoencoder became able to map the system behaviour. Through the processed values and the proposed approach t successfully detects anomalies in the measures due to sensor failures or intentional data hijacking, network topology changes (i.e. components breaking or transmission lines interruption) or unexpected power demand trend.
References 1. B. Thompson, R. Marks, J. Choi, M. A. El-Sharkawi, M. Huang and C. Bunje, Implicit Learning in Autoencoder Novelty Assessment, International Joint Conference on Neural Networks, 2002 IEEE World Congress on Computational Intelligence, Hawaii, May 12-17, 2002 2. S. Haykin, Neural Networks: A comprehensive Fundation (2nd Edition), Prentice Hall, 1998. 3. M. Markou, S. Singh, Novelty Detection: A Review - part 2: Neural network based approaches, Signal Processing, vol. 83, 2003. 4. T. M. Mitchell, Machine Learning, McGraw-Hill Iternational Editions, 1997. 5. The Safeguard Project website: www.ist-safeguard.org 6. Reliability Test System Task Force of the application of probability methods subcommittee, The IEEE Reliability Test System - 1996, IEEE Transaction on Power Systems, Vol 14, No. 3, August 1999.
Capturing and Applying Lessons Learned During Engineering Equipment Installation Ian Watson Dept. of Computer Science University of Auckland Auckland New Zealand [email protected] www.cs.auckland.ac.nz/~ian/
Abstract. This paper describes the implementation of a knowledge management tool to capture and reuse the lessons learned from the installation of engineering equipment. It has been developed as an adjunct to an existing system that uses case-based reasoning to reuse previous engineering installation specifications and designs. The system described lets engineers recall details of installation, commissioning and operational problems with systems. The paper discusses how lessons learned support the reuse and revision processes of the traditional CBR cycle.
1 Introduction Several papers have been presented describing the author’s work in collaborating on the development of a case-based reasoning (CBR) system, called Cool Air, that supports the installation of HVAC (heating ventilation and air conditioning) equipment [Watson & Gardingen 1999a & b]. This system has been successfully fielded and made a significant return on its investment. It was designed to meet several goals: to reduce the installation specification and quotation time from five days or more to two days, to reduce the margin of error built-in to pricing and thereby produce more competitive quotations, and to reduce the burden on head office engineers in checking every detail of every specification. The fielded system met these goals and generated a good return on its investment. However, although it was capturing and reusing engineering designs and specifications it was not adequately or consistently enabling the lessons learned1 during design and installation to be applied. This was because the system did not 1
A lesson learned is only a lesson learned if it has been applied to help prevent a mistake or error. Otherwise it is merely a lesson stored [Aha et al., 1999].
M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 1249–1254, 2004. © Springer-Verlag Berlin Heidelberg 2004
1250
I. Watson
proactively offer relevant lessons learned (LL) to engineers. Instead, like most LL repositories it relied on engineers actively searching for and retrieving relevant LL knowledge. This paper describes enhancements to the Cool Air system to support the proactive delivery of LL knowledge to engineers when that knowledge is likely to be relevant, thus enhancing its knowledge management (KM) role. The status of the enhancements described are currently that of a research demonstrator.
2 System Architecture Cool Air is a distributed client server system operating on the Internet. On the engineers (client) side a Java applet is used to gather the customer’s requirements and send them as structured XML to the server. On the server side another Java applet (a servlet) uses this information to query the database (approx. 14,000 records) to retrieve a set of similar records. This process takes the original query and relaxes terms in it to ensure that a useful number of records are retrieved from the database. This is similar to the query relaxation technique used by Kitano & Shimazu [1996] in the SQUAD system at NEC, although as is discussed in [Watson 2000] we have improved its efficiency using an introspective learning heuristic.
Fig. 1. System Architecture
The Java servlet then converts the set of records into XML and sends them to the client side applet that uses a simple k-nearest neighbour algorithm to rank the set of cases. Once a matching case is retrieved the engineer obtains the installation specification files from the company FTP server. These files include CAD drawings, technical
Capturing and Applying Lessons Learned
1251
specifications, bills of quantities, contracts and notes (or trouble tickets) made by previous engineers describing problems with installation, commissioning and operation of the HVAC system. This requires that the engineer proactively downloads, via FTP, the appropriate trouble tickets and reads them. The engineer is not presented with the file name and location of trouble tickets from other similar installations, which may also be relevant. Consequently, the lessons learned from previous similar installations are not being transmitted and therefore cannot be learned.
Fig. 2. A Portion of a Symbol Hierarchy for Mechanical Heating & Cooling Systems
There are few publications referring to KM specifically in the construction industry, for example Cser et al., [1997], but there is a growing body of work about the application of CBR to KM. In particular a AAAI workshop on Exploring Synergies of Knowledge Management and Case-Based Reasoning [Aha et al., 1999], a workshop at ICCBR’99 on Practical Case-Based Reasoning Strategies for Building and Maintaining Corporate Memories [Gresse von Wangenheim & Tautz, 1999] and a recent book Applying Knowledge Management: techniques for building corporate memories [Watson, 2003]. This growing interest is not surprising since the recognition of similar problems and their solutions are central to both CBR and KM. Moreover the use of the Internet as a vehicle for supporting distributed KM is becoming more common [Caldwell et al., 1999]. Figure 3 shows parts of three example trouble tickets; one describing the need to reduce noise when installing a system in a residential nursing home, another describing the actual installed diameter of some ducting and the third describing a problem with a thermostat when located too far from a controller. Trouble tickets are indexed by Code (this refers to the job type), Location, Client (including a reference for client type) and a list of the equipment and contractors used (not shown in Figure
1252
I. Watson
3). In each trouble ticket the problem and its solution are recorded. The trouble tickets are indexed in Cool Air’s database by these key features, along with a file reference to the trouble ticket itself.
3 Lessons Learned The LL system offers a proactive two stage reminding. In the first stage when the set of similar installation records (typically between 10 & 20) is sent to the client all associated trouble tickets of these installations are also sent to the client as XML. Since these installations are similar it is reasonable to assume that any problems encountered with these installations may be relevant. Engineers can peruse these and use the information gained to improve the resulting design.
Fig. 3. Three Sample Trouble Tickets
In CBR terms the trouble tickets are being used to inform the case reuse and case revision or adaptation processes. Once a specification for the job is finalised the details for this new project are used to re-search the knowledge repository to obtain trouble tickets that might be relevant to the proposed job type, location, client type, equipment and contractors. This is relevant because the final adapted project specification may include significant variations from the cases upon which it was based and consequently it is valid to check its proposals for potential installation, commissioning or in-use problems. Retrieval of trouble tickets uses CBR and the same abstraction hierarchies used by the query relaxation algorithm of the Cool Air system. An example hierarchy for mechanical heating and cooling equipment is shown in Figure 2. Using this hierarchy it is easy to see that U31A Athol and U32A Athol are both types of fan coil, are similar and hence may share similar problems and trouble tickets. Searching is not performed on the body of the trouble ticket itself. Like many CBR systems Cool Air is featured based and it does not perform textual case-based reasoning [Ashley 1999]. Neither are trouble tickets retrieved using an iterative conversational CBR process [Aha & Breslow 1997]. However, textual CBR, conversational
Capturing and Applying Lessons Learned
1253
CBR or a combination of the two is being looked at as a possibility for future research. During the installation and commissioning of the HVAC system engineers will be encouraged to create trouble tickets using simple web-based forms. Once the project reference number is known all the relevant indexed features can be automatically added to the trouble ticket. Leaving the engineer free to concentrate on the body of the trouble ticket. Through the forms interface they will be encouraged to consider both the trouble encountered and the eventual solutions.
Fig. 4. Lessons Learned & the CBR-cycle
4 Conclusions The first stage of the LL enhancements to the Cool Air system have undergone some limited testing and received qualified support. The second stage has not been field tested yet (March 2004) although it has performed satisfactorily in the laboratory. However, I do not underestimate the significant management problems associated with the successful operation of an LL system. Primarily these centre not upon the technology itself, which performs satisfactorily, but upon the management of the process [Davenport, 1997]. Put simply, not all engineers take the time to record problems and their solutions regarding this activity as a non-value adding task or at worst a threat to their experience and consequent, value to the company. These issues, as many commentators have noticed, are as important to KM as the technology itself. Several methods are being suggested to overcome the reluctance of engineers to create and use trouble tickets. These range from a system of rewards to encourage compliance to disciplinary penalties to enforce compliance.
1254
I. Watson
However, CBR has proven itself useful in the retrieval of LL knowledge and moreover, in an interesting synergy, the LL knowledge is useful in guiding both the reuse and the revision or adaptation processes of CBR. This is illustrated in Figure 4, which shows how LL knowledge first informs the selection of a past case upon which to base the subsequent solution and then secondly can be used to anticipate problems with that that solution.
References Aha, D., Becerra-Fernandez, I., Maurer, F. & Munoz-Avila, H. (1999). Exploring Synergies of Knowledge Management and Case-Based Reasoning. AAAI Workshop Technical Report WS-99-19. AAAI Press Aha, D. & Breslow, L.A. (1997). Refining conversational case libraries. In, Proc. Int. Conf. On Case-Based Reasoning, pp. 267-78. Springer-Verlag. Ashley, K. (1999). Progress in Text-Based Case-Based Reasoning. Invited talk at the Int. Conf. On Case-Based Reasoning. Online (March 2000): www.lrdc.pitt.edu/Ashley/TalkOverheads.htm Gresse von Wangenheim, C. & Tautz, F. (1999). Practical Case-Based Reasoning Strategies for Building and Maintaining Corporate Memories. Online (March 2000): www. eps. ufsc.br/ ~gresse/call_ws2.html Caldwell, N.M.H., Rodgers, P.A. and Huxor, A.P. (1999). Web-Based Knowledge Management for Distributed Design, IEEE Intelligent Systems, September, 1999. Cser, J., Beheshti, R. and van der Veer, P. (1997). Towards the development of an integrated building management system, Proceedings of the Portland International Conference on Management & Technology Innovation in Technology Management - The key to Global Leadership, PICMET. Davenport, T. (1997). Information Ecology: Mastering the Information and Knowledge Environment: Why Technology is not enough for Success in the Information Age, Oxford University Press, 1997 Kitano, H., & Shimazu, H. (1996). The Experience Sharing Architecture: A Case Study in Corporate-Wide Case-Based Software Quality Control. In, Case-Based Reasoning: Experiences, Lessons, & Future Directions. Leake, D.B. (Ed.) pp.235-268. AAAI Press/The MIT Press Menlo Park, Calif., US. Watson, I. (2000). A Case-Based Reasoning Application for Engineering Sales Support using Introspective Reasoning. In Proc. IAAI 2000. Austin Texas. AAAI Press. Forthcoming. Watson, I. (2003). Applying Knowledge Management: techniques for building corporate memories. Morgan Kaufmann Publishers Inc., San Francisco, CA, US Watson & Gardingen, D. (1999). A web-based CBR system for heating ventilation and air conditioning systems sales support. Knowledge Based Systems Journal, Vol. 12 Nos. 5-6, pp.207-214. Watson, I. & Gardingen, D. (1999). A Distributed Case-Based Reasoning Application for Engineering Sales Support. IJCAI’99 31st. July - 6th August 1999, Vol. 1: pp. 600-605. Morgan Kaufmann Publishers Inc.
Moving Towards a New Era of Intelligent Protection Through Digital Relaying in Power Systems Kongpan Areerak, Thanatchai Kulworawanichpong, and Sarawut Sujitjorn School of Electrical Engineering, Suranaree University of Technology, Nakhon Ratchasima, 30000, Thailand [email protected]
Abstract. This paper presents an intelligent approach for digital relaying design. Due to a powerful search scheme of the selected intelligent method, digital relays are well discriminated to satisfy a large number of constraints, which are so complicated over the ability of the conventional relay setting. Also, the proposed scheme is demonstrated through a small 5-bus power system, in which six digital relays are situated to be smartly discriminated.
1 Introduction Short-circuit conditions can occur unexpectedly in any part of a power system at any time due to various physical problems. Such situations cause a large amount of fault current flowing through some power system apparatus. The occurrence of the fault is harmful and must be isolated promptly by a set of protective devices. Over several decades, protective relaying has become the brain of power system protection [1]. Its basic function is to monitor abnormal operations as a “fault sensor” and the relay will open a contractor to separate a faulty part from the other parts of the network if there exists a fault event. To date, power transmission and distribution systems are bulky and complicated. These lead to the need for a large number of protective relays cooperating with one another to assure the secure and reliable operation of a whole. Therefore, each protective device is designed to perform its action dependent upon a so-called “zone of protection” [2]. From this principle, no protective relay is operated by any fault outside the zone if the system is well designed. As widely known that old fashion analogue relays are inaccurate and difficult to establish the discrimination among protective relays. The relay setting is thus conducted based on the experience of an expert or only a simple heuristic algorithm. However, with the advancement of digital technologies, a modern digital protective relay is more efficient and flexible to enable the fine adjustment of the time-dial setting different to that of the electromagnetic one [3]. This paper proposes a new discrimination method based on some efficient search algorithm, called the Adaptive Tabu Search (ATS) [4], for digital relaying, in which the time-dial setting is appropriately adjusted in order to minimise operating time while discriminated relays are still reliable. In this paper, the discrimination of digital relaying systems is explained in Section 2 in such a way that the ATS method is M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 1255–1261, 2004. © Springer-Verlag Berlin Heidelberg 2004
1256
K. Areerak et al.
employed to achieve the system objective. A case study of a 5-bus power system protection, where six digital over-current relays are discriminated, is discussed in Section 3. The last section provides the conclusions and future work.
2 Intelligent Discrimination for Digital Relaying At present, working with relay parameter setting is wearisome and spends too much time. Although there exist a special computer software to help power engineers set up key parameters of relays, it cannot guarantee that the system under consideration has an optimal operating time [5-6]. This leads to the need for an alternative approach, which is able to provide the ability to search for an optimal solution of relay’s parameter setting through a complicated search space. This setting is intrinsically an offline parameter tuning, so the artificial intelligent search method can be a potential candidate where the time is not involved. In this paper, the ATS method is used to find an optimal time-dial setting of digital relays. This method has fast convergence and is verified by some intensive works [4,7], thus it is suitable to be an optimiser for this adjustment. The framework of the intelligent discrimination is summarised as follows. Given that there is a set of digital relays to be discriminated in the considered power system. Therefore, 1) Perform steady-state power flow calculation at the maximum load condition for CT-ratio selection of all the relays. 2) Calculate all fault conditions and select some cases or even all the cases for design depended upon the design engineer. Note that the worst-case scenarios must be included. 3) Assign the operating curve to all involved digital relays. 4) Set the pick-up current for the relays with the account of maximum load and minimum fault current consideration. 5) Apply an efficient intelligent search method, which is the ATS method, to find the optimal time-dial setting for the digital relays.
2.1 Review of the ATS method Adaptive Tabu Search (ATS) Method [4] is a modified form of the original Tabu search proposed by Glover [8] in 1986 especially for combinatorial optimisation problems. The modified version was developed according to the need for a powerful search method to solve non-linear continuous optimisation problems. The essence of this method, which distinguishes itself from the original is that 1) a continuous search space must be discretised and 2) back-tracking and adaptive radius features are employed to enhance the overall performance of the search process. Its effectiveness has been proved and verified by some intensive works [4,7-8]. The ATS algorithm is briefly presented by the flow diagram with appendix. In this paper, it is employed to be an efficient optimiser for the optimal adjustment of relay discrimination problems.
Moving Towards a New Era of Intelligent Protection
1257
3 Case Study, Results and Discussions A 5-bus power system is situated for test as shown in figure 1. The test requires the pre-calculation from a power flow solver and a fault calculator. The power flow calculation is employed to obtain the normal operating condition under the maximum load for CT’s ratio setting as shown in figure 1, whereas the fault calculation is used to evaluate the fault current distributed in the power network. Although there are many possible fault locations and types, in this demonstration only two fault conditions are situated and their results are graphically presented in figures 2 and 3. These two fault cases are therefore used to form the objective function for relay discrimination, which is explained in the last part of this section.
Fig. 1. Power flow solution for the maximum load operation
To achieve the optimum time relay grading, the system objective must be formed carefully, in which all necessary constraints are taken into account. In this paper, the addition of the operating time of all considered relays is set as the objective function, while the time grading margins between pairs of associated upstream and downstream relays are inequality constraints as in equations 1.
1258
K. Areerak et al.
where is the system objective function, are are arbitrary constants of relay i, is the time-dial setting of relay i, is the pick-up current of relay i, is the fault current seen by relay i for case k, is the time-grading margin allowance, are the operating time of upstream and downstream relays of pair j, n, m and J are the total number of designed relays, fault cases and relay pairs.
Fig. 2. Fault current distribution for the occurrence of fault at bus 2
Fig. 3. Fault current distribution for the occurrence of fault at bus 5
Moving Towards a New Era of Intelligent Protection
1259
In this case study, all the digital relays are characterised by and (standard inverse) [9] and 100% pick-up current setting. Also, all selected CT ratios are shown in Table 1. To minimise the objective function, the search space of the problem is given by [0.05, 1.00] for the time-dial setting. Optimising the objective function by using the ATS method, the obtained optimal solution (TDS) is shown in Table 1 together with the operating time of the two fault cases. In addition, the grading graph interpreted from the obtained optimal solution is shown in figure 4. Moreover, the convergence of the search process is presented in figure 5.
Fig. 4. Grading graph of the discriminated digital relay system
1260
K. Areerak et al.
Fig. 5. Convergence of the search process
4 Conclusions This paper discusses a new method of discrimination for digital over-current relaying based on an intelligent search method. The result from this proposed setting scheme gives the minimum relay operating time in which the relaying system is still reliable and well discriminated. The parameter setting design is simple and consumes less time. In addition, this method is applicable to a bulk and complicated power system, having a large number of constraints for taking into account. Although its formulation of this scheme is rather complicated, a design engineer is required to set up only the objective function. After the objective function is created, a whole process will be done by a digital computer. This may leads to the improvement of the relaying discrimination with fast operating time and still working in a secure and reliable operating region.
Acknowledgement The financial support from Suranaree University of Technology, Nakhon Ratchasima, Thailand, is gratefully acknowledged.
References 1. Blackburn, J.L: Protective Relaying. Marcel Dekker, 1987 2. Horowitz, S.H., Phadke, A.G.: Power System Relaying. Research Study Press, 1995 3. Qin, B.L, Guzman-Casillas, A., Schweitzer, E.O.: A New Method for Protection Zone Selection in Microprocessor-Based Bus Relays. IEEE Trans on Power Delivery 3 (2000), 876-887
Moving Towards a New Era of Intelligent Protection
1261
4. Puangdownreong, D., Areerak, K-N., Sujitjorn, S., Totarong, P., Srikaew, A.: System Identification via Adaptive Tabu Search. IEEE Int. Conf. On Industrial Technology (ICIT’02), 2 (2002), 915-920 5. So, C.W., Li, K.K., Lai, K.T., Fung, K.Y.: Application of Genetic Algorithm to Overcurrent Relay Grading Coordination. Int. Conf. On Advances in Power System Control, Operation and Management (APSCMP-97), Hong Kong, Novemeber 1997, 283-287 6. So, C.W., Li, K.K., Lai, K.T., Fung, K.Y.: Application of Genetic Algorithm for Overcurrent Relay Coordination. IEE Int. Conf. On Development in Power System Protection, March 1997, 66-69 7. Kulworawanichpong, T., Puangdownreong, D., Sujitjorn, S.: Finite Convergence of Adaptive Tabu Search. ASEAN Journal (Accepted 2004) 8. Glover, S.: Future Paths for Integer Programming and Links to Artificial Intelligence. Computers and Operations Research 13 (1986), 553-549 9. IEEE Std C37.112-1996
Appendix
Fig. 6. Flow diagram for the ATS method
Capacitor Switching Control Using a Decision Table for a 115-kV Power Transmission System in Thailand Phinit Srithorn1, Kasem Khojulklang2, and Thanatchai Kulworawanichpong3 1
Department of Electrical Engineering, Rajamangala Institute of Technology, Northeastern Campus, Nakhon Ratchasima, 30000, Thailand [email protected] 2
Division of Planning and Operation, Provincial Electricity Authority, Nakhon Ratchasima, 30000, Thailand [email protected]
3
School of Electrical Engineering, Suranaree University of Technology, Nakhon Ratchasima, 30000, Thailand [email protected]
Abstract. This paper simplifies the operation of switched capacitor-back control in a 115-kV power transmission system using a decision table. A system operator simply looks up an appropriate operating condition in a provided table to decide the action. This control scheme is demonstrated through a 115-kV power transmission system in Thailand to evaluate its use. As a result, it is quite simple but gives a satisfactory result to lead the voltage profile improvement and the power loss reduction.
1 Introduction Due to the imperfection of electrical power systems, a considerable amount of power losses and voltage drops along transmission lines usually occurs in normal operation. Various techniques for voltage control and reactive power compensation have been developed to improve the overall system performance [1]. A set of capacitor-banks installed at substations is used as one possible solution [2]. They will be partly or fully switched to the system in order to regulate the substation voltage close to the nominal value and also to reduce the power losses. The switching control depends on bus voltage and power factor of the substation. There are a number of works related to the switched capacitor-bank control, in which its switching pattern is carefully designed. Over decades, many AI techniques have been applied such as GA [3], fuzzy logic [4-5], etc. They require large database and appropriate parameter setting, which is problem-dependent. Furthermore, other different problems require their own individual adjustment. In this paper, capacitor switching control problems are performed through a simple algorithm, called a decision table. The decision table is a method based on operator’s experience. The switching operation is decided by looking up such a table provided by an expert. This table is used as a reference for operators working at substations. The control rules are written in this table. The system operator just only M.Gh. Negoita et al. (Eds.): KES 2004, LNAI 3213, pp. 1262–1268, 2004. © Springer-Verlag Berlin Heidelberg 2004
Capacitor Switching Control Using a Decision Table
1263
observes an operating condition of the substation and thus the action of switched capacitor banks can be easily decided. The decision table is introduced and described in the next section. To demonstrate this control scheme, a 115-kV power transmission system in Thailand is situated as a test system. The tests are presented in Section 3 while Section 4 gives conclusions.
2 Decision Table for Capacitor Switching Control At a considered substation, if switched capacitor-banks are installed and ready to use for improving the system performance. The problem of this control is to decide whether or not capacitor banks are needed to be switched at that instant. If the capacitor switching is required, the next problem is how to determine the number of steps of the capacitor switching. Although there are many key factors such as bus voltage, power factor, load factor, etc, this demonstration uses only the bus voltage to generate a simple decision table [6] as follows. The bus voltage is determined as a logical variable having five attributes (Very High, High, Medium, Low and Very Low; VH, H, M, L, VL). The system operator is required to read the substation voltage and assign the attribute of the bus voltage. Therefore, the action of the capacitor switching is just to look up the instruction given in the decision table corresponding to the assigned bus-voltage attribute. In power transmission systems, system voltage profile is typically regulated within ±5% voltage variation around the nominal value. The example of the bus voltage classification for a 115-kV power transmission system is presented in Table 1.
1264
P. Srithorn et al.
The operator at the substation with capacitor banks installed must decide the action of the capacitor switching following the bus voltage condition. Table 2 gives a decision table to decide the action. There are two types of the action depending on a total number of capacitors installed at the substation.
3 Case Study, Results and Discussions A 115-kV power transmission system of the Provincial Electricity Authority of Thailand in Nakhon Ratchasima, North-eastern Thailand, is situated for test as shown in figure 1. There are two capacitor-bank-equipped substations (CHOKCHAI and NONGKI). The test is conducted by controlling the capacitor banks at these two substations. A daily load curve as shown in figure 2 is used to generate working conditions, hour-by-hour, to evaluate the effectiveness of the switching operation.
Fig. 1. 115-kV power transmission system in Nakhon Ratchasima, Thailand
Table 3 presents the operation of the capacitor switching resulting from the use of the decision tables given in Tables 1 and 2. It notes that each of CHOKCHAI and NONGKI substations has a 6-step capacitor bank installed (2.4 MVar per step). As can be seen from Table 3, NONGKI is required to switch more capacitor banks to the system while all it has are fully switched. As a result, the system performance would be further improved if there are more capacitor banks in this substation. Hence, there is the need to install additional capacitors at NONGKI substation. Another 6-step
Capacitor Switching Control Using a Decision Table
1265
Fig. 2. Daily load curve for the test system
capacitor bank is committed to be installed. Moreover, as about 50% of total loads in the system are connected at HUATALAE substation, it would be useful if a 6-step capacitor bank is also installed at this bus as depicted in figure 3. This leads to the test case 2 as shown in Table 4.
Fig. 3. Modified test system for test case 2
1266
P. Srithorn et al.
As can be seen, when the voltage at NONGKI substation is LOW, one or more capacitor steps is required to be switched into the system. As a result, the NONGKI voltage and other parts are all lifted up and the total power losses are reduced. This reveals that it is worth to install additional capacitor banks at NONGKI substation. In contrast, although load demand at HUATALAE substation is heavy (about 50% of the system loads), the capacitors installed at this substation do not play significant effect on the voltage at this substation because a large amount of excessive reactive power required is supplied by the capacitors at NONGKI substation. This implies that the capacitor bank at HUATALAE substation might be unnecessary for these test case scenarios.
4 Conclusions This paper demonstrates the use of a decision table for capacitor switching problems. The switching action only depends upon the operating condition at that instant. A
Capacitor Switching Control Using a Decision Table
1267
system operator simply looks up the table corresponding to the operating condition to decide what the action would be. As a result, the voltage profile is improved and the total power losses are reduced. Furthermore, the satisfactory results also suggest which of substations significantly requires additional switched capacitors to be installed.
5 Acknowledgement Information of the 115-kV power transmission system appeared in this work was supported by Provincial Electricity Authority of Thailand. In addition, one of the authors, Mr Phinit Srithorn, would like to acknowledge the financial support from Rajamangala Institute of Technology, Thailand.
1268
P. Srithorn et al.
References 1. Song, Y.H., Johns, A.T.: Flexible AC transmission systems (FACTS). The Institute of Electrical Engineering, London, UK, 1999 2. Collum, M.: Capacitor bank control using traditional microprocessor based relays. Rural Electric Power Conference, May 2000, D5/1 – D5/5 3. Hu, Z., Wang, X., Chen, H., Taylor, G.A.: Volt/VAr control in distribution systems using a time-interval based approach. IEE Proc.-Gener. Transm. Distrib., 5 (2003), 548-554 4. So, A.T.P., Chan, W.L., Tse, C.T.: Fuzzy logic based automatic capacitor switching for reactive power compensation. Proceedings of the International Forum on Applications of Neural Networks to Power Systems (ANNPS’93), April 1993, 41-46 5. Deng, Y., Ren, X.: Fuzzy modelling of capacitor switching for radial distribution systems. IEEE Power Engineering Society Winter Meeting, Columbus, USA, 830-834 6. Witten, I.H., Frank, E.: Data mining. Morgan Kaufmann Publishers, San Diego, USA, 2000
Author Index Abe, Akinori II-815 Abe, Jair Minoro II-935, II-942 Abe, Koji I-614 Abe, Norihiro II-175, II-181 Aboudarham, J. III-467 Achuthan, Manoj II-469 Adachi, Yoshinori II-71, II-77, II-89 Adamides, Emmanuel II-62 Adams, Rod III-256 Adorno, Marcello Cattaneo I-426 Aghdaei, Mohammad Hossein III-278 Ahmad, Abdul Manan II-587 Ahn, Byung-Ha I-952 Ahn, Kye-Soon III-869 Ahn, Seongjin III-38 Aizawa, Teruaki II-156 Akamatsu, Norio I-799, I-807, I-827, I-840, I-847, I-853, I-874, I-881, I-891, II-1045 Akashi, Takuya I-799 Akima, Hisanao I-1010 Akutagawa, Masatake II-1060, II-1068, II-1074 Akutsu, Tomonori II-920 Allan, J. I-699 Alvarez, Julio César I-1227 Alvarez, Luis III-395 Amamiya, Makoto I-124 Amamiya, Satoshi I-124 Andreae, Peter III-224 Anuchitkittikul, Burin I-385 Aoe, Jun-ichi I-530, I-541, I-549, I-558, I-567 Ao Ieong, Tony W.H. II-572, II-594 Aoki, Keisuke III-136 Aoki, Terumasa I-140 Araki, Kenji II-131 Areerak, Kongpan I-1255, III-695 Areerak, Kongpol III-695 Arentz, Will Archer I-620 Arita, Daisaku I-419 Ariton, Viorel II-382 Asano, Akira III-756
Asano, Chie Muraki III-756 Ashidi, Nor I-591 Asorey-Cacheda, R. I-685 Athanassoula, Lia III-404 Atlam, El-Sayed I-530, I-541, I-549, I-567 Attkitmongcol, Kitti III-643 Auli, Francesc I-647 Aziz, Shalihatun Azlin I-599 Baba, Norio I-434, I-792 Bac, Le Hoai II-708, II-1114 Baccigalupi, C. III-426 Bae, Ihn-Han II-219 Baek, Jonghun III-610 Bajaj, Preeti III-551 Baker, Jesse T. I-232 Balachandran, Bala II-469 Balducelli, Claudio I-1242 Bandara, G.E.M.D.C. II-698 Barnes, Stuart II-28 Bar-Yam, Yaneer III-809 Batty, Thayne I-905 Bauer, Michael I-1164 Bedini, L. III-426 Belea, Radu III-246 Bell, David III-521 Benkhalil, A. III-446, III-453, III-460 Bento, Carlos III-678 Bertoli, Marcello I-1171 Bertolotto, Michela II-425 Bhowan, Urvesh III-224 Bi, Yaxin III-521 Bigelow, John II-996 Bo, Feng II-564 Bocaniala, C. D. III-365 Bodyanskiy, Yevgeni II-764, II-772 Bölöni, Ladislau II-1121 Boonyanunta, Natthaphan III-529 Bosma, Albert III-404 Bouvry, Pascal I-727 Braun, K. III-476 Bright, Glen I-225
1270
Author Index
Brumen, Boštjan II-1039 Brust, Matthias R. I-734 Burns, Alex I-148 Caforio, Antonio III-342 Cairó, Osvaldo I-1227 Calì, Andrea III-187 Calvanese, Diego III-187 Camilleri, Guy II-9 Cao, Cungen I-263, II-580 Caraman, Sergiu III-246 Carreiro, Paulo III-678 Ceravolo, Paolo III-328, III-335 Cha, Jae Sang III-116 Chan, Stephen III-232 Chang, Jae-Woo II-975 Chau, Rowena I-155 Chen, Chun-Hua I-929 Chen, Guanlong III-544 Chen, Guoqing I-270, III-513 Chen, Junghuei I-25 Chen, Kok Yeng III-357 Chen, Yen-Wei II-337, II-352, II-359 Cheng, Jian II-344, II-352 Cheng, K.P. II-28 Cheung, King Hong II-572 Chi, Sheng-Chai II-1252 Chiba, Masaya II-905 Chiu, Chaochang I-922, I-937 Cho, Jae Hoon I-50 Cho, Jae-Soo III-887 Cho, Jundong III-573 Cho, Yongjoo III-103 Cho, Youngwan II-202 Choi, Doo-Hyun III-862 Choi, J. G. II-211 Choi, Jae Gark II-253, III-887 Choi, Jaeseok III-166 Choi, Woo-Kyoung III-589 Chuang, Zhang III-632 Chun, Kwang Ho III-52 Chung, Jinwook III-38 Clark, David II-483 Coghill, George II-319 Colucci, Simona III-187 Coppock, Sarah II-1136 Corallo, Angelo III-335, III-342
Corne, David I-952 Cornelis, Chris I-270, II-779 Cosgrove, Michael C. I-232 Costa-Montenegro, E. I-685 Cox, Robert III-210, III-566 Coyette, Adrien I-1150 Crampin, Edmund J. II-329 Cranefield, Stephen II-1172 Crespo, J.L. I-661 Crippa, Paolo II-683 Crowther, Patricia III-210 Czyzewski, Andrzej I-743 Dahalan, Azweeda II-294 Damiani, Ernesto III-321 D’Anjou, A. I-692 Davey, Neil III-256 Deaton, Russell I-25 De Cock, Martine I-270, II-779 de Garis, Hugo I-905 Deguchi, Toshinori II-103 de la Calleja, Jorge III-411 Denda, Tatsuaki I-981 De Oliveira, Marcos II-1172 Deselaers, Thomas II-989 Vladan I-284, I-299 Diegel, Olaf I-225 Dietrich, Jens I-455 Ding, Kai II-447 Ding, Shuxue II-366 Di Noia, Tommaso III-187 Dipoppa, Giovanni I-1242 Donini, Francesco M. III-187 Dote, Yasuhiko I-113 Družovec, Marjan II-1003, II-1039 Duro, R.J. I-661, I-669 Dziubinski, Marek I-743 Eakins, John P. I-614 Eduards, Mark A. II-793 Eguchi, Hajime II-398 Elia, Gianluca III-335, III-342 Emoto, Takahiro II-1060 Estévez, Pablo A. I-140 Falkowski, Bernd-Jürgen II-675 Faulkner, Stéphane I-1150
Author Index
Favela, Jesús I-1181 Fernández-Redondo, Mercedes I-677 Ferreira, José L. III-678 Fleming, PJ I-699 Flisar, Dušan II-1018 Forgionne, Guiseppi A. I-103 Fotouhi, F. III-173 Frank, Celia III-506 Frey, Hannes I-718 Fuchino, Tetsuo II-418 Fuentes, Olac III-395, III-404, III-411 Fujii, Satoru II-912, II-920 Fujita, Toshiharu II-1208 Fujita, Yoshikatsu I-480, I-516 Fujita, Yusuke I-608 Fuketa, Masao I-530, I-549, I-558, I-567, Fukue, Yoshinori I-501 Fukui, Shinji II-632 Fukumi, Minoru I-799, I-807, I-827, I-840, I-881, I-891, I-898, II-1045 Fukumoto, Taro III-1 Fuller, N. III-467 Funahashi, Kenji II-1 10, II-632 Furumura, Takashi II-898 Fyfe, Colin I-16, I-74 Galitsky, Boris III-314, III-307 Garcia, Fernando I-647 Garzon, Max I-18 I-284 Ghada, Elmarhomy I-530 Gianini, Gabriele III-321 Gil-Castiñeira, F. I-685 Giripunje, Shubhangi II-640 Glebocka, Agnieszka Dardzinska II-1143 Goghill, George II-284 Goh, John III-795 Goh, T.T. I-277 Gomes, Paulo III-678 Gomez, Juan Carlos III-404 Gonzales, Jorge I-647 Gonzáles-Castaño, F.J. I-685 Goonesekera, Tharanga II-1, III-772 Görgen, Daniel I-734 Gorodetsky, Vladimir I-1136
1271
Gradecki, Joseph D. I-637 Graham, Jonathan III-388 Graña, M. I-692 Grosky, W.I. III-173 Grzymala-Busse, Jerzy W. I-757 Gšrgen, Daniel I-718 Guan, Jiwen III-521 Guan, Xiaoxiang I-1017 Guinand, Frederic I-727 Guitart, Pere I-647 Güld, Mark O. II-989 Guo, Wanwu II-432, II-454 Guo, Ying III-388 Ha, Sook-Jeong II-219 Hagita, Norihiro II-815 Håkansson, Anne I-253 Hall, Richard I-1023 Hamaguchi, Takashi II-398 Hamakawa, Yasuo III-855 Hamanaka, Hiroshi I-1002 Hamamoto, Yoshihiko I-608 Han, Hee-Seop III-788 Han, Sun-Gwan III-788 Han, W.Y. III-843 Han, Youngshin III-95 Hara, Akira II-1089 Harada, Katsuyuki II-181 Harada, Kouji II-519 Hashimoto, Yoshihiro II-398 Hassan, Halah II-46 Hatanaka, Toshiharu I-1189 Hatzilygeroudis, Ioannis I-292, II-1106 Hayakawa, Yoshihiro I-974, I-981 Hayama, Hiroshi III-625 Hayashi, Hatsuo I-967 Hayashi, Takahiro III-180 He, Qiang I-1037 Herd, Susanna I-1023 Hernández-Espinosa, Carlos I-677 Herranz, D. III-426 Herzog, A. III-476 Hien, Thai Duy II-359 Higchi, Tetsuya I-6 Hijazi, Mohd. Hanafi Ahmad II-587 Hirakawa, Satoshi III-756
1272
Author Index
Hirao, Masahiko II-418 Hirata, Toshiyuki I-322 Hirata, Yoshihiko II-898 Hirayama, Katsumi II-1215 Hirose, Kota II-188 Hogie, Luc I-727 Hong, S. H. II-211 Hong, Tzung-Pei II-1283 Hori, Koichi III-350 Hori, Satoshi II-175, II-181, II-188 Horie, Kenichi II-831 Horio, Yoshihiko I-988 Hoshikawa, Masafumi II-1074 Hou, Jia III-74 Howard, Daniel II-793, III-217 Hsu, Chao-Hsing I-929 Hsu, Chi-I I-922 Hsu, Han-Jen III-749 Hsu, Pei-Lun I-922 Hu, Hong I-772 Huang, Hung-Hsuan I-357 Huang, Jen-Peng II-1245 Huang, Jie II-366 Huang, Te Ming III-802 Hung, Shao-Shin II-1237 Hutter, Christian I-734 Hwang, Chang-Soon III-596 Hwang, Gi Yean III-74 Hwang, Kyoung-Soon III-817 Ibrahim, Zuwairie I-32 Ichalkaranje, Nikhil I-71, I-80, I-110 Ichikawa, Teruhisa II-928 Ichimura, Hiroshi II-920 Ichimura, Takumi II-156, II-195, II-1081, II-1089, II-1097, II-1128 Ikai, Tomohiro I-113 Ikeda, Toshiaki I-337 Imai, Hideyuki I-1058 Imura, Takashi I-840 Inoue, Yoshio II-898 Inuzuka, Nobuhiro II-95 Ipson, S.S III-446, III-453, III-460 Isa, Mat I-591 Iseyama, Yukari I-509 Ishida, Yoshiteru II-504, II-534 Ishigaki, Chuhei III-536
Ishii, Naohiro II-83, II-103, II-124 Ishikawa, Hiroshi I-178 Ishikawa, Ryuji II-954 Ishizaki, Masato I-330 Ismail, Ahmad Faris II-294 Isokawa, Teijiro III-491 Isomichi, Yoshinori II-1089 Ito, Shin-ichi I-853 Itoh, Toshiaki II-398 Iwahori, Yuji II-110, II-118, II-632 Iwamoto, Seiichi II-1201, II-1208 Iwao, Tadashige I-124 Iwata, Atsushi I-995 Iwata, Jun II-912 Iwata, Tomoharu II-624 Izworski, Andrzej III-740 Jain, Lakhmi I-74, I-80, I-786, II-949 Jang, Min-Soo III-196 Jefferies, Margaret E. I-232 Jeon, Hong-Tae III-589 Jeon, Inja II-227 Jeon, Jae Wook III-573 Jeon, Joongnam III-817 I-299 Jimbo, Takashi II-83 Jing, Ju III-419 Johansson, Christopher I-959 Jones, Harrison P. III-433, III-439 1-284 Jun, Guo III-632 Jun, Jeong III-869 Jun, Moon-Seog III-81 Jung, Mi Gyoung II-237, II-244 Juszczyszyn, Krzysztof II-1194 Kadirkamanathan, V. I-699 Kadmin, Ahmad Fauzan I-591 Kadoya, Yuki I-541, I-549, I-567 Kaewkasi, Chanwit I-1235 Kaewkasi, Pitchaya I-1235 Kailing, Karin II-982 Kakusho, Koh I-364 Kamal, M.A.S. I-1197 Kambayashi, Yasushi II-1010 Kameda, Yoshinari I-411 Kamisaka, Daisuke II-905 Kamiura, Naotake III-491
Author Index
Kanda, Taki III-143 Kaneda, Yuji II-616 Kang, Hyun-Soo II-211, II-253, III-887 Kang, Ju-Hyun III-596 Kang, Seung Hwan II-261, III-8 Karacapilidis, Nikos II-62 Karatrantou, Anthi I-292 I-946, II-268 Karungaru, Stephen II-1045 Kasaev, Oleg I-1136 Kashiji, Shinkaku I-541, I-558, I-567 II-1150 Katayama, Karoru I-178 Kato, Tsuneaki II-148 Kato, Yoshikiyo III-350 Kawachi, Tomoko I-434 Kawaguchi, Masashi II-83 Kawahara, Daisuke I-393 Kawakatsu, Jun I-1100 Kawanaka, Haruki II-118, II-632 Kawaoka, Tsukasa III-650 Kawasaki, Hiroshi I-827 Kawata, Seiichi I-1107 Kawaura, Takayuki III-136 Kayano, Akifumi II-876 II-1157 Kecman, Vojislav III-802 Kerre, Etienne I-270, II-779 Keskar, Avinash III-551 Kessoku, Masayuki I-523 Keysers, Daniel II-989 Khalid, Marzuki III-380 Khojulklang, Kasem I-1262 Khosla, Rajiv II-1, III-657, III-772 Kiguchi, Kazuo I-1092 Kim, Dong Hwa I-50, I-57, II-661 Kim, Dong-Hwee II-276 Kim, Dongwon II-716, III-596, III-603 Kim, Euntai II-202 Kim, Gwang-Hyun III-110 Kim, Hyeoncheol III-788 Kim, Hyuncheol III-38 Kim, Hyung Jin III-45, III-88 Kim, Jae-Bong III-788
1273
Kim, Jong Tae III-573 Kim, Mi Young II-237, II-244 Kim, Min Kyung III-270 Kim, Nak-Hyun II-716 Kim, Sang-ho III-116 Kim, Sang-Jun III-203 Kim, Seong-Joo III-589 Kim, Soon-Ja II-276 Kim, Taewan III-45 Kim, Tai-hoon III-60, III-116 Kim, Yong-Guk III-196, III-203 Kimura, Naoki II-412 Kimura, Yuuya I-1189 Kinosita, Yosuke III-180 Kinouchi, Yohsuke II-1060, II-1068, II-1074 Kinshuk I-277 Kiuchi, Yosuke II-961 Klein, Mark III-809 Klonis, Nectarios I-1023 Koda, Tomodo II-862 Koga, Takanori I-13 Kogure, Kiyoshi II-815 Kojima, Masanori II-898 Kokol, Peter II-1018, II-1025 Kolodyazhniy, Vitaliy II-764 Kolp, Manuel I-1150 Komatsu, Takanori I-371, I-378 Komeda, Takashi I-371 Komura, Kazunari II-110 Kondo, Tadashi II-1051 Kong, Chan Chi III-232 Konishi, Osamu III-780 Korekado, Keisuke I-995 Korkotian, E. III-476 Kosaka, Takuya I-411 Kostek, Bozena I-750 Koutsojannis, Constantinos II-1106 Koyama, Koji II-77 Koyama, Yukie II-110 Kozuma, Masafumi II-175 Kriegel, Hans-Peter II-982 Krishnamurthy, E.V. I-87, I-95 Krogh, Anders I-64 Król, Dariusz II-1165 Kryssanov, Victor I-364 Kube, K. III-476
1274
Author Index
Kubota, Naoyuki I-1121 Kudo, Mineichi I-1058, I-1065 Kudo, Yasuo I-1079, I-1085 Kulworawanichpong, Thanatchai I-1255, I-1262, III-695, III-710 Kumamoto, Satoru II-1230 Kumamoto, Tadahiko II-139 Kumsawat, Prayoth III-643 Kunifuji, Susumu I-337, I-322 Kuo, Huang-Cheng II-1245 Kuo, Ting-Chia II-1237 Kurano, Masami II-1230 Kurihara, Masahito I-1072 Kuroda, Chiaki II-412 Kurohashi, Sadao I-385, I-393 Kurosawa, Yoshiaki II-156, II-1128 III-426 Kushiro, Noriyuki II-807 Kusiak, Andrew I-148 Kwon, Kisang II-227 Kwong, Raymond W.M. II-564 Lai, Chris III-772 Lai, Hsin-Hsi III-618 Lai, Wei-Cheng II-1260 Lai, Weng Kin II-284, II-294, III-357 Lai, Chris III-657 Lam, H.F. III-373 Lam, Toby H.W. II-557 Lansner, Anders I-959 Lau, K.H. II-28 Lau, Sim Kim II-261, III-8 Le, Kim II-491 Lee, Byung-Joo III-196 Lee, Chilgee III-95 Lee, Dong Chun III-110 Lee, Eric W.M. III-373 Lee, Eun-ser III-60 Lee, Geehyuk III-610 Lee, Guee Sang III-880 Lee, Huey-Ming III-123 Lee, Hyo-Young II-219 Lee, Hyun-Gu III-196 Lee, Ickjai I-196 Lee, Ji Hyong III-573 Lee, Jong-Hee III-81
Lee, Keon Myung II-723, III-573, III-817 Lee, Keun-Wang III-81 Lee, Kwang-Hyoung III-81 Lee, Kwangyup II-202 Lee, Kyoung Jun II-646, II-668 Lee, Moon Ho III-74 Lee, Raymond S.T. II-549, II-557, II-594 Lee, Sang-Keon III-88 Lee, Seung Wook III-573 Lee, Shu-Yen III-123 Lee, Si-Woong II-211, II-253, III-887 Lee, Soek-Joo III-196 Lee, Soobeom III-45 Lee, Sung-Oh III-203 Lee, Yangsun II-202, III-826 Lee, Yeong-Chyi II-1283 Lee, Yong-Hwan III-67 Lehmann, Thomas M. II-989 Lehnert, Johannes I-718 II-1025 Letsche, Terry I-148 Levachkine, Serguei III-718 Li, Deren III-513 Li, Deyi III-513 Li, Gary C.L. II-549 Li, Qiubang III-657 Lim, Chee Peng III-357 Lim, Myoung Seob III-52 Lim, W.S. II-305 Lin, In-Jou II-1252 Lin, Wen-Yang II-1276, II-1283 Lin, Yang-Cheng III-618 Lin, Yi-Sen II-1245 Lin, Zhongqin III-544 Lindemann, Udo I-1157 Litvan, Irene II-1018 Liu, Chi II-440 Liu, Damon Shing-Min II-1237 Liu, Hugo III-293 Liu, James N.K. II-564, II-572 Liu, Min II-71 Liu, Qingshan II-344, II-352 Logan, Erica I-1023 Loo, C.K. II-305 López Ariste, Arturo III-388
Author Index
López-Peña, F. I-661, I-669 Lovrek, Ignac I-1143 Lu, Hanqing II-344, II-352 Luo, Jun I-189 Luo, Xiao III-498 Ma, Bingxian II-580 Ma, Songde II-344 Ma, Xiaohang I-1051 Ma, Zhiqiang II-454 MacDonald, Bruce A. I-203 Maghsoudi, Shahin II-36 Malanushenko, Olena III-439 Malowiecki, Michal II-1179 Mao, Ching-Hao III-123 Marinescu, Dan C. II-17 Martinelli, Marco I-1242 Martínez, Ana I. I-1181 Maruno, Shinji II-869 Masayuki, Kessoku I-501 Mashor, Mohd. Yousoff I-591 Matsubara, Takashi I-314 Matsuda, Noriyuki II-175, II-181 Matsugu, Masakazu I-995 Matsui, Nobuyuki I-833, III-491 Matsumoto, Hideyuki II-412 Matsumoto, Hiroshi I-1114 Matsumoto, Yoshiyuki III-159 Matsumura, Naohiro II-839 Matsumura, Yuji I-891 Matsunaga, Naofumi I-608 Matsushita, Mitsunori II-148 Matsuyama, Hisayoshi II-375 Maurer, Maik I-1157 Mayiwar, Narin I-253 Mazlack, Lawrence J. II-1136 McClean, Sally I-171 McSharry, Patrick E. II-329, III-483 Mera, Kazuya II-195, II-1 128 Messom, Chris I-218 Metzler, Richard III-809 Michaelis, E. III-476 Ming, Wu III-632 Minoh, Michihiko I-364 Mitani, Keiichiro I-480 Mitani, Yoshihiro I-608 Mitrovic, Antonija I-306
1275
Mitsukura, Kensuke I-807, I-827 Mitsukura, Yasue I-807, I-827, I-847, I-853, I-874, I-881 Miura, Hirokazu II-175, II-181 Miura, Motoki II-883 Miura, Yuka II-912 Miyajima, Hiromi III-855 Miyakoshi, Masaaki I-1058 Miyawaki, Asuka II-800 Mizukoshi, Noriyoshi I-487 Mizuno, Tadanori II-898, II-912 Mogami, Yoshio I-792 Monavar, Hamid III-278 Monroy, Raúl II-526 Montero, Calkin A.S. II-131 Mørch, Anders I. I-131 Moreno, Marco III-718 Mori, Koji I-988 Morie, Takashi I-995 Morihiro, Koichiro I-833 Morishige, Hajime I-1205 Morita, Kazuhiro I-530, I-541, I-549, I-558, I-567 Morohashi, Kazuya I-1107 Moshidi, Behzad III-559 Motoda, Hiroshi II-800 Mouhoub, Malek III-702 Mrázová, Iveta I-1044 Munemori, Jun II-869, II-876, II-891, II-905 Murai, Tetsuya I-1079, I-1085 Murase, K. II-968 Murata, Junichi I-1197, I-1213 Murata, Tadahiko I-1114, I-1128 Murthy, V.K. I-87, I-95 Na, Seungwon III-826 Nævdal, Jan Eirik B. I-131 Nagashino, Hirofumi II-1060, II-1068, II-1074 Nakada, Kazuhiro II-920 Nakagami, Jun-ichi II-1230 Nakajima, Koji I-974, I-981, I-1010 Nakamatsu, Kazumi II-954, II-961 Nakamura, Yuichi I-401, I-411 Nakano, Kenji I-472 Nakano, Miyoko I-898
1276
Author Index
Nakano, Ryohei II-602, II-609 Nakao, Zensho II-359 Nakaoji, Kumiyo II-148 Nakasuka, Shin’ichi III-350 Nakaura, Kazuhiro I-840 Nakayama, Hirotaka I-441 Nam, J. Y. II-211 Nam, M. Y. III-833, III-843 Naoe, Yukihisa II-898 Nara, Yumiko II-823 Neel, Andrew I-18 Negoita, Mircea Gh. I-240, I-914 Ng, Vincent III-232 Ngah, Umi Kalthum I-599 Nghi, Nguyen Thanh II-1114 Nguyen, Ngoc Thanh II-1179 Nguyen, Tai I-1150 Niimi, Ayahiko III-780 Nishida, Toyoaki I-357, I-385, I-393 Nishimoto, Kazushi I-314, I-330 Nishimura, Haruhiko I-833 Nishizaki, Takashi I-401 Nocerino, Maria Cristina III-328 Nomura, Osamu I-995 Nonaka, Hidetoshi I-1072 Nowostawski, Mariusz II-1172 Ny, Bunna II-541 Oeda, Shinichi II-1097 Ogata, Ryo I-401 Ogawa, Tomoya II-95 Ogura, Kanayo I-330 Oh, Sun-Jin II-219 Ohno, Sumika II-869 Øhrn, Aleksander I-620 Ohsawa, Yukio I-11, II-786, II-807, II-823, II-831, II-839, II-847 Ohta, Manabu I-178 Ohta,Yuichi I-401, I-411 Ohtsuka, Shoichiro I-371 Oka, Natsuki I-371, I-378 Okamoto, Masashi I-385, I-393 Okamoto, Takeshi II-534 Okuno, Takahide I-988 Omata, Sadao II-366 Omi, Takeshi I-178
Onai, Rikio III-180 Ong, C. W. III-16 Ong, M. I-699 Ono, Masaki I-558 Ono, Osamu I-32 Ota, Yutaka II-398 Oysal, Yusuf III-581 Ozaki, Hiroshi II-869 Ozaki, Masahiro II-71, II-77, II-124 Özsen, Özgür I-583 Paiva, Paulo III-678 Palade, Vasile II-698, III-246, III-365 Pan, Hongqi II-753 Panat, Anshish II-640 Pandya, Abhijit S. II-1051 Pappis, Costas P. II-62 Park, Gwi-Tae II-716, III-196, III-596, III-603 Park, Hyun Seok III-270 Park, Kil-Houm III-862 Park, Kyoung S. III-103 Park, Seon Hee III-270 Park, Seong-Mo II-253 Park, Seon-Hee III-263 Park, Wonbae III-610 Park, Gwi-Tae III-203 Pedrycz, Witold I-807 Pensuwon, Wanida III-256 Penumatsa, Phani I-18 Pereira, Francisco C. III-678 Perry, Mark I-1164 Peters, James F. I-764 Peterson, Don III-314 Petrovsky, Nikolai III-566 Phillips-Wren, Gloria E. I-71, I-103, I-110 Piattini, Mario I-1181 Pierrakeas, C. I-292 Ping, Chan Choyi I-599 Pirani, Massimiliano II-683 Polkowski, L. I-779 Popat, Deval II-691 Popescu, Theodor D. I-1220 Popov, S. II-772 Potgieter, Johan I-225
Author Index
Pousada Carballo, J.M. I-685 Povalej, Petra II-1018, II-1025 Pritchard, David I-240, I-914 Prügel-Bennett, Adam I-64 Puangdownreong, Deacha III-710 Puketa, Masao, I-549 Purvis, Martin II-1172 Purvis, Maryam II-1187 Qiu, Bin I-1017 Qu, Ming III-419 Quintero, Rolando III-718 Ra, Ilkeyun I-637 Rajasekaran, Sanguthevar I-189 Ramanna, Sheela I-764 Ramli, Dzati Athiar I-591 Ranawana, Romesh II-698 Rao, M.V.C. II-305 Rashidi, Farzan II-653, II-738, II-745, III-278, III-559 Rashidi, Mehran II-653, II-738, II-745, III-278 Rees, David III-388, III-419 Ren, X. I-699 Resta, Marina I-426 Rhee, Phill-Kyu II-227, III-833, III-843, III-869 Rhee, Sang-Surm III-67 Riaño, David II-1039 Rodríguez, Oscar M. I-1181 Rodríguez-Hernández, P.S. I-685 Rose, John A. I-8, I-40 Rosengard, Jean-Marc III-31 Rothkugel, Steffen I-734 Roy, Debabrata I-614 Rutkowski, Tomasz M. I-364 Ryu, Jeha I-210 Sa da Costa, J. III-365 Sado, Nobuaki II-118 Saito, Kazumi II-602, II-616, II-624 Saito, Toshimichi I-1002 Sakai, Sanshiro II-912 Sakakibara, Tsuneki II-831 Sakamoto, Katsuhiro I-847 Sakamoto, Masaru II-398
1277
Salami, Momoh-Jimoh E. II-294, II-312 Salerno, E. III-426 Salmenjoki, Kimmo II-1032 Samoilov, Vladimir I-1136 Sanada, M. I-1085 Sasaki, Hiroshi II-124 Sato, Eri I-1100 Sato, Hideaki I-847 Sato, Shigeo I-1010 Sato, Shigeyuki II-504 Sato, Y. I-1085 Sato, Yoichi I-385 Satoh, Hironobu I-866 Savarimuthu, Roy II-1187 Sawamura, Hajime III-1 Schnell, Santiago II-329 Schönauer, Stefan II-982 Schubert, Henning II-989 Schwitter, Rolf I-711 Scotney, Bryan I-171 Seco, Nuno III-678 Seo, Sam-Jun II-716, III-603 Serneniuk-Polkowska, M. I-779 Serra-Sagrista, Joan I-647 Shadabi, Fariba III-566 Shafawi, Mohd III-380 Shahjahan, Md. II-968 Shapcott, Mary I-171 Sharda, Hema II-691 Sharma, Dharmendra III-469, II-476, II-498, III-210, III-566 Shi, Zhongzhi I-772 Shibata, Tomohide I-393 Shigei, Noritaka III-855 Shigenobu, Tomohiro II-869, II-876 Shih, Frank III-419 Shim, Choon-Bo II-975 Shimizu, Koichi II-511 Shimizu, Toru II-898 Shimooka, Toshiyuki II-511 Shin, Chi-Hyun III-88 Shin, Jungpil II-165 Shin, Kyung-shik II-646, II-668 Shin, Miyoung III-263 Shin, Myong-chul III-116
1278
Author Index
Shioya, Yasuo I-178 Shiraki, Wataru I-441 Shizuki, Buntaoru II-883 Shon, Min-Kyu I-1213 Shu, Wen-Lung II-1260 Shukri, Mohamad III-380 Si, Jinxin I-263, II-580 Siegel, Howard Jay II-17 Sil, Jaya III-24 Sing, Push III-293 Sinkovic, Vjekoslav I-1143 Sioutis, Christos I-80 Smith, Kate A. I-155 Soak, Sang-Moon I-952 Sohn, Bong Ki III-573 Sokolov, Alexander II-731 Solazzo, Gianluca III-342 Son, Bongsoo III-45, III-88 Song, Young-Chul III-862 Sospedra, Joaquín Torres I-677 Spitzer, Klaus II-989 Spravedlyvyy, V. III-476 Squire, David McG. II-996 Sreenath, D.V. III-173 Srikaew, Arthit III-643 Srithorn, Phinit I-1262 Stefanowski, Jerzy I-757 Stiglic, Bruno II-1018 Štiglic, Gregor II-1018, II-1025 Stranieri, Andrew I-1171 Sturm, Peter I-718 Suenaga, Shinya I-974 Suetake, Noriaki III-536 Sugiyama, Kozo I-314 Sui, Yuefei I-263 Sujitjorn, Sarawut I-1255, III-643, III-695, III-710 Suka, Machi II-1081 Sumi, Yasuyuki I-357 Sumitomo, Toru I-541, I-549, I-558 Sun, Baiqing I-859 Suzuki, Atsuyuki II-954, II-961 Suzuki, Shoji II-89 Syed Mustapha, S.M.F.D. I-343, I-350 Sztandera, Les M. III-506
Tachiki, Masato I-393 Tadeusiewicz, Ryszard III-740 Taguchi, Masashi II-786 Takada, Kenji I-1128 Takagi, Masato III-166 Takahama, Tetsuyuki II-1089 Takahashi, Fumihiko II-1068 Takahashi, Hiroshi I-494 Takahashi, Ken’ichi I-124 Takahashi, Koichi II-839 Takahashi, Masakazu I-487, I-523 Takahashi, Satoru I-494, I-509 Takahashi, Takehisa III-1 Takeda, Atsushi II-165 Takeda, Fumiaki I-859, I-866, I-891 Takeda, Kazuhiro II-375 Takeoka, Saori II-77 Taki, Hirokazu II-175, II-181, II-188 Takigawa, Ichigaku I-1058 Takimoto, Hironori I-874 Takimoto, Munehiro II-1010 Tamura, Hiroshi II-847 Tanahashi, Yusuke II-602 Tanaka, Akira I-1058 Tanaka, Jiro II-883 Tanaka, Katsuaki III-350 Tanaka, Koki II-53 Tanaka, Shogo I-1205 Tanaka, Takushi II-53 Tanaka-Yamawaki, Mieko I-449 Taniar, David II-691, III-795 Taniguchi, Kazuhiko I-1121 Taniguchi, Ken I-464 Taniguchi, Rin-ichiro I-419 Tanimoto, Satoshi II-609 Tateyama, Takeshi I-1107 Tay, J. C. III-16 ten Brinke, Walter II-996 Terano, Takao I-464, I-472 Terlevich, Roberto III-395 Thai, Le Hoang II-708 Thatcher, Steve I-74 Thompson, HA I-699 Tilley, Leann I-1023 Tomita, Shigeyuki II-405 Tonazzini, A. III-426 Tony, Bastin II-1187
Author Index
Torikai, Hiroyuki I-1002 Torres, Miguel III-718 Tran, Dat II-476, II-498 Tronci, Enrico I-1242 Tsao, Chanhsi I-937 Tseng, Ming-Cheng II-1276 Tsuboi, Yusei I-32 Tsuda, Kazuhiko I-480, I-487, I-494, I-501, I-509, I-516, I-523 Tsuge, Yoshifumu II-375 Tsurusaki, Kazuyoshi II-1201 Tuncer, Taner I-946 Turchetti, Claudio II-683 Turgut, Damla II-1121 Tweedale, Jeffrey I-80 Uchino, Eiji III-536 Ueda, Atsushi I-1121 Ueda, Kazuhiro I-371, I-378, III-625 Ueda, Naonori II-616 Ueno, Takayuki II-1208 Umeno, Masayoshi II-83 Uosaki, Katsuji I-1189 Urlings, Pierre I-80 Ursu, Marian F. III-31, III-764 Usuki, Masao I-314 Utsunomiya, Atsushi I-378 Velásquez, Juan D. I-140 Vemulapali, Balaji III-506 Vera, Eduardo I-140 Virginas, Botond III-764 Viviani, Marco III-328 Vizcaíno, Aurora I-1181 III-726, III-733 Voudouris, Chris III-764 Wada, Takao II-418 Wagenknecht, Michael II-731 Walton, Chris II-920 Wang, Changhai III-702 Wang, Dianhui I-1051 Wang, Haimin III-419 Wang, Hong-Ming III-749 Wang, Jhing-Fa III-749 Wang, Min-Feng II-1276 Wang, Shuliang III-513
1279
Wang, Xizhao I-1037 Wang, Pei III-285 Washida, Yuichi II-847 Washio, Takashi II-800 Watabe, Hirokazu III-650 Watada, Junzo III-129, III-136, III-151, III-159, III-166 Watanabe, Takayuki II-534 Watanabe, Teruyuki III-129 Watanabe, Yuji II-504 Watman, Craig II-491 Watson, Ian I-575, I-1249, II-36, II-46, III-672 Weerasinghe, Amali I-306 Wei, Daming II-366 Wein, Berthold II-989 Welzer, Tatjana II-1003, II-1025, II-1032, II-1039 Wilk, Szymon I-757 Wills, Anna I-575 Wojcik, Jaroslaw I-750 Won, Kyoung-Jae I-64 Wong, Ka Yan I-654 Wszolek, Wieslaw III-740 Wu, Annie S. II-17 Wu, Chen-Cheng II-1260 Wu, Chin-Hung II-1268 Wu, Huanrui I-1030 Wun, Chian-Huei II-1268 Xiao, Jitian II-461 Xie, Nengfu I-263, II-580 Xu, Baishan II-447 Yabuuchi, Yoshiyuki III-151 Yada, Katsutoshi II-800 Yamaba, Hisaaki II-405 Yamada, Kunihiro II-898, II-920 Yamada, Masafumi I-1065 Yamaguchi, Takashi II-831 Yamaguchi, Toru I-1100 Yamakami, Toshihiko II-855 Yamakawa, Takeshi I-13 Yamamoto, Yasuhiro II-148 Yamashita, Yoshiyuki II-391 Yamato, Kazuharu III-491 Yamawaki, Shigenobu I-786, II-949
1280
Author Index
Yan, Min-Chuan II-1252 Yan, Peng I-270 Yang, Zhiyi I-630 Yasuda, Hiroshi I-140 Yasuda, Masami II-1230 Yasukata, Fumiko I-898 Yeap, Wai-Kiang I-232 Yee, Paul II-319 Yeh, Chung-Hsing I-155, II-753, III-618 Yip, Angela Y.N. III-665 Yip, Chi Lap I-654 Yip, Daniel C.Y. II-28 Yoneyama, Mika I-113 Yong, Shereen II-284 Yoo, Ju-Hyoung III-869 Yoo, Seong-Joon I-164 Yoo, Seung-Jae III-110 Yoon, Hyo Sun III-880 Yoon, Jungwon I-210 Yoon, Min I-441 Yoshida, Hajime II-891 Yoshida, Jun I-516 Yoshida, Katsumi II-1081, II-1097 Yoshida, Kenichi I-516 Yoshida, Koji II-898 Yoshida, Kouji II-912, II-920 Yoshida, Motoharu I-967 Yoshida, Yuji II-1222, II-1230 Yoshimori, Seiki I-881 Yoshino, Takashi II-869, II-876, II-891, II-905 Yoshioka, Hitoshi II-405 You, Il-Sun III-67 You, Jane II-572
Yu, Ge I-1030 Yu, Han II-17 Yu, Hui II-440 Yu, Zhiwen I-630 Yuan, Fang I-1030 Yuan, Hanning III-513 Yue, Xiaoli I-263 Yuen, David C.K. I-203 Yuizono, Takaya II-876 Yun, Byoung-Ju II-211, II-253, III-610, III-887 Yun, Yeboon I-441 Yusuf, Rubiyah III-380 Zabidi, Suriza Ahmad II-312 Zadeh, Lotfi A. I-1 Zahradnik, Pavel III-726, III-733 Zalaket, Joseph II-9 Zeephongsekul, Panlop III-529 Zeng, Xiang-Yan II-337 Zhang, Chunxia II-580 Zhang, Mengjie II-541, III-224 Zhang, Qinyu II-1074 Zhang, Yansong III-544 Zharkov, S.I. III-446, III-453, III-460 Zharkova, V.V. III-446, III-453, III-460 Zheng, Zheng I-772 Zhong, Guoqiang I-124 Zhou, Min II-425 Zhou, Xingshe I-630 Zilli, Antonio III-335 Zincir-Heywood, A. Nur III-498 Zyzalo, Jonathan R. I-225
This page intentionally left blank
This page intentionally left blank
This page intentionally left blank
This page intentionally left blank
Lecture Notes in Artificial Intelligence (LNAI)
Vol. 3249: B. Buchberger, J.A. Campbell (Eds.), Artificial Intelligence and Symbolic Computation. X, 285 pages. 2004.
Vol. 3157: C. Zhang, H. W. Guesgen, W.K. Yeap (Eds.), PRICAI 2004: Trends in Artificial Intelligence. XX, 1023 pages. 2004.
Vol. 3245: E. Suzuki, S. Arikawa (Eds.), Discovery Science. XIV, 430 pages. 2004.
Vol. 3155: P. Funk, P.A. González Calero (Eds.), Advances in Case-Based Reasoning. XIII, 822 pages. 2004.
Vol. 3244: S. Ben-David, J. Case, A. Maruoka (Eds.), Algorithmic Learning Theory. XIV, 505 pages. 2004.
Vol. 3139: F. Iida, R. Pfeifer, L. Steels, Y. Kuniyoshi (Eds.), Embodied Artificial Intelligence. IX, 331 pages. 2004.
Vol. 3238: S. Biundo, T. Frühwirth, G. Palm (Eds.), KI 2004: Advances in Artificial Intelligence. XI, 467 pages. 2004.
Vol. 3131: V. Torra, Y. Narukawa (Eds.), Modeling Decisions for Artificial Intelligence. XI, 327 pages. 2004.
Vol. 3229: J.J. Alferes, J. Leite (Eds.), Logics in Artificial Intelligence. XIV, 744 pages. 2004. Vol. 3215: M.Gh. Negoita, R.J. Howlett, L. C. Jain (Eds.), Knowledge-Based Intelligent Information and Engineering Systems. Part III, LVII, 906 pages. 2004. Vol. 3214: M.Gh. Negoita, R.J. Howlett, L. C. Jain (Eds.), Knowledge-Based Intelligent Information and Engineering Systems. Part II, LVIII, 1302 pages. 2004. Vol. 3213: M.Gh. Negoita, R.J. Howlett, L. C. Jain (Eds.), Knowledge-Based Intelligent Information and Engineering Systems. Part I, LVIII, 1280 pages. 2004. Vol. 3206: P. Sojka, I. Kopecek, K. Pala (Eds.), Text, Speech and Dialogue. XIII, 667 pages. 2004. Vol. 3202: J.-F. Boulicaut, F. Esposito, F. Giannotti, D. Pedreschi (Eds.), Knowledge Discovery in Databases: PKDD 2004. XIX, 560 pages. 2004. Vol. 3201: J.-F. Boulicaut, F. Esposito, F. Giannotti, D. Pedreschi (Eds.), Machine Learning: ECML 2004. XVIII, 580 pages. 2004. Vol. 3194: R. Camacho, R. King, A. Srinivasan (Eds.), Inductive Logic Programming. XI, 361 pages. 2004. Vol. 3192: C. Bussler, D. Fensel (Eds.), Artificial Intelligence: Methodology, Systems, and Applications. XIII, 522 pages. 2004. Vol. 3191: M. Klusch, S. Ossowski, V. Kashyap, R. Unland (Eds.), Cooperative Information Agents VIII. XI, 303 pages. 2004. Vol. 3187: G. Lindemann, J. Denzinger, I.J. Timm, R. Unland (Eds.), Multiagent System Technologies. XIII, 341 pages. 2004. Vol. 3176: O. Bousquet, U. von Luxburg, G. Rätsch(Eds.), Advanced Lectures on Machine Learning. IX, 241 pages. 2004. Vol. 3171: A.L.C. Bazzan, S. Labidi (Eds.), Advances in Artificial Intelligence – SBIA 2004. XVII, 548 pages. 2004. Vol. 3159: U. Visser, Intelligent Information Integration for the Semantic Web. XIV, 150 pages. 2004.
Vol. 3127: K.E. Wolff, H.D. Pfeiffer, H.S. Delugach (Eds.), Conceptual Structures at Work. XI, 403 pages. 2004. Vol. 3123: A. Belz, R. Evans, P. Piwek (Eds.), Natural Language Generation. X, 219 pages. 2004. Vol. 3120: J. Shawe-Taylor, Y. Singer (Eds.), Learning Theory. X, 648 pages. 2004. Vol. 3097: D. Basin, M. Rusinowitch (Eds.), Automated Reasoning. XII, 493 pages. 2004. Vol. 3071: A. Omicini, P. Petta, J. Pitt (Eds.), Engineering Societies in the Agents World. XIII, 409 pages. 2004. Vol. 3070: L. Rutkowski, J. Siekmann, R. Tadeusiewicz, L.A. Zadeh (Eds.), Artificial Intelligence and Soft Computing - ICAISC 2004. XXV, 1208 pages. 2004. Vol. 3068: E. André, L. Dybkjær, W. Minker, P. Heisterkamp (Eds.), Affective Dialogue Systems. XII, 324 pages. 2004. Vol. 3067: M. Dastani, J. Dix, A. El Fallah-Seghrouchni (Eds.), Programming Multi-Agent Systems. X, 221 pages. 2004. Vol. 3066: S. Tsumoto, J. Komorowski, (Eds.), Rough Sets and Current Trends in Computing. XX, 853 pages. 2004. Vol. 3065: A. Lomuscio, D. Nute (Eds.), Deontic Logic in Computer Science. X, 275 pages. 2004. Vol. 3060: A.Y. Tawfik, S.D. Goodwin (Eds.), Advances in Artificial Intelligence. XIII, 582 pages. 2004. Vol. 3056: H. Dai, R. Srikant, C. Zhang (Eds.), Advances in Knowledge Discovery and Data Mining. XIX, 713 pages. 2004. Vol. 3055: H. Christiansen, M.-S. Hacid, T. Andreasen, H.L. Larsen (Eds.), Flexible Query Answering Systems. X, 500 pages. 2004. Vol. 3040: R. Conejo, M. Urretavizcaya, J.-L. Pérez-dela-Cruz (Eds.), Current Topics in Artificial Intelligence. XIV, 689 pages. 2004. Vol. 3035: M. A. Wimmer (Ed.), Knowledge Management in Electronic Government. XII, 326 pages. 2004. Vol. 3034: J. Favela, E. Menasalvas, E. Chávez (Eds.), Advances in Web Intelligence. XIII, 227 pages. 2004.
Vol. 3030: P. Giorgini, B. Henderson-Sellers, M. Winikoff (Eds.), Agent-Oriented Information Systems. XIV, 207 pages. 2004. Vol. 3029: B. Orchard, C. Yang, M. Ali (Eds.), Innovations in Applied Artificial Intelligence. XXI, 1272 pages. 2004. Vol. 3025: G.A. Vouros, T. Panayiotopoulos (Eds.), Methods and Applications of Artificial Intelligence. XV, 546 pages. 2004. Vol. 3020: D. Polani, B. Browning, A. Bonarini, K. Yoshida (Eds.), RoboCup 2003: Robot Soccer World Cup VII. XVI, 767 pages. 2004. Vol. 3012: K. Kurumatani, S.-H. Chen, A. Ohuchi (Eds.), Multi-Agnets for Mass User Support. X, 217 pages. 2004. Vol. 3010: K.R. Apt, F. Fages, F. Rossi, P. Szeredi, J. Váncza (Eds.), Recent Advances in Constraints. VIII, 285 pages. 2004. Vol. 2990: J. Leite, A. Omicini, L. Sterling, P. Torroni (Eds.), Declarative Agent Languages and Technologies. XII, 281 pages. 2004. Vol. 2980: A. Blackwell, K. Marriott, A. Shimojima (Eds.), Diagrammatic Representation and Inference. XV, 448 pages. 2004. Vol. 2977: G. Di Marzo Serugendo, A. Karageorgos, O.F. Rana, F. Zambonelli (Eds.), Engineering Self-Organising Systems. X, 299 pages. 2004. Vol. 2972: R. Monroy, G. Arroyo-Figueroa, L.E. Sucar, H. Sossa (Eds.), MICAI 2004: Advances in Artificial Intelligence. XVII, 923 pages. 2004. Vol. 2969: M. Nickles, M. Rovatsos, G. Weiss (Eds.), Agents and Computational Autonomy. X, 275 pages. 2004. Vol. 2961: P. Eklund (Ed.), Concept Lattices. IX, 411 pages. 2004. Vol. 2953: K. Konrad, Model Generation for Natural Language Interpretation and Analysis. XIII, 166 pages. 2004. Vol. 2934: G. Lindemann, D. Moldt, M. Paolucci (Eds.), Regulated Agent-Based Social Systems. X, 301 pages. 2004. Vol. 2930: F. Winkler (Ed.), Automated Deduction in Geometry. VII, 231 pages. 2004. Vol. 2926: L. van Elst, V. Dignum, A. Abecker (Eds.), Agent-Mediated Knowledge Management. XI, 428 pages. 2004. Vol. 2923: V. Lifschitz, I. Niemelä (Eds.), Logic Programming and Nonmonotonic Reasoning. IX, 365 pages. 2004. Vol. 2915: A. Camurri, G. Volpe (Eds.), Gesture-Based Communication in Human-Computer Interaction. XIII, 558 pages. 2004. Vol. 2913: T.M. Pinkston, V.K. Prasanna(Eds.), High Performance Computing - HiPC 2003. XX, 512 pages. 2003. Vol. 2903: T.D. Gedeon, L.C.C. Fung (Eds.), AI 2003: Advances in Artificial Intelligence. XVI, 1075 pages. 2003. Vol. 2902: F.M. Pires, S.P. Abreu (Eds.), Progress in Artificial Intelligence. XV, 504 pages. 2003. Vol. 2892: F. Dau, The Logic System of Concept Graphs with Negation. XI, 213 pages. 2003. Vol. 2891: J. Lee, M. Barley (Eds.), Intelligent Agents and Multi-Agent Systems. X, 215 pages. 2003.
Vol. 2882: D. Veit, Matchmaking in Electronic Markets. XV, 180 pages. 2003. Vol. 2871: N. Zhong, S. Tsumoto, E. Suzuki (Eds.), Foundations of Intelligent Systems. XV, 697 pages. 2003. Vol. 2854: J. Hoffmann, Utilizing Problem Structure in Planing. XIII, 251 pages. 2003. Vol. 2843: G. Grieser, Y. Tanaka, A. Yamamoto (Eds.), Discovery Science. XII, 504 pages. 2003. Vol. 2842: R. Gavaldá, K.P. Jantke, E. Takimoto (Eds.), Algorithmic Learning Theory. XI, 313 pages. 2003. Vol. 2838: D. Gamberger, L. Todorovski, H. Blockeel (Eds.), Knowledge Discovery in Databases: PKDD 2003. XVI, 508 pages. 2003. Vol. 2837: D. Gamberger, L. Todorovski, H. Blockeel (Eds.), Machine Learning: ECML 2003. XVI, 504 pages. 2003. Vol. 2835: T. Horváth, A. Yamamoto (Eds.), Inductive Logic Programming. X, 401 pages. 2003. Vol. 2821: A. Günter, R. Kruse, B. Neumann (Eds.), KI 2003: Advances in Artificial Intelligence. XII, 662 pages. 2003. Vol. 2807: V. Matoušek, P. Mautner (Eds.), Text, Speech and Dialogue. XIII, 426 pages. 2003. Vol. 2801: W.Banzhaf, J.Ziegler, T. Christaller, P.Dittrich, J.T. Kim (Eds.), Advances in Artificial Life. XVI, 905 pages. 2003. Vol. 2797: O.R. Zaïane, S.J. Simoff, C. Djeraba (Eds.), Mining Multimedia and Complex Data. XII, 281 pages. 2003. Vol. 2792: T. Rist, R.S. Aylett, D. Ballin, J. Rickel (Eds.), Intelligent Virtual Agents. XV, 364 pages. 2003. Vol. 2782: M. Klusch, A. Omicini, S. Ossowski, H. Laamanen (Eds.), Cooperative Information Agents VII. XI, 345 pages. 2003. Vol. 2780: M. Dojat, E. Keravnou, P. Barahona (Eds.), Artificial Intelligence in Medicine. XIII, 388 pages. 2003. Vol. 2777: B. Schölkopf, M.K. Warmuth (Eds.), Learning Theory and Kernel Machines. XIV, 746 pages. 2003. Vol. 2752: G.A. Kaminka, P.U. Lima, R. Rojas (Eds.), RoboCup 2002: Robot Soccer World Cup VI. XVI, 498 pages. 2003. Vol. 2741: F. Baader (Ed.), Automated Deduction – CADE-19. XII, 503 pages. 2003. Vol. 2705: S. Renals, G. Grefenstette (Eds.), Text- and Speech-Triggered Information Access. VII, 197 pages. 2003. Vol. 2703: O.R. Zaïane, J. Srivastava, M. Spiliopoulou, B. Masand (Eds.), WEBKDD 2002 - MiningWeb Data for Discovering Usage Patterns and Profiles. IX, 181 pages. 2003. Vol. 2700: M.T. Pazienza (Ed.), Extraction in the Web Era. XIII, 163 pages. 2003. Vol. 2699: M.G. Hinchey, J.L. Rash, W.F. Truszkowski, C.A. Rouff, D.F. Gordon-Spears (Eds.), Formal Approaches to Agent-Based Systems. IX, 297 pages. 2002. Vol. 2691: J.P. Müller, M. Pechoucek (Eds.), Multi-Agent Systems and Applications III. XIV, 660 pages. 2003.