Mobile Agents: 6th International Conference, MA 2002, Barcelona, Spain, October 22-25, 2002, Proceedings (Lecture Notes in Computer Science, 2535) 3540000852, 9783540000853

Welcome to the proceedings of the 6th IEEE International Conference on Mobile Agents. MA 2002 took place in Barcelona, S

120 4

English Pages 220 [213] Year 2002

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

Mobile Agents: 6th International Conference, MA 2002, Barcelona, Spain, October 22-25, 2002, Proceedings (Lecture Notes in Computer Science, 2535)
 3540000852, 9783540000853

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis, and J. van Leeuwen

2535

3

Berlin Heidelberg New York Barcelona Hong Kong London Milan Paris Tokyo

Niranjan Suri (Ed.)

Mobile Agents 6th International Conference, MA 2002 Barcelona, Spain, October 22-25, 2002 Proceedings

13

Series Editors Gerhard Goos, Karlsruhe University, Germany Juris Hartmanis, Cornell University, NY, USA Jan van Leeuwen, Utrecht University, The Netherlands Volume Editor Niranjan Suri University of West Florida Institute for Human and Machine Cognition 40 South Alcaniz Street, Pensacola, FL 32501, USA E-mail: [email protected]

Cataloging-in-Publication Data applied for Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at http://dnb.ddb.de

CR Subject Classification (1998): C.2.4, D.1.3, D.2, D.4.4-7, I.2.11, K.6.5 ISSN 0302-9743 ISBN 3-540-00085-2 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag Berlin Heidelberg New York a member of BertelsmannSpringer Science+Business Media GmbH http://www.springer.de © Springer-Verlag Berlin Heidelberg 2002 Printed in Germany Typesetting: Camera-ready by author, data conversion by PTP-Berlin, Stefan Sossna e.K. Printed on acid-free paper SPIN: 10871542 06/3142 543210

Preface

Welcome to the proceedings of the 6th IEEE International Conference on Mobile Agents. MA 2002 took place in Barcelona, Spain and was co-located with the 4th International Workshop on Mobile Agents for Telecommunications Applications. Both events were held at the Universitat Pompeu Fabra, October 22–25, 2002. Mobile agents may be defined as programs that, with varying degree of autonomy, can move between hosts across a network. Mobile agents combine the notions of mobile code, mobile computation, and mobile state. Capabilities of mobile agents include: – Supporting unrealiable networks and disconnected operation – Counteracting low-bandwidth, high-latency communication links – Deploying new behaviour (through mobile code) and reconfiguring systems on-the-fly – Distributing processing load across systems – Improving survivability in the face of network and system failure. Given the above capabilities, mobile agents (while they may not be referred to as such) are now becoming accepted as a fundamental architectural construct for the design and development of complex adaptive systems that need to operate in highly dynamic environments. Mobile agents also support applications in several domains such as ubiquitous computing, grid computing, remote sensing, data mining, system management, and agile computing. The conference call for papers included the following areas of interest: – – – – – – – – – – – – – –

Security Requirements, Implications, Algorithms, and Implementation Resource Control and Management Debugging and Visualization Tools Tracking, Directory, and Matchmaking Services Support for Small Devices Deployment and Interoperability Scalability End-User Configuration / Customization Novel Architectures and Infrastructure Applications Performance Evaluation and Metrics Survivable Systems Formal Models and Theories of Mobility Coordination Models

The conference received 48 submissions in total from authors in many countries. All submissions were carefully reviewed by the Program Committee, consisting of 18 highly distinguished researchers from the mobile agents community: Each

VI

Preface

paper was reviewed by three program committee members (four in the case of a paper authored by a member of the program committee). Only 13 of the best papers were accepted into the conference and are contained in this volume. All of these papers help push the current state of the art in mobile agents to the next stage. I hope you enjoy reading them!

October 2002

Niranjan Suri

Organization

MA 2002 was co-located with the 4th International Workshop on Mobile Agents for Telecommunication Application (MATA 2002) and was held at the Universitat Pompeu Fabra in Barcelona, Spain.

Steering Committee David Kotz Friedemann Mattern Amy Murphy Gian Pietro Picco Volker Roth Niranjan Suri Giovanni Vigna David Wong

Dartmouth College, USA ETH Zurich, Switzerland University of Rochester, USA Politecnico di Milano, Italy Fraunhofer IGD, Germany Institute for Human & Machine Cognition, University of West Florida, USA University of California at Santa Barbara, USA Mitsubishi Electric Research Labs, USA

Organizing Committee General & Program Chair:

Niranjan Suri Institute for Human & Machine Cognition, University of West Florida, USA

Local Arrangements Chair:

Jaime Delgado Universitat Pompeu Fabra, Spain

Treasurer:

David Kotz Dartmouth College, USA

VIII

Organization

Program Committee Israel Ben-Shaul Walter Binder Ciaran Bryce Grzegorz Czajkowski Robert Gray Doug Lea Hwa-Chun Lin Keith Marzullo Luc Moreau Amy Murphy Gian Pietro Picco Kimmo Raatikainen Volker Roth Ichiro Satoh Christian Tschudin Giovanni Vigna David Wong Franco Zambonelli

Technion – Israel Institute of Technology, Israel Technical University of Vienna, Austria University of Geneva, Switzerland Sun Microsystems Labs, USA Dartmouth College, USA SUNY Oswego, USA National Tsing Hua University, Taiwan University of California San Diego, USA University of Southampton, UK University of Rochester, USA Politecnico di Milano, Italy Nokia Research Center and University of Helsinki, Finland Fraunhofer IGD, Germany National Institute of Informatics, Japan Uppsala University, Sweden University of California at Santa Barbara, USA Mitsubishi Electric Research Labs, USA Universit` a di Modeno e Reggio Emilia, Italy

Sponsoring Institutions IEEE Technical Committee on the Internet and IEEE Computer Society Universitat Pompeu Fabra

Table of Contents

aZIMAS: Web Mobile Agent System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Subramanian Arumugam, Abdelsalam (Sumi) Helal, Amar Nalla

1

Mobile Code in .NET: A Porting Experience . . . . . . . . . . . . . . . . . . . . . . . . . . 16 M´ arcio Delamaro, Gian Pietro Picco Mobile Agents and Logic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Hisashi Hayashi, Kenta Cho, Akihiko Ohsuga Empowering Mobile Software Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Volker Roth An Intrusion Detection System for Aglets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Giovanni Vigna, Bryan Cassell, Dave Fayram Fine-Grained Interlaced Code Loading for Mobile Systems . . . . . . . . . . . . . . 78 Luk Stoops, Tom Mens, Theo D’Hondt Improving Scalability of Replicated Services in Mobile Agent Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 JinHo Ahn, Sung-Gi Min, ChongSun Hwang Toward Interoperability of Mobile-Agent Systems . . . . . . . . . . . . . . . . . . . . . . 106 Arne Grimstrup, Robert Gray, David Kotz, Maggie Breedy, Marco Carvalho, Thomas Cowin, Daria Chac´ on, Joyce Barton, Chris Garrett, Martin Hofmann Mobile Intermediaries Supporting Information Sharing between Mobile Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Norliza Zaini, Luc Moreau A Mobile Agent Enabled Fully Distributed Mutual Exclusion Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Jiannong Cao, Xianbing Wang, Jie Wu Using a Secure Mobile Object Kernel as Operating System on Embedded Devices to Support the Dynamic Upload of Applications . . . . . . 154 Walter Binder, Bal´ azs Lichtl Supporting Flexible Data Feeds in Dynamic Sensor Grids through Mobile Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Marco Carvalho, Maggie Breedy

X

Table of Contents

Physical Mobility and Logical Mobility in Ubiquitous Computing Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 Ichiro Satoh

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

aZIMAS: Web Mobile Agent System Subramanian Arumugam, Abdelsalam (Sumi) Helal, and Amar Nalla Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL-32611, USA [email protected]

Abstract. Mobile Agents technology opens up new avenues in personalizing and customizing the web experience of users. It provides new possibilities for deploying distributed applications using existing web infrastructure. One of the reasons why mobile agents are not yet popular on the web is due to the lack of an easily deployable framework that would facilitate their existence. Existing mobile agent systems usually require heavy infrastructure that lacks interoperability if deployed on the Internet. In this paper, we describe aZIMAS (almost Zero Infrastructure Mobile Agent System) – a framework that will enable the execution of lightweight mobile agents on the Internet and remove some of the constraints imposed by existing systems. aZIMAS uses existing platform independent protocols like HTTP to achieve code mobility and agent interaction. Our approach involves adding a minimal infrastructure layer, called Agent Environment (AE), over existing web servers and using web browsers as clients. By basing our framework firmly on existing web servers and browsers, we hope to leverage the pervasiveness of web browsers and servers and achieve similar pervasiveness for mobile agents. Keywords: Mobile Agents, Agent Systems, World Wide Web

1 Introduction Mobile Agents technology opens up new avenues in personalizing and customizing the web experience of users. It provides new possibilities for deploying distributed applications using existing web infrastructure. There are two important factors that we believe will make mobile agents a critical part of the Internet in the near future. First, due to the explosive growth of content on the web, the average user is subjected to overwhelming amount of information. It has become increasingly necessary for the underlying software to take an active role in presenting useful information. This involves information processing tasks like data filtering based on user preferences, automating common tasks etc. Many commercial software attempt to alleviate this problem [11]. With the advent of web services, more resources are available in a structured and easily accessible form than ever before. The second phenomenon is the increasing number of mobile users. Unlike connected computers, mobile users often N. Suri (Ed.): MA 2002, LNCS 2535, pp. 1–15, 2002. © Springer-Verlag Berlin Heidelberg 2002

2

S. Arumugam, A. Helal, and A. Nalla

have to deal with a weak network connection, limited resources (power, memory, screen area). Hence, it is attractive to consider options that would move resource and bandwidth consuming processes to remote locations. Some of the problems arising out of both these trends can be attributed to the passive role played by existing software tools, which merely display retrieved information. Whereas through the use of mobile agents one can develop tools that actively participate in a user’s web interaction process. Mobile users can benefit immensely in areas like remote processing and data rendering, whereas the average web user can tap on agents to cope with the information overload. With the Internet having been identified as the most desirable platform for distributed applications, agents can also help realize richer and dynamic forms of collaboration and cooperation. Given these advantages, the goal of our work is provide a framework for agents to operate on the Internet and access common web services. The framework should integrate seamlessly with existing infrastructure tools realizing agents as a natural part of the World Wide Web. Our previous work [5] investigated the merits of an agent system that is based upon existing web infrastructure software like web servers. The work resulted in the project aZIMAS (almost Zero Infrastructure Mobile Agent System), a mobile agent system based on the apache web server. The system had a very simple architecture and provided no support for features like agent communication and collaboration. Since the system was custom designed for Apache, deploying it in other web servers required extensive modification. We realized the need to extend the aZIMAS framework. This resulted in our current improvised version, which consists of a simple framework called Agent Environment (AE) that can plug-in to a web server through means a server extension module specific to a web server. The contributions of this paper can be summarized as follows: (1) We demonstrate a new agent system that is based on existing web technologies and which requires no additional client-side components (2) We present a new technique for integrating agent systems with web servers that is based on a lightweight web server-specific module. This approach enables the deployment of the agent system in any web server with no modification. (3) We investigate the performance implications of deploying an agent system along with a web server. The rest of this paper is organized as follows: In section 2, we give an overview of research that deals with web integration of agent systems. In section 3, we present the architecture of the aZIMAS System with a detailed description of its Agent Environment (AE). Section 4 provides an overview about aZIMAS agents and describes how issues like agent communication, mobility, and security are handled. Section 5 deals with implementation issues. In section 6, we present some preliminary performance results. We then describe usage examples and application scenarios in Section 7. Section 8 concludes the paper with a note on our current work.

aZIMAS: Web Mobile Agent System

2

3

Related Work

There are a number of research projects that deal specifically with the issue of integrating mobile agents into web servers. The approach taken by these projects fall in one of the following two categories: Œ

Custom Web Server based: Develop a custom web server integrating an agent execution component. The agent component handles agent related requests. Projects like WASP [1] (Web Agent-based Service Providing), and Agent Server [2] take this approach.

Œ

Script based: Make use of server-side facilities like common gateway interface scripts to launch applications that handle agent requests. M&M framework [3] and WebVector [4] fall under this category.

In the custom web server approach, agent system architecture is usually bound tightly to a specific web server. The web server normally consists of an embedded agent server environment which enables it to host mobile agents. Agent specific requests are exchanged via HTTP POST messages with unique MIME-types. The web server recognizes messages with specific MIME types as agent related and passes it on to the agent server for further handling. The problem with this approach is that existing web sites may not want to replace their current web infrastructure in favor of a custommade web server. Custom-made web servers may not match the power, reliability, and efficiency of production-quality commercial web servers. Thus, it is likely that any agent enhancement solution that gains acceptance among web content providers is based on existing web infrastructure software. Another issue with this approach is that the tight coupling between the web server and the agent architecture makes it difficult to deploy the solution in a wide variety of web servers uniformly. In projects that make use of server-side scripts, a script normally handles the task of receiving agents, and supporting their operation. In M&M agent system, web deployment is achieved by making use of servlet technology. WebVector uses common gateway interface scripts to receive and transport agents. If the custom server based approach makes the level of coupling tight, this approach tends to decouple the web server and the agent system completely. This has some important performance consequences: Œ Œ Œ

The web server will incur a significant overhead if the agent system is bulky. Since the web server has no awareness of the agent system, neither the agent system nor the web server can take advantage of internal optimizations. To access local resources, the agent system will have to act like any other normal HTTP client.

Our approach seeks to strike a compromise between the two approaches discussed above. Our solution consists of an agent environment with a well-defined interface, so that it can easily plug-in into existing web servers. In our approach, the agent envi-

4

S. Arumugam, A. Helal, and A. Nalla

ronment remains the same for all web servers. The agent environment is integrated into a web server through use of a simple server module developed specifically for the server. Though this approach makes the server module specific to each web server, we believe this is not a constraint to the deployment of the aZIMAS AE. This is because the only task performed by the server module is to forward an agent-specific HTTP request to the AE, get the reply back from the AE and communicate it to the client. Hence, developing a server module for any given web server should be a relatively easy task. We have developed an extension module for the popular apache web server to route requests to the agent environment. We are currently investigating if a similar architecture can be achieved on an IIS Server through the use of ISAPI extensions. For IIS server, an ISAPI extension that could interface with the AE would be needed. Developing a server module for other web servers should be a relatively easy task as almost all commercial web servers provide a library of routines to help developers in creating extension modules. Using a server module to integrate an AE offers number of other advantages. The web server usually loads server modules during start up, so server modules incur little or no overhead in processing a request. In addition, all important server data structures are usually available for the server modules; hence clever optimizations can be done to improve request processing. Finally, by moving web server specific tasks to the server module, we can keep the agent environment the same for any web server.

3 aZIMAS System Architecture Figure 1 shows the high-level design consisting of both the client and server components of the aZIMAS System. Web Browser

Server Module

aZIMAS Agent Environment (AE)

Any Web Server Fig. 1. Architecture of aZIMAS

3.1 Client Side Components

Client components in agent systems are typically used by users to launch new agents, check the status of launched agents, and in general to monitor and direct the actions of an agent. In contrast to many available agent systems and to the previous version, the

aZIMAS: Web Mobile Agent System

5

current aZIMAS system does not require a separate client side component for launching agents. Instead, users interact with the aZIMAS Agent Environment using a web browser. Azimas WebInterface module exposes the functionality of the aZIMAS System. Interaction with the AE takes place through means of HTTP POST requests. The request structure requirements in AE for HTTP POST messages are well defined, making it easy for user applications to issue requests to the AE. We have also developed a preliminary programming model called Web Agent Programming Model (WAPM) that would enable even non-programmers use the functionality of the agent system. WAPM defines a simple scripting language that can be used to direct and define an agent’s action in a very high-level language. We have adopted HTTP POST as the basic communication mode to interact with the AE since it is easy to define a request structure using POST. Also, POST enables us to deal with arbitrarily large data blocks.

3.2 Server Side Components

The server side components consist of the server extension module and the aZIMAS Agent Environment (AE) (figure 2). AE Web Content Provider

WebInterface

Logger

HTTP POST Messages

From / To Web server Module

AgentServices

Messaging

AgentEngine Authentication SecurityManager

Name Registry

AgentAPI

Communication Manager Agents

Data Store

Resource Controller AgentSpace

Fig. 2. aZIMAS Agent Environment (AE)

The server extension module captures HTTP requests that are intended for the AE (identified by a specific request extension) and routes it to AE. aZIMAS AE is a

6

S. Arumugam, A. Helal, and A. Nalla

minimal layer framework that consists of a Messaging and WebInterface component, an AgentEngine component, and an AgentSpace component. The Messaging component acts like a gateway in the AE. It sends and receives messages between the web server module and the AgentEngine. WebInterface component acts like a content publisher. It exposes the functionality of the agent environment through static and dynamic web pages and provides the World Wide Web interface to the AE. The AgentEngine provides services like verification of request structure related to agent messages, and authorization and authentication of incoming messages. The AgentEngine passes incoming agents to the AgentSpace after verification. The Agent Engine also includes a Logger to log system activities. The AgentEngine also has an integrated SecurityManager to control access to the Azimas AgentAPI. The AE exposes its services via the Azimas API. The AgentSpace component forms the heart of the Azimas Agent Architecture. AgentSpace deals with supporting the lifecycle of an agent directly. It provides all necessary functionality needed by the agents like migration, agent communication, persistence, etc. On receiving the agents from AgentEngine, AgentSpace registers them in a Name Registry and then loads their state/code for execution. AgentSpace keeps track of the various agents in the AE and provides the interfaces to the Agent API. To some extent, AgentSpace also controls usage of resources like memory and disk space by agents through a Resource Controller. A Communication Manager handles message transfer among agents as well as transporting agent packets as HTTP messages to other hosts. Storage requirements of agents are handled using a Data Store.

4 Agents in aZIMAS We define an azimas agent as a program that acts on behalf of an entity, and which has the ability to move autonomously from one host to another. The entity, which owns the agent (owner), can be a user, program or an organization. Agents in azimas can only exist within the context of the AgentSpace in Agent Environment (AE), which provides the functionality necessary for the agents to operate. In aZIMAS, every agent is associated with a home base, which refers to the AE at which the agent was first created. For an agent, hosts other than the home base are referred to as foreign bases. An azimas agent can be described by the following attributes: – ID, which uniquely identifies it within an AE – Home base, the AE at which the agent was first created – Credentials to prove its authenticity – Data (like program state and output, itineraries, agent type etc.,) and – Code that forms the agent program

aZIMAS: Web Mobile Agent System

7

These various attributes are packaged and represented as an AgentTravelPack when an agent is transferred between azimas hosts. In aZIMAS, agents are classified as either Interactive or Non-Interactive: •

Interactive Agents have the ability to respond to the activity of their owners, and possibly other agents. Interactive agents can cooperate with other agents in realizing a common goal. These agents have the ability for synchronous and asynchronous collaboration and can be used to build distributed Internet Applications. Interactive agents can optionally specify AppletContexts, which provides an applet interface for their owners to direct their actions.



Non-Interactive Agents are primarily concerned with information sources available at different hosts on the Internet and are suitable for information search and filtering applications. These agents have the ability to replicate or clone themselves. A non-interactive agent and its clones form a group, within which synchronous and asynchronous communication is possible. Non-interactive agents normally do not interact with agents outside their group and typically communicate results and messages to their owners in an asynchronous manner.

Every incoming agent is expected to identify its type (interactive/non-interactive). The purpose of this classification is to aid the aZIMAS AE in decisions of load balancing. When the number of agents in the system exceeds a given threshold (available system memory, disk space), the AE’s Resource Controller gets invoked. The Resource Controller typically suspends some agents to ease the strain on the AE. Usually low priority agent threads get suspended in no particular order, with preference given to interactive agents and agents belonging to the AE’s domain (home base agents).

4.1 Agent Communication

For cooperation and collaboration between agents to succeed, we need an effective communication medium. A communication medium facilitates agent interaction with other agents and its owners. aZIMAS provides support for both synchronous and asynchronous agent communication. Asynchronous communication is implemented through use of mailboxes. Every agent is allocated a mailbox, which stores incoming messages for the agent. A message sent by an agent is stored in the receiver's mailbox. An agent can do a blocking or non-blocking wait for messages. When an agent blocks waiting for a message, it is notified upon the arrival of a message. If the agent is nonblocking then it has to explicitly check its mailbox to retrieve the messages. Synchronous communication is established through a rendezvous point established by the aZIMAS system. A rendezvous point opens up a connection between two agents in an AE, through which data transfer is possible. aZIMAS provides limited location transparency support, made possible through use of a forwarding service provided by every AE. Thus, when an agent receiving messages in an asynchronous manner chooses to migrate to a new location it will continue to receive messages at its new location.

8

S. Arumugam, A. Helal, and A. Nalla

4.2 Event Management

aZIMAS uses a type of publisher-subscriber model for event management. In this model, a publisher acts as an event generator. Subscribers receive events generated by publishers. In aZIMAS, an event manager facilitates event management and acts as the deliverer of events to the subscribers. Publishers register themselves with the event manager and deliver events to it. Agents can query the event manager for available publishers and can subscribe to specific publishers. An agent can act as both a publisher and a subscriber simultaneously. A subscriber in an aZIMAS AE will continue to receive event notification even after it has moved to a different host. Events are Java objects in aZIMAS, which provide a general description of the event along with its details.

4.3 Mobility and Agent Persistence

Mobile agents, by their very nature, need to have the ability to move from one host to another. aZIMAS supports only weak migration for Agents. In weak migration, only the code and state of the agent is transferred as opposed to strong migration, where the execution state is transferred. Attempts have been made to provide for strong migration [6] by either extending the Java Virtual Machine or by capturing the execution state through use of a backup object. Though strong migration has a number of positive aspects, the high overhead introduced by these techniques makes it unattractive to our cause. In aZIMAS, agent migration is handled by the AE. Agents are passive participants in the migration process and cannot react to events that take place during migration. An agent issues a migration request through a move method call. The move method when invoked never returns and takes an itinerary string as a parameter. An itinerary string is an ordered triplet (destination_host: method: parameter), indicating the destination host to which the agent wishes to migrate, the method to be invoked upon arrival at the destination along with parameters that needs to be passed to the method. As of now, only simple parameters like strings and primitive types can be passed as method parameters. Agent persistence is supported through use of Java’s serialization mechanism. An agent’s state and code, once serialized, persist physically at the home base. At foreign bases, serialized agent state and code is maintained only in memory. The agent’s home base always keeps track of the location information of all agents belonging to its domain (home agents). This is made possible by an update to the location information at the home base whenever the agent moves between hosts. When an AE receives a foreign agent it sends the agent’s home base AE a status message indicating the agent’s new location.

aZIMAS: Web Mobile Agent System

9

4.4 Security

A mobile agent system needs to define proper security framework before it can be trusted and deployed widely since they often will have to execute arbitrary code from unknown agents. In addition to the security issues related to the agents in general, new issues arise when mobile agents are deployed on the web. Fortunately, dealing with web security is not as hard as that of agent security and acceptable solution exists to many problems [8]. To prevent unauthorized web use, aZIMAS AE requires each user to follow a registration process before being allowed to access agent services. This enables the AE to identify each request with a specific aZIMAS user. Since HTTP is a plain-text protocol it is possible for snoopers to gain access to confidential information. Use of secure HTTP where possible should eliminate this problem. It should be noted that in the current system no provision has been made to share user information across AE’s. Thus, a user registered at a particular AE can use services of other AE’s only indirectly through use of agent programs. To make registration information common across AEs, one can adopt a central registration server. A detailed analysis of security issues in agent systems in provided in [7]. Broadly, there are three main classes of threats in agent systems: (1) Agent-to-Platform: The agent exploits weakness in the agent system to gain unauthorized access to resources, or otherwise launch attacks on the agent system. Numerous techniques have been developed to protect an agent system from harmful agents like safe code interpretation, signed code, proof carrying code, state appraisal etc., (2) Platform-to-Agent: This category of threats represents situations in which the agent’s security is compromised by attacks from the agent system. Preventing this type of attack is generally difficult since the agent system controls the execution environment of an agent. There are some techniques to prevent attacks from the agent system, but they tend to fall more towards detection that prevention. (3) Agent-to-Agent: In this category, an agent launches an attack against another agent possibly exploiting security weakness of the other agent. These types of attacks can usually be tackled using the same techniques that are employed for protecting the Agent platform. The current system does not have any provision to prevent attacks by the platform. We assume that aZIMAS AEs are hosts trusted by the agents. To prevent attacks by the agents, aZIMAS employs a combination of techniques. Every agent is expected to carry credentials identifying its owner and the home base. As like many other agent systems, aZIMAS makes use of Java’s inbuilt security mechanisms to prevent runtime attacks by providing a Security Manager that controls access to potentially harmful system libraries. aZIMAS offers some security features to prevent agent-to-agent attacks. This includes agent blocking, in which an agent can block messages from another agent to it by requesting the AE.

10

S. Arumugam, A. Helal, and A. Nalla

5 Implementation We have implemented most of the framework that has been described in the previous two sections. We have developed the basic agent environment component, which supports the lifecycle of the agents. We have implemented a server extension module for the popular Apache Web server. As part of our work, we have also developed a simple web server called Pluto, which interfaces with the AE. Pluto makes it easy to see the project in action even on systems with no installed web servers. The Agent Environment is implemented using Java. Java provides a number of features useful to agent systems like serialization mechanism, on-demand code loading, relatively strong security framework etc., which makes it an excellent language to base an agent system upon. Some popular agent systems based on Java [9] include Aglets, Concordia, Voyager, Odyssey, Ajanta etc. The web server dictates the choice of language for the server extension module. In our implementation of the apache module, we have used the C language. The core of the system is accommodated in five Java classes: AzimasMessaging, AzimasWeb, AzimasAgentEngine, AzimasAgentSpace, and AzimasSecurityManager. The implementation is modeled according to the aZIMAS design described in section 3. All aZIMAS agents are derived from a basic agent class called Azimas, which helps enforce certain minimum guidelines for the agent programs. Agents are represented as threads in the AE. The base agent class has an attribute called AzimasSpaceInterface, which needs to be set by the Agent Environment. This is a Java object by means of which the agent accesses many services from the AE. To facilitate invocation as a thread, the system mandates that the agents provide a run() method, from where the agent code normally starts executing. Apart from these requirements, the AE’s SecurityManager constrains the agent programs to make calls only to safe library methods. The aZIMAS API provides a safe way to access many useful functions. The method names model closely after the Java library convention. For example, to invoke a request to an external URL the agent just needs to make a call to the getLocation method, which takes an URL string as a parameter. The method call if successful returns the data retrieved from the URL.

6 Performance Evaluation In this section, we present some of the performance results obtained after deploying our AE on Apache web server. The performance of a web server is critical and should not be adversely affected by the presence of additional components.

aZIMAS: Web Mobile Agent System

13

to be present in all messages directed to the aZIMAS AE. It helps the AE in deciding how to process a request. Depending on the type of the request, the AE may expect additional fields. For example, a message containing agent code is handled differently based on the request type:





If the request type is createAgent,it implies that a new agent needs to be created at the AE. The AE stores the code of the agent in a separate folder created for the user. The code is then loaded and the agent execution begins, provided necessary resources are available. The AE expects fields like userName, userPassword, agentName, agentType etc., If the request type is agentPack, then it implies a visiting agent. The AE allocates only temporary space for the agent under a visitors folder. The AE also informs the home AE about the successful migration and then restores the state of the visiting agent. The AE expects fields like agentTravelPack.

The request structure expectations are well defined in aZIMAS. This enables user applications to access aZIMAS services by following the correct request structure. An example POST request illustrating a createAgent request is shown below. -----------------------------30881262316024 Content-Disposition: form-data; name="requestType" createAgent -----------------------------30881262316024 Content-Disposition: form-data; name="agentName" Matrix1 -----------------------------30881262316024 Content-Disposition: form-data; name="agentType" Non-Interactive -----------------------------30881262316024 Content-Disposition: form-data; name="userName" [email protected] -----------------------------30881262316024 Content-Disposition: form-data; name="userPasswd" ********* -----------------------------30881262316024 Content-Disposition: form-data; name="agentFile"; filename="matrix.class" […… Class Bytes ……] -----------------------------30881262316024--

7.1 Application Scenarios We now describe some application scenarios where our system can be of use. Search Agent aZIMAS provides a means for agents to search information available on the Internet. For example, the search results from the popular search engine Google are available as a Web Service. aZIMAS provides a search routine that relies on Google to obtain the results. An agent’s call to the aZIMAS search routine results in a service query to the

14

S. Arumugam, A. Helal, and A. Nalla

Google search engine. An agent program can make use of this useful feature and perform automated processing and filtering of the search results to fit a particular criterion. The agent can follow an itinerary, visiting and collecting information from various websites. For example, one can write an agent that indexes all results from .edu sites for a given search phrase. The agent can further refine these results to match specific keywords in the document. By its nature, this agent is non-interactive. This application can be useful when there is a need to process a large amount of information on the Internet. By possibly moving the agent application near to the source, the user can save bandwidth and at the same time free his/her machine for other activities. Moving processing to remote locations is also attractive for machines which are weakly connected, or which are poor in resources like low memory devices. Info Agent An Info agent acts like an information repository for a user. It keeps track of the user’s personal and contact information, bookmarks, and address book. The agent provides an interface on the Internet so that anyone can add new entries to the user’s address book. The agent returns to the user’s host machine at periodic intervals and synchronizes with the address book maintained locally by the user. An advantage with this agent application is that it provides owners with access to their address book, and bookmarks anywhere, anytime. The agent can also be used to set alerts, reminders based on date or time. The agent will then remind the user about the impending event through email or other communication means. This application qualifies easily as an interactive agent.

8 Conclusion and Future Work In this paper we described aZIMAS, a novel framework that realizes mobile agent systems within the context of the World Wide Web. The approach interfaces the agent system (Agent Environment) with web servers through means of a minimal server specific extension module and makes use of web browsers as clients. By making only the extension module as server specific it is possible to deploy the agent system uniformly in a wide variety of web servers. Our framework provides a reasonably secure environment for mobile agents to function on the Web. We are currently working on abstracting the functionality of common web applications and making them available as part of the aZIMAS Agent API. Another area of interest is to realize aZIMAS agents as providers of web service. We are also investigating ways to incorporate proactive security features like verification of mobile code to check for harmful intent. We are fine-tuning the various features of the system and soon plan to have a public release of the AE along with web server modules for popular web servers.

aZIMAS: Web Mobile Agent System

15

References 1.

Funfrocken, S., How to integrate mobile agents into Web Servers, 6th Workshop on Enabling Technologies Infrastructure for Collaborative Enterprises, 1997 2. Anselm Lingnau and Oswald Drobnik and Peter, An HTTP-based Infrastructure for Mobile Agents, Proceedings of the 4th International WWW Conference, 1995 3. P. Marques, R. Fonseca, P. Simões, L. Silva, J. Silva, Integrating Mobile Agents into Offthe-Shelf Web Servers: The M&M Approach, Proc. of the International Workshop on Internet Bots: Systems and Applications, 2001 4. T. Goddard, V.S. Sunderam, WebVector: Agents With URLs, Proceedings of the 6th Workshop on Enabling Technologies Infrastructure for Collaborative Enterprises, 1997 5. Amar Nalla, Abdelsalam (Sumi) Helal and Vidya Renganarayanan, aZIMAs: Almost Zero Infrastructure Mobile Agents System, Proceedings of the IEEE Wireless Communications and Networking Conference, March 2002 6. S. Bouchenak, Pickling threads state in the Java system. Third European Research Seminar on Advances in Distributed Systems (ERSADS'99), Madeira Island, Portugal, April 1999 7. Wayne Jansen and Tom Karygiannis, Mobile Agent Security, National Institute of Standards and Technology, Special Publication 800-19, August 1999 8. Stefanos Gritzalis and Diomidis Spinellis, Addressing threats and security issues in World Wide Web technology, In Proceedings of 3rd IFIP TC6/TC11 International joint working Conference on Communications and Multimedia Security, 1997 9. Damir Horvat, Dragana Cvetkovic and Veljko Milutinovic, Mobile Agents and Java Mobile Agents Toolkits, Proceedings of the 33rd Hawaii International Conference on System Sciences, 1998 10. David Kotz, Robert S Gray, Mobile Agents and the Future of the Internet, ACM Operating Systems Review, 33(3), August 1999 11. Internet Agents, http://cws.internet.com/agents.html

Mobile Code in .NET: A Porting Experience M´ arcio Delamaro1 and Gian Pietro Picco2 1

2

Funda¸ca ˜o Eur´ıpedes Soares da Rocha Av Hygino Muzzi Filho, 529, 17525901, Marilia—SP, Brazil Phone: +55(14) 421-0833, [email protected] Dipartimento di Elettronica e Informazione, Politecnico di Milano P.za Leonardo da Vinci, 32, I-20133 Milano, Italy Phone: +39-02-23993519, [email protected]

Abstract. Mobile code systems typically rely on the Java language, since it provides many of the necessary building blocks. Nevertheless, Microsoft recently released the .net platform, which includes at its core a virtual machine supporting multi-language programming, and a new language called C#. The competition between .net and Java is evident, and so are the analogies between these two technologies. From the point of view of code mobility, a natural question to ask is then whether .net supports mobile code, and how the mechanisms provided compare with those available in Java. This paper aims at providing a preliminary set of answers to this simple question. The work we report about was not driven by the goal of providing a thorough comparison. Instead, it was driven by the practical need to port an existing toolkit for code mobility written in Java, μCode, to the .net environment. This approach forced us to verify our mobile code design on a concrete example, rather than just think about the problem in abstract. The resulting software artifact constitutes, to the best of our knowledge, the first implementation of a mobile code system written for .net. In the paper, we provide an overview of the .net mechanisms supporting mobile code, show how they are exploited in our port, and discuss similarities and differences with the Java platform.

1

Introduction

Code mobility [2] is increasingly being considered as part of the mainstream techniques for developing distributed systems. In some cases, code migration takes place behind the scenes, e.g., in middleware like RMI and Jini [4], where code mobility is exploited to increase the flexibility of service invocation. In other cases, the ability to trigger code migration is directly under the control of the programmer. This is the case of a number of systems supporting mobile code, where code can be explicitly relocated from one host to another. In particular, mobile code techniques and mechanisms are typically exploited by mobile agent systems [9], which allow the migration of an entire execution unit (e.g., a thread or a process) to a different host. The popularity and pervasiveness of mobile code can be ascribed largely to the success of the Java language. Modern technologies supporting code mobility N. Suri (Ed.): MA 2002, LNCS 2535, pp. 16–31, 2002. c Springer-Verlag Berlin Heidelberg 2002 

Mobile Code in .NET: A Porting Experience

17

all rely on this language. Even without considering social factors, like marketing and hype, there are a number of technical reasons that justify this phenomenon. In Java, a number of fundamental building blocks for code mobility, like multithreading and communication primitives, are readily available either at the language level or in the standard library. Furthermore, portability and the availability of a programmable class loader opened up unprecented levels of flexibility. To counter the success of Java as a language for Internet programming, Microsoft has recently released the first version of its .net [7] platform, in an evident effort to re-gain control of the distributed application market. The .net environment includes as its core a virtual machine supporting multi-language programming, and a new language called C#. Natural questions to ask are then what features in .net can be exploited to support mobile code, and how they compare to those available in Java. This paper aims at providing a preliminary set of answers to this question. The work we report about was not driven by the goal of providing an extensive comparison. Instead, we followed a bottom up approach, driven by the need to port an existing open source toolkit for code mobility written in Java, μCode [8,6], to the .net platform. This approach forced us to verify our mobile code design on a concrete example, rather than just think about the problem in abstract. The resulting software artifact constitutes, to our knowledge, the first implementation of a mobile code system written for .net. The paper is structured as follows. Section 2 provides the reader with a concise overview of the .net platform, and of the features of C# that are relevant to the content of this paper. Section 3 analyzes more specifically those features of .net that facilitate and support the development of systems involving code mobility. Section 4 provides a minimal description of μCode, the mobile code toolkit whose port under .net is described in Section 5. Section 6 elaborates on the findings of our porting experience, and provides some preliminary comparison between Java and .net as platform supporting mobile code. Finally, Section 7 ends the paper by describing opportunities for future work and by providing some concluding remarks.

2

.NET

.net is a software platform that aims at providing a complete solution for the development of distributed applications. The core of .net is represented by the .net Framework1 , that provides the base infrastructure for the rest of the platform and unifies the corresponding programming model across several languages. At the heart of the .net Framework is the Common Language Runtime (clr) [11], the virtual machine providing the core services of memory management, thread management, compilation, and communication, and handling code 1

Usually, when people refer to “.net” they mean the “.net Framework” rather than the whole platform. Hereafter, we adopt the same convention, since the rest of the paper is concerned only with the .net Framework.

18

M. Delamaro and G.P. Picco

management and execution in a secure way. The clr provides language interoperability by defining an intermediate language (called msil, Microsoft Intermediate Language) and a single format (called pe, Portable Executable format) for executable code. Compilers generate msil code from the specific source language (e.g., Visual Basic or C) and output it into a pe file that contains, besides the msil code, metadata that instructs the clr about how to allocate memory to objects, enforce security, locate referenced modules, and so on. Since pe files are language-independent, code generated from different languages can interoperate easily. The msil is similar in concept to Java bytecode, and is a stack-based language with a rich instruction set. While in principle the msil could be interpreted, the clr translates msil to native code using a just-in-time compiler. In addition, the clr also manages the interoperability of managed (msil) code and unmanaged (native) code. Unmanaged code is code that does not comply with .net, and hence does not provide the same guarantees in terms of execution and security. The need for such interoperability typically arises with reuse of legacy code, like native DLL libraries or COM components. The other fundamental component of the .net Framework is the Base Class Library, a comprehensive set of classes that provide the API towards the services offered by the clr, together with a lot of other features like graphics, interoperation with DBMSes, collections, I/O, and XML support, to name a few. The current version of the .net Software Development Kit includes compilers that produce msil from Visual Basic, C++ and C#. The last one is a new language, that constitutes one of the novelty of .net. C# [10] is an objectoriented language and can be regarded as a high-level version of the msil, in that all the features of C# are natively supported by the clr. On the surface, C# is quite similar to Java. However, it provides some distinctive features2 , of which the most prominent and more relevant to the content of this paper are the notion of delegate, the support for application events as a first-class language construct, and the notion of attribute. Delegates are similar to interfaces in the sense that they specify a contract between a caller and a specific method. Unlike interfaces, however, delegates are defined at runtime, to create a sort of “instance of a method”. Hence, delegates are often used for the same purpose as function pointers in C, e.g., to implement callbacks. For example, in the code below the method ProcessString takes as parameter a delegate, whose interface is defined by StringProcessor, that is supposed to perform some computation on the string passed as a parameter. In Main, the actual method ComputeLength is passed as a delegate: public class Example { public delegate int StringProcessor(string x); public int ComputeLength(string x) { return x.Length; } public void ProcessString(StringProcessor x, string y) { x(y); } public static void Main() { 2

More details can be found in the .net documentation, or in one of the many books on the topic, e.g., [1].

Mobile Code in .NET: A Porting Experience

19

Example e = new Example(); e.ProcessString(new StringProcessor(e.ComputeLength), "abc"); } }

Delegates are often used in conjunction with events, to specify the handler associated to the occurrence of an event. In C#, the interface of a class can specify which events it can raise, like in the following code fragment: public delegate TempExceededEventHandler(object source, EventArgs e); public class Sensor { public event TempExceededEventHandler TempExceeded; ... }

Events can be generated easily, as in TempExceeded(aSensor,e); while the delegates describing the behavior of event handlers can be associated to events with statements like TempExceeded += new TempExceededEventHandler(aSensor);

It is interesting to note how delegates and events are first-class elements not only in C# but also in the clr, and hence potentially available to other languages. Finally, attributes are auxiliary, declarative information that can be associated to given elements of the language, and that get stored in the metadata associated to the compiled code. This information can be later retrieved using reflection, and can be used at runtime. For instance, a [Serializable] attribute is used to tag a class field as serializable. A number of predefined attributes are provided, together with mechanisms to create new ones.

3

.NET Features Supporting Mobile Code

The success of Java as a language for mobile code relies on some of its features, which provide the fundamental building blocks for these kind of systems. Roughly, these features can be grouped together as support for concurrency, object serialization, code loading, and reflection. In this section, we illustrate how .net supports similar features, highlighting significant departures from Java whenever appropriate. Our description is based on C#, since this is the language we used to develop our port, but most of our considerations should hold also for the other languages supported by .net. The content of this section should not be regarded by any means as complete. The API of the .net Framework provides a huge array of functionality, whose complexity can only be scratched in a short paper like this. Our intent here is to give a concise overview of the fundamental features found in .net that can be exploited to support the development of mobile code systems and applications. For more technical detail, we redirect the interested reader to the documentation that accompanies the .net Framework.

20

3.1

M. Delamaro and G.P. Picco

Concurrency

The ability to handle multiple, concurrent activities is fundamental for mobile code systems and especially for those supporting mobile agents. Java-based mobile code systems typically pick threads as their unit of concurrency: for instance, a mobile agent is usually implemented by a thread. A thread is represented in Java as an object of class Thread, and executes within the (operating system) process containing the JVM. Java threads are granted shared access to objects residing in the process containing them, while sharing of objects contained in different processes must be handled through interprocess facilities. At their core, the features provided by .net to deal with threads are very similar to those found in Java. Threads are represented by objects of class System.Threading.Thread, and methods to start, suspend, resume, interrupt, join, abort a thread and get a reference to the current thread are provided, similarly to Java. Interestingly, the thread’s code is not bound to reside in a run() method, like in Java. Instead, applications can specify at thread creation time which is the method containing the thread behavior by passing a delegate. The constructs provided for controlling concurrency are instead a little different. The core functionality is provided by the class Monitor, that provides features similar to those found in Java’s Object. The Enter, TryEnter, and Exit methods allow a thread to acquire or release a lock on the Monitor object, and thus define a critical region. Moreover, the methods Wait, Pulse, and PulseAll allow a thread to explicitly synchronize with other threads. The lock statement provides a syntactic shortcut to define a critical section of code that can be executed only after a lock on an object has been acquired. Hence, lock is equivalent to a synchronized block in Java. Synchronized methods are instead declared by attaching a [Synchronized] attribute to methods. Several additional utility features are provided, like a ThreadPool class for managing collections of threads, a Timer class, and several classes (e.g., Mutex, ReaderWriterLock, and so on) supporting low-level synchronization of concurrent activities. Nevertheless, the concurrency model put forth by .net is richer than the Java one since, in addition to processes and threads, it provides the notion of application domain, which is a sort of hybrid between the other two. Like threads, application domains are lightweight processes that run in a process. However, unlike threads and similar to processes, application domains cannot directly share code or objects. In essence, application domains are a way to provide isolation between separate applications without incurring the overhead of handling them through multiple processes, hence enhancing performance and scalability. According to the .net documentation, Application domains form an isolation, unloading, and security boundary for managed code.

Hence, not only performance is improved, but the management of application is more flexible, since an application running in an application domain can be stopped and its code unloaded from the system, without having to stop the

Mobile Code in .NET: A Porting Experience

21

process containing it. Similarly, different policies for different applications can coexist in the same process. Application domains can be thought of providing a notion of “logical process” inside a “physical” operating system process. Threads can still be exploited within and across application domains. Nevertheless, since memory cannot be shared across application domains, the programmer is forced to resort to mechanisms similar to interprocess communication. In .net, these mechanisms are provided by the Remoting facility, which provides a form of remote method invocation that can be used not only locally, to cross the application domain boundary, but also to enable communication between applications on remote hosts. Hence, there is a tradeoff between the benefits brought by application domains and the performance overhead and increased complexity when accessing shared resources. Application domains are available to programmers as AppDomain objects. Methods are available to create a new application domain, load code into it, unload an application domain and its code. Moreover, the interface of AppDomain also exports two events, TypeResolve and AssemblyResolve, that can be used to implement code loading schemes, as we describe in Section 5. 3.2

Object Serialization

Serialization is clearly a fundamental building block of mobile code systems based on an object-oriented programming language. It allows to transform a structured object variable into a flat data structure, typically a stream of byte or characters, for subsequent use with an I/O channel, e.g., a socket. Serializable classes are tagged as such by using the [Serializable] attribute, like in: [Serializable] public class Person { public String name = "John"; public String surname = "Doe"; public int age = 20; }

Notably, this attribute is not inherited. For instance, if a subclass of Person is meant to be serializable, the [Serializable] attribute must be explicitly attached to it. This is a significant departure from Java, where the a serializable object is declared by implementing the tagging interface java.io.Serializable, and hence serializability is automatically inherited by subclasses. Class fields that are not meant to be serialized can be tagged with the [NonSerialized] attribute, analogous to Java’s transient fields. .net provides a ISerializable interface as well, but with a different meaning from its Java counterpart. In fact, this interface is provided to allow an object to govern its own serialization and deserialization, which is achieved in Java by defining the writeObject and readObject methods of a class implementing Serializable. As in Java, the object code is not stored with the object state. Instead, information about the type of the object is stored with the serialized data, so that the correct type can be retrieved upon deserialization.

22

M. Delamaro and G.P. Picco

In Java, (de)serialization is achieved by using a specific I/O stream class, like java.io.ObjectInputStream. Instead, (de)serialization in .net is delegated to a formatter object, that must implement the interface IFormatter. For instance, the following snippet serializes an object of type Person and writes it in a file: IFormatter f = new BinaryFormatter(); Stream fs = new FileStream("person.dat", FileMode.Open, FileAccess.Read, FileShare.Read); Person obj = new Person("John", "Doe", 20); f.Serialize(fs, obj); s.Close();

Two formatter implementations are provided by .net, providing binary serialization and XML serialization. The latter allows to serialize an object’s public properties and fields into an XML file, and is meant to be used for generating human readable descriptions of an object, and for interacting with Web services based on SOAP. Instead, binary serialization is closer to Java serialization, in that it preserves the type of the object, and is typically exploited within the Remoting API for passing parameters in a remote method invocation, similarly to Java RMI. Interestingly, serialization is even more important in .net, due to the aforementioned impossibility of sharing object directly across application domains. While serialization is exploited in Java only across processes, typically residing across different hosts, in .net it becomes relevant even in the scope of a single process, to implement object sharing. 3.3

Code Loading

The fundamental mechanism enabling code mobility is the ability to load code dynamically, either into a running application or in a newly spawned concurrent executing unit. In Java, the unit of code loading is an object type. The bytecode corresponding to a class or interface can be loaded dynamically by the runtime, and more specifically by the class loader, typically when the name of a class that has not yet been loaded is encountered during the execution of an application. In .net the unit of code loading is more coarse grained than a single type, and is constituted by an assembly. According to the .net documentation, “[Assemblies] form the fundamental unit of deployment, version control, reuse, activation scoping, and security permissions. An assembly is a collection of types and resources that are built to work together and form a logical unit of functionality. [...] To the runtime, a type does not exist outside the context of an assembly.”

At first sight, an assembly vaguely resembles a Java JAR file. For instance, each assembly contains a manifest, containing the assembly metadata. Nevertheless, while JAR files are only relevant for deployment, to package together code and resources, assemblies are first class entities in .net. Assemblies are made accessible to the programmer through the class Assembly that, among other

Mobile Code in .NET: A Porting Experience

23

things, enables introspection as we discuss later. More importantly, assemblies play a fundamental role in code loading, since they provide the context for the code representing a type. Assembly loading can be triggered when the runtime attempts to resolve a reference to another assembly. References can be defined statically or dynamically. Static references are typically generated by the compiler and stored in the assembly manifest. A typical example is a method call on an object whose class belongs to another assembly. Dynamic references are instead generated when a program requests the clr to load an assembly that was not referenced statically, or to create and load an assembly from a byte array containing its code. This latter feature is fundamental for implementing mobile code, and can be achieved by invoking either the Assembly.Load or the AppDomain.Load method. The effect is similar to the ClassLoader.defineClass found in Java, where the byte array is reified into a class object. The sequence of steps performed during assembly resolution is the following: 1. Determine the assembly version required. This is performed by consulting a number of system- and application-defined configuration files. Notably, versioning is built in the unit of code loading. This is a significant improvement over Java, where class versioning is still largely an open issue. 2. Check whether the assembly has been bound before within the runtime. If yes, the previously loaded assembly is used. 3. If the assembly is strong-named, check the global assembly cache. Strongnamed assemblies are assemblies that have been signed with a key at creation time. They are meant to be shared among several applications on one machine, and are stored in a machine-wide cache. 4. Probe for the assembly. Probing is a process that attempts to determine potential locations for the assembly, by looking at a number of configuration files and applying heuristics. These sequence of steps cannot be changed directly by the programmer, who can intervene only by manipulating configuration information. In other words, in .net there is no direct equivalent of the Java programmable class loader. Nevertheless, as we discuss in Section 5, programmable code loading can be provided by a combination of the loading facility provided by the Assembly class, the protected name space provided by AppDomain, and the reactive features provided by events and delegates. 3.4

Reflection

Reflection, i.e., the ability to obtain type information about an object, is not necessarily exploited within the runtime of a mobile code system, but it is often a fundamental asset for applications that exploit mobile code. Through reflection, an application can instantiate an object using a class that becomes available only dynamically. More importantly, reflection can be used to determine what portion of the class closure must be transferred during code migration.

24

M. Delamaro and G.P. Picco

Fig. 1. The architecture of mainstream mobile agent systems (left) and μCode (right).

At the core of .net reflection is the System.Type class, that provides functionality similar to Java java.lang.Class. Thus, for instance, GetType(String) returns the Type with the given name, similarly to the Class.forName(String) found in Java. Once the type is obtained, a new instance of the type can be obtained by retrieving the appropriate constructor (through GetConstructors) and by invoking it (through InvokeMember). Besides constructors, InvokeMember can be used also to invoke methods or get access to members. Through the Type class, information can be gathered about the members of a type, its ancestors, the interfaces implemented, and so on. Differently from Class, however, Type is used not only for classes and interfaces, but also for scalar types, arrays, pointers, and enumerations, since the type system of the clr is unified. Nevertheless, as we mentioned before, types are always contained in assemblies. Hence, it is no surprise to find reflective features in the Assembly class as well. For instance, Assembly.CreateInstance(String) allows to create a new instance of a the named type by invoking the default constructor, after the latter is retrieved from the assembly. Methods that allow to query an assembly for all the Types and resources contained in it are also provided. In general, the reflection features provided by .net are more sophisticated (and complex) than the Java counterpart. Part of the motivation lies in the fact that .net addresses a multilanguage platform, and aims at retaining compatibility with unmanaged code belonging to legacy applications, e.g., COM components. Thus, for instance, a number of methods are also provided that allow to access the internal representation of a type, and a lot of flexibility is provided in defining the rules by which a given member is queried and retrieved.

4

μCode

μCode [8] is a lightweight and flexible toolkit for code mobility that, in contrast with most of analogous platforms, strives for minimality and places a lot of emphasis on modularity. While mainstream mobile agent systems tend to provide

Mobile Code in .NET: A Porting Experience

25

a rich set of features but with a monolithic design, μCode decouples the core mechanisms supporting code mobility from the other features (see Figure 1). This design achieves modularity, thus improving the understanding and management of the system, and optimizing its deployment. μCode revolves around three fundamental concepts: groups, group handlers, and class spaces. Groups are the unit of mobility, and provide a container that can be filled with arbitrary classes and objects (including Thread objects) and shipped to a destination. Classes and objects need not belong to the same thread. Moreover, the programmer may choose to insert in the group only some of the classes needed at the destination, and let the system downloading and link the missing classes from a remote target specified at group creation time. The destination of a group is a μServer, an abstraction of the run-time support. In the destination μServer, the mix of classes and objects must be extracted from the group and used in some coherent way, possibly to generate new threads. This is the task of the group handler, an object that is instantiated and accessed in the destination μServer, and whose class is specified by the programmer at group creation time. Any object can be a group handler. Programmers can define their own specialized group handlers and, in doing so, effectively define their own mobility primitives. During group reconstruction, the system needs to locate classes and make them available to the group handler. The classes extracted from the group must be placed into a separate name space, to avoid name clashes with classes reconstructed from other groups. This capability is provided by the class space. Classes shipped in the same group are placed together in a private class space, associated with that group. However, if and when needed, these classes can later be “published” in a shared class space associated to a μServer, where they become available to all the threads executing in it, as well as to remote ones. Class spaces play also a role in the resolution of class names. When a class name C needs to be resolved during the execution of a class originally retrieved from a group g managed by a μServer S, the redefined class loader of μCode is invoked to search for C’s bytecode by performing the following steps: i) check whether C is a ubiquitous class, i.e. a class available on every μServer (e.g., system classes); ii) search for C in the private class space associated with g in S; iii) search for C in the shared class space associated with S; iv) if t is allowed to perform dynamic download, retrieve C from the remote μServer specified by the user at migration time, and load C; v) if C cannot be found, throw a ClassNotFoundException. Moreover, μCode provides higher-level abstractions built on the core concepts defined thus far. These abstractions include primitives to remotely clone and spawn threads, ship and fetch classes to and from a remote μServer, and a full-fledged implementation of the mobile agent concept. μCode is available as open source under the Library GNU Public License (LGPL). Binaries, source, documentation, and examples are available at mucode.sourceforge.net.

26

M. Delamaro and G.P. Picco

5

μCode from Java to C#

In this section, we describe the most relevant issues we faced when porting μCode from the Java platform to .net. Although the port was developed in C#, most if not all of the content of this section should hold for the other languages supported by .net. Clearly, there are different strategies that can be exploited when porting an application to a different platform. In particular, a tradeoff stems from the desire to keep the port as close as possible to the original, and at the same time to use effectively the features of the target environment. In the work we report here, we strived to implement our port (that we will call henceforth μCode.net to distinguish it from the original) so that its API and runtime functionality are as close as possible to μCode. Nevertheless, we also tried to reach this goal by using the features provided by .net in the most natural way. This first experience already allowed us to draw some relevant considerations about .net for mobile code, as discussed in Section 6, and provides the basis for ongoing work in improving our prototype. As in the original μCode, the Group class allows one to pack objects and code together, and relocate them to a remote μServer. The main difference with the original μCode, however, is that classes are not shipped individually but as part of an assembly. This is clearly a consequence of the fact that .net forces types to be always contained in an assembly, and does not provide mechanisms to load the former without the latter. Hence, adding a class to a group automatically triggers also the insertion of the corresponding assembly. A method AddAssembly, for explicitly inserting an assembly into a group, is also provided in μCode.net. At the destination μServer, the assemblies in the group are unpacked and made available to the group handler through the private class space. In μCode, the private class space is realized by associating a customized class loader to the group. The semantics of Java class loading effectively yields separation of name spaces, and hence avoids class name conflicts. In .net, such isolation is provided by the notion of application domain. For this reason, we decided to perform the unpacking of a group into a newly created application domain. Since applications domains behave like logical processes, the code belonging to different groups is kept isolated. The new application domain is created by using the methods in the AppDomain class. Then, an object is created in this new application domain that is responsible to obtain the group from the μServer, unpack it, register some delegates (as mentioned later), instantiate the group handler, and deliver the group to it. Since application domains enforce code isolation, the instantiation and the access to this “bootstrap” object in the new application domain require the use of the Remoting API. The disadvantage of this solution is the increased complexity of access to shared resources. Threads created in an application domain associated to a received group can share an object with a thread in another application domain only by using some form of interprocess communication. As a special case, this holds also for access to the public, shared class space associated to the μServer,

Mobile Code in .NET: A Porting Experience

27

where code (assemblies) are stored and made available to local and remote nodes. In fact, the μServer runs itself in a separate application domain. In our current implementation, we exploited TCP sockets as a form of interprocess communication. We are currently reworking it to exploit the Remoting API, and simplify object sharing. Once the application domain is created, assemblies are explicitly loaded from the group into the application domain using AppDomain.Load. This method may accept an assembly as a parameter, or even a byte array, from which an assembly is automatically reconstructed. As we already mentioned, this latter feature is similar to the ClassLoader.defineClass found in Java. In the Java implementation of μCode, all the class bytecode coming from other μServers is kept in a hashtable, so that it can be retrieved when the class needs to be transferred again. In fact, a byte array is easily obtained from the file containing the class, but once the bytecode has been reified into a class object there are no means to transform it back in a byte array. For similar reasons, in μCode.net we first store the byte array in a hashtable upon arrival, and then load it in the application domain. The code running in the application domain associated with a group is not necessarily self-contained. Most likely, mobile code is exploited further to enable additional code to be downloaded on demand whenever necessary. In order to load the code associated with the group, the default code loading strategy needs to be redefined. Again, in μCode this is obtained by redefining the class loader, while in μCode.net this ability is provided through application domains. The AppDomain class defines two public events that allow the programmer to deal with code loading into the application domain. TypeResolve is fired whenever the clr cannot find the assembly containing the requested type, while AssemblyResolve is fired whenever the resolution of an assembly, carried out as discussed in Section 3.3, fails. AppDomain defines also the delegate ResolveEventHandler, responsible for handling these events. By handling these two events, it is possible to obtain the equivalent of the Java class loader. The following is a snippet of the actual code of μCode.net, registering the event handlers: curApp.TypeResolve += new ResolveEventHandler(this.TryToLoadType); curApp.AssemblyResolve += new ResolveEventHandler(this.TryToLoadAssembly);

The methods TryToLoadType and TryToLoadAssembly are responsible for finding and retrieving the code, and hence for reproducing the original μCode strategy for class resolution. Below is shown the correspondence between the original strategy used by μCode and the one implemented by μCode.net through the aforementioned delegates: 1. μCode checks whether the class to be resolved is ubiquitous. In μCode.net, default ubiquitous assemblies are those of the core .net runtime, and those containing μCode.net itself; other assemblies can be defined as ubiquitous by the programmer. Nevertheless, because of their nature they are assumed to be present in well-known places in the file system. Hence, the delegates need not worry about them, in that these assemblies are

28

2.

3.

4.

5.

M. Delamaro and G.P. Picco

either found in the file system (and hence the aforementioned events would not fire) or, if these assemblies are not, they are not meant to be replaced with foreign code. μCode searches in the private class space. The private class space of a group coincides with its application domain. The assemblies in the private class space have already been loaded in the application domain upon unpacking of the group, so the delegates should never get a chance to get called. Nevertheless, we experienced that, for some undocumented reason, the migrated assemblies unpacked from the group and loaded in the application domain are not found when the application tries to use them for the first time. Hence, when a class or assembly is missing the delegates first search their own application domain in any case. μCode searches in the shared class space. μCode.net behaves in the same way, by contacting the μServer that created the application domain and asking for the missing code. μCode attempts a dynamic download from the address in the group. Again, μCode.net behaves the same, by looking at the value of the dynamic link source field of the group, and contacting the corresponding μServer. μCode raises a ClassNotFoundException. In this case, the delegates return a null value to the runtime, which in turn raises the exception.

Clearly, frequent dynamic class loading may result in communication overhead, or even in the impossibility to proceed with execution, if the code repository is currently unavailable. This is a general issue with mobile code system, and μCode is one of the very few system that tackle it. In μCode, the programmer is provided with tools that extend Java reflection with the ability to compute the full closure of a given type. The reflection API of Java allows to capture only the type information associated with the type declaration, i.e., fields, methods, parameters, exceptions, inner classes, superclasses and interfaces. Nevertheless, this constitutes only a fraction of the information required to compute the type closure: types that are used only within the body of a method but are not part of the type declaration are not captured by the reflection primitives. The only way to determine such information is through bytecode inspection, which is performed in μCode by the ClassInspector utility class. This way, the programmer can determine the fraction of the (full) type closure that must be relocated during a code migration, and hence reduce or eliminate the need for dynamic class loading. In .net, things are simplified by the role of assemblies as type containers. In fact, by their very nature assemblies already provide a way to pre-package together types that are somehow related. Moreover, the manifest of a given assembly contains information about the other assemblies it depends on, and this information can be easily obtained through reflection from the Assembly class, using GetReferencedAssemblies. Nevertheless, although information about these dependencies among assemblies gets inserted by the compiler, it must be explicitly supplied by the programmer at compilation or linking time.

Mobile Code in .NET: A Porting Experience

6

29

Discussion

Our presentation thus far has evidenced how .net provides a number of features that can be exploited for supporting code mobility. At the same time, however, it is also evident how .net is the result of design criteria that are rather different from those who guided the development of Java. A significant difference between the two platforms is in the unit of mobility. .net assemblies define a rather coarse-grained unit, if compared with Java classes. While it is possible to define assemblies containing only one type, it would be cumbersome to do so, since it is the programmer’s responsibility to keep track of relationships among assemblies. Moreover, it would be a stretch of the assembly abstraction. Assemblies are really designed to be the unit of application deployment: migrating an assembly is just one of the mechanisms for deploying it. Instead, the ability to relocate a single Java class supports a more flexible (and radical) vision where applications are built out of fine-grained components that can reside in any place of the network, and hence are really distributed. This feature is exploited also in Java-based middleware, like RMI or Jini, to enhance the flexibility of remote method invocation, and allows to pass as parameters objects whose classes are not necessarily pre-deployed at destination. On the other hand, assemblies provide richer metadata information, e.g., including version information, dependencies, and security information, while class versioning is still an largely an open problem in Java. Code loading, another cornerstone of mobile code systems, is also designed according to significantly different principles in Java and .net. Java is more liberal about the (re)definition of how code loading is performed. The programmer is free to change the entire class loading behavior, and a class can be loaded by a redefined class loader even when it is accessible from the standard one. Instead, in .net the programmer has a chance to modify the loading behavior only when the predefined one fails. In other words, while Java supports proactive code loading, .net supports only reactive code loading, and in a fashion that always privileges the default loading strategy. In general, the code loading mechanisms provided by Java are characterized by a lower level of abstraction, if compared to .net. This results in a high degree of freedom and flexibility when dealing with mobile code, but also in increased complexity: programming with class loaders is often difficult and error prone. In particular, class loaders are only tied to classes, i.e., to the static elements of the language. Instead, with mobile code and especially mobile agents, one would like to associate a different class loading strategy to each executing unit (usually a thread), that is, to the dynamic elements of the language. Nevertheless, achieving this binding in Java is not a straightforward task. Instead, .net leverages nicely of the separation among logical processes provided by application domains. Code can be loaded in an application domain, where it becomes part of the code segment of the logical process, and hence separated from the others. This provides a natural and intuitive abstraction for managing mobile code, and is associated to a unit of execution. The disadvantage of this solution is that the code and objects residing in different application

30

M. Delamaro and G.P. Picco

domains cannot be shared directly, and instead must be accessed through the Remoting API. This is an issue particularly for mobile agent systems, where agents often seek co-location for the sole purpose of sharing objects. Moreover, the unit of execution chosen may be too coarse grained. Code loading is associated to an application domain, not to a single thread. In our prototype, we chose the most natural solution of associating each group to a separate application domain. However, this may not be the right choice for an application that needs to run in a single application domain and yet leverage of mobile code. Future work will explore whether an alternative design where code loading is realized on a per-thread basis is feasible. Application domains provide also a nice solution to a key problem in support to code mobility: code unloading. The ability to unload a class is of paramount importance for mobile code, since the codebase evolves dynamically, and yet has a finite size. Unloading becomes critical when the mobile agent paradigm is exploited: the memory of a mobile code server may easily get saturated by a flow of mobile agents, each carrying a different set of classes. Similarly, code unloading is key in dealing with resource-limited devices, like PDAs or cellular phones. In Java, A class or interface may be unloaded if and only if its class loader is unreachable. The bootstrap class loader is always reachable; as a result, system classes may never be unloaded. ([5], §2.17.8)

Again, unloading is tied to class loaders, which can be managed in arbitrary ways by the programmer. Moreover, even when the programmer has dealt with class loaders accurately, she cannot have the guarantee that a given class will be effectively unloaded by the JVM. Instead, in .net unloading is associated to the application domain. When an application domain is discarded, all the assemblies that were loaded in it are unloaded as well. Hence, application domains provide an effective and intuitive mechanism to define the scope of code loading and unloading. Mobile code systems can be distinguished according to whether they provide support for strong mobility. Strong mobility is defined [2] as the ability of an executing unit to retain the execution state (e.g., the program counter) across migration. Strong mobility is desirable especially for mobile agents, since it makes migration completely transparent to the migrated agents. Nevertheless, it complicates significantly the design of the run-time support, and for this reason it is not supported by Java. Not surprisingly, a similar decision was made also for .net, although the design of the clr, and in particularly the ability to store richer metadata information in the bytecode, may be exploited to support strong mobility. Our future activities will explore this line of research. In any case, the design of the clr may already simplify the achievement of a desirable feature of mobile code systems: support for multiple languages. Traditionally, this is achieved through ad hoc design of the mobile code platform, like in the case of D’Agents [3]. In .net, the chore is simplified by the fact that the clr is designed specifically to accommodate multiple languages. Clearly, the problem of interoperating between different runtimes remains unaltered.

Mobile Code in .NET: A Porting Experience

31

Finally, in this paper we did not consider security issues at all. The .net platform provides a sophisticated set of security features, that leverage off of the concepts we introduced thus far, like application domains and assemblies. Nevertheless, while we agree that security is a relevant feature in mobile code, our work was driven by the desire to learn first what are the core mechanisms supporting the migration of code, before delving into the details of how to deal with such migration in a secure way.

7

Conclusions and Future Work

In this paper, we reported about an experience in porting an existing mobile code toolkit, called μCode, from Sun’s Java to Microsoft’s .net. The port gave us the opportunity of learning about the features of .net that can be exploited to support mobile code, and of exploring architectural tradeoffs in implementing mobile code mechanisms in this platform. The software artifact will soon be released publicly, under an open source license. Future work on the topic of this paper will include the exploration of alternative designs for mobile code, including different code loading schemes, and the provision of strong mobility features in the clr. Acknowledgments. This work has been partially supported by the “NetworkAware Programming and Interoperability (NAPI)” project, sponsored by Microsoft Research. M´arcio Delamaro thanks CNPq (the Brazilian Federal Funding Agency) for supporting his stay at Politecnico di Milano.

References 1. T. Archer. Inside C#. Microsoft Press, 2001. 2. A. Fuggetta, G.P. Picco, and G. Vigna. Understanding Code Mobility. IEEE Transactions on Software Engineering, 24(5):342–361, May 1998. 3. R.S. Gray, G. Cybenko, D. Kotz, R.A. Peterson, and D. Rus. D’Agents: Applications and Performance of a Mobile-Agent System. Software: Practice and Experience, 2001. To appear. 4. Jini Web page. http://www.sun.com/jini. 5. T. Lindholm and F. Yellin. The Java Virtual Machine Specification. AddisonWesley, 2nd edition, 1999. 6. μCode Web page. http://mucode.sourceforge.net. 7. .net Web page. http://www.microsoft.com/net. 8. G.P. Picco. μCode: A Lightweight and Flexible Mobile Code Toolkit. In Proc. of Mobile Agents: 2nd Int. Workshop MA’98, LNCS 1477, pages 160–171. Springer, September 1998. 9. G.P. Picco. Mobile Agents: An Introduction. J. of Microprocessors and Microsystems, 25(2):65–74, April 2001. 10. ECMA TC39/TG2. Draft C# Language Specification. Technical report, ECMA, September 2001. 11. ECMA TC39/TG3. The CLI Architecture. Technical report, ECMA, October 2001.

Mobile Agents and Logic Programming Hisashi Hayashi, Kenta Cho, and Akihiko Ohsuga Computer and Network Systems Laboratory Corporate Research and Development Center TOSHIBA CORPORATION 1 Komukai, Toshiba-cho, Saiwai-ku, Kawasaki-shi, 212-8582, Japan {hisashi3.hayashi, kenta.cho, akihiko.ohsuga}@toshiba.co.jp

Abstract. In many mobile agent systems, it is normal for mobile agents to be uninformed about the environment of a computer until they actually arrive at the computer. If the environment of computers is updated frequently, it is even more difficult to execute actions as expected. This paper introduces a new procedure for mobile agents that work in such dynamic world. The new procedure smoothly integrates planning, action execution, knowledge updates, and plan modifications.

1

Introduction

Adaptation to dynamic environment is a very important subject in mobile agent systems. Usually, mobile agents have only incomplete information of the environment. Even if a mobile agent has complete information about the environment of a computer, the environment might be updated by the time the mobile agent actually moves to the computer. Therefore, it is normal for mobile agents not to execute actions as expected. One way to tackle this problem is to make inferences before performing actions. Especially, Prolog-like procedures have been used to implement intelligent mobile agent systems. For example, the MiLog agents [3] make inferences using Prolog. The planner for Plangent [10] makes plans using a Prolog-like procedure. It seems that mobile agents can adapt to dynamic environment by making inferences before performing actions. However, things are not that simple. Considering the fact that mobile agents have only incomplete information of the world, the inference which the mobile agents make might be invalid. Although Prolog is useful for implementing inference engines in the static world, it cannot deal with the dynamic nature of the world. Once Prolog performs computation, it will not modify the computation afterwards. In dynamic environment, things do not always work as planned. Mobile agents might not be able to execute some actions. They might revise their knowledge and based on the new knowledge, some old plans might become invalid. When switching from plan A to plan B, the actions which mobile agents have performed might prevent the execution of plan B because of the side effects of the executed actions. N. Suri (Ed.): MA 2002, LNCS 2535, pp. 32–46, 2002. c Springer-Verlag Berlin Heidelberg 2002 

Mobile Agents and Logic Programming

33

In this paper, we will show how a Prolog-like procedure can be used to implement mobile agent systems in the dynamic world. Our new procedure for mobile agents integrates planning, action execution, knowledge updates, and plan modifications. Consequently, our mobile agents check and modify the plans while executing actions. The rest of this paper is organized as follows. Section 2 explains how Prologlike procedures can be used for planning. Section 3 shows how plans can be modified when the program (knowledge) is updated. Section 4 defines a new derivation that records extra information to prepare for future program updates and plan modifications. Section 5 presents the method to modify several plans, considering side effects, when an action is executed. Section 6 defines the life cycle of the agent, which integrates planning, action execution, program updates, and plan modifications. Section 7 discusses efficiency issues. Section 8 defines the semantics. And Section 9 shows how the life cycle of the agent in Section 6 is applied to our mobile agent system.

2

Prolog as a HTN Planner

Prolog is different from the planners that are based on the state of the world (or situation) described by the values of fluents, which are propositions whose values change in different situations. Prolog is closer to planners that decompose goals into more primitive goals. In this sense, Prolog is related to HTN (Hierarchical Task Network) planners, such as SHOP [9], which decompose a compound task into more primitive tasks. For example, if we want to satisfy the literal: move(london,kyoto), using the clause: move(london,kyoto):-move(london,tokyo),move(tokyo,kyoto)., Prolog decomposes move(london,kyoto) into the literals move(london,tokyo) and move(tokyo,kyoto). Because Prolog tries to satisfies move(london,tokyo) before move(tokyo,kyoto), we can regard the list [move(london,tokyo), move(tokyo,kyoto)] as a total-order plan. Hereafter, literal refers to the positive literal1 of Prolog without negation, clause refers to a definite clause2 of Prolog, and plan refers to a total-order plan which is described by a list of literals. Some literals are executable and called 1

2

A positive literal is either a predicate or of the form P (T1 , ...., Tn ) where n ≥ 0, P is a predicate, and T1 , ...., Tn are terms. Constants, variables, and compound terms are terms. A compound term is of the form F (T1 , ...., Tn ) where n ≥ 0, F is a function, and T1 , ..., Tn are terms. We also use a list as a term. A list is of the form [T1 , ..., Tn ] where T1 , ..., Tn are terms and n ≥ 0. [T |T AIL] represents the list [T, T1 , ..., Tn ] where T is a term, T AIL is the list [T1 , ..., Tn ], and n ≥ 0. A definite clause is of the form H:-B1 , ..., Bn where n ≥ 0, H is a positive literal, and B1 , ..., Bn are positive literals. H is called the head of the clause, and the sequence B1 , ..., Bn of positive literals is called the body of the clause. When there is no literal in the body of the clause (or n = 0), H is used as the abbreviation of the clause H:and called a fact.

34

H. Hayashi, K. Cho, and A. Ohsuga

actions. Actions are not defined by clauses. Literals that are not executable will be decomposed by a Prolog-like procedure to make them executable. A plan is considered executable if and only if all the literals in the plan are executable.

3

Knowledge Updates and Plan Modifications

In dynamic environments, agents might acquire new information by observation, communication, learning, and so on. When an agent obtains new information, the agent might update its knowledge. If a Prolog-like procedure is used to compute plans, knowledge (program) updates might affect the plans. For example, a plan becomes invalid if the plan depends on a clause and the clause is deleted from the program when updating the program. Example 1. Consider the following program for a mobile agent where goto( ) ( is arbitrary.) is an action and the clauses defining connected might be updated later. move(A,B):-connected(A,B),goto(B). move(A,B):-connected(A,X),goto(X),move(X,C). connected(node1,node2). connected(node2,node3). Given the literal move(node1,node3) as an initial goal, we can decompose this literal into an executable plan as follows, where the literals are underlined when they are selected. 1. 2. 3. 4. 5. 6.

[move(node1,node3)] [connected(node1,node3),goto(node3)] [connected(node1,X1),goto(X1),move(X1,node3)] [connected(node1,X1),goto(X1),move(X1,node3)] [goto(node2),move(node2,node3)] [goto(node2),connected(node2,node3),goto(node3)] [goto(node2),connected(node2,X2),goto(X2),move(X2,node3)] [goto(node2),goto(node3)] [goto(node2),connected(node2,X2),goto(X2),move(X2,node3)]

In Step 6, the plan [goto(node2),goto(node3)] is executable. This plan depends on the clauses connected(node1,node2) and connected(node2,node3). These clauses can be recorded in Step 4 and Step 6. If one of these clauses is deleted from the program, this plan becomes invalid. We can find invalid plans in this way. On the other hand, if a new clause is added to the program, we might be able to create a new plan using the new clause. The following example shows how to create a new plan using a newly added clause. Example 2. Suppose that the clause: connected(node1,node3) is added to the program in Step 6 in Example 1. Replanning from scratch gives us the

Mobile Agents and Logic Programming

35

new plan: [goto(node3)]. However, we would like to avoid replanning from scratch because the agent might have executed some actions which have side effects. (For example, if the action goto(node2) has been executed, the agent should go back to node1 before using the new plan [goto(node3)]. ) Note that we can incorporate executed actions into plans as will be explained in Section 5. Therefore, we would like to keep the current plans and modify the plans when an action is executed or the program is updated. Using the newly added clause, the new plan [goto(node3)] can be created from the following plan: [connected(node1,node3),goto(node3)]. Therefore, in order to prepare for clause addition, we separately record this plan and the selected literal connected(node1,node3) in Step 3. We call this separately recorded plan a supplementary plan. Note that the definition clauses of the underlined selected literal in the supplementary plan are subject to change. For the same reason, we record the following supplementary plans respectively at Step 4 and Step 6: – –

4

[connected(node1,X1),goto(X1),move(X1,node3)] [goto(node2),connected(node2,node3),goto(node3)]

A New Derivation for Plan Decomposition

Based on the previous section, we define a new derivation for planning. The derivation will record extra information so that plans can be modified after a program update as shown in the previous section. The definition of this derivation includes dynamic literals, which are non-executable. Dynamic literals can be defined not only by facts but also by clauses. A program is a set of clauses such that actions do not occur at the head of each clause. (Actions cannot be defined by clauses.) The clauses defining dynamic literals are subject to change when the program is updated. Definition 1. Let Lk be a literal that is unifiable with the head of clause3 C. The resolvent of the plan [L1 , ..., Lk , ..., Ln ] on Lk by the clause C is the plan: θ([L1 , ..., Lk−1 , B1 , ..., Bm Lk+1 ..., Ln ]) where H:-B1 , ..., Bm (m ≥ 0) is a new variant4 of the clause C and θ is a most general unifier5 (mgu) of Lk and H. Clauses and a history of action execution6 recorded in association with a plan are also recorded in its resolvents. 3 4 5

6

No action is unifiable with the head of a clause. C1 is a variant of C2 if there exists substitutions θ1 and θ2 such that C1 is identical to θ1 (C2 ) and C2 is identical to θ2 (C1 ). Substitution θ is a unifier of P and Q if and only if θ(P ) and θ(Q) are identical. Unifier θ of P and Q is a most general unifier of P and Q if and only if for any unifier σ of P and Q, there exists a substitution ρ such that ρ(θ(P )) is identical to σ(P ). A history of action execution will be defined in Definition 4 in the next section.

36

H. Hayashi, K. Cho, and A. Ohsuga

Definition 2. A derivation under the program P from P LAN S1 to P LAN Sn is a sequence: P LAN S1 , P LAN S2 , ..., P LAN Sn of sets of plans such that n ≥ 2, and each P LAN Si+1 (1 ≤ i ≤ n − 1) is derived from P LAN Si by one of the following derivation rules: p1 Select a plan [L1 , ..., Lk , ..., Lm ] from P LAN Si , and select a literal Lk from this plan such that Lk is neither an action7 nor a dynamic literal. Make P LAN Si+1 from P LAN Si by replacing the plan [L1 , ..., Lk , ..., Lm ] with all the plans each of which is a resolvent of the plan [L1 , ..., Lk , ..., Lm ] on Lk by a clause in the program P . If supplementary plans are recorded in association with P LAN Si , record also those supplementary plans (and their selected literals) in association with P LAN Si+1 . p2 Select a plan [L1 , ..., Lk , ..., Lm ] from P LAN Si , and select a literal Lk from this plan such that Lk is a dynamicliteral. Make P LAN Si+1 from P LAN Si by replacing the plan [L1 , ..., Lk , ..., Lm ] with all the plans each of which is a resolvent R of the plan [L1 , ..., Lk , ..., Lm ] on Lk by a clause C in the program P , with the clause C recorded in association with the plan R. Record the plan [L1 , ..., Lk , ..., Lm ] (and the selected literal Lk ) in association with P LAN Si+1 as a supplementary plan. If some supplementary plans are recorded in association with P LAN Si , record also those supplementary plans (and their selected literals) in association with P LAN Si+1 . When the definition clauses of the selected literal are subject to change, Rule p2 is applied and extra information is recorded. As explained in the previous section, supplementary plans are recorded in association with a set of plans to prepare for clause addition, and some clauses are recorded in association with a plan (resolvent) to check the validity of the plan when deleting a clause. The number of supplementary plans, which are recorded in association with a set of plans, represents how many times dynamic literals have been selected to derive the set of plans. The number of clauses recorded in association with a plan represents how reliable the plan is. (The fewer the recorded clauses, the more reliable the plan. Recorded clauses might be deleted in the future.) Note that executable literals (actions) are not resolved in the above definition. An action will be satisfied when it is executed successfully.

5

Action Execution and Side Effects

After a mobile agent moves from one computer to another, it might not be able to execute an action as planned. This kind of problem usually occurs because the environment of the computer has changed. In such cases, it is necessary to switch from the current plan to another plan. However, some of the executed actions might need to be undone because execution of an action in a plan might prevent execution of another plan. Therefore, each time an action is executed, the plan is modified by the following action-execution rules, where the result of execution of action A is one of the following: 7

Actions are not resolved by the derivation rules.

Mobile Agents and Logic Programming

37

– The execution of action A is successful and does not have side effects8 . – The execution of action A is successful but has side effects that can be undone by the execution of action A−1 . – The execution of action A is successful but has side effects that cannot be undone. – The execution of action A is unsuccessful9 . Definition 3. Given the result of execution of action A, the plan [L1 , L2 , ..., Ln ] is modified by one of the following action-execution rules where n ≥ 0 and L1 is an action (if n ≥ 1): s1 If A is executed successfully, n ≥ 1, and L1 is unifiable10 with A, then unify L1 with a new variant of A, and delete11 L1 from the plan [L1 , L2 , ..., Ln ]. s2 If A is executed successfully, L1 is an action that is not unifiable with A, action A has a side effect, and action A−1 undoes the execution of A, then add12 A−1 to the top of the plan [L1 , L2 , ..., Ln ]. s3 If A is executed successfully, L1 is an action that is not unifiable with A, action A has a side effect, and no action can undo the execution of A, then delete the plan [L1 , L2 , ..., Ln ]. s4 Otherwise13 , the plan [L1 , L2 , ..., Ln ] is not modified. Rule s1 erases the first literal of a plan, if it has been satisfied, to avoid redundant execution of the action. Rule s2 applies to the case where the executed action has a side effect and it is necessary to undo the action execution before using the plan. Rule s3 cuts a plan if the executed action has a side effect and it is impossible to undo the action execution before using the plan. Rule s4 is applied if the plan is not affected. Note that the action-execution rules are not applied to the plan [L1 , ..., Ln ] if the first literal L1 is not executable. In this case, in order to apply an actionexecution rule, the non-executable literal L1 has to be decomposed further till it becomes executable. However, we might not decompose L1 immediately if some other plans are available at the moment. In order to apply the action-execution rules to the plan [L1 , ..., Ln ] after decomposing L1 into actions later, we need to record the history of action execution in association with that plan. Definition 4. A history of action execution is a list of the form [H1 , ..., Hn ], where n ≥ 0, and each Hk (1 ≤ k ≤ n) is a result of action execution. 8 9 10

11 12 13

We assume that this action does not prevent other plans from functioning. Therefore, we do not have to undo this action in any case. When the action execution is unsuccessful, we assume that the situation has not changed. Even if L2 is unifiable with A, we do not delete L2 from the plan [L1 , L2 , ..., Ln ] because this plan is a total-order plan and L1 has to be satisfied before L2 . We would like to extend our procedure so that it can deal with partial-order plans in the future. The modified plan is θ([L2 , ..., Ln ]) where θ is an mgu of L1 and a new variant of A. The modified plan is [A−1 , L1 , L2 , ..., Ln ]. This rule is not applied if L1 is not an action.

38

H. Hayashi, K. Cho, and A. Ohsuga

We assume that the agent has knowledge about canceling actions. In our implementation, the Prolog clause: undo(cancel(Hotel),book(Hotel)) means that the action cancel(Hotel) will undo the execution of book(Hotel). The Prolog clause: undo(plan(goal),action1). means that the execution of action1 will be undone if a plan for goal is executed. In order to execute the plan [plan(goal), a1 , a2 , ...], we need to decompose the plan [goal, a1 , a2 , ...] until it becomes executable. Using Prolog clauses, we can set meta-level conditions. For example, the Prolog clause: undo(goto(Location),goto(_)):-previousLocation(Location). means that the goto action will be canceled by going back to the previous recorded Location. Note that in our implementation, if no canceling action is known, we assume that the executed action does not have side effects. Therefore, if previousLocation is not recorded in the knowledge base, the agent will not undo the goto action. In other words, we record previousLocation only when we want to cancel the goto action. previousLocation(location) is recorded by executing the action setLocation(location). In the simplest case, we record previousLocation each time the agent executes a goto action. Here, the canceling action of each goto action moves the agent to the previous node. If previousLocation is not recorded when executing the goto action, action-execution rule s2 will not be applied to plans and the executed goto action will not be undone. Similarly, actions whose side effects cannot be undone and actions without side effects can be expressed using Prolog clauses.

6

Life Cycle of the Agent

This section defines the life cycle of the agent. Based on the techniques in the previous sections, our agent integrates planning, action execution, knowledge (program) updates, and plan modifications as outlined below. The derivation is defined in Definition 2, the action-execution rules are defined in Definition 3, and the history of action execution is defined in Definition 4. Note that extra information (clauses and a history of action execution) will be recorded in association with a plan. Also, supplementary plans will be recorded separately. Definition 5. Given the literal G (initial goal) and the program P (current program), the life cycle of the agent is as follows: 1. Record the empty history [] of action execution in association with the plan [G], and let {[G]} be the current set of plans. 2. Repeat the following procedure until one of the plans in the current set of plans becomes an empty plan []:

Mobile Agents and Logic Programming

39

a) (Program Updates) Repeat the following program updates if necessary: i. Assert (or retract) a clause14 C that defines a dynamic literal to (respectively, from) the current program. ii. If the clause C has been retracted from the current program in Step 2(a)i, delete each plan P LAN in the current set of plans and each supplementary plan P LAN recorded in association with the current set of plans, such that C is recorded in association with P LAN . iii. If the clause C has been asserted to the current program in Step 2(a)i, for each supplementary plan [L1 , ..., Lk , ..., Ln ] recorded in association with the current set of plans such that 1 ≤ k ≤ n, Lk is the selected literal, and the head of C is unifiable with Lk , add the resolvent of [L1 , ..., Lk , ..., Ln ] on Lk by C to the current set of plans. b) (Action Execution) If possible, select15 a plan [A, ...] from the current set of plans such that the first literal A is executable16 , try to execute17 the action A, and for each plan [L1 , ..., Ln ] (n ≥ 0) in the current set of plans and for each supplementary plan [L1 , ..., Ln ] recorded in association with the current set of plans, modify the plan [L1 , ..., Ln ] or its history of action execution [H1 , ..., Hm ] (m ≥ 0) as follows: i. If n = 0 or the first literal L1 of the plan [L1 , ..., Ln ] is executable, then apply an action-execution rule to the plan [L1 , ..., Ln ] following the result of execution of action A. ii. Otherwise, add18 H to the top of the history [H1 , ..., Hm ] of action execution. c) (Planning) If possible, make a derivation19 under the current program from the current set of plans to P LAN S, and replace the current set of plans with P LAN S. d) (Updates of Plans and their Histories) While there exists a plan [L1 , ..., Ln ] in the current set of plans or there exists a supplementary plan [L1 , ..., Ln ] recorded in association with the current set of plans, such that n ≥ 0, the first literal L1 is executable (if n ≥ 1), the history of action execution recorded in association with the plan [L1 , ..., Ln ] is [H1 , H2 , ..., Hm ], and m ≥ 1, repeat the following: i. Apply an action-execution rule to the plan [L1 , ..., Ln ] following the result H1 of action execution and delete20 H1 from [H1 , H2 , ..., Hm ]. 14 15 16 17 18 19

20

We assume that the agent has the module that decides to update the program. This module chooses the clause to add or delete at Step 2(a)i. We assume that the agent has the module that selects a plan to execute. As long as A is executable, the other literals in the plan [A, ...] do not have to be executable. We assume that the agent has the module that executes the action A and returns the result of execution of A. The history of action execution is updated to [H, H1 , ..., Hm ]. We assume that the agent has the module that makes a derivation using the derivation rules. The module decides how many plans it makes and how long it spends in making a derivation. The history of action execution is updated to [H2 , ..., Hm ].

40

H. Hayashi, K. Cho, and A. Ohsuga

3. The initial goal has been satisfied under the current program. (When the current program has to be updated, go to Step 2.) Example 3. Consider the following program P where goto( ) ( is arbitrary) is an action and goto( ) ( is arbitrary) is a dynamic literal. move(A,B):-connected(A,B),goto(B). move(A,B):-connected(A,X),goto(X),move(X,C). connected(node1,node2). connected(node2,node3). Given the initial goal move(node1,node3) and the current program P , let us start the life cycle of the agent. The current set of plans is {[move(node1,node3)]}. The agent records the empty history [] of action execution in association with this current set of plans. Since the initial goal is not executable, the agent starts planning. As shown in Example 1, the agent can make a derivation from the current set of plans to the set of the following two plans: – [goto(node2),goto(node3)]; – [goto(node2),connected(node2,X2),goto(X2),move(X2,node3)]. The set of the above two plans is now the current set of plans. As explained in Example 1, the clause connected(node2,node3) is recorded in association with the first plan. The clause connected(node1,node2) is recorded in association with the first plan and the second plan. And the empty history [] of action execution is also recorded in association with each of the above plans. As in Example 2, the following three plans are recorded separately as supplementary plans: – – –

[connected(node1,node3),goto(node3)]; [connected(node1,X1),goto(X1),move(X1,node3)]; [goto(node2),connected(node2,node3),goto(node3)].

No clauses are recorded in association with the first two supplementary plans. The clause connected(node1,node2) is recorded in association with the third supplementary plan. The empty history [] of action execution is recorded in association with each of the above three supplementary plans. Suppose that the agent has selected the plan [goto(node2),goto(node3)] from the current set of plans and executed the action goto(node2) successfully. The agent has just moved from node1 to node2. The agent modifies each plan in the current set of plans as follows: – [goto(node3)]; – [connected(node2,X2),goto(X2),move(X2,node3)]. Note that the action goto(node2) has been removed from the above plans. The agent also modifies the supplementary plans as follows: –

[connected(node1,node3),goto(node3)];

Mobile Agents and Logic Programming

– –

41

[connected(node1,X1),goto(X1),move(X1,node3)]; [connected(node2,node3),goto(node3)].

The action goto(node2) has been removed from the third supplementary plan. Suppose that this successful execution of goto(node2) has a side effect and that the action goto(node1) will undo21 the execution of goto(node2). We express this result of action execution as undo(goto(node1),goto(node2)). Because the first literal of the first supplementary plan is not executable, the agent updates the history of action execution recorded in association with the first supplementary plan from [] to [undo(goto(node1),goto(node2))]. This history of action execution might be used to modify the first supplementary plan after decomposing the literal connected(node1,node3). For the same reason, the history of action execution recorded in association with the second supplementary plan is updated to [undo(goto(node1),goto(node2))]. Now suppose that the agent has found that node2 is not connected to node3. The agent retracts the clause connected(node2,node3) from the current program. The agent then removes the plan [goto(node3)] from the current set of plans because the deleted clause connected(node2,node3) is recorded in association with this plan. The only plan which exists in the current set of plans is as follows: – [connected(node2,X2),goto(X2),move(X2,node3)]. Now suppose that the agent has found that node1 is connected to node3. The agent asserts the clause connected(node1,node3) to the current program. The agent then adds the plan [goto(node3)] to the current set of plans. Note that this new plan ([goto(node3)]) is the resolvent of the supplementary plan [connected(node1,node3),goto(node3)] on connected(node1,node3) by the asserted clause connected(node1,node3). Therefore, the history of action execution recorded in association with this supplementary plan is also recorded in association with the new plan [goto(node3)]. Using this history of action execution ([undo(goto(node1),goto(node2))]), the agent modifies this new plan to [goto(node1),goto(node3)]. Now the current set of plans is the set of the following plans: – [goto(node1),goto(node3)]; – [connected(node2,X2),goto(X2),move(X2,node3)]. Note that the first plan has been made using the newly asserted clause, and the action goto(node1) will undo the executed action goto(node2). The agent continues its life cycle in this manner.

7

Efficiency Issues

Our method of modifying plans after a program update is based on Dynamic SLDNF (DSLDNF) [4] [5]. Experiments described in [4] and [5] have confirmed 21

By recording the previous location, it is possible to find that goto(node1) will undo goto(node2). This technique was explained in Section 5.

42

H. Hayashi, K. Cho, and A. Ohsuga

that when a program is updated, DSLDNF evaluates the initial goal faster than SLDNF under the updated program. This is true as long as DSLDNF can save computation time by pruning many parts of the search tree. Note that SLDNF has to reevaluate the initial goal from scratch under the updated program. Therefore, DSLDNF can significantly reduce the computation time if it takes SLDNF a long time to make the plans that are maintained by DSLDNF. However, in our new procedure, this kind of efficiency is not particularly important because we also modify plans after executing an action. If computation is started from scratch, the side effects of executed actions will be ignored. We would like to maintain the plans that take the executed actions into account.

8

Semantics

This section defines the semantics for the life cycle of the agent. We define some axioms and explain the intuitions behind these axioms. Definition 6. axioms(P ) is defined as follows, where P is a program: Axiom 1 For any literal L, axioms(P ) implies: ∀hold(L, [L]). Axiom 2 For each clause H:-B1 , ..., Bn in the program P , axioms(P ) implies: ∀hold(H, [B1 , ..., Bn ]). Axiom 3 For any literals Y1 , ..., Yk−1 , Yk , Yk+1 , ..., Yn , Z1 , ..., Zm (1 ≤ k ≤ n, m ≥ 0) and X, axioms(P ) implies: ∀(hold(X, [Y1 , ..., Yk−1 , Z1 , ..., Zm , Yk+1 , ..., Yn ]) ← hold(X, [Y1 , ..., Yk−1 , Yk , Yk+1 , ..., Yn ]) ∧ hold(Yk , [Z1 , ..., Zm ])). Axiom 4 For any action A and for any literals Y1 , ..., Yk , ..., Yn (0 ≤ k ≤ n) and X, if the execution of A does not have a side effect, axioms(P ) implies: ∀(hold(X, [Y1 , ..., Yk , A, Yk+1 , ..., Yn ]) ← hold(X, [Y1 , ..., Yk , Yk+1 , ..., Yn ])). Axiom 5 For any action A and for any literals Y1 , ..., Yk , ..., Yn (0 ≤ k ≤ n) and X, if the action A−1 will undo the execution of A, axioms(P ) implies: ∀(hold(X, [Y1 , ..., Yk , A, A−1 , Yk+1 , ..., Yn ]) ← hold(X, [Y1 , ..., Yk , Yk+1 , ..., Yn ])). When the life cycle of the agent starts, the agent makes a plan [G] from the initial goal G. From Axiom 1, we can prove that the initial goal G is satisfied when the plan [G] is satisfied. According to Axiom 2, the clause H:-B1 , ..., Bn in the program means that the literal H is satisfied when the plan [B1 , ..., Bn ] is satisfied. Axiom 3 says that if the literal L is satisfied when the plan P LAN is satisfied, then the literal L is also satisfied when a resolvent of P LAN is satisfied. Axiom 3 justifies literal decompositions in planning. Plans can be affected by a program update. When a clause is added to the current program, some new plans might be added to the current set of plans. This is justified by Axiom 3 because these newly added plans are resolvents of supplementary plans. Note that the initial goal is satisfied when a supplementary

Mobile Agents and Logic Programming

43

plan is satisfied. When a clause is deleted from the current program, the plans which depend on the deleted clause will become invalid. (Actually, these plans will lose justifications that are based on Axiom 2.) These invalid plans will be deleted by the plan modifications. Note that clauses are recorded in association with a plan for this purpose. Plan modifications after action execution are justified by Axiom 4 and Axiom 5. Axiom 4 means that if action A does not have any side effects, it can be executed in the middle of a plan. Axiom 4 justifies the action-execution rule s4. Axiom 5 means that if the action A−1 undoes the action execution of A and the two actions A, A−1 are executed in this order, then the execution of these two actions (A, A−1 ) will not affect any plan. Axiom 5 justifies the action-execution rule s2. Some executed actions might have side effects that cannot be undone. In this case, useless plans will be abandoned by the action-execution rule s3. Using the axioms 1 to 5, we can prove the following theorem. Theorem 1. (Soundness) When the life cycle of the agent is terminated successfully in Step 3 in Definition 5, there exists a substitution θ such that the following holds: axioms(P ) |= hold(θ(G), [θ(A1 ), ..., θ(An )]) where P is the current program, G is the initial goal, A1 , ..., An (n ≥ 0) are all the actions that have been executed successfully since the beginning of the life cycle of the agent, and each Ai has been executed before Ai+1 (1 ≤ i ≤ n − 1).

9

Implementation of a Mobile Agent System

The agent life cycle has been applied to our new mobile agent system called picoPlangent, which is implemented in Java. This section shows how picoPlangent has been implemented. As shown in Figure 1, the picoPlangent agent works on a platform on which some components are installed. The agent uses a component on the platform when executing an action. Components are sometimes used to decompose an action into more primitive actions. In this case, we need another axiom: Axiom 6 If a component replaces action A in a plan with the sequence22 A1 , ..., An (n ≥ 0) of actions, then axioms(P ) implies: ∀hold(A, [A1 , ..., An ]). This action decomposition is based on the local rule of the platform. On the other hand, literal decomposition in planning (derivation) is based on the program of the agent. The agent has its own knowledge such as a program. This knowledge also includes information about actions and dynamic literals. To declare that goto( ) is an action and connected( , ) is a dynamic literal, we use the following Prolog clauses: 22

The order of A1 , ..., An matters.

44

H. Hayashi, K. Cho, and A. Ohsuga

action(goto(_)). dy(connected(_,_)). Based on the program and other information in the knowledge, the agent makes (and modifies) plans using the planner component on the platform. The agent also uses its knowledge to check the side effects of actions. When moving from one platform to another, the agent carries plans, information that is necessary for future plan modifications, and the knowledge (compiled into Java’s objects). However, it normally does not carry any of the components, including the planner component. One of the FAQs about program updates is when and how the agent updates the program. The picoPlangent agent updates the program by executing the action assert(C) or the action retract(C) where C is a clause. As in Prolog, assert(C) adds the clause C to the program and retract(C) deletes the clause C from the program. These actions are handled by the planner component. When executing the action assert(C), some new plans might be added to the current set of plans. Those added plans are created from supplementary plans using the clause C. When executing the action retract(C), some plans might be deleted from the current set of plans. The plans are deleted because they were created using the deleted clause C and are therefore invalid. These actions (assert(C)/retract(C)) are put into plans by literal decompositions. Note that a literal is decomposed either by a planner or by a component. A planner decomposes literals following the program of the agent. In other words, the agent can decide when the program is updated. In contrast, a component decomposes literals (actions) following the local rules of the platform. This means that the component can instruct the agent to update its program. Network

Platform 1

Hello

Say Hello

Agent 1

Component 1 Suceeded

Agent 2

Execute goto(platform2)

Component 2 Replace goto(platform2) with gotoRMI(platform2)

Platform 2 is connected by RMI

Fig. 1. The picoPlangent Platform

10

Related Work

As agents are becoming a major AI topic, an increasing amount of research is being carried out on the use of logic programs in dynamic environments. A dynamic logic programming agent architecture [8] uses a collection of (generalized)

Mobile Agents and Logic Programming

45

logic programs, and each program can be updated. This collection of programs can be mutually contradictory because each program might have a clause with a negation at its head. A special semantics is used to remove this kind of contradiction, and models are changed when a program is updated. In reference [13], a procedure for robot navigation that combines sensing, planning, and action execution is proposed. However, when a new observation invalidates a plan, the procedure does not modify the plan. Instead, it replans from scratch. Replanning is an important subject in the area of planning. Reference [11] explains the standard replanning method based on the IPEM planner [1] which smoothly integrates partial-order planning, execution, and monitoring. In the standard partial-order planning, protected links play important roles in replanning. A protected link records a fluent which has to remain true between two specified time points. Therefore, using protected links, it is possible to check if fluents do hold true as expected. Our plan modification method in Section 3 is based on Dynamic SLDNF (DSLDNF) [4] [5], and it can deal with general program updates including clause addition and clause deletion. The following studies are related to this plan modification method: In reference [7], integrity constraints are used to reactively assimilate observed facts. However, the procedure in [7] cannot delete observed facts. A closely related work is Satoh’s “speculative computation,” as pointed out in [12]. In his system, some assumed facts (abducibles) are recorded in association with a process, and those assumed facts might be deleted later. His process modification method is similar to our plan modification method when deleting a clause. Note that our procedure records some clauses in association with a plan, and those clauses might be deleted from the program. (Our clause corresponds to an assumed fact in reference [12], while our plan corresponds to a process in the same reference.) Reference [6] shows that our procedure is also useful for speculative computation in multi-agent systems. Transaction logic programming [2] is related to our plan modification method in Section 5. Like our procedure, the procedure in [2] executes literals in the body of a clause from left to right. When backtracking, it undoes some actions in a similar manner to our action-execution rule s2. However, our action-execution rule s2 is different from the action-canceling method of transaction logic programming because our action-execution rules modify plans without backtracking.

11

Conclusions

We have shown how to integrate planning, action execution, program updates, and plan modifications through logic programming. Our procedure is similar to Prolog to some extent, and it is relatively easy to imagine how the procedure works even if the program is written declaratively. Although our procedure does not use the knowledge describing the effects of actions on the state of the world expressed by fluents, it takes into account the side effects of executed actions. This plan modification is necessary because an action in a plan might prevent another plan from functioning. We can use three types of actions: actions without

46

H. Hayashi, K. Cho, and A. Ohsuga

side effects, actions with side effects that can be undone, and actions with side effects that cannot be undone. Also, our procedure enables modification of plans after a program update. When a clause is deleted from the program, some plans might become invalid. Our procedure erases those invalid plans. On the other hand, when the agent adds a new clause to the program, there might be new plans which depend on the newly added clause. Our procedure can create those new plans. In Section 9, we have shown how to implement a mobile agent system based on the agent life cycle defined in Section 6.

References 1. J. Ambrose-Ingerson and S. Steel. Integrating Planning, Execution, and Monitoring. National Conference on Artificial Intelligence (AAAI), pp. 83-88, 1988. 2. A. J. Bonner and M. Kifer. Transaction Logic Programming. International Conference on Logic Programming, pp. 257-279, 1993. 3. N. Fukuta, T. Ito, and T. Shintani. MiLog: A Mobile Agent Framework for Implementing Intelligent Information Agents with Logic Programming. Pacific Rim International Workshop on Intelligent Information Agents, 2000. 4. H. Hayashi. Replanning in Robotics by Dynamic SLDNF. IJCAI Workshop ”Scheduling and Planning Meet Real-Time Monitoring in a Dynamic and Uncertain World,” 1999. 5. H. Hayashi. Computing with Changing Logic Programs. PhD Thesis, Imperial College of Science, Technology and Medicine, University of London, 2001. 6. H. Hayashi, K. Cho, and A. Ohsuga. Speculative Computation and Action Execution in Multi-Agent Systems. ICLP Workshop on Computational Logic in MultiAgent Systems, Electronic Notes in Theoretical Computer Science 70(5), 2002. 7. R. Kowalski and F. Sadri. From Logic Programming to Multi-Agent Systems. Annals of Mathematics and Artificial Intelligence, 1999. 8. J. A. Leite, J. J. Alferes, and L. M. Pereira. MINERVA-A Dynamic Logic Programming Agent Architecture. International Workshop on Agent Theories, Architectures, and Languages, 2001. 9. D. Nau, Y. Cao, A. Lotem, and H. M˜ unoz-Avila. SHOP: Simple Hierarchical Ordered Planner. International Joint Conference on Artificial Intelligence, pp. 968975, 1999. 10. A. Ohsuga, Y. Nagai, Y. Irie, M. Hattori, and S. Honiden. PLANGENT: An Approach to Making Mobile Agents Intelligent. IEEE Internet Computing, 1(4):50-57, 1997. 11. S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice-Hall, 1995. 12. K. Satoh. Speculative Computation and Abduction for an Autonomous Agent. International Workshop on Non-Monotonic Reasoning, pp. 191 - 199, 2002. 13. M. Shanahan. Reinventing Shakey. In Logic-Based Artificial Intelligence, Kluwer Academic Press, pp. 233-253, 2000.

Empowering Mobile Software Agents Volker Roth Fraunhofer Institute for Computer Graphics Dept. Security Technology Fraunhoferstr. 5, 64283 Darmstadt, Germany [email protected] Currently at International Computer Science Institute University of California, Berkeley 1947 Center Street, Suite 600, Berkeley CA 94704-1198, USA [email protected]

Abstract. Recent work has shown that several cryptographic protocols for the protection of free-roaming mobile agents are vulnerable by means of protocol interleaving attacks. This paper presents equivalent protocols meant to be robust against this type of attack. Moreover, it describes the required processes and data structures at a level of detail that can be translated to an implementation in a straightforward way. Our aim is to demonstrate how cryptographic processing can be implemented transparently for agent programmers, thereby reducing the risks of human error in (secure) mobile agent programming. Keywords: Mobile agent security, malicious host, protocol interleaving attack, applied cryptography.

1

Introduction

One class of mechanisms for the protection of data in free-roaming mobile software agents against malicious hosts (as opposed to e.g., protecting the confidentiality of an agent’s computation) is based on cryptographic protocols. In general, the objectives are twofold: first, an agent carries confidential information that is revealed only while the agent is on a trusted host, and second, the agent transports partial results back to its origin in a way that assures the integrity (and optionally the confidentiality) of the partial results. Furthermore, the owner of the agent shall be able to derive the identity of the host on which a given partial result was acquired. Recently, several of these protocols were shown to be vulnerable to interleaving attacks [1]. An interleaving attack [2, §10.5] is “an impersonation or other deception involving selective combination of information from one or more previous or simultaneously ongoing protocol executions (parallel sessions), including possible origination of 

This research was supported by the DAAD (German Academic Exchange Service). Views and conclusions contained in this document are those of the author and do not necessarily represent the official opinion, either expressed or implied, by the DAAD.

N. Suri (Ed.): MA 2002, LNCS 2535, pp. 47–63, 2002. c Springer-Verlag Berlin Heidelberg 2002 

48

V. Roth

one or more protocol executions by an adversary itself.” Figure 1 illustrates how interleaving attacks may be mounted in the domain of mobile agents: the adversary receives an agent, and copies protocol data back and forth between this agent and agents she sent herself.

Fig. 1. Basic scheme of attacks we mount against various protocols. Triangles denote agents. Triangles shaded in gray denote agents created by the adversary Eve.

In this paper, we put forward protocols that are designed to be robust against this type of attack.1 We apply techniques from [1,3,4] and define their application at a level of abstraction that can be translated to an implementation in a straightforward way. We have the following additional objectives: – Cryptographic protocol data shall be produced only if a partial result is generated. – The implementation shall be independent of a particular agent programming language. – There shall be a security policy that is simple and easy to understand. – Cryptographic processing shall be transparent for agent programmers. The last two objectives take into account the potential for human error. Security is often added as an afterthought rather than at design time. On top of that, it is customary that (agent) programmers need to “get the application going” within a tight schedule, and security features are likely sacrificed close to a deadline. Hence, it is prudent not to leave the management and implementation of cryptographic functions up to agent programmers – a task for which they are hardly ever trained in the first place. We chose to represent a mobile agent as a ZIP [5] archive. Our previous approach [3] was based on the JAR [6] format. As it turned out, that choice offered little benefits to us and complicated the implementation. JAR Manifest sections do not require a canonical ordering and may contain additional (even custom) attributes. Therefore, said sections would require caching and complicated re-ordering so that automatic regeneration of Manifests does not invalidate e.g., digital signatures that are computed on them. For the 1

We do not take into account interleaving attacks based on malicious modification of an agent’s mutable state. This needs to be addressed by other means.

Empowering Mobile Software Agents

49

sake of simplicity, we changed to a format that has a more concise set of features, contains less redundancy, and is easier to parse and recreate in a cryptographically unambiguous way. Our current mobile agent structure still bears strong resemblance to a JAR. In particular, we keep most of the JAR terminology, although the structures to which we refer have a different format. Please note, though, that the archive format is purely an implementation choice. The abstract mechanisms that we describe can be applied to other representations of a mobile agent as well e.g., a representation based on XML [7]. The remainder of the paper is structured as follows. An overview of the mechanisms we describe is given in sect. 2. Section 3 gives the definitions that we use throughout the paper. The details of our work are at the focus of sect. 4. A subset of the functionality that we describe in sect. 4 was implemented and sect. 5 summarizes the results of our performance measurements. We describe related work in sect. 6, and give our conclusions in sect. 7.

2

Overview and Security Rationale

We implemented security funtions on top of the ZIP format by means of meta files with a well-defined cryptographic interpretation. A schematic overview is given below: 

Scope of S0  

 f0 f1 . . . fn S0 f0 f1 . . . fm Sz M V D0 . . . Dl    Scope of Sz

Files fi are static files, and fi stands for mutable files in the agent. The static files are signed by the agent’s owner, which yields signature S0 . The message digest h(S0 ) uniquely identifies the protocol run that is represented by that particular agent instance. Each host Ez signs the mutable files including the file with owner’s signature, which yields signature Sz , thereby replacing the signature of the previous host Ez−1 . We assume that a signer’s identity can be derived from the signature. Signatures are computed on Manifest sections of the files. The Manifest sections are kept current in file M . Whenever the agent produces a partial result, that result is encrypted for the entity who produced S0 , and the ciphertext is added to the mutable files. If the previous host added a partial result, then that host’s signature Sz−1 is saved in the agent as well. Additionally, the current host adds the difference Dl between the previous version of M and the version it signs so that the input to the verification of Sz−1 can be recovered later. If the previous host did not add a partial result then Dl is merged with Dl−1 . Signature Sz serves a dual purpose. It protects the overall integrity of the agent during transport and binds partial results to it for later verification. The indirection by means of the M file and the differential Manifests Di assures that saved signatures can be verified on a set of files where some files are allowed to change and others are not. Thus, the chain of (encrypted) partial results P1 , . . . , Pn is protected qualitatively as given below: {h(S0 ), . . . {h(S0 ), {h(S0 ), P1 }S −1 , P2 }S −1 , . . . , Pn }Sn−1 1

2

50

V. Roth

Signing h(S0 ) along with Pi prevents that a Pi is used in a context of an agent other than the agent with signature S0 . Without further precautions, this scheme still allows arbitrary truncation of partial results, where a malicious host strips off an arbitrary number of outer signatures and grows a fake stem. This risk is reduced by means of a variant of Partial Result Authentication Codes (PRAC) [4] and encryption. Each encrypted partial result Pi is equivalent to ({mi }Ni , {Ni }K0 ) where K0 is the public encryption key of the entity who produced S0 , and mi is the plaintext of the partial result, and Ni is a random secret key chosen by the host that produced Pi . Key Ni is reused as input to a hash function that is computed on Ni , Si , and the previous partial result authentication code Vi−1 . The new value Vi = h(Ni || Vi−1 || Si ) replaces the previous value Vi−1 in the agent. This construction serves two purposes: – The PRAC can be predicted neither into the past nor into the future based on the current value. The agent’s owner has access to the secret keys and thus can validate the sequence of computations that led to the final value of the PRAC. – The input of Si and Ni into the PRAC computation generally prevents that the adversary signs and adds cipher texts for which he does not know the corresponding secret key (e.g., cipher texts that the adversary collected elsewhere). These measures particularly prevent that the adversary strips off the last signature in the (truncated) chain of encrypted partial results, adds a signature of his own, and thus claims to be the origin of the encrypted partial result (albeit not knowing the plain text). Adversaries may still truncate the partial results at a position for which they know a valid PRAC. Particularly, any host on a loop in the agent’s itinerary can replace the agent by a previously saved copy, thus undoing all state changes of that agent between consecutive visits. There does not seem to be a way around this problem unless there is a notion of agent freshness, or an external state is used (e.g., based on co-operating agents [8]). Readers may notice that at this point we did not include a forward reference to the next hop of the agent into the PRAC computation, as is occasionally proposed in related work [9,10,11]. The reasons for this decision go beyond the scope of this paper and will probably be described elsewhere. We describe our security mechanisms in greater detail in sect. 4. For the sake of completeness, this includes mechanisms for binding encrypted data to a mobile agent that is made available to it only on trusted hosts. We covered that material already in [3, 1], see also def. 7 for details.

3

Definitions and Data Types

Table 1 gives an overview over the meta-structure of a mobile agent. Almost all data types are defined in terms of ASN.1 [12], and instances thereof are encoded by means of the Distinguished Encoding Rules (DER) which are defined in [13]. Each data type is associated with a file extension, the mapping is given in tab. 1. For the sake of brevity and precision, we also define some data types in an abstract way, including the operations that we require on these types. We will write encryption of some plaintext into a ciphertext symbolically as c = {m}K , where K is the key being used. A digital signature will be written as an encryption

Empowering Mobile Software Agents

51

with a private signing key S −1 . We will write S −1 (m) when we refer to the bare signature rather than the union of the signature and the signed data. We assume that the identity of the signer can be extracted from her signature. A cryptographic hash of some input will be written h(m). Unless noted otherwise, we assume that h is preimage resistant and collision resistant [2, §9.2.2]. Definition 1 (Collection). A collection C = {(si , vi ) | i = 0, . . . , n} is an ordered set of key/value pairs where si (the key) is a name and vi is the value. Each name occurs at most once in C, so si = sj ⇔ i = j. The sort order is the lexicographic order of the keys. Let zip and unzip be two functions so that for a given collection C: zip(C) is a value, and

unzip(zip(C)) = C

We assume that zip adds some redundancy to the value (to be verified by unzip) so that the probability that a random value is accepted by unzip is negligible. Below, we define additional operations on collections. The first operation selects all pairs from a collection C whose keys have a given prefix p (the prefix is removed from the keys). The second operation prepends a given prefix to the keys of a collection. The third operation removes values from the collection whose keys have a given prefix. The fourth operation returns the subset of a selection based on a list of keys. The fifth operation returns the keys of a given collection. The sixth operation returns the value that is associated with a given key. children(C, p) = {(s, v) | (p ◦ s, v) ∈ C} move(C, p) = {(p ◦ s, v) | (s, v) ∈ C} remove(C, p) = {(s, v) ∈ C | p ◦ s = s} select(C, L) = {(s, v) ∈ C | s ∈ L} keys(C) = {s | ∃v : (s, v) ∈ C}  v if (s, v) ∈ C value(C, s) = error else where ’◦’ is the string concatenation. We furthermore define a binary copy operator ’ ’ so that for two collections C1 and C2 : C1 C2 = {(s, v) ∈ C1 | s ∈ keys(C2 )} ∪ C2 Thus, C2 “overwrites” the values in C1 . Definition 2 (Aliases). A mobile agent A is a collection. A Manifest is a collection. The elements of a Manifest are also called Manifest sections. Definition 3 (Differential Manifest). Let M1 , M2 be two Manifests. We define a difference operator diff and a differential Manifest D as given below: retain = M1 \M2

(1)

delete = keys(M2 )\ keys(M1 )

(2)

⇒ D = diff(M2 , M1 ) = (retain, delete)

(3)

52

V. Roth

We define the substraction of a differential Manifest D from a Manifest M as follows: sub(M, D) = {(s, v) ∈ M | s ∈ keys(retain) ∪ delete} ∪ retain

(4)

Differential Manifests represent a series of incremental changes of a Manifest in a compact way. This is used in sects. 4.6 and 4.7 to compute and verify cryptographic “checkpoints” of an agent’s state. Theorem 1 (Manifest reconstruction). Let M2 , M1 be two Manifests with identical message digest algorithm identifiers. Then the following equation holds sub(M2 , diff(M2 , M1 )) = M1 Proof of Theorem 1. Follows by a simple argument over defs. (1) – (4). Definition 4 (Meta, regular, static, and mutable files). The meta files of a given mobile agent A are the elements whose names start with “META-INF/”. The regular files regular(A) of A are all elements that are not meta files. The static files of the agent are the regular files marked • in tab. 1, plus any additional files that shall not be modified during the agent’s lifetime. The mutable files are all regular files that are not static files. Definition 5 (Entities, groups, and associated key pairs). Let E = {E0 , . . . , En } be a set of entities. Any nonempty subset G ⊆ E is a group. Let (Si , Si−1 ), (Ki , Ki−1 ) be key pairs of Ei for use with a digital signature algorithm and a public key encryption scheme. For simplicity, we do not distinguish between entities and their identities. The difference should be clear from the context. Definition 6 (Encrypted archives and seals). Let G = {E0 , . . . , En } be a group of entities with corresponding key pairs Ki , Ki−1 , 0 ≤ i ≤ n that are suitable for encryption. Let N be a randomly chosen group key, and let v be a value. Then {v}N is an encrypted archive and P = {(E0 , {N }K0 ), . . . , (En , {N }Kn )} is a collection where the keys are the identities of the group members and the values are the group keys encrypted with the keys of the group members. P is also called a seal, and the group members are called authorized recipients. Definition 7 (Install file). Let A be a mobile agent, E0 be the agent’s owner, and let S0 be the owner’s verification key. Let G0 , . . . , Gm be groups of authorized recipients. Furthermore, let N0 , . . . , Nm be a list of randomly chosen group keys where Ni corresponds to group Gi . Then an install file I is defined as: acl = {(pi , ji ) | i = 0, . . . , n} proofs = {(k, h(Nk || S0 )) | k = 0, . . . , m} ⇒ I = (acl, proofs)

Empowering Mobile Software Agents

53

where ji ∈ {0, . . . , m} and all pi are prefix-free. The first set of I is the access control policy of A and its interpretation is as follows: pi is the path name prefix of a collection of files in A, k is the identifier of the encrypted archive in which that file collection is kept, and ji is the index of the group whose group key the archive is encrypted with. The second set of I contains per group proofs of knowledge of the group key. They are used in order to ascertain that a given encrypted archive actually belongs to a given mobile agent. Definition 8 (Useful macros). Let X be the symbol of an abstract data type as listed in tab. 1. Then name(X) is the corresponding path name. In order to render our descriptions of the algorithms more compact, we define the “algorithm snippets” given below:  n  store(A, X0 , . . . , Xn ) = A ← A {(name(Xi ), Xi )} ⎡

i=0

⎤ X0 ← value(A, name(X0 )) ⎢ ⎥ .. ⎥ load(A, X0 , . . . , Xn ) = ⎢ . ⎣ ⎦ Xn ← value(A, name(Xn )) n  (value(C, name(Xi )) = value(C  , name(Xi ))) check(C, C  , X0 , . . . , Xn ) = i=0

where A is an agent, C, C  are two collections, and X0 , . . . , Xn are abstract data types. In other words, store writes files to the meta-structure of agent A, load initializes abstract data types from data that is stored in agent A, and check compares corresponding sections of two collections for equality. Encoding and decoding is done according to the file types given in tab. 1.

4

Processes

In this section we define the cryptographic transformations of a mobile agent. These transformations are meant to implement the following security policy: 1. On creation of his mobile agent A, the owner E0 defines a path p and a group G of authorized recipients. If A is at some E ∈ G then the files in path p are made available to A. Please note that the opposite is not true. If some file is available in path p then this does not necessarily mean that the agent is at some E ∈ G. Extend this to n paths and corresponding groups. 2. Let p be the path “spool/”. All files in that path are partial results acquired at the current host Ec . These results are transported back to the agent’s owner E0 in confidentiality and integrity. Whenever A resumes execution at some host Ec it finds path “spool/” empty.

54

V. Roth

Table 1. The meta-structure of a mobile agent (left table). The symbols refer to the definitions of abstract data types to which the files correspond. The ’i’ in file names is replaced by the string value of the index of the corresponding symbol. Files marked • are static files and must be signed by the owner of the agent, files marked ◦ are signed by the agent’s sender, and files marked × are not signed. File extensions (right table) denote content that is encoded according to distinct formatting rules; the mapping from extensions to formatting rules is given above. Data structures marked with a † are wrapped into a PKCS#7 ContentInfo. Data structures marked with a ‡ are defined in this paper. path/name META-INF/ manifest.mf owner.sf owner.p7s sender.p7s prac.bin i.dmf SEAL-INF/ owner.cert install.cfg i.p7m i.ear VAR-INF/ i.ear i.p7m i.p7s name.class

mark symbol × M × L0 ◦ S0 × Si , i > 0 × V × Di • K0 • I • Pi ◦ Ai ◦ Ri ◦ Qi ◦ Ti •

Extension Formatting mf sf dmf p7s p7m ear cfg bin cert

ManifestFile‡ SignatureFile‡ ManifestSections‡ PKCS#7 SignedData† PKCS#7 EnvelopedData† raw encrypted ZIP file InstallFile‡ binary data X.509v3 Certificate

3. Upon return of A to E0 , the agent finds the n’th partial result in path “results/n/”. Please note that again the opposite is not true. If an agent finds partial results then it need not necessarily be at the host of its owner. This security policy gives meaning to certain paths in a mobile agent. It is enforced by the cryptographic processes that we describe below. Dealing with the security policy, rather than dealing with cryptographic detail, simplifies the task of agent programmers. 4.1

Inflation and Deflation

Several processing steps of a mobile agent require that as a subprocess: – cipher text is decrypted and installed in the meta-structure of the agent; – clear text is encrypted into cipher text, and deleted subsequently from the meta-structure. We refer to these processes as inflation and deflation of the agent. Both processes take as their arguments the encrypted archives of type either A or R, the paths where the clear text is installed, and the secret keys required to encrypt or decrypt the clear text. acl = {(pi , ji ) | i = 0, . . . , n}, ji ∈ {0, . . . , m} secrets = {(i, Ni ) | i = 0, . . . , m} Both algorithms are given below, where X is a placeholder for an encrypted archive of type either A or R:

Empowering Mobile Software Agents

deflateX (A, acl, secrets) Require: Agent A, acl, secrets 1: for all (pi , ji ) ∈ acl do 2: N ← value(secrets, ji ) 3: Xi ← {zip(children(A, pi ))}N 4: A ← remove(A, pi ) 5: end for 6: store(A, X0 , . . . , Xn )

55

inflateX (A, acl, secrets) Require: Agent A, acl, secrets 1: for all (pi , ji ) ∈ acl do 2: if children(A, pi ) = ∅ then 3: error {Agent not clean.} 4: end if 5: if ji ∈ keys(secrets) then 6: load(A, Xi ) 7: N ← value(secrets, ji ) 8: C ← unzip({Xi }N ) 9: A ← A move(C, pi ) 10: end if 11: end for

The assertion children(A, pi ) = ∅, which is verified in line 2 of the inflation algorithm, prevents unauthorized recipients from merging data into the agent. Assume that the assertion is not verified. Then an unauthorized recipient may add (pi ◦ s, x) to the agent for some s, x. During inflation at an authorized recipient of Ai , that data would be merged with the plain text and subsequently be copied into Ai by the deflation algorithm. 4.2

Instantiation of a Mobile Agent

Each mobile agent has an owner E0 with two different key pairs S0 , S0−1 and K0 , K0−1 for signing and encryption. The owner creates a file collection A and fills it with the data that is required by the mobile agent e.g., the code, itinerary, and serialized object graph of the agent. This data shall not interfere with the definition of special files in tab. 1. E0 defines the groups G0 , . . . , Gm of authorized recipients and the access control policy acl = {(pi , ji ) | i = 0, . . . , n}, ji ∈ {0, . . . , m}. Then, E0 computes the seals and the install file, and deflates the agent. Require: Agent A, groups, S0 1: secrets ← ∅ 2: proofs ← ∅ 3: for j = 0, . . . , m do 4: Compute random key Nj 5: secrets ← secrets ∪ {(j, Nj )} 6: proofs ← proofs ∪ {(j, h(Nj || S0 ))} Compute Pj based on Gj , Nj according to def. 6 7: 8: end for 9: I ← {acl, proofs} 10: store(A, I, P0 , . . . , Pm ) 11: deflateA (A, acl, secrets) Next, E0 decides which files of the agent shall be static. All program code of the agent must be static, as well as all files marked as being static in tab. 1. Let L0 be the names of these files. E0 computes the Manifest M of his agent, and signs it. The signature is

56

V. Roth

saved to the agent, as well as a secret initial partial result authentication code which is chosen randomly. Require: Agent A, list of static files L0 , private key S0−1 1: M ← {(s, h(v)) | (s, v) ∈ regular(A)} 2: S0 ← S0−1 (select(M, L0 ), t0 ), t0 3: Compute random number V0 4: store(A, S0 , L0 , V0 ) The owner of the agent is also its first sender. Therefore, E0 also signs the entire agent. More precisely, E0 updates the Manifest so that sections for the added files are included, and signs the sections of all mutable files as well as a digest of S0 . This binds the mutable files of A to the agent’s static kernel, which prevents protocol interleaving attacks of the form described in [1]. Require: Agent A, list of static files L0 , private key S0−1 , signature S0 1: L ← keys(regular(A))\L0 2: M ← {(s, h(v)) | (s, v) ∈ regular(A)} 3: S1 ← S0−1 (select(M, L), h(S0 ), t1 ), t1 4: store(A, M, S1 ) E0 stores (h(S0 ), V0 ) for the purpose of verifying A after its return. This completes the instantiation of the agent. zip(A) is now sent to the first host. We do not consider replay of A to E0 . This has to be detected by E0 by separate means. 4.3

Security Invariants of Migration

At each miration of a mobile agent, certain invariants must hold for the agent to be deemed valid. It is the responsibility of each host to verify these invariants upon receiving the agent. A failed verification means that the agent is invalid, and that it is not admitted to the host. Let A be an agent, let M be the Manifest as read from A, and let I be the installation file as read from A. Invariant 1 All regular files in A must have a valid Manifest section with a matching message digest. Invariant 2 All Manifest sections must be signed by either the owner or the sender of the agent. The Manifest sections signed by the owner are the sections of the static files. Invariant 3 The sender must have signed the owner’s signature along with the Manifest sections of the mutable files. Invariant 4 No two mobile agents with the same owner signature are admitted to the agent server.

Empowering Mobile Software Agents

57

Invariant 5 Whenever the current host E is an authorized recipient in the seal Pj of a group j there must be a proof of knowledge (j, x) in the install file of A so that x = h(Nj || S0 ) where Nj is the secret group key and S0 is the public key that verifies the owner’s signature. Since E is an authorized recipient, E can recover Nj from Pj .

Invariant 6 Let p be a path prefix (see def. 7) in I then there must not be a file in A whose name has p as its prefix.

Invariant 7 Code is loaded only from files whose Manifest sections are signed by the owner of the agent (files can also be loaded from remote code sources as long as the digest matches the one in the corresponding Manifest section).

4.4

Processing of an Incoming Agent

Let E be a host and let A be a mobile agent that is received by E. E first checks the invariants 1-4. Then it determines the groups of authorized recipients to which it belongs and checks invariant 5 as given below: Require: Agent A, public key S0 of the owner E0 of A 1: secrets ← ∅ 2: load(A, I, P0 , . . . , Pn ) 3: for all (j, x) ∈ proofs do 4: if E ∈ keys(Pj ) then 5: Nj ← {value(Pj , E)}K −1 E 6: if x = h(Nj || S0 ) then 7: error {Ciphertext not owned by agent, invariant 5 violated} 8: end if 9: secrets ← secrets ∪ {(j, Nj )} 10: end if 11: end for 12: inflateA (A, acl, secrets) This completes the verification and the setup of incoming agents. Based on the established signing keys and host-specific policies the agent is either authorized and executed, or it is rejected.

4.5

Execution of a Mobile Agent

For all code that is loaded into the code segment of the agent, invariant 7 must be assured. The agent is provided read access to its meta-structure by means of a suitable abstraction, and write access to all files in its meta-structure but those listed in tab. 1.

58

4.6

V. Roth

Processing of an Outgoing Agent

Let the current host Ez be the z’th sender of the agent with z > 1. The mobile agent A might have modified its meta-structure in a way that violates invariant 6. Therefore, this invariant must be assured, and all files that are subject to the agent’s access control policy must be re-encrypted with the group key. Require: Agent A, public key S0 of the owner E0 of A 1: Compute acl and secrets as in sect. 4.4 (do not inflate A again) 2: Deflate A based on acl, secrets, and archive type A as described in sect. 4.1 The next step is to seal off any partial results the agent wishes to take home in confidentiality, if there are any. The agent must store these results in a spooling area in the meta-structure which is reserved for this purpose. First, E has to find the smallest unused index in the sequence of encrypted archives. Then E creates a seal with the owner of the agent as the authorized recipient, and an encrypted archive with the contents of the spool area. The plain text is deleted subsequently. Require: Agent A, agent owner identity E0 , public key Sz of current host 1: n ← min{i ∈ N0 | name(Qi ) ∈ keys(A)} 2: load(A, K0 , V ) 3: if children(A, “spool/”) = ∅ then 4: Compute random key N 5: Qn ← {(E0 , {N }K0 )} 6: V ← h(V || N || Sz ) 7: store(A, Qn , V ) deflateR (A, {(“spool/”, n)}, {(n, N )}) 8: 9: end if If any encrypted archive of type R was added by the previous host then this fact shall be recorded by storing the signature of the agent’s most recent sender in the agent. The chain of signatures is verified by the agent’s owner after the agent’s return. Its purpose is not to provide non-repudiation of origin. Rather the purpose is to testify that the partial results have not been tampered with, and on which host the results were accumulated. 10: load(M, L0 , Sz−1 ) 11: if n > 0 then 12: if name(Tn−1 ) ∈ keys(A) then 13: load(A, Dn−1 ) 14: M ← sub(M, Dn−1 ) 15: else 16: Tn−1 ← Sz−1 17: store(A, Tn−1 ) 18: end if 19: M  ← {(s, h(v) | (s, v) ∈ regular(A)} 20: Dn−1 ← diff(M  , M ) 21: store(A, Dn−1 )

Empowering Mobile Software Agents 22: 23: 24: 25:

59

end if L ← keys(regular(A))\L0 Sz ← Sz−1 (select(M, L), h(S0 ), tz ), tz store(A, M, Sz )

The purpose of the algorithm given above is to compress a record of changes to the meta-structure that occurred between to additions of an encrypted archive. This means that cryptographic protocol data is added to the agent only if the agent generated partial results. Given a valid Manifest Mz and sender signature Sz the application of a differential Manifest Dn−1 to the given Manifest shall yield a Manifest Mz−i that can be validated with the previously recorded signature Sz−i = Tn−1 where no encrypted archives were added for i − 1 hops. Manifest Mz−i can be used to verify the integrity of data that was added by the (h − i)’th sender and has not been modified subsequently. 4.7

Final Processing of a Returning Mobile Agent

Upon the return of an agent that is owned by the receiving entity the accumulated partial results must be retrieved and their integrity must be verified based on the cryptographic protocol data contained in the mobile agent. This takes place after regular processing of an incoming agent as is described in sect. 4.4. Verification takes place in two passes. The first pass verifies the chain of signatures and the integrity of the encrypted archives. The initial Manifest has already been verified successfully at this stage and thus the sections of these Manifests must bear the correct message digests of the agent’s contents. Require: Agent A, E0 , K0−1 , public key Sz−1 of the most recent sender 1: load(A, M, L0 , S0 , V ) 2: n ← min{i ∈ N0 | name(Qi ) ∈ keys(A)} − 1 3: M  ← M \ select(M, L0 ) 4: if n ≥ 0 ∧ name(Tn ) ∈ keys(M ) then 5: signers ← {(n, Sz−1 )} 6: n←n−1 7: else 8: signers ← {∅, ∅} 9: end if 10: for i = n, . . . , 0 do 11: load(A, Ti , Di ) 12: M  ← sub(M  , Di ) 13: if ¬(E0 knows S : S verifies Ti , M  , S0 ) then 14: error {Bad signature, results are incomplete.} 15: end if 16: if ¬ check(M, M  , Qi , Ri ) then 17: error {Encrypted data was tampered with.} 18: end if 19: if n > 0 ∧ ¬ check(M, M  , Ti−1 ) then 20: error {Previous signature was tampered with.}

60

V. Roth

end if 21: 22: signers ← signers ∪ {(i, S)} 23: end for The second pass verifies the correctness of the partial result authentication code. This prevents some forms of truncation attacks and an attack where a host claims to have be the originator of a previously added partial result. The algorithm continues as follows: 24: V  ← V0 25: for i = 0, . . . , n do 26: load(A, Qi ) 27: Ni ← {value(Qi , E0 )}K −1 0

28: secrets ← secrets ∪ {(i, Ni )} 29: acl ← acl ∪ {(“results/” ◦ i, i)} 30: S ← value(signers, i) 31: V  ← h(V  || Ni || S) 32: end for 33: if V =

V  then 34: error {Partial result authentication code mismatch.} 35: end if

We have to make sure that no unauthorized data is merged into the partial results of the agent by the decryption process. 36: if children(A, “results/”) = ∅ then 37: error {Agent is not clean.} 38: end if 39: inflateR (A, acl, secrets)

This completes the final processing of the mobile agent. Please note that the partial results must be deleted from the agent if it migrates after the final processing. Otherwise, the agent might leak the clear text.

5

Implementation Notes

The algorithms given in sect. 4 are not yet entirely implemented. However, we made an implementation of a (functionally equivalent) subset of these processes with the following exceptions and differences: – Agents are encoded as JAR files; Manifests comply to the JAR standard. – Partial results are handled differently and no PRACs are computed nor verified. – Pre-generated encrypted archives may be either static or mutable files so that an agent can modify or store partial results in them while being hosted by an authorized recipient. – Signatures of previous hosts are not saved. – Hence there is no final processing of a returning agent (sect. 4.7) on top of the regular processing of an incoming agent.

Empowering Mobile Software Agents

61

The remaining set of features allows an agent to carry encrypted data that is transparently revealed to the agent while being hosted by an authorized recipient. Agents can modify the data which are re-encrypted transparently before the agent migrates. Interleaving attacks on the encrypted data are detected as described in sect. 4. However, an adversary may replace an encrypted archive with a previous version of that archive thus possibly rolling back modifications. We measured the overhead of the cryptographic processing using a setup of four computers. The implementation was based on Java Version 1.3.0 01 (HotSpot VM, native threads, sunjit) and ran on Sun Ultra 5/10, UPA/PCI (UltraSPARC-IIi 333Mhz), 111 MHz, Solaris 8, which were connected to a Switched Fast Ethernet (100 MBit/s) that serves more than 200 workstations and PCs which are accessed by more than 320 staff members, researchers and students.. The software was loaded from an Auspex 700 fileserver. The cryptographic processing of agents was implemented by means of security filter plugins for the SeMoA mobile agent server [17]. We used DSA for signing and RSA with Triple DES for encryption. In our measurements, we let a mobile agent migrate 600 times between these computers. In each test, we varied the size of the payload carried by the agent and set it to 0, 32, 64 und 96 KB respectively. The payload consisted of random data produced by means of the SHA1 pseudorandom number generator that is included in the JDK. Our intention was to produce a payload that yields a low compression rate before encryption takes place. The size of the bare agent without payload was about 23 KB. Figure 2 summarizes the results of our measurements. The dashed line gives the mean time per migration where neither signatures nor encryption is used. The solid line gives the mean migration time where only signatures are computed on the agent. This means that on each migration two signatures were verified and one signature was generated. The dotted line gives the mean time per migration where the payload is decrypted and re-encrypted on each migration, on top of the signature computation. The reported agent size is the overall size of the agent as it was transported between hosts. Due brevity prevents us from interpreting or comparing the results which needs to be done in a separate work.

6

Related Work

Several protocols of the type on which we focus in this paper were published in the past [9, 10,11,14] and were found to be vulnerable [1]. The work we present in this paper is meant to “tie some loose ends” that led to the aforementioned attacks. Additional protocols e.g., published in [11,15] remain to be investigated with regard to the applicability of the attacks described in [1]. The representation of a mobile agent as a collection of files matches the Briefcase abstraction in Tacoma [16], and shares the same advantages. Most notably, the representation is independent of a particular agent implementation language. Our contribution is the addition of a security layer that is tailored to the needs of mobile agents.

62

V. Roth

Fig. 2. This figure gives the mean time per migration of a mobile agent with a varying payload.

7

Conclusions

In this paper we presented algorithms and data structures meant to protect free-roaming mobile agents against certain types of malicious host attacks, namely, attacks on the integrity and confidentiality of data that is brought or acquired by a mobile agent on its route. Our approach has the advantage that it: – is robust against interleaving attacks as described in [1]; – can be implemented transparently for agent programmers by means of cryptographic preprocessing and post processing of the agent meta-structure based on a simple-to-understand security policy; – simultaneously protects the overall integrity of the agent during transport; – adds cryptographic protocol data only if a partial result is generated; and – although our approach is file-based, suitable abstractions for e.g., object-oriented agent programming languages are easily provided. The level of detail at which we describe our approach eliminates ambiguities that may be introduced by a presentation that focusses on the pure protocol aspects thereby discounting the side effects of a protocol’s translation into an implementation. It should be noted, though, that a mobile agent still needs to be programmed carefully so that tampering with its mutable state does not lead to leakage of sensitive data.

References 1. V. Roth, “On the robustness of some cryptographic protocols for mobile agent protection,” in Proc. Mobile Agents 2001, vol. 2240 of Lecture Notes in Computer Science, Springer Verlag, December 2001. 2. A. J. Menezes, P. C. van Oorschot, and S. A. Vanstone, Handbook of Applied Cryptography. Discrete Mathematics and its Applications, New York: CRC Press, 1996. ISBN 0-84938523-7.

Empowering Mobile Software Agents

63

3. V. Roth and V. Conan, “Encrypting Java Archives and its application to mobile agent security,” in Agent Mediated Electronic Commerce: A European Perspective (F. Dignum and C. Sierra, eds.), vol. 1991 of Lecture Notes in Artifical Intelligence, pp. 232–244, Berlin: Springer Verlag, 2001. 4. M. Bellare and B. Yee, “Forward integrity for secure audit logs,” tech. rep., Computer Science and Engineering Department, University of California at San Diego, November 1997. 5. PKWARE Inc., 9025 N. Deerwood Dr., Brown Deer, WI 53223-2480, .ZIP File Format Specification, November 2001. Available at URL http://www.pkware.com/support/appnote.html. 6. T. Dell, D. Hopwood, D. Brown, B. Renaud, and D. Connelly, Manifest Format. Sun Microsystems Inc. and Netscape Corporation, March 1999. Available at URL http://java.sun.com/products/jdk/1.2/docs/guide/jar/. 7. T. Bray, E. Maler, J. Paoli, and C. M. Sperberg-McQueen, “Extensible Markup Language (XML) 1.0,” w3c recommendation, W3C, October 2000. Available at URL http://www.w3.org/TR/2000/REC-xml-20001006. 8. V. Roth, “Mutual protection of co–operating agents,” in Secure Internet Programming: Security Issues for Mobile and Distributed Objects (J. Vitek and C. Jensen, eds.), vol. 1603 of Lecture Notes in Computer Science, pp. 275–285, New York, NY, USA: Springer-Verlag Inc., 1999. 9. A. Corradi, R. Montanari, and C. Stefanelli, “Mobile agents protection in the Internet environment,” in The 23rd Annual International Computer Software and Applications Conference (COMPSAC ’99), pp. 80–85, 1999. 10. G. Karjoth, N. Asokan, and C. G¨ulc¨u, “Protecting the computation results of free–roaming agents,” in Proceedings of the Second International Workshop on Mobile Agents (MA ’98) (K. Rothermel and F. Hohl, eds.), vol. 1477 of Lecture Notes in Computer Science, pp. 195– 207, Berlin Heidelberg: Springer Verlag, September 1998. 11. G. Karjoth, “Secure mobile agent-based merchant brokering in distributed marketplaces,” in Proc. ASA/MA 2000 (D. Kotz and F. Mattern, eds.), vol. 1882 of Lecture Notes in Computer Science, pp. 44–56, Berlin Heidelberg: Springer Verlag, 2000. 12. International Telecommunication Union, Information technology – Abstract Syntax Notation One (ASN.1): Specification of basic notation, December 1997. ITU-T Recommendation X.680, equivalent to ISO/IEC International Standard 8824-1. 13. International Telecommunication Union, Information technology – ASN.1 encoding rules: Specification of Basic Encoding Rules (BER), Canonical Encoding Rules (CER) and Distinguished Encoding Rules (DER), December 1997. ITU-T Recommendation X.690, equivalent to ISO/IEC International Standard 8825-1. 14. N. M. Karnik and A. R. Tripathi, “Security in the Ajanta mobile agent system,” Technical Report TR-5-99, University of Minnesota, Minneapolis, MN 55455, U. S. A., May 1999. 15. S. Loureiro, Mobile Code Protection. Ph.d. thesis, Ecole Nationale Sup´erieure des T´el´ecommunications, January 2001. 16. D. Johansen, R. van Renesse, , and F. B. Schneider, “An introduction to the TACOMA distributed system version 1.0,” Technical Report 95-23, Department of Computer Science, University of Tromsø, June 1995. 17. V. Roth and M. Jalali, “Concepts and architecture of a security-centric mobile agent server,” in Proc. Fifth International Symposium on Autonomous Decentralized Systems (ISADS 2001), (Dallas, Texas, U.S.A.), pp. 435–442, IEEE Computer Society, March 2001. ISBN 0-76951065-5.

An Intrusion Detection System for Aglets Giovanni Vigna, Bryan Cassell, and Dave Fayram Department of Computer Science University of California Santa Barbara {vigna,dfayram,bryanc}@cs.ucsb.edu

Abstract. Mobile agent systems provide support for the execution of mobile software components, called agents. Agents acting on behalf of different users can move between execution environments hosted by different organizations. The security implications of this model are evident and these security concerns have been addressed by extending the authentication and access control mechanisms originally conceived for distributed operating systems to mobile agent systems. Other well-known security mechanisms have been neglected. In particular, satisfactory auditing mechanisms have seldom been implemented for mobile agent systems. The lack of complete and reliable auditing makes it difficult to analyze the actions of mobile components to look for evidence of malicious behavior. This paper presents an auditing facility for the Aglets mobile agent system and an intrusion detection system that takes advantage of this facility. The paper describes how auditing was introduced into the Aglets system, the steps involved in developing the intrusion detection system, and the empirical evaluation of the approach. Keywords: Mobile Agents, Security, Intrusion Detection, Auditing.

1

Introduction

Mobile agent systems provide a distributed computing infrastructure that supports the execution of mobile components, called mobile agents [9]. In the most general case, mobile agents act on behalf of different users and, in addition, the nodes that compose the infrastructure may be managed by different authorities (e.g., a university or a private company). The mobile agent paradigm provides a number of advantages with respect to the traditional client-server paradigm. The ability to relocate the components of an application supports service customization, optimized access to distributed resources, and deployment in a mobile networking environment [25]. On the other hand, the ability to move and execute code fragments has serious security implications [8,5,15]. In particular, the recent worm attacks [4,6] showed that malicious mobile software is tolerant to eradication and allows one to perform distributed denial-of-service attacks. The security issues introduced by mobile agents have been addressed by extending the authentication and access control mechanisms originally conceived for distributed systems to address mobility [10]. The goal of authentication and N. Suri (Ed.): MA 2002, LNCS 2535, pp. 64–77, 2002. c Springer-Verlag Berlin Heidelberg 2002 

An Intrusion Detection System for Aglets

65

access control mechanisms is to prevent impersonation and unauthorized access to system resources. These mechanisms provide little support for the detection of attacks that either circumvent protection mechanisms or abuse legitimate access to resources. These attacks can be detected by analyzing the operating system information generated by the actions performed by users and applications. The process of collecting this information is called auditing [2]. Meaningful and complete auditing information is a prerequisite for effective intrusion detection. If the information included in the collected events is not complete, attacks may go undetected. In addition, intrusion detection may not be possible at all because attacks may not have a manifestation that can be identified in the audit trail [32]. Unfortunately, this is the case for mobile agent systems. Most mobile agent systems provide no auditing mechanisms or are able to produce only incomplete information about the activity of mobile agents. In addition, different agents are usually executed within a single user process that acts as the execution environment (e.g., a Java Virtual Machine). Therefore, the resulting audit trail at the operating system level contains the actions performed by the execution environment as a whole and cannot be “sorted” to associate a subset of the audit records with the execution of a single mobile agent. To overcome this problem, it is necessary to perform auditing at the agent system level, where meaningful information about the actions of each agent can be collected. This paper describes the design and implementation of a facility for the collection of audit trails in the Aglets system [20] and an intrusion detection system that takes advantage of the auditing information. The auditing system provides extensive information about the activity of mobile agents running within an execution environment. This information is leveraged by an intrusion detection system specifically developed for Aglets. The system allows one to detect attacks that bypass the authorization mechanisms, abuse legitimate access, or violate the security policy of an Aglets server. The paper is structured as follows. Section 2 presents related work on the topic of auditing and intrusion detection. Section 3 outlines our approach to detect intrusions in the Aglets system. Section 4 describes how auditing was introduced into the Aglets system. Section 5 presents an intrusion detection system for Aglets. Section 6 contains a quantitative evaluation of the auditing systems and shows examples of the use of the intrusion detection system. Section 7 draws some conclusions and outlines future work.

2

Related Work

Intrusion detection is performed by analyzing one or more input event streams, looking for the manifestation of an attack. Traditionally, the input stream is either represented by packets transmitted on a network segment or by the audit records produced by the auditing facility of an operating system. Examples of event streams are the audit records generated by the Solaris Basic Security Module (BSM) [27], traffic logs collected using tcpdump [28], and syslog messages.

66

G. Vigna, B. Cassell, and D. Fayram

Historically detection has been achieved following two different approaches: anomaly detection and misuse detection. Anomaly detection relies on models of the “normal” behavior of a computer system. These models may focus on the users, the applications, or the network. Behavior profiles may be built by performing statistical analysis on historical data [12,17] or by using rule-based approaches to specify behavior patterns [19,34,35,26]. Anomaly detection compares actual usage patterns against the established profiles to identify abnormal patterns of activity. Misuse detection systems take a complementary approach. The detection tools are equipped with a number of attack descriptions. These descriptions (or “signatures”) are matched against the stream of audit data looking for evidence that the modeled attack is occurring [14,21,24]. Misuse and anomaly detection both have advantages and disadvantages. Misuse detection systems can perform focused analysis of the audit data and usually they produce few, if any, false positives. At the same time, misuse detection systems can detect only those attacks that have been modeled. Anomaly detection systems have the advantage of being able to detect previously unknown attacks. This advantage is paid for with a large number of false positives and the difficulty of training a system for a very dynamic environment. Mobile agents have sometimes been advocated as a means to perform intrusion detection in distributed systems [13,16,3,29]. In this context, intrusion detection systems are designed as mobile applications that roam the network to detect attacks and track intruders. The approach described in this paper takes a different perspective. The approach focuses on the detection of attacks against mobile agent systems and, in particular, against Aglets. Detection is achieved by instrumenting the Aglets system to produce events about the activity of mobile agents and then using a misuse detection tool to analyze the event stream. Similar approaches to auditing of mobile code have been proposed for the Anchor Toolkit [22] and to support policy verification for mobile programs [11]. Unfortunately, the former does not provide complete information about agents activity and, to the best of our knowledge, has not been implemented. The latter does not take into account the mobile agent paradigm and focuses on policy specification and access control for code on demand.

3

Architecture

The goal of our research is to develop mechanisms, techniques, and tools to support intrusion detection in the context of mobile agent systems. The approach includes the design and implementation of mechanisms for the collection of complete auditing information about mobile agent execution and the development of an intrusion detection system that uses the collected audit trails to detect attacks against both the mobile agents and the execution environments. This paper presents the application of the approach to a particular mobile agent system, namely Aglets, version 2.0.2. Aglets is a well-known mobile agent system whose sources have been made available through the SourceForge opensource initiative [1]. In the Aglets system, mobile agents are called “aglets” and

68

G. Vigna, B. Cassell, and D. Fayram

with code that logs the information about the requested operation and the identity of the involved parties. In addition, the Java Security Manager implementation provided with the system was extended to log the parameters of both successful and failed operations. The aglets interaction procedures that were instrumented are create, clone, dispose, dispatch, and retract. In addition, the Java Security Manager was modified to log detailed information about system-level operations. For example, logging was implemented for file operations (e.g., open, close, read, and write), socket operations (e.g., accept and listen), and system property operations (for example, the request to modify the java.home property). To log these events, it was necessary to access the information about an executing aglet from its corresponding Java thread. A method was added to the AgletThread class. The method returns the aglet’s AgletInfo structure, which contains all the information necessary to identify the aglet. The logs produced by the auditing system are in XML format. An audit trail associated with the execution of an Aglets server is an XML document containing a series of events. The structure of an event is designed to accommodate a variety of actions that are relevant in the Aglets system. Each event contains a source, an action, and a result. An example of an event element is shown in Figure 2.









Fig. 2. An example of an event entry in the Aglets audit trail.

The source element is used to specify the originator of an action. The originator is either a server (which is usually the local server) or an aglet. The identity of the source is specified in the name property.

An Intrusion Detection System for Aglets

69

The action element is used to specify the operation performed by the source. This element contains a type property that defines the type of the action. In the case of actions that are specific to the Aglets system, the type property contains the keyword AgletAction, followed by the name of the operation invoked (e.g., “clone”). In the case of actions involving Java security permissions, the type property contains the name of the permission’s class (for example, java.io.FilePermission) followed by the actions associated with the permission (for example, “execute”). A target sub-element is included within the action element if the action requires additional information to specify the object of an operation. The result element is used to specify the outcome of the operation. This element contains the property status that indicates the success or failure of the action. The property type specifies the type of data enclosed within the result tag. If there is no result data, the result element is empty and the type property is set to “none”.

5

An Intrusion Detection System for Aglets

The information generated by the auditing facility of Aglets is used as input to an intrusion detection system, called AgletSTAT. AgletSTAT has been developed by leveraging the STAT framework [33]. The STAT framework provides a platform for the development of intrusion detection sensors by extending a generic runtime with domain-specific components. The STAT framework is centered around three concepts: the STAT technique, the STATL language, and the STAT Core [31]. The STAT technique is used to represent high-level descriptions of computer attacks. Attack scenarios are abstracted into states, which describe the security status of a system, and transitions, which model the evolution between states. STATL is an extensible language [7] that is used to represent STAT attack scenarios. The language defines the domain-independent features of the STAT technique. The STATL language can be extended to express the characteristics that are specific to a particular domain. The extension process includes the definition of a set of C++ classes that represent the events in the event stream to be analyzed. In addition, the language is extended with types and predicates that support the definition of events and the testing of domain-specific properties. Event and predicate definitions are grouped in a language extension module. The module is compiled into a dynamically linked library (i.e., a “.so” file in a UNIX system or a DLL file in a Windows system). Once the event set and associated predicates for a language extension are available, it is possible to use them in a STATL scenario description by including them with the STATL use keyword. STATL scenarios are matched against a stream of events by the STAT Core. The STAT Core represents the runtime of the STATL language. The STAT Core implements the domain-independent characteristics of STATL, such as the concepts of state, transition, timer, matching of events, etc. At run-time the

70

G. Vigna, B. Cassell, and D. Fayram

STAT Core performs the actual intrusion detection analysis process by matching an incoming stream of events against a number of attack scenarios. The input event stream is provided by one or more event providers. An event provider collects events from the external environment (e.g., by obtaining packets from the network driver), creates events as defined in one or more STAT language extensions, encapsulates these events into generic STAT events, and inserts these events into the input queue of the STAT core. In summary, a STAT-based sensor is created by developing a language extension that describes the particular domain of the application, an event provider that retrieves information from the environment and produces STAT events, and attack scenarios that describe attacks in terms of patterns of STAT events. The AgletSTAT intrusion detection system was developed following the process outlined above. AgletSTAT was built by developing a language extension module that defines Aglets-specific events, an event provider that parses Aglets audit trails and generates Aglets events, and a number of scenarios that detect attacks by analyzing the Aglets event stream. The AgletSTAT language extension contains the definition of the Aglets event, auxiliary types, and Aglets-specific predicates. Figure 3 shows a simplified version of the class for the Aglets event, which is an abstraction of an entry in the audit log generated by the Aglets auditing facility.

class AgletsEvent : public STATExtEvent { public: Time timestamp; // Timestamp AgletsComponent source; // Originator of this event AgletsAction action; // Action performed AgletsComponent target; // Target of this action bool outcome; // Outcome of the operation AgeltsActionResult result; // Result of the operation [...] } Fig. 3. The AgletsEvent defined in the AgletSTAT language extension.

The AgletSTAT event provider reads the events stored in the audit log file as they are generated. The event provider parses the XML representation of the events and creates the corresponding AgletsEvent objects. These objects are encapsulated into STAT events and inserted in the STAT Core event queue. The STAT Core extracts the events from the event queue and passes them to the active attack scenarios for analysis (see Figure 1). Attacks are represented by specifying state-transition patterns over the stream of AgletsEvent events. An example attack scenario is shown in Figure 4. The scenario detects an aglet attempting to perform a portscan against

An Intrusion Detection System for Aglets

firstProbe counter

s0

scanning

71

toFinal

end

use agletstat; scenario portscan(int threshold) { global HashTable cloneWatch; HashTable portWatch; string agletid; int count = 0; initial state s0 {} state scanning {} state end { {stat_log("Aglet %s is performing a portscan", agletid);} } transition firstProbe (s0->scanning) nonconsuming { [AgletsEvent a]: (a.action.matches("java.net.SocketPermission connect,resolve") && !(cloneWatch.contains(a.source.id)) && (a.target.name.matches("127.0.0.1"))) { count++; agletid = a.source.id; cloneWatch.put(agletid); portWatch.put(a.target.port); } } transition counter (scanning->scanning) consuming { [AgletsEvent a]: (a.action.matches("java.net.SocketPermission connect,resolve") && (a.source.id == agletid) && (a.target.name.matches("127.0.0.1")) (!(portWatch.contains(a.target.port))) { count++; } } transition toFinal (scanning->end) consuming { [AgletsEvent a]: (count > threshold) { cloneWatch.delete(agletid); } } } Fig. 4. An AgletSTAT attack scenario that detects a portscan. The scenario has been simplified with respect to the original, for the sake of exposition. For a detailed description of STATL syntax and semantics see [7]

72

G. Vigna, B. Cassell, and D. Fayram

the local Aglets server. The attack is detected by counting the number of attempted connections to different ports on the local host and alerting when a predetermined threshold is reached. A number of STATL scenarios have been developed to detect disclosure of sensitive information (such as accessing the password file on a UNIX system), denial-of-service attacks, and the scanning of remote systems. Note that STATL scenarios can also be used to specify intended aglet behavior. In this case, a scenario contains a state-transition description of the correct sequence of operation to be performed by an aglet. Deviations from the specification would then be detected by the system.

6

Evaluation

The evaluation of the prototype implementation addresses both the auditing system and the intrusion detection tool. The auditing system has been evaluated quantitatively by determining the overhead introduced by the logging procedures. The intrusion detection tool has been evaluated from the functionality point of view by running malicious aglets and analyzing the effectiveness of the detection process. 6.1

The Auditing System

The evaluation of the auditing system was carried out on a system with dual Celeron CPUs clocked at 533 MHz, 384 MB of RAM, and an IBM 7200 rpm hard disk with a DMA-33 interface. The operating system was Linux, kernel version 2.4.16, with Java SDK 1.3.1. To test the overhead introduced by the auditing system three aglets were developed, namely CpuBound, IOBound, and Mixed. The CpuBound aglet performs purely computational operations with doubles, integers, and longs. The IOBound aglet is designed to stress-test the auditing system. The aglet simply opens and closes files in a loop, flooding the auditing system with logging requests. The Mixed aglet was designed to be a compromise between CpuBound and IOBound. This aglet opens a file and writes some data to it while doing some computations. Then, the aglet closes the file and opens it again to read the data, while doing more computations. All the aglets have the ability to spawn a specified number of clones of themselves. Each aglet was run on both the original Aglets system and the instrumented version with 0, 9, and 24 clones, resulting in a total of 1, 10, and 25 aglets running simultaneously. Each test was executed 10 times. The CpuBound aglet was run for 50 million iterations with 1 aglet, 10 million iterations per aglet with 10 aglets, and 4 million iterations with 25 aglets. The IOBound aglet was run for 20 thousand iterations with 1 aglet, 2 thousand iterations with 10 aglets, and one thousand iterations with 25 aglets. The Mixed aglet was run for 3 thousand iterations with 1 aglet, 500 iterations with 10 aglets, and 200 iterations with 25

An Intrusion Detection System for Aglets

73

aglets. The number of iterations for each test was chosen so that all the tests would take are roughly the same time. The setup was identical for each test. The results of the tests are provided in Table 1. The table contains the execution time required by each test, in milliseconds. An average value is provided for each set of tests. The performance overhead is computed comparing the results for the original system with the results for the instrumented version. Table 1. Performance evaluation of the modified Aglets system.

aglets

1

CpuBound 10 25

63,834 62,046 62,050 62,064 62,104 62,066 62,067 62,047 62,046 62,068 Mixed 62,239

63,120 62,663 62,871 62,859 63,179 62,810 62,820 62,716 63,080 62,789 62,891

63,987 63,852 63,862 63,869 64,102 63,916 64,004 64,173 64,298 64,290 64,035

63,842 63,854 62,045 62,068 62,086 62,047 62,127 62,098 62,044 62,046 Mixed 62,426

63,491 62,757 62,957 63,012 62,963 62,850 63,015 62,993 63,021 62,996 63,006

64,212 63,962 63,844 63,933 64,833 63,983 63,974 64,951 64,525 64,064 64,228

0.30% 0.18% 0.30%

IOBound 1 10 25 Original Aglets 24,455 26,265 37,833 24,793 25,913 36,384 24,145 25,681 36,936 24,299 25,938 36,657 24,453 26,285 36,775 24,559 26,182 36,160 24,186 25,898 36,515 24,536 26,203 36,438 24,298 25,764 36,375 24,601 25,958 36,697 24,433 26,009 36,677 Modified Aglets 72,978 60,845 78,860 72,625 59,569 78,013 72,008 59,518 78,558 71,849 59,603 77,882 71,540 59,799 78,484 71,883 59,554 78,254 71,713 59,675 78,393 71,892 60,202 78,684 71,816 59,891 78,992 71,824 59,836 78,711 72,013 59,849 78,483 Overhead 194.74% 130.11% 113.98%

1

Mixed 10

25

51,475 51,834 52,075 51,921 52,458 52,008 51,771 51,881 52,770 51,521 51,971

49,267 48,365 48,679 48,600 48,347 48,584 48,749 48,646 49,244 48,531 48,701

50,655 50,137 50,453 50,371 50,383 50,830 50,316 50,788 50,743 50,565 50,524

61,519 62,771 59,660 59,655 59,228 59,920 60,067 59,705 59,513 59,589 60,163

59,221 59,032 58,595 58,717 58,215 58,830 58,305 58,853 58,287 58,521 58,658

60,354 59,484 60,013 59,541 59,586 59,334 59,746 60,109 59,885 59,684 59,774

15.76% 20.44% 18.31%

The impact of auditing is clear in the case of the IOBound aglet. This aglet represents the absolute worst possible case for the performance of the auditing system. Every time the IOBound aglet accesses a file, the logging system has to log the request, which creates considerable overhead. The worst case shows a three-fold increase in time with respect to the original Aglets system. The Mixed 2 2

Note that although this aglet is called “Mixed ”, the aglet is still very file-system intensive and generates a large number logging requests.

74

G. Vigna, B. Cassell, and D. Fayram

aglet produced an overhead between 15% and 20%. This overhead is comparable to the overhead introduced by operating system-level auditing facilities [23]. Note that the auditing mechanism introduced into Aglets can be selectively disabled. The tests described here were performed with full logging. 6.2

Evaluating the Intrusion Detection System

The evaluation of the intrusion detection functionality of AgletSTAT was performed by developing a number of malicious aglets and STATL scenarios to detect the attacks. A first class of malicious aglets perform denial-of-service attacks that attempt to monopolize and/or overload system resources in various ways. The first example is the CloneBomb aglet. This aglet generates hundreds of clones of itself. The cloned aglets also generate clones of themselves in a recursive fashion. While inelegant, this attack is extremely effective at monopolizing system resources and proved to be difficult to halt because explicitly disposing of one aglet has no effect. Fortunately, the auditing facility added to the Aglets system logs all aglet-level actions. Therefore, it was possible to develop a scenario to detect excessive cloning. A simple scenario could shut down the server in the case of a clone attack, while a more sophisticated scenario could attempt to explicitly kill all cloning aglets. Another type of denial-of-service attack proved to be undetectable. In this attack, an aglet, called WindowBomb, simply spams the screen with a large number of windows. The aglet uses a collection object to prevent garbage collection of the window objects. Window creation operations cannot be logged properly because the Java security model does not allow for the interception of these events. The WindowBomb aglet would be difficult to stop if combined with the CloneBomb aglet, although in this case the attack would generate copious audit information and would be easily detected. A second class of attacks exploits the aglet interaction procedures provided by the Aglets system. An example of this attack is an aglet called KillBomb that sends a “dispose” message to every other aglet in the system. The actions necessary to perform this attack are clearly visible in the logs, and a STAT scenario has been developed to detect the attack. The test of the intrusion detection system also included the execution of attacks that attempt to access security-relevant system properties (e.g., java.home or user.name) and sensitive files in the operating system (e.g., /etc/passwd). These attacks were blocked by the Java Security Manager and logged by the auditing system. STAT scenarios to detect these attempts were developed.

7

Conclusions and Future Work

The goal of our research work is to develop mechanisms and tools to perform intrusion detection in the context of mobile agent systems. Detection of malicious activity performed by the agents is achieved by instrumenting a mobile

An Intrusion Detection System for Aglets

75

agent system to generate complete and meaningful auditing information and then analyzing the audit trail using an intrusion detection system. This paper introduces the general approach and describes its application to the Aglets mobile agent system. The Aglets system was extended to generate auditing information about the actions of mobile agents. In addition, a STAT-based sensor was developed to analyze the audit trail and a number of attack scenarios were developed. The overhead of the auditing system and the effectiveness of detection were evaluated. Future work will focus on extending this approach in a number of ways. First of all, we will take advantage of the lessons learned in developing our prototype to design a system-independent auditing facility for mobile agent systems. By doing this, a number of mobile agent systems can benefit from the services provided by the auditing mechanism. In order for the auditing facility to be easily included into different mobile agent systems, it is necessary to define both a standardized service interface and a common audit record format. A well-defined common audit trail format would be beneficial to both the intrusion detection and the mobile agent communities. It would support component and preprocessor reuse, simplify data sharing among different systems, and allow for the merging of different event streams. As a second step, we plan to investigate how intrusion detection can be performed on audit trails collected on different execution environments. Mobile agents can roam from server to server and generate an audit trail at each of the visited hosts. As a consequence, the manifestation of an intrusion may span different hosts or even different mobile agent systems. Therefore, it is necessary to devise a mechanism to reliably collect the audit streams associated with the entire lifetime of an agent. Unfortunately, mobile agent systems may be administered by different authorities with different levels of security and conflicting goals. The collection mechanism should ensure that the audit records associated with the actions of an agent within a server cannot be modified illegally by either the server or the roaming agent. We plan to build on the cryptographic tracing mechanism introduced in [30]. The mechanism generates a chain of signed checksums of the agent’s state and audit trail. The signed checksums are computed at each of the servers visited by an agent. This mechanism makes it impossible for either a server or an agent to tamper with the agent audit trail without being detected. The collected traces are the basis for detecting both attacks against mobile agent systems and attacks against mobile agents.

Acknowledgments. We would like to thank Prof. Richard Kemmerer for his comments on early drafts of this work. In addition, we want to thank the anonymous reviewers for their constructive comments that improved the quality of the paper. In particular, we would like to thank one of the reviewers (and give him/her credit) for pointing out a problem in the tracking of the identities of aglets and for providing a viable solution to the problem.

76

G. Vigna, B. Cassell, and D. Fayram

References 1. The Aglets Software Development Kit (ASDK). http://sourceforge.net/projects/aglets/, June 2002. 2. J.P. Anderson. Computer Security Threat Monitoring and Surveillance. James P. Anderson Co., Fort Washington, April 1980. 3. J.S. Balasubramaniyan, J.O. Garcia-Fernandez, D. Isacoff, E.H. Spafford, and D. Zamboni. An Architecture for Intrusion Detection Using Autonomous Agents. In Proceedings of ACSAC ’98, pages 13–24, 1998. 4. CERT/CC. “Code Red Worm” Exploiting Buffer Overflow In IIS Indexing Service DLL. Advisory CA-2001-19, July 2001. 5. D.M. Chess. Security Issues in Mobile Code Systems. In G. Vigna, editor, Mobile Agents and Security, volume 1419 of LNCS, pages 1–14. Springer-Verlag, June 1998. 6. CIAC. The Ramen Worm. Information Bulletin L-040, February 2001. 7. S.T. Eckmann. The STATL Attack Detection Language. PhD thesis, Department of Computer Science, UCSB, Santa Barbara, CA, June 2002. 8. W.M. Farmer, J.D. Guttman, and V. Swarup. Security for Mobile Agents: Issues and Requirements. In Proc. of the 19th National Information Systems Security Conf., pages 591–597, Baltimore, MD, USA, October 1996. 9. A. Fuggetta, G.P. Picco, and G. Vigna. Understanding Code Mobility. IEEE Transactions on Software Engineering, 24(5):342–361, May 1998. 10. R.S. Gray, D. Kotz, G. Cybenko, and D. Rus. D’Agents: Security in a multiplelanguage, mobile-agent system. In G. Vigna, editor, Mobile Agents and Security, volume 1419 of Lecture Notes in Computer Science, pages 154–187. SpringerVerlag, 1998. 11. B. Hashii, S. Malabarba, R. Pandey, and M. Bishop. Supporting reconfigurable security policies for mobile programs. Computer Networks, 33(1-6):77–93, June 2000. 12. Paul Helman and Gunar Liepins. Statistical Foundations of Audit Trail Analysis for the Detection of Computer Misuse. In IEEE Transactions on Software Engineering, volume Vol 19, No. 9, pages 886–901, 1993. 13. G. Helmer, J.S. K. Wong, V. Honavar, and L. Miller. Intelligent Agents for Intrusion Detection. In Proceedings of the IEEE Information Technology Conference, pages 121–124, Syracuse, NY, September 1998. 14. K. Ilgun, R.A. Kemmerer, and P.A. Porras. State Transition Analysis: A RuleBased Intrusion Detection System. IEEE Transactions on Software Engineering, 21(3):181–199, March 1995. 15. W. Jansen and T. Karygiannis. Mobile Agent Security. NIST Special Publication 800-19, August 1999. 16. W. Jansen, P. Mell, T. Karygiannis, and D. Marks. Applying mobile agents to intrusion detection and response. Technical Report 6416, NIST, October 1999. 17. H. S. Javitz and A. Valdes. The NIDES Statistical Component Description and Justification. Technical report, SRI International, Menlo Park, CA, March 1994. 18. G. Karjoth, D. Lange, and M. Oshima. A Security Model for Aglets. In G. Vigna, editor, Mobile Agents and Security, volume 1419 of LNCS. Springer, 1998. 19. C. Ko, M. Ruschitzka, and K. Levitt. Execution Monitoring of Security-Critical Programs in Distributed Systems: A Specification-based Approach. In Proceedings of the 1997 IEEE Symposium on Security and Privacy, pages 175–187, May 1997.

An Intrusion Detection System for Aglets

77

20. Danny B. Lange and Mitsuru Oshima. Programming and Deploying Java Mobile Agents with Aglets. Addison-Wesley Longman, 1998. 21. U. Lindqvist and P.A. Porras. Detecting Computer and Network Misuse with the Production-Based Expert System Toolset (P-BEST). In IEEE Symposium on Security and Privacy, pages 146–161, Oakland, California, May 1999. 22. S. Mudumbai, A. Essiari, and W. Johnston. Anchor Toolkit, 1999. 23. S. Nitzberg. Performance benchmarking of unix system auditing. Master’s thesis, Monmouth College, August 1994. 24. M. Roesch. Snort - Lightweight Intrusion Detection for Networks. In Proceedings of the USENIX LISA ’99 Conference, November 1999. 25. Gruia-Catalin Roman, Gian Pietro Picco, and Amy L. Murphy. Software Engineering for Mobility: A Roadmap. In A. Finkelstein, editor, The Future of Software Engineering, pages 241–258. ACM Press, 2000. 26. F. Schneider. Enforceable security policies. ACM Transactions on Information and System Security, 3(1):30–50, February 2000. 27. Sun Microsystems, Inc. Installing, Administering, and Using the Basic Security Module. 2550 Garcia Ave., Mountain View, CA 94043, December 1991. 28. Tcpdump and Libpcap Documentation. http://www.tcpdump.org/, June 2002. 29. A. Tripathi, T. Ahmed, S. Pathak, A. Pathak, M. Carney, M. Koka, and P. Dokas. Active Monitoring of Network Systems using Mobile Agents. Technical report, Department of Computer Science, University of Minnesota, May 2002. 30. G. Vigna. Cryptographic Traces for Mobile Agents. In G. Vigna, editor, Mobile Agents and Security, volume 1419 of LNCS. Springer-Verlag, June 1998. 31. G. Vigna, S. Eckmann, and R. Kemmerer. The STAT Tool Suite. In Proceedings of DISCEX 2000, Hilton Head, South Carolina, January 2000. IEEE Computer Society Press. 32. G. Vigna, S.T. Eckmann, and R.A. Kemmerer. Attack Languages. In Proceedings of the IEEE Information Survivability Workshop, Boston, MA, October 2000. 33. G. Vigna, R.A. Kemmerer, and P. Blix. Designing a Web of Highly-Configurable Intrusion Detection Sensors. In W. Lee, L. M`e, and A. Wespi, editors, Proceedings of the 4th International Symposiun on Recent Advances in Intrusion Detection (RAID 2001), volume 2212 of LNCS, pages 69–84, Davis, CA, October 2001. Springer-Verlag. 34. D. Wagner and D. Dean. Intrusion Detection via Static Analysis. In Proceedings of the IEEE Symposium on Security and Privacy, Oakland, CA, May 2001. IEEE Press. 35. C. Warrender, S. Forrest, and B.A. Pearlmutter. Detecting intrusions using system calls: Alternative data models. In IEEE Symposium on Security and Privacy, pages 133–145, 1999.

Fine-Grained Interlaced Code Loading for Mobile Systems Luk Stoops, Tom Mens1, and Theo D’Hondt Department of Computer Science Programming Technology Laboratory Vrije Universiteit Brussel, Belgium {luk.stoops, tom.mens}@vub.ac.be http://prog.vub.ac.be

Abstract. In the advent of ubiquitous mobile systems in general and mobile agents in particular, network latency becomes a critical factor. This paper investigates interlaced code loading, a promising technique that permutes the application code at method level and exploits parallelism between loading and execution of code to reduce network latency. It allows many applications to start execution earlier, especially programs with a predictable startup phase (such as building a GUI). The feasibility of the technique has been validated by implementing a prototype tool in Smalltalk, and applying it to three applications and a wide range of different bandwidths. We show how existing applications can be adapted to maximally benefit from the technique and provide design guidelines for new applications. For applications that rely on a GUI, the time required to build the GUI can be reduced to 21 % on the average.

1 Introduction An emerging technique for distributing applications involves mobile code: code that can be transmitted across the network and executed on the receiver's platform. Mobile code comes in many forms and shapes [10]. Mobile code can be represented by machine code, allowing maximum execution speed on the target machine but thereby sacrificing platform independence. Alternatively, the code can be represented as bytecodes, which are interpreted by a virtual machine (as is the case for Jini [1] and Smalltalk [5]). This approach provides platform independence, a vital property in worldwide heterogeneous networks. The third option, which also provides platform independence, consists of transmitting source code or program parse trees. Note that the side effect of platform independence is that an extra compilation step is necessary before the code can be executed on the receiving platform. An important problem related to mobile code is network latency: the time delay introduced by the network before the code can be executed. This delay has three possi1

Tom Mens is a Postdoctoral Fellow of the Fund for Scientific Research – Flanders (Belgium)

N. Suri (Ed.): MA 2002, LNCS 2535, pp. 78–92, 2002. © Springer-Verlag Berlin Heidelberg 2002

Fine-Grained Interlaced Code Loading for Mobile Systems

79

ble causes. The code must be (1) loaded over a network to the target platform, (2) eventually checked for errors and/or security constraints and (3) possibly compiled or transformed into an intermediate representation. Step (1) is in general the most timeconsuming activity, and can lead to significant delays in the startup of the application. This is especially the case in low-bandwidth environments such as the current wireless communication systems or in overloaded networks. Therefore we need to tackle the load phase if we wish to reduce network latency. In this paper we propose interlaced code loading, a promising technique that introduces parallelism between loading and execution of code to reduce the overall network latency. The technique allows us to start up the code before it is completely loaded. Our experiments involve adapting and running real code and consequently our results are not obtained as part of some simulation technique. Only the network transmission with different transmission rates is simulated in order to evaluate the technique on load channels ranging from very low to very high bandwidths. We also provide some design level guidelines for application developers to take advantage of the loading technique. The paper is structured as follows. Section 2 presents some basic observations of current network and computer architectures and introduces the technique of interlaced code loading. Section 3 describes the experiments conducted to validate our approach and discuss our findings. Section 4 presents some related work. Next we conclude and present our future work.

2 Proposed Technique 2.1 Basic Observations A first important observation is that code transmission over a network is inherently slower than compilation and evaluation2 and this will remain the case for many years to come. The speed of wireless data communications has increased enormously over the last years and with technologies as HSCSD (High Speed Circuit Switched Data) and GPRS (General Packet Radio Services) we obtain transmission speeds of 2Mbps [2]. Compared with the raw “number crunching” power of microprocessors where processor speeds of Gbps are common, transmission speed is still several orders of magnitude slower. We expect that this will remain the case for several years to come since, according to Moore's Law [11], CPU speeds are known to double every year. A second observation is that actual computer architectures provide separate processors for input/output (code loading) and main program execution. A third observation is that for many applications, if we launch the application over and over again, its program flow after the start will always be the same for a certain amount of time. This time interval is called the predictable deterministic time zone. Most notably those applications that communicate with the user by a graphical user 2

We utilize the more general term evaluation to describe execution or interpretation of code.

80

L. Stoops, T. Mens, and T. D’Hondt

interface (GUI) spend a lot of time building this GUI, and this process is the same each time the application is started. As soon as the user interacts for the first time with the application, the program flow becomes less predictable. Many applications without a user interface also seem to follow a predictable process during startup until their first interaction with an unpredictable environment such as the connection with external systems, generation of a true (non pseudo) random number etc... The time needed to load, build and display the GUI is called the user interface latency. Loading the GUI code first can be very beneficial. The idle time where the system has to wait for user interaction can be exploited to load the rest of the code. This idle time is not negligible. For example, it takes approximately three seconds to select a command using a mouse interface [3]. As a final observation, typical source code contains a lot of low priority chunks for which loading can be deferred until the last moment. A typical example is exception handling (unless exceptions are used to structure the program flow).

2.2 Interlaced Code Loading The Interlaced Graphics Interchange Format (GIF) [13] is an image format that exploits the combination of low bandwidth channels and fast processors by transmitting the image in successive waves of bit streams until the image appears at its full resolution. We propose interlaced code loading as a technique that applies the idea of progressive transmission to software code instead of images. The proposed technique splits a code stream in several successive waves of code streams. When the first wave finishes loading at the target platform its execution starts immediately and runs in parallel with the loading of the second wave. In a JIT compilation environment there is an extra compilation phase needed and therefore there are three processes that could potentially run in parallel: loading, compiling and evaluation. Extra timesavings will only occur if different processors are deployed for the compilation and evaluation phase. Nevertheless, even if the same processor shares the processes of compilation and evaluation, the use of JIT compilation is advantageous for the proposed technique. Since the program flow of a classic compilation process is highly predictable it guarantees that during this phase no unpredictable branches will occur, allowing a smooth parallel process between compilation and loading. In other words, incorporating a compilation phase increases the predictable deterministic time zone that is often found at the start of a program. We deliberately chose for a JIT compilation approach because of its advantages in a low bandwidth environment: (1) Source code has a smaller footprint than the corresponding native code; (2) Source code preserves a high level of abstraction, thus enabling more powerful compression techniques; (3) JIT fits nicely in the proposed code interlacing technique since, as explained before. However, to anticipate possible criticism that the results heavily depend on this extra compilation step we did not apply parallelism between the loading and compilation phase. All the obtained results were obtained from parallelism between the running application and the code loading only. In our setup the compilation phase is part of the

Fine-Grained Interlaced Code Loading for Mobile Systems

81

load process. If we also apply parallelism between compiling and code loading the time gain will increase even more. An important question is: what is the ideal unit of code to be split into successive waves? We propose to use as unit those program abstractions where the code was build from. For example, in Smalltalk likely candidates at different levels of granularity would be: statements, methods, method categories, classes, class hierarchies, class categories, etc… The unit of code should be sufficiently small to achieve a high flexibility in the possible places where the code can be split, which in turn enables a higher degree of parallelism. In object-oriented languages, it seems most appropriate to use methods as unit of code3. Especially for well-written object oriented programs that adhere to the good programming practice of keeping methods small, the splitting flexibility remains high. Before we can start to cut the code into different chunks we need to permute the source code in such a way that the code that will be executed first will be loaded first as well. After the cutting, we need to apply some glue code in the form of semaphores. Semaphores temporally suspend the application if the next chunk of code is not yet loaded. The algorithm used to permute the source code is based on the dependencies between the different Smalltalk entities. A method cannot be loaded and compiled if the method's class description is not already available in the system. So in the Smalltalk environment a method depends on its class description. In the same spirit we notice that a class depends on its superclass, a class depends on its namespace, a class initialization method depends on its class description and depends possibly on semaphore code that eventually can prevent its invocation. A class also depends on the availability of relevant shared variables and, if the class is a subclass of ApplicationModel, the availability of the associated window specification resource. These dependencies are not complete to cover all the possible Smalltalk applications but were considered to be sufficiently comprehensive to cover all the dependencies in the actual experimental setup.

3 Experiments 3.1 Setup We describe some experiments to illustrate a generic approach of interlaced code loading and to provide a proof of concept. A prototype tool was implemented in Smalltalk (more specifically, VisualWorks Release 5i.4), a popular object-oriented language that allows fast prototyping. As a practical validation we tested our approach on three applications each exhibiting some typical but distinct behavior. We feel that these three are representative for a whole range of typical mobile applications and suffice for a proof of concept. Nev3

While code loading is achieved at method level in Smalltalk, this is not the case in Java for security reasons: the unit of Java code loading must be a class.

82

L. Stoops, T. Mens, and T. D’Hondt

ertheless, experiments on a larger scale are needed to validate this approach for other types of applications as well. Benchmark: (ver: 5i.4) (80 kByte, 7 classes) A program that comes with the VisualWorks environment adapted in such a way that after its Graphical User Interface (GUI) appears, it launches a standard test immediately, thereby simulating prompt user interaction. CoolImage: (ver: 2.0.0 5i.2 with fixes) (184 kByte, 60 classes) An extended image editor that draws on a non-trivial graphical user interface. Gremlin: (ver: Oct 7 ’99) (65 kByte, 4 classes) An application that lets an animated figure pop up from time to time without the need for a user interaction, representing non-GUI applications. To test these applications we designed a code loader to simulate different transmission rates. Essentially the code loader waits for the amount of time needed to load the file containing the code, under different network bandwidths before effectively loading the code from disk and passing it on to the compiler. For this setup six transmission rates were simulated: 2400 bps (very low bandwidth), 14.4 kbps (slow modem), 56 kbps (fast modem), 114 kbps (GPRS) en 2 Mbps (UMTS). These different transmission rates were complemented by the rate obtained without network latency: 41 Mbps in our setup.

3.2 Permuting the Source Code To simplify the permutation process somewhat, on this first setup we assumed that the code flow is completely deterministic. In other words, we assumed that for each run of the code the application behaves always the same way, hereby neglecting possible different user inputs or other random events. This makes the permutation process straightforward since it suffices to determine the method invocation sequence once and rearrange the methods accordingly. The static structure of the permuted file will then reflect more closely its dynamic behavior. Finding the ideal breakpoints is less straightforward. Profiling tools together with the dynamic behavior statistics, obtained as a side effect during the permutation process can give us some hints as where to split the code. In our initial experiment we will resort to some simple heuristics, such as cutting the file into equal pieces. The permutation process, which is completely automated in our setup, consists of several distinct steps (Fig. 1). To obtain the necessary method invocation sequence the original source code is instrumented with extra code that logs the time of invocation of each method. The instrumentation is accomplished by the source code instrumenter component. Then the instrumented source code is evaluated. The output is ignored at this time but the instrumented methods will generate the necessary log information, in this case an XML file that contains the method invocation sequence. In another phase, which could be carried out in parallel with the steps described previously, the original source code is parsed by the source code parser component and the resulting descriptions (class, methods, comments and other descriptions) are stored in an intermediate repository.

84

L. Stoops, T. Mens, and T. D’Hondt

to go. To favor a quick emergence of the GUI we will try to make the first cut immediately after the GUI code. In this way we can exploit the inevitable user delay [3], during which the system waits for user interaction, to load the rest of the code. If the method that finishes off the GUI is in the first half of the source code, as in our three test applications, then the first cutting place will be after that method. The remaining code is then equally divided in the three remaining parts: part 2, part 3 and part 4. Three semaphores are then added at the end of the three loose ends of part1, part2 and part3. The semaphores are added to the last method at the beginning of its method body to avoid possible return messages and therefore to be sure that it will be executed. The methods, in which the semaphores reside, are possibly invoked more than once. This means that the semaphore must be disabled after its first use. In this setting this is done by enclosing each semaphore in a conditional structure in such a way that the semaphore is bypassed after it’s first use: Interlacer.S1Active ifTrue: [Interlacer.S1 wait. Interlacer.S1Active := false] The application is then loaded, compiled and run again in an interlaced style for each of the simulated channel bandwidths and the new timing results are gathered. These are referred to as "interlaced end" and "interlaced GUI" in the figures later. Each timing result is calculated as the average of three timing runs to be able to flatten occasional variations caused by the operating system or programming environment such as garbage collection.

3.3 Timing Results For each of the three test applications result times were measured with the six different bandwidths. For each of these bandwidths the time was measured in a normal set up (first load all the code and then compile and run) and an interlaced set up where the compilation and start of the code takes place after the first part is loaded. For both loading types we measured the time it took for the GUI to display itself and the total time to complete the loading, compilation and evaluation of the application. The experiments where carried out on a DellÇ Inspiron 8100 computer with IntelÇ PentiumÇ III Mobile CPU AT/AT compatible processor at 1GHz processor speed and 256 Mb RAM running WindowsÇ 2000 and VisualWorks 5i4. 3.3.1 Benchmark Benchmark is an application that runs selectable tests on the VisualWorks environment. For this test the application was adapted in such a way that after the GUI pops up the application immediately runs a number of standard tests. Fig. 3 shows the parallel processes achieved at a bandwidth of 114 kbps where load and compilation times are at the same order of magnitude. GUI building indicates the first part of the evaluation process where the GUI is built. The second part of the

Fine-Grained Interlaced Code Loading for Mobile Systems

85

application GUI building compile4 load4 compile3 load3 compile2 load2 compile1 load1 0

2000

4000

6000

8000

10000

12000

14000

execution time (in ms)

Fig. 3. Parallel execution Benchmark @ 114 kbps

evaluation process is indicated in the figure by application. The evaluation process has to share the processing power with the compile phases but can run in parallel with the load phases (except for load1). Note from Fig. 3 that the evaluation process can take advantage of the relatively long periods of load2 and load3 to be able to finish early. It will even finish before all the code is loaded. This means that all the code that remains to be loaded is not needed for the actual execution. Hence, we may stop loading the rest of the application. Table 1. Timing results (in ms) for Benchmark application Bandwidth (kbps) normal GUI normal end interlaced GUI interlaced end GUI ratio end ratio

2.4 279268 280255 74327 221564 26.61% 79.06%

14.4 51184 52175 13341 40291 26.06% 77.22%

56 18074 19069 4669 14279 25.83% 74.88%

114 12352 13361 2995 9279 24.25% 69.44%

2048 7016 8133 1722 6062 24.54% 74.54%

42308 6562 7526 1341 7609 20.43% 101.10%

Timing results are depicted in Table 1 and Fig. 4. The first row of Table 1 (normal GUI) shows the time in milliseconds it normally takes to render the GUI for the different bandwidths. The second row (normal end) shows the time in milliseconds the application normally needs to end. The third and fourth rows (interlaced GUI and interlaced end) show the same time if the application is deployed in an interlaced code loading fashion. Finally the bottom rows (GUI ratio and end ratio) show the relative amount of time gained by interlacing to present the GUI and to finish the application. In Fig. 4, the x and y scale are logarithmic to accommodate the wide range of bandwidths. Note also from this figure that, if the application is loaded via a network (all rows except the last one where no network latency was simulated), the application itself ends earlier (on average 75% of the original time needed) if deployed in an interlaced mode.

Fine-Grained Interlaced Code Loading for Mobile Systems

87

When the application is launched, the animated figure pops up for the first time and a help window shows up. Table 3 shows the delays of the Gremlin application. Table 3. Timing results (in ms) for Gremlin application Bandwidth (kbps) normal GUI normal end interlaced GUI interlaced end GUI ratio end ratio

2.4 230743 230745 51601 225385 22.36% 97.68%

14.4 46392 46394 15194 39441 32.75% 85.01%

56 19563 19565 11932 12777 60.99% 65.31%

114 14713 14715 11005 11005 74.80% 74.79%

2048 10469 10471 10283 10283 98.23% 98.21%

42308 10262 10264 10439 10439 101.72% 101.70%

Since the Gremlin application starts with a popup of an animated figure and during the rest of its life it just does the same thing over and over again at different time intervals it means that all the resources needs to be in place before the application can start. This is reflected in Table 3 by the fact that only for bandwidths lower than 56 kbps the GUI ratio is lower than the end ratio, i.e., the first popup can finish earlier than the complete loading and compilation process. For bandwidths greater than 56 kbps it is the popup process itself that will determine the end of the process. The poor results of the GUI ratio obtained with the Gremlin application lead us to the question whether it is possible to adapt the design of the application in such a way that interlacing could be applied more advantageously. If we could change the application in such a way that it would not depend any more on all of its resources, for its first token of life, this would do the trick. To achieve this, we adapted the Gremlin application so that after it is launched only the help window appears (containing an explanation of the behavior of Gremlin and stating that the first popup is scheduled within 5 minutes). This is only a minor change to the main behavior of the application but as Table 4 shows there is now a significant time gain possible for the GUI building (now the text window) and the end of the application (now the loading and compilation of the source code but before the first popup). We came to the conclusion that small changes at the design level sometimes suffice to get a significantly better behavior in an interlaced loading environment. Table 4. Timing results (in ms) for adapted Gremlin application Bandwidth (kbps) normal GUI normal end interlaced GUI interlaced end GUI ratio end ratio

2.4 223468 223470 44183 224902 19.77% 100.64%

14.4 39420 39422 8261 39441 20.96% 100.05%

56 12568 12569 3049 12596 24.26% 100.22%

114 7890 7892 2223 7968 28.17% 100.97%

2048 3544 3546 1413 3557 39.88% 100.29%

42308 3150 3152 1228 3197 38.98% 101.45%

Fine-Grained Interlaced Code Loading for Mobile Systems

89

interlaced code loading. More research is needed to device these guidelines but some of the obvious ones are: œ Keep programming modules independent from each other (i.e., use low coupling and high cohesion). œ Start as soon as possible with building the GUI. œ Keep the code and the resources needed to present the first user interface as small as possible. Mostly this is the GUI the user is confronted with at startup. œ If necessary, enhance the GUI (e.g., extending the GUI menu), at a later time. œ Postpone heavily resource-dependent actions as long as possible. œ Postpone multithreaded processes as long as possible. 3.4.3 Dealing with Semaphores As mentioned before, precautions must be taken to prevent methods to be triggered that are not loaded yet. Although it is possible to catch these exceptions on the level of the virtual machine or even on the level of the operating system, for this setup we chose for the generic approach of adding semaphores in the source code. It can be assumed that for every application there will exist an ideal number of pieces to split the code in to obtain a maximum speedup. If the number of pieces increases so will the total size of the code since each piece of code will need extra statements to present the semaphore code. And if the code size increases so will the loading time and since the extra code needs to be evaluated too, also the evaluation time. Times that we wanted to decrease in the first place. Furthermore there will be an extra overhead at the receiver and sender platform to administrate the loading, compiling and evaluation of the different parts. More experiments are necessary to determine the optimal number of parts, but as shown in the examples a simple heuristic of cutting the source code in four pieces and trying to put the first break at the point where the first GUI is built provides already significant results. Provisions need to be made to disable the semaphores once they have served their purpose for the first time. Placing them in a conditional branch that bypasses them after first use seems to be a valid option and this is the choice that we took in the experiments of this paper. If the method in which the semaphore is placed is triggered a significant number of times, complete removal of the semaphore code after its first use can be considered. Access to a precompiled version of the same method without the semaphore code can speed up that process. Deploying garbage collection agents to remove unused semaphores in the background is another possible approach. On the other hand, if we are dealing with mobile code that moves continuously from host to host it may be advantageous to keep the semaphores in place.

90

L. Stoops, T. Mens, and T. D’Hondt

4 Related Work There are a number of different techniques that have been proposed in the research literature to reduce network latency: code compression, exploiting parallelism, reordering of code and data, and continuous compilation. Code compression is the most common way to reduce overhead introduced by network delay in mobile code environments. Several approaches to compression have been proposed. Ernst et al. [4] describe an executable representation that is roughly the same size as gzipped x86 programs and can be interpreted without decompression. Franz [7] describes a compression scheme called slim binaries, based on adaptive methods such as LZW [14], but tailored towards encoding abstract syntax trees rather than character streams. The technique of code compression is orthogonal to the techniques proposed in this paper, and can be used to further optimize our results. Exploiting parallelism is another way to reduce network latency. Krintz et al. [8] proposed to simultaneously transfers different pieces of Java code in parallel, to ensure that the entire available bandwidth is exploited. Alternatively, they proposed to parallelise the processes of loading and compilation/execution, a technique that is also adopted by this paper. Compared to our paper, Krintz et al. also suggest parallelisation at the level of methods, and their experiments yielded even better results than ours: transfer delay could be decreased between 31% and 56% on average. An important difference with our approach is the implementation language (Java instead of Smalltalk). Moreover, because of the limitations of the Java virtual machine security model, Krintz et al. simulated their experiments. Additionally, they only considered two different bandwidths while we explored a wider range of 6 different bandwidths in this paper. Reordering of code and data is also essential for reducing transfer delay. Krintz et al. [9] suggest splitting Java code (at class level) into hot and cold parts. The cold parts correspond to code that is never or rarely used, and hence loading of this code can be avoided or at least postponed. With verified transfer, class file splitting reduces the startup time by 10% on average. Without code verification, the startup time can even be reduced slightly more. To determine the optimal ordering of code, a more thorough analysis of the code is needed. This can be done either statically, using control flow analysis, or dynamically, using profiling. Both techniques are empirically investigated in [8] to predict the first use ordering of methods in a class. These techniques are directly applicable to our approach as well. More sophisticated techniques for determining the most probable path in the control flow of a program are explored in [6]. Continuous compilation and ahead-of-time compilation are techniques that are typically used in a code on demand paradigm, such as dynamic class loading in Java. The goal of both compilation techniques, explored in [9] and [12], is to compile the code before it is needed for execution. Again, these techniques are complementary to our approach, and can be exploited to further optimize our results.

Fine-Grained Interlaced Code Loading for Mobile Systems

91

5 Conclusion Network latency becomes a critical factor in the usability of applications that are loaded over a network. As the gap between processor speed and network speed continues to widen it becomes more and more opportune to use the extra processor power to compensate for the network delays. Performance of an application is most commonly measured by overall program execution time but in a mobile environment performance is also measured by invocation latency. Invocation latency is the time from application invocation to when execution of the program actually begins. From the viewpoint of the user the most crucial latency is the user interface latency, being the time a user has to wait between his demand and a user interface reaction of the system. Exploiting parallelism between loading and execution proves to reduce user interface latency considerably (21% of the original time on average in three applications tested). Besides this reduction of the user interface latency also the overall program execution time can be significantly reduced (75% of the original time in the Benchmark application). Except for the simulation of a large range of transmission rates our experiments do not rely on simulation techniques what makes us confident about the obtained results.

6 Future Work An industrial research project (funded by the Belgian government) that will start end 2002 is situated around mobile code and low bandwidth environments. This setting will give us the real live test environment to validate our approach further on different platforms and will allow us to get more detailed results. Experiments with interlaced code loading will be performed on a larger scale, including applications and benchmarks that will reflect the typical mobile agent application behavior. The results and lessons learned will be distilled in interlacing design guidelines to guide developers willing to take advantage of low user interface latency.

$

%

&

'

Load A-B Load B-C A-B

A-B

A-B

B-C

B-C

C-D

Load C-D

B-C

C-D

C-D

HYDOXDWLRQ Fig. 6. Mobile agent traversing a multi-hop network in an interlaced mode

92

L. Stoops, T. Mens, and T. D’Hondt

We will also apply interlaced code loading on mobile agents operating in multi-hop networks. This environment promises an even more substantial decrease of the invocation latency. See Fig. 6 that compares a classic and interlaced code loading in a multi-hop network. In the example the code is split in three parts. Further we will look for a more formal approach to decide where to cut the original code and how and where to add semaphores or other guarding systems. Genetic algorithms may provide us the right tool to find the most opportune cutting places. Acknowledgments. We thank Alexandre Bergel, Karel Driessen, Johan Brichau, Franklin Vermeulen, Gert van Grootel and the anonymous referees for reviewing the paper. We also thank the Programming Technology Lab for their valuable comments.

References 1. 2. 3. 4.

5. 6.

7. 8.

9.

10. 11. 12.

13. 14.

K. Arnold, B. O'Sullivan, R.W. Scheiffer and J. Waldo, A. Wollrath. The Jini Specification. Addison-Wesley, 1999 S. Barberis, A CDMA-based radio interface for third generation mobile systems. Mobile Networks and Applications Volume 2 , Issue 1 ACM Press June 1997 R. Dillen, J. Edey and J. Tombaugh. Measuring the true cost of command selection: techniques and results. CHI ’90 proceedings ACM Press April 1990 J. Ernst , W. Evans , C. W. Fraser , T. A. Proebsting , S. Lucco , Code Compression. Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation. Volume 32 Issue 5, May 1997 A. Goldberg and D. Robson, Smalltalk-80 The Language, Addison-Wesley Publishing Company ISBN 0-201-13688-0 1989 R. Jason, C. Patterson, Accurate Static Branch Prediction by Value Range Propagation Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation, pages 67-78. (La Jolla, San Diego), June 1995 M. Franz and T. Kistler. Slim Binaries. Comm. ACM Volume 40 Issue 12, December 1997 C. Krintz, B. Calder, H. B. Lee, B. G. Zorn, Overlapping Execution with Transfer Using Non-Strict Execution for Mobile Programs. Proc. Int. Conf. on Architectural Support for Programming Languages and Operating Systems, San Jose, California U.S., October, 1998 C. Krintz, B. Calder and U. Hölzle, Reducing Transfer Delay Using Class File Splitting and Prefetching, Proc. ACM SIGPLAN Conf. Object-Oriented Programming, Systems, Languages, and Applications, November, 1999 D. Milojicic, Mobility processes, computers and agents, ACM Press 1999 G. Moore, Cramming more components onto integrated circuits, Electronics, Vol. 38(8), pp. 114–117, April 19, 1965. M. P. Plezbert , Ron K. Cytron, Does "just in time" = "better late than never"? Proc. ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages, p.120–131, Paris, France, January 15–17, 1997 D. Siegel, Creating killer web sites, Indianapolis: Hayden Books. 1996 J.Ziv and A.Lempel, A Universal Algorithm for sequential Data compression, IEEE Transactions on Information Theory Vol 23, No.3, May 1977

Improving Scalability of Replicated Services in Mobile Agent Systems JinHo Ahn1 , Sung-Gi Min1 , and ChongSun Hwang1 Dept. of Computer Science & Engineering, Korea University 5-1 Anam-dong, Sungbuk-gu, Seoul 136-701, Republic of Korea {jhahn | hwang}@disys.korea.ac.kr, [email protected]

Abstract. In this paper, we propose a strategy to improve scalability of replicated services in mobile agent systems by using an appropriate passive replication protocol for each replicated service according to whether the service is deterministic or non-deterministic. For this purpose, two passive replication protocols are introduced for nondeterministic and deterministic services respectively. They both allow visiting mobile agents to be forwarded to and execute their tasks on any node performing a service agent, not necessarily the primary agent. Additionally, in the second protocol for deterministic services, after a backup service agent has received each mobile agent request and obtained its delivery sequence number from the primary service agent, the backup is responsible for processing the request and coordinating with the other replica service agents. Therefore, our strategy using the two proposed protocols can promise better scalability of replicated services a large number of mobile agents attempt to access in mobile agent systems. Keywords: Mobile agent system, replicated service, scalability, faulttolerance, passive replication, determinism

1

Introduction

A mobile agent system consists of a number of mobile agents, service agents and places [13]. Mobile agent is an active and self-contained program that moves from node to node in a computer network and executes its task on behalf of the user using services provided on these nodes. Service agent is a stationary agent that provides other agents with system or application-level services on a particular node, and place is an environment for safely executing local service agents and mobile agents. When agent migration occurs in the system, agent’s code and state information are captured and then transferred to the next node. After arriving at the node, the mobile agent is initiated and performs its task by interacting with local service agents on the node. Therefore, the mobile agent paradigm has two important advantages, i.e., reduction of network traffic and asynchronous interaction, compared with the client-server paradigm, and so is 

This work was supported by Korea Research Foundation Grant.(KRF-2002-003D00248)

N. Suri (Ed.): MA 2002, LNCS 2535, pp. 93–105, 2002. c Springer-Verlag Berlin Heidelberg 2002 

94

J. Ahn, S.-G. Min, and C. Hwang

widely used in several application areas such as electronic commerce, network management, telecommunication and Internet computing [11]. As the mobile agent system is gaining popularity and the number of mobile agent’s users rapidly increases, a large number of mobile agents may concurrently be transferred to a node supporting a particular service. In this case, the service agent on the node can be a performance bottleneck and if the agent fails, the execution of all the transferred mobile agents be blocked. In order to solve these problems, the service agent function should be replicated at multiple nodes. This approach may balance the load caused by the mobile agents and even if some service agents crash, continuously allow the other service agents to provide the mobile agents with the service. Each mobile agent issues a series of requests to one or more among the set of service agents. The interleaved sequence of requests sent from the mobile agents should be handled consistently at every service agent. There are two approaches used in distributed systems to potentially be applied for satisfying the goal: active and passive replication [8]. If the active replication approach [16] is used in the mobile agent system, every non-faulty replicated service agent receives the requests from mobile agents in the same order, processes and replies them to the agents. This approach requires the operations on the service agent to be deterministic: given the initial state of each service agent and the same sequence of mobile agent requests previously performed by the service agent, it will produce the same result [18]. It provides failure transparency for mobile agents because even if some replicated service agents fail, the other normal ones can handle the requests and send their replies to the mobile agents. Additionally, it requires low response time in case of failures compared with the passive approach. However, the approach has two important drawbacks: high resource consumption because all replicated service agents have to process mobile agent requests, and determinism constraint as mentioned above. In the passive replication approach [3], only one among a set of service agents, named primary service agent, receives a request from each mobile agent, processes it and then forwards a message including its updated state to the other service agents, called backup service agents. When receiving the message, each backup agent updates its state using the message and returns an acknowledgement message to the primary agent. After the primary agent receives the messages from all non-faulty backup agents, it sends a response of the request to the mobile agent. When the primary agent fails, this approach suffers from a high reconfiguration cost and cannot provide failure transparency for mobile agents. But, this approach has three desirable features we focus on. First, the approach enables its consistency to be guaranteed even if replicated service agents are performed in a non-deterministic manner. Thus, it can be applied to every replicated service regardless of the execution behavior of the service. Second, it needs lower processing power during failure-free execution than the active replication one. Third, mobile agents has only to use a unicast primitive, not a multicast one because they send service requests only to the primary service agent.

Improving Scalability of Replicated Services in Mobile Agent Systems

95

However, the traditional passive replication approach may result in some scalability and performance problems when being applied to the mobile agent system as a fault-tolerant technique for replicated services. In other words, to the best of our knowledge, previous works [3,7] uniformly applied the traditional passive replication approach to each replicated service regardless of whether it is deterministic or non-deterministic. But, in this approach, every mobile agent request should be sent only to the primary service agent, which processes the request and coordinates with the other live replicas and then returns a response of the request to the mobile agent. This special role of the primary is necessarily required to ensure the consistency for non-deterministic services. Moreover, the traditional passive replication approach forces all visiting mobile agents to be transferred to and execute their works in order only on the node running the primary service agent of each domain. These inherent features may cause the extreme load condition to occur on the primary service agent when a large number of mobile agents are forwarded to the service domain and access its resources. Thus, this previous strategy may not achieve high scalability and performance. In this paper, we propose a scalable strategy to apply an appropriate passive replication protocol to each service according to its execution behavior because deterministic services require weaker constraints to ensure their consistency than non-deterministic ones. For this goal, two passive replication protocols are designed in this paper. The first protocol for non-deterministic services is named P RP N S and the second protocol for deterministic services, P RP DS. They both allow visiting mobile agents to be forwarded to and execute on any node performing a service agent, not necessarily the primary agent. Especially, in case of the second protocol P RP DS, after a backup service agent has received a mobile agent request and obtained the delivery sequence number of the request from the primary service agent, the backup agent, not the primary one, is responsible for processing the request and coordinating with the other replica service agents. Due to this feature, P RP DS is more lightweight than P RP N S tolerating nondeterministic servers such as multi-threaded servers. The rest of the paper is organized as follows. In section 2, we describe the system model and in section 3, discuss the problems of the traditional protocols and present our passive replication protocols. Sections 4 reviews related work and section 5 concludes this paper.

2

System Model

We consider an asynchronous distributed system where there is no global memory and clock, and no bound on message delay. This system is augmented with a unreliable failure detector [5] in order to solve the impossibility problem on distributed consensus [6]. The system provides mobile agents with a set of n(n > 0) services S =[s0 , s1 , · · ·, sn−1 ]. A service si (0 ≤ i ≤ n − 1) is accessed by the mobile agents and implemented by a set of k(k > 0) service agents ui =[ui 0 , ui 1 , · · ·, ui k−1 ] as a group, where ui w executes on a node

96

J. Ahn, S.-G. Min, and C. Hwang

nodei w (0 ≤ w ≤ k − 1). We consider every service agent has its local state and atomically changes the state through agent interactions. In order to perform an assigned task on behalf of a user, a mobile agent aj executes on a sequence of l(l > 0) places pj =[pj0 , pj1 , · · ·, pj(l−1) ] according to its itinerary, which may be statically determined before the mobile agent is launched at the agent source or dynamically while progressing its execution. Executing aj at a place pjm (0 ≤ m ≤ l−1) is called a stage Stagem j of the agent execution. The resulting state of aj (j ≥ 0) executing all the stage operations on the places pjr (0 ≤ r ≤ m) is denoted by am+1 , which will be forwarded from pjm to pj(m+1) and executed j on place pj(m+1) . For example, figure 1 illustrates an execution of three mobile n agents alt , am s and ar attempting to use a non-replicated service ui via service 0 agent ui on node node0i . In this figure, the three mobile agents are initially put in a message queue of u0i , wqi0 , and then perform their tasks in order by interacting with the service agent u0i in request-execute-response manner. Agents have crash-failure semantics, in which they lose contents in their volatile memories and stop their executions [15]. We assume that the communication channels are immune to partitioning, reliable and FIFO. Linearizability is a strong correctness criterion for replicated services that most passive replication protocols for real-world distributed systems in this literature ensure [8,19]. It is based on real-time dependency. In other words, it implies that agents appear to be interleaved at the granularity of complete operations, and that the order of non-overlapping operations is preserved [9]. We use it as the basic consistency criterion for the passive replication protocols shown in this paper. A service agent q in group ui can communicate with the other service agents ∈ ui using unicast and multicast communication primitives. The multicast communication primitive used in the system is View Synchronous Multicast(V SCAST ) [14], which usually is used to ensure correctness of the passive replication approach. V SCAST is defined in the context of a group ui and is based on the notion of a sequence of views v0 (ui ), v1 (ui ), ..., vf (ui ), ... of group ui . View vf (ui )(f ≥ 0) is a set of members of the group ui that are recognized as being correct at some time t. If service agent q ∈ vf (ui ) is suspected to have failed, or some other service agent z attempts to join the group ui , a new view vf +1 is installed. If some process q ∈ vf (ui ) sends message m to vf (ui ), V SCAST ensures the following property: if one process q ∈ vf (ui ) has delivered m in vf (ui ) and then installed vf +1 (ui ), ∀r ∈ vf (ui ) which has installed vf +1 (ui ) has installed vf +1 (ui ) after having delivered m.

3

The Strategy

In this section, we propose a strategy to improve scalability of mobile agent systems by using the appropriate passive replication protocol for each replicated service domain according to whether the service is deterministic or nondeterministic like in figure 2. For this purpose, two passive replication protocols P RP N S and P RP DS are introduced in sections 3.1 and 3.2 respectively.

98

J. Ahn, S.-G. Min, and C. Hwang

current node to the node where the primary service agent of ui , uprim , executes. i Then, the following phases are performed: (Phase 1.) When uprim receives a request message from mobile i agent aj , uprim processes the message. After the execution, it generates i (response, psn, next state), where response is the response of the message, psn identifies the processed request message and next state is the state of uprim i updated by processing the request. (Phase 2.) uprim sends all backup service agents the update message (response, i psn, next state, j, reqid) using V SCAST respectively, where j identifies aj and reqid is the send sequence number of the request message. When each backup service agent receives the update message, it updates its state using next state, maintains (response, j, reqid) in its buffer and then sends an acknowledgement . In this case, (response, j, reqid) is needed to ensure the message to uprim i exactly once semantics despite uprim ’s failure. i (Phase 3.) After receiving an acknowledgement message from every live sends response to aj . backup service agent, uprim i

ui wq i0

nodei0

ui0 wq i1

nodei1 ack

wq i2 arn

ack

ui1

VSCAST

execute nodei2 request

asm

atl response

ui2

1 2 3

Fig. 3. an execution of three mobile agents accessing a replicated service ui in the traditional passive replication protocol

Improving Scalability of Replicated Services in Mobile Agent Systems

99

Therefore, this protocol forces all visiting mobile agents to be transferred to and execute their works in order only on the node running the primary service agent of each domain. This behavior may cause the extreme load condition to occur when a large number of mobile agents are forwarded to the service domain to use the non-deterministic service. For example, in figure 3, three mobile agents n 2 alt , am s and ar are concurrently transferred to the node nodei where the primary 2 service agent of ui , ui , is running. The mobile agents perform the service of ui in order via only u2i using the above mentioned protocol respectively. Thus, we introduce a passive replication protocol for non-deterministic service, P RP N S, to solve the problem by allowing mobile agents to be forwarded to and execute on each a node performing a service agent, not necessarily the primary agent. The protocol forces each service agent p to forward every request received from mobile agents to the primary service agent. Afterwards, the primary agent performs the phases 1 through 2 previously mentioned to satisfy the consistency condition for non-deterministic service. After receiving an acknowledgement from every normal backup service agent, the primary agent sends the response of the request to the service agent p, which forwards it to the corresponding mobile agent. Figure 4 illustrates how the protocol P RP N S executes. In this figure, mon 2 bile agents alt , am s and ar are transferred to and execute on the nodes nodei , 1 0 nodei and nodei for accessing service ui . Then, the mobile agents perform their tasks via u2i , u1i and u0i respectively. For example, if u1i receives a request from 2 2 am s , it forwards the request to the primary service agent ui . Afterwards, ui 1 processes the request and coordinates with the other service agents ui and u0i using V SCAST . When receiving acknowledgements from the two backup service agents, it sends the response of the request to u1i , which forwards the response to am s . From this example, we can see that P RP N S improves scalability of replicated services in case a large number of mobile agents attempt to use a particular service simultaneously. If a mobile agent accesses a service via a backup service agent in this protocol, two more messages are required per request compared with the traditional protocol. However, the additional message cost is not significant because each service domain is generally configured on a local area network like Ethernet. 3.2

Passive Replication Protocol for Deterministic Service

A deterministic replicated service requires weaker constraints to ensure the consistency than a non-deterministic one. In other words, after the primary service agent has determined the processing order of every request from mobile agents, it is not necessary that only the primary agent handles all requests, and coordinates with the other replica service agents like in the protocol P RP N S. With this observation, we attempt to use a lightweight passive replication protocol, P RP DS, for each deterministic service domain. The proposed protocol has the following features. • Each mobile agent can use a service via the primary service agent or a backup one.

100

J. Ahn, S.-G. Min, and C. Hwang

ui wq i0 arn wq i1

nodei0

execute

ui0 ack nodei1

execute request

asm

ui1

VSCAST

response

wq i2 atl

request execute

ack response 2 nodei

ui2

Fig. 4. an execution of three mobile agents accessing a replicated service ui in the protocol P RP N S

• Only the primary service agent determines the processing order of every request from mobile agents. • After a backup service agent has received a mobile agent request and obtained the order of the request from the primary service agent, the backup agent, not the primary one, processes the request and coordinates with the other replica service agents including the primary agent. Due to these desirable features, this protocol enables each visiting mobile agent to be forwarded to and execute on a node running any among replicated service agents in the service domain. If mobile agent aj is transferred to the executes, P RP DS executes the node where a backup service agent ubackup i following phases. Otherwise, the three phases of P RP N S are performed. receives a request message from aj , ubackup (Phase 1.) When ubackup i i asks the primary service agent uprim the psn of the request message. In this i case, after uprim determines the psn of the message, it notifies ubackup of i i prim processes the request message and saves (response, psn, the psn. Then, ui next state, j, reqid) of the message in its buffer. When receiving the psn, ubackup i processes the corresponding request and generates (response, psn, next state) of the request. sends the other service agents the update message (Phase 2.) ubackup i

Improving Scalability of Replicated Services in Mobile Agent Systems

101

(response, psn, next state, j, reqid) using V SCAST respectively. When each receives the update message, it updates its state service agent except for uprim i using next state, maintains (response, j, reqid) in its buffer and then sends an acknowledgement message to ubackup . If uprim receives the update message from i i backup ui , it just removes the element (response, psn, next state, j, reqid) for the message from its buffer, saves (response, j, reqid) in the buffer and sends an . acknowledgement message to ubackup i (Phase 3.) Once ubackup receives an acknowledgement message from every i other live service agent, it sends response to aj . and anr atFigure 5 shows an execution of three mobile agents alt , am s 2 1 0 tempting to use service ui via ui , ui and ui in the protocol P RP DS. In this figure, only the procedure is pictured to execute in P RP DS when am s sends a request to u1i . In this case, u1i asks the primary agent u2i the psn of the request. u2i determines the psn and sends it to u1i . Then, u2i processes the request and saves (response, psn, next state, j + 1, reqid) of the request in its buffer. Meanwhile, after obtaining the psn of the request from the primary agent, u1i processes the request and coordinates with the other service agents u2i and u0i using V SCAST to satisfy the consistency condition for deterministic service. When u1i receives an acknowledgement from every live agent, it sends the response to am s . From this figure, we can see that the protocol P RP DS may significantly improve scalability of replicated services by enabling each visiting mobile agent to use a particular service via any among replicated service agents on the service domain and the request processing and coordination load to be distributed between a set of service agents. 3.3

Recovery

If service agents fail on a service domain, the proposed two protocols, P RP N S and P RP DS, should perform their recovery procedures to satisfy each consistency condition for nondeterministic service or deterministic service despite the failures respectively. Firstly, in case of the crash of the primary service agent uprim , the two protocols execute both each appropriate recovery procedure in i the following cases. fails before finishing the phase 1. (Case 1.) The primary service agent uprim i  is elected among all backup In this case, a new primary service agent uprim i , the service ones. Since no response of each request can be received from uprim i  , which has only to perform the phases 1 through request will be resent to uprim i 3. (Case 2.) The primary service agent uprim fails after completing the phase 1, i but before sending the response to the corresponding mobile agent.  is selected. To ensure linearizability Like in case 1, a new primary agent uprim i in this case, either all the backup service agents receive the update message, or none of them receive it. V SCAST is used to satisfy the atomicity condition of

102

J. Ahn, S.-G. Min, and C. Hwang

ui wq i0 arn wq i1

nodei0

execute

ui0 nodei1 ack

execute request

asm

ui1 response

wq i2 atl

VSCAST ack

nodei2

execute

ui2

Fig. 5. an execution of three mobile agents accessing a replicated service ui in the protocol P RP DS

linearizability in case of an asynchronous mobile agent system with an unreliable failure detector. If no backup service agent receives the update message, this case is similar to case 1. Otherwise, (response, j, reqid) in the phase 2 is used to ensure the exactly-once semantics [17]. In other words, when the request from the corresponding mobile agent is sent to the new primary service agent  again, the latter immediately sends the response of the request to the first uprim i without handling the request. (Case 3.) The primary service agent uprim fails after finishing the phase 3. i  is selected and identified. Like in case 1, a new primary service agent uprim i Secondly, when a backup service agent ubackup crashes, P RP N S and i P RP DS perform their recovery procedures as follows. In P RP N S, ubackup is i just removed from its service group. On the other hand, P RP DS executes a corresponding recovery procedure in each following case. (Case 1.) Backup service agent ubackup fails before asking the primary service i agent uprim the psn of each request in the phase 1. i backup In this case, ui is just removed from its group. Afterwards, ubackup ’s failure i is detected because of no response of the request from it. fails before finishing the phase 1. (Case 2.) Backup service agent ubackup i

Improving Scalability of Replicated Services in Mobile Agent Systems

103

(Case 2.1.) The primary uprim fails. i  , The other backup service agents select a new primary service agent, uprim i among them. is alive. (Case 2.2.) The primary uprim i prim Detecting ubackup ’s crash, u retrieves every update information, which is i i form of (response, psn, next state, j, reqid), from its buffer and sends it to the other live backup service agents respectively. In this case, V SCAST is used to ensure linearizability. fails after sending the update message (Case 3.) Backup service agent ubackup i to the other service agents in the phase 2, but before sending the response to the corresponding mobile agent. sent the update message to the other service agents by using As ubackup i V SCAST , the entire consistency is ensured. Therefore, the service group of has only to remove ubackup . ubackup i i backup (Case 4.) Backup service agent ui fails after completing the phase 3. ubackup is removed from its group and mobile agents can detect that ubackup fails. i i

4

Discussion

Semi-active replication protocol [12] is a variant of the active replication technique to attempt to lift the determinism constraint. The goal is satisfied by forcing the primary to send its updated state to the other replicas like in the passive replication technique after each replica has received and processed a client request. However, this protocol has still the drawbacks of both the traditional active and passive replication techniques. Moreover, the synchronous system model is assumed in this protocol. Semi-passive replication protocol [7] was proposed to have the advantage of the passive replication technique without requiring any group membership service. It is based on the rotating coordinator paradigm that has been for solving the consensus problem [5]. Thus, the protocol allows for aggressive time-out values and suspecting crashed processes while resulting in a lower cost in case of the primary’s failure compared with the traditional passive replication technique. But, this protocol also requires that the primary handles all the client requests and coordinates with the other replicas.

5

Conclusion

In this paper, to improve scalability of mobile agent systems, we proposed a new strategy to apply an appropriate passive replication protocol to each replicated service according to its execution behavior, deterministic or non-deterministic. To satisfy the goal, the two passive replication protocols, P RP N S and P RP DS were presented for non-deterministic and deterministic services respectively.

104

J. Ahn, S.-G. Min, and C. Hwang

While ensuring linearizability, they both allow visiting mobile agents to be forwarded to and execute their tasks on any node performing a service agent, not necessarily the primary agent. Especially, the more lightweight protocol P RP DS allows any service agent to process each mobile agent request and coordinate with the other replica service agents after receiving the request and obtaining its delivery sequence number from the primary agent. Thus, if P RP DS is wellcombined with existing load balancing schemes [2,4], the request processing and coordination load can be evenly distributed among a set of deterministic and replicated service agents based on the workload of each service agent. In conclusion, we believe that our strategy using P RP N S and P RP DS can promise better scalability of replicated services a large number of mobile agents attempt to access in mobile agent systems. The role of load balancing schemes is very important for our strategy to work effectively. Thus, we have to perform various experiments about which existing load balancing schemes are appropriate for our presented protocols P RP N S and P RP DS. For this purpose, we currently intend to implement and test the two protocols with the load balancing schemes in some conventional mobile agent platforms such as Aglets [10] and Mole [1].

References 1. J. Baumann, F. Hohl, K. Rothermel and M. Straer. Mole – Concepts of a Mobile Agent System. World Wide Web, 1(3):123–137, 1998. 2. H. Bryhni, E. Klovning and O. Kure. A Comparison of Load Balancing Techniques for Scalable Web Servers. IEEE Network, 14:58–64, 2000. 3. N. Budhiraja, K. Marzullo, F. B. Schneider and S. Toueg. The primary-backup approach. Distributed Systems(S. Mullender ed., ch. 8, 199–216, Addison-Wesley, second ed., 1993. 4. V. Cardellini, M. Colajanni and P.S. Yu. Dynamic load balancing on Web-server systems. IEEE Internet Computing, 3:28-39, 1999. 5. T. D. Chandra and S. Toueg. Unreliable failure detectors for reliable distributed systems. Journal of ACM, 43:225–267, 1996. 6. M. J. Fischer, N. A. Lynch and M. S. Paterson. Impossibility of distributed consensus with one faulty process. Journal of ACM, 32:374–382, 1985. 7. X. Defago, A. Schiper and N. Sergent. Semi-Passive Replication. In Proc. of the 17th IEEE Symposium on Reliable Distributed Systems, pp. 43–50, 1998. 8. R. Guerraoui and A. Schiper. Software-Based Replication for Fault Tolerance. IEEE Computer, 30(4):68-74, 1997. 9. M. Herlihy and J. Wing. Linearizability: a correctness condition for concurrent objects. ACM Transactions on Progr. Languages and Syst., 12(3):463–492, 1990. 10. IBM Tokyo Research Labs. Aglets Workbench: Programming Mobile Agents in Java. 1996. http://www.trl.ibm.co.jp/aglets 11. V. Pham and A. Karmouch. Mobile Software Agents: An Overview. IEEE Communications Magazine, 36(7):26–37, 1998. 12. D. Powell, M. Chereque and D. Drackley. Fault-tolerance in Delta-4. ACM Operating Systems Review, SIGOPS, 25(2):122-125, 1991. 13. K. Rothermel and M. Schwehm. Mobile Agents. In A.Kent and J.G. Williams (Eds.):Encyclopedia for Computer Science and Technology, 40 (25):155–176, 1999.

Improving Scalability of Replicated Services in Mobile Agent Systems

105

14. A. Schiper and A. Sandoz. Uniform reliable multicast in a virtually synchronous environment. In Proc. of the 13rd International Conference on Distributed Computing Systems, pp. 561–568, 1993. 15. R. D. Schlichting and F. B. Schneider. Fail-stop processors: an approach to designing fault-tolerant distributed computing systems. ACM Transactions on Computer Systems, 1:222–238, 1985. 16. F. B. Schneider. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Computing Surveys, 22(4):299–319, 1990. 17. A. Spector. Performing remote operations efficiently on local computer network. Communications of the ACM, 25(4):246–260, 1982. 18. R. B. Strom and S. Yemeni. Optimistic recovery in distributed systems. ACM Transactions on Computer Systems, 3:204–226, 1985. 19. M. Wiesmann, F. Pedone, A. Schiper, B. Kemme and G. Alonso. Understanding Replication in Databases and Distributed Systems. In Proc. of the 21st International Conference on Distributed Computing Systems, pp. 464–474, 2000.

Toward Interoperability of Mobile-Agent Systems Arne Grimstrup1 , Robert Gray1 , David Kotz1 , Maggie Breedy2 , Marco Carvalho2 , Thomas Cowin2 , Daria Chac´on3 , Joyce Barton3 , Chris Garrett3 , and Martin Hofmann3 1

2

Dartmouth College Institute for Human & Machine Cognition, University of West Florida 3 Lockheed Martin Advanced Technology Laboratories

Abstract. Growing recognition of the benefits of mobile agents in distributed systems, such as military C4ISR, has led to a proliferation of mobile agent systems. However, incompatibilities between proprietary systems prevent the greater potential benefits of ubiquitous mobile agent computing. In particular, agents cannot migrate to a host that runs a different mobile-agent system. Prior approaches to interoperability have tried to force agents to use a common API and so far none have succeeded. This goal led to our efforts to develop mechanisms that support runtime interoperability of mobile-agent systems. This paper describes the Grid Mobile-Agent System, which allows agents to migrate to different mobile-agent systems.

1

Introduction

Many mobile-agent systems have been developed, but with different proprietary Application Programming Interfaces (APIs) for the agents. This proliferation of incompatible APIs implies that agents developed for one agent platform cannot be run on, let alone migrate to, a system with a different agent platform. In our opinion, interoperability of platforms is essential for mobile agents to become a ubiquitous technology. Prior approaches, such as MASIF [4], defined a standard API and required all platforms that wish to interoperate to then implement the common API. These approaches, however, have failed to encourage many systems to adopt the API. 1 This paper describes initial results from ongoing work to enable interoperability of mobile-agent systems. We begin with a motivating application scenario. Section 2 then describes the overall design followed by implementation details in Section 3. Section 4 presents a cost-benefit analysis of the interoperability system. Section 5 discusses related work. In Section 6, we discuss some future directions for our approach. Finally, in Section 7, we conclude by presenting a summary of important lessons learned. 

This research was supported by the DARPA CoABS Program (contracts F30602-98-2-0107, F30602-98-C-0170 and F30602-98-C-0162 for Dartmouth, UWF, and Lockheed Martin respectively) and by the DoD MURI program (AFoSR contract F49620-97-1-03821 for both Dartmouth and Lockheed Martin). The contact author is Thomas Cowin [email protected]. 1 The Mobile Agent System List identifies systems that do and do not comply with MASIF and other standards. The list is at http://mole.informatik.uni-stuttgart.de/mal/mal.html

N. Suri (Ed.): MA 2002, LNCS 2535, pp. 106–120, 2002. c Springer-Verlag Berlin Heidelberg 2002 

Toward Interoperability of Mobile-Agent Systems

1.1

107

Motivating Application

The need for interoperability between information systems is readily apparent in peacekeeping and disaster-response operations. In these situations, a coalition of civilian and military organizations, each with its own intelligence or other information assets, is formed on short notice and required to operate in areas where the fixed communication infrastructure has been severely damaged or completely destroyed. The field units are forced to rely on limited-power devices that use an unreliable low-bandwidth data link to communicate between themselves and to access information held within constituent organizations’ headquarters. Today, the advantages provided by mobile agents are well suited to the coalition’s operational environment. By moving the computation to the information source or another server on the fixed network with stable power, battery power consumption can be reduced and bandwidth can be used more efficiently. These advantages are only available if every member of the coalition uses the same mobile-agent system, however. It is highly unlikely that every military, governmental and non-governmental organization will agree to use the same mobile-agent system a priori. The need for rapid response also precludes the conversion of existing agents to a new platform or major changes to the software infrastructure of any member organization. Clearly, the most desirable solution would allow existing agents to operate on foreign mobile-agent system hosts without modification. Our goal is to provide an inter-operation standard that can be used to provide migration inter-operability between different mobile-agent platforms.

2

Overall Design

Our basic approach to interoperability is to allow foreign agents to execute in a nonnative mobile-agent system (MAS) by translating the foreign agent’s API into the local platform’s API. Instead of creating pairwise translations of MAS APIs, we defined a single common interoperability API (IAPI) that supports agent registration, lookup, messaging, launching, and mobility. Then each group provided translators between their MAS API and the IAPI. In this manner, it was not necessary to rewrite each MAS to conform to a newAPI, but to write translators from each MAS to the IAPI, and conversely, from the IAPI to the MAS to support these specific agent operations (i.e., 2N adaptors vs. N system rewrites). 2 Figure 1 shows the structure of the Grid Mobile Agent System (GMAS). In the diagram, we see a foreign agent and a native agent operating in three different MAS environments. Each layer presents an API to the layer above. A typical MAS implementation consists of three layers: the agent itself, the MAS API, and the network. Our design adds three new layers: foreign MAS API to the GMAS API translation (Foreign2GMAS), GMAS API to native API translation (GMAS2Native), and a common communication and discovery service. In addition, our design adds two additional components to each participating MAS, an Agent Launcher and a Gateway. A MAS that wishes its agents to operate on other MAS’s must implement the Foreign2GMAS translator and the gateway service, while those who are willing to host foreign agents as well must implement the 2

For readers familiar with the PBM image-translation tools, PBM uses the same approach.

108

A. Grimstrup et al.

GMAS2Native translator, the Agent Launcher, and the Gateway. The Gateway serves as the conduit to the transport mechanism at the source, while the Launcher serves the same function at the destination.

D'Agent

D'Agent

NOMADS Agent

D'Agents/ GMAS Adaptor GMAS/ EMAA Adaptor

D'Agents/ GMAS Adaptor GMAS/ NOMADS Adaptor

NOMADS GMAS Adaptor GMAS/ D'Agents Adaptor

EMAA MAS Launcher

EMAA Agent

Gateway

NOMADS MAS

NOMADS Agent Launcher

Gateway

Bridge

D'Agent Launcher

Gateway

Bridge

Foreign2GMAS GMAS2Native

D'Agents MAS

CoABS Grid Network

Fig. 1. Structure of GMAS

Since each MAS has its own communication and directory services, it is impractical to construct mappings between them. Instead, we rely upon a common substrate to provide communication and discovery services in this heterogeneous environment. In GMAS, we use the Jini-based DARPA Control of Agent Based Systems (CoABS) Grid [7], but CORBA or any other similar mechanism can also be used. NOMADS and D’Agents implemented their support of the Grid interface with the help of a bridge (a proxy that speaks both the Native and Grid protocols) due to lack of native support for Jini. Foreign agents move to other GMAS-capable MAS hosts via their local Gateway and the Agent Launcher at the destination. The Gateway provides the discovery, marshalling, and movement operations using the local implementation of the GMAS API, described below. The Agent Launcher handles incoming launch requests, unmarshalling, and execution support for newly arrived agents. Depending on the network topology and local policy, an Agent Launcher may serve an entire subnet or an individual host machine and function as either a stand-alone service or as an integral component of the host mobile-agent system. The local system composes the Foreign2GMAS translator with the GMAS2Native translator to map the Foreign API to the Native API. When a foreign mobile agent arrives at the system, the local system dynamically loads the corresponding Foreign2GMAS translator from a specified remote server. This capability is the key to providing interoperability of mobile-agent systems. GMAS supports systems that are Java based, so any agent written in Java can operate upon a GMAS-enabled host utilizing standard Java API method calls, provided that its agency related calls (i.e., those of discovery, communication and migration) either conform to the GMAS API, or are translated to GMAS via a Foreign2GMAS adaptor. Access to certain Java methods may be circumscribed for host security reasons. An identifiable subset may evolve in common usage.

Toward Interoperability of Mobile-Agent Systems

2.1

109

GMAS Interoperability API

The GMAS API provides methods to create an agent either by launching a new agent or by cloning the current agent on a remote host . The corresponding methods are launchAgent() or cloneAgent(). When launching a new agent, the agent’s initial state must be provided to the system as a parameter. An explicit moveAgent() method is not provided, but can be easily implemented as a clone operation followed by the termination of the original agent. Cloning an agent requires that the state of the agent be moved to the destination. One of the main advantages of using Java for mobile-agent programming is the ability to use object serialization for packaging an agent before shipment to another host. This behavior is supported in our design too. Any agent that implements Java’s Serializable interface can be cloned through the GMAS API. Many systems, such as D’Agents, do not support Java 2 serialization. To operate on those systems, an agent programmer must explicitly manage the conversion of the agent’s data to and from an external form. These agents are referred to as self-serializing agents and must implement the SelfSerializable interface to be supported by GMAS. The self-serialization process is straightforward. When a self-serializing agent attempts to move to a new host, the underlying mobility service retrieves from the agent a container holding objects that represent the agent’s state using the getAgentState() method. Each object’s value in the container is then converted to an XML-based representation containing its name, data type and value. An example of an agent’s state after self-serialization is shown in Figure 2.







Fig. 2. SelfSerialized Agent State Transfer Message

To restore the agent state, this XML-based representation is parsed and a new container containing the state object is reconstructed from the triples. The setAgentState() method is invoked on the newly created agent instance with the container as the only argument and the values are restored to the agent’s attributes. To assist the programmer in the conversions, we provide the AgentVariableState class as a means of storing variables as well as handling the conversions to and from the message format. The agent programmer is still responsible for packing and unpacking the AgentVariableState with appropriate data objects.

110

A. Grimstrup et al.

The Agent Launcher handles cloning an existing agent and launching a new agent. In both cases, the Launcher creates a new instance of the agent on the receiving side. When cloning, the Launcher then copies the state of the original agent into the newly created agent. When launching a new agent, the launcher copies the initial state into the new agent. The agent’s meta information, including such items as its origin and the location of its class files, resides in an AgentMetaData object, and is provided to GMAS upon creation of the agent’s GMAS representation. The system may obtain access to the meta data by invoking the getMetaData() method. 2.2

Gateway to Launcher Communication Protocol

As in many agent systems, agent migration depends on message passing. In our agent transfer, the client Gateway uses a two-phase protocol to launch or clone an agent. During the first phase, the Gateway sends the agent’s meta information in the form of the AgentMetaData object to the remote Agent Launcher. If the remote Launcher decides to allow the migration, it returns a one-time-use token to the Gateway. The Gateway can then use the token to send the state of the agent, in either binary (Serializable) or XML (SelfSerializable) form, thereby completing the migration. This approach allows the receiver to perform authentication or other checks (such as server load) before granting the sender the right to send the agent.

3

Implementation

To evaluate our design, we implemented the inter-operable agent infrastructure for three different mobile-agent systems: D’Agents [6] from Dartmouth College, EMAA [3,9] from Lockheed-Martin Advanced Technology Laboratory, NOMADS [11] from the University of West Florida Institute for Human and Machine Cognition, and a reference implementation based on an unmodified Java 2 virtual machine. We first defined the communications protocol between the gateways and the launchers. Next, we implemented the major components: the launchers, the gateways, and the adaptors. Finally, we made a few minor changes to the agent systems so they would to work with the interoperability infrastructure. All of the above components are described in detail below. Communication Protocol. As described in the design section, agent migration is accomplished via message passing between the source and the destination host. The CoABS Grid provides a well-integrated message-passing service, so implementing the launch request and response protocol only required that we create a Message object and invoke the appropriate message transmission method. In our design, we specified that a launch request is comprised of a description of the agent and that agent’s data state. Because we used the Grid to handle our communications, we were free to select between a string-based or a binary-based message format since a Grid Message is equally capable of carrying either form of data. We used a string-based approach since it would be easier to construct protocol bridges to systems that could not directly use the Grid software. Using a string-based format requires special marshaling

Toward Interoperability of Mobile-Agent Systems

111

and unmarshalling code at both ends of the communications link. Rather than write custom software to handle the translation, we defined a set of XML tags that hold the agent description and state information. Using an XML-based format gave us a structured way to translate agents to and from the launch request format, a way to leverage existing XML parsers such as Xerces3 , and a way to make changes in our data format with minimal disruption to the existing code base. XML does not support transfer of binary data, however, so we use Base64 to encode all serialized agents before inserting them into a launch request. Launcher Implementation. Our launcher is implemented as a stationary Grid service that accepts and processes incoming Grid messages bearing launch requests. It can stand alone or run as a separate thread within a mobile-agent system’s JVM. When a message is received, the launch request is parsed and the agent information is extracted. The agent is returned as either a serialized object or a two-object set: one containing the agent metadata and the other containing the data state. The required Java class files are then downloaded from the specified source locations, and the agent is deserialized or the data state is loaded into a new instance of the agent. The newly reconstructed agent object is now ready for execution. The launcher invokes an executor method, which is responsible for resuming the given agent object at the specified entry point. Depending on the executor selected, control may pass to the agent, a new thread may be created for the agent to run in, or the agent may be forwarded to another host. The behavior and types of executors available is dependent on the policy and configuration of the site where the launcher resides. Sites that have a single launcher supporting multiple MAS hosts will require an executor that forwards the incoming agent to the correct destination. Other sites may wish to present different execution environments to newly arrived agents based on an agent’s native MAS. For example, an external launcher that supports a D’Agents MAS host may be configured to execute incoming EMAA and NOMADS agents as separate threads in its JVM due to possible code incompatibilities. Gateway Implementation. The Gateway hides any local implementation details, such as special translation, from the agent programmer. In our reference implementation, this process is straightforward; the Gateway’s launchAgent() and cloneAgent() methods find the destination then create and send the launch request to the appropriate launcher. Since each platform provides a different Gateway implementation, the mobile agents should not carry a Gateway implementation object with them. Instead, the IAPI provides a wrapper class that returns the correct instance of the Gateway depending on the current location of the agent. The Gateway implementation returned is determined by a system property that can be set on the command line or in a configuration file. Using this additional layer of abstraction also allows the Launcher to change the returned mobility service implementation (i.e., Gateway) at runtime, allowing greater adaptability to changing local conditions. Adaptor Implementation. The adaptors make up the last component of the implementation. As mentioned above, there are two kinds of adaptors: Foreign2GMAS and 3

http://xml.apache.org/xerces-j/index.html

112

A. Grimstrup et al.

GMAS2Native. All three mobile-agent systems have implemented both adaptors. The remainder of this section describes the NOMADS implementation of the adaptors. The NOMADS native API consists of a base Agent class that allows an agent to access all of the services provided by the underlying platform, including the mobility service. NOMADS supports both strong and weak mobility and hence provides two separate mobility services. Given that GMAS provides for only weak mobility, that is what we will address here. When a NOMADS agent requests a move operation to a GMAS-enabled host, the mobility service encapsulates the agent into a WrapperAgent that conforms to the GMAS specification. Therefore, the WrapperAgent cloaks the native NOMADS agent in a GMAS shell, strictly for the purpose of getting successfully launched onto the destination host. This WrapperAgent’s job is complete once the NOMADS agent is running on the destination host. The Foreign2GMAS adaptor consists of an alternate implementation of the NOMADS base Agent class, providing access to the GMAS equivalent services while the agent is actually running, mapping any mobility requests made by the NOMADS agent into the GMAS API. The agent uses this alternate agent class as long as it executes on (and moves to) GMAS-enabled platforms. If the agent moves back to a NOMADS platform, the original base Agent class is utilized. The GMAS2Native adaptor is utilized only when a GMAS Agent or a Foreign Agent composed with its Foreign2GMAS adapter is resident on a NOMADS system and makes a GMAS call, such as cloneAgent(). The local implementation of the Grid Mobility Service, in other words, the GMAS2Native adaptor, will be utilized to translate this into action within the NOMADS system, and clone the subject agent on its destination. Modifications needed for Existing Systems. One of the key challenges in the design and implementation was to minimize changes to the existing mobile-agent systems. In NOMADS, we had to make only one change to the existing implementation. In our design, when an agent is running on its native platform, the agent uses the native platform’s API (i.e., the Agent class) directly without going through any adaptors. When the agent decides to move to a GMAS-supported platform, we run into a problem because the native platform’s API implementation does not know about the GMAS-supported platform. Therefore, the one necessary change in each native platform implementation was a fallback mechanism to try the gateway if the native transfer protocol failed. Note that an alternative would have been to modify the agents to use the interoperability API as a fallback but our goal was to not make any changes to the code of the mobile agent.

3.1

Mobile-Agent Systems

We chose the particular MAS’s we did because they were readily available to us and because they represent a range of design choices that impact interoperability. Since a complete presentation of each mobile-agent system is outside of the scope of this work, the relevant features of each system are summarized in Table 1.

Toward Interoperability of Mobile-Agent Systems

113

Table 1. Relevant features of systems used in our experiments.

a) b) c) d) e) f) g) h) i)

Feature D’Agents NOMADS EMAA GMAS Ref.Impl. strong/weak strong strong and weak weak weak JVM version 1.0.2 Aroma and 1.3.1 1.3.0-02 1.3.1 JVMs used multiple multiple one one what moved all code, data, stack data, stack data data, code code caching no yes preinstalled no encoding custom, fat custom serialized serialized, custom communication sockets sockets sockets Grid Messages socket reuse no no yes no security off off off none

(a) Both D’Agents and NOMADS implement strong mobility, which moves the code, data, and control states of an agent between host machines. Supporting strong mobility required modifications to the Java VM to extract the state information. Since weak mobility systems such as EMAA operate in unmodified JVMs, we could not expect state-capture support to be available in other systems. Therefore, GMAS was forced to use a weak mobility model. (b) The modifications made to support the strongly mobile D’Agents were made to an older version of the JVM. The Grid is based on a more recent release of Java and uses new technologies such as Jini in its core services. NOMADS supports both strong and weak mobility APIs. The weak mobility version of NOMADS (NOMADSSpring) is written in pure Java and executes on the standard Java VM. The strong mobility version of NOMADS (NOMADS-Oasis) uses a clean-room implementation of the JVM, called AROMA, that is mostly compatible with JDK 1.2.2 but does not support Jini. D’Agents and NOMADS-Oasis required communication bridges between the servers and the remainder of the Grid to interact with the Grid. As a result, both these systems pay a performance penalty when agents jump due to the translation and additional handling at the bridge. (c) D’Agents and NOMADS-Oasis both create a new JVM process for each arriving agent. NOMADS-Spring, the GMAS reference implementation, and EMAA only create a new thread for each arriving agent, which is a much faster operation. (d) The GMAS reference implementation moves agent code via the Java class loader. The bytecodes for an agent would be downloaded from the machine specified in the agent’s metadata description. The downloaded code is not cached by the GMAS server. EMAA uses a similar mechanism that, when combined with caching, greatly reduces the transmission overhead. NOMADS also caches agent code. D’Agents has no caching support and must bring everything along as it jumps. (f) D’Agents uses a custom encoding to carry agents from host to host but, due to the use of the older VM, cannot accept Java 2 serialized objects. NOMADS has the option to use either a custom encoding for its native agents or the regular Java serialization while

114

A. Grimstrup et al.

EMAA uses the object serialization services from Java exclusively. A GMAS-enabled agent may use either method described previously, but agents that use the SelfSerializable transfer mechanism suffer a greater performance penalty due to the additional overhead resulting from manual serialization. (g) Since the Grid messaging system is based on RMI, GMAS suffers from higher per-message latency than the other socket-based systems. (h) EMAA reuses sockets when an agent jumps, which saves connection setup and tear-down overhead. D’Agents and NOMADS create new sockets for each jump. (i) The GMAS system has no support for encryption or other security features at present. Furthermore, the security features of D’Agents, EMAA and NOMADS are dramatically different, and it is unclear to what extent they could be made interoperable. In our experiments each system’s security mechanism was disabled to ensure a fair comparison.

4

Performance Results

We measured three types of system-to-system jumps for each agent system. Each agent carried a common cargo of specifically identified data types, i.e., the same size payload. The first set of measurements were based on the native mobile-agent system running on each host.4 We recorded the round-trip times as the native agent jumped back and forth, using the mobility provided by its mobile-agent system. Second, we measured self-serializable GMAS Reference Implementation agents and, last but not least, Javaserializable agents. In all cases, we measured the average round-trip time over 100 round trips. The results are summarized in Figure 3. EMAA showed that its Native jump was considerably faster than a jump using GMAS (i.e., those times indicated by the SelfSerializable and Serializable bars in the EMAA section), by a factor of between 5 and 6, depending upon the type of GMAS mobility. NOMADS-Oasis showed even more distinction in this Native/GMAS jump time, by a factor of 10. This difference is attributable to some extent to the fact that the Aroma VM is restarted upon the agents’arrival at each system, and the agent’s class must be retrieved via a URLClassLoader each time. This repeated retrieval is not necessary on systems that use Sun’s VM, as the VM persists and the class remains cached. NOMADS-Oasis also showed a significantly slower overall time due to the unoptimized VM. Although not shown in the graph, we discovered that the CoABS Grid has significant startup overhead. Running an experiment with a freshly started Grid system added between 2 and 6 seconds to the first round-trip time. The launchAgent() operation in the D’Agents environment is almost 100 times slower than native calls. Like NOMADS-Oasis, D’Agents cannot directly communicate via the 4

Each host was a Gateway 9300XL laptop computer with a Pentium III 500MHz processor, 128 MB of RAM, 6 GB hard disk drive and a WaveLAN Gold 11 Mbps wireless network adapter. All hosts ran Slackware Linux version 7 with the 2.2.13 kernel, JDK v1.3rc1, Jini v1.1, and the CoABS Grid v2.0.0beta.

115

3500 3000 2500 Native SelfSerializable Serializable

2000 1500 1000 500

ce

0

en

10

ef R

N

O

M

er

S/ AD

en D

Ag

EM AA

0 ts

Mean Round-Trip Time (msec)

Toward Interoperability of Mobile-Agent Systems

Agent System Fig. 3. Comparison of native vs. interoperable mobility operations. Note NOMADS transfer times were scaled down to fit on this graph.

Grid so we had to use a communication bridge to manage the translation. While this accounts for some of the performance penalty, the D’Agents implementation suffered in other areas. In the GMAS Reference Implementation, we used the URLClassLoader to handle the transfer of the agent’s class file. This mechanism was not available in JDK 1.0.2 so D’Agents was forced to use a less efficient means. Also due to compatibility problems, we could not use our XML parser and were forced to use slower string manipulation methods to convert between GMAS messages and internal objects. Improvements in these areas should significantly cut the overhead cost for interoperable D’Agents. We also measured the performance of the adaptors implemented for NOMADS and compared that with the native migration of agents in NOMADS. For this test, we used NOMADS-Spring and measured three different migration times: NOMADS-NOMADS (using the native migration protocol), NOMADS-GMAS (Reference Implementation), and GMAS-GMAS. In the NOMADS-GMAS test, an agent moves from a NOMADSSpring platform to a GMAS platform and back. In the GMAS-GMAS case, the Foreign2GMAS adapter was needed on both hosts. Therefore, the agent is wrapped and unwrapped during the course of each iteration. Note that the same Native NOMADS agent was used in each of these different tests. Figure 4 shows the results of the experiments. So, it is apparent that there is a cost, not insignificant in some cases, for interoperability. If it takes your agent an extra 3-400 milliseconds to move between platforms, of which at least one is a foreign mobile-agent system, and provided some service or benefit to your agent that cannot be obtained on your native systems, that seems a small

116

A. Grimstrup et al.

450 400 350

Milliseconds

300 250 200 150 100 50 0 NOMADSSpring Native

NOMADSSpring to GMAS

GMAS to GMAS

Fig. 4. Comparison of native vs. interoperable mobility operations in NOMADS-Spring.

price to pay for this increase in utility. Of course, after tuning these implementations, the performance will improve.

5

Related Work

Here we explain why we propose an interoperability standard, and how GMAS differs from the standards defined by the Object Management Group (OMG) and the Foundation for Intelligent Physical Agents (FIPA). We recognize that there are some fundamental requirements for interoperability among heterogeneous mobile-agent systems. These include (1) discovery of agents and services, (2) communication, and (3) migration. Without all of these together, the benefits of interopability will not be realized. The OMG Mobile Agent Facility (MAF) [12] builds on the CORBA naming, life cycle, externalization, and security services. It is intended to establish standards that support interoperability. This facility promises a degree of interoperability through common interfaces to two basic mobile-agent system components: the MAFAgentSystem and the MAFFinder. A MAFAgentSystem implementation provides agent management and transfer. The MAFFinder interface defines operations for registering and locating agents and agent systems. The specification assumes, however, that it is rare that two different agent systems can receive and execute agents from each other. Indeed, the operation get nearby -

Toward Interoperability of Mobile-Agent Systems

117

agent system of profile() is provided to find only those migration targets that are running compatible agent systems. In contrast, our GMAS approach directly addresses the issue, providing a standard to enable participating heterogeneous mobile-agent systems to receive and launch each other’s agents. Like OMG MAF, GMAS leaves it up to the compliant agent system implementations to address security issues specific to mobile-agent systems and does not yet provide standard interfaces or implementations. Also like OMG MAF, GMAS does not specifically address transferring agents between agent systems written in different programming languages. This issue is true for MAF despite the fact that the underlying CORBA standard supports remote procedure calls among objects written in different programming languages. Mobility across heterogeneous programming languages is much more complicated, since mobile agents must execute their code on the heterogeneous platform directly. GMAS defines a rich, declarative representation of the mobile agent in transit, to provide adequate flexibility in the mobility infrastructure for the heterogeneous MAS’s. This contrasts with the minimal representation used by the MAF consisting of agent name, agent profile, agent (which can include class definition and state) , place name, class name, code base, and agent sender. GMAS adds agent meta data that are unnecessary in the more homogeneous environment assumed by OMG MAF, such as the name of the start method, and the agent’s programming language. GMAS structures the agent in transit as an XML message to facilitate interpretation by heterogeneous implementations instead of a set of method call parameters. The additional metadata and the XML representation take GMAS a big step towards language independence. We have shown GMAS-enabled interoperability among three agent systems running on three different versions and implementations of the Java virtual machine. FIPA defined a set of standards that represent a blueprint for constructing agent systems [5]. A few compliant agent system implementations are listed on the FIPA Web page,5 but FIPA’s standards have not been universally accepted. The FIPA mobility specification recognizes two extreme cases of the mobility protocol. At one end of a spectrum, agents using Simple Mobility Protocols communicate a single move request to their local agent platform and the agent platform (system) takes care of moving the agent. At the other end of the spectrum, agents using the Full Mobility Protocols communicate with both the remote and the local agent platform and direct every stage of the move from remote request to local termination. Our GMAS model allows a spectrum of migration protocols that fall between the Simple and the Full Mobility Protocols. GMAS aspires neither to provide standards as broad as FIPA nor to comply with a particular standard at a time when no universally accepted standard has emerged. For example, when an agent wants to find a remote peer, FIPA specifies a directory service, MAF its MAFFinder, and GMAS uses the CoABS Grid agent look-up service that is based on Sun’s Jini. GMAS does not regulate communications protocols among agents but it assumes that heterogeneous agents are able to communicate with one another. Although we utilized the communication services provided by the Grid to craft GMAS, CoABS Grid adoption is not an absolute requirement: it would be equivalent to replace 5

http://www.fipa.org/resources/livesystems.html

118

A. Grimstrup et al.

the directory and communication services with CORBA. The point is that you have to have some common substrate from which to work. The mobile-agent description specified by FIPA is closer to our GMAS representation than the OMG MAF. It contains a parameter specifically designed to hold the agent’s state. It does not address the needs of the two types of itinerant agent representations for both Self-Serializable and Serializable mobility as defined above. Three recent papers describe mechanisms that allow mobile agents to migrate between distinct mobile-agent platforms. Magnin et al. [8] require the mobile agent to be written to use a Guest interface, which is layered on top of the native interface of each mobile-agent platform. In this way, it is quite similar to the simplest version of GMAS, where existing mobile agents must be rewritten to use a standard interface. GMAS, however, builds on this simpler functionality and allows existing agents to run unchanged on top of different mobile-agent platforms. Each agent uses the native interface of its “home” platform, and “adaptors” (Foreign2GMAS and GMAS2Native) translate from one native interface to another when the agent is visiting a foreign platform. The adaptors first translate one native interface into a common interface, and then from the common interface to the target interface. Such dual translation prevents a combinatorial explosion in the number of adaptors. Tjung et al. [13] present a way of converting an Aglets agent into a Voyager agent (and vice versa), so that the agent can migrate between Aglets and Voyager platforms. The conversion requires access to the platform source code, however, something that is unlikely for commercial systems. GMAS avoids the need for source code access, and uses only the public interfaces of each mobile-agent system. Misikangas and Raatikainen [10] have a similar approach in that they do not require that each MAS be rewritten to a new interoperability API. They use the metaphor of a “head” that contains the agents logic or abstracted mission, and a “body” that serves to translate migration, communication, and lookup calls to the specific platform in question - instead of GMAS’s GMAS2Native adaptor. The agent head would need a different body for each different MAS. This approach provides only for the migration of their “MONADS” agents to different agent platforms, however, as the head is a MONADS agent. Although GMAS is not unique in its ability to enable agent migration among heterogeneous host agent systems, it allows mobile agents to use the native interface of their home platform, and in particular, existing agents do not need to be rewritten. Moreover, GMAS is scalable in the number of mobile-agent systems, since only two adaptors are required per agent platform, rather than one adaptor per each different pair of platforms. GMAS efficiently addresses the issues of packing, transferring, and unpacking an agent in a platform independent manner, and translation between APIs of different mobile-agent systems.

6

Future Work

This section briefly discusses our anticipated future work in the areas of security, resource control, and agent management. The current GMAS implementation completely ignores security. We expect that secure communication and secure agent transmission is provided by the communication

Toward Interoperability of Mobile-Agent Systems

119

transport layer. GMAS, for example, can take advantage of the secure messaging features of the CoABS Grid, while a CORBA-based implementation could exploit CORBA’s security infrastructure. We do plan to incorporate basic agent authentication mechanisms into the Interoperability API. Mobile-agent systems have different security models however, so we expect that providing full security interoperability will be difficult. While most mobile-agent systems recognize the need for resource control, once again different systems have different models and capabilities. For example, D’Agents supports a market-based approach to resource allocation and control whereas NOMADS supports a policy-based approach. NOMADS provides fine-grained control over resource usage (based on capabilities in the Aroma VM) whereas D’Agents and EMAA are constrained by the capabilities in the standard Java VMs. Therefore, coming up with interoperability mechanisms will be challenging. Our first step towards this goal is to make explicit the resource requirements of agents and to provide mechanisms that allow agents to query resource availability. Agent management and system management are important in large-scale agent systems. Here again, current agent systems support different models and capabilities making it difficult to manage heterogeneous agent systems as a single administrative unit. In NOMADS, we are exploring the domain management capabilities of the KAoS agent architecture [1,2], which is scalable to multiple agent systems. The next phase of development is to provide a common messaging API and enhance the adaptors to map the native messaging APIs of the three mobile-agent platforms to the common messaging API.

7

Conclusion and Lessons Learned

This paper describes our design for runtime interoperability of mobile-agent systems. The first stage, described here, involved defining an interoperability API that supported agent migration and agent messaging. The current implementation operates over the DARPA CoABS Grid, which provides the basic registration, lookup, and messaging infrastructure. The performance for the three mobile-agent systems are slower by a factor of 10, but the system proves that interoperability is possible. The distinguishing feature of our approach has been not to force a common API on all mobile-agent platforms. We recognize that past efforts using this approach have failed. Instead, our goal is to embrace the diversity of the different platforms and to create the necessary translation mechanisms to allow the systems to interoperate. While we have successfully demonstrated interoperability for agent messaging (via direct CoABS Grid adoption) and agent migration, we also recognize that several tricky issues remain to be solved. The limitations to interoperability stem from the wide range of models and features of different mobile-agent systems. Mobile-agent systems are still at the stage where each system stresses different capabilities while ignoring others. One system difference for which we have no solution is strong versus weak mobility. Another lesson involved incompatibilities in the Java API. For example, D’Agents currently uses JDK 1.0.2 whereas EMAA and NOMADS support Java 2. Therefore,

120

A. Grimstrup et al.

although we can move an agent from NOMADS or EMAA to D’Agents, the agent may fail to execute because of the differences at the level of the Java API. Overall, we are pleased with the ability of GMAS to cleanly enable agent migration among diverse agent platforms. The slow performance will improve with careful optimization. Nonetheless, for many applications, performance may be less important than the ability to obtain new services and benefits available only by migration to a foreign MAS platform.

References 1. Jeffrey M. Bradshaw, Stewart Dutfield, Pete Benoit, and John D. Woolley. KAoS: Toward an industrial-strength open agent architecture. In Jeff Bradshaw, editor, Software Agents, pages 375–418. AAAI/MIT Press, 1997. 2. Jeffrey M. Bradshaw, Niranjan Suri, Alberto K. Ca˜nas, Robert Davis, Kenneth Ford, Robert Huffman, Renia Jeffers, and Thomas Reichherzer. Terraforming Cyberspace. IEEE Computer, 34(7):48–56, July 2001. 3. Daria Chac´on, John McCormick, Susan McGrath, and Craig Stoneking. Rapid application development using agent itinerary patterns. Technical Report #01-01, Lockheed Martin Advanced Technology Laboratories, March 2000. 4. D. Milojicic et al. MASIF: The OMG mobile agent system interoperability facility. In Proc. of the Second International Workshop on Mobile Agents, volume 1477 of Lecture Notes in Computer Science, pages 50–67. Springer-Verlag, September 1998. 5. FIPA Architecture Board. Agent Management Support for Mobility Specification. Foundation for Intelligent Physical Agents, Geneva, Switzerland, June 2000. 6. Robert S. Gray. Agent Tcl: A flexible and secure mobile-agent system. Technical Report PCS-TR98-327, Dartmouth College, Computer Science, January 1998. 7. Martha Kahn. CoABS Grid User’s Manual, Available from http://coabs.globalinfotek.com/public/downloads/Grid/documents/GridUsersManual.v3.3.0. doc. Global InfoTeK Incorporated, October 2000. Version 2.0.0beta. 8. Laurent Magnin, Thang Viet Pham, Arnaud Dury, Nicolas Besson, and Arnaud Thiefaine. Our Guest agents are welcome to your agent platforms. In Proceedings of the Seventeenth ACM Symposium on Applied Computing 2002 (SAC 2002), Madrid, Spain, March 2002. 9. Susan McGrath, Daria Chac´on, and Ken Whitebread. Intelligent mobile agents in the military domain. In Proceedings of the Autonomous Agents 2000 Workshop on Agents in Industry, Barcelona, Spain, 2000. 10. Pauli Misikangas and Kimmo Raatikainen. Agent migration between incompatible agent platforms. In Proceedings of the Twentieth International Conference on Distributed Computer Systems. IEEE Computer Society Press, April 2000. 11. Niranjan Suri, Jeffrey M. Bradshaw, Maggie R. Breedy, Paul T. Groth, Gregory A. Hill, and Renia Jeffers. Strong mobility and fine-grained resource control in NOMADS. In Proc. of ASA/MA2000, volume 1882 of Lecture Notes in Computer Science, pages 2–15, September 2000. 12. The Object Management Group. Mobile Agent Facility, January 2000. v1.0. 13. D. Tjung, M. Tsukamoto, and S. Nishio. A converter approach for mobile-agent system integration: A case of Aglet to Voyager. In Proceedings of the First International Workshop on Mobile Agents for Telecommunication Applications (MATA ’99), pages 179–195, Ottawa, Canada, October 1999.

Mobile Intermediaries Supporting Information Sharing between Mobile Users Norliza Zaini and Luc Moreau Department of Electronics and Computer Science, University of Southampton Southampton SO17 1BJ UK {nmz00r, L.Moreau}@ecs.soton.ac.uk

Abstract. Mobile device’s networking capabilities offer opportunities for a new range of applications. We consider here a service that allows mobile users taking part in virtual meeting rooms to share information and documents. The sharing is promoted by a recommender system that assists users browsing documents, by making recommendations in the form of URLs pointing to other documents, that users in the virtual room have explicitly decided to share. A multi-user recommender system is a complex application requiring communication, memory and computing resources, and does not lend itself to a port to mobile devices with limited resources and intermittent connectivity. For this reason, we decided to offload the computationally intensive part of the application to the infrastructure, and to introduce the idea of an intermediary located in the network infrastructure, which interacts with applications on behalf of the mobile device, thereby hiding away the intermittent connectivity details. Our vision is that of a mobile intermediary, called Shadow, that will always be in close vicinity with the mobile device. We show that multiple Shadows may co-exist, and we propose a protocol capable of coordinating them. We present an abstraction layer, hiding away communication and coordination details, which offers a substrate for building the distributed recommender system across mobile devices and fixed infrastructure. Implementation details of our application are also presented.

1

Introduction

The context of this paper is the “ubiquitous computing environment” [12], where embedded devices and artifacts abound in buildings and homes, and have the ability to sense and interact with devices carried by people in their vicinity. Mobile devices’ networking capabilities offer opportunities for a new range of services, such as access to stock updates or latest news, or exchange of information with other mobile users discovered dynamically. However, as devices communicate over wireless networks, they are prone to temporary disconnections and the low bandwidth communication channel they use may lead to high network latency. Moreover, although having the advantage of being small and easy to use, handheld devices such as PDAs suffer from low resource capability such as low memory capacity, limited processing power and small display area. Consequently, these limited capabilities would prevent the N. Suri (Ed.): MA 2002, LNCS 2535, pp. 121–137, 2002. c Springer-Verlag Berlin Heidelberg 2002 

122

N. Zaini and L. Moreau

large-scale deployment of advanced services to mobile users, as such services tend to be communication and computation intensive. Instead of requiring complex applications to be installed on mobile devices and constant connectivity between mobile devices and fixed network to serve users’ requests, we believe that applications can be offloaded to the fixed infrastructure, and act semi-autonomously on behalf of the user. In this context, if infrastructure applications can perform tasks without direct control and monitoring from the user, then the proposed approach does not rely on permanent connectivity with mobile devices, it can save mobile device’s resources, and it can take advantage of the available resources on the wired network. While offloading applications to the fixed infrastructure solves the problem of limited devices’ resources, it does not address the issue of how communications can take place between the mobile device and the infrastructure application. To this end, we introduce an intermediary process in the fixed infrastructure, whose responsibility is to spawn applications in reaction to user’s requests and to store and forward messages between mobile devices and applications. Since a stationary intermediary may lead to long distance communications, we decided to adopt a more flexible approach, in which the intermediary is mobile. Our vision is to have a mobile intermediary, which is a mobile agent [5], acting as a Shadow of the mobile user, migrating to the user’s vicinity when prevailing conditions permit it. The flexibility offered by the mobility can help reduce the bandwidth required for the application and improve its performance [3]. Besides, this has a number of other advantages: (i ) Shadow and mobile device can communicate using specialised protocols, possibly dynamically chosen according to the current location or to a negotiation between parties; (ii ) newly created applications would run in the user’s vicinity, making use of the local infrastructure; (iii ) even if the local network is not connected to the Internet, local services could be accessed; (iv ) Shadows and applications can communicate reliably using transparent routing of messages to mobile agents [8,9]. When a user moves to a new location, their mobile device interacting with the infrastructure will request the user’s Shadow to migrate to a new location. However, this may fail when the user’s local network is not connected with the user’s previous location. In order to support services in the current vicinity, we opted for a solution where new Shadows can be created dynamically. As result, a user may be associated with multiple Shadows that need to be coordinated and we shall describe in this paper a coordination protocol for such Shadows. The purpose of this paper is to introduce an application for mobile users, which supports information sharing in virtual meeting rooms. The design and implementation of this application involves complex interactions between infrastructure and mobile device. Our specific contributions are: 1. An architecture supporting multiple mobile Shadows; 2. A coordination protocol between mobile devices and Shadows; 3. An abstraction layer, encapsulating migration and coordination, offering a substrate to program applications between mobile devices and infrastructure; 4. An application supporting information sharing between mobile users.

Mobile Intermediaries Supporting Information Sharing between Mobile Users

123

In the next section, we overview our application scenario. Then, in Section 3, we introduce the architecture followed by Section 4, where we present the algorithms to be implemented by all its components. In Section 5, we further comment on the application’s implementation and finally, in the last section we present the summary and discuss related work.

2

Application Scenario

Today’s Internet offers various forms of virtual meeting rooms allowing online users to meet and interact with each other; examples include Internet Relay Chat (IRC) supporting text channels and the Access Grid (www.accessgrid.org) supporting audio and video multicast conferencing. When business meetings are conducted with such facilities, it is useful for users to access an out-of-line mechanism for sharing information or documents. Our approach is to rely on an agent-based recommender system [11] to recommend relevant documents to users as they browse information. The recommended documents will be selected from a pool of documents that users participating to the virtual room have actively decided to share; documents are recommended according to their similarity to other documents. Practically, a user navigates information, while recommendations are displayed in a browser sidebar; recommendations also include information regarding the user who decided to make the document available to the meeting room. The sidebar also allows users to export documents for possible recommendation in the meeting room. In the context of this paper, our goal is to allow mobile users to take part into virtual meetings. As mobile users roam to other networks, they remain in the virtual meeting rooms that had subscribed to, and they are given the opportunity to discover and use different virtual meeting rooms running on different networks. Here, we do not address the problem of delivering content of the meeting room (such as IRC, audio or video channels), instead, we investigate the infrastructure to allow the sharing of information through the recommender system. In short, our application’s functions include: (i ) to support information sharing between mobile users, (ii ) to provide recommendations based on the shared information to mobile users, (iii ) to allow a mobile user to remain a participant in a meeting room while the user roams to different networks. We have already developed an agent-based recommender system [11] capable of recommending documents that users have actively decided to share. Therefore, this leaves us with the challenge of developing an application that is able to support mobile users accessing virtual meeting rooms hosted by the fixed infrastucture. In the following sections, we present an architecture supporting mobile intermediaries — implemented as mobile agents — which hide communication details between mobile devices and applications on the fixed infrastructure.

124

3

N. Zaini and L. Moreau

Architecture Overview

Our proposed architecture is composed of three major components, namely a mobile device, a Shadow and a Shadow Manager, which we describe with the assumptions we make concerning their communication capabilities. We are also investigating the security aspect of the architecture, but we do not present it at this stage. A mobile device has the ability to connect to a network in its vicinity. It may use specific methods to communicate with network hosts, e.g. infra-red or Bluetooth. We assume that the device is allocated an address, which may change as the device connects to another network, and which can be used by networked entities to communicate with it. A Shadow Manager acts as a local daemon in a local network, first contact point of a mobile device with the local network. A Shadow Manager is responsible for starting or migrating Shadows on behalf of mobile devices. A Shadow is a mobile agent, acting as an intermediary between a mobile device and infrastructure applications. Being able to migrate allows it to move “closer” to the mobile device, and to communicate with it using the address allocated by the local network. The Shadow’s functions include: (i ) to create applications on behalf of the mobile device; (ii ) to send messages to the applications on behalf of the mobile device; (iii ) to store and forward messages for the mobile device; (iv ) to migrate to a location closer to the mobile device, whenever the mobile device changes its location, network connectivity permitting. Our architecture may be summarised as follows. When connected to a network, a mobile device makes contact with a Shadow manager, and requests it to migrate its Shadows to the manager’s location. If no user’s Shadow is active, the Shadow Manager creates a new Shadow for the mobile device. In the simplest case, there exists a single Shadow. If migration is successful, the Shadow interacts locally with the mobile device. The Shadow spawns new applications as requested by the device and forwards messages to and from them; in essence, the Shadow acts as a router of messages to the applications. Communications between Shadow and applications are robust to the migration of Shadows, based on a transparent routing algorithm [8,9]; on the other hand, communications between device and Shadow may fail as the device changes location. If the migration of all Shadows fails, a new Shadow is spawned locally, and the device keeps a log of all created Shadows. When several Shadows are requested to migrate to a specific destination, the first Shadow to reach the location is assigned to be the main Shadow; the others coordinate with it to offload information about applications they were routing messages to. In the following section, we describe precisely the algorithm of each component. Our goal is to define an abstraction layer, which hides the details of communication and coordination between mobile devices, Shadows and applications. On top of this abstraction layer, we have constructed an application, which supports information sharing between mobile users in virtual meeting rooms: in the mobile device, a programming API is provided to communicate transparently with fixed infrastructure applications, while applications are given the possibility to interact transparently with mobile devices; the abstraction layer takes care

Mobile Intermediaries Supporting Information Sharing between Mobile Users

125

of all necessary routing and coordination. In addition, the coordination between multiple Shadows introduces a global property, which allows the mobile device to interact only with a local main Shadow, while Shadows are allowed to interact directly with the mobile device only if it is connected locally.

4

The Algorithm

In this section, we describe the algorithm coordinating the interactions between mobile devices, Shadows, Shadow managers and applications. In Figure 1, we define a set of notations that we use in our algorithm.

LANi (Local Network i) Pi,j (Platform j on LANi ) SMi,j (Shadow Manager j on LANi ) LS (List of active Shadows) LA (List of applications)

β M D , αM D (Mobile Device : Identifier, Address) β M S , αM S , γ M S (Main Shadow : Identifier, Address, Counter) β S , αS (Shadow : Identifier, Address) β SMi,j , αSMi,j (SMi,j : Identifier, Address) (Application : Identifier, Address) β App , αApp LOM (List of outgoing messages) LIM (List of incoming messages)

send(β, α, M sg) (Send message M sg to entity identified by β, α) receive(β, α, M sg) (Receive message M sg from entity identified by β, α) List operations: [M ] : L L1 : L 2 L : [M ] enqueue(M, L)

(List composed of a head M and tail L) (Concatenation of two lists L1 , L2 ) (Message M added to the end of list L) ≡ L := L : [M ]

Fig. 1. Notation

Mobile Device. The behaviour of a mobile device is described in Figures 2 and 3. Each mobile device stores a main Shadow identifier and a list of Shadows (referred to by their identifier and address) that are active on the fixed network. Every time a mobile device connects to the fixed network, it resets its main Shadow identifier; it discovers a local Shadow Manager, and sends it a request to migrate its active Shadows to the platform on which the Shadow Manager is operating. If the request fails to be sent, another local Shadow Manager is discovered and a similar request is sent. Otherwise, the mobile device starts to wait for messages. If no “ShadowInf ormation” message is received in an expected time range, a timeout occurs and a new request is sent to the Shadow Manager. A “ShadowInf ormation” message informs about the existence of a Shadow and its current address. When received, this information is updated in the list of active Shadows. If no main Shadow exists, the message’s sender

126

N. Zaini and L. Moreau

Variables: LS, LOM, LIM, β M D , γ M S := 0; When MD connects to LANi at address α: – – – – – –

Online := true; β M S := ⊥; αM S := ⊥; //resets identifier and address of main Shadow D: discover SMi,j with identifier β SMi,j and address αSMi,j ; MR: send(β SMi,j , αSMi,j , M igrateShadow(LS)); if failed, then: discover other Shadow Manager, goto D; else: • if receive(β SMi,j , αSMi,j , SM Ack ), then: ∗ start timer(ShadowInf ormation(β M D , migrateCountSx ), d);

Repeatedly process the incoming messages: – if receive(β S , αS , ShadowInf ormation(β M D , migrateCountSx )), then: • stop timer(ShadowInf ormation(β M D , migrateCountSx ), d); • LS[β S ] := αS ; • if β M S = ⊥, then: ∗ β M S := β S ; αM S := αS ; γ M S + +; ∗ enqueue((β M S , αM S , M SAssignment(LS, γ M S ), LOM ); • else: enqueue((β S , αS , M SInf ormation(β M S , αM S , γ M S ), LOM ); – if receive(β M S , αM S , M essageF ailed(β App , M sg)), then: • enqueue((β M S , αM S , M essage(β App , M sg)), LOM ); – if receive(β M S , αM S , CreationF ailed(β App , App )), then: • enqueue((β App , CreationF ailed(β App , App )), LIM ); – if receive(β M S , αM S , M essage(β App , M sg)), then: enqueue((β App , M sg), LIM ); – if receive(β M S , αM S , ShadowT ermination(β S )), then: LS[β S ] := ⊥; In parallel: – if timeout, then: • stop timer(ShadowInf ormation(β M D , migrateCountSx ), d); • send(β SMi,j , αSMi,j , StartShadow); Exported API: – CreateApplication(β App , App ): • enqueue((β M S , αM S , CreateApplication(β App App )), LOM ); – SendM essageT oApp(β App , M sg): • enqueue((β M S , αM S , M essage(β App , M sg)), LOM ); – GetM essageF romApp(): • if LIM = ⊥, then: return ⊥; • if LIM = [(β App , M sg)] : LIM  , then: LIM := LIM  ; return (β App , M sg); Fig. 2. Mobile Device’s Behaviour (i)

is assigned to be the main Shadow by sending an “M SAssignment” message and if a main Shadow already exists, the Shadow is sent a “M SInf ormation” message, which notifies about the new main Shadow.

Mobile Intermediaries Supporting Information Sharing between Mobile Users

127

An API is provided for the application layer to create a new application, to send a message to an application, or to receive a message from an application on the infrastructure — applications are referred to by identifiers passed as argument to the API procedures. The first two procedures result in a message sent to the main Shadow. In return, the mobile device may receive a failure notification from the main Shadow indicating its failure to create the application or to send a message to the application. Incoming application messages are kept in a queue of messages, until the application layer reads them. Incoming messages are changing the internal state and queues of the mobile device. For instance, when a “ShadowT ermination” message, which informs about a recently terminated Shadow, is received, the terminated Shadow’s detail is removed from the mobile device’ list of active Shadows. Messages to be sent to Shadows are added to a queue of outgoing messages, while in parallel (cf. Figure 3), separate threads are responsible for processing the enqueued messages. Additionally, messages are validated before they are sent. For this purpose, we use a counter “γ M S ”, which identifies the number of times the mobile device has changed location; such a counter is also added as a “timestamp” to messages. A message can be outdated if the intended recipient no longer exists or has changed its status or address. An outdated message can still be valid if the intended recipient is still holding the same status but has changed its address. Such messages are updated with the new address, while invalid messages are discarded from the queue. For instance, an “M SInf ormation” or an “M SAssignment” message is no longer valid if a new main Shadow is being assigned. On failure of sending a message, the message is added back to the queue.

Continuously in parallel: if Online and β M S = ⊥ and αM S = ⊥, then: – if LOM = [(β R , αR , M sg)] : LOM  , then: • LOM := LOM  ;  • if M sg = M SAssignment(LS  , γ M S ), then: M S MS ∗ if γ = γ , then: sendOut(β M S , αM S , M SAssignment(LS, γ M S )); //use main Shadow’s latest address αM S and latest LS    • if M sg = M SInf ormation(β M S , αM S , γ M S ), then: M S MS ∗ if γ =γ then: sendOut(β R , LS[β R ], M SInf ormation(β M S , αM S , γ M S )); //use β R ’s latest address • else: sendOut(β R , αR , M sg); Subroutine: sendOut(β R , αR , M sg): – send(β R , αR , M sg); – if fail, then: enqueue((β R , αR , M sg), LOM ); When MD disconnects from LANi : Online := f alse; Fig. 3. Mobile Device’s Behaviour (ii)

128

N. Zaini and L. Moreau

SMij running at address αi,k : – advertise its presence at address αi,k ; – Repeatedly process the incoming messages: – if receive(β M D , αM D , M igrateShadow(LS)), then: • send(β M D , αM D , SM Ack ); • C := 0; //a counter • for each pair (β S , αS ) in LS: ∗ send(β S , αS , M igrateRequest(β M D , αM D , αi,k )); ∗ if successful, then C:=C+1; • if C = 0, then: ∗ startShadow(β M D , αM D ); – if receive(β M D , αM D , StartShadow), then: • startShadow(β M D , αM D ); Subroutine: startShadow(β M D , αM D ): – start a Shadow S; with new identifier β S at address αS ; – send(β S , αS , M DInf ormation(β M D , αM D )); resend on failure; Fig. 4. Shadow Manager’s Behaviour

Shadow Manager. In our architecture, Shadow Managers are stationary agents running on the fixed network and in order for the algorithm to work, we assume there is at least one Shadow Manager operating on each local network. When a Shadow Manager is started (cf. Figure 4), it advertises its presence through a service directory, such as Jini or LDAP, and then waits for messages. A Shadow Manager may receive a request from a mobile device to migrate Shadows; the Shadow Manager then sends a “M igrateRequest” message to each Shadow requesting them to migrate to the platform on which it is operating. If no Shadow was able to migrate or if it receives a “StartShadow” request, it starts a new Shadow, to which an “M DInf ormation” message that contains information on the requesting mobile device is sent. The message is repeatedly sent on failure until the Shadow eventually gets it. Shadow. Figure 5 describes the global behaviour of Shadows while Figure 6 and 7 are describing specific behaviour of a regular Shadow and a main Shadow. A Shadow is able to create applications on request from mobile device. Each application has an identifier and an address. A variable “LA” is used to keep a mapping of application identifier to application address, which is initially empty. A Shadow has a handOver flag, which is set to false on its creation. This flag becomes true if the Shadow’s function has been transferred to the main Shadow. Messages to be sent to the mobile device are queued in a list of outgoing messages to mobile device (LOM M D ), while messages to be sent to the main Shadow are queued in a list of outgoing messages to main Shadow (LOM M S ). Incoming messages from the mobile device for the applications are queued in a list of incoming messages (LIM ), while a list (LOM Ack ) queues acknowledgement messages to other Shadows. In parallel, separate threads are responsible for processing the

Mobile Intermediaries Supporting Information Sharing between Mobile Users

129

enqueued messages, which would add messages failed to be sent, back to their respective queues. Variables: β S , β M S , γ M S , αM D , β M D , LA, handOver, LOM M D , LOM M S , LIM, LOM Ack , migrateCount, handOverCount, M S, LS; Continuously in parallel: 

– if LOM M D = [M sg] : LOM M D , then: • remove all ShadowInf ormation(β M D , migrateCount ) messages from LOM M D where migrateCount = migrateCount;  • LOM M D := LOM M D ; MS • if β = ⊥, then: send(β M D , αM D , M sg); • else: send(β M S , αM S , SendM essageT oM D(M sg)); • if failed, then: enqueue(M sg, LOM M D ); – if LIM = [(β App , M sg)] : LIM  , then: • LIM := LIM  ; • send(β App , LA[β App ], M sg); if failed, then: enqueue((β App , M sg), LIM );  – if LOM Ack = [(β Sx , αSx , M sg)] : LOM Ack , then:  • LOM Ack := LOM Ack ; Sx Sx • send(β , α , M sg); if failed, then: enqueue((β Sx , αSx , M sg), LOM Ack ); Subroutine: migrate(αPx,y ): – – – –

if (LANx = LANi ), then: migrate to αPx,y ; migrateCount + +; β M S := ⊥; αM S := ⊥; M S = f alse; enqueue(ShadowInf ormation(β M D , migrateCount), LOM M D ); Fig. 5. Global Behaviour of Shadows

A migrateCount is a variable, which keeps track of the number of migrations of a Shadow. This variable is important to validate messages during the process of sending out messages from the list of outgoing messages. For example, a “ShadowInf ormation” message, which is used to inform the mobile device about the Shadow’s arrival on a platform is no longer valid if the Shadow has already migrated to a new platform. In this case, the migrateCount variable recorded in the “ShadowInf ormation” message would be less than the current migrateCount, thus showing that the message is outdated and invalid. Such messages are discarded from the list. In a Shadow, there is a hook for intelligent decision making about migration; such decision is not part of this algorithm, and may depend on the state of the application or prevailing network condition. The output of this decision making process is obtained by the “callback” canMigrate(), which returns true if the application layer decides to migrate. When started (cf. Figure 6), a Shadow waits for “M DInf ormation” message, which contains information about a mobile device. Then, the Shadow sends the mobile device a “ShadowInf ormation” message informing its ex-

130

N. Zaini and L. Moreau

Fig. 6. Regular Shadow’s Behaviour (MS=false)

Mobile Intermediaries Supporting Information Sharing between Mobile Users

131

A main Shadow S is running on platform Pi,k at address αi,k on LANi : Repeatedly process the incoming messages: 

– if receive(β SMu,q , αSMu,q , M igrateRequest(β M D , αM D , αP )), then:  • αM D := αM D ; MS := false; //behave as in Figure 6) • if canMigrate(), then: migrate(αP );    – if receive(β Sx , αSx , LocationInf ormation(β M D , αM D , αP , γ M S )) and γ M S > MS γ , then:  • M S := f alse; αM D := αM D ; β M S := β Sx ; αM S := αSx ; //behave as in Figure 6) • if canM igrate(), then: migrate(αP ); • else: enqueue(T ransf erF (β M D , LA, LOM M D , LIM ), LOM M S ); – if receive(β Sx , αSx , T ransf erF (β M D , LASx , LOM M DS x , LIM Sx )), then: • LA := LA : LASx ; LIM := LIM : LIM Sx ; • LOM M D := LOM M D : LOM M DS x ; • enqueue((β Sx , αSx , T ransf erF Ack), LOM Ack ); – if receive(β Sx , αSx , SendM essageT oM D(M sg)), then: • enqueue(M sg, LOM M D ); – if receive(β Sx , αSx , T erminationM essage(β M D )), then: • enqueue(ShadowT ermination(β Sx ), LOM M D ); • enqueue((β Sx , αSx , T ermination Ack), LOM Ack ); Continuously in parallel, if LS = ⊥ then: 





– if LS = [(β S , αS )] : LS  and β S = β S , then:   • send(β S , αS , LocationInf ormation(β M D , αM D , αi,k , γ M S ));   • if fails, then: LS := LS  : [(β S , αS )]; else: LS := LS  ; Interface with Applications: – if receive(β M D , αM D , CreateApplication(β App , App )), then: • StartApplication(β App , App ); its address is αApp ; • if success, then: LA[β App ] := αApp ; • else: enqueue(CreationF ailed(β App , App ), LOM M D ); – if receive(β M D , αM D , M essage(β App , M sg)), then: • if LA[β App ] = ⊥, then: enqueue((β App , M sg), LIM ); • else: enqueue(M essageF ailed(β App , M sg), LOM M D ); – if receive(β App , αApp , M sg)), then: • if LA[β App ] = ⊥, then: enqueue(M essage(β App , M sg), LOM M D ); • else: enqueue((β App , M essageF ailed(M sg)), LIM ); Fig. 7. Main Shadow’s Behaviour (MS=true)

istence at its current address. The Shadow waits for messages; if it receives an “M SAssignment” message, it sets its main Shadow (MS) flag to true as it is being assigned by the mobile device to be the main Shadow. Instead of receiving an “M SAssignment” message, a Shadow may receive an “M SInf ormation” message, which signifies that another Shadow has been assigned to be the main Shadow. The Shadow sets its MS flag to false and updates its information about

132

N. Zaini and L. Moreau

the main Shadow accordingly. A Shadow may receive a “M igrateRequest” message from a Shadow Manager, which requests the Shadow to migrate to the platform on which the Shadow Manager is operating. If canMigrate() returned true, the Shadow migrates to the new platform. On arrival at the new platform the Shadow resets its MS flag to false and sends a “ShadowInf ormation” to the mobile device. Then it waits for messages. If a Shadow cannot migrate, it stays on the same platform and continues to wait for further messages. A regular Shadow has to hand over its function to the main Shadow by sending its LA, LOM M D and LIM in a “T ransf erF ” message. Then, the Shadow sets its handOver flag to true when it received a “T ransf erF Ack” message, and sends messages to all applications it is interacting with that the main Shadow is the new intermediary to communicate with the mobile device. Every message sent to the applications requires an acknowledgement to ensure that the recipient has successfully received the message. Subsequently, the Shadow is ready for termination; before terminating itself, it sends a “T erminationM essage” to the main Shadow and waits for an acknowledgement. “M SAssignment”, “M SInf ormation” and “LocationInf ormation” are types of messages, which carry information about the main Shadow. Each of this messages contains main Shadow assignment counter “γ M S ”. This is to avoid Shadows to use an outdated information about the main Shadow. For instance, if a Shadow received an “MSInformation” message with a counter that is less than the one contained in a previously received message, the message is considered as outdated and discarded. Sometimes, a regular Shadow may receive a message intended for a main Shadow, such as when a Shadow receiving a “T ransf erF ” or a “T erminationM essage” message from another Shadow. In this case, an “M SInf ormation” message is returned informing about the current main Shadow. If a “SendM essageT oM D” message is received, which requests it to send a message to the mobile device, a “SendM essageT oN ewM S” message containing an “M SInf ormation” and the message for the mobile device is replied to the sending Shadow. A main Shadow sends a “LocationInf ormation” message to all Shadows of the mobile device. The message indicates current location of the mobile device. If a main Shadow received a “T ransf erF ” message, the LA, LOM M D and LIM of another Shadow, contained in the message are extended to the Shadow’s local lists. A main Shadow may also receive a “SendM essageT oM D” request from a Shadow, which requires it to relay the included message in the request to the mobile device. A “T erminationM essage” received notifies about the termination of a Shadow. This information is relayed to the mobile device in a “ShadowT ermination” message. For every message received from another Shadow, an acknowledgement is returned to the sender. As for messages coming from the mobile device, a main Shadow may receive requests to create an application or send a message to an application on the fixed infrastructure. The details of a newly created application are added to LA. If the Shadow failed to create an application or to send a message, a failure notification is returned to

Mobile Intermediaries Supporting Information Sharing between Mobile Users

133

the mobile device. The Shadow also relays messages from the applications to the mobile device. Summary. In our algorithm, we make sure that messages are not lost, in which case whenever communication failures occur, messages involved are put in queues. For instance, when a mobile device is disconnected from the network and no longer able to send messages to the main Shadow, those messages are added to the queue of outgoing messages. Once the mobile device reconnects to the network, messages from the queue are sent out. The same applies to the Shadows; once a Shadow failed to send a message to another Shadow or to the mobile device, the message is stored in a queue to be sent out again later. The algorithm also tries to terminate Shadows that no longer act as routers for applications and have handed over their functions to the main Shadow. Terminating such Shadows is important as it clears garbage in the system. The outcome is that some Shadows maybe temporarily disconnected from the main Shadow, and therefore may loose the route to deliver messages to the mobile device. Messages are not lost; they remain in the queue and will be forwarded when connectivity get re-established again. We are considering another approach where termination of Shadows is not as eager, in order to ensure some redundancy in the routing along the lines of [9]. Similarly, such an approach may be considered for handling host failures. The coordination layer really benefits from Mobile Agent technology. First, mobile code can be transported to a remote platform and activated, in order to perform its tasks. Second, mobile agents also incorporate a state in addition to mobile code. Such a mobile state is needed to hold all information required to perform the coordination algorithm, which includes information on the applications the Shadow is interacting with, the mobile device location, the main Shadow information, handOver flag and queues of messages.

5

Application Implementation

We have developed our application using the Southampton Framework for Agent Research (SoFAR) [10], which supports weak mobility. The algorithm of the abstraction layer is implemented by three agents, namely a Mobile Device Agent, a Shadow Manager Agent and a Shadow Agent. Our application is currently applicable for high capability mobile machines such as laptops. We host a Mobile Device Agent on the mobile machine. The Shadow Agent is mobile, while Mobile Device Agent and Shadow Manager Agent are stationary. Although stationary on the hosting laptop, a Mobile Device Agent benefits from the physical mobility of its hosting environment. On top of the abstraction layer, we are prototyping the application mentioned earlier in Section 2. The Recommender system [11] is a stationary application located on the fixed infrastructure. The application interactions are illustrated in Figure 8. A mobile user uses a browser on a laptop to access information. The browser sidebar interacts directly with a User Application, which is also an agent running on the laptop. User’s

134

N. Zaini and L. Moreau

RECOMMENDER SYSTEM BROWSER SIDEBAR

RECOMMENDER INTERFACE AGENT

BROWSER

Request (CreateApplication)

USER APPLICATION

QueryRef (SimilarLinks)

QueryRef(SimilarLinks)

MOBILE DEVICE AGENT

Inform(BookMarks)

USER AGENT

Inform(BookMarks)

MOBILE DEVICE

ABSTRACTION LAYER FIXED INFRASTRUCTURE SHADOW

Fig. 8. Application Interactions

requests are sent by the browser sidebar to the User Application. If no User Agent is started on the fixed infrastructure, the User Application requests the abstraction layer to start a User Agent. Then, the User Application is ready to route user’s requests to the User Agent using the delivery mechanism we described in Section 4. For a request to get recommendations on related urls or to get other users’ bookmarks, a “SimilarLinks” query is constructed. An inform “Bookmarks” message is created if the User Application received a request to export a set of bookmarks to the infrastructure. These messages are then forwarded to the User Agent. For a “SimilarLinks” query, the User Application would get a set of urls to similar documents or a set of other users’ bookmarks in return. These urls or bookmarks are then forwarded to the browser sidebar to be displayed. On the infrastructure, a User Agent queries a Registry Agent for a Recommender Interface Agent, with which it interacts in order to use services offered by the Recommender system. Every “SimilarLinks” query or “Bookmarks” inform message received from the mobile device is forwarded to the Recommender Interface Agent. Results of “SimilarLinks” queries are returned to the mobile device. Interactions between a User Application on the mobile device and a User Agent on the fixed infrastructure are supported by our abstraction layer. At this stage, we have not completed a formal evaluation of our architecture, but we are collecting observations about it. Given the application, we informally compared the use of a laptop with our abstraction layer and without it. In both cases, the laptop can interact with applications on the fixed infrastructure. In the second case, when the laptop is disconnected, messages may be lost and resending of messages will has to be programmed at the application level. In the first case, the use of a mobile agents and the transparent routing of messages to them [8] solved the problem of delivering messages to the laptop; alternatively in the second case, full IPv6 may be necessary to route messages to the laptop

Mobile Intermediaries Supporting Information Sharing between Mobile Users

135

mobile address. Finally, by addressing the problem of the reliable delivery and of the routing of messages in an abstraction layer, we have designed a generic solution reusable by other applications having to support mobile users.

6

Related Work

In [6], mobile agents are used to move between resources on the fixed infrastructure to take advantages of those resources in order to accomplish tasks for a mobile user. An Agent Gateway, which is a stationary host, is acting as the mediator between the wireless device and fixed infrastructure resources. This is different from our approach as we adopt a mobile agent to be such a mediator. Having a mobile mediator is more flexible as it can move closer to the current location of the user, which allows local communication to be established. The “Personal Agent System” [1] provides a mobile user with a personalised information retrieval service. The Personal Agent is a mobile agent that resides on the fixed infrastructure and communicates with agents residing on the mobile device. The system is similar to ours in the sense that it involves migration of the Personal Agent to other stationary servers so that it follows the mobile user around in the wired network, while the user moves around in the wireless network. In comparison, our abstraction layer provides more flexibility since we allow multiple mobile agents to exist when the user’s current local network is not connected with the user’s previous location. The Mobile Agents Platform (MAP) architecture [2] involves data servers to store results acquired by a mobile agent for a mobile user once the mobile user is disconnected from the network. When reconnected, the user has to undergo multiple communication steps to get the result, like having to query the lookup table for data server address and then to query the data server for the result. Our approach is much simpler since we provide a store and forward mechanism built in the abstraction layer, which allows the results of user’s queries to be forwarded to the mobile user once the mobile user is reconnected to the network. In the M-Commerce Framework [7], a mobile agent called Service Agent is moving around the wired network to gather information for a mobile user, while another mobile agent called Courier Agent is migrating to the mobile device to establish an interaction with the Service Agent on the fixed infrastructure. Migrating the Courier Agent to the mobile device in order to interact with the Service Agent puts more burden on the network connection than a migration between two hosts on the fixed network, as it involves the ability to move the agent state and code, which includes the serialisation and deserialisation of the transferred data through the low bandwidth wireless communication channels. In our abstraction layer, we adopt a simpler approach, where an application residing on the mobile device is responsible for interacting with applications on the fixed infrastructure. In the Tacoma Architecture [4], a support specific to PDA application is provided using an entity called “hostel”, which is the host that a PDA normally uses to synchronise data with. The hostel is also assumed to act as the network

136

N. Zaini and L. Moreau

provider or proxy for the PDA, i.e. the hostel is a networked workstation. In this architecture, mobile agents are used to gather information on the wired network assuming the presence of a host that they can inquire in case the PDA is not connected. This approach is suitable for a PDA user that has the hostel as the only connection point needed for the PDA. But in the case of a user who is always on the move and needs to connect to different hosts, a more flexible approach such as having a mobile agent acting as the “hostel” is more suitable.

7

Conclusion

A mobile agent able to migrate around the network trying to stay as close as possible to the mobile user, gives a major advantage by allowing local communication to be established with the mobile device. With this capability, the mobile agent is designed to be the main component in our abstraction layer, which allows transparent interactions between fixed infrastructure applications and applications on mobile devices. This paper has presented an application, which supports information sharing between mobile users in virtual meeting rooms. The main challenge in developing the application is to construct an intermediary layer, which supports seamless communication between a traveling mobile user and virtual meeting rooms hosted by the fixed infrastructure. We have introduced an architecture and algorithm of the intermediate layer, based on a mobile agent called Shadow. This intermediary layer takes care of coordination of multiple Shadows, as well as the communication between a mobile device and its Shadows. It is defined as an abstraction layer, which hides the details of communication and coordination, allowing transparent interactions between fixed infrastructure applications and applications on a mobile device. Having the intermediary layer has made the implementation of the application straightforward, in which the layer takes care of complex interactions with mobile devices. An agent-based Recommender system [11] is used in our application to provide an information sharing environment between User Agents, which are mobile-users’ representative in the virtual meeting room. The User Agents interact transparently with a mobile device through the intermediary layer. We believe such ability is important to allow more applications for mobile users to be easily developed. Acknowledgement. This research is funded in part by QinetiQ and EPSRC Magnitude project (reference GR/N35816).

References 1. Debbie Chyi. An Infrastructure for a Mobile-Agent System that Provides Personalized Services to Mobile Devices. Technical Report TR2000-370, Dartmouth College Computer Science, 2000.

Mobile Intermediaries Supporting Information Sharing between Mobile Users

137

2. A. La Corte, A Puliafito, and O. Tomarchio. An Agent-based Framework for Mobile Users. In Proceedings of European Research Seminar on Advances in Distributed Systems 1999, Madeira, Portugal, 1999. 3. Robert S. Gray, David Kotz, Ronald A. Peterson, Joyce Barton, Daria Chacon, Peter Gerken, Martin Hofmann, Jeffrey Bradshaw, Maggie R. Breedy, Renia Jeffers, and Niranjan Suri. Mobile-Agent versus Client/Server Performance: Scalability in an Information-Retrieval Task. In Mobile Agents, pages 229–243, 2001. 4. Kjetil Jacobsen and Dag Johansen. Mobile Software on Mobile Hardware – Experiences with TACOMA on PDAs. Technical Report 97-32, Department of Computer Science,University of Tromsø , Norway, 1997. 5. Danny B. Lange and Mitsuru Ishima. Programming and Deploying Java Mobile Agents with Aglets. Addison-Wesley, 1998. 6. Q.H. Mahmoud. MobiAgent – An Agent-based Approach to Wireless Information Systems. In Proceedings of the 3rd International Bi-Conference Workshop on Agent-Oriented Information Systems (AOIS-2001), Montreal, 2001. 7. Patrik Mihailescu and Walter Binder. A Mobile Agent Framework for MCommerce. Computer Science 2001, GI/OCG annual Convention:2:959–967. . 8. Luc Moreau. Distributed Directory Service and Message Router for Mobile Agents. Science of Computer Programming, 39(2–3):249–272, 2001. 9. Luc Moreau. A Fault-Tolerant Directory Service for Mobile Agents based on Forwarding Pointers. In The 17th ACM Symposium on Applied Computing (SAC’2002) — Track on Agents, Interactions, Mobility and Systems, Madrid, March 2002. 10. Luc Moreau, Nick Gibbins, David DeRoure, Samhaa El-Beltagy, Wendy Hall, Gareth Hughes, Dan Joyce, Sanghee Kim, and Danius Michaelides. SoFAR with DIM Agents An Agent Framework for Distributed Information Management. In Proceedings of the 5th International Conference on the Practical Application of Intelligent Agents and Multi-Agent Technology (PAAM 2000), pages 369–388, 2000. 11. Luc Moreau, Norliza Zaini, Jing Zhou, Nicholas R. Jennings, Yan Zheng Wei, Wendy Hall, David De Roure, Ian Gilchrist, Mark O’Dell, Sigi Reich, Tobias Berka, and Claudia Di Napoli. A Market-Based Recommender System. In Paolo Giorgini, Yves Lesp´erance, Gerd Wagner, and Eric Yu, editors, Proceedings of the Fourth International Bi-Conference Workshop on Agent-Oriented Information Systems at AAMAS 2002 (AOIS’02), Bologna, Italy, July 2002. http://CEUR-WS.org/Vol59/. 12. Mark Weiser. Some Computer Science Problems in Ubiquitous Computing. Communications of the ACM, 36(7):74–84, July 1993.

A Mobile Agent Enabled Fully Distributed Mutual Exclusion Algorithm 1

1,2

3

Jiannong Cao , Xianbing Wang , and Jie Wu 1

Internet and E-Commerce Lab, Dept. of Computing, Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong {csjcao, csxbwang}@comp.polyu.edu.hk 2 Computer Center, School of Computer, Wuhan University, Wuhan, Hubei, China 3 Dept of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431-6498, USA [email protected]

Abstract. In this paper, we present a fully distributed algorithm using mobile agents to achieve mutual exclusion in a networking environment. The algorithm is designed within a framework for mobile agent enabled distributed server groups (MADSG), where cooperating mobile agents (CMA) are used to achieve coordination among the servers. When it requests to execute in the critical section (CS), to achieve mutual exclusion, a node dispatches a mobile agent to obtain permissions from other nodes. The agent will travel across the nodes and exchange information with them until it obtains enough permissions to decide its order to enter the CS. The algorithm is based on the well-known Majority Consensus Voting (MCV) scheme but, under heavy demand, an agent needs not to contact a majority number of nodes. We show that the proposed algorithm achieves mutual exclusion and is free from deadlock and starvation. We also present the performance analysis in terms of the number of agent migrations and the synchronization delay.

1 Introduction Mutual exclusion is one of the most fundamental problems in computing systems. It states that only a single process can be allowed access to a critical section (CS) at any time. A process that has the exclusive right to access the CS is said to hold the lock for the CS. When a process wants to access the CS, it first needs to obtain the lock so as to ensure that no other process will enter the CS at the same time. The problem of mutual exclusion becomes much more complex in distributed systems, and many algorithms have been proposed for distributed systems where nodes communicate by asynchronous message passing. Comprehensive surveys can be found in [4,19]. Mutual exclusion algorithms in distributed systems can be classified as either centralized or distributed. In a centralized algorithm [5], one node is dedicated to serve the requests for mutual exclusion from all nodes in the system. This approach has two shortcomings. First, the algorithm is vulnerable to failure or disconnection of the central node. Second, because two sequential messages are required to pass the lock from one node to another, the synchronization delay is as large as 2T (T is the average delay of passing a message between two nodes). Distributed mutual exclusion N. Suri (Ed.): MA 2002, LNCS 2535, pp. 138–153, 2002. © Springer-Verlag Berlin Heidelberg 2002

A Mobile Agent Enabled Fully Distributed Mutual Exclusion Algorithm

139

algorithms can solve the first problem. For example, in quorum-based algorithms [1,7,9,15,16], a node wishing to enter the CS asks a set of other nodes, called a quorum [6], to grant it their permissions to proceed. It then waits until the permissions have been received. However, many of these algorithms still require two sequential messages for the lock to be passed from one node to another, the synchronization delay remains to be 2T. The second problem is addressed by some token-based algorithms [10,11,12,14,17,18,20]. In a token-based algorithm, a unique token is shared among the nodes. A node is allowed to enter the CS only if it possesses the token. When a node releases the CS, it passes the token directly to the next node using only one message, reducing the synchronization delay to T. Another performance measure is the message complexity of a mutual exclusion algorithm, which is measured in terms of the number of messages exchanged per Critical Section. For a system with N processes, competitive algorithms have a message complexity between log(N) and 3(N-1) messages per access to the CS, depending on their features [27]. In this paper, we describe a novel, distributed and dynamic mutual exclusion algorithm, called MADME (Mobile Agent Enabled Distributed Mutual Exclusion), which uses cooperating mobile agents as an aid to achieve distributed mutual exclusion in a networking environment. Mobile agents (MAs) are programs that can autonomously halt execution from a host, travel across the network, and continue execution at another host [8, 13]. Cooperating mobile agents (CMAs) are a collection of mobile agents which come together for the purpose of exchange information or in order to engage in cooperative task-oriented behaviors [3, 23, 24]. In our previous work, we have proposed a framework, called MADSG (Mobile Agent enabled Distributed Server Groups), which allows us to develop a novel system architecture for distributed server groups by having CMAs carry out coordination tasks for the cooperating servers [2, 3, 24, 25]. In traditional distributed client/server systems using the message passing based approach, the coordination code often has to be integrated into the server service model itself. This mixes the functionality of providing services with the service independent operations for maintaining consistency and ensuring performance. Using mobile agents allows us to provide clear and useful abstractions through the separation of different concerns. The server site functionality can be separated from the operations of maintaining the logical relationship between group members and providing the desired level of performance, which are realized by a collection of autonomous, cooperating mobile agents. This helps reduce the complexity of and increase the flexibility in implementing the servers. Also, because mobile agent technology is innately suitable for mobile computing [26], the framework could also support cooperation among a group of disconnected, mobile users. MADSG provides a general, flexible framework in which distributed control functions, such as checkpointing, load sharing, replication, etc., for a wide range of distributed applications can be achieved by using CMAs [2, 3, 24, 25]. Cooperating mobile agents encapsulate policies and algorithms for their interaction and coordination in order to implement various distributed control functions. The MADME distributed mutual exclusion algorithm presented in this paper is designed within the MADSG framework. In MADME, mobile agents that carry requests from the server nodes travel across the network and cooperate with each other by exchanging information and making decision on which node, in which order obtains the permission to enter the CS. The MADME algorithm is a fully distributed algorithm based on the well-

140

J. Cao, X. Wang, and J. Wu

known Majority Consensus Voting (MCV) scheme [21]. It requires 1 to N (N is the number of nodes in the system) migrations for a MA to determine the order in which it enters the CS and the synchronization delay is T. The rest of this paper is organized as follows. Section 2 introduces the system model and data structures used in the MADME algorithm. Section 3 describes the design of the MADME algorithm. Section 4 presents the correctness arguments for the MADME with respect to guaranteed mutual exclusion, deadlock freedom and starvation freedom. In Section 5, we analyze the performance of the algorithm. The final section concludes the paper.

2 System Model and Data Structure 2.1

System Model

Figure 1 illustrates the underlying system model of the MADME. A distributed server group consists of N server nodes, which are connected by a communication network. The underlying network is assumed to be logically fully connected, reliable, and to deliver messages in unpredictable but bounded time. At any time, each node initiates at most one outstanding request for mutual exclusion. Any node executing inside the CS exits it in finite time.

Fig. 1. System model of the MADME

Each node in the system is associated with a mobile agent server, which is responsible of creating, executing and destroying mobile agents, and maintaining system information. Unlike traditional message passing paradigm, the code for mutual exclusion is carried out by mobile agents which move around in the network. Relevant state information is stored at each node and each agent. Whenever an agent arrives at a node, it synchronizes the information it carries with the information stored at the node. More specifically, when a node needs to enter the CS, it will dispatch an agent to travel across other nodes to obtain permissions. Each node maintains an agent list (AL), which contains the identifiers of all the agents that have visited the node. When an agent reaches a node, it registers its requests to the node for obtaining the lock of the CS by appending its identifier to the node’s AL. The mobile agent also

A Mobile Agent Enabled Fully Distributed Mutual Exclusion Algorithm

141

exchanges with the node the information it collected from the nodes on its migration path. We assume that, at a node, the visiting agents are served in a FIFO manner. If an agent obtains enough information to determine its rank among all the agents, we say that the agent is ordered. The rank of an agent is defined by two parameters: the number of ALs in which the agent’s identifier is placed on the top and the ID of its home node. The latter is used to resolve any tie. An ordered agent knows the order for it to enter the CS and will stop migration. If the agent finds it has the highest rank, it immediately returns home to inform its home node to enter the CS. Otherwise, it will moves to its immediate preceding agent’s home node and inform the node that it will enter the CS next. Then the agent will return home waiting for a RELEASE message from that node. If the agent is not yet ordered, it will continue its journey and choose a new node to visit. 2.2

Data Structures

Each node Ni (1 ≤ i ≤ N) is assigned a unique identifier i which is an integer ranging from 1 to N. Node Ni maintains the following three data structures, as illustrated in Figure 2.

Fig. 2. The data structure of a node

Fig. 3. The data structure of a mobile agent

Node System-Information Table (NSITi): it consists of N rows, one for each node (including Ni itself). Each row stores the information about a node known to Ni, including the ID, the timestamp TS, and the agent list AL of that node. AL stores the identifiers of the mobile agents which have visited the node. TS represents how up-to-date the information about the node is. Since the status of the node’s information is updated whenever an agent visits the node or a new agent is created by the node, TS is implemented as a counter recording the number of agents that have visited the node or have been created by the node. Nexti: it stores the ID of the node which will enter the CS immediately following Ni, when Ni exits the CS, it will send a RELEASE message to the node Nexti to inform that it can enter the CS. Node Ordered-Agent List (NOALi): it stores the identifiers of the mobile agents which have already been ordered.

142

J. Cao, X. Wang, and J. Wu

When a mobile agent is created by a node, it is assigned a unique identifier consisting of the ID and the current value of TS of the creating node. The ID of an agent is defined by the tuple , but for simplicity, hereafter, it is written TS as MANodeID . t Each mobile agent MAi maintains the following three data structures, as illustrated in Figure 3. Agent System-Information Table (ASITi): it has the same structure as NSITi but t carried by the agent MAi . Unvisited Node List (ULi): it stores a list of the IDs of the nodes which have not yet t been visited by the agent MAi . Agent Ordered-Agent List (AOALi): it has the same structure as NOALi but carried t by the agent MAi .

3 The MADME Algorithm The proposed algorithm consists of two parts. One part (Algorithm 1) is performed by a node. Another part (Algorithm 2) is performed by a mobile agent, which travels across the nodes in the system and exchange information (Exchange Procedure) with those nodes, making decision (Order Procedure) when it obtains enough permissions to enter the CS, or choosing a new node to visit next. 3.1

Operations Performed by a Node Ni

Each node Ni executes Algorithm 1. Ni first initialises its data structures (lines 2-7). Iinitially, for each entry in NSITi, Ni sets the TS field to 0 and the AL field to null. It will also set Nexti to 0 and NOALi to null. When Ni requests to enter the CS, it creates a mobile agent and assigns it an identifier. Ni also appends the ID of the agent to the NSITi[i].AL and increases NSITi[i].TS (lines 9-11). When Ni receives an agent, it first exchanges information with the agent by updating its data structures as requested by the agent. It will then append the ID of the visiting agent to NSITi[i].AL and increase NSITi[i].TS. Finally, it will update its agentordering information as requested by the visiting agent if the agent has determined the order for some agents to access the CS (line 13-17). t When Ni receives the Enter message from its agent MAi , it will delete all agents t which precede MAi from NOALi and enters the CS (line 19-20). After Ni finishes the CS, it will send a RELEASE message to the node represented t by the variable Nexti. It will then delete its mobile agent MAi from NOALi (line 2126). Algorithm 1: Operations performed by a node Ni 1. 2. 3. 4.

Initialization: for i = 1 to N do NSITi[i].TS = 0; NSITi[i] .AL= null;

A Mobile Agent Enabled Fully Distributed Mutual Exclusion Algorithm

143

5. end for; 6. NOALi = null; 7. Nexti = 0; 8. Upon requesting the CS: t 9. Create a mobile agent MAi and assign it an identifier ; t 10. Append ID of MAi to NSITi[i].AL; 11. NSITi[i].TS = NSITi[i].TS +1; 12. Upon arrival of a mobile agent M: 13. Update data structures as requested by M in the Exchange Procedure; 14. As requested by M for registering: 15. Append ID of M to NSITi[i].AL; 16. NSITi[i].TS = NSITi[i].TS +1; 17. Update data structures as requested by M in the Order Procedure; t

18. Upon receiving the Enter signal from its agent MAi t 19. Delete IDs of agents which precede the agent MAi from NOALi; 20. Enter the CS; 21. Upon releasing the CS: 22. if Nexti ≠ 0 then 23. send a RELEASE message to Nexti; 24. Nexti = 0; 25. end if t 26. Delete the agent MAi from NOALi;

3.2

Operations Performed by a Mobile Agent MAi

t

t

After being created, agent MAi obtains system information, including NSITi and NOALi, from its creating node Ni (Algorithm 2, line 2-6). After being dispatched, the agent travels over the nodes and executes the rest of Algorithm 2 autonomously until it obtains the permission to enter the CS (line 7-35). The two boolean variables BeOrdered and Highest_Priority are used to control the execution of the algorithm. If the agent has determined its order to enter the CS, BeOrdered is set to True. If the agent is on the top of AOALi, Highest_Priority is set to True and the agent migrates home to enter the CS immediately. On visiting a node, the agent will execute exchange procedure (section 3.2.2) to exchange information with the node and execute order procedure (section 3.2.1) to calculate its order. According to the output, three cases can be resulted. In case 1, the agent is ordered with the highest priority (on top of AOALi). It will migrate home and enter the CS immediately. In case 2, the agent is ordered but not on top of AOALi. It will migrate to the immediate preceding agent’s home node. In case 3, the agent can’t yet be ordered, so it will choose a node from ULi to visit next. Case 1 In this case, the agent has obtained the Highest_Priority, so it will migrate home, exchange information with the home node and inform it to enter the CS immediately (Algorithm 2, lines 28-35).

144

J. Cao, X. Wang, and J. Wu

Algorithm 2: Operations performed by a mobile agent MAi 1. 2. 3. 4. 5. 6.

t

Initialization: ULi = {1..N} – {i}; ASITi = NSITi; AOALi = NOALi; BeOrdered = False; Highest_Priority = False;

7. while BeOrdered ≠ True Do 8. begin 9. Randomly choose a node Nj in ULi to visit next; 10. Delete node Nj from ULi; 11.

Migrate to node Nj;

12. On arrival at the node Nj 13. Call Exchange Procedure to update NSITj and ASITi; 14. Register to node Nj; 15. Append itself to ASITi[j].AL; 16. ASITi[j].TS=NSITj[j].TS; 17. Call Order Procedure to calculate the order using ASITi; 18. end; 19. if not Highest_Priority then 20. begin 21. migrate to Nk = AOALi[length(AOALi)-1].NodeID 22. On arrival at the Node Nk 23. if AOALi[length(AOALi)-1] is not in NSITk and NOALk then 24. Highest_Priority = True; 25. else Nextk = AOALi[length(AOALi)].NodeID; 26. Call Exchange Procedure to update NSITk and ASITi; 27. end; 28. migrate to home Ni; 29. On arrival at home Ni 30. Call Exchange Procedure to update NSITi; 31. if not Highest Priority then 32. wait the RELEASE message from the preceding priority agent; 33. delete agents which precede this agent from AOALi; 34. end if 35. Send Enter signal to the Node Ni to let it enter the CS;

Case 2 Because the agent is not on top of its AOALi, according to Order Procedure, it must be at the end of AOALi and there must be another agent with a higher rank in AOALi. The immediate preceding agent is AOALi[length(AOALi)-1], and it will enter the CS before the current agent. Then, the current agent migrates to Nk = AOALi[length(AOALi)-1].NodeID (line 21). After arrived at the node Nk, if the agent AOALi[length(AOALi)-1] cannot be found in NSITk or NOALk, which means the node Nk finished the CS and the agent AOALi[length(AOALi)-1] was deleted, then the

A Mobile Agent Enabled Fully Distributed Mutual Exclusion Algorithm

145

current agent can enter the CS immediately (line 23, 24). Otherwise, the agent sets Nextk to AOALi[length(AOALi)].NodeID (line 25). Before migrating home, the agent exchanges its information with the node Nk (line 26). After arriving home, the agent will exchange information with the home node and wait for RELEASE message from the immediate preceding agent (line 30-32). After it receives the message, it deletes from AOAL all preceding agentsi and then informs the home node to enter the CS (line 33-35). When a node Ni releases the CS, it will send a RELEASE message to node Nexti if any. Case 3 Because the agent could not determine its order to enter the CS, it will randomly choose a node from ULi to migrate to (line 8). The node chosen will be deleted from ULi. When a mobile agent reaches the new node, it will execute Exchange Procedure to exchange information with the node, register itself to the node’s AL and update data structures of the agent and the node (line 13-16). The TS of the node will be increased because the state is changed. Then the agent will call Order Procedure to calculate its rank (line 17). 3.2.1 The Order Procedure t In this procedure, agent MAi arrives at node Nj and wants to determine whether it can be ordered. The boolean variable BeOrdered is set to TRUE if the agent has determined its rank for entering the CS. The rank of an agent is defined in terms of the number of ALs in which the agent’s identifier is placed on the top. When more than one agent reaches the top in the same maximum number of ALs, the agent with the minimum NodeID will be assigned the highest rank (line 5-7). Order Procedure (on arrival at Nj) 1. 2. 3. 4. 5. 6. 7.

Continue = true; while Continue = true Do begin // Calculate ASITi; Finds M(M”1 mobile agents, {Ah⏐1”K”0$h reaches the top of Sh Als, and ∀k,l(1”NO”0  if(k Length(NOALj) then for each agent A in AOALi and not in NOALj Request node Nj to delete A from NSITj; Request node Nj to set NOALj=AOALi; else for each agent A in NOALj and not in AOALi delete A from ASITi; AOALi=NOALj; end if

14. for k=1 to N do 15. if NSITj[k].TS ≠ ASITi[k].TS then 16. for each agent A in NSITj[k].AL do 17. if (A.NodeID = k) and (A∉ASITi[k].AL) and (NSITj[k].TS < ASITi[k].TS) then 18. Request node Nj to delete A from NSITj; 19. for each agent B in ASITi[k].AL do 20. if (B.NodeID = k) and (B∉NSITj[k].AL) and (ASITi[k].TS < NSITj[k].TS) then 21. delete B from ASITi; 22. if NSITj[k].TS < ASITi[k].TS then 23. Request node Nj to set NSITj[k] = ASITi[k]; 24. else ASITi[k] = NSITj[k]; 25. end if 26. end for

First, the information in NOALj and AOALi are synchronized outdated ordered agents are deleted from NOALj (line 1-2) and AOALi (line 3-4). According to line 1-2, t if an agent A is in AOALi but not in NOALj and NSITj[A.NodeID].AL, MAi knows that A has been ordered. However, to the current node Nj it may be the case that A had finished executing the CS or that A has never visited Nj. The two cases can be distinguished by the difference between the timestamps of A’s home node maintained t by the agent MAi and the current node. If the timestamp of A’s home node maintained

A Mobile Agent Enabled Fully Distributed Mutual Exclusion Algorithm

147

t

by the MAi is smaller than that of the current node, A must be an outdated agent and can be removed. In Algorithm 2 (line 32-33), when an agent receives its RELEASE message, it knows that all the ordered agents preceding it have finished executing the CS and can be safely deleted from its AOAL. So, in Exchange Procedure, agents that precede agent A also can be deleted (line 2,4). Otherwise, it will be added to NOALj in the rest of the procedure. Next, NOALj and AOALi are combined if needed (line 5-13), and relevant agents are deleted from NSITj (line 6-7) or ASITi (line 10-11). Last, NSITj and ASITi are synchronized (line 14-26). If the TS of a node Ni maintained by the agent and the current node is the same, i.e., NSITj[i].TS=ASITi[i].TS, then no information need to be exchanged because the agent and the current node maintain the same state of the node Ni. Otherwise, information update is needed. First, outdated agents, if any, are deleted from NSITj and ASITi (line 16-21). Then NSITj[i] and ASITi[i] are synchronized with the one with a bigger TS (line 22-24).

4 Correctness Proof In this section, we argue the correctness for the MADME algorithm by demonstrating that it ensures mutual exclusion, deadlock freedom and starvation freedom. Lemma 1. For each AL, AL ≤ N . Proof. The distributed system consists of N nodes. Each node contains only one process that makes a request to mutual exclusively access the CS. At any time, each process initiates at most one outstanding request for mutual exclusion. So, a node can’t create a new agent until the node releases the CS and delete the old agent. A node’s AL stores the identifier of agents that have visited it or created by the node. Exchange Procedure ensures that an AL does not contain two different agents t created by one node. When an agent MAi traveling a node Nj, if there exist two different agents created by one node Nk in NSITj or ASITi, one agent created by Nk must be outdated and will be deleted in Exchange Procedure. So there will be at most N agents contained in an AL.

Lemma 2. Agent MAit knows there must exist an agent, which can be ordered if every AL in its ASITi is nonempty. Proof. During the execution of Order Procedure, MAit can obtain a sequence of M(M ”N) mobile agents, Aj (1”j ”M), which are ordered according to the number of ALs, where they reached the top, and their creating node IDs. If every AL in its ASITi is nonempty, the first agent A1 in the sequence is ensured to have the highest rank for entering the CS. Therefore, at least one agent in the sequence can be ordered. t

Lemma 3. An agent MAi can determine its order to enter the CS with not more than N-1 migrations. t Proof. On its N-1th migration, the agent MAi has already traveled every node. Therefore, it must be in each AL of its ASITi. This means that every AL in ASITi is nonempty. By Lemma 2, at least one agent can be ordered, and is appended to AOALi t and deleted from ASITi. If the ordered agent is not MAi , it is still in each AL of ASITi, by Lemma 2 and order procedure, another agent will be ordered and deleted from

148

J. Cao, X. Wang, and J. Wu

ASITi. By Lemma 1, there will be at most N agents contained in each AL, which t means that after finite times MAi will be ordered.

Lemma 4. When an agent A is able to determine its order, if there is any agent which precedes A and has not finished executing the CS, A must know the existence of such an agent. Proof. When an agent B preceding A, if any, was ordered, according to Order Procedure, B must have been ranked the highest among the M (M ”N) mobile agents competing for entering the CS at that time. Now, assume agent A is able to determine its order, if B has not finished executing its CS, If A does not know that B has been ordered then the rank of A can achieve must not be higher than that of B. Consequently, A cannot determine its own order because it needs to wait until it knows that B is ordered. Lemma 5. Two different agents cannot achieve the same rank. Proof. This is straightforward. By definition, an agent is ranked according to the number of ALs in which the agent’s identifier is placed on the top and the ID of its home node. Two agents thus cannot achieve the same rank. Lemma 6. In Exchange Procedure, after outdated ordered agents are deleted, we have AOALi ⊆ NOALj or NOALj ⊆ AOALi and agents on the top of the two lists are the same if both are nonempty. Proof. Assume the contrary, neither AOALi ⊆ NOALj nor NOALj ⊆ AOALi is true. Then, ⎜ NOALj ⎜>0 and ⎜ AOALi ⎜>0, there exists Agent A∈ AOALi, A∉ NOALj, and there exists Agent B∈ NOALj, B∉ AOALi. We assume AOALi ={Ak} (1≤ k ≤ M), NOALj ={Bk}(1≤ k ≤ M’). First, we consider NOALj ∩ AOALi =∅. According to Order Procedure, Ak1 is ranked higher than Ak2 if k1 < k2, and Bk’1 is ranked higher than Bk’2 if k’1 < k’2. By Lemma 5, two agents could not achieve the same order. Because NOALj ∩ AOALi =∅, then by Lemma 4, AM precedes all agents in NOALj or BM’ precedes all agents in AOALi. But in the first case, B1 must know AM is ordered before it got ordered, then AM should exist in NOALj. In the second case, A1 must know BM’ is ordered before it got ordered, then BM’ should exist in AOALi. This is contrary to NOALj ∩ AOALi =∅. Second, NOALj ∩ AOALi ≠∅. We consider the minimum k where Ak≠Bk. If k>1, then Ak-1=Bk-1, Ak and Bk achieve the same rank which is contrary to Lemma 5. Otherwise, k=1, there will exist a minimum k’, Ak’ = Bk’ and k’ >1, then Ak’-1 ≠ Bk’-1. According to the proposed algorithm, both Ak’-1 and Bk’-1 precede Ak’, by Lemma 5, Ak’-1 and Bk’-1 cannot achieve the same rank. By Lemma 4, if Ak’-1 precedes Bk’-1, Bk’-1 will be appended to AOALi after Ak’-1 but before Ak’. And if Bk’-1 precedes Ak’-1, Ak’-1 will be appended to NOALj after Bk’-1 but before Bk’. It is contrary to the assumption. In each case, there is contradiction. Lemma 7. Agents in two different NOALs are ranked in the same order. t Proof. We assume an agent MAi migrates from node Nj to Nk and AOALi = NOALj and ASITi = NSITj. We denote AOALi and NOALk after outdated ordered agents being deleted as AOALi‘ and NOALk‘. By Lemma 6, after outdated ordered agents being deleted, AOALi‘ ⊆ NOALk‘ or NOALk‘ ⊆ AOALi‘ and agents on the top of the two lists are the same if both are nonempty. This means agents in AOALi and NOALk are ranked in the same order except outdated ordered agents.

A Mobile Agent Enabled Fully Distributed Mutual Exclusion Algorithm

149

Without losing generality, assume AOALi‘ ⊆ NOALk‘ and agents on the top of the two lists are the same if both are nonempty. Now, we want to prove that an outdated ordered agent, before it is deleted, is ordered the same in AOALi and NOALk. According to the definition of outdated ordered agent, such an agent precedes all the ordered agents not outdated and no such agent is in both NOALk or in AOALi. If both AOALi‘ and NOALk‘ are nonempty, we only need to prove that outdated agents must all exist in NOALk or in AOALi, but not both. Assume the contrary, there is an outdated ordered agent A∈AOALi and A∉NOALk and another outdated ordered agent B∈NOALk and B∉AOALi. By Lemma 5, agent A and B cannot be ranked the same. But by Lemma 4, if agent B precedes A, A precedes all agents in NOALk‘, A will be appended to NOALk after B and before all agents in NOALk‘. Otherwise, agent A precedes B, B will be appended to AOALi after A and before all agents in AOALi‘. Both cases result in contradiction. If AOALi‘ is empty, we only need to prove that any outdated agents in AOALi precedes all agents in AOALk. Assume the contrary, there is an outdated ordered agent A∈AOALi and A∉NOALk and another outdated ordered agent B∈NOALk and B∉AOALi, and B precedes A. By Lemma 4, A will be appended to NOALk after B and before all agents in NOALk‘. It is contradiction to A∉NOALk. Thus, Agents in two different NOALs are ranked in the same order. t

Lemma 8. When an agent MAi ’s home node Ni enters the CS, the agent must be on the top of both AOALi and NOALi. t Proof. According to Algorithm 2, there are only two ways for agent MAi ’s home t node Ni to enter the CS. One way is MAi determines its order to enter the CS and is on the top of AOALi, then gets the Highest_Priority and migrates home to enter the CS immediately. When the agent comes back, it gets the Highest_Priority, this means its t AOALi just has one agent. So, by Lemma 6, after exchange procedure, MAi is also on top of NOALi. t In another way, MAi determines its order to enter the CS, but not on the top of AOALi, then migrates to the immediate preceding agent’s home node Nk changing the NEXTk, and migrates home exchanging information with home node and waiting for RELEASE message from Nk. When it receives RELEASE message, it deletes all agents which precede it from AOALi, then it will be on the top of AOALi, and send Enter signal to home node Ni. When Ni receives the Enter signal, it will delete all t agents which precede MAi from NOALi, then it will be also on the top of NOALi. Theorem 1. Mutual exclusion is achieved. Proof. Mutual Exclusion is achieved when no pair of nodes is ever simultaneously in its critical section. For any pair of nodes, one must leave its critical section before other may enter. Assume the contrary, at some time two nodes (Ni and Nj) are simultaneously in their critical section, and the corresponding agents are A on node Ni and B on node Nj. According to Lemma 8, agent A must be on top of NOALi and agent B must be on top of NOALj. t Now, an agent MAk migrates from node Ni to Nj and AOALk = NOALi and ASITk = NSITi. Because both Ni and Nj are simultaneously in their critical section, both A and B are not outdated agents and can’t be deleted as in Exchange Procedure. It is contrary to Lemma 6 and Lemma 7. Theorem 2. Deadlock is impossible.

150

J. Cao, X. Wang, and J. Wu

Proof. The system is deadlocked when no node is in its critical section and no requesting node can ever proceed to its own critical section. Assume the contrary, the deadlock is possible. In our algorithm, that a deadlock exists will result in two cases. First case, no agent could determine its order to enter the CS, this is contrary to Lemma 2 and Lemma 3, because every agent could determine its order to enter the CS after not more than N-1 migrations. In the second case, there exist three agents A, B and C. A is waiting for RELEASE message from B directly or indirectly, B is waiting for RELEASE message from C directly or indirectly, C is waiting for RELEASE message from A directly or indirectly. Then A precedes C, C precedes B and B precedes A. It is contradiction to Lemma 7. Theorem 3. Starvation is impossible. Proof. Starvation occurs when one node must wait indefinitely to enter its critical section even though other nodes are entering and exiting their own critical section. Assume the contrary, that starvation is possible. In our algorithm, an agent’s migrating and executing time, the process’s entering the CS time and the message’s passing time are finite. So if an agent is starvation, it must be waiting for preceding priority agent message infinitely or migrating among nodes continuously. But by Lemma 7, the number of ordered agents, which precede an agent, is determined as soon as the agent gets ordered. So, in the first case the agent will receive RELEASE message and enter the CS in finite time. In the second case, it means the agent could not determine its order to enter the CS. But this is contrary to Lemma 3. Thus the theorem must be true.

5 Performance We evaluate the performance of distributed mutual exclusion algorithms using the following three metrics: (1) the number of migrations required for an agent to enter the CS; (2) synchronization delay, which is the number of sequential message exchanges required after a node leaves the CS and before next node enter the CS; (3) response time, which is the time interval a request waits to execute CS after its agent be created. The performance depends upon loading conditions of the system and has been studied under two special loading conditions, light load and heavy load. In light load condition, there is seldom more than one request for mutual exclusion simultaneously in the system. In heavy load condition, there is more than one request for mutual exclusion simultaneously in the system. 5.1

Number of Agent Migrations

We discuss the two cases separately. 5.1.1 Under Light Load Condition When the demand is light and contention rarely occurs, one agent of distributed mutual exclusion requires migrations to determine its order to enter the CS. After the

A Mobile Agent Enabled Fully Distributed Mutual Exclusion Algorithm

151

agent determines its order, it needs one migration to the preceding node if itself is not on the top of AOAL, and one migration to return home. When the distributed system is initiated, there are no outdate information, NSITs of all nodes are nulls. When an agent created to travel across nodes, it will on top of each AL. If it is created by node 1, after N/2+(N mod 2)-1 migrations, it will find itself ordered with Highest_Priority and migrate home for entering the CS immediately. Otherwise, it will migrate home to enter the CS immediately after N/2 migrations. If there are outdate information, it will result in two cases. In the first case, the agent visits the current last ordered agent in first N/2-1+(N mod 2) (if NodeID=1) or N/2 (if NodeID>1) migrations. Because the agent can delete outdate information from its ASIT and get ordered, the number of migrations will be the same as the former situation. In the second case, the agent visits the current last ordered agent after N/21+(N mod 2) (if NodeID=1) or N/2 (if NodeID>1) migrations. The worst case is that the agent visits all nodes before it arrives on a node and knows the current last ordered agent. Then the maximum migrations will be N-1. So under the low load condition, an agent needs N/2 to N-1 migrations to determine its order to enter the CS. 5.1.2 Under Heavy Load Condition Under this condition, there will be many agents travelling across nodes competing for entering the CS. If an agent A determines its order, it knows there are totally M mobile agents competing for entering the CS. The minimum number of first ranks A needs to determine its order should be N/M+1, while the other M-1 agents achieves average number of the rest first ranks. The minimum number of migrations for agent A to determine its order will be N/M+2, in this case A achieves N/M+1 first ranks during the first N/M migrations and finds other M-1 agents’ information during the last second migration and then migrates home for entering the CS. The maximum migrations will be N-1, it is ensured by Lemma 3. 5.2

Synchronization Delay

In our algorithm, since only one RELEASE message needs to be passed, the synchronization delay is T. 5.3

Response Time

First, let us define the following notations: tA: Agent execution time; T: Message transfer and Agent migration time; E: Execution CS time. Under low load condition, before an agent enter the CS, it needs N/2 to N-1 migrations to determine orders and one migration to return home for entering the CS. The response time will be (N/2+1)*(tA +T) to (N-1)*( tA +T). Under high load condition, if each agent will wait for RELEASE message to enter the CS, the response time will be N*(T+E).

152

J. Cao, X. Wang, and J. Wu

6 Discussion and Concluding Remarks In this paper, we have described an algorithm using mobile agents to achieve fully mutual exclusion in a computer network of N nodes. We have also presented the proof of correctness of the algorithm, with respect to guaranteed mutual exclusion, deadlock freedom and starvation freedom, and analyzed the performance of the algorithm. Comparing with message passing based protocols, the proposed mobile agentenabled protocol has several advantages. First, mobile agent technology provides an approach to overcome the difficulties that hamper tight interaction between the processes. After being dispatched, the mobile agents become independent of the creating process and can operate asynchronously and autonomously [2, 22]. In this way, they also support mobile computing by carrying out tasks for a mobile user temporarily disconnected from the network. Second, because mobile agent can package a conversation and dispatches itself to a destination host, using mobile agent allows us to design algorithms that make use of the most up to date system state information for decision making. It may also lead to the reduction of the total amount of communications as the interactions can take place locally. Furthermore, mobile agent brings flexibility and scalability into distributed, dynamic systems due to its ability to encapsulate policies and algorithms and its ability to automatically tolerate transit faults and dynamic changes of the network. Our future work includes an quantitative study of the performance of mobile agent enabled algorithms in comparison with traditional algorithms using message passing. Acknowledgement. This work is partially supported by the University Grant Council of Hong Kong under the CERG Grant B-Q518 (PolyU 5076/01E) and the Hong Kong Polytechnic University, under HK PolyU ICRG grant A-P202.

References 1. 2. 3. 4. 5. 6. 7.

D. Agrawal and A. El Abbadi, "An efficient and fault-tolerant solution for distributed mutual exclusion," ACM Transactions on Computer Systems, vol. 9, no. 1, Feb. 1991, 1– 20. J. Cao, G. H. Chan, W. Jia, and T. Dillon, "Checkpointing and Rollback of Wide-area Distributed Applications Using Mobile Agents", Proc. IEEE 2001 International Parallel and Distributed Processing Symposium (IPDPS2001), San Francisco, USA, Apr. 2001. J. Cao, T. S. Chan and J. Wu, "Achieving Replication Consistency using Cooperating Mobile Agents", Proc. IPPC 2001 Workshop on Wireless Networks and Mobile Computing (IEEE Computer Society Press), Valencia, Spain. Sep. 2001. Y. I. Chang, "A simulation study on distributed mutual exclusion", Journal of Parallel and Distributed Computing, vol. 33, 1996, 107–121. E. W. Felten and M. Rabinovich, "A centralized token-based algorithm for distributed mutual exclusion", Univ. of Washington technical report TR-92-02-02. H. Garcia-Molina and D. Barbara, "How to assign votes in a distributed system," Journal of the ACM, vol. 32, no. 4, 1985, 841–860. L. Lamport, "Time, clocks, and the order of events in a distributed system," Communications of the ACM, vol. 21, no. 7, Jul. 1978, 558–565.

A Mobile Agent Enabled Fully Distributed Mutual Exclusion Algorithm 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24.

25. 26. 27.

153

D. B. Lange and M. Oshima, "Seven Good Reasons for Mobile Agents", Communication of the ACM, vol. 42, no. 3, Mar. 1999, 88–89. M. Maekawa, "A sqrt(n) algorithm for mutual exclusion in decentralized systems," ACM Transactions on Computer Systems, vol. 3, no. 2, May 1985, 145–159. M. Mizuno, M. L. Neilsen and R. Rao, "A Token based distributed mutual exclusion algorithm based on Quorum Agreements," 11th Intl. Conference on Distributed Computing Systems, May 1991, 361–368. M. Naimi, M. Trehel and A. Arnold, "A log(n) distributed mutual exclusion algorithm based on path reversal", Journal of Parallel and Distributed Computing, vol. 34, 1996, 1– 13. M. L. Neilsen, and M. Mizuno, "A DAG-based algorithm for distributed mutual exclusion," 11th Intl. Conference on Distributed Computing Systems, May, 1991, 354– 360. V. A. Pham and A. Karmouch, "Mobile Software Agents: An Overview'', IEEE Communications Magazine, Jul. 1988, 26–37. K. Raymond, "A tree-based algorithm for distributed Mutual Exclusion," ACM Transactions on Computer Systems, vol. 7, no. 1, Feb. 1989, 61–77. G. Ricart and A. Agrawala, "An optimal algorithm for mutual exclusion in computer networks," Communications of the ACM, vol. 24, no. 1, Jan. 1981, 9–17. B. Sanders, "The information structure of distributed mutual exclusion algorithms," ACM Transactions on Computer Systems, vol. 5, no. 3, Aug. 1987, 284–299. M. Singhal, "A heuristically-aided algorithm for mutual exclusion in distributed systems," IEEE Transactions on Computers, vol. 38, no. 5, May 1989, 651–662. M. Singhal, "A dynamic information-structure mutual exclusion algorithm for distributed systems," IEEE Transactions on Parrallel and Distributed Systems, vol. 3, no. 1, Jan. 1992, 121–125. M. Singhal, "A Taxonomy of Distributed Mutual Exclusion", Journal of Parallel and Distributed Computing, vol. 18, 1993, 94–101. Suzuki and T. Kasami, "A distributed mutual exclusion algorithm," ACM Transactions on Computer Systems, vol. 3, no. 4, Nov. 1985, 344-349. R. H. Thomas, "A majority consensus approach to concurrency control for multiple copy databases," ACM Transactions on Database Systems, vol. 4, no. 2, Jun. 1979, 180–209. C. Xu and D. Tao, "Building Distributed Applications with Aglet'', http://www.cs.duke.edu/chong/aglet N. Minar, K.H. Kramer and P. Maes, “Cooperative Mobile Agents for Dynamic network Routing”, in Software Agents for Future Communication Systems, Springer-Verlag, 1999. J. Cao, X.B. Wang, and S.K. Das, “A Framework of Using Cooperating Mobile Agents to th Achieve Load Sharing in Distributed Web Server Groups”, to appear Proc. 5 Int’l Conference on Algorithms and Architectures for Parallel Processing (ICA3PP), Oct. 2002, Beijing, China. J. Cao, X.B. Wang, S. Lo and S.K. Das, “A Consensus Algorithm for Synchronous Distributed Systems using Mobile Agent”, to appear in Proc. 2002 Pacific Rim Int’l Symposium on Dependent Computing, Tskuba, Japan, Dec. 2002. R. S. Gray, G. Cybenko, D. Kotz, and D. Rus. "Mobile agents: Motivations and State of the Art", In Jeffrey Bradshaw, editor, Handbook of Agent Technology, AAAI/MIT Press, 2001. S. Lodha and A. Kshemkalyani, “A Fair Distributed Mutual Exclusion Algorithm”, IEEE Trans. On Parallel and Distributed Systems, Vol. 11, No. 6, June 2000, pp. 537–549.

Using a Secure Mobile Object Kernel as Operating System on Embedded Devices to Support the Dynamic Upload of Applications Walter Binder and Bal´ azs Lichtl CoCo Software Engineering GmbH Margaretenstr. 22/9, 1040 Vienna, Austria {w.binder | b.lichtl}@cocosoftware.com

Abstract. In this paper we present the architecture of an autonomous, multi-purpose station which securely executes dynamically uploaded applications. The station hardware is based on an embedded Java processor running the system software and applications. The system software is built on top of a flexible, lightweight, efficient, and secure mobile object kernel, which is able to receive mobile code and to execute it, while protecting the station from faulty applications. Mobile code is used for application upload, as well as for remote configuration and maintenance. The autonomous station relies on resource accounting and control in order to prevent an overuse of its computing resources. Moreover, applications executing on the station may be charged for their resource consumption. This paper also describes an initial application of the autonomous station, which has been recently deployed in a pilot project: Based on the architecture of the autonomous station, we have designed and implemented an on-demand bus stop.

1

Introduction

This paper gives an overview of the design and architecture of an autonomous station, which is able to securely and reliably execute dynamically uploaded applications. The autonomous station does not rely on an external power supply system, but it comprises a unit for the generation of current in order to ensure its autonomy. It is equipped with application-dependent sensors and actuators, and it may be deployed in inaccessible environments. It offers its specific equipment to applications in a time-sharing fashion. The applications are not hard-coded in the station, but they are dynamically uploaded on demand. They are charged for their utilization of the resources provided by the autonomous station. In order to support application upload, transmission of results, and remote system maintenance, the autonomous station is connected to a public or private wireless network. The hardware of the station is based on an embedded Java processor running our system software, which is implemented in pure Java. The system software rests upon a lightweight, efficient, and secure mobile object platform, which is able to receive mobile code and to execute it, while protecting the autonomous station from malicious or badly programmed applications. N. Suri (Ed.): MA 2002, LNCS 2535, pp. 154–170, 2002. c Springer-Verlag Berlin Heidelberg 2002 

Using a Secure Mobile Object Kernel as Operating System

155

In the past years numerous research works have focussed on mobile object technology. This has resulted in a better understanding of the problems inherent to mobile objects and mobile code in general, which have to be solved in order to enable the widespread deployment of mobile code based solutions in various commercial settings, such as the embedding of a Java-based mobile object system within a proprietary hardware environment. In particular, significant research has concentrated on building secure environments for the execution of foreign, potentially malicious mobile code, which may seriously damage the execution environment itself or other components in the system [2,8,7,3]. Actually, Java has become the de facto standard implementation language for mobile object platforms due to its network centric approach, its runtime virtual machine, and features that ease the development of mobile object systems, such as a portable code format (Java bytecode) [11], dynamic and customizable classloading, multi-threading, built-in support for object serialization, and language safety. Since the adoption of the real-time specification for Java [6], embedded Java processors conforming with this specifications have been developed, which are able to meet certain (soft) real-time requirements. Despite of these advantages, Java has not been designed for multi-tasking1 . Currently, Java offers no support for isolating mobile objects from each other. The lack of a task model in Java also makes it difficult to terminate applications in order to reclaim their allocated resources. Furthermore, Java has no support for resource accounting and control, which makes it vulnerable to denialof-service (DoS) attacks and prevents the deployment of applications that shall be charged for their resource consumption. See [5] for an in-depth discussion of deficiencies of Java with respect to mobile code environments. Consequently, many recent research works have aimed at working around these shortcomings of Java related to mobile code. Different approaches have put restrictions on the programming model, used bytecode arbitration to control the execution of mobile code, and developed special runtime systems with enhanced functionalities usually found in operating systems. In the meantime, Java-based mobile object systems are available that are secure and reliable enough to justify their deployment in commercial settings. As mobile object environments are perfectly suited for software distribution, installation, and remote maintenance, the system software of our autonomous station is based on J-SEAL2 [3], a lightweight Java-based micro-kernel, which makes Java safe for the execution of untrusted mobile code. J-SEAL2 offers a hierarchical task model allowing to isolate applications, to terminate them in a safe way, and to monitor and limit their resource consumption. Our system software has control over a special hardware, the embedded Java processor and its peripherals (i.e., different sensors and actuators which are connected to the main station). We use a processor that natively executes Java bytecode programs, which supersedes a Java Virtual Machine (JVM) [11] implemented in software, as well as an underlying operating system layer. Therefore, 1

In this paper the term ‘task’ refers to the concept of a process in an operating system, which allows to separate and isolate applications from each other.

156

W. Binder and B. Lichtl

the executive overhead is significantly reduced when compared to a JVM implemented in software. Furthermore, the rather low clock rates of current Java processors (about 40–400MHz) help to preserve power, which is crucial if the station has a limited power supply. Moreover, because all system components and applications are implemented within a Java-based, high-level, object-oriented programming model, the reliability of the overall system is greatly improved. This paper is structured as follows: In the next section we discuss possible applications of the autonomous station and present our requirements and design goals. In section 3 we explain how the station is managed. We focus on application upload and on communication of application data. Section 4 outlines our business model and presents different options for charging applications executing on the station. In section 5 we explain why we selected the J-SEAL2 mobile object platform as operating system kernel for the autonomous station. Section 6 addresses technical issues regarding resource management on embedded Java systems. Using a special benchmark suite tailored to embedded devices we evaluate the overhead of CPU accounting based on program transformation techniques. Section 7 outlines an application of the autonomous station, which has been deployed in a pilot project. The last section concludes this paper.

2

Applications and Design Goals

The autonomous station is a multi-purpose, customizable, and extensible system, which may be deployed in many different configurations. Below we mention a few exemplary applications. Equipped with the necessary sensors and actuators, the same autonomous station may serve a series of applications at the same time. – In a pilot project we have used autonomous stations to provide on-demand bus stops, where the bus only passes the stop, if a customer has explicitly ordered it. In section 7 we present some details of this particular application. – Autonomous stations can be used for traffic monitoring and for tracing. Equipped with a radar unit and a high-speed digital camera, they are a cost-effective alternative to traditional radar units operated by the police, because they are maintained remotely. The uploaded monitoring application may employ techniques for pattern recognition to extract and transmit only the relevant portion of the image, thus saving communication costs. – The general architecture of our autonomous station is also well-suited for cheap multi-purpose satellites, as well as for spacecrafts. – In the context of industrial sensing autonomous stations offer a cost-effective alternative to a system of wired sensing elements. – Autonomous stations may be deployed to monitor their environment. Equipped with sensors to detect toxic substances, autonomous stations can improve civil protection. – Research institutions may use autonomous stations to collect environmental information.

Using a Secure Mobile Object Kernel as Operating System

157

Research on distributed wireless sensor networks [12] has been addressing some of these application domains, especially environmental monitoring. In sensor networks communication bandwidth and energy usually are limited, while computing power is comparatively plentiful and inexpensive. Due to bandwidth constraints and because communication over a wireless network consumes considerable energy, in-network processing, such as localized data aggregation, is essential to optimize the exploitation of resources [9]. Our autonomous station suffers from similar resource constraints, thus, driven by these limitations, we have developed a generic environment to move application code as close as possible to where the data is collected. Whereas many recent works on wireless sensor networks focus on routing protocols and on data distribution, a very simple communication architecture is sufficient for our purpose (for details see section 3). Our contribution is an extensible, multi-purpose, embedded system, where different applications are installed remotely on demand and execute in a controlled and secure way. The hardware of the autonomous station comprises a main board with CPU and memory, a wireless communication module, a power supply system that typically consists of solar cells and a rechargeable battery, as well as applicationspecific sensors and actuators (input and output devices). Because of the wireless communication module and the integrated power supply, the operator may easily displace stations to new sites without incurring high installation costs. Depending on the concrete application, the hardware components have to meet the following requirements: – The autonomous station has to be built from off-the-shelf components, in order to keep the hardware costs low. – The hardware has to be resistant against variations in temperature. – The power supply system must be adaptable to the concrete operational area of the station. The size of the solar cells and the capacity of the rechargeable battery have to be selected according to the expected insolation. – Because of the limited power supply, the processor has to offer competitive performance as well as reduced power consumption. Since we require a safe high-level language for application programming, the autonomous station is based on a modern Java processor, which provides a standard JVM implemented in hardware. – Depending on the location it may be necessary to protect the hardware against vandalism. The design decision to employ a Java processor also implies that all system software, as well as the applications, have to be represented by JVM bytecode. The software may be implemented in pure Java or in any other language that can be compiled to JVM bytecode. For the system software, we have the following requirements: – Because the station may not be easily accessible after deployment, and in order to reduce the maintenance overhead, the system is designed for remote

158











3

W. Binder and B. Lichtl

maintenance. This means that diagnostic programs may be uploaded in order to detect the reason for a malfunction. Furthermore, new system components may be installed remotely, and existing components may be replaced with new versions. Applications are installed and updated remotely. Applications may be terminated, freeing their allocated resources and leaving the station in a consistent state. Applications are charged for their resource consumption. The resources that may be charged for include power consumption, CPU and memory utilization, access to sensors and actuators, and communication. All system components are protected from faulty applications. Sensors and actuators cannot be directly accessed and programmed by an application, but a device driver (a system component) mediates access to the device. The autonomous station may offer its resources to multiple applications in a time-sharing fashion. Applications are isolated from each other, since they may execute on behalf of different parties. A device manager ensures that concurrent applications use multiple sensors and actuators in a consistent way. The device manager must support preemption and revocation, if a high-priority application requires access to a device occupied by a low-priority application.

Communication

In this section we give an overview of the communication model and infrastructure of the autonomous station, which is being used for application upload and for communication with running applications.

3.1

Supervising Server

Each autonomous station is managed and controlled by a single supervising server (SuSe). A SuSe may be in charge of multiple stations. The SuSe is the only communication partner of an autonomous station2 . Clients who want to upload applications to an autonomous station or to communicate with an already uploaded application have to contact the station’s SuSe, which acts as a gateway: It receives client requests from the wired network (e.g., through a TCP/IP connection) and dispatches the request to the corresponding station, 2

In order to prevent the SuSe from becoming a single point of failure, the autonomous station may communicate with a set of backup SuSes, if its primary SuSe fails. The physical replication of a SuSe is crucial for applications with (soft) real-time guarantees, such as applications for civil protection, industrial sensing, or military purpose. Within this paper we do not elaborate issues concerning the replication of SuSes.

Using a Secure Mobile Object Kernel as Operating System

159

which is accessible only through a wireless network3 . Vice versa, the SuSe receives messages from an application running on a station and forwards them to the client who has deployed the application. This approach helps to protect the autonomous station, as all messages to the station are mediated by the SuSe. For instance, if an application is to be uploaded, the SuSe inspects the application code to ensure certain security properties. Moreover, this model simplifies the management and maintenance of the station, because the SuSe is its single authority, which also facilitates the charging of applications. In addition to this, the communication module within the autonomous station is significantly simplified by the fact that there is only a single communication partner. The autonomous station and its SuSe employ symmetric key encryption to protect and to authenticate the messages communicated over the wireless network. The symmetric key is a shared secret between the station and its SuSe4 . Because cryptography based on symmetric keys can be implemented much more efficiently than public key cryptography, this approach saves processing on the autonomous station and, hence, helps to preserve power. Furthermore, there is no need for the station to utilize any public key infrastructure. Considering the limited CPU, memory, and power resources available to the station, the system software has to be kept simple, small, and efficient. 3.2

Directory and Registration

Each SuSe provides a directory of the autonomous stations it manages. Within the namespace of a SuSe, each station has its own unique identifier. For each autonomous station, the directory comprises the services offered by the station, its location (GIS coordinates), a specification of the APIs needed to program the special sensors and actuators attached to the station, a description of the different quality-of-service (QoS) levels supported by the station, and detailed pricing information. The directory of a SuSe may be publicly accessible, or it may be restricted to registered clients only. Before a client can upload its first application to a station maintained by a particular SuSe, he has to register at the SuSe. The client has to post information necessary for billing (e.g., billing address, credit-card details, etc.), as well as his public key, which will be used by the SuSe to authenticate the client’s requests (such as requests for application upload). If desired by the client, the public key may also be used to encrypt application results, which are forwarded by the SuSe to the client. The SuSe verifies the client’s public key as well as the given billing information. Upon successful validation, the client will be allowed to upload applications to autonomous stations. 3

4

The wireless network used to connect the autonomous station with its SuSe depends on the physical location of the station and the range of applications it shall support. For instance, in the configuration presented in section 7 the autonomous station is connected to a public GPRS (General Packet Radio Service) network [1]. To protect against brute force attacks aimed at cracking the symmetric key, the autonomous station and its SuSe isochronously change the common key.

160

3.3

W. Binder and B. Lichtl

Application Upload

When a client wants to upload a new application to an autonomous station, he has to transmit the application to the station’s SuSe. For this purpose, the client sends a signed message to the SuSe, including the identifiers of the destination stations, a Java archive (a JAR file) containing the application classes and a deployment descriptor, as well as the network location (e.g., host address, port, and protocol) where application results shall be routed to. The deployment descriptor specifies the resource requirements of the application, QoS parameters, etc. The SuSe checks whether the requested QoS can be guaranteed. If the new application is accepted, it is assigned a unique application identifier (AID) within the SuSe’s namespace. The application archive is opened, the application classes are verified (and eventually modified to guarantee certain security properties, such as resource accounting and control of the application [4]), and the application is re-packaged in a special application transfer format, which may yield better compression (an important aspect regarding the low bandwidth of many wireless networks) and which can be handled by the mobile object system executing within the autonomous station. We are relying on compression algorithms that are especially tailored to Java class files [10] and achieve significantly better compression than commonly used methods such as ZIP. The AID is part of the transfer format. The re-packaged application is transmitted to the destination stations through a wireless network. As mentioned before, symmetric key encryption is used to secure the communication. All messages originating from the application will be tagged with the AID, allowing the SuSe to dispatch them to the client. Upon succesful installation of the application, the AID is returned to the client, who can use it to direct messages to the application. When an existing application is to be updated, the client has to provide the identifier of the application to be replaced. When the autonomous station receives an application which is already running (according to the AID), the old version is terminated before the new one is installed5 . 3.4

Communication of Application Data

Once installed, an application may transmit results to the client who has deployed the application. The communication module within the autonomous station tags the message with the correct AID. It is also responsible for buffering messages that cannot be transmitted immediately due to a network failure or crash of the SuSe. As with application upload, communication messages are encrypted with a symmetric key known only to the autonomous station and its SuSe. 5

This simple mechanism for application update may not be suited for soft real-time applications that require a high QoS. A more sophisticated protocol for application update may allow both versions of the application to execute concurrently for a short period of time, allowing the new version to take over the functions of the old one, which will be terminated in a coordinated way.

Using a Secure Mobile Object Kernel as Operating System

161

The SuSe decrypts messages received from an autonomous station. Based on the AID, the SuSe is able to determine the recipient. If desired by the client, the SuSe signs the message and encrypts it with the receiver’s public key. Finally, the message is delivered to the network address, which the client has provided during installation of the application. If the receiver is temporarily not available, the SuSe buffers the message. The client may also send control messages to its application. For this purpose, he has to transmit the signed message to the SuSe, providing the destination AID, which the SuSe needs to route the message to the correct station. Within the autonomous station, the communication module dispatches the message to the corresponding application based on the AID. 3.5

Example

Figure 1 illustrates the deployment of an application on autonomous stations. Encryption details are not shown in this figure. In this example the SuSe controls the autonomous stations AS 1 and AS 2. First, the client wishing to upload an application registers at the SuSe (1). Then he transmits a package containing application App A to the SuSe (2), which verifies and eventually rewrites the classes (3), before the application is uploaded to the autonomous stations (4). In this example, the client requests to upload the application to both stations. AS 2 is already executing another application App X, while AS 1 is idle before App A is installed. After App A has been installed and has started executing on AS 2 (5), it transmits application results to its SuSe (6), which dispatches (7) and forwards (8) the data to the client owning the application. Finally, the client is able to process the application results (9).

4

Charging for Resource Consumption

In order to amortize the investment in autonomous stations and to make money, clients may be charged for applications they have deployed. The total charge for running an application on a station may comprise the following costs: – A flat rate for application deployment. – A general fee for using the station, which may be charged on a month-bymonth basis. This charge will depend on the QoS (or priority) granted to the application. – Variable costs depending on the resources consumed by the application, such as power, CPU time, and memory. For many applications, power will be the most precious resource, as it is limited by the available insolation and the capacity of the rechargeable battery. The power consumption may be roughly determined from measurements (i.e., voltage metering), as well as from the utilization of other resources, such as the number of executed JVM instructions (i.e., CPU consumption), access to the communication module, to sensors, and to actuators (which usually consumes extra power), etc.

162

W. Binder and B. Lichtl

Fig. 1. Application upload and communication.

– Communication costs for messages originating from the application or sent to the application. This charge will depend on the costs of the underlying wireless network. – An extra fee for application update. The general fee and the variable costs for consumed resources help to amortize the investments in the station (hardware costs and license fees for the system software). Essentially, these investments are to be made by the station’s owner before applications can be deployed. Nevertheless, there are also some non-negligible costs for maintenance: Apart from upgrades of the system software, which can be done remotely in a similar way as application upload, costs for repairs, for cleaning and calibration of sensors, etc. have to be considered. Furthermore, because the lifetime of the rechargeable battery is limited, it has to be replaced occasionally. The resource consumption of an application is monitored and accumulated by the system software within the autonomous station. Once it exceeds a given threshold, the information on consumed resources is communicated to the station’s SuSe. Communication causes additional costs for the owner of the autonomous station. Depending on the low-level transfer protocol offered by the wireless network and the communication frequency, these costs may become very high. Therefore,

Using a Secure Mobile Object Kernel as Operating System

163

it is important that the SuSe, which is involved in every message transfer over the wireless network, logs every communication on persistent storage. Thus, the charging for communication will be always accurate (except for lost messages sent by the autonomous station).

5

Adaption of the J-SEAL2 Mobile Object Kernel

The system software of the autonomous station is based on J-SEAL26 [3], a secure mobile object kernel. J-SEAL2 is a micro-kernel that offers a task model with strong isolation properties on top of standard Java runtime systems. It is based on the formal model of the Seal Calculus [14], which was first implemented by the JavaSeal kernel [7]. J-SEAL2 is able to securely and concurrently execute multiple Java applications in the same JVM, which are completely separated from each other. J-SEAL2 resembles the kernel of a traditional operating system, as it provides mechanisms for application isolation, efficient and mediated communication, safe termination, and resource control. The architecture of J-SEAL2 is well suited as the basis for mobile code systems, because it offers the necessary level of host security, which is not found in current standard Java runtime systems: Executing applications and system services within separate tasks, J-SEAL2 protects the platform from malicious or badly programmed applications, as well as applications from each other. We are exploiting the advanced security mechanisms of J-SEAL2 to protect the autonomous station from faulty applications, to isolate applications from each other, and to account and control their resource consumption, which is necessary for charging. We selected J-SEAL2, because it offers several advantages with respect to our requirements: – J-SEAL2 has been specially designed for increased host security. It provides a hierarchical task model, which allows to isolate system components (such as the communication module or device drivers) and applications, while they are executing within the same JVM. In the task model of J-SEAL2 a parent task acts as communication controller, access controller, and resource manager of its children. Applications that are uploaded to the autonomous station are installed as children of a trusted mediator task, which controls the execution of the applications and limits their resource consumption. – J-SEAL2 is implemented in pure Java. In contrast to other secure mobile object platforms and Java operating systems, such as KaffeOS [2], J-SEAL2 relies neither on a special JVM nor on native code. This is of paramount importance, as the platform has to run on a Java processor, which provides a standard JVM implemented in hardware. – J-SEAL2 is a small and efficient micro-kernel. The kernel offers only essential primitives to implement secure and reliable system software. This is important, since the memory available on the autonomous station is limited. 6

http://www.jseal2.com/

164

W. Binder and B. Lichtl

– J-SEAL2 has a modular and extensible architecture. Special system services, such as device drivers, can be added dynamically. – J-SEAL2 supports resource control for physical resources (i.e., CPU and memory), for logical resources (e.g., threads), and for access to service components. For CPU and memory control, the resource consumption of the application is reified [4,13]. I.e., the application is modified to expose its resource consumption to the system. This approach enables resource control, even if the underlying Java runtime system does not support it. In our setting the application is modified for resource control by the SuSe before it is uploaded to the station. As these modifications are complex and time consuming, they should not be carried out by the station, where the resources are limited. System components (e.g., device drivers, device manager, communication module, etc.) as well as applications execute within separate tasks. Each task may be terminated at any time, which frees its allocated resources and is guaranteed to leave the system in a consistent state7 . Consequently, we are able to implement application upload and update, installation of new system services, and update of system services in a similar way.

6

Resource Management on Embedded Java Systems

In order to support resource accounting and control on the autonomous station, the SuSe rewrites applications to keep track of the number of executed bytecode instructions (CPU accounting) and to update a memory account when objects are allocated or reclaimed by the garbage collector. Ideally, program transformations for resource management shall be compatible with existing Java runtime systems, shall cause only moderate overhead, and allow accurate accounting. While the accuracy of CPU accounting on a Java 2 Standard Edition JVM is limited, because of the significant part of native code that is not accounted for and due to optimizations performed by the compiler, the accounting precision on a Java processor can be much better, as the execution time of individual bytecode instructions can be measured and only very simple and well documented optimizations are performed, such as the combination of certain JVM instructions (instruction folding). However, regarding the overhead sophisticated optimizations can be beneficial, and consequently the relative overhead on an embedded Java processor may be significantly higher than on a JVM with a modern compiler, where the overhead for CPU accounting is about 15–30% [4, 13]. In order to evaluate the overhead caused by the rewriting of application and of JDK classes on an embedded Java processor, we created a special benchmark 7

If the task to be terminated is executing a kernel operation, termination is delayed until completion of the kernel operation, in order to ensure the integrity of the kernel. Because kernel operations in J-SEAL2 are non-blocking and have a short and constant execution time, termination cannot be delayed arbitrarily. Details concerning task termination in J-SEAL2 are presented in [3].

Using a Secure Mobile Object Kernel as Operating System

165

suite including the Embedded Caffeine Mark 3.08 benchmarks, as well as a series of custom benchmarks focused on cryptography.9 We measured encryption and decryption with the AES and RC6 symmetric key algortihms using 12kB of input data and a key length and block size of 128 bit. We also evaluated the performance of the RSA public key algorithm with 256 bytes of input data, a key length of 1024 bits for both the public and the private key, an exponent of 8 bits for the public key, and an exponent of 1016 bits for the private key. The cryptography benchmarks have high practical relevance for the autonomous station, as all communication with the station is encrypted. Our performance measurements were collected on a JStamp board by Systronix10 , which is based on an aJile aJ-80 processor11 running at 80MHz. The memory resources of the JStamp are very limited, it only offers 512KB of SRAM and 512KB of flash memory. We measured the performance with the following 3 configurations: Unmodified (U): Neither the runtime system nor the benchmarks are rewritten. This setting gives a reference value for comparison. Rewritten (W): The runtime system and the benchmarks are rewritten for CPU accounting. Rewritten, Optimized (W*): In this setting some simple optimizations are applied to reduce the overhead. A simple loop detection algorithm marks the beginning of loops in the control flow graph of each method. Basically, CPU accounting is limited to the first basic block of code in a method, in an exception handler, and in a JVM subroutine, as well as to the blocks marked by the loop detection algorithm. The accounting weight is determined by the longest path in the control flow graph until the next accounting site is reached. Depending on the required precision, additional accounting sites may be inserted. For our measurements we did not require precise accounting. Since embedded systems tend to be memory constrained, we registered also the overhead regarding the size of the binary program image (including the Java runtime system as well as the benchmark classes) that is written into the flash memory of the board. Table 1 presents the size of the binary program image for each setting; the overhead is only 7–16%. Tables 2 and 3 present our performance measurements. Comparing the performance of the crytography benchmarks backs our design decision to use only symmetric key encryption for communication with the autonomous station, which does not have the necessary computing resources for public key cryptography. Especially the RC6 algorithm is well suited for embedded systems with limited resources, as it is based on simple and efficient operations. 8 9

10 11

http://www.pendragon-software.com/pendragon/cm3/ We used the implementation of ‘The Legion of the Bouncy Castle’, available at http://www.bouncycastle.org/, which is a free implementation of cryptography protocols offering, in addition to the JCE API, also a lightweight encryption API for Java 2 Micro Edition environments. http://www.systronix.com/ http://www.ajile.com/

166

W. Binder and B. Lichtl Table 1. Size of binary program image (values in bytes). Benchmark U W W* Image size 152192 (1,00) 176712 (1,16) 163260 (1,07)

Table 2. Overhead of CPU accounting: Custom benchmarks (time in milliseconds). Benchmark U W W* Bubblesort 4429 (1,00) 7023 (1,59) 5264 (1,19) Hashtable 141 (1,00) 251 (1,78) 210 (1,49) AES Enc. 12kB 4561 (1,00) 7352 (1,61) 6168 (1,35) AES Dec. 12kB 6295 (1,00) 11865 (1,88) 8893 (1,41) RC6 Enc. 12kB 956 (1,00) 1689 (1,77) 1480 (1,55) RC6 Dec. 12kB 991 (1,00) 1710 (1,73) 1500 (1,51) RSA Enc. 256B 7276 (1,00) 15265 (2,10) 10514 (1,45) RSA Dec. 256B 921967 (1,00) 2084941 (2,26) 1347205 (1,46) Geometric mean 4286 (1,00) 7830 (1,83) 6096 (1,42) Table 3. Overhead of CPU accounting: Embedded CaffeineMark benchmarks (values are scores). Benchmark Sieve Loop Logic String Float Method Overall

33 26 42 57 24 41 35

U (1,00) (1,00) (1,00) (1,00) (1,00) (1,00) (1,00)

12 16 7 26 19 16 14

W (2,75) (1,63) (6,00) (2,19) (1,26) (2,56) (2,50)

25 23 38 42 22 27 28

W* (1,32) (1,13) (1,11) (1,36) (1,09) (1,52) (1,25)

Without optimizations the overhead of CPU accounting is excessive, about 80% for the custom benchmarks and even more for the CaffeineMark suite. The measurements W* show the best results optimizations can achieve by minimizing the number of accounting sites (without changing the structure of programs). We can see that optimizations have the potential to significantly reduce the accounting overhead to 25% for CaffeineMark and about 40% for the more realistic custom benchmarks. For the moment, we have to accept this significant overhead, because resource accounting is essential for the proper protection of the autonomous station and for billing. However, since current Java processors do not perform sophisticated optimizations, we will also experiment with standard code optimization techniques typically applied by optimizing compilers, such as method inlining and loop unrolling. These techniques aim at increasing the average size of basic blocks of code, thus reducing the accounting overhead (in addition to eliminating branch instructions and method invocations). However, because embedded systems frequently are memory constrained, optimization techniques that increase the code size have to be applied with special caution.

Using a Secure Mobile Object Kernel as Operating System

7

167

Autonomous Stations as On-Demand Bus Stops

Recently, we have used autonomous stations based on the architecture described in this paper to provide on-demand bus stops within a pilot project. This application is the first deployment of our autonomous station under commercial settings. The project was initiated by a bus operator in Austria, in order to avoid empty buses in the rural area and to improve the QoS. The customer has to press a button on the bus stop to order the next bus. If there is no request by a customer, the bus will not pass the stop. The on-demand bus stops are deployed in areas where the bus service is used rarely and irregularly. Consequently, the bus driver may frequently select a shorter route to save fuel and time, which also helps to compensate for delays due to traffic jams. Therefore, the overall bus punctuality (and hence the QoS) is improved. This application involves four major components: The autonomous stations allowing customers to order buses, the stations’ SuSe, a logistics application to coordinate the bus routes, as well as a simple application running on cell phones to interact with the bus drivers. The logistics application communicates with the autonomous stations through the SuSe. In our setting the communication between the logistics application and the bus drivers’ cell phones is managed by the SuSe as well. In order to reduce the hardware costs, the user interface of the station (its peripherals) is kept as simple as possible: It comprises two vertically grouped buttons with built-in LEDs and a 20x4 character LCD display with green background illumination. The LEDs on the buttons signal the possible input choices to the user. See figure 2 for some pictures. Furthermore, the station contains a beeper for acoustic feedback of user actions, as well as a motion detection sensor to activate the illumination of the display when a potential customer approaches the station. The communication with the SuSe is based on UDP packets on top of a GPRS [1] connection. The station includes a Motorola g18 GPRS GSM embedded wireless module, which is connected to the mainboard of the station by the standard RS232 serial interface. Since we could not find a free implementation of the PPP/IP/UDP protocol stack in pure Java, we had to develop our own implementation from scratch. In the current version the application executing on the autonomous station is able to serve the schedule of one bus in two directions. The configuration of the application (i.e., the bus schedule) is completely dynamic, it is communicated by the server application periodically. Therefore, the bus schedule may be changed at runtime without uploading a new version of the application. The application running on the station also logs all user actions and periodically transmits the logging data to the server. This information is important to evaluate the user behaviour and the acceptance of the on-demand bus stop. Charging the application for consumed resources is not necessary in this pilot project, because the bus operator does not allow the uploading of foreign applications so far. The benefits of installing the flexible and extensible autonomous stations will become evident in the long term. If the number of deployed stations increases, managing software updates (e.g., bug fixes, improvements of the user dialog,

168

W. Binder and B. Lichtl

Fig. 2. Pictures of the on-demand bus stop, its solar panel (also showing the GSM antenna and the motion detection sensor), as well as its simple user interface.

etc.) remotely reduces the maintenance costs. Moreover, once a large number of stations has been installed in a wide area, this infrastructure becomes attractive to deploy additional applications in a cost-effective way, such as traffic or environmental monitoring. The operator has to add the required devices to the stations and may open his infrastructure to other parties that will be charged for this service.

8

Conclusions

The contributions of our work are threefold: Firstly, we present the autonomous station, a new application of embedded Java, which relies on mobile code for program upload and remote maintenance. The station has a flexible and extensible architecture to support a wide range of different applications. Secondly, we show how a mobile object kernel can be adopted as an embedded operating system for the distribution and installation of applications by mobile code. We

Using a Secure Mobile Object Kernel as Operating System

169

are exploiting the advanced security features of the J-SEAL2 mobile object kernel in order to provide a reliable and secure system to host foreign applications, which are charged for their resource consumption. Finally, we present a concrete application, the on-demand bus stop, which is based on the architecture of our autonomous station. Since resource management is a missing feature in current versions of Java, we are relying on program transformation techniques to expose the resource consumption of applications. Our techniques for resource accounting are fully portable and also work on embedded Java systems. However, because current Java processors perform no sophisticated optimizations, the accounting overhead is significantly higher than on standard Java runtime systems. With the aid of traditional program optimizations we will continue to reduce the overhead of portable resource management in embedded applications like our autonomous station. Acknowledgments. Many thanks to Klaus Rapf and to Volker Roth for their useful comments that helped us to improve the paper.

References 1. 3GPP. 3GPP Specifications Home Page. Web pages at http://www.3gpp.org/ specs/specs.htm. 2. G. Back, W. Hsieh, and J. Lepreau. Processes in KaffeOS: Isolation, resource management, and sharing in Java. In Proceedings of the Fourth Symposium on Operating Systems Design and Implementation (OSDI’2000), San Diego, CA, USA, Ott. 2000. 3. W. Binder. Design and implementation of the J-SEAL2 mobile agent kernel. In The 2001 Symposium on Applications and the Internet (SAINT-2001), San Diego, CA, USA, Jan. 2001. 4. W. Binder, J. Hulaas, A. Villaz´ on, and R. Vidal. Portable resource control in Java: The J-SEAL2 approach. In ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA-2001), Tampa Bay, Florida, USA, Ott. 2001. 5. W. Binder and V. Roth. Secure mobile agent Systems using Java: Where are we heading? In Seventeenth ACM Symposium on Applied Computing (SAC-2002), Madrid, Spain, Mar. 2002. 6. G. Bollella, B. Brosgol, P. Dibble, S. Furr, J. Gosling, D. Hardin, and M. Turnbull. The Real-Time Specification for Java. Addison-Wesley, Reading, MA, USA, 2000. 7. C. Bryce and J. Vitek. The JavaSeal mobile agent kernel. In First International Symposium on Agent Systems and Applications (ASA’99)/Third International Symposium on Mobile Agents (MA’99), Palm Springs, CA, USA, Ott. 1999. 8. G. Czajkowski and L. Daynes. Multitasking without compromise: A virtual machine evolution. In ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA ’Ul), Tampa Bay, Florida, Ott. 2001.

170

W. Binder and B. Lichtl

9. J. Heidemann, F. Silva, C. Intanagonwiwat, R. Govindan, D. Estrin, and D. Ganesan. Building efficient wireless Sensor networks with low-level naming. In G. Ganger, editor, Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP-01), volume 35, 5 of ACM SIGOPS Operating Systems Review, pages 146–159, New York, Ott. 21–24 2001. ACM Press. 10. R. N. Horspool and J. Corless. Tailored compression of Java class files. Software Practice and Experience, 28(12):1253–1268, Ott. 1998. 11. T. Lindholm and F. Yellin. The Java Virtual Machine Specification. AddisonWesley, Reading, MA, USA, second edition, 1999. 12. G. J. Pottie and W. J. Kaiser. Embedding the Internet: wireless integrated network Sensors. Communications of the ACM, 43(5):51–51, May 2000. 13. A. Villaz´ on and W. Binder. Portable resource reification in Java-based mobile agent systems . In Fifth IEEE International Conference on Mobile Agents (MA-2001), Atlanta, Georgia, USA, Dec. 2001. 14. J. Vitek and G. Castagna. Seal: A framework for secure mobile computations. In Internet Programming Languages, 1999.

Supporting Flexible Data Feeds in Dynamic Sensor Grids through Mobile Agents Marco Carvalho and Maggie Breedy Institute for Human and Machine Cognition / University of West Florida {mcarvalho,mbreedy}@ai.uwf.edu

Abstract. This paper describes a framework for flexible data feeds in sensor grids where resource constraints, policies, and a dynamic topology are important factors. Mobile agents are used to dynamically establish the data flows and data transformations in the network. They also act as policy enforcers that are dynamically dispatched into the sensor network. A dynamic topology for the network is taken into consideration, where nodes can join and leave at any time. Mobile code provides the means to dynamically deploy capabilities to any participating host and strong mobility allows process migration between nodes to ensure feed survivability and load balancing. The proposed framework relies on a strong mobility agent system (NOMADS) and the KAoS framework for policy enforcement and is being currently used to support a military coalition agent scenario (CoAX 2002).

1

Introduction

Our capacity to interact with the world around us is directly related to our capacity to collect and interpret data. Some of the more complex aspects of collecting and distributing information involve the aggregation and transformation of data, as well as the actual transmission from its source to its destination. More often than not, information is extracted from data collected by remote sensors and then distributed to the destination nodes utilizing various types of devices, links, and protocols. There are many different communication models available to regulate the exchange of messages between nodes in a data network. Broadcast-based models and publishsubscribe models based on unicast and multicast have been widely discussed and used on data networks. However, most of these protocols have been designed for direct delivery of data from the sensor to the client and fail to efficiently address issues of processing load distribution and bandwidth reduction for customized data feeds between many sensors and clients. Customized delivery of data usually requires additional processing load for data transformation that will happen either at the server or at the client side. Looking at services provided through the web today, we can easily identify these two cases. Some applications rely on heavy computation at the server side to create customized pages or database results where others rely on applets or downloadable code to push this computational load to the client. Some specific data networks, however, impose very different kinds of restrictions and requirements. Consider, for instance, a state of the art hospital where each patient N. Suri (Ed.): MA 2002, LNCS 2535, pp. 171–185, 2002. © Springer-Verlag Berlin Heidelberg 2002

172

M. Carvalho and M. Breedy

is monitored with heart strips, temperature sensors, and other network enabled diagnostic sensors. In such a hospital, doctors could carry PDAs that would, on demand, receive customized health information from any patient to whom they have access. Data now must be efficiently and securely transmitted from source (patient) to sink (doctors) to avoid unnecessary use of battery life or shared wireless bandwidth. The nurses might request 30 minute updates from their patients with temperature and pressure readings, while the patient records database would have to include 5 minute interval readings of patient’s vital signs. Some doctors could also choose to request notification from specific patients but only if blood pressure goes beyond a certain level. Data broadcast models would be very expensive and traditional publish-subscribe models would prove to be very inefficient because of the constant changes in topology and the processing load required by the sensor to create customized data feeds for each subscriber. The problem can be attenuated with the use of a centralized data server as a relay point for patient data. Centralized data servers are a common solution in these types of data networks but it requires at least a partially fixed topology, where some nodes will be always present and available for data collection and aggregation. Even more complex scenarios can be considered, where the network infrastructure is dynamic and the notion of a centralized, high capacity server for data collection, processing, and distribution is not realistic. This is the case we address in this paper. This work describes FlexFeed, a mobile-agent based framework for dynamic creation of flexible data distribution feeds in a sensor grid, which ensures that sensor resources are preserved, client capabilities are honored, communications bandwidth is minimized, and information release policies are enforced. In section 2, we will introduce a general description of the problem and the requirements. We then present a scenario in section 3 to be used as basis for discussion and illustration of the issues and proposed solutions. In sections 4 and 5, we discuss the flexible feed framework and its components, followed by some implementation details, future work, and conclusions in section 6.

2

Problem Description

Sensor network systems vary greatly in both type and scale. We can find systems that go from ocean buoys connected through satellite links to temperature and pressure sensors in a power or production plant [1]. In general, the capabilities and constraints of these systems are compatible with their applications. However, it is easy to find scenarios where unforeseen changes in goals, topology or policies might require extensive changes in configuration and operation of already deployed systems. Dynamically changing data flows on a large system to accommodate, for instance, changes in topology, is a very complex and expensive task, especially when system policies, performance, and limitations of nodes must also be addressed. [2][3]. The framework we propose in this work is mainly concerned with sensory networks where the sensor nodes have constraints in communication, processing, and, power capabilities. We assume that communication links between these sensors and the clients must be dynamically established to account for a continuously changing network topology.

Supporting Flexible Data Feeds in Dynamic Sensor Grids through Mobile Agents

173

We can summarize our set of requirements for the problem as follows: A) Processing and communications loads in sensors should be minimized to extend their life expectancy. Some sensor grids rely on small remote sensor nodes that are battery powered and hard to replace. Preserving the battery life by reducing processing and communication to a minimum is of paramount importance. B) Clients should not receive more data than requested to avoid overload. Usually, the clients are also under processing and communication constraints. Clients may vary from large systems with high-bandwidth connections to small portable devices such as PDAs and cell phones with intermittent connectivity. C) Every data stream and service request might be constrained by a set of policies put in place by the sensor administrator, owners of the data, or the various networks traversed. D) Some of the nodes in the network might have arbitrary life spans or intermittent connectivity, meaning that they can join and leave the framework at any time. In cases where the framework is aware of the fact that a node is about to become unavailable, routes can be proactively changed to accommodate new topology. Sudden changes in availability should trigger a recovery algorithm to re-establish the broken routes. Note that data representation and sensor abstraction [4][5] as well as the many different algorithms and techniques for competitive and complementary data aggregation will be widely used as examples when describing the framework but they are not the subject of this paper. Our initial implementation of these capabilities greatly leverages from current research in the field [6][7][8]. We will present and discuss our proposal to dynamically deploy these processing and filtering capabilities on arbitrary nodes of the framework (not necessarily the client or the sensor) with the help of mobile agents. As far as data distribution, most of the current research in sensor networks relies in the use of Active Networks for efficient routing of sensor data [7][9]. Although very effective in many instances, these approaches usually fail to coordinate this capability with flexible distribution of data processing tasks. The FlexFeed framework offers a different approach to the problem, using mobile agents not only for routing optimization but also for appropriate distribution of correlation, processing, and policy enforcement tasks. Prior work by Qi [10] also uses mobile agents for distributed sensor networks but focuses on the problem of distributed data integration. In this approach, we concentrate on bandwidth-efficient data distribution and policy-enforcement capabilities. The next section describes a scenario which relies upon the FlexFeed framework. Note that the application of the FlexFeed framework is much broader than the scenario presented.

3

Scenario

The following scenario is a modified version of an experiment in coalition military operations that relies upon the FlexFeed framework. The scenario involves data

174

M. Carvalho and M. Breedy

collection and delivery from a sensor grid in a fictitious combat situation between an imaginary country and a coalition of several other allied countries including the USA and Australia [11]. The scenario assumes that the US and Australia are allied and share information collected from a sensor bed that each country has deployed in the combat theater. Troops and commanders from the US can then access and request information feeds from sensors owned by Australia and vice-versa. Each country is assumed to have policies in place that restrict access to its more sensitive sensors and information. For example, Australia might not want to reveal the real capabilities of its image sensors to the US, so it would put into effect a policy that reduces resolution of data feeds from those sensors. The same requirements presented in section 2 are valid in this scenario. The transmission of data between sensors and clients must be arranged in such a way that the minimum processing and transmission load is placed on the sensors. Clients requiring a specific data feed must not receive a higher bandwidth stream that it can handle. Information release policies must be enforced, and the dynamic nature of the network must be accommodated. We also assume in our fictitious scenario that other nodes besides sensors and clients participate in the network and might provide higher processing and communication capabilities. These nodes could be used as traffic relay stations and as data transformation or filtering stations. These nodes could be trucks, stationary troops, command and control centers, or even aircraft and helicopters in the area that would lend their processing capabilities to the framework while within communications range. Within this scenario, we can imagine a situation where a US commander requests a video feed from sensor A, owned by the US and, a few minutes later, a commander from Australia makes another request for a video feed from the same sensor. The US commander has equipment capable of receiving a 15 frame per second (fps) feed, with a resolution of 640x480, while the Australian can only handle a feed of 5 fps, with a resolution of 320x240. Both make a request for a feed to receive data within the maximum capacity of their receivers. One possible way to address the requirements of both troops would be to request from sensor A, a broadcast with the higher feed requirement (15 fps / 640x480) to both clients and then have the commander from Australia filter the feed to its capabilities. This approach however, would violate one of our requirements (requirement B). By this requirement, if this feed would be provided to the Australian commander, it would probably flood their network connection or freeze the client with the processing demands for image handling and conversion. Another possibility, as presented in Figure 1, would be to have the sensor create two different feeds, with different resolutions and frame rates and unicast them to each of the commanders. This approach would also violate one of our requirements (requirement A), increasing the load on the sensor. Yet another solution would be to have the sensor feed the US commander with the (15 fps/640x480) data feed and have that commander transform and relay the image to the Australian commander, now at a lower resolution and frame rate. This solution also presents problems in that maybe the commanders are not allowed to directly communicate with each other, or maybe the US commander has no processing power or software to convert his feed into the lower resolution feed to be relayed.

Supporting Flexible Data Feeds in Dynamic Sensor Grids through Mobile Agents

175

An acceptable solution would then be the creation of a temporary data distribution network to serve the needs of both groups. The system, faced with both requests, could have the sensor send the higher resolution feed to a relay point, maybe a truck, tank or airplane nearby, have that node process the feed and create two streams for each of the commanders. If the relay point becomes unavailable, the system would then build a different path leveraging from other relay points to obtain equivalent results. A

B USA

USA

USA

USA

Australia

Australia

C

E USA

USA

USA

Australia

USA

Australia

Fig. 1. Possible data feed scenarios between a single sensor and two agents involving data transformation.

The last solution would also provide means to enforce potential policies that could be in place for the camera sensor. One possible policy for sensors could be a limit on the maximum resolution to 160x120, so the request from the Australian commander would have to be denied or adjust to the new resolution. The dynamic establishment of data links, as proposed here, raises some issues that require special attention. It relies on a system coordinator capable of receiving and providing a solution for the requests within a reasonable amount of time. The person or group of people that would perform this task would have to check each requirement against a list of policies and modify them accordingly so the correct information would be provided. It also relies on the intermediary node’s capabilities to properly convert and relay the image. All this must be done without interfering with the normal activities of the intermediate node that might be, at the same time, performing other processing tasks.

176

4

M. Carvalho and M. Breedy

The Flexible Feed Framework Design

The flexible feed framework proposed in this work tries to address all the issues presented in the scenario above with the use of mobile agents to dynamically establish data paths and enforce communication policies between nodes. Mobile agents are very important and fundamental for this application for several reasons. a) mobile agents can move media conversion code to any node in the network, providing onthe-fly capabilities not previously available on the specific host to convert and modify the data feed b) Once the link is terminated, the agents can be discarded thereby removing the transformation code to save storage capacity in the intermediate nodes. c) Although not required, a strong mobility framework can also provide transparent forced mobility, migrating processes and resources to alternative nodes to ensure survivability and provide load balancing. This capability can greatly simplify the design and implementation of custom agents. The FlexFeed framework assumes that all the nodes in the network run an execution environment for the mobile agents and that the execution environment is trusted and secure. Each agent has access to the framework through a FlexFeedManager interface that provides methods to request and terminate data feeds, and to send data and control messages. All of the underlying communication between agents is provided by the FlexFeedManager. The establishment and teardown of all communication links are controlled by the FlexFeedCoordinator, which is also implemented as a mobile agent. Our current design uses a single, centralized coordinator. This coordinator receives all of the feed requests and, based upon policy restrictions and network topology, configures the best path for data delivery. The algorithm for path selection can be very complex, depending on the level of sophistication of the system. Ideally, the coordinator should be able to simulate the network and estimate load on each node, calculating an optimum state from an observer model approach. After that, a cost function must be evaluated to decide if migration to the optimum state is recommended or not. The algorithms for the system coordinator, as well as some distributed algorithms for path selection, are topics for future work. At this point, only a simple algorithm based on network topology, minimum sensor load, and policies has been implemented. Figure 2 provides an illustration of the whole framework and the communication process. An agent acting as a client must implement a FlexFeedSink interface and the call back functions that will be used by the framework to deliver data received from sensor agents. An agent acting as a source should implement a FlexFeedSource interface. Media conversion agents or policy enforcement agents will implement both interfaces since they act as both, sources and sinks of data. The actions performed by the policy enforcer and the transformation agent are similar; in fact, the policy enforcement agent is a particular kind of transformation agent. The difference is that the policy enforcer creates a transformation to enforce a policy for that specific data feed, while the transformation agent does that to satisfy a requirement from the client or coordinator. For example, consider the scenario in Figure 2. Agent A arrives at host A and requests a feed from the camera sensor attached to host C from the FlexFeedCoordinator. The FlexFeedCoordinator checks the policies and determines that agent

Supporting Flexible Data Feeds in Dynamic Sensor Grids through Mobile Agents

177

A has unrestricted access to the sensor (camera); it then sends a control message to the camera agent in host C to start providing the data feed to agent A (dotted line). After a few minutes, agent B arrives and requests the same data feed. Ensuring that no design requirements are violated, the FlexFeedCoordinator will dispatch a relay agent to host R with instructions to clone the traffic received and send it to both agents A and B. After launching the relay agent, the Coordinator will then teardown the communication between C and A, and will re-establish it through R, satisfying the agents’ requests and the requirements of the framework. If policies were in place, the Coordinator could dispatch a policy enforcement agent to host R instead of a relay agent. The policy enforcer would then provide the appropriate feed transformations before proceeding with the data delivery for both agents A and B.

Flex Feed Coordinator

R

C a

camera Agent

Relay Agent

v

B

A

Agent A

Agent B

Fig. 2. An schematic view of the flexible feed framework.

One of the advantages of using mobile agents for the framework is their ability to dynamically adapt to the constantly shifting network topology, while providing uninterrupted continuity of data stream delivery. In the previous example, Host R can leave the network at any time and the data flow will have to be re-routed through other nodes that should also guarantee the same restrictions imposed by the code running on host R. In this case, since the relay agent running in host R is implemented as a mobile agent it can simply migrate to another node and transparently redirect its connections with no apparent interruption of the data feed.. Figure 3 illustrates a scenario where our relay host (host R) is an airplane that is just about to leave communications range. In that case, the Coordinator can move the relay agent to a nearby truck, or any other idle system available in the network.

178

M. Carvalho and M. Breedy Policy Enforcer Policy Enforcer

Sensor Agent Client Agent

Sensor Agent Client Agent

Fig. 3. Using mobility to ensure data feed survivability in dynamic networks. If both the aircraft and support vehicle provide a strong mobility execution environment, the policy enforcer agent can be transparently moved between the hosts.

5

System Implementation

The FlexFeed framework relies in two core technologies, the NOMADS [12] agent system to provide strong and weak mobility, resource control, and resource redirection, and the KAoS [13][14] framework to ensure policy management and enforcement. The framework consists of a Java API that allows agents to register and join either as sensors, clients, relay nodes or as a combination. At this time, the framework has been implemented and tested. It is currently being used to support a military coalition agents scenario for CoAX 2002 [11] - a DARPA-sponsored experiment. All agents joining the framework obtain an instance of the FlexFeedManager, which acts as a proxy for the agent in the framework. The FlexFeedManager provides methods for the agent to request and terminate sensor data feeds, as well as methods to return data objects and notifications. Figure 4 shows the FlexFeedManager interface. One of the main objectives of the framework is to abstract the route selection and the actual source of data from the agent, optimizing feed requests and ensuring policy enforcement. There are two main interfaces that a node can implement when joining the framework: The FlexFeedSink and FlexFeedSource interfaces (figure 5). The interface will indicate the role played by the agent within the framework. The FlexFeedSink interface defines the agent as a sink node, or a node that will receive data from the framework. By implementing this interface, the agent will provide the actual mechanism to handle the received data message. Each data packet received is tagged with a unique identifier that allows the client to handle multiple feeds simultaneously.

Supporting Flexible Data Feeds in Dynamic Sensor Grids through Mobile Agents

179

package edu.uwf.nomads.flexfeed; import java.io.Serializable; public interface FlexFeedManager extends Serializable { public void register (String sourceAgentId, FeedParams fp, FlexFeedNode node); public void sendUpdate (String feedId, Object data); public String requestFeed (String sourceAgentId, FeedParamsfp, FlexFeedSink target); public void stopFeed (String feedId); }

Fig. 4. The FlexFeedManger Interface package edu.uwf.nomads.flexfeed; public interface FlexFeedSink extends FlexFeedNode { public void updateReceived (String sourceAgentName, String feedId, Object data); public void feedTerminated (String feedId); } public interface FlexFeedSource extends FlexFeedNode { public void startFeed (String sinkAgentName, FeedParams fp, String feedId); public void stopFeed (String feedId); }

Fig. 5. The FlexFeedSink and FlexFeedSource Interfaces. The whole process of optimizing the path for data delivery and providing policy enforcement is transparent to the agent. From the client agent’s perspective, a request for a feed is made and data packets start arriving. Figure 6 shows a sample code of a simple sink agent (or client). When a client agent requests a feed using the FlexFeedManagers’s “requestFeed” method (figure 6), a control message is delivered to the FlexFeedCoordinator. The control message contains information about the source node and the parameters of the feed, encapsulated within the FeedParams object. The coordinator will receive the control message and check overall domain policies and the current network topology to decide how to optimally provide the feed. Depending on the request and the current state of the system, the coordinator will dispatch a media conversion agent to an intermediate node or leverage the feed from some previously established link.

180

M. Carvalho and M. Breedy

{ public ArabelloClient() { _FlexFeedManager ffManager = FlexFeedSystem.getFlexFeedManager(); _ffManager.register(_agentName, _fp, this); VideoFeedParams vfp = new VideoFeedParams(4,50); _sfeedId = _ffManager.requestFeed(sSensorName,vfp,this); } public void updateReceived (String srcName, String feedId, Object data) { //handle data received from node “srcName” } }

Fig. 6 . A simple sink agent (client) Once the feed is established, the client agent will start receiving FlexFeedDataMessage objects from the framework, encapsulated in update messages. As far as the client agent is concerned, these are responses to the feed request, regardless of the path or media transformations put in place by the FlexFeedCoordinator. The first version of the Coordinator relies on a simple rulebased system. The FlexFeed framework has been designed to transparently support three different kinds of message passing mechanisms. Messages can be exchanged between agents using the KAoS framework [13][14], the CoABS Grid system [15] (where the enforcement of policies would not be available) and TCP Sockets. The transport mechanism is handled at the framework level and is completely encapsulated in the agent interfaces, so no change of code is necessary to switch in between them. The desired transport mechanism is determined from a system property set in the Java VM. The main consequences of choosing one transport mechanism over another impacts the portability aspects of the framework and the capabilities to enforce policies. 5.1 Media Conversion Agents The framework is designed to facilitate the process of creating and deploying sources, sinks and media transformation agents, to ensure customization of policies and flexibility. A media conversion agent, sometimes referred to as transformation agent, is both a sink and source. It implements both interfaces and provides the appropriate data conversion algorithm between source and sink. A set of standard media conversion agents developed primarily for video feeds is available within the framework. A brief discussion of some of these agents is presented here for illustration purposes.

Supporting Flexible Data Feeds in Dynamic Sensor Grids through Mobile Agents

181

Resolution Reduction Agents The Resolution Reduction agents are designed to reduce the resolution of data feeds. They take, for instance, images or video feeds and transform them, sending the new resolution images to the sink. Resolution reduction might be necessary either to satisfy client requirements or to enforce policies, or both. Figure 7 shows an illustration of this scenario, including the control messages represented by dashed lines.

Flex Feed Coordinator

dispatch

Request: 640x480 video feed

B

R A

camera Agent

Resolution Reduction Policy Enforcer

Feed: 320 x 240 Agent A

Feed: 640 x 480

C v Trusted Agent

Fig. 7. Sample application of the Resolution Reduction agent for policy enforcement.

In this example, an agent (C) has no policy restrictions applied to it and receives a direct, high resolution feed from the sensor agent (dotted thin line). When a policy restricted agent (A) requests a high resolution feed from the same sensor, the coordinator will have to dispatch a Resolution Reduction Agent to host (R) to provide the feed within the policy constraints. Assuming that policies will allow a maximum resolution of (320x240) for a client (agent A), a Resolution Reduction Agent would be dispatched to reduce the high resolution feed (640x480) to the lower resolution feed (320x240). It is important to note that the agent (C) should still receive the high resolution feed so to avoid two independent streams from agent (B). A single high resolution stream is provided to the transformation agent that will simply forward the traffic to agent (C) and convert the media to be sent to agent (A). Sampling-Rate Reduction Agents Similar to the Resolution Reduction Agent, the Sampling-rate reduction agent can be used to reduce the frequency of the signal provided by the sensor. This might also be due to either client requirements or to policy restrictions. A variation of this agent is also provided to add delays or latency to the signal generated at the sensor. This way if there are policies in place to prevent access to

182

M. Carvalho and M. Breedy

data in real-time, this agent could be launched to relay the images to the client with a fixed time delay and/or a lower sampling rate. Custom Filter Agents Many other types of agents can be designed and deployed in the framework. For example, some Custom Filter agents that we have developed for the FlexFeed framework are responsible for detecting overall changes in image feeds. These agents are dispatched into the framework and receive video feeds from image sensors, comparing it with subsequent images until a significant change in the image is detected. The agent will then send a message to the client with the image from the sensor. Another group of agents that would fit in this category are the data fusion agents that would capture and correlate data from different sensors to build a customized data set for the client. Data fusion agents are still under development for the FlexFeed framework. 5.2 The NOMADS Agent System One of the main challenges faced by the proposed framework is the heterogeneous and dynamic nature of the network. The size and type of nodes vary greatly, from simple miniature cameras or clients running on small PDA’s to full computers or servers. Adopting an agent system that can seamlessly migrate and take advantage of each of node’s capabilities is a key factor. There are no requirements for a specific agent system although some desirable features such as strong mobility, resource redirection and extended portability would be recommended. The NOMADS agent system provides a rich combination of features that fully address this purpose. NOMADS is composed by two distinct execution environments, Oasis and Spring, with different capabilities but fully integrated and compatible with each other. The Oasis execution environment relies on the custom written, Java Compatible Aroma VM [16], which provides a means for resource control and strong mobility. This execution environment can be deployed in a range of architectures and operating systems. The Spring execution environment is the pure Java version of Oasis. It extends NOMADS portability to large range of devices, including portable and pocket computers. Spring runs on many implementations of the Java VM, including the Personal Java VM. It is now being ported to the Sun’s KVM. One of the great advantages of NOMADS for this type of application is that NOMADS agents can move transparently between Spring and Oasis environments, taking advantages of Spring’s portability and Oasis’s resource control and strong mobility as needed. Taking our scenario into account, the transformation agent could be launched by the FlexFeedCoordinator to an Oasis environment available on a tank nearby. As Oasis provides both strong mobility and transparent redirection of resources, the agent could transparently move between trucks, airplanes, command and control centers or any other Oasis-enabled system without disruption of the feed.

Supporting Flexible Data Feeds in Dynamic Sensor Grids through Mobile Agents

183

If needed, the agent could also move to a Spring-enabled environment running on a hand-held device, and re-establish operation from there, assuming that resources are available. The communication restrictions imposed by the policies are handled in a higher layer by the KaoS framework, so regardless of the execution environment, all restrictions related to communication between agents would remain in place. For most scenarios, strong mobility is not a requirement but would greatly facilitate the design and implementation of the agents. The FlexFeed framework allows voluntary as well as forced mobility of processing agents. It supports the use of simple conversion or policy enforcement agents and still ensures (through forced mobility) some levels of survivability and load balancing. The NOMADS agent system also supports resource redirection. This is an important feature for the FlexFeed application and allows, for instance, the persistent network connectivity mentioned in the previous example. Transparent redirection of network connectivity is supported by or Mobile Sockets (Mockets) [17], provided as part of the NOMADS agent system. Mockets provide a wrapper layer on top of standard Java Sockets that ensure transparent connection mobility when the agent moves to another host. No trace is left behind to relay traffic, which ensures that after moving from host (A) to (B), even if host (A) goes offline, there will be no interruption on network connectivity between the agents. NOMADS’s Mockets for the FlexFeed framework are in important factor for system link survivability and load balancing. It can be used with both the strong and weak mobility environments of NOMADS. Other additional types of resource mobility, such as RMI sessions and disk access, are currently under development for the NOMADS platforms and will add even more robustness and flexibility to the framework. 5.3 The KAoS Framework KAoS provides a new paradigm in policy definition and enforcement in multi-agent environments [14]. It has been designed to support, amongst other things, explicit reusable communication policies to represent recurring pattern of interactions in agent communication, rather then particular syntactic message patterns. The framework provides a high-level policy layer that governs security, resource management, mobility, registration, access control, and obligation management for domains, hosts and individual agents. KAoS’s role in the framework is to provide a high-level access to the establishment of policies that regulate agent interaction. It provides means to abstract the policies from the agents and to simplify agent development and deployment [13]. In the FlexFeed framework, mobile agents are used as distributed points of enforcement for policies and restrictions.

6

Summary and Future Work

The current implementation of the FlexFeed framework relies on a centralized administrative authority at the domain level to build and deploy the agents that will establish the temporary data feeds. The distributed nature of the system ideally requires a decentralized algorithm for feed establishment. The study of a

184

M. Carvalho and M. Breedy

decentralized approach for agent negotiation of data feeds is the subject of future work and continuation of this effort. Most of our examples for the FlexFeed framework have been tested for video feeds requested from specific sensors. We believe though, that with proper implementation of sensor and data abstraction, the framework can be easily extended for most types of sensor networks. The continuation of this work will focus on the design of the framework coordinator and the integration of directed diffusion for data centric networks [4][18]. Data-centric networks rely on the notion that the identification of the data can be abstracted from its actual source or location. Data correlation amongst sensors and data fusion of information are interesting research topics [1] [19] [20] and can be explored and studied in the context of flexible feed networks. Another important research topic that will be explored in future work is in the design of distributed algorithms to coordinate policy enforcement. We can foresee the need for policy enforcers that would evaluate and approve each agreement between agents, ensuring that it is in accordance with the policies. In general terms, the policy infrastructure could be moved from a centralized domain manager to a set of distributed nodes that would determine and enforce local policies on behalf of the domain administrator. Many other issues are still open in this area, such as directory services, security, and performance. We intend, in the future, to study and evaluate their applicability to flexible feeds. The FlexFeed framework is now moving to a next level of development. The core infrastructures and APIs are implemented. Conceptually, the applicability of mobile agents to the problem seems to be appropriate and pertinent; we are still addressing some performance issues and improving system capabilities but the preliminary results from test and demo implementations are very encouraging. Acknowledgements. This work has been prepared through participation in the Advanced Decision Architectures Collaborative Technology Alliance sponsored by the U.S. Army Research Laboratory under Cooperative Agreement DAAD19-01-20009, and was supported in part by the DARPA Control of Agent-based Systems (CoABS) Program.

References 1. 2. 3. 4.

Estrin, D., Pister, D., Sikhatme, G.: “Connecting the Physical World with Pervasive Networks. Pervasive Computing“ IEEE 2002. Minar, N. Kramer, Kwinlda. Maes, Pattie: “Cooperating Mobile Agents for Dynamic Network Routing.” (1999) Available online at: http://www.media.mit.edu/~nelson/research/routes Carzaniga, A. Picco, G. Vigna, G.: “Designing Distributed Applications with Mobile Code th Paradigms.” Proceedings of the 19 International Conference on Software Engineering, Boston MA. USA – 1997 Intanagonwiwat, C. Govidan, R., Estrin, D. “Directed Diffusion: A Scalable and Robust th Communication Paradigm for Sensor Networks.” Proceedings of 6 annual International Conference on Mobile Computing and Networking – ISBN: 1-58113-197-6 (2000).

Supporting Flexible Data Feeds in Dynamic Sensor Grids through Mobile Agents 5. 6. 7. 8. 9. 10. 11. 12.

13. 14. 15. 16. 17. 18. 19. 20.

185

Guibas, L. : “Sensing, Tracking, and Reasoning with Relations.” IEEE Signal Processing Magazine, March (2002). Iyengar, S.S., Jayasimha, D.N.: “A Versatile Architecture for the Distributed Sensor Integration Problem.” IEEE Transactions on Computers, Vol 43. No. 2. Feburary, 1994. I. Busse, S. Covaci, and A. Leichsenring: "Autonomy and Decentralization in Active Networks: A Case Study for Mobile Agents", Proceedigns of IWAN'99,LNCS 1653, Springer 1999 Estrin, D. Govindan, R., Heidemann, J., Kumar, S. “Next century Challenges: Scalable Coordination in Sensor Networks” – Mobicom’ 99, Seattle Washington, USA (1999) F. Michahelles, M. Samulowitz, and B. Schiele, "Detecting Context inDistributed Sensor Networks by Using Smart Context-Aware Packets",Proceedings of ARCS'2002, pp. 34-47, LNCS 2299, Springer, April, 2002 H. Qi, S. Iyengar, and K. Chakrabarty, "Distributed Multiresolution Data Integration Using Mobile Agents", IEEE Transactions on Systems, Man, and Cybernetics Part C: Applications and Reviews, vol. 31, no. 3, pp383-391, 2001 DARPA’s Coalition Agents eXperiment (CoAX) – Available online at: http://www.aiai.ed.ac.uk/project/coax/ Suri, N., Bradshaw, J.M., Breedy, M.R., Groth, P.T., Hill, G.A., & Jeffers, R.: “Strong nd Mobility and Fine-Grained Resource Control in NOMADS.” Proceedings of the 2 th International Symposium on Agents Systems and Applications and the 4 International Symposium on Mobile Agents (ASA/MA 2000). Zurich, Switzerland, Berlin: SpringerVerlag Bradshaw, J., Suri, N., Kahn, M., Sage, P., Weishar, D. and Jeffers,R.: “Terraforming Cyberspace: Toward a Policy-Based Grid Infra-structure for Secure, Scalable, and Robust Execution of Java-Based Multi-Agent Systems” – IEEE 2001 Bradshaw, J.M. et al.: “KAoS: Toward an Industrial-Strength Generic Agent Architecture.” Software Agents, AAAI Press/MIT Press, Cambridge, Mass. 1997, pp. 375418. Thompson Craig: “Characterizing the Agent Grid.” (1998) Available online at http://www.objs.com/agility/tech-reports/9812-grid.html Suri, N. Bradshaw, J.M., Breedy, M.R., Ford, K.M., Groth, T., Hill, G.A., and Saavedra, R.: “State Capture and Resource Control for Java: The Design and Implementation of the Aroma Virtual Machine.” White Paper. http://nomads.coginst.uwf.edu Mitrovich, T., Ford, K., Suri, N, - “Transparent Redirection of Network Sockets” – Available online: http://nomads.coginst.uwf.edu/mockets.pdf Hill, J., Szewczyk, R., Woo, A., Hollar, S., Culler, D., Pister, K. “System Architecture Directions for Networked Sensors” – ACM Press – ISSN 0163-5980 (2000). Hillman,R., Hanna, J.,Walter, M. : “Modeling the Joint Battle Infosphere.” http://www.dodccrp.org/6thICCRTS/Cd/Tracks/Papers/Track7/128_tr7.pdf United States Air Force Scientific Advisory : “Report on Building the Joint Battlespace Infosphere.” Volumes 1and 2, SAB-TR-99-02, Dec, 1999.

Physical Mobility and Logical Mobility in Ubiquitous Computing Environments Ichiro Satoh National Institute of Informatics / Japan Science and Technology Corporation 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan Tel: +81-3-4212-2546 Fax: +81-3-3556-1916 [email protected]

Abstract. This paper presents a framework for building context-aware applications in ubiquitous and mobile computing settings. The framework provides people, places, and things with computational functionalities to support and annotate them. It is unique among existing systems because the functionalities are implemented by mobile agents. Using location-tracking systems, this framework can navigate mobile agents to stationary or mobile computers near the locations of the entities and places to which the agents are attached, even when the locations change. The framework provides a way for mobile agents to follow their users as they move about and to adhere to places as virtual Post-its. A prototype implementation of the framework has been built on a Java-based mobile agent system and tested with several practical applications, including follow-me applications and a user- navigation system.

1

Introduction

Ubiquitous computing and mobile computing will be key areas in future computing. However, the two approaches have their own advantages and disadvantages. The concept of ubiquitous computing implies computation with elements that are contained in the environment rather than carried on the person. Various computing and sensing devices are in fact already present in almost every room of a modern building or house and in many of the public facilities of cities. They may now be disappearing inside all sorts of appliances and thus integrate with every aspect of life. This demonstrates the suitability of ubiquitous computing to provide environmental information and services. However, this approach is not suited to providing multiple-purpose and personalized services, because the devices embedded in various items within the environment tend to have limited storage and processing capacities. They are thus incapable of internally maintaining a variety of software and profile databases on the potential users. This approach may also raise serious privacy issues, because a ubiquitous computing environment would be able to monitor the preferences and locations of individuals. On the other hand, the concept of mobile computing can mean that computing devices, for example, notebook-PCs, PDAs, wearable computers, are carried by users rather than contained within the environment. Recently, portable computing devices have become very small and powerful, giving their users access to a variety of applications in N. Suri (Ed.): MA 2002, LNCS 2535, pp. 186–201, 2002. c Springer-Verlag Berlin Heidelberg 2002 

Physical Mobility and Logical Mobility in Ubiquitous Computing Environments

187

personalized form, regardless of the user locations. Each of these devices is intended to stay with a particular user so the user’s profile can be maintained in the portable device and can it easily evolve over time, without having to be transferred from place to place in an external environment. Therefore, the mobile computing approach provides both personalization and privacy. However, its users are forced to carry devices, such as PCs, PDAs, and smart-phones, which may not be light and may only have small screens and clamped keyboards. Moreover, this approach is not suitable for context-dependent services because it is difficult for a portable device to sense its environment. The two approaches are posed as polar opposites. We have attempted to alleviate the disadvantages of each approach by using the advantages of the other. Therefore, this paper presents a location-aware framework, called SpatialAgent, in which mobile agent technology is applied to provide a bridge between the two approaches. This framework enables mobile agents to be spatially bound to people, places, and things, which the agents support and annotate. Location-tracking systems are used within the framework to migrate such agents to stationary and mobile computing devices that are near the locations of the entities and places to which the agents are attached, even when the locations of the entities change. Several ways of reducing the number of disadvantages of in both approaches have been explored. AT&T’s Sentient Computing [3], for example, proposed a so-called follow-me application to support the provision of personalized services in ubiquitous computing settings. With HP’s Cooltown [6], mobile computing devices such as PDAs and smart phones are attached to positioning sensors to provide location-awareness to web-based applications running on the devices. In contrast to these approaches, the framework presented in this paper does not distinguish between mobile and ubiquitous computing. Since mobile agents can travel between computers, the framework can naturally map the movements of physical entities such as people and objects to the movements of mobile agents in mobile and ubiquitous computing systems. In the remainder of this paper, we describe our design goals (Section 2), the design of our framework, called SpatialAgent, and a prototype implementation of the framework (Section 3). We also discuss our experience with several applications that we developed by using the framework (Section 4), and briefly review related work (Section 5). We briefly discuss some future issues (Section 6) and provide a summary (Section 7).

2 Approach The framework presented in this paper aims to enhance the capabilities of users, particularly those of mobile users, of things that include computing devices and non-electronic objects, and places such as rooms, buildings and cities with computational functionalities. 2.1

Locating Systems

Our goal is to offer a location-aware system in which spatial regions can be determined to within a few square feet, so that one or more portions of a room or building can be distinguished. The framework itself is designed to be independent of any particular

188

I. Satoh

locational infrastructure and is accompanied by more than one locating system. It determines the positions of objects by identifying the spatial regions that contain the objects. In general, such locating systems consist of RF (radio frequency) or infrared sensors, which detect the presence of small RF or infrared transmitters, often called tags, each of which periodically transmits a unique identifier. The framework assumes that physical entities and places are equipped with their own unique tags so that they are automatically locatable entities. The framework consists of two parts: (1) mobile agents and (2) location information servers, called LISs. The former offers application-specific services, which are attached to physical entities and places, as collections of mobile agents. The latter provide a layer of indirection between the underlying locating sensing systems and mobile agents. Each LIS manages more than one sensor and provides the agents with up-to-date information on the state of the real world, such as the locations of people, places, and things, and the destinations that the agents should migrate to. 2.2 Application-Specific Services This framework enables application-specific services to be implemented as mobile agents. Mobile agent technology also has the following advantages in ubiquitous and mobile computing settings. – After arriving at its destination, a mobile agent can continue working without losing the work results, for example the content for instance variables in the agent’s program, at the source computers. Thus, the technology enables us to easily build follow-me applications as proposed by Cambridge University [3]. – Mobile and ubiquitous computers often have only limited resources, such as fixed levels of CPU power and restricted memory. Mobile agents can help to conserve these limited resources, since each mobile agent needs to be present at a computer only when the computer needs the services provided by that agent. – Each mobile agent is locally executed on the computing device it is visiting and is able to directly access various equipment, which belong to that device as long as the security mechanisms of the device permits this. In this framework, each mobile agent can be tied to radio-ID or infrared-ID tag attached to a person, place, or thing in the physical world. 2.3

Narrowing the Gap between Physical and Logical Mobility

This framework can inform mobile agents attached to tags about their proper destinations according to the current position of the tags. We call computing devices that can execute mobile agent-based applications agent hosts. This framework permits agent hosts to be mobile or stationary, but each host needs to be equipped with its own tag and must advertise its profile information to the LISs that detect the tag. The framework supports two types of linkages between a physical entity or place and more than one mobile agent: – The framework binds one or more mobile agents to a tag, which is attached to a moving entity such as a user and a non-electronic object. When a tagged entity moves

Physical Mobility and Logical Mobility in Ubiquitous Computing Environments

189

within a place, the framework prompts agents, which are bound to the moving entity, to move to appropriate stationary hosts within the same place, as shown in Fig. 1. – The framework allows physical places to have their own agents which support location-dependent services. When a user with network-enabled computing devices is in a given place, the framework instructs the agents that are attached to the place to migrate themselves to the visiting devices, where they provide the location-dependent services of the place as shown in Fig. 2. This framework permits a combination of both forms of linkages, while existing related work, such asAT&T’s Sentient Computing and HP’s Cooltown, only support one of them. In addition to this, the framework does not distinguish between mobile and stationary devices. In the framework, multiple sensors do not have to be neatly distributed in a space such as rooms or buildings to completely cover the spaces; instead, they can be placed near more than one agent host and the coverage of sensors can overlap.

Step 1

Step 2 stationary computer sensor (agent host)

tag

tag sensor

stationary computer (agent host)

the migration of an agent attached to a moving user tag

tag

sensor

sensor tag

tag cell 1 cell 2 tag the movement of a user with a tag

cell 1

cell 2

Fig. 1. Migration of an agent, which is attached to a moving entity, to a computer at the current location of the entity

mobile agent attached to the place

Step 1

invisiblecomputer (agent host) tag

Step 2

tag sensor cell

the movement of a user with a PDA (Agent Host)

tag agent migration

sensor tag cell

the migration of an agent to the PDA

Fig. 2. Migration of an agent which is attached to a particular place to a computer visiting that place

190

2.4

I. Satoh

Design Principles

In addition to achieving the goals presented above, the framework has the following advantages: Autonomy: When an LIS detects the movement of a tag in the physical world, it informs agents bound to the tag about the network address and the capabilities of more than one candidate destination that the agents should visit, but the LIS itself does not send agents to a destination. Each of these agents selects one host among the candidate destinations recommended by the LIS and migrates to the selected host, since it is an autonomous entity. Moreover, when the capabilities of a candidate destination do not satisfy all the requirements of an agent, the agent itself should decide, on the basis of to its own configuration policy, whether or not it will migrate to the destination and adapt itself to the destination’s capabilities. Scalability: Our final goal is widespread building-wide and city-wide deployment. It is almost impossible to deploy and administer a system in a scalable way when all of the control and management functions are centralized. Our framework consists of multiple servers, which are connected to individual servers in a peer-to-peer manner. Each LIS only maintains up-to-date information on the identifiers of tags, which are present in one or more of the specific places it manages, instead of on tags in the whole space. Extensibility: LISs and agent hosts may be dynamically deployed and frequently shut down. The framework permits each LIS to run independently of the other LISs and offers an automatic mechanism for the registration of agent hosts. The mechanism requires agent hosts to be equipped with tags so that they are locatable and can advertise their capabilities. Reconfigurability: In the framework, not only portable components but also system components, such as the sensors and agent hosts, are movable. As a result, it is almost impossible to maintain a geographical model of the whole system. To solve this problem, the framework provides a demand-driven mechanism for discovering the agents and agent hosts that are required, where the mechanism was inspired by ad-hoc mobile networking technology [12]. Modularity and Application-Independence: The framework should be as independent as possible of the underlying sensor technologies and mobile agent systems. This minimizes the effects of the distribution and heterogeneity of the underlying locating infrastructure on the applications. The framework itself is independent of applicationspecific tasks because such tasks are performed within mobile agents. Personalization and Privacy: The framework only maintains per-user profile information within those agents that are bound to the user. It promotes the movement of such

Physical Mobility and Logical Mobility in Ubiquitous Computing Environments

191

agents to appropriate hosts near the user in response to the user’s movement. Thus, the agents do not leak profile information on their users to other parties and can interact with their mobile users in personalized forms that have been adapted to respective individual users.

3

Design and Implementation

This section presents the design of the SpatialAgent framework and describes a prototype implementation of the framework. Fig. 3 shows the basic structure of the framework.

Location Server A profile handler

directory database

Location Server B event handler

profile handler

directory database

abstraction abstraction layer layer

event handler abstraction layer

peer-to-peer communication locating sensor

locating sensor

agent host

agent host

agent migration

desklampbound agent MobileSpaces

user-bound agent MobileSpaces

tag

tag

MobileSpaces

tag tag

tag cell 1

locating sensor agent host

cell 2

tag

tag

user migration

cell 3

Fig. 3. Architecture of the SpatialAgent Framework

3.1

Location Information Server

Each LIS can run on a stationary or mobile computer and provides the following functionality: Management of Locating Sensors: Each LIS manages multiple sensors that detect the presence of tags and maintains up-to-date information on the identities of tags that are within the zone of coverage by its sensors. This is achieved by polling the sensors or receiving events issued by the sensors themselves. An LIS does not require any knowledge of other LISs. To hide differences among the underlying locating systems, each LIS maps low-level positional information from each of the locating systems into information that is symbolic model of location. An LIS represents an entity’s location in terms of the unique identifier of the sensor that detects the tag of the entity. We call each sensor’s coverage a cell, as in the model of location studied in [8].

192

I. Satoh

Mechanism for Agent Discovery: Each LIS discovers mobile agents bound to tags within its cells and maintains a database in which it stores information about each of the agent hosts and each of the mobile agents attached to a tagged entity or place. When an LIS detects a new tag in a cell, the LIS multicasts a query that contains the identity of the new tag and its own network address to all of the agent hosts in its current sub-network. It then waits for replies from the agent hosts. Here, there are two possible cases: the tag may be attached to an agent host or the tag may be attached to a person, place, or thing other than an agent host. – In the first case, the newly arriving agent host will send its network address and device profile to the LIS; the profile describes the capabilities of the agent host, e.g., input devices and screen size. After receiving the reply, the LIS stores the profile in its database and forwards the profile to all agent hosts within the cell. – In the second case, agent hosts that have agents tied to the tag will send their network addresses and the requirements of acceptable agents to the LIS; the requirements for each agent specify the capabilities of the agent hosts that the agent can visit and perform its services at. The LIS then stores the requirements of the agents in its database and moves the agents to appropriate agent hosts in the following way. If the LIS has not received any replies from the agent hosts, it can multicast a query message to other LISs. When the absence of a tag is detected in a cell, each LIS multicasts message with the identifier of the tag and the identifier of the cell to all agent hosts in its current sub-network. Navigation Service: We will now explain how agents navigate to reach appropriate agent hosts. When an LIS detects the movement of a tag attached to a person or a thing to a cell, it searches its database for agent hosts that are present in the current cell of the tag. It also selects candidate destinations from the set of agent hosts within the cell, according to their respective capabilities. The framework offers a language based on CC/PP (composite capability/preference profiles) [20]. The language is used to describe the capabilities of agent hosts and the requirements of mobile agents in an XML notation. For example, a description contains information on the following properties of a computing device: the vendor and model class of the device (PC, PDA, phone, etc.), its screen size, the number of colors, CPU, memory, input devices, secondary storage, and presence/absence of speakers. Each LIS is able to determine whether or not the device profile of each agent host satisfies the requirements of an agent by symbolically matching and quantitatively comparing properties. The LIS informs each agent about the profiles of agent hosts that are present in the cell and satisfies the requirements of the agent. The agents are then able to autonomously migrate to the appropriate hosts. The current implementation allows each agent to specify the preferable capabilities of agent hosts that it may visit as well as the minimal capabilities. When there are multiple candidate destinations, each of the agents that is tied to a tag must select one destination on the basis of the profiles of the destinations. Also, when one or more cells geographically overlap, a tag may be in multiple cells at the same time; agents tied to that tag may then receive candidate destinations from multiple LISs. Our goal is to provide physical entities and places with computational functionality from

Physical Mobility and Logical Mobility in Ubiquitous Computing Environments

193

locations near them. Therefore, if there are no appropriate agent hosts in any of the cells at which a tag is present but there are some agent hosts in other cells, the current implementation of our framework is not intended to move agents tied to the tag to hosts in different cells. 3.2 Agent Host Each agent host has two forms of functionality: one for advertising its capabilities and the other for executing and migrating mobile agents. When a host receives a query with the identifier of a newly arriving tag from an LIS, it responds in one of the following three ways: (i) if the identifier in the message is equal to the identifier of the tag to which it is attached, it returns profile information on its capabilities to the LIS; (ii) if one of the agents running on it is tied to the tag, it returns its network address and the requirements of the agent; and (iii) if neither of the above cases applies, it ignores the message.1 The current implementation of this framework is based on a Java-based mobile agent system called MobileSpaces [14].2 Each MobileSpaces runtime system is built on the Java virtual machine, which hides differences between the platform architecture of source and destination hosts, such as the operating system and hardware. Each of the runtime systems moves agents to other agent hosts over a TCP/IP connection. The runtime system governs all agents inside it and maintains the life-cycle state of each agent. When the life-cycle state of an agent changes, for example, when it is created, terminates, or migrates to another host, the runtime system issues specific events to the agent. This is because the agent may have to acquire or release various resources, such as files, windows, and sockets, which it has previously captured. When a notification on the presence or absence of a tag is received from an LIS, the runtime system dispatches specific events to the agents that are tied to that tag and run inside it. 3.3

Mobile Agent Program

Each mobile agent is a collection of Java objects and is equipped with the identifier of the tag to which it is attached. It is a self-contained program and is able to communicate with other agents. An agent that is attached to a user always internally maintains that user’s personal information and carries all its internal information to other hosts. A mobile agent may also have one or more graphical user interfaces for interaction with its users. When such an agent moves to another host, it can easily adjust its windows to the screen of the new host by using a compound document framework for the MobileSpaces system that was presented in our previous paper [15]. Next, we will explain the programming interface for our mobile agents. Every agent program must be an instance of a subclass of the abstract class TaggedAgent as follows: 1

The current implementation assumes that LISs and agent hosts can be directly connected through a wireless LAN such as IEEE802.11b and thus does not support any multiple-hop query mechanisms, unlike mobile ad-hoc networking technology [12]. 2 The framework itself is independent of the MobileSpaces mobile agent system and can thus work with other Java-based mobile agent systems.

194

I. Satoh

1: class TaggedAgent extends Agent implements Serializable { 2: void go(URL url) throws NoSuchHostException { ... } 3: void duplicate() throws IllegalAccessException { ... } 4: void destroy() { ... } 5: void setTagIdentifier(TagIdentifier tid) { ... } 6: void setAgentProfile(AgentProfile apf) { ... } 7: URL getCurrentHost() { ... } 8: boolean isConformableHost(HostProfile hfs) { ... } 9: .... 10: }

Here are some of the methods defined in the TaggedAgent class. An agent executes the go(URL url) method to move to the destination host specified as url by its runtime system. The duplicate() method creates a copy of the agent, including its code and instance variables. The setTagIdentifier method ties the agent to the identity of the tag specified as tid. Each agent can specify requirements that its destination hosts must satisfy by invoking the setAgentProfile() method, with the requirements specified as apf. The class provides a service method, isConformableHost(), which the agent uses to decide whether or not the capabilities of an agent host specified as an instance of the HostProfile class satisfy the requirements of the agent. Each agent can have more than one listener object that implements a specific listener interface to hook certain events issued before or after changes in its life-cycle state or the movements of its tag. 1: interface TaggedAgentListener extends AgentEventListener { 2: // invoked after creation at url 3: void agentCreated(URL url); 4: // invoked before termination 5: void agentDestroying(); 6: // invoked before migrating to dst 7: void agentDispatching(URL dst); 8: // invoked after arrived at dst 9: void agentArrived(URL dst); 10: // invoked after the tag arrived at another cell 11: void tagArrived(HostProfile[] apfs, CellIdentifier cid); 12: // invoked after the tag left rom the current cell 13: void tagLeft(CellIdentifier cid); 14: // invoked after an agent host arrived at the current cell 15: void hostArrived(AgentProfile apfs, CellIdentifier cid); 16: .... 17: }

The above interface specifies the fundamental methods that are invoked by the runtime system when agents are created, destroyed, or migrate to another agent host. Also, the tagArrived callback method is invoked after the tag to which the agent is bound has entered another cell, to obtain the device profiles of the agent hosts that are present in the new cell. The tagLeft method is invoked after the tag is no longer in a cell.

Physical Mobility and Logical Mobility in Ubiquitous Computing Environments

3.4

195

Current Status

The framework presented in this paper was implemented in Sun’s Java Developer Kit version 1.1 or later versions, including Personal Java. The remainder of this section discusses some features of the current implementation. Locating Systems: The current implementation of our framework supports two commercial locating systems: RF Code’s Spider and Elpas’s EIRIS. The former provides active RF-tags. Each tag has a unique identifier that periodically emits an RF-beacon that conveys an identifier (every second). The system allows us to explicitly control the omnidirectional range of each of the RF receivers to read tags within a range of 1 to 20 meters. The latter provides active infrared-tags, which periodically broadcast their identifiers through an infrared interface (every four seconds), like the Active Badge system [19]. Each infrared receiver has omnidirectional infrared coverage, which can be adjusted to cover distances within the range of 0.5 to 10 meters. Although there are many differences between the two locating systems, the framework minimizes the differences. Performance Evaluation: Although the current implementation of the framework was not built for performance, we measured the cost of migration of an agent with a size of 3 Kbytes (zip-compressed) from a source host to the destination host recommended by the LIS. This experiment was performed with two LISs and two agent hosts, each of which was running on one of four computers (Pentium III-1GHz with Windows2000 and JDK 1.4), which were directly connected via an IEEE802.11b wireless network. The latency of an agent’s migration to the destination after an LIS had detected the presence of the tag of an agent was 380 msec; the cost of agent migration between two hosts over a TCP connection was 48 msec. The latency includes the cost of the following processes: UDP-multicasting of the identifier of the tags from the LIS to the source host; TCP-transmission of the agent’s requirements from the source host to the LIS; TCP-transmission of a candidate destination from the LIS to the source host; marshaling of the agent; the migration of an agent from the source host to the destination host; unmarshaling of the agent; and security verification. We believe that this latency is acceptable for a location-aware system used in a room or building.

4

Initial Experience

To demonstrate the utility of the SpatialAgent framework, we developed several typical location-aware applications for mobile or ubiquitous computing settings. 4.1

Follow-Me Desktop Application

A simple application of the framework is a desktop teleporting system, like a followme application [3], within a richly equipped, networked environment such as a modern office. The system tracks the current location of a user and allows him or her to access his or her applications at the nearest computer as he or she moves around in the building. Unlike previous studies of such applications, our framework can migrate, not only

196

I. Satoh

the user interfaces of applications but also the applications themselves, to appropriate computers in the cell that contains the tag of the user. In our previous paper [15], we also developed a mobile window manager, which is a mobile agent that can carry its desktop applications as a whole to another computer and control the size, position, and overlap of the windows of the applications. Using the framework presented in this paper, the window manager and desktop applications can be automatically moved to and then executed at the computer that is in the current cell of the user and that has the resources required by the applications in the manner shown in Fig. 4.

clock application (mobile agent) editor application (mobile agent) agent migration agent host

agent host tag

tag

tag cell 1

tag tag

user movement

cell 2

Fig. 4. Follow-Me Desktop Applications

4.2

User Navigation System

We also developed a user navigation system that assists visitors to a building. Several researchers have reported on other similar systems [2,4]. In this example, tags are distributed to several places within the building, such as its ceilings, floors, and walls; each visitor carries a wireless-LAN enabled tablet PC, which is equipped with a locating sensor to detect tags. It also includes an LIS and an agent host. The system initially deploys place-bound agents to invisible computers within the building. When a tagged position enters the cell of the moving sensor, the LIS running on the visitor’s tablet PC detects the presence of the tag. The LIS detects the place-bound agent that is tied to the tag. Next, it instructs the agent to migrate to its agent host and perform the agent’s location-dependent services at the host. Fig. 5 shows a situation where a visitor with his/her tablet PC and sensor is roaming, first approaching place A and then place B. The system enables more than one agent tied to place A to move to the table PC. The agent returns to its home computer and other agent, which is tied to place B. It then moves to the tablet PC. Fig. 6 shows a place-bound agent displaying a map of its surrounding area on the screen of a tablet PC.

Physical Mobility and Logical Mobility in Ubiquitous Computing Environments Step 1

Step 2

cell tablet PC tag A (agent host) tag user movement sensor a user with a tablet PC and a sensor for tags

tag

map agent for place A

agent migration

197

Step 3

tag sensor

user movement

tag A

invisible computer A agent (agent host) migration

tag A

cell

map agent for place B

cell invisible computer B (agent host) tag B

tag

agent migration

sensor tag B

Fig. 5. The migration of an agent, which is attached to a place, to a visiting computer. (B)

(A)

RF-tag

RF-sensor

IEEE 802.11b

The positions of RF-sensors

Place-bound Tablet PC Agent (Map Viewer) (Agent Host)

Fig. 6. (A) the positions of RF-tags on a floor (B) and a screen-shot of a map-viewer agent running on a table PC

4.3

Proactive Control of Home Appliances

We also used this framework to implement two prototype systems to control the lights in a room. Each light was equipped with a tag and was within the range covered by the sensor. In a previous project [11], we developed a generic server to control power outlets through a commercial protocol called the X10; in both the approaches we describe here the lights are controlled by switching their power sources on or off through the X10 protocol.

User-aware Automatic Controller: The first system provides proactive control of room lighting through a similar approach to that used by the EasyLiving project [1]. Our approach can autonomously turn the room lights on whenever a tagged user is sufficiently close to them. Suppose that each light is attached to a tag and is within the 3-meters coverage of the stationary sensor for the RF Code’s Spider system. A tag attached to each of the lights is correlated with a mobile agent, which is a client of our X10-based server and is running on a stationary agent host in the room. When a tagged user approaches a light, an LIS in the room detects the presence of his/her tag in the cell that contains the light. Next, the LIS moves an agent that is bound to his/her tag to the agent host on

198

I. Satoh

which the light’s agent is running. The user’s agent then requests that the lights’ agent to turn the light on through inter-agent communication. Location-aware Remote Controller: The second system allows us to use a PDA as a remote controller for nearby lights. In this system, place-bound controller agents, which can communicate with X10-base servers to switch the lights on or off, are attached to the places that contain room lights. Each user has a tagged PDA, which supports an agent host with WindowsCE and a wireless LAN interface.3 When a user with his/her PDA visits a cell that contains a light, the framework moves a controller agent to the agent host of the visiting PDA. The agent, now running on the PDA, displays a graphical user interface to control the light. When the user leaves that place, the agent automatically closes its user interface and returns to its home host.

Desklamp

PDA (Agent Host)

RF-sensor

Controller Agent RF-tag attached to a desklamp X10 Appliance Module

Fig. 7. Controlling a desk lamp from a PDA

5

Related Work

This section discusses several systems that have influenced various aspects of this framework, which seamlessly integrate two different approaches: ubiquitous and mobile computing. First, we will compare some projects that support mobile users in ubiquitous computing environment with our framework. Research on smart spaces and intelligent environment is becoming increasingly popular at many universities and corporate research 3

Since existing Java VMs for WindowsCE-based PDAs are lacking in terms of function and performance, the current implementation of this example uses a light-weight version of the MobileSpaces system.

Physical Mobility and Logical Mobility in Ubiquitous Computing Environments

199

facilities. Cambridge University/AT&T’s Sentient Computing project [3] provides a platform for location-aware applications using infrared-based or ultrasonic-based locating systems in a building. Using the VNC system [13], the platform can track the movement of a tagged entity, such as individuals and things, so that the graphical user interfaces of the user’s applications follow the user while he, she, or it moves around. Although the platform provides similar functionality to that of our framework, its management is centralized and it is difficult to dynamically reconfigure the platform when sensors are added to, or removed from, the environment. Since the applications must be executed in remote servers, the platform may have non-negligible interactive latency between the servers and the hosts that the user locally accesses. Recently, some Cambridge University researchers [9] have proposed a CORBA-based middleware, called LocARE, for the Sentient Computing project. The middleware can move CORBA objects to hosts according to the location of tagged objects. CORBA objects, however, are not always suitable for implementing user interface components. Microsoft’s EasyLiving project [1] provides context-aware spaces, with a particular focus on the home and office. Computer-vision is used to track users within the spaces. The system can organize the dynamic aggregation of networked-enabled devices in a space and control devices according to the location of users. There have also been several studies on enhancing context-awareness in mobile computing. HP’s Cooltown [6] is an infrastructure for supporting context-aware services on portable computing devices. It is capable of automatically providing bridges between people, places, and things in the physical world and the web resources that are used to store information about them. The bridges that it forms allow users to access resources stored on the web via a browser, using standard HTTP communication. Although it would be an advantage in this system for users to be familiar with web browsers, all of the services available in the Cooltown system are constrained by the limitations of web browsers and HTTP. The NEXUS system [4], developed by Stuttgart University, offers a generic platform that supports location-aware applications for mobile users. Like the Cooltown system, users require PDA or tablet PC-like handheld devices, which are equipped with GPS-based positioning sensors and wireless communication. Applications that run on such devices, e.g., user-navigation, maintain a spatial model of the current vicinity of users and gather spatial data from remote servers. Unlike our approach, however, both of these approaches are not suited to supporting mobile users from stationary computers distributed in a smart environment. Despite this, even though a number of mobile agent systems have been developed, few researchers have attempted to apply mobile agent technology to mobile and ubiquitous computing. Kangas [5] developed a location-aware augmented-reality system that enabled the migration of virtual objects to mobile computers, but only when the computer was located in particular spaces, like our framework. However, the system was not designed to move such virtual objects to ubiquitous computing devices. Hive [10] is a mobile agent-based middleware to control devices in ubiquitous computing environments, but it does not support any location-aware services.

200

6

I. Satoh

Future Work

Since the framework presented in this paper is designed as a general-purpose framework, in future work we need to apply it to various applications as well as the three applications presented in this paper. Moreover, the MobileSpaces system, which is one basis of the framework, allows an application-specific service to be implemented as a collection of multiple agents rather than as a single agent. We are now developing a mechanism to divide an application-specific service into multiple mobile agents. For example, a mobile agent-based service may often require various I/O devices, such as a keyboard and speakers, but cannot find an agent host that has all of the devices. If there are two hosts, where one has a keyboard and another has speakers, the service should be able to be provided by the two in combination. The current mechanism to exchange information between LISs is not yet satisfactory. We therefore plan to develop a publish-subscribe system for the framework. We currently have an approach for building and managing configurable sensor networks [18]. Since it allows sensor nodes to be organized and configured according to the requirements of applications and changes in the physical world, it is useful in dynamically customizing our location information servers. We have also developed an approach to test context-aware applications on mobile computers [16]. We are interested in providing a methodology for testing applications based on the framework.

7

Conclusion

A novel framework for developing and managing location-aware applications in mobile and ubiquitous computing environments has been presented in this paper. The framework provides people, places, and things with mobile agents to support and annotate them. Using location-tracking systems, the framework can migrate mobile agents to stationary or mobile computers near the locations of the people, places, and things to which the agents are attached. That is, it allows a mobile user to access its personalized services in a ubiquitous computing environment and provides location-dependent services to the user’s portable computing device. The framework is decentralized. In addition, it is a generic platform independent of any higher-level applications and locating systems. We designed and implemented a prototype system for the framework and tested several practical applications.

References 1. B. L. Brumitt, B. Meyers, J. Krumm, A. Kern, S. Shafer, “EasyLiving: Technologies for Intelligent Environments”, Proceedings of International Symposium on Handheld and Ubiquitous Computing, pp. 12-27, September, 2000. 2. K. Cheverst, N. Davis, K. Mitchell, and A. Friday, “Experiences of Developing and Deploying a Context-Aware Tourist Guide: The GUIDE Project”, Proceedings of Conference on Mobile Computing and Networking (MOBICOM’2000), pp.20-31, 2000. 3. A. Harter, A. Hopper, P. Steggeles, A. Ward, and P. Webster, “The Anatomy of a Context-Aware Application”, Proceedings of Conference on Mobile Computing and Networking (MOBICOM’99), pp.59-68, 1999.

Physical Mobility and Logical Mobility in Ubiquitous Computing Environments

201

4. F. Hohl, U. Kubach, A. Leonhardi, K. Rothermel, and M. Schwehm, “Next Century Challenges: Nexus -An Open Global Infrastructure for Spatial-AwareApplications, Proceedings of International Conference on Mobile Computing and Networking (MOBICOM’99), 249-255, 1999. 5. K. Kangas and J. Roning, “Using Code Mobility to Create Ubiquitous and Active Augmented Reality in Mobile Computing”, Proceedings of Conference on Mobile Computing and Networking (MOBICOM’99), pp.48-58, 1999. 6. T. Kindberg, et al. “People, Places, Things: Web Presence for the Real World”, Technical Report HPL-2000-16, Internet and Mobile Systems Laboratory, HP Laboratories Palo Alto, February, 2000. 7. B. D. Lange and M. Oshima, “Programming and Deploying Java Mobile Agents with Aglets”, Addison-Wesley, 1998. 8. U. Leonhardt, and J. Magee, “Towards a General Location Service for Mobile Environments”, Proceedings of IEEE Workshop on Services in Distributed and Networked Environments, pp. 43-50, IEEE Computer Society, 1996. 9. D. Lopez de Ipina and S. Lo, “LocALE: a Location-Aware Lifecycle Environment for Ubiquitous Computing”, Proceedings of Conference on Information Networking (ICOIN-15), IEEE Computer Society, 2001. 10. N. Minar, M. Gray, O. Roup, R. Krikorian, and P. Maes, “Hive: Distributed agents for networking things”, Proceedings of Symposium on Agent Systems and Applications / Symposium on Mobile Agents (ASA/MA’99), IEEE Computer Society, 2000. 11. T. Nakajima, I. Satoh, and H. Aizu, “A Virtual Overlay Network for Integrating Home Appliances”, Proceedings of International Symposium on Applications and the Internet (SAINT’2002), pp.246-253, IEEE Computer Society, January, 2002. 12. C. E. Perkins “Ad Hoc Networking”, Addistion Wesley, 2001. 13. T. Richardson, Q, Stafford-Fraser, K. Wood, A. Hopper, “Virtual Network Computing”, IEEE Internet Computing, Vol. 2, No. 1, 1998. 14. I. Satoh, “MobileSpaces: A Framework for Building Adaptive Distributed Applications Using a Hierarchical MobileAgent System”, Proceedings of International Conference on Distributed Computing Systems (ICDCS’2000), pp.161-168, IEEE Computer Society, 2000. 15. I. Satoh, “MobiDoc: A Framework for Building Mobile Compound Documents from Hierarchical Mobile Agents”, Proceedings of Symposium on Agent Systems and Applications / Symposium on Mobile Agents (ASA/MA’2000), LNCS, Vol.1882, pp.113-125, Springer, 2000. 16. I. Satoh, “Flying Emulator: Rapid Building and Testing of Networked Applications for Mobile Computers”, in Proceedings of Conference on Mobile Agents (MA’2001), LNCS, Vol. 2240, pp.103-118, Springer, 2001. 17. M. Strasser and J. Baumann, and F. Holh, “Mole: A Java Based Mobile Agent System”, Proceedings of 2nd ECOOP Workshop on Mobile Objects (eds. J. Baumann, C. Tschudin and J. Vitek), 1997. 18. T. Umezawa, I. Satoh, Y. Anzai, “A Mobile Agent-based Framework for Configurable Sensor Networks”, to appear in International Workshop on Mobile Agents for Telecommunication Applications (MATA’2002), LNCS, Springer, October, 2002. 19. R. Want, A. Hopper, A. Falcao, and J. Gibbons, “The Active Badge Location System”, ACM Transactions on Information Systems, vol.10, no.1, pp. 91-102, ACM Press, January, 1992. 20. World Wide Web Consortium (W3C), Composite Capability/Preference Profiles (CC/PP), http://www.w3.org/TR/NOTE-CCPP, 1999.

Author Index

Ahn, JinHo 93 Arumugam, Subramanian

Hwang, ChongSun Kotz, David

Barton, Joyce 106 Binder, Walter 154 Breedy, Maggie 106, 171

Lichtl, Bal´ azs

Cao, Jiannong 138 Carvalho, Marco 106, 171 Cassell, Bryan 64 Chac´ on, Daria 106 Cho, Kenta 32 Cowin, Thomas 106

Nalla, Amar

154

1

Ohsuga, Akihiko

32

Picco, Gian Pietro Roth, Volker

64

16

47

Satoh, Ichiro 186 Stoops, Luk 78

Garrett, Chris 106 Gray, Robert 106 Grimstrup, Arne 106 Hayashi, Hisashi 32 Helal, Abdelsalam (Sumi) Hofmann, Martin 106

106

Mens, Tom 78 Min, Sung-Gi 93 Moreau, Luc 121

D’Hondt, Theo 78 Delamaro, M´ arcio 16 Fayram, Dave

93

1

Vigna, Giovanni

64

Wang, Xianbing Wu, Jie 138

138

1 Zaini, Norliza

121