Computing and Visualization in Science


331 63 1MB

English Pages 46 Year 2005

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

Computing and Visualization in Science

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

TeAM YYePG

Comput Visual Sci (2004) Digital Object Identifier (DOI) 10.1007/s00791-004-0145-0

Computing and Visualization in Science

Regular article

On a modular architecture for finite element systems. Digitally signed bycodes TeAM YYePG I. Sequential DN: cn=TeAM YYePG, c=US, Krzysztof Bana´s o=TeAM YYePG, ou=TeAM Section of Applied Mathematics ICM, Cracow University of Technology, Warszawska 24, 31-155 Krak´ow, Poland (e-mail: [email protected] [email protected]) YYePG, Reason: accuracy Received:I 9attest December to 2002the / Accepted: 22 March 2003 Published online: 17 August 2004 –  Springer-Verlag 2004 and integrity of this document by: G. Wittum Date:Communicated 2005.07.10 22:54:00 +08'00' Abstract. The paper discusses basic principles for the modular design of sequential finite element codes. The emphasis is put on computational kernels, considering pre- and postprocessing as separate programs. Four fundamental modules (subsystems) of computational kernels are identified for the four main tasks: problem definition, approximation, mesh manipulation and linear system solution. Example interfaces between the four modules that can accommodate a broad range of application areas, approximation methods, mesh types and existing solvers are presented and discussed. The extensions for other modules and for interfaces with external environments are considered. The paper prepares ground for the next article considering the architecture of parallel finite element systems.

1 Introduction Finite element codes are becoming more and more complex. Simulations concern multi-scale phenomena and multiphysics processes modeled by coupled systems of nonlinear partial differential equations posed in complicated 3D domains. Adaptive meshes require more elaborate data structures, efficient solvers use multi-level preconditioning, approximation methods operate in more sophisticated function spaces and use complex algorithms for error estimation. Additionally, for the full utilization of capabilities offered by today’s parallel computers with memory hierarchies, the explicit model of programming with domain decomposition and message passing proves to be the most efficient [23]. All these call for a new programming paradigm for the finite element method. The implementations described in the popular textbooks, like [15, 21, 26] are considered the old paradigm. The finite element program is defined as a single unit with a single data structure. The data structure contains

information on the problem solved, the approximation used and the finite element mesh. The attempts to define a new paradigm have continued for at least ten years [13, 17, 27] and have recently intensified (see e.g. articles in [3] and [16]). There are basically two approaches that can be found in literature. The first of them are case studies of large, complex systems [5–8]. For such systems it is necessary to introduce some kind of modularization of the program. Modular design makes codes easier to maintain, modify and extend. The basic principles and advantages of modularization are commonly known and acknowledged, the main problem, however, is to find a suitable modular structure for the software in question. Although there are many proposed architectures for finite element programs, the problem has not been the subject of separate investigations and there is no solution with widespread acceptance. The existing designs do not separate a specification for modules from its implementation and are usually influenced by the latter. The aim of the present paper is to examine the structure of finite element codes, propose a modular design and define it in the most general form possible. To this end the specification for modules is done exclusively in terms of their interfaces with other modules. The specification tries to accommodate many existing variations of finite element approximations (continuous, discontinuous, higher order) and related algorithms. To limit the scope of investigations, the emphasis is put on computational kernels of finite element codes, the parts related to fundamental calculations in the finite element method. Similar efforts for specifying interfaces between different computational modules, in a slightly different context, have been undertaken in [14] and within the broader project described in [1]. The second approach for modernizing finite element codes consists in using features of modern programming languages and software development techniques. It is usually related to object oriented methodology and the use of C++

K. Bana´s

(see e.g. [4] and the articles in [12]). Classes (or class hierarchies) are created for low level objects such as vectors, elements or materials, as well as for top level constructs corresponding to fundamental finite element modules. The drawback of this approach is that class hierarchies often do not allow for the strict separation of modules and, therefore, limit the possible implementations of cooperating modules to a specified language. The new paradigm for designing finite element codes will be, with no doubt, related to the progress in computer languages and software engineering. Finite element programs, like other software, benefit from using such features as derived (constructed) data types, dynamic memory allocation or different implementations of inheritance. The profits of using new features of modern languages are counterbalanced by the costs of porting the old software and training programmers. Not all features prove to be equally advantageous for scientific computing [2]. The languages and their compilers change constantly. These make the choice of a programming language for the finite element method by no means obvious. There are now at least three modern languages, Fortran90, C and C++, widely used in scientific programming, each with its own advantages and disadvantages, all standardized and equipped with additional libraries and programming tools. The solution proposed in the present paper tries to define a top level architecture of finite element systems in a language independent fashion, using module interfaces that allow for interoperability of three mentioned above, standardized languages. The restriction to standardized languages ensures the portability as a crucial requirement for scientific codes. The choice of a particular language to implement a given module is not discussed. Certain languages seem to be most appropriate for certain tasks, like e.g. C++ for implementing the hierarchy of possible elements using inheritance. On the other hand, the majority of legacy codes is written in Fortran77. The specification of module interfaces for Fortran90 would allow for reusing the most valuable parts of this heritage by a relatively small effort of writing small intermediate Fortran90 wrappers. Finite element analysis, in its classical form, is performed in three consecutive steps: pre-processing, processing and post-processing. Pre-processing includes geometrical modeling of a computational domain, mesh generation and detailed definition of the problem, comprising the specification of coefficients, boundary conditions and possibly an initial condition. Post-processing involves visualization of approximation results and possibly their further transformations. Processing contains all computations leading from the specification of the problem to the finite element approximation of the solution. All three steps may form a part of a larger simulation or a design process and the programs may be embedded in a problem solving environment. A recent trend is to consider a uniform user interface for the whole simulation as a mean for an interactive and possibly collaborative control and steering [9, 18–20, 22]. The last issue is the subject of intensive research in recent years, related to the emergence of new computing environments – distributed, webbased, “grid”. The architecture proposed in the present paper does not include such extensions. Nevertheless, it has been designed as a computational kernel that can be easily parallelized and included into a distributed problem solving environment.

The computational kernel of a finite element code is defined here as a part that implements algorithms related to the processing phase. Some of these algorithms (e.g. time integration, nonlinear equations solution) are shared with other methods for approximating partial differential equations (finite difference, finite volume). Due to their generic character these algorithms are not considered in detail in the paper. The distinct features of the finite element method are: the use of a weak, integral statement as a basis for approximation, the division of the computational domain into finite elements and the use of function spaces spanned by basis functions that are defined element-wise. Therefore the most fundamental algorithms for the finite element analysis are those that transform the weak statement into the system of linear equations. Implementation of these algorithms is the main focus of the paper. From many modularizations proposed already in the literature, the one considered as fundamental is chosen as a basis for the architecture. The splitting consists of three modules: a mesh manipulation module, an approximation module and a problem dependent module. Since the system of linear equations produced by the finite element method has several particular characteristics, the interface between the computational kernel and linear equations solvers is also of crucial importance. Therefore a linear solver module is also considered as the fourth fundamental module. The rationale behind the proposed splitting is that each distinguished module can operate on its own independent data structure. This may produce simplifications to the complex data structures of finite element codes, especially in their parallel and adaptive versions. On the other hand, the splitting allows for making use of well known advantages of modular design: creation of prototype modules, that adhere to the part of specifications, easy modifications to the code by managing separate modules independently, increased flexibility of combined codes, easier testing and tuning of separate modules, etc. If designed in a proper way, small modules may be no more difficult to use than a single large code, and usually offer broader functionality than one, all purpose program. In the particular finite element context, modules can be created e.g. for hexahedral meshes with anisotropic refinements (e.g. for introducing needle elements), tetrahedral and prismatic meshes with slicing of prisms possible (e.g. to model boundary layers in complicated domains), for different approximation methods like: discontinuous Galerkin, mixed, hp-adaptive and so on. The question of implementation of particular modules is not discussed in the paper, besides few remarks given below. Since modules are defined in terms of their interfaces, their implementation may be done using different languages and constructs. Probably the most advantageous would be to further split modules into submodules related to different tasks. This can be done in an object oriented fashion or in a procedural style. Finally, standard libraries can be used for such tasks as e.g linear algebra or memory management. The modular design is also recommended here. A single module in a single file containing an interface between the finite element code and a library enables more flexibility in changing libraries or extending code’s functionality. The paper is the first in a series of articles discussing modular design of finite element systems. The second part will describe parallel codes and the following parts will present example implementations and numerical examples

On a modular architecture for finite element systems. I. Sequential codes

in application domains such as CFD, multi-phase flow and electro-magnetics. There is only one main section in the paper with different subsections discussing different modules and their interfaces. A small summarizing section concludes the paper. 2 Finite element computational kernel modules and their interfaces The system of modules and their interfaces described below is a proposal for a general, modular design of finite element computational kernels. It is not comprehensive, the idea is to present the most important and useful abstractions and concepts, as well as to give illustrating examples. There are four modules considered: a mesh manipulation module, an approximation module, a problem dependent module and a linear solver module. Each module is build around its own data structure, not shared with other modules, and the interfaces between modules consist only of parameters (named constants) and procedure (subroutine) calls. Constants are used to specify dimensions of arrays being arguments of interface routines or to provide possible choices of parameters employed by different conventions. Interfaces, as usual, are contained in separate included files. All conventions adopted to make interface information unambiguous and comprehensive have to be explained in comments inside included files. Each implementation of a module should provide interfaces in all three languages, Fortran90, C and C++. The examples in the next subsections use C or Fortran90 interchangeably. For C, pointers are used exclusively as function arguments, to make interoperability with Fortran easier. Vector or matrix arguments are always presented as one dimensional arrays leaving storage details to the final specification. Two basic types of variables are used: integer and double precision real (with the name double for both, C and Fortran). The question of user specified precision is not considered as belonging to a different level of specification (although the use of some convention in that matter is advisable). 2.1 Mesh manipulation module The mesh manipulation module is responsible for reading, storing and modifying data concerning finite element meshes. It also provides all other modules with information on meshes. Mesh modifications comprise different kinds of refinements and derefinements: element divisions (h-refinements), vertex movements (r-refinements), element clustering. It is relatively easy to separate the mesh manipulation module from the rest of the finite element code. The module does not require any information from other fundamental modules. There is no information within the module on the approximation it is used for, nor on the kind of problem solved. Using the example of the mesh module, the principles for defining a module and the ways in which it communicates with other modules are presented. The key idea is to equip each appearing entity of a specific kind with a unique identifier (ID). The entities comprise complex constructs like mesh or solver but also simple constructs like vertex or element. In some languages entities may be directly implemented as objects, but the specification does not even require the use of

constructed (user defined) data types. For portability reasons it is assumed that an identifier is an integer number (as a type, in practice natural numbers are used). A mesh is the fundamental entity for a mesh manipulation module. It is assumed that the module can handle several meshes, for approximating different field components or to handle multi-domain approximations. It is the responsibility of the approximation and the problem dependent modules to ensure the consistency of approximations on different meshes. To create a new mesh an initialization routine is called. The call, for the case of mesh data specified in a file, may have the form: mesh_id = mmr_init_mesh( control, filename)

where mesh_id is an identifier associated with the newly created mesh, control is an integer parameter with the meaning defined by a particular module (e.g. it may be used for parallel execution to specify whether the data in a file concern the whole mesh or a particular submesh only) and filename is the name of a file with mesh data. The identifier mesh_id should be used as an argument for all operations on the corresponding mesh. An example is given using a simple routine that returns a description of a mesh: call mmr_mesh_introduce( mesh_id, mesh_description)

where mesh_description is a character array (string) describing the type of the mesh according to some convention (e.g. “ANISOTROPIC HEXAHEDRAL” or “TRIANGULAR + QUADRILATERAL”). The most important mesh entity, as one could expect for the finite element method, is an element. From the theoretical point of view [10] an element is a triplet: space domain, a set of shape functions (a local function space) and a set of degrees of freedom (functionals for the space). Two latter ingredients are directly related to approximation and will be considered later. The space domain is related indirectly, it provides a domain for integration of terms from a weak formulation (these terms comprise approximating functions). Integrals from a weak formulation are computed separately for each element. The mesh manipulation module must provide means for performing a loop over all elements that take part in the integration. These elements are called active and the loop may look as follows: elem_id=mmr_first_active_elem(mesh_id); do{ ... elem_id=mmr_next_active_elem(mesh_id,elem_id)); }while(elem_id0 - equal size neighbor ID*/ /* 0 .

Various numerical methods are known to solve Stefan problems: front-tracking, front-fixing and fixed domain methods. Since the concentration at the interface varies with time in a bounded domain, we restrict ourselves to a front-tracking method. Recently a number of promising methods are proposed for multi-dimensional problems: phase field methods and level set methods, such as in [2, 11, 16, 19]. However, imposing local equilibrium condition at the interface in such models is not as straightforward as in front-tracking methods that are used here. A coupling between thermodynamics and a phase field model is presented by Grafe et al. [7]. Our main interest is to give an accurate discretization of the boundary conditions for this Stefan problem with one spatial co-ordinate. Therefore we use the classical moving grid method of Murray and Landis [13] to discretize the diffusion equations. In this paper we briefly describe the method, for more details for the case of a diagonalizable matrix, we refer to [21]. Transformation of the concentrations We assume that the matrix D does not depend on time. First the eigenvalues and eigenvectors of the diffusion matrix are computed for the transformation of the concentration. Thereafter, the particle and initial concentrations are also transformed. The diffusion equation is discretized by using a Finite Difference Method where the time integration is implicit to guarantee numerical stability. A great advantage of the diagonalization argument is that a fully implicit method for diffusion, which is unconditionally stable, can be used easily to integrate the concentration profile in time since the equations are decoupled. This also holds when the diffusion matrix is not diagonalizable. Discretization of the interior region We use an implicit finite difference method to solve the diffusion equations in the inner region. An explicitly treated convection term due to grid-movement is included. Since the magnitude of the gradient is maximal near the moving interface we use a geometrically distributed grid such that

F.J. Vermolen et al.

the discretization near the interface is fine and coarse farther away from the moving interface. Furthermore, we use a virtual grid-point near the moving boundary. The distance between the virtual node and the interface is chosen equal to the distance between the interface and the first grid-node. The resulting set of linear equations is solved using a tridiagonal matrix solver. Discrete boundary conditions at the interface We define the discrete approximation of the concentration as j u i,k , where j, i and k respectively denote the time-step, the index of the chemical (alloying) element and gridnode. The virtual gridnode behind the moving interface and the gridnode at the interface respectively have indices k = −1 and k = 0. At the moving interface, we obtain from discretization of the Stefan condition for j ∈ {1, . . . , n − 1} λi part

ui

− u si

j+1

j+1

u i,1 − u i,−1 2∆r

=

λi+1 part

u i+1 − u si+1

j+1

j+1

u i+1,1 − u i+1,−1 2∆r

.

Note that the concentration profile of each element is determined by the value of the interfacial concentration. Above equation can be re-arranged into a zero-point equation for all chemical elements. All interfacial concentrations satisfy the hyperbolic relation (1). Combination of all this, gives for i ∈ {1, . . . , n − 1} and i = n      part j+1 j+1 j+1 j f i u i,0 , u i+1,0 := λi u i,1 − u i,−1 u i+1 − u si+1 +    part j+1 j+1 − λi+1 u i+1,1 − u i+1,−1 u i − u si = 0 m 1  n    s p1 j u sj  (. . . ) f n u 1 , . . . , u sn :=  j=1

 m n n  × pn j u sj  − K = 0 . j=1

To approximate a root for the ‘vector-function’ f we use Newton’s method combined with discrete approximations for the non-zero entries in the first n − 1 rows of the Jacobian matrix. The iteration is terminated when sufficient accuracy is reached. This is explained in more detail in [21]. Adaptation of the moving boundary The interface position is computed by the use of the Stefan condition. In [20] the forward (explicit) Euler and Trapezium time integration methods are described and compared. It was found that the (implicit) Trapezium method was superior in accuracy. Furthermore, the iteration step to determine the interfacial concentrations is included in each Trapezium step to determine the interfacial position. Hence, the work per time iteration remains the same for both time integration methods. Therefore, the Trapezium rule is used to determine the interfacial position as a function of time. We terminate the iteration when sufficient accuracy is reached, i.e. let ε be the inaccuracy, then we stop the iteration when the inequality   n    S j+1 ( p + 1) − S j+1( p)  s s u ( p + 1) − u ( p) + S(t), t > 0:  ∂2u ∂u    = Λ   ∂t ∂x 2        u p − u s  dS = Λ ∂u (S(t), t) (P1 ) dt ∂x       u(x, 0) = u 0 , S(0) = S0 ,       u(S(t), t) = u s . First we deal with the diagonalizable case where we consider an exact solution and an asymptotic approximation. Subsequently we deal with the non-diagonalizable case where we also consider an exact solution and an asymptotic approximation. A self-similar solution, where the boundaries do not move, can be found in the book of Glicksman [5], chapters 23 and 24. 5.1 The exact solution for the diagonalizable case As a trial solution of (P1 ) we assume that the interfacial concentrations u s are constant. Furthermore, we assume that the diffusion matrix, D, is diagonalizable. Suppose that the vector u s is known then using a similar procedure as in [21], one obtains the solution for each component:   √ 0 erfc 2x−S   λt  i  , for i ∈ {1, . . . , n} . u i = u 0i + u 0i − u si k erfc 2√λ i

√ The assumption that S(t) = S0 + k t gives the following expression for k u 0i − u si p u i − u si



2 −k

k λi e 4λi = ,  π erfc √k 2 2 λ

for i ∈ {1, . . . , n} .

i

Above equation has to be solved for the parameter k. However, the transformed interfacial concentrations u s are not known either and hence one is faced with the following problem

Cross-diffusion controlled particle dissolution in metallic alloys

   k   √  erfc   2 λi u 0i − u si λi k   , for i ∈ {1, . . . , n} , =  p  2 2 u i − u si π −k  e 4λi (P2 )   m 1  m 2 m n    n n n        p1 j u sj   p2 j u sj  (. . . ) pn j u sj  = K .    j=1 j=1 j=1

Here the unknowns are the transformed interfacial concentrations u s and rate-parameter k. In above problem there is no time-dependence, hence the ansatz of time-independent transformed interfacial concentrations (and hence the physical interfacial concentrations) is not contradicted. Due to the non-linear nature of the equations, the solution may be not unique. We apply a numerical zero-point method to obtain a solution of (P2 ).

Example We illustrate the importance of the cross-diffusion term. The following input parameters are used where we vary the value of the cross term D12 : c0 = (0, 0)T , cpart = (50, 50)T ,   1 D12 , K =1. D= D12 2 The above diffusion matrix is symmetric. From Fig. 1 it is clear that the influence of the cross terms is significant. Since Kale et al. [8] indicate that the cross diffusion term can have the same order of magnitude as the diagonal terms in the diffusion matrix we choose the values of D12 in the range [−1, 0]. From (P2 ) there is no explicit relation for k. In [22] we show that an approximate explicit solution for k can be derived provided that ||u s − u 0 || ≪ ||u p − u s ||. 5.2 The exact solution for the non-diagonalizable case In this section we consider a ternary example, so n = 2. Examples with more components can be treated similarly. When 1

0.9

0.8

0.7

From the above system it can be seen that the equation for u 2 is uncoupled. Its solution is computed using the selfsimilarity transformation and subsequently substituted into the equation for u 1 . We consider self-similarity solutions x − S0 u 1 , u 2 (x, t) = u 1 , u 2 (η), where η := √ , and we apply t a similar procedure as in Sect. 5.1. to obtain a system of ordinary differential equations for u 1 and u 2 . These equations are solved to obtain the following expressions for the u 1 and u 2 .    √ η2 C1 η u 1 = 2 −ηλ e− 4λ + λ πλerf √ 2λ 2 λ   √ η + C2 πλerf √ + C3 2 λ   √ η u 2 = C1 πλerf √ + C4 2 λ √ Again we use the trial solution S(t) = S0 + k t. A combination with the boundary conditions delivers expressions for the integration constants C1 , C2 , C3 and C4 . Substitution of these constants into the expressions of u 1 and u 2 gives the transformed concentrations. Further, the rate factor of the interface movement, k, is obtained from combination of the Stefan condition and the expression for u 2 . Then we get the following set of equations to be solved for k, u s1 and u s2 :    k k erfc 2√λ u 02 − u s2 λ , = p k2 2 u 2 − u s2 π e− 4λ    k √ erfc k u 01 − u s1 λ 2 λ = p k2 2 u 1 − u s1 π e− 4λ   2 − k4λ 0 s 2 2λ k k e u −u   +  p 2 s 2√ 1 + 2 − √ √ 4λ 2 λ π erfc √k 2 u 1 − u 1 λπ 2 λ

0.6



D = −1 12

0.5

0.3

D = −1/2 12

0.2

D = −1/4 12 D =0 12

0.1

0

500

1000

1500

2000

  s m1

p11 u s1 + p12 u 2

 s m2

p21 u s1 + p22 u 2

=K.

(12)

Note that p1 and p2 respectively represent the eigenvector and generalized eigenvector that correspond to the eigenvalue λ of the defective matrix D. The above system of equations can be solved using a zero-point method. We remark here that we derived for this non-diagonalizable case an approximate solup tion under the condition |u s2 − u 02 | ≪ |u 2 − u s2 | in [22].

0.4

0

2×2 the matrix is not diagonalizable then we obtain D ∈ R  λ 1 Λ= as the decomposed form of the diffusion matrix. 0 λ The set of transformed diffusion equations become  ∂2u1 ∂2u2 ∂u 1    =λ 2 + 2 ,  ∂t ∂x ∂x (11) 2  ∂u 2 ∂ u2    =λ 2 . ∂t ∂x

2500

3000

3500

4000

4500

5000

Fig. 1. The interfacial position as a function of time for the exact self similar solution for several values of the cross diffusion terms

Example In the illustration in Fig. 2 both the exact (see (12)), approximate solution ([22]) and numerical solution are displayed. It

F.J. Vermolen et al. 1 exact solution approximate solution 0.95

0.9 numerical solution

semi−spherical particle

0.85

Position (m)

0.8

0.75

analytical solutions

cylindrical plate

0.7

Fig. 3. Sketch of the initial geometry of the growing semi-spherical particle and the dissolving cylindrical plate. Note that we have rotational symmetry

0.65

0.6

0.55

0.5

0

10

20

30

40

50 Time (s)

60

70

80

90

100

Fig. 2. The interface position as a function of time for the case that the diffusion matrix is not diagonalizable. The analytical and numerical solution are shown

can be seen that the difference between both analytical approaches is very small for the following data-set that is used for the calculations: c0 = (0, 0)T , cpart = (50, 50)T ,   2 1 D= , K =1. 0 2

6 Application to a Finite Element Model In applications a one-dimensional approach is not always suitable (see [12] for instance where three-dimensional effects have to be taken into account). We limit ourselves to rotational symmetry such that the model only contains two spatial co-ordinates. The geometry is sketched in Fig. 3. The chemical composition of the two phases are different. The cylindrical particle only dissolves at the rim, whereas the spherical particle grows. Note that we have two moving boundaries in this example. We consider a ternary alloy with artificial input data:   1 −0.1 , c0 = (5, 5)T , D= −0.1 2 KS = 1 ,

K C = 50 .

’cylindrical’ phase Fig. 4. Simultaneous growth of a ‘spherical’ particle and decay of a ‘cylindrical’ phase at consecutive times. Input data are as given in (13). Further, only the area in the vicinity of the moving interfaces is shown

The eigenvalue of the above matrix is equal to 2 and the matrix is not diagonalizable. In Fig. 2 a comparison is shown with the numerical solution (see Sect. 4). The interface concentration starts with cs = (0.8673, 1.1560)T . It can be seen that the agreement is perfect in the initial stages. However, the solutions start to deviate at later stages where soft impingement starts to play an important role. Finally, we remark that the above treatment can be extended to the more general case of n simultaneously diffusing alloying elements.

cpart = (50, 50)T ,

’sphere’

(13)

Here K S and K C respectively denote the solubility product of the spherical particle and cylindrical plate. As initial ge-

ometrical data we used a sphere with radius 1 and a cylindrical plate with radius 4. The initial cylinder height is 1 and the whole domain measures 5 × 5. Further the sizes are in micrometer and the diffusion coefficients are in terms of µm2 /s. Since the Finite Element Code that we use here is not applicable to this vector-valued Stefan problem yet, we solve the diffusion equation for one element only. Furthermore, we assume that interface concentrations are constant in time and position. To obtain the interface concentrations, we solve the equations in problem (P2 ) after the use of the diagonalization argument. The transformed interface and particle concentration and eigenvalue of one element is used in the two-dimensional Finite Element code SEPRAN to compute the evolution of the interfaces. The result is shown in Fig. 4. As time proceeds the real interface concentrations are a function of time. Furthermore, it can be seen that in Fig. 4 that the growing spherical phase exhibits a fingering behavior, i.e. the interface is unstable. As the off-diagonal terms of the diffusion matrix become more negative, the movement of the two interfaces is delayed. 7 Conclusions A model based on a vector-valued Stefan problem has been developed to predict the dissolution kinetics of stoichiometric particles in multi-component alloys. Cross-diffusion is taken into account, which gives a strong coupling between the diffusion equations of the several alloying elements. A diagonalization of the diffusion matrix leads to a vector-valued Stefan problem with a weaker coupling. If the off-diagonal entries of the diffusion matrix are negative, then the delay of particle

Cross-diffusion controlled particle dissolution in metallic alloys

dissolution due to these terms increases as the magnitude of the off-diagonal entries increase. Future work will be the implementation of vector-valued Stefan problems into the Finite Element Method for two and three spatial co-ordinates. References 1. Chadam, J., Rasmussen, H.: Free boundary problems involving solids. Scientific & Technical Harlow: Longman 1993 2. Chen, S., Merriman, B., Osher, S., Smereka, P.: A simple level-set method for solving Stefan problems. J. Comp. Phys. 135, 8–29 (1997) 3. Crank, J.: Free and moving boundary problems. Oxford: Clarendon Press 1984 4. Farkas, M.: Two ways of modelling cross-diffusion. Nonlin. Anal., theory, methods and applications 30, 1225–1233 (1997) 5. Glicksman, M.E.: Diffusion in solids. New York: John Wiley and Sons 2000 6. Golub, G.H., van Loan, C.F.: Matrix Computations. Baltimore: The Johns Hopkins University Press 1996 7. Grafe, U., Bottger, B., Tiaden, J., Fries, S.G.: Coupling of multicomponent thermodynamic databases to a phase-field model: application to solidification and solid state transformations of superalloys. Scrip. Mater. 42(12), 1179–1186 (2000) 8. Kale, G.B., Bhanumurthy, K., Khera, S.K., Asundi, M.K.: Ternary diffusion in FCC phase of Fe-Ni-Cr alloy system at 1223 K. Mater. Trans., JIM 32(11), 1034–1041 (1991) 9. Kassab, A.J.: Stefan and moving front problems. Oxford: Elsevier Applied Science 1995 10. Kirkaldy, J.S., Young, D.J.: Diffusion in the condensed state. London: The Institute of Metals 1987 11. Kobayashi, R.: Modeling and numerical simulations of dendritic crystal growth. Phys. D 63, 410–423 (1993) 12. Kuijpers, N.C.W., Vermolen, F.J., Vuik, C., van der Zwaag, S.: A model of the β-AlFeSo to α-Al(FeMn)Si transformation in Al-Mg-Si alloys. in preparation (2003)

13. Murray, W.D., Landis, F.: Numerical and machine solutions of transient heat conduction problems involving freezing and melting. Transactions ASME (C), Journal of heat transfer 245, 106–112 (1959) 14. Naumann, E.B., Savoca, J.: An engineering approach to an unsolved problem in multi-component diffusion. AICHE Journal 47(5), 1016–1021 (2001) 15. Nolfi, F.V. jr., Shewmon, P.G., Foster, J.S.: The dissolution and growth kinetics of spherical particles. Trans. Metall. Soc. of AIME 245, 1427–1433 (1969) 16. Osher, S., Sethian, J.A.: Fronts propagating with curvature-dependent speed: Algorithms based on Hamilton-Jacobi formulations. J. Comp. Phys. 79, 12–49 (1988) 17. Protter, M.H., Weinberger, H.F.: Maximum principles in differential equations. Englewood Cliffs: Prentice-Hall 1967 18. Reiso, O., Ryum, N., Strid, J.: Melting and dissolution of secondary phase particles in AlMgSi-alloys. Metall. Trans. A 24A, 2629–2641 (1993) 19. van Til, W., Vuik, C., van der Zwaag, S.: An inventory of numerical methods to model solid-solid phase transformations in aluminium alloys. NIMR-report number P.004.001, Delft, (2000) 20. Vermolen, F.J., Vuik, C.: A numerical method to compute the dissolution of second phases in ternary alloys. J. Comput. Appl. Math. 93, 123–143 (1998) 21. Vermolen, F.J., Vuik, C.: A mathematical model for the dissolution of particles in multi-component alloys. J. Comput. Appl. Math. 126, 233–254 (2001) 22. Vermolen, F.J., Vuik, C., van der Zwaag, S.: Some mathematical aspects of particle dissolution and cross-diffusion in multi-component alloys. Delft University of Technology, Department of Applied Mathematical Analysis 01–13, (2001) Delft, The Netherlands 23. Vermolen, F.J., Vuik, C., van der Zwaag, S.: Particle dissolution and cross-diffusion in multi-component alloys. Mater. Sc. and Engng. A, 347(1–2), 265–279 (2003) 24. Visintin, A.: Models of phase transitions, Progress in nonlinear differential equations and their application. Boston: Birkhäuser 1996, p. 38 25. Whelan, M.J.: On the kinetics of particle dissolution. Met. Sc. 3, 95–97 (1969)

Comput Visual Sci (2004) Digital Object Identifier (DOI) 10.1007/s00791-004-0142-3

Computing and Visualization in Science

Regular article The N UM L AB numerical laboratory for computation and visualisation J. Maubach, A. Telea Department of Mathematics and Computer Science, Eindhoven University of Technology, Den Dolech 2, 5600 MB Eindhoven, The Netherlands (e-mail: {maubach, alext}@win.tue.nl) Received: 29 May 2000 / Accepted: 13 May 2002 Published online: 17 August 2004 –  Springer-Verlag 2004 Communicated by: G. Wittum

Abstract. A large range of software environments addresses numerical simulation, interactive visualisation and computational steering. Most such environments are designed to cover a limited application domain, such as finite element or finite difference packages, symbolic or linear algebra computations or image processing. Their software structure rarely provides a simple and extensible mathematical model for the underlying mathematics. Thus, assembling numerical simulations from computational and visualisation blocks, as well as building such blocks is a difficult task for the researcher in numerical simulation. This paper presents the N UM L AB environment, a single numerical laboratory for computational and visualisation applications. Its software architecture one-to-one models fundamental numerical mathematical concepts and presents a generic framework for a large class of computational applications. Partial and ordinary differential equations, transient boundary value problems, linear and non-linear systems, matrix computations, image and signal processing, and other applications all use the same software architecture and are built in a simple and interactive visual manner. N UM L AB’s oneto-one modelled mathematical concepts are illustrated with various applications.

1 Introduction The N UM L AB (Numerical Laboratory) environment has been constructed after a thorough search through a wide range of software environments for numerical computation, interaction, and data visualisation. N UM L AB’s goals include seamless integration of computation and visualisation, convenient application construction, communication with other software environments, and a high level of extensibility and customisability for research purposes. In order to assess the merits of the N UM L AB environment, we first consider the numerical simulation and visualisation software environments in general. From a structural point of view, such software environments can be classified into three categories (see for

instance [39]): Libraries, turnkey systems, and application frameworks. Libraries for numerics such as LAPACK [2], NAGLIB [37], or IMSL [26], or for visualisation such as OpenGL [27], Open Inventor [50], or VTK [44], provide services in the form of data structures and functions. Libraries are usually easy to extend with new data types and functions. However, using libraries to build a complete computational or visualisation application requires involved programming. Turnkey systems, such as Matlab [33], Mathematica [32], or the many existing dedicated numerical simulators on the market, are simpler to use than libraries to build a complete application. However, extending the functionality of such systems is usually limited to a given application domain, as in the case of the dedicated simulators, or to a fixed set of supported data types, as in the case of the Matlab programming environment. Application (computational) frameworks, such as the Diffpack and SciLab systems for solving differential equations [11, 43] or the Oorange system for experimental mathematics [23] combine the advantages of the libraries and turnkey systems. On one hand, frameworks have an open structure, similarly to libraries, so they can be extended with new components, such as solvers, matrix storage schemes, or mesh generators. On the other hand, some (notably visualisation) frameworks offer an easy manner to construct a complete application that combines visualisation, numerics, and user interaction. This is usually provided by means of visual programming tools such as Matlab’s Simulink [33] or the dataflow network editing tools of the AVS [49], IRIS Explorer [1], or Oorange [23] frameworks. In these frameworks, applications are constructed by assembling visual representations (icons) of the computational or visualisation components in a network. Program execution is implemented in terms of computational operations on the network nodes and data flows between these nodes respectively. With the above in mind, let us consider how the N UM L AB environment integrates the advantages of the above architectures. On the level of libraries, N UM L AB’s C++ routines call Fortran, Pascal, C, and C++. Next, similar to a turnkey system, N UM L AB offers full integration of visu-

J. Maubach, A. Telea

alisation and numerical computation, and implements communication with other environments such as Simulink [33] and MathLink [32]. On the application framework level, N UM L AB provides interactive application construction with its visual programming dataflow system VISSION [46, 47]. Furthermore, N UM L AB provides an object-level (subroutinelevel) make-concept which allows for interactive program validification. In order to better address N UM L AB’s merits on all levels, we need a closer look at computational frameworks. Though efficient and effective, most existing computational frameworks are limited in several respects. First, limitations exist from the perspectives of the end user, application designer, and component developer [4, 19, 39, 46]. First, few computational frameworks facilitate convenient interaction between visualisation (data exploration) and computations (numerical exploration), both essential to the end user. Secondly, from the application designer perspective, the visual programming facility, often provided in visualisation frameworks such as AVS or Explorer [1, 49], usually is not available for numerical frameworks. Conversely, it is quite difficult to integrate large scale computational libraries in visualisation frameworks. Finally, from the numerical component developer perspective, understanding and extending a framework’s architecture is still (usually) a very complex task, albeit noticeably simplified in object-oriented environments such as [11, 44]. Next to limitation with respect to the three types of users, many computational frameworks are constrained in a more structural manner: Similar mathematical concepts are not factored out into similar software components. As a consequence, most existing numerical software is heterogeneous, thus hard to deploy and understand. For instance, in order to speed up the iterative solution of a system of linear equations, a preconditioner is often used. Though iterative solvers and preconditioners fit into the same mathematical concept – that of an approximation x which is mapped into a subsequent approximation z ≈ F(x) – most computational software implements them incompatibly, so preconditioners can not be used as iterative solvers and vice versa [11]. Another example emerges from finite element libraries. Such libraries frequently restrict reference element geometry and bases to a (sub)set of possibilities found in the literature. Because this set is hard coded, extensions to different geometries and bases for research purposes is difficult, or even impossible. The design of N UM L AB addresses all the above problems. N UM L AB is a numerical framework which provides C++ software components (objects) for the development of a large range of interdisciplinary applications (PDEs, ODEs, non-linear systems, signal processing, and all combinations). Further, it provides interactive application design/use with its visual programming dataflow system VISSION [46, 47], data interchange (e.g. via Simulink and MathLink), and can be used both in a compiled and interpreted fashion. Its computational libraries factor out fundamental notions with respect to numerical computations (such as evaluation of operators z = F(x) and their derivatives), which keeps the amount of basic components small. All components of these libraries are aware of dataflow, even in the absence of the VISSION data-

flow system, and can for instance call back to see whether provided data is valid. The remainder of this paper addresses some fundamental N UM L AB design aspects, as follows. In Sect. 2, the mathematics that we desire to model in software is reduced to a set of simple but generic concepts. Section 3 shows how these concepts are mapped to software entities. Section 4 illustrates the above for the concrete case of solving the Navier–Stokes partial differential equation. Section 5 presents how concrete simulations combining computations and visualisation are constructed and used in N UM L AB. Finally, Sect. 6 concludes the paper presenting further directions. In order to bound the list of references, quotations have been kept at a minimum. 2 The mathematical framework In order to reduce the complexity of the entire software solution, we show how N UM L AB formulates different mathematical concepts with a few basic mathematical notions. It turns out that in general N UM L AB’s components are either operators F, or their vector space arguments x, y. The most frequent N UM L AB operations are therefore operator evaluations F(x) and vector space operations such as x + y. Important is the manner in which N UM L AB facilitates the construction of complex problem-specific operators (for instance transient Navier–Stokes equation with heat transfer), and related complex solvers. N UM L AB offers: 1. Problem-specific operators: Transient Finite Element, Volume, Difference operators F for transient boundary value problems (BVPs); Operators which formulate systems of ordinary differential equations (ODEs); operators which act on linear operators (for instance image filters). The operator framework is open, users can define customised operators z = F(x). 2. Problem-specific solvers for systems of ODEs: Time-step and time-integration operators formulated with the use of (parts of) the problem-specific operators mentioned above. The former operators require non-linear solvers for the computation of solutions. 3. Solvers for systems of non-linear equations: Such systems are operators, and their solution is reduced to the solution of a sequence of linear systems. 4. Solvers for systems of linear equations: Such systems are also operators F(x) = Ax − b. Their solution is reduced to a sequence of operator evaluations and vector space operations. The reduction from one type of operator into another is commented on in the subsections of Sect. 2, in the reverse order of the itemisation above. Thus, Sect. 2.1, examines systems of (non-)linear equations and preconditioners, Sect. 2.2 considers the reduction of systems of ODEs to non-linear systems, and Sect. 2.3 deals with an initial boundary value problem. The presented mathematical reductions are de facto standards, new is NumLab’s software implementation which maps one to one with these techniques. 2.1 Non-linear systems and preconditioners This subsection presents N UM L AB’s operator approach, and demonstrates how operator evaluations reduce to repeated

The N UM L AB numerical laboratory for computation and visualisation

vector space operations and operator evaluations. This is illustrated by means of examples, which include (non-)linear systems, and preconditioning techniques. First, consider linear systems of the form F(x) = f . Here F(x) = Ax is a linear operator, with an N by N coefficient matrix A, and f ∈ R N is a right hand side. The N UM L AB implementation of the evaluation z = F(x) is: F.eval(z, x); The actual implementation of eval() varies with the application type (for instance full matrix, sparse matrix, image, etc.). Though z is a resulting value, its initial value can be used for computations (for instance as an initial guess). Next, N UM L AB formulates a linear system F(x) = f with the use of an affine operator G: G(x) = F(x) − f .

(1)

The user constructs this N UM L AB system with G.setO(F); G.setI(f); and computes the residual z = G(x) with: G.eval(z, x); The routines with name set- provide G with the linear operator and a right hand side vector. Next, let x ∈ R N be a given vector, and focus on the solution(s) of G(z) = x, i.e, on solution methods for affine operators. Assume that operator R approximates G −1 : G(z) = x ⇐⇒ z = G

−1

(x) ⇐⇒ z ≈ R(x) .

(2)

For the sake of demonstration, and without loss of generality, we assume that R is a left-preconditioned Richardson iterative solution method, with preconditioner P. Such a method is based on a successive substition process:     z (k+1) = z (k) − P G z (k) − x , (3)

which terminates as soon as S(P(G(z (k) ) − x)) = 0, for a user-provided stopping criterion S: Rn → {0, 1}. This recursion will converge if for instance G is as in (1) with F positive definite, and if P(x) = hx with h positive and small enough. The related N UM L AB operator R for (3) is defined by its implementation of its eval() method: R.eval(z, x) { P.setO(G); repeat { G.eval(r, z); r -= x; P.eval(s = 0, r); z -= s; } while (S(s) > 0); } The system G(z) = x is solved with a few instructions: R.setO(G).setP(T).setS(S); R.eval(z, x);

Observe that solver R uses z both as initial guess z (0) ∈ R N and final approximate solution, whereas preconditioner P must use 0 as an initial guess. If a preconditioner is not provided, a default – the identity operator – is substituted. The stopping critria are similarly dealt with. Next, as could be observed, operators can make use of operators: The preconditioner for Richardson’s algorithm could have been Richardson’s algorithm itself, a diagonal preconditioner, an (incomplete) LU factorisation, and so forth. Furthermore, the eval() methods of the solver R, preconditioner P and system G are syntax-wise identical. The pseudo code for R above executes P.setO(G), so preconditioner P can use (has access to) G and its Jacobian. Further, the linear system F(z) = f could have been solved directly with R: R.setO(F); R.eval(z, \f); N UM L AB formulates systems, solvers and preconditioners all with the use of set- and eval() syntax – though related mathematical concepts differ. Few other methods such as update() exist, and relate to data flow concepts, outside the current paper’s scope. A closer examination of the Richardson operator reveals more information of interest. N UM L AB implements all its operator evaluations with: (1) Vector space operations; and (2) all which remains: Nested operator evaluations. This is clearly demonstrated by R’s implementation above:   (1) r = G z (k) ; (2)

r = r−x; (1)

s = P(r) ; (2)

z (k+1) = z (k) − s ,

(4)

where, (1) denotes operator evaluation and (2) vector space operation. This clear cut classification of operations thoroughly simplifies the mathematical framework. Though N UM L AB regards preconditioning as approximate function evaluation – which simplifies its framework – this does not solve the problem of proper preconditioning. Specific iterative solution methods might require preconditioners to preserve for instance symmetry (such as the preconditioned gradient method PCG [8]) or at least positive definiteness of the symmetric part (for minimal residual methods, see [42] for GMRES and [5] for GCGLS). All iterative solvers have some requirements: Robust methods (e.g. [30]), multi-level methods (e.g. [7] and [31]), multi-grid methods (e.g. [25]), and incomplete factorisation methods (e.g. [24]). The application designer should keep these mathematical restrictions in mind, when designing a suitable solver for the problem at hand. Similar to linear systems, N UM L AB also formulates nonlinear systems with the use of operators G, and looks for solutions of G(z) = x. The Jacobian (Frechet derivative) of a (non-linear) operator G at point x is denoted by DG(x) – or by DG if G is linear. Related non-linear solvers are again formulated as operators. Non-linear operators G which do not provide derivative evaluation, can be solved with the use of a fixed point

J. Maubach, A. Telea

method (comparable to the Richardson method above), or with a combinatorial fixed point method [48] (a multidimensional variant of the bisection method). Non-linear operators G which provide derivative evaluation can also be solved with (damped, inexact) Newton methods (see [18] and [20]). A typical N UM L AB code for an undamped Newton method is: Newton.eval(z, x) { repeat { G.eval(r, z); r -= x; Solver.setO(G.getJacobian(z)); Solver.eval(s, r); z -= s; } while (S(s) > 0); } and a system G(z) = x is solved by this method with: Newton.setO(G).setSolver(R); Newton.eval(z, x); where Richardson’s method is used to solve the linear systems. Again, the application designer should take care that the fixed point function is chosen properly, so it preserves properties of F, such as symmetry and positive definiteness of the symmetric part. In order to close this section on systems of equations and solvers, note that images are also treated as operators F(x) = Ax ,

(5)

where A is a matrix (or block-diagonal) matrix of colour intensities. Thus, image visualisation reduces to Jacobian visualisation. An application of the above to image processing is illustrated in Fig. 4m. 2.2 Ordinary differential equations Standard discretisations of ordinary differential equations can also be formulated as operators whose evaluation reduces to a sequence of vector space operations and function evaluations. For instance, let E be an operator, and consider the initial value problem: Find x(t) for which: d x(t) = E(t, x(t)) (t > 0) , dt

x(0) = x0 .

(6)

Let h > 0 denote the discrete time-step, and define tk = kh for all k = 0, 1, 2, . . . . Provided with an approximation x(k) of x(tk ), a fixed-step Euler backward method determines an approximation x(k+1) of x(tk+1 )   x(k+1) − x(k) = h E tk+1 , x(k+1) , (7) which can be rewritten as   x(k+1) − x(k) − h E tk+1 , x(k+1) = 0 .

(8)

Define the operator T as follows: T(x) = x − x(k) − h E (tk+1 , x) .

(9)

Then x(k+1) is a solution of T(x) = 0. Of course, T depends on the user-provided values x(k) , tk and h. The N UM L AB evaluation code of z = T(x) is: T.eval(z, x) { E.setT(t_k + h); E.eval(z, x); z *= h; z += x; z -= x_k; } Next, the approximate solution x(k+1) at time tk+1 is computed with: T.setT(t_k).setX(x_k).setH(h); Newton.setO(T).setSolver(R); Newton.eval(z, 0); The operator formulation x(k+1) = T −1 (0) applies to all explicit methods such as Runge–Kutta type methods [13], as well as to all implicit discretisation methods, such as Euler Backward and Backward Difference Formulas (BDF) [22]. It is obvious that the evaluation of T at a given x again only involves vector space operations and operator evaluations. Solving T(x) = 0 can thus be done by several methods: Successive substitution, Newton type methods, preconditioned methods, etc. Naturally, a time-step integrator complements the timestep mechanism. N UM L AB provides standard fixed time-step methods and – required for stiff problems – adaptive timestep integrators of the PEC and PECE type [34]. An example is the solution of the Lotka-Volterra predator-prey problem, shown in Fig. 4l. Phase-plane plots can also be generated. 2.3 Partial differential equations and initial boundary value problems In order to show how partial differential equations (PDEs) are reduced to (non-)linear systems of equations, consider an initial boundary value problem. Let Ω ⊂ Rd be the bounded region of interest, and let ∂Ω denote its boundary. We denote points in this region with c ∈ Ω (x is reserved for vectors and related iterands). The problem of interest is: Find a solution u on [0, ∞) × Ω which satisfies ∂ u = ∆u + f ∂t

(t > 0) ,

(10)

subject to initial condition u(0, c) = u 0 (c) for all c ∈ Ω, and boundary conditions u(t, c) = γ(c) for all t ∈ [0, ∞) and c ∈ ∂Ω. For the sake of presentation, the boundary conditions are all assumed to be of Dirichlet type. With a method of Lines (MOL) approach, (10) fits into the framework (6), for a suitable operator E, to be defined. As an alternative, one can first discretise in time, and next discretise in space, or simultaneously discretise with respect to both (see for instance [6]). For a MOL solution of (10), the region of interest Ω is covered with a grid of elements (with the use of a uniform,

The N UM L AB numerical laboratory for computation and visualisation

Delaunay, or bisection type [35] grid generator). Next, the static equation −∆u = f is discretised with one of the available methods (standard conforming higher order finite elements, and non-forming elements as for instance in [29]). A standard Galerkin approach assumes that the solution u N is in a linear vector space V with basis {vj } j=1 . For the method of lines approach applied to (10) one sets  u(t, c) = x i (t)vi (c) . (11) i

for all time t and c ∈ Ω. Functions in V are identified with their coefficient vectors in Rn , so u is identified with x. MulN tiplication of (10) with (test) functions {vj } j=1 , followed by partial integration over Ω leads to a system of ODEs M

d x(t) = −G(x(t)) , dt

(t > 0) .

(12)

the (iterative) solver for the solution of the linear maps (1) V 0 onto V 0 and (2) is the identity on V − V 0 . Then, by induction, also (17) satisfies both assumptions. Because all common linear solvers (PCG, GCGLS, CGS, Peaceman–Rachford, etc.) map V 0 onto V 0 , all of N UM L AB’s solvers map V γ onto V γ . This holds for the (non-)linear (iterative) solvers, as well as for the solvers for systems of ordinary differential equations. For linear systems G(z) = 0 with G affine, as for instance in (13), the use of Newton’s method (17) may seem an overkill. However, this is not the case: Under the assumption that all coefficients of x are degrees of freedom – including the ones related to Dirichlet points – the solution of G(z) = 0 requires the solution of (17) above. In order to see this, define the linear system F(x) = f with      N   ∇  (18) x j vj  ∇vi  , [F(x)]i =  j=1

Here,      N   ∇  [G(x)]i =  x j vj  ∇vi − fvi  ,

and

(13)

j=1

if variable i = 1, . . . , N is not related to a Dirichlet point and [G(x)]i = 0 otherwise, and  [vj vi ], (14) [M]ij =  if neither variable i nor variable j is related to a Dirichlet point, and [M]ij = δij , the Kronecker Delta otherwise. M is a standard finite element mass-matrix. The Jacobian DG of G is  [∇vj ∇vi ] (15) [DG]ij = 

if neither variable i nor variable j is related to a Dirichlet point, and [DG]ij = δij , the Kronecker Delta otherwise. Functions in the linear vector space V do not need to satisfy the Dirichlet boundary conditions. Let α : ∂Ω → R. Define the set (not necessarily a vector space) V α = {x ∈ V : x = α at ∂Ω}.

(16)

0

Then V (homogeneous boundary conditions) is a vector space, and V γ is the set of all function which satisfy the Dirichlet boundary conditions. The solution z of G(z) = 0 is obtained by application of a full Newton step z− > z + [DG(z)]−1[−G(z)]

(17)

to an initial guess z (0) . The N UM L AB code for the related undamped Newton method is: Newton.setO(G).setSolver(R); Newton.eval(z, 0); In the example code, Richardon’s iterative solver R is used for the solution of G(x) = 0. Note that operator G in (13) maps V onto V 0 , and that its Jacobian matrix DG in (15) maps V 0 onto V 0 . Assume that

 [ fvi ] , [ f ]i = 

(19)

for all i. The problem F(x) = f has no unique solution because F is singular. Under the assumption that we compute with all coefficients x i , F(x) = f is below transformed in a standard manner, which results in the system G(x) = 0, and requires solution method (17). First, define the projection C : R N → R N by [C(x)]i = x i for all related non-Dirichlet supports ci , = 0 elsewise .

(20)

Next, the vector x is coefficient-wise split into a vector which contains all Dirichlet related function values x(0) and interior degrees of freedom d, i.e., we set x = x(0) + d.

(21)

Then x(0) turns out to be the solution to        (I − C) + C DF x(0) C T d = C f − F x(0) .

(22)

This shows that the for the solution of a linear boundary value problem F(x) = f , we must solve (17), and in fact exactly solve G(x) = 0. The standard splitting (21) for linear systems makes use of an x (0) ∈ V 0 (in (22)) which is zero at all nodal points, except those at the Dirichlet boundary. This – socalled elimination of boundary conditions – is a poor choice because x 0 has steep gradients near Dirichlet boundaries, whence the induced initial residual r(0) for the iterative solver is large. Fortunately, from (21) it follows that we can also take different x (0) ∈ V 0 . In order to minimize the amount of iterative solution steps, we best use a smooth x (0) . Finally, we consider the N UM L AB formulation of an operator for the solution of (10). This operator, of which the discrete solution of (12) is a root, is contructed similar to the operator constructed in (6)–(9), for E(t, x) := −G(x) and T(x) = M(x − x(k) ) − h E(tk+1 , x). This construction is such

J. Maubach, A. Telea

that a symmetric positive definite Jacobian of G(x) implies a likewise Jacobian of T(x). Therefore, the initial boundary value problem (10) reduces to a sequence of systems of nonlinear equations. 2.4 Conclusions The examples in Sects. 2.1, 2.2 and 2.3 have shown how mathematical problems with a seemingly different formulation can be reduced to the two basic operations of vector space computations and operator evaluation. Because of this, the N UM L AB software provides the basic notions as well as concrete specialisations of vector spaces V and operators G on V . 3 From the mathematical to the software framework In this section, we show how the notion of operators F and arguments v in (cross-product) spaces V map to a software framework. As outlined in the previous section, a large class of solution methods for problems of the form F(x) = 0, can be reduced to a simple mathematical framework based on finite dimensional linear vector spaces and operators on those spaces. The software framework we propose will closely follow the mathematical model. As a consequence, the obtained software product will be simple and generic as well. Consider the mathematical framework for spaces V and operators F in more detail. In general, let Ω be the bounded polygonal/polyhedral domain of interest, with smooth enough boundary ∂Ω. The linear vector space V = V1 × · · · × Vn is a cross-product space of n spaces (n is the amount of degrees of freedom). Each space Vi is spanned Ni where vij : Ω → R. An elemby basis functions {vij } j=1 ent x ∈ V is a vector function from Ω to Rn , and is written as x = [x 1 , . . . , x n ], a vector of component functions. Each component x i ∈ Vi is a linear combination of basis functions: for all c ∈ Ω x i (c) =

Ni 

x ij (t)vij (c) .

(23)

j=1

Each element x i is associated with a unique scalar vector X i = [x i1, . . . , x iNi ] ∈ R Ni . At its turn, X denotes the aggregate of these vectors: X = [X 1 , . . . , X n ], and Xij = [X i ] j . Summarised, we have vector functions x = [x 1 , . . . , x n ] and related vectors of coefficient vectors X = [X 1 , . . . , X n ]. Whenever n = 1, we use a more standard notation. In N this case, the space is V , spanned by basis functions {v j } j=1 , and elements x ∈ V are related to coefficient vector x = [x 1 , . . . , x N ]. For most finite element computations, the basis functions vij of Vi have local support. However,basis functions have global support in spectral finite elements computations. The local supports, also called elements, are created with the use of a triangulation algorithm. The next subsections describe N UM L AB’s software components related to the mathematical concepts discussed in this section: Grid generation in Sect. 3.1, bases generation in Sect. 3.2, vector functions in Sect. 3.3, and related operators in Sect. 3.4.

3.1 The Grid module To be able to define local support for the basis functions vij later on, we need to discretise the function’s domain Ω. This is modelled in the software framework by the Grid module, which covers the function’s domain with elements e. This Grid module takes a Contour as input, which describes the boundary ∂Ω of Ω. The default contour is the unit square’s contour. In N UM L AB, the grid covers regions Ω in any dimension (e.g. 2D planar, manifold or 3D spatial), and consists of a variety of element shapes, such as triangles, quadrilaterals, tetrahedra, prisms, hexahedrals, n-simplices (see [35]), and so on. All grids implement a common interface. This interface provides several but few services. These include: Iteration over the grid elements and their related vertices, topological queries such as the element which contains a given point. The amount of services is a minimum: Modules which use a grid generator and need more service must compute the required relations from the provided information. Specific Grid generator modules produce grids in different manners. N UM L AB contains Delaunay generators, simplicial generators, and regular generators, and "generators" which read an existing grid from a file. An example generator is illustrated in Fig. 4k, which shows a cubic finite element interpolant on a 2-manifold in R3 . 3.2 The Space module The linear vector space V is implemented by the software module Space. Space takes a Grid and BoundaryConditions as inputs. The grid’s discretisation in combination with the boundary conditions are used to build the supports of its basis functions vij . The default boundary conditions are Dirichlet type conditions for all solution components. None, Robin, Neumann and vectorial boundary conditions are specified per boundary part. Recall that elements in V do not have to satisfy the Dirichlet boundary conditions. Recall that elements of V do not have to satisfy the essential boundary conditions. Because Grid has a minimal interface, some information – required by Space for the construction of the basis functions – is not provided. Whenever this happens, Space internally computes the required information with the use of Grid’s services. A specific Space module implements a specific set of basis functions, such as constant, linear, quadratic, or even higher order polynomial degree, matched to the elements’ geometry. The interface of the Space module follows the mathematical properties of the vector space V presented so far: Elements x, y ∈ V can be added together or scaled by real values. Furthermore, elements vij of V are functions, and V permits evaluation at points c ∈ Ω of such functions and their derivatives. It should be kept in mind that elements of V are functions, not linear combinations of functions. Therefore, the name SPACE is somewhat misleading. However, for the brevity of demonstration, the name SPACE will also be used in the sequel. In most cases, the required basis functions have local support, also called element-wise support. The restriction of

The N UM L AB numerical laboratory for computation and visualisation

global basis function vij to support e is said to be local function vir . In software, this is coded as follows: For space component i (so Vi ), element e, and local basis function r thereon, j := j(i, r) induces basis function vij . The software implementation is on element-level for efficiency purposes: Given a point c ∈ Ω, Space determines which support e contains c for the evaluation of vij (c). 3.3 The Function module As discussed, a vector function x : Ω → Rn in a space V generated by vij is uniquely related to a coefficient vector X with coefficients X ij . Based on this observation, N UM L AB software module Function implements a vector function x as a block vector of real-valued coefficients Xij , combined with a reference to the related Space – which contains related functions vij . The Function module provides services to evaluate the function and its derivatives at a given point c ∈ Ω. To this end, both x’s coefficient vector X and the point c are passed to the Space module referred to by x. At its turn, the Space module returns the value of x(c). This is computed following the definition x(c) = [ j x ij vij (c)], as described in the previous section. The computation of the partial derivatives of a given function x in a point c follows a similar implementation. Providing evaluation of functions x ∈ V and of their derivatives at given points is, strictly speaking, the minimal interface the Space module has to implement. However, it is sometimes convenient to be able to evaluate a function at a point given as an element number and local coordinates within that element. This is especially important for efficiency in the case where one operation is iterated over all elements of a Grid, such as in the case of numerical integration. If the Space module allows evaluating functions at points specified as elements and local element coordinates, the implementation of the numerical integration is considerably faster than when point-to-element location has to be performed. Consequently, we also provided the Space module with a function evaluation interface which accepts an element number and a point defined in the element local coordinates. 3.4 The Operator module As described previously, an operator F : V → W maps an element x ∈ V to an element z ∈ W. The evaluation z = G(x) computes the coefficients z ij of z from the coefficients x ij of x, as well as from the bases {vij } and {wij } of V and W respectively. Next to the evaluation of G, derivatives such as the Jacobian operator DG of G are evaluated in a similar manner. Such derivatives are important in several applications. For example, they can be used in order to find a solution of G(z) = x, with Newton’s method. The software implementation of the operator notion follows straightforwardly the mathematical concepts introduced in Sect. 2. The implementation is done by the Operator module, which offers two services: evaluation of z = G(x), coded as G.eval(z,x), and of the Jacobian of G in point y, z = DG(y)x, coded as G.getJ(y).eval(z,x). To evaluate z = G(x), the Operator module takes two Function objects z and x as input and computes the coefficients z ij using the coefficients x ij and the bases of the Space

objects z and x carry with them. It is important that both the ‘input’ z and the ‘output’ x of the Operator module are provided, since it is in this way that Operators determine the spaces V, respectively W. To evaluate z = DG(y)x, the Operator proceeds similarly. Internally, DG(y) is usually implemented as a coefficient matrix, and the operation DG(y)x is a matrix-vector multiplication. However, the implementation details are hidden from the user (DG(y)x may be computed element-wise, i.e. matrix-free), who works only with the Function and Operator mathematical notions. Specific Operator implementations differ in the way they compute the above two evaluations. For example, a simple Diffusion operator z = G(x) may operate on a scalar function and produce a function z where z i = x i−1 − 2x i + x i+1 . A generic Linear operator may produce a vector of coefficients z = Ax where A is a matrix. A Summator operator z = G 1 (x) + G 2(x) may take two inputs G 1 and G 2 and produce a vector of coefficients z i = [G 1 (x)]i + [G 2(x)]i . Remark that the modules implementing the Linear and Summator operators actually have two inputs each. In both cases the function x is the first input, while the second is the matrix A for the Linear operator and the operators G 1 and G 2 for the Summator operator. These values could be as well hard-coded in the operator implementation. In both cases however, we see Operator as a function of a single variable x, as described in the mathematical framework. 3.5 The Solver module We model the solving of G(z) = x by the module Solver in our software framework. Mathematically, Solver is similar to an operator S: V → W, where V and W are function spaces. The interface of Solver provides evaluation at functions x ∈ W, similarly to the Operator module. The implementation of the Solver evaluation operation z = S(x) should provide an approximation z to z ≈ F−1 (x). However, Solver does not provide evaluation of its Jacobian, as this may be undesirably complex to compute in the general case. Practically, Solver takes as input an initial guess Function object x and an Operator object G. Its output z is such that G(z) = x. The operations done by the solver are either vector space operations or Operator evaluations, or evaluations of similar operators G(z). In the actual implementation, this is modelled by providing the Solver module with one or more extra inputs of type Solver. In this way, one can for example connect a nested chain of preconditioners to an iterative solver module. The implementation of a specific Solver follows straightforwardly from its mathematical description. Iterative solvers such as Richardson, GMRES, (bi)conjugate gradient, with or without preconditioners, are easily implemented in this software framework. The framework makes no distinction between a solver and a preconditioner, as discussed in Sect. 2. The sole difference between a solver and a preconditioner in this framework is semantic, not structural. A solver is supposed to produce an exact solution of G(z) = 0 (up to a desired numerical accuracy), whereas the preconditioner is supposed to return an approximate one. Both are implemented as Solver modules, which allows easy cascading of a chain of pre-

J. Maubach, A. Telea

conditioners to an iterative solver as well as using preconditioners and solvers interchangeably in applications. Furthermore, the framework makes no structural distinction between direct and iterative solvers. For example, an ILUSolver module is implemented to compute an incomplete LU factorisation of its input operator G. The ILUSolver module can be used as a preconditioner for a ConjugateGradient solver module. In the case the ILUSolver is not connected to the ConjugateGradient module’s input, the latter performs non preconditioned computations. Alternatively, a LUSolver module is implemented to provide a complete LU factorisation of its input operator G. The LUSolver can be used either directly to solve the equation G(z) = x, or as preconditioner for another Solver module. 3.6 An object-oriented approach to the software framework So far, sections have outlined the structure of the proposed numerical software framework. This structure is based upon a few basic modules which parallel the mathematical concepts of Grid, Function, Space, Operator, and Solver. These modules provide their functionality via interfaces containing a small number of operations, such as the Operator’s evaluation operation or the Grid’s elementrelated services previously outlined. As stated in the beginning of this section, a large range of numerical problems can be modelled with these few generic modules. In order to capture the specifics of a given problem, such as the type of PDE to be solved or the basis functions of an approximation space, the generic modules have to be specialised. The specialised modules provide the interface declared by their class, but can implement it in any desirable fashion. For example, a ConjugateGradient module implements the Solver interface of evaluating z = G −1 x by using the conjugate gradient iterative method. The above architectural requirements are elegantly and efficiently captured by using an object-oriented approach to software design [10, 12, 36, 41]. Consequently, we have implemented our numerical software framework as an objectoriented library written in the C++ language [45]. This design enabled us to naturally model the concepts of basic and specialised modules as class hierarchies. The software framework implements a few base classes Grid, Function, Space, Operator, and Solver. These base classes declare the interface to their operations. The interface is next implemented by various specialisations of these base classes. An overview of the implemented specialisations follows: – Grid: 2D and 3D grid generators for regular and unstructured grids, and grid file readers; – Function: Several specific functions vi j are generated, such as cosines, or piecewise (non-)conforming polynomial functions in several dimensions; – Space: There is a single Space class, but a multitude of basis functions are implemented, as described further in Sect. 5; – Operator: Operators for several ODEs, PDEs, and non-linear systems have been implemented, such as Laplace, Stokes, Navier–Stokes, and elasticity problems. Next, several operators for matrix manipulation and image processing have been implemented. For example, matrix

sparsity patterns can be easily visualised, as in other applications like Matlab (Fig. 4j); – Solver: A range of iterative solvers including biconjugate gradient, GMRES, GCGLS, QMR, etc. are implemented. Several preconditioners such as ILU are also provided as Solver specialisations, following the common treatment of solver and preconditioner modules previously described. Besides the natural modelling of the mathematics in terms of class hierarchies, the object-oriented design allows users to easily extend the current framework with new software modules. Implementing a new solver, preconditioner, or operator usually involves writing only a few tens of lines of C++ to extend an existing one. The same approach also facilitates the reuse of existing numerical libraries such as LAPACK [2] or Templates [9] by integrating them in the current objectoriented framework.

4 Transient Navier–Stokes equations This section examines the mathematical concepts at the foundations of a N UM L AB solver for transient Navier–Stokes equations. These concepts (1)–(4) in Sect. 2, have been examined in Sects. 2.1–2.3 for small model problems, suited for presentation purposes. Here, these concepts are all worked out in relation to a single problem, the solution of transient Navier–Stokes equations. Section 5 discusses the design of a N UM L AB application with the N UM L AB operators discussed here. The transient Navier–Stokes equations have been chosen since related finite element operators require a finite dimensional cross product vector space V of basis functions, and because the transient formulation leads to differential algebraic equations, and requires solution techniques related to ODEs. The DAE class of equations is non-trivial to solve, and common in industrial problems. Our claim is – see Sect. 5 for details – that N UM L AB provides a sophisticated framework for the integration of complex Navier–Stokes solvers, not that N UM L AB provides solvers better than those found in the literature. First we examine the static problem. For a particular discretisation, we show that there exists a straightforward and lucid relation between the mathematical formulas and the N UM L AB software implementation: The N UM L AB implementation of F accomplishes the finite element required (numerical) integration without space V exposing its basis functions and element geometries to F. The static case is followed with the mathematical formulation of the transient problem. We demonstrate that (components of) the static problem operator F can be used in combination with all suitable time-integrators S – suited for indefinite/stiff problems. Due to the high degree of orthogonality between F, V and time-stepper methods, N UM L AB can and does offer a range of finite element types – higher order, as well as non-conforming Crouzeix-Raviart (see [17]) – on rather arbitrary support geometries: simplices, parallelipipeda, prisms, etc. It facilitates and supports user-defined reference bases and geometries, as well as user-supplied geometries and grid generators. Existing applications do not have to be adapted

The N UM L AB numerical laboratory for computation and visualisation

for new bases and geometries, as long as all required mathematical conditions hold. 4.1 The Navier–Stokes equations The incompressible Navier–Stokes equations describe an incompressible fluid u subject to forces f , in a region Ω ⊂ R2 , assumed for the sake of brevity. Then the fluid velocities are u = [u 1 , u 2 ], and p denotes the pressure. The classical problem is to find sufficiently smooth (u, p) such that in Ω:  −ε∆u + u∇u + ∇ p = f , (24) ∇ ·u=0. For the sake of demonstration, all boundary conditions are presumed to be of Dirichlet type (parabolic in/outflow profiles and no-slip along walls). Problem (24) is discretised with the use of a finite element method. To this end, one first covers Ω by elements with the use of a grid generator module Grid (the construction and refinement of a suitable computational grid is a problem of its own (see for instance [21]). Then a triplet of finite dimensional (Hilbert) finite element spaces V := V1 × V2 × V3 is chosen such that V1 × V2 and V3 satisfy the L.B.B. condition [3]. The N UM L AB implementation creates one Space module V, provided with three reference Basis modules. For the sake of presentation, quadratic conforming finite element bases are used for the velocities (V1 and V2 ), and a piecewise linear conforming finite element basis is used for the pressure (V3 ). Next, the equations (24) are multiplied by test functions (v, q) ∈ V, after which the first one is partially integrated. This procedure results in a variational problem: Find x = [u 1 , u 2 , p] = (u, p) ∈ V such that for all (v, q) ∈ V    ε∇u : ∇v − p∇v + (u∇u − f )v = 0 ,  (25)    ∇ · u q = 0 .

Different finite element discretisations of (24), for instance an O’Seen discretisation, are also possible. In order to facilitate the formulation of a N UM L AB application for our problem, system (25) is now reformulated into operator form: F(X) = 0. The operator F related to the discrete variational formulation (25) has three components F := [F1 , F2 , F3 ], each related to one equation. For the definition of these components, first define x = [x 1 , x 2 , x 3 ] := [u 1 , u 2 , p] ∈ V, and set z = [z 1 , z 2 , z 3 ] ∈ V (assume we use a Galerkin procedure). Recall that each vector function x is uniquely related to coefficients x ij , at their turn related to functions vij from Ω to R. The discrete Navier–Stokes operator, discretised in space, now is: Z 1 j = F(X) = [F1 (X 1 , X 2 , X 3 )] j  ε∇x 1 ∇v1 j −x 3 ∂x v1 j +(x 1 ∂x x 1 +x 2 ∂ y x 1 − f 1 )v1 j =

Z 2 j = F(X) = [F2 (X 1 , X 2 , X 3 )] j  ε∇x 2 ∇v2 j −x 3 ∂ y v2 j +(x 1 ∂x x 2 +x 2 ∂ y x 2 − f 2 )v2 j =

Z 3 j = F(X) = [F3 (X 1 , X 2 , X 3 )] j  (∂x x 1 +∂ y x 2 )v3 j . =

(26)

Here, x 1 is the function related to coefficients X 1 , and so forth. It is evident – as stated earlier – that F uses the coefficients of x as well as the bases functions in order to compute the coefficients of the result z. The integrals in (26) are computed support-wise, with the use numerical integration, involving integration points xk . As can be deduced from (26), required are the values vi,r (xk ) and ∇vi,r (xk ) as well as the values x i (xk ) and ∇x i (xk ). In the N UM L AB code below, these values are returned in arrays v(i)(k)(r), dv(i)(k)(r), x(i)(k), respectively dx(i)(k). The selection dv(i)(k)(r)(dY) returns the individual gradient component ∂ y vi,r (xk ). Define U1 = 0, U2 = 1, P = 2. The N UM L AB evaluation of z = F(x) and z = DF(x)*y for support e (typeset to fit this layout) is: Operator z = F(x): z(U1)(j(U1)(r)) += qw(k)* (eps*dx(U1)(k)*dv(U1)(k)(r) x(P)(k)*dv(U1)(k)(r)(dX) + (x(U1)(k)*dx(U1)(k)(dX) + x(U2)(k)*dx(U1)(k)(dY) f1(qp(k)) * v(U1)(k)(r))); z(U2)(j(U2)(r)) += qw(k)* (eps*dx(U2)(k)*dv(U2)(k)(r) x(P)(k)*dv(U2)(k)(r)(dY) + (x(U1)(k)*dx(U2)(k)(dX) + x(U2)(k)*dx(U2)(k)(dY) f2(qp(k)) * v(U2)(k)(r))); z(P)(j(P)(r)) += qw(k)* ((dx(U1)(k)(dX) + dx(U2)(k)(dY)) * v(P)(k)(r)); Jacobian z = DF(x)*y: DF(U1)(U1)(j(U1)(r))(j(U1)(s)) += qw(k)* (dv(U1)(k)(s)*dv(U1)(k)(r) + v(U1)(k)(s)*dx(U1)(k)(dX) + x(U1)(k)*dv(U1)(k)(s)(dX)); ... DF(P)(U2)(j(P)(r))(j(U2)(s)) += qw(k)* (dv(U2)(k)(s)(dY)*v(P)(k)(r)); z = DF * y; Both evaluation operations have an almost identical loop structure: V = x->getSpace(); for (Integer e = 0; e < V->NElements(); e++) V->fetch(e, j, v, dv, x, dx, .....); for (Integer i = 0; i < j.size(); i++) for (Integer r = 0; r < j(i).size(); r++) for (Integer k = 0; k < x(i).size(); k++) The Jacobian has an extra inner loop over trial functions s. With regard to this implementation, several observations come to mind:

J. Maubach, A. Telea

– First, F does not have spaces V and W as input (i.e., as auxiliary variables). The spaces are obtained from the input/output variables. This technique simplifies computational networks. – Secondly, because F performs numerical integration, it solely requires the value of (partial derivatives of) the basis functions at the quadrature points. The basis functions themselves are not required, so F operates orthogonal to V and W. – Finally, the N UM L AB operator models the discrete Navier–Stokes equations in (25) in a convenient fashion. The software implementation is one-to-one with the mathematical syntax, and can in fact be automated. Finally, recall that the derivative operator acts as the identity operator on Dirichlet point related variables, which requires fetch to deliver the related information. This information is also required for non-homogeneous Neumann boundary conditions and Robin conditions. 4.2 The time discretisation A transient version of the Navier–Stokes equations in the previous section can be formulated as so-called differential algebraical equations (DAEs):  ∂   u = ε∆u − u∇u − ∇ p + f , ∂t (t > 0) (27)  ∇ · u = 0 , with initial condition u(0, c) = u0 (c) on Ω and boundary conditions u(t, c) = u 1 (c) for all t ∈ [0, ∞) and c ∈ ∂Ω. We now construct a non-linear N UM L AB time-step operator F for a MOL discretisation of (27), which is implicit with respect to the constraint ∇ · u = 0. For the sake of presentation, for a discretisation of the first vectorial equation in (27), we will use a rather basic time-step method: the θ-method – recall the constrained will be treated in an implicit manner below. In practice, for stiff problems – high Reynolds number – one would rather use a backward difference method. For θ ∈ [0, 1], the θ-method for a general non-linear system of ODEs d u(t) = E(t, u(t)) , dt

(28)

crete solution as formulated in (11). This leads to a discretised version of (27):  d    M dt X 1 (t) = −F1 (X(t))    d (32) M X 2 (t) = −F2 (X(t))   dt     0 = F3 (X(t)) .

subject to initial conditions on the two velocity components X 1 (0) = g0 , X 1 (0) = g1 . The operators Fi are those defined in (26), and M is the mass matrix. Define Y(t) = [X 1 (t), X 2 (t)], i.e., X(t) = [Y(t), X 3 (t)]. Let operator E(X):= [−F1 (X), −F2 (X)], then (32) reduces to  d   M Y = E(X) dt (33)   0 = F3 (X) , with related initial and boundary conditions. Finally, with the application of the θ-method (30), the discrete solution X(k+1) = [Y (k+1) , X (k+1) ] of (27) is a root of 3 G(X) := [W(X), F3 (X)] = [0, 0] ,

(34)

where W(X) is defined by     M Y − Y (k) − hθ E tk , X(k) − h(1 − θ)E (tk+1 , X) .

(35)

Summarising (27)–(35), we have shown that each approximate solution X(k) of X(tk ) must solve a non-linear system of equations G(X) = 0, which can be supplied to a N UM L AB non-linear solver. Finally, some remarks and observations. First, the value which operator G attains at X, is composed of the values which F attains at related points. Therefore, the Jacobian DG(X) can be formulated in terms of DF at related points. The N UM L AB implementation of time-steps exploits this: The basic Jacobian implementation of DG(X) is a sequence of call-backs to the Jacobians DF. It it pointed out that the saddle point problems related to (32) are hard to solve ([21, 40]).

leads to the recursion: 5 Application design and use

u 0 = u(0) , uk+1 − uk = hθ E(tk , uk ) + h(1 − θ)E(tk+1, uk+1 ) .

(29)

This all fits into the N UM L AB Operator style, if we define the time-step operator T – similar to (9) – as follows:   T(u) := u − u(k) − hθ E tk , u(k) − h(1 − θ)E(tk+1, u) . (30)

In this manner, the Jacobian of T is positive definite for small h, if the Jacobian of E is, and the approximation u(k+1) of u(tk+1 ) is a root of T(u) = 0 .

(31)

For the solution of (27), we first discretise (27), in a manner similar to how (24) was discretised to obtain (26), with a dis-

The previous sections have presented the structure of the N UM L AB computational framework. It has been shown how new algorithms and numerical models can easily be embedded in the N UM L AB framework, due to its design based on few generic mathematical concepts. This section treats the topics of numerical application construction and use with the N UM L AB system. As stated in Sect. 1, a numerical framework should provide an easy way to construct numerical experiments by assembling predefined components such as grids, problem definitions, solvers, and preconditioners. Next, one should be able to interactively change all parameters of the constructed application and monitor the produced results in a numerical or visual form. Shortly, we need to address the three roles of

The N UM L AB numerical laboratory for computation and visualisation

component development, application design, and interactive use for the scientific computing domain. We have approached the above by integrating the N UM L AB component library in the VISSION system. VISSION is a general-purpose environment for VIsualisation and SImulation with Object-oriented Networks. The main feature of VISSION is its capability to load independently developed C++ component libraries and to display them in a visual, iconic form in its network editor user interface (Fig. 1a). The application designer can construct the desired computational or visualisation application by visually assembling the desired components in a dataflow network. VISSION automatically provides graphical user interfaces for all the loaded components, as the example shown in Fig. 1b. Overall, VIS SION provides similar code integration, application construction, and steering mechanisms as the AVS, IRIS Explorer, or Oorange environments, and generalises and simplifies their use for arbitrary component libraries written in C++ (as detailed in [46, 47]). As N UM L AB is written as a C++ component library, its integration into VISSION was easy. Moreover, the structure of N UM L AB as a set of components that communicate by data streams in order to perform the desired computation matches well VISSION’s dataflow application model. As no modification of the N UM L AB code was necessary, its integration in VISSION took only a few hours of work. Once all the N UM L AB components were integrated into VISSION, constructing numerical applications with interactive computational steering and visualisation was easily achieved by using VISSION’s visual network construction and end user interaction facilities described above. We shall illustrate these with the Navier–Stokes problem discussed in the previous section.

5.1 S Navier–Stokes simulation As outlined previously, numerical applications built with the N UM L AB components are actually VISSION dataflow networks. Figure 1a shows such a network built for the Navier– Stokes problem discussed in the previous section. The modules in the Navier–Stokes computational network in Fig. 1 are arranged in five groups. The functionality of these groups is explained in the following. 5.1.1 The computational domain. The first group contains modules which define the geometry of the computational domain. This basically contains modules that accomplish three functions: 1. definition of the computational domain’s contour. 2. definition of the reference geometric element. 3. mesh generation In our example, the computational domain is a rectangular region whose boundary is defined by the GeometryContourUnitSquareStandard module. This module allows the specification of the rectangle’s sizes, as well as a distribution of mesh points on the contour. Next, the GeometryGridUniformTriangle module produces a meshing of the rectangle into triangles. The reference triangle geometry is given by the GeometryReferenceTriangle. The mesh produced by the GeometryGridUniformTriangle module conforms both to the reference element supplied as input and to the boundary points output by the GeometryContourUnitSquareStandard module. Different combinations of contour definitions, mesh generators, and reference elements are easily achieved by using different modules. In this way, 2D and 3D regu-

Fig. 1. a Navier-Stokes simulation built with N UM L AB components. b User interface for the grid generator module

J. Maubach, A. Telea

lar and unstructured meshes of various element types such as triangles, quadrilaterals, hexahedra, or tetrahedra can be produced. The produced mesh can be directly visualised or further used to define a computational problem. 5.1.2 Function spaces. The second group contains modules that define the function space V over the computational domain. The modules in this group perform two functions: 1. definition of a set of basis functions vi that span V . 2. definition of V from the basis functions and the discretised computational domain. The first task is done by the SpaceReferenceTriangleLinear and SpaceReferenceTriangleQuadratic modules, which define linear, respectively quadratic basis functions on the geometric triangles. The functions are next input into the Space module, which has already been discussed in the previous sections. The support of the basis functions is defined by the computational domain’s discretisation which is also input into Space. In our case, Space uses the quadratic basis function module twice and the linear basis function module once, as the 2D Navier–Stokes problem has two velocity components to be approximated quadratically and one linearly approximated pressure component. An important advantage of the design of N UM L AB is the orthogonal combination of basis functions and geometric grids. Several other (e.g. higher order) basis function modules are provided as well, defined on different geometric elements. By combining them as inputs to the Space module, one can easily define a large range of approximation spaces for various computational problems. In the case of a diffusion PDE solved on a grid of quadrilaterals, for example, one would use a single SpaceReferenceQuadLinear basis function input to the Space module. 5.1.3 Operators and solvers. The third group contains modules that define the function F for which the equation F(x) = 0 is to be solved, as well as the solution method to be used. This group contains thus specialisations of the Operator and Solver modules described in the previous sections. In our example, the discrete formulation of (26) discussed in the previous section is implemented by the OperatorImplementationFiniteElementNavierStokes module. The Navier–Stokes problem is solved by a Newton solver implemented by the OperatorIteratorNonLinearNewtonDamped module. The linear system output by the Newton module is then solved by a conjugate gradient solver implemented by the OperatorIteratorLinearCGS module. The solution is accelerated by using an incomplete LU preconditioner OperatorIteratorLinearILU which is passed as input to the conjugate gradient solver. Other problems can be readily modelled by choosing other operator implementations. Similarly, to use another solution or preconditioning method, a chain of Solver specialisations can be constructed. As solvers have an input of the same Solver type, complex solution algorithms can be built on the fly. 5.1.4 Functions. The fourth group contains specialisations of the Function module. These model both the solution of

a numerical problem as well as its initial conditions or other involved quantities such as material properties. In our example, the FunctionVector module holds both the velocity and pressure solution of the Navier–Stokes equation. The solution is updated at every iteration, as this module is connected to the solver module’s output. As explained in the previous sections, a function is associated with a space. This is seen in the Function’s input connection to the Space module. The solution of the problem is initialised by connecting the FunctionSymbolicBubble module to the FunctionVector’s input. When the user changes the initial solution value, by changing an input of the FunctionSymbolicBubble signal or by replacing it with another function, the network restarts the computations from this new value. 5.1.5 Visualisation. As presented in Sect. 1, a computational environment should provide extensive support for data visualisation and monitoring. Such support should cover the following: – several dataset representations, such as structured, unstructured, curvilinear, rectilinear, uniform and locally refined grids, with several types of values defined per node or per cell (scalar, vector, tensor, colour, etc). Support for image datasets should be provided as well. Besides these discrete datasets, the possibility of defining continuous datasets (e.g. implicit functions) should also be taken into account. – several dataset processing tools, such as dataset readers and writers for various data formats, filters producing streamlines, streamribbons, isosurfaces, warp planes, slices, dataset simplifications, feature extraction, and so on. Imaging operations should also be supported, such as image filtering, Fourier transforms, image segmentation, colour processing, etc. – several visualisation primitives, such as 2D and 3D rendering or objects with various shading models, mapping scalars to colours via various colourmaps, direct manipulation of the viewed objects, interactive data probing and object picking, hard copy options, animation creation, and so on. A second requirement is that the visualisation tools should be open for extension or customisation, as researchers often need to extend, adapt, optimise, or experiment otherwise with various visualisation algorithms and data structures. Writing such a library is clearly a task out of the scope of a single person. Moreover, such libraries exist, offering various degrees of application domain specificity and numbers of components. In order to provide N UM L AB with the desired visualisation capabilities, we have integrated the Visualization Toolkit (shortly VTK) [44] library into the VISSION environment. VTK is one of the most powerful freely available scientific visualisation libraries, with over 400 components for scalar, vector, and tensor visualisation, imaging, volume rendering, charting, and more. Similarly to N UM L AB, VTK is implemented as a set of C++ classes that specialise a few basic concepts such as datasets, filters, mappers, actors, viewers, and data readers and writers. Back to the Navier–Stokes simulation network of Fig. 1, we finally discuss the modules that provide visualisation fa-

The N UM L AB numerical laboratory for computation and visualisation

cilities. The main module is the FunctionVTKViewer module group which takes as input the current solution of the Navier–Stokes equation and the grid upon which it is defined. In VISSION, a module group represents a whole subnetwork of modules or groups which are treated as a single module. In our example, the FunctionVTKViewer module inputs the velocity and pressure solution components into various visualisation modules, such as stream lines and hedgehogs for the vectorial, respectively colour plots and isolines for the scalar component. These modules are accessible to the interested user by double-clicking on the FunctionVTKViewer icon. Several other visualisation methods can be easily attached to the Navier–Stokes simulation, by editing the contents of the FunctionVTKViewer module group. Keeping the visualisation back-end pipeline inside a single module group allows a natural separation of the computational network from the post-processing operations. This also helps to reduce the overall network visual complexity. 5.2 Navier–Stokes simulation steering and monitoring Once the Navier–Stokes computational network is constructed, one can start an interactive simulation by changing the parameters of the various modules involved, such as mesh refinement, solver tolerance, or initial solution value. All the numerical parameters, as well as the parameters of the visualisation back-end are accessible via the module interactors automatically created by VISSION (Fig. 1b). Moreover, the evolution of the intermediate solutions produced by the Newton solver can be interactively visualised. This is achieved by constructing a loop which connects the output of the OperatorIteratorNonLinearNewtonDamped module to its input. The module will then change the FunctionVector, and thus the visualisation pipeline downstream of it, at every iteration. This allows one to interactively monitor the improvement of the solution at a given time step, and eventually change other parameters to experiment new solvers or preconditioners.

Figure 2 shows a snapshot from an interactive Navier– Stokes simulation. The simulation domain, shown meshed in Fig. 2a, consists of a 2D rectangular vessel with an inflow and an outflow. The inflow and outflow have both parabolic essential boundary conditions on the fluid velocity. The sharp obstacle placed in the middle of the container can be interactively manipulated by the end user by dragging its tip with the mouse anywhere inside the upper-left vessel picture. Once the obstacle’s shape is changed, the N UM L AB network re-meshes the new domain, recomputes the stationary solution for the Navier–Stokes simulation defined on this new domain, and displays the pressure and velocity solution (Fig. 2). Various other parameters, such as fluid viscosity, mesh refinement, and solver accuracy, can also be interactively controlled. The computational steering of the above problem proceeds at near-interactive rates, for e.g. 2000 elements on an SGI O2 R5000 machine. Consequently, such N UM L AB setups can be used for quick, interactive testing of the robustness and accuracy of various solvers, preconditioners, and mesh generators. For example, one can test the speed and robustness of an iterative solver for different combinations of obstacle size and shape, mesh coarseness, and fluid viscosity for the above problem. Alternatively, a fine mesh can be used for obtaining accurate solutions. N UM L AB can also be used for solving large computational problems. In the following example, glass pressing in the industry is considered. The process of moulding a hot glass blob pressed by a parison is simulated. The glass is modelled as a viscous fluid, subjected to the Navier–Stokes equations. The pressing simulation is a time-dependent process, where the size and shape of the computational domain is changed at every step, after which the stationary Navier– Stokes equations are solved on the new domain. The flow equations can be solved on a two-dimensional cross-section in the glass, since the real 3D domain is axisymmetric. The simulation is analogous in many respects to the one previously presented. However, a mesh in the glass pressing simulation involves tenths of thousands of finite elements, whereas the previous example used only a few hundreds.

Fig. 2. Interactive Navier–Stokes simulation: domain, mesh, pressure, and velocity solution

J. Maubach, A. Telea

Fig. 3. 3D visualisation of glass pressing (top row). Pressure magnitude in 2D cross-section (bottom row)

Consequently, the latter simulation can not be steered interactively. However, all computational parameters of the involved N UM L AB network can be interactively controlled at the beginning of the process, or between computation steps. Figure 3 shows several results of the glass pressing simulation. The first row depicts several snapshots of the 3D geometry of the moulded glass, reconstructed and realistically rendered in N UM L AB from the 2D computational domain. The second row in Fig. 3 shows fluid pressure snapshots taken during the 2D numerical simulation. The output of the N UM L AB visualisation pipeline can be connected to the MPEGCreator module. In this way, one can produce MPEG movies of the time-dependent simulation which can be visualised outside the VISSION environment as well. For the MPEGCreator module, we have actually reused the freely available code of the same module in the AVS system [49], by adding a simple C++ class interface to it. This reuse is typical for the open structure of the N UM L AB-VISSION combination. The above has presented two computational applications built with the N UM L AB library in the VISSION system. However different in terms of interactivity, computational complexity, and visualisation needs, these applications illustrate well the smooth integration of numerics, user interaction, and

on-line visualisation that is achieved by embedding the N UM L AB library in the VISSION environment. 6 Conclusions and future work The numerical laboratory N UM L AB was designed to address two categories of limitations of current computational environments. First, N UM L AB addresses the functional limitations of many computational systems by factoring out of a few fundamental mathematical notions: Vector functions x, spaces V, operators F on such spaces, and implementation of the evaluation z = F(x). Based hereon, all its (iterative) solution methods, preconditioners, time integrators, finite element/ difference/volume operators, etc. are instances of approximate evaluations z ≈ F(x). Because roots are in general computed with the use of evaluations z = F(x) – and sometimes evaluations involving Jacobians – N UM L AB extension are simple. Its objects are close to the modelled mathematics. Secondly, N UM L AB is easy to extend, customise and simple to use for a large application class. It provides interactive application construction, steering, and visualisation with its

The N UM L AB numerical laboratory for computation and visualisation

Fig. 4. Visualisation of various numerical computations in the N UM L AB environment

J. Maubach, A. Telea

network editor VISSION. In VISSION, but also outside in compiled and interpreted programs, its numerical libraries can be intermixed with other visualisation, data processing, and data interchange libraries. N UM L AB seperates its numerical libraries, visualisation libraries VTK/OI and interaction and dataflow library VIS SION. This makes their extension, maintenance, and understanding much easier than in systems where the above libraries are amalgamated in one (source) code. The scientific researcher who uses N UM L AB can with ease focus on the testing of new mathematical algorithms and applications. This is shown by the large class of applications implemented in our framework (Fig. 4). Large datasets produced by computational flow dynamics simulations are interactively visualised by for instance arrow plots (Fig. 4a), slices (Fig. 4b), or interactively placed stream tubes (Fig. 4c). Mathematical objects can be computed and visualised, such as scalar functions (the isosurface plot in Fig. 4d or a quadric function), or tensor functions (the hyperstreamline and the isosurface plots on Fig. 4e,f of a stress tensor caused by a point load on a semi-infinite domain). Various simulations have been implemented and interactively run, such as wave simulations (Fig. 4g), elasticity problems, (Fig. 4h), and global illumination using the radiosity method (Fig. 4i), as described in [15, 16]. However flexible, the N UM L AB environment has also a number of conceptual and practical limitations, as follows. Conceptually, the network application model it uses can be sometimes unintuitive for its users. A N UM L AB network (see e.g. Fig. 1) reflects directly the object oriented structure of the underlying C++ library. Understanding this structure and the role its fine-grained components play involves a certain learning curve for the end users. In many cases, the complexity of the networks can be hidden from the end user by the usage of groups (Sect. 5.1.5). Overall, we believe that this extra complexity is a reasonable price to pay for the generic nature of the toolkit. From the practical viewpoint, solving large PDE problems in N UM L AB is still slower compared to using specialised toolkits. This is due to the generic nature of the N UM L AB modules that can not make assumptions about specific data storage or discretisation properties provided by other modules (see e.g. Sect. 3.3). This problem can be tackled in several ways: implementing less generic (optimised) modules, reengineering the generic modules’ implementations to make more extensive use of data caching, or parallelising the numerical code, as outlined further in this section. A second limitation involves the need to program new Operator subclasses e.g. to model new PDEs (see Sects. 3.4 and 4.1). A better approach would be to design generic Operators that accept their definition via a symbolic, interpreted notation. Implementing such generic Operators would raise the same efficiency problems outlined above. Multigrid solvers can be used within the NumLab framework, but right now, no special support is provided. In order to offer convenient multi-grid solver modules, a few modules need to be extended, and others must be added. The grid and solution data types must be extended to handle internal stacks of grids and solutions. Restriction and prolongation require new modules, and application-specific preconditioners are desirable. The strong coupling between the grid and basic

iterative solvers need not be a problem: operator evaluation can be grid based. We plan to extend the N UM L AB library with even more modules, including readers and writers for the standards MathML and OpenMath [14] Along with this, we plan to integrate a new technique for automatic, transparent parallelisation of all numerical code in N UM L AB.

References 1. Abram, G., Treinish, L.: An Extended Data-Flow Architecture for Data Analysis and Visualization. Proc. IEEE Visualization 1995, ACM Press, pp. 263–270 2. Anderson, E., Bai, Z., Bischof, C. et al.: LAPACK user’s guide. SIAM Philadelphia, 1995 3. Arnold, D.N., Brezzi, F., Fortin, M.: A stable finite element for the Stokes equations. Calcolo 21, 337–344 (1984) 4. Astheimer, P.: Sonification tools to supplement dataflow visualization. Scientific Visualization: Advances and Challenges, Academic Press 1994, pp. 251–263 5. Axelsson, O.: A generalized conjugate gradient, least square method. Numerische Mathematik 51, 209–227 (1987) 6. Axelsson, O., Maubach, J.: Global space-time finite element methods for time-dependent convection diffusion problems. Advances in Optimization and Numerical Analysis 275, 165–184 (1994) 7. Axelsson, O., Vassilevski, P.S.: Algebraic multilevel preconditioning methods I. Numerische Mathematik 56, 157–177 (1989) 8. Axelsson, O., Barker, V.A.: Finite Element Solution of Boundary Value Problems. Orlando, Florida: Academic Press 1984 9. Barret, R., Berry, M., Dongarra, J., Pozzo, R.: Templates for the Solution of Linear Systems, 2nd edn. SIAM, 1995 10. Booch, G.: Object-Oriented Analysis and Design. 2nd edn., Redwood City, CA: Benjamin/Cummings 1994 11. Bruaset, A.M., Langtangen, H.P.: A Comprehensive Set of Tools for Solving Partial Differential Equations: Diffpack. Numerical Methods and Software Tools in Industrial Mathematics, Daehlen, M., Tveito, A. (eds.), 1996 12. Budd, T.: An Introduction to Object-Oriented Programming. AddisonWesley 1997 13. Butcher, J.C.: The numerical analysis of ordinary differential equations: Runge–Kutta and general linear methods. Wiley 1987 14. Caprotti, O., Cohen, A.M.: On the role of OpenMath in interactive mathematical documents. J. Symbolic Comput. 32, 351–364 (2001) 15. Cohen, M.F. Wallace, J.R.: Radiosity and Realistic Image Synthesis. San Diego CA: Academic Press 1993 16. Cohen, M.F., Wallace, J.R., Chen, S.E., Greenberg, D.P.: A Progressive Refinement Approach to Fast Radiosity Image Generation. Computer Graphics (SIGGRAPH ’95 Proceedings), Vol. 22, No. 4 17. Crouzeix, M., Raviart, P.A.: Conforming and nonconforming finite element methods for solving the stationary Stokes equations, I. R.A.I.R.O. 3, 33–76 (1973) 18. Dembo, R.S., Eisenstat, S.C., Steihaug, T.: Inexact Newton methods. SIAM Journal on Numerical Analysis 19, 400–408 (1982) 19. Duclos, A.M., Grave, M.: Reference models and formal specification for scientific visualization. Scientific Visualization: Advances and Challenges. Academic Press 1994, pp. 251–263 20. Eisenstat, S.C., Walker, H.F.: Globally convergent inexact Newton methods. SIAM Journal on Optimization 4, 393–422 (1994) 21. Ervin, V., Layton, W., Maubach, J.: A Posteriori error estimators for a two level finite element method for the Navier–Stokes equations. Numerical Methods for PDEs 12, 333–346 (1996) 22. Gear, C.W.: Numerical initial value problems in ordinary differential equations. Prentice-Hall 1971 23. Gunn, C., Ortmann, A., Pinkall, U., Polthier, K., Schwarz, U.: Oorange: A Virtual Laboratory for Experimental Mathematics. Sonderforschungsbereich 288, Technical University Berlin. http://www-sfb288.math.tu-berlin.de/oorange/ OorangeDoc.html 24. Hackbusch, W., Wittum, G. (eds.): ILU algorithms, theory and applications, Proceedings Kiel 1992. In: Notes on numerical fluid mechanics 41. Vieweg 1993

The N UM L AB numerical laboratory for computation and visualisation 25. Hackbusch. W., Wittum, G. (eds.): Multigrid Methods, Proceedings of the European Conference in: Lecture notes in computational science and engineering. Springer Verlag 1997 26. IMSL: FORTRAN Subroutines for Mathematical Applications, User’s Manual. IMSL 1987 27. Jackie, N., Davis, T., Woo, M.: OpenGL Programming Guide. AddisonWesley 1993 28. The Java 3D Application Programming Interface: http://java.sun.com/products/java-media/3D/ 29. John, V., Maubach, J.M., Tobiska, L.: A non-conforming streamline diffusion finite element method for convection diffusion problems. Numerische Mathematik 78, 165–188 (1997) 30. Layton, W., Maubach, J.M., Rabier, P.: Robust methods for highly non-symmetric problems. Contemporary Mathematics 180, 265–270 (1994) 31. Margenov, S.D., Maubach, J.M.: Optimal algebraic multilevel preconditioning for local refinement along a line. Journal of Numerical Linear Algebra with Applications 2, 347–362 (1995) 32. Wolfram, S.: The Mathematica Book 4-th edition. Cambridge University Press 1999 33. Matlab: Matlab Reference Guide. The Math Works Inc. 1992 34. Mattheij, R.M.M., Molenaar, J.: Ordinary differential equations in theory and practice. Wiley 1996 35. Maubach, J.: Local bisection refinement for n-simplicial grids generated by reflections. SIAM Journal on Scientific Computing 16, 210–227 (1995) 36. B. Meyer: Object-oriented software construction. Prentice Hall 1997 37. NAG: FORTRAN Library, Introductory Guide, Mark 14. Numerical Analysis Group Limited and Inc. 1990 38. Parker, S.G., Weinstein, D.M., Johnson, C.R.: The SCIRun computational steering software system. In: Arge, E., Bruaset, A.M., Langtangen, H.P. (eds.), Modern Software Tools for Scientific Computing. Switzerland: Birkhaeuser Verlag AG 1997, pp. 1–40

39. Ribarsky, W., Brown, B., Myerson, T., Feldmann, R., Smith, S., Treinish, L.: Object-oriented, dataflow visualization systems – a paradigm shift?. in Scientific Visualization: Advances and Challenges. Academic Press 1994, pp. 251–263 40. Roos, H.G., Stynes, M., Tobiska, L.: Numerical Methods for Singularly Perturbed Differential Equations. Springer Verlag 1996 41. Rumbaugh, J., Blaha, M., Premerlani, W., Eddy, F., Lorensen, W.: Object-Oriented Modelling and Design. Prentice-Hall 1991 42. Saad, Y., Schultz, M.H.: GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM Journal on Scientific and Statistical Computing 7, 856–869 (1986) 43. INRIA-Rocquencourt: Scilab Documentation for release 2.4.1., 2000 http://www-rocq.inria.fr/scilab/doc.html 44. Schroeder, W., Martin, K., Lorensen, B.: The Visualization Toolkit: An Object-Oriented Approach to 3D Graphics. Prentice Hall 1995 45. Stroustrup, B.: The C++ Programming Manual. Addison-Wesley 1993 46. Telea, A., van Wijk, J.J.: VISSION: An Object Oriented Dataflow System for Simulation and Visualization. Proceedings of IEEE VisSym ’99, Groeller, E., Ribarsky, W. (eds.), Springer 1999, pp. 95–104 47. Telea, A.: Combining Object Orientation and Dataflow Modeling in the VISSION Simulation System. Proceedings of TOOLS’99 Europe, Nancy 3–8 June 1999, IEEE Computer Society Press, 1999, Mitchell, R., Wills, A.C., Bosch, J., Meyer, B. (eds.) pp. 56–65 48. Todd, M.J.: The Computation of Fixed Points and Applications. Lecture Notes in Economics and Mathematical Systems 124. Springer Verlag 1976 49. Upson, C., Faulhaber, T., Kamins, D., Laidlaw, D., Schlegel, D., Vroom, J., Gurwitz, R., van Dam, A.: The Application Visualization System: A Computational Environment for Scientific Visualization. IEEE Computer Graphics and Applications, July 1989, 30–42. See also http://www.avs.com 50. Wernecke, J.: The Inventor Mentor: Programming Object-Oriented 3D Graphics with Open Inventor. Addison-Wesley 1993

Comput Visual Sci (2004) Digital Object Identifier (DOI) 10.1007/s00791-004-0143-2

Computing and Visualization in Science

Regular article A study of motion recognition from video sequences Xiang Yu, Simon X. Yang Advanced Robotics and Intelligent Systems (ARIS) Lab, School of Engineering, University of Guelph, Guelph, N1G 2W1, Ontario, Canada (e-mail: [email protected]) Received: 15 March 2003 / Accepted: 4 January 2004 Published online: 17 August 2004 –  Springer-Verlag 2004 Communicated by: G. Wittum

Abstract. This paper proposes a method for recognizing human motions from video sequences, based on the cognitive hypothesis that there exists a repertoire of movement primitives in biological sensory motor systems. First, a contentbased image retrieval algorithm is used to obtain statistical feature vectors from individual images. An unsupervised learning algorithm, self-organizing map, is employed to cluster these shape-based features. Motion primitives are recovered by searching the resulted time serials based on the minimum description length principle. Experimental results of motion recognition from a 37 seconds video sequence show that the proposed approach can efficiently recognize the motions, in a manner similar to human perception.

1 Introduction The analysis of human actions by a computer has gained more and more interest [3, 6, 7, 9, 12, 14, 16]. A significant part of this work is the recognition and modelling of human motio ns in video sequences, which provides a basis for applications such as human/machine interaction, humanoid robotics, animation, video database search, sports medicine, etc. For human/machine interaction, it is highly desirable if the machine can understand the human operator’s action and react correspondingly. The work for the remote control of camera view is a good example. An operator monitors a site by watching through a remote camera. The camera control system detects the movement of the operator’s head and eyes, estimates the interested view point of the operator and automatically changes the camera’s view to that point so that the operator can see what he is interested simply by turning his head and eyes. The recognition of human motions is also important for humanoid robotics research. For example, imitation is a powerful means of skill acquisition for humanoid robots, i.e., a robot learns its motions by understanding and imitating the action of a human model [9]. Another application is video database search. The increasing interest in the understanding of action or behaviour has led to a shift from static images to video sequences in computer vision.

A characteristic of the work on motion recognition is that it deals with a time series of video signals [12, 19]. A video sequence consists of many frames, which are the individual static images and generally the smallest unit we are concerned. A contiguous set of frames representing a continuous action in time and space is called a shot. Basically, video segmentation, or called video recognition, is the process of dividing a video sequence into its component shots. A conventional solution to human motion recognition is based on a kinematics model. For example, Sidenbladh et al. [16] introduced a human body model in which the human body is represented by a collection of articulated limbs. In this case, an action is considered as a collection of time series describing the joint angles as they evolve over time. Then, the motion recognition task can be simplified, probably oversimplified, as a parameter estimation problem, i.e., to estimate the joint angles over the images series. One problem of this approach is how to decompose a time series into suitable temporal primitives in order to model these body angles. Hidden Markov models (HMMs) have been well used for recognition of human action. Bregler [3] proposed a probabilistic decomposition of human dynamics at multiple abstraction levels. At the low level, EM (expectation maximum) clustering is used to find coherent motion primitives. The middle level categories are simple movements represented by dynamic systems, and the high level complex gestures are represented by HMMs. The weakness of HMMs for modelling is that they do not well capture some of the intrinsic properties of biological motion such as smoothness. Instead, human motions are often represented by explicit temporal curves that describe the change over time of 3D joint angles. In [3], the topology of the HMMs is obtained by learning a hybrid dynamic model on periodic motion and it is difficult to be extended to other types of motions, which could be more complex. Still, manually segmenting and labelling training data, which are required in supervised learning approaches, are a tedious and error prone process. In this paper, we propose an unsupervised learning based approach to model and represent human motion in video sequences, as illustrated in Fig. 1. The basic idea comes from the psychophysics and neuroscience evidences that motor

X. Yu, S.X. Yang

Fig. 1. The system scheme of the proposed method for motion discovery from video sequence

primitives for structuring movement exist in human cognitive systems, which constitute a motor vocabulary [9]. We use a self-organizing map to cluster image sequences to form motor primitives. In a computational sense, primitives can be viewed as a basis set of motor programs that are sufficient, through combination operators, for generating the entire movement repertoire. After the clustering, we apply a substructure discovery algorithm introduced by Cook and Holder [4] to find the motion primitives. The algorithm is based on the minimum description length principle. In addition, other knowledge about segmentation can be used to guide the search toward more appropriate substructures. In the following sections, the perceptual model of motion primitives is first discussed. Then, the structure and learning algorithm of a self-organizing map are introduced in Sect. 3, followed by its application to video processing. Section 4 presents the basic idea of the recovery of primitives from the video sequence by searching. Section 5 describes the experiments and results. The final section gives some concluding remarks. 2 Motion primitives Cognitive studies about human motion systems demonstrate that human beings tend to choose from a limited but possibly large repertoire of movement primitives when they take actions. Meanwhile, biological experiments carried out on vertebrates, e.g., frogs and rats, show convincing evidences of existence of dedicated areas in the spine for specific movements [1]. Complete movements, such as reaching and wiping, could be produced by potentiating an electrode in different regions of the spine of spinalized frogs. Similar movements are observed from different frogs for similarly positioned activations of the spine. Further, it is observed that supra-spinal inputs from the brain co-activate to superimpose multiple primitives in an additive fashion, resulting in an even larger repertoire of meaningful movements. All these evidences suggest that there exists a primitive set of actions in the motion planning system of vertebrate. The perceptual model of motion primitives has been applied to various studies about tracking human motion [16], learning by imitation [18], etc. In [16], a model for human body is defined by a set of joint parameters, e.g., as shown in Fig. 2. Basically, the human body is simplified to a set of cylinders. Then, motions can be represented by a time series of the parameters. In [18], the primitives are used as a basis set of motion, serving as a vocabulary for classifying and imitating observed movements. The existence of a primitive set for motion generation leads to a direct solution to motion perception. Instead of extracting explicit description of the trajectory of a limb anew for each action, video sequences are processed to de-

Fig. 2. A human body model based on motion primitives. It consists of a set of articulated cylinders with 25 degrees of freedom (DOF). Each limb i has a local coordinate system with the Z i axis directed along the limb. Joints j,i j,i j,i have up to 3 DOF, expressed as relative rotations (θx , θ y , θz ) between body parts i and j

rive the primitives and to build up a primitive set [1, 9], so called a primitive vocabulary. In a computational sense, the primitive vocabulary can be viewed as a basis set of motor programs that are sufficient, through combination operators, for generating the entire movement repertoire. Then, in the recognition phase, actions are extracted from a video sequence by looking it up in the vocabulary. Apparently, the major work of this action recognition system is how to build up the primitive set. 3 Clustering of the self-organizing map A key difficulty of a primitive-based approach is how to find and define those primitives. We propose an approach based on a self-organizing map. First, edge images are symbolized by clustering, where the video sequence is converted into a long symbolic series. Then, primitive actions are discovered by using a substructure searching algorithm. Self-organizing maps (SOM), also called Kohonen’s feature maps, are a type of neural networks that is capable of providing a powerful unsupervised learning tool for visualization and abstraction of high-dimensional data [10]. It converts complex, nonlinear relationships among highdimensional data items into simple geometric relationships on a low-dimensional space that is easy to display. As a result, information abstraction is achieved by compressing the information while preserving the most important topological and metric relationships of the primary data. There are many SOM algorithms, utilizing these two aspects for a wide range of applications such as process analysis, machine perception, control, and communication. The SOM usually consists of a 2-dimensional (2D) grid of nodes, each of which represents a model of some observation. Basically, the SOM can be regarded as a similarity graph, or a clustering diagram [17]. After a nonparametric

A study of motion recognition from video sequences

and unsupervised learning process, the models are organized into a meaningful order in which the similar models are closer to each other than those more dissimilar ones. The basic criteria for training a self-organizing map is the so called “winner take all” principle, i.e., a winner node is selected and only this winner node will have a chance to learn its weights. Further, in order to organize a map with cooperation between nodes, the learning area is expanded to a kernel around the winner node, with the learning rate linearly decreasing from the winner node to nodes on the boundary of the kernel. Then, the learning is performed in such a way that the reference vector represented by these weights is moved closer to the input pattern. Denote mi as the weight vector for the ith node, and x as an input, the learning process is [10] mi (t + 1) = mi (t) + h c,i (xt − mi (t)) , where h c,i is a decreasing function defined as follow   ||ri − rc || , h c,i = α(t) exp − 2δ2(t)

(1)

(2)

where 1 > α(t) > 0 is the learning rate that is a monotonically decreasing function, ri and rc are the vector locations of the ith node and cth node, respectively, δ(t) is the width of the kernel function that defines the size of the learning neighbourhood that decreases monotonically with time, and ||.|| represents the Euclidean distance. As a powerful clustering diagram, SOM has attracted intensive attention from researchers for image processing. For example, the PicSOM [11], a framework for content-based image retrieval, uses a self-organizing map to organize similar images in nearby neurons, building up a representation of an image database with similar images located near each other. With a number of tree structured SOMs trained on different image features, the system yields an efficient retrieval of images similar to a set of reference images when one browses the image database. It has been suggested that SOMs provide an efficient tool for both low level image quantification and high level feature localization [5]. As shown in Fig. 3, a SOM usually consists of a 2D regular grid of nodes. A model of some observation is associated with each node. Before the training, as shown in the left part, all models are randomly located in the map. The right part of Fig. 3 illustrates the self-organization effect, i.e., similar images are clustered together, being associated with neighbouring nodes in the map. In this paper, we use a SOM to cluster images based on shape features. The objective is to find a representation of a video sequence to illustrate the property of an action as a time series. After training, the SOM will generate a label for each input image, converting a video sequence to a label series. Then, a searching process for motor primitives is applied to construct a primitive vocabulary. In order to use SOM, the images need to be transformed into some feature vectors. In our approach, we use a shapebased Fourier feature [2]. First, the image area is normalized such that the aspect ratio is maintained. Prewitt edge images are computed for the normalized frames. The edge image contains the most relevant shape information and the discrete Fourier transform can be used to describe it. The Fourier transform is computed for the normalized image using the

Fig. 3. An illustration of the map for image clustering. Each node in the hexagonal grid holds an image of a three-section pie. Left: the initial map before training. After a random initialization of the connection weights, all images are randomly located in the map. Right: the map after training. Images are clustered according to similarity. Notice that neighbouring models are mutually similar in this map

discrete fast Fourier transform (FFT) algorithm. Then, the magnitude image of the Fourier spectrum is first low-pass filtered and thereafter decimated. 4 Searching for primitives After symbolizing the video sequences, computation cost is the key point for the searching algorithm of primitives. An exhaustive searching will result in an exponentially increasing complexity. Fortunately, exhaustive searching is not necessary when we consider the nature of the actions. Basically we can describe an action as a transfer from one pose to another. A pose accords to a serial of images that do not significantly change. Therefore, the whole searching space can be divided into multiple spaces by detecting the poses. Further, by using the minimum description length principle, the repetitive substructures, primitives, are identified. The rationale behind the minimum description length principle, introduced by Rissanen in 1978 [13] is that the best model for a data set is a model that can minimize the description length of the data set. Cook and Holder [4] applied the principle for identifying repetitive substructures in structural data as a basic knowledge discovery approach. By replacing previously discovered substructures in the data, a hierarchical description is produced for the structural regularities in the data. For a given video sequence, the trained relation maps all individual images onto a 2D network of neurons. Consider the time order of all images in the video, the video sequence forms some tracks/paths on the SOM map. These tracks represent some substructures in the video sequence, which appear repeatedly. As described in the follows, we propose an algorithm to discover these substructures. Consider an N × M SOM, with P = N × M neurons. By feeding a video sequence into the map, the result is a directional map. Denote the P neurons as a symbol set S = {S1 , S2 , . . . , S P }. The directional map can be identically

X. Yu, S.X. Yang

represented by a symbol series. Due to the coordination learning of the SOM, neurons within the neighbourhood represent images that are similar to each other. Meanwhile the global competitive learning among neurons drives different images to different neurons. As a result, motion primitives in the video sequence are encoded as repetitive substructure in the symbol series. In order to represent the direction for connections, we build up a transition matrix, TP×P , with its row and column corresponding to the start and end points of a connection, respectively. Specifically, T(i, j ) means the number of times when a track from Si to Sj is observed. The transition matrix TP×P is computed by scanning the entire symbol series. Then, the maximal element of the matrix is located and two searching processes, as we called forward and backward, are designed to find a primitive. Assume the maximal element as T(n, m), standing for a connection from Sn to Sm . The forward search is to find a neuron, from which the connection to Sn is the strongest. Similarly, the backward search is to find a neuron, to which the connection from Sm is the strongest. A threshold is defined on the connection strength to terminate the search, i.e., the connection is cut when the connection is weak. The above process will be repeated over the entire matrix. As a result, motion primitives that are represented by repetitive substructures in the symbol series are discovered. Figure 4 illustrates the process described above by a 3 × 3 SOM. A video sequence is fed into the map. As shown in Fig. 4, we have a directional map. Denote the 9 neurons as A, B, C, . . . , I. Then, the directional map can be identically represented by a time series of symbols A–I, i.e., ABCHIADADHIFABCADABCEHI. The transition matrix for the above symbol series is shown in Fig. 5. In this simplified case, there are elements with the maximal value of 3. First we set the threshold for the connection strength as 2. Consider the first element T(A, B) with a value of 3. The forward search is to find a neuron from which the connection to A is the strongest. The result is none due to the threshold of 2. This means A is the starting point for the current search of primitive. The backward search is to find to which neuron the connection from B is the strongest and we get neuron C. But after that the search is terminated because no neuron takes a strong connection from C (The maximal value for the C row is 1, which is below the threshold). Then, we get a primitive as (ABC). Following similar processes, we obtain (AD) and (HI). In summary, the proposed algorithm is described as follows. (1) Convert the 2D N × M SOM map into an identical representation of a 1D series of symbols, {S1 , S2 , . . . , S P }. The series length, P, equals to the number of neurons in the SOM map. (2) Create the transition matrix TP×P . Compute T(i, j ) as the number of times when a track from Si to Sj is observed. (3) Find the maximal element of T . Denote it as Ti ′ , j ′ . It represents a track from Si ′ to Sj ′ . (4) Fetch the j ′th row of T. Find the maximal element of this row. Then, the corresponding symbol is the next symbol after Sj ′ . (5) Set the elements whose symbols have been tracked to zero. Then repeat Steps (4)–(5) until the current maximal element is less than half of the first maximal element.

Fig. 4. The directional graph obtained by feeding a video segment to a 3 × 3 SOM. The SOM has 9 neurons on the map, denoted as A, B, . . . , I, respectively. An arrow between two neurons, e.g., A and B, shows that an image corresponding to neuron A is exactly followed by an image of B in the video sequence

(6) After finding the global maximal element, the other process is to find the previous symbol by fetching the i ′ th column of T and finding the maximal element. This process is also repeated until the current maximal element is less than half of the first maximal element. (7) Repeat Steps (3)–(6) until there is no element larger than half of the first maximum. The obtained sub-series of symbols represent the so called primitives of motion. A new symbol can be defined for each primitive. Then, the whole video sequence can be represented by using these symbols, resulting in a concise representation of the video sequence. The computational complexity of the proposed searching algorithm is proportional to the length of the video sequence and the size of the transition matrix. Suppose the length of the video sequence is L and the size of the transition matrix is P 2 . To build up the transition matrix, the scanning of the symbol series involves L operations of comparison. Then, the computation to find primitives is basically a global sorting for the P 2 elements of the transition matrix. Because the length of a primitive is normally much less than the video sequence, the computation for forward and backward search is ignorable, in comparison to the computation for the transition matrix. Therefore, the total computation complexity is estimated as o(L + P 2 ). In addition, the determination of the

Fig. 5. The transition matrix for the time series of “ABCHIADADHIFABCADABCEHI”. The transition graph is shown in Fig. 4. The number of “*” for an element of T(i, j ) represents the number of observations of the connection between Neuron i and Neuron j

A study of motion recognition from video sequences

map size P is made to have an even distribution of all images on the map, as we will discussed later. For instance, we have P ∼ L/5 in our simulation. On the contrary, the exhaustive searching for primitives is estimated as o(L!). The proposed searching algorithm is much more efficient than an exhaustive searching. 5 Simulations A web camera is used to capture a video sequence of a hand clicking on a mouse. With the resolution being 320 × 240 and the frequency being 15 frames/second, a 37 seconds sequence with 555 frames is used to test the proposed approach. 5.1 Pre-processing of the video sequence After we convert the video sequence into individual image files, the MATLAB Image Toolbox is used to compute shapebased Fourier feature vectors to which SOM can be applied to do the clustering. Certainly, the feature extraction algorithm is important for the performance because the clustering of the SOM is based on an efficient representation of images by the feature vectors. However, this is not the focus here. We choose the following algorithm for feature extraction from the literature [2]. First the images are normalized to 512 × 512. The Prewitt method is used to compute the edge image. Then, an 8-point FFT is calculated. The resulted Fourier spectrum is low-pass filtered and decimated by a factor of 32, resulting in a 128D vector for each image. This algorithm is used because it is reported to be the most effective one in [2]. However, these shape-based Fourier features are by no means recommended for a generalized application of the proposed motion recognition method, because they do not well represent the local changes/movement, which are generally important for capturing the motion characteristics. More discussions on the feature selection are presented in the final section.

Fig. 6. Illustration of the sample distribution on the trained SOM. There are 555 samples in total. The map is a 12 × 12 grid. The bar height demonstrates the number of samples that take the current node as the best-matching unit. This figure is used to help to determine the map size. Basically, we expect an even distribution of all samples over the map

5.2 Clustering by SOM The feature vectors obtained in the above process are fed into a 12 × 12 SOM for clustering. As there is no prior knowledge for the selection of the number of neurons, we apply a simple rule to help the selection, i.e., an even distribution of samples/features over the whole map. Basically , a too large map will fail to discover any similarity among samples while an extra small map might mess everything together. By monitoring the sample distribution, as shown in Fig. 6, we choose a heuristic structure with 12 × 12 neurons. The learning rate function α(t) is chosen as α(t) = a/(t + b), where a and b are chosen so that α(t = T ) = 0.01α(t = 0), with T is the time interval and α(t = 0) = 0.05. The kernel width δ(t) is set to be a linear function that changes from δini = 12 to δ final = 1. In particular, δ(t) = (δ final − δini )t/T + δini .

Fig. 7. Sample distribution on the trained SOM. The horizontal axis represents neurons, while the vertical axis is the number of samples that is clustered into the corresponding neuron

Fig. 8. The unified distance matrix of the trained SOM. The distance value is illustrated by the grey level, as shown by the bar on the right. A high value means a large distance between neighbouring map units, and thus indicates the cluster boundary

X. Yu, S.X. Yang

view, clusters are typically uniform areas of low values. Figure 9 presents a better view of the distance matrix by a 3D surface plot, where the X and Y coordinates correspond to the position of the neurons and the Z coordinate represents the distance. The high-value areas accord to some distinguish poses in an action, while the low-value areas mean a cluster of slowly changing poses. In the experiment, a typical movement starts from a low area, climbs to a high area and returns back to a low area. Figure 10 presents a view of the clustering by SOM in more details, by showing the kernel for each neuron in the map. Each kernel is a represented by a 128D vector, as the visual feature for one individual image does. Each waveform in Fig. 10 is drawn for the 128D kernel. In general, more similarity among the waveforms indicates that there is less difference by the motion among the images that are clustered to the corresponding neurons. 5.3 Motion primitives discovery Fig. 9. A 3D view of the distance matrix. The X and Y coordinates correspond to the position of the neurons and the Z coordinate represents the distance. A better view may be available if colours are displayed, since we also use colours to differentiate the distance. Intuitively, the high-value areas correspond to some distinguish poses in an action, while the low-value areas indicate slowly changing poses

As shown in Fig. 8, the so called U-matrix shows the unified distances between neighbouring units and thus visualizes the cluster structure of the map. Note that the U-matrix visualization has much more hexagons than the component planes. This is because that hexagons are inserted between map units to show the distance between neighbouring neurons, while a hexagon in the position of a map unit illustrates the average of surrounding values. A high value on the U-matrix mean a large distance between neighbouring map units, and thus indicates the cluster boundary. From the classification point of

Fig. 10. Illustration of the codebook for all neurons on the SOM after the training. Each waveform is drawn as the 128D kernel for the corresponding neuron

The concept of pose detection in video segmentation is very similar to that of the silence detection in speech processing. Intuitively, it is reasonable to consider a pose as a piece of silence for human actions. Silence detection is widely used for speech segmentation. At least, sentences can always be picked up from a long speech by detecting the pause between them. Speech silence is defined based on signal energy. The pose we define here is based on frequency, while speech information is basically encoded by frequency components. In cognitive studies, scientists often address human behaviours as body language. Interestingly, this suggests us an effective way to handle video signals. As a time sequence, video signals share some common sense with speech signals. Speech signal is a time series of sound wave, in which language information is encoded. Similarly, a particular action can be addressed as a series of poses. Then, a primitive can be symbolized, just as we write down a word to represent a particular sound series. When enough action primitives are collected, a vocabulary is formed. Then, we can use this vocabulary to describe human behaviours. Figure 11 shows some sample shots of action in the video sequence. For example, the forefinger’s action is well recognized by searching the serial of {(3,11), (6,12), . . . }. By applying the substructure searching algorithm presented above, primitives are extracted as series of neurons, which are represented by a pair of numbers according to their positions on the

Fig. 11. Some sample shots in the video sequence. The upper sequence shows a movement of the forefinger, while the lower sequence shows a movement of the middle finger. Each motion serial is symbolized to a neuron serials in the 2D map. Neurons are represented by a pair of numbers according to their position on the map. For example, (3, 11) means the neuron in the 3rd row and the 11th column

A study of motion recognition from video sequences

map. Then, the whole video sequence is split automatically by dividing and representing the corresponding symbol series with the resulted primitives.

6 Conclusion and discussion The video sequence processing approach proposed in this paper features two factors. First, due to the unsupervised learning mechanism of the self-organizing map, it saves us some tedious manual computation that is necessary for conventional approaches such as hidden Markov based models. Secondly, it gains support from cognitive studies of motion primitives, as well as provides a better understanding of the biological sensory motor systems. The proposed searching algorithm for substructure discovery is efficient and effective. Compared to an exhaustive searching, its computational complexity is very low, particularly when the sequence length is large. In fact, for real world application, where the video sequence is very long, the exhaustive searching is almost useless. However, as discussed in Sect. 5, the computation requirement for the proposed algorithm is proportional to the sequence length, resulting in a good application. The proposed approach is part of our effort on studies of biologically inspired robotics. One application of the proposed method is to enhance the learning-by-imitation capability for humanoid robots. There has been much work done for building up a humanoid robot system that can automatically gain knowledge on how to control its own movement by observing the behaviour of a model (either a human or another robot) and trying to copy it, which is inspired from studies on how a child learns his/her action by watching and practising [9, 18]. The approach discussed in this paper focuses on the observation and understanding part of a learning-byimitation procedure. Based on the hypothesis of primitives, it provides a method to interpret the motion of a model, e.g., a human, in the video input. Following that, the humanoid robot will try to perform in a similar way. Then with a reward given on its performance, the robot is capable of learning the way to control its body, as a child learns to walk, reach, take, etc. Another potential application to robotics is video-based navigation. As video is the dominant sensor input for robots, it is desirable for us to understand the video as much as we could. Motion recognition from video sequence could arm a robot with a nice ability to avoid dynamic obstacles. For example, a pedestrian could easily tell if it is possible to avoid a coming car and cross the street safely by taking some looks along the way (here let us set a situation without traffic rules). This is because that a human can gain the motion information of the moving cars by a look. A robot that can extract motion information from video input, hopefully, will be able to navigate in the world more naturally and safely. Still, the approach is sensitive to the calculation of feature vectors, as a data-driven method normally does. This sensitivity may also be understood as that the global shape information is not robust to represent the difference that are resulted in the motion [8]. Studies on biological motion perception systems suggest that the global shape information is not ne-

cessary for recognizing motions. The shape-based Fourier features well catch the global shape information in individual images. However, it does not detect local changes/movement efficiently. For this reason, finger movements in the test video are exaggerated on purpose. For a generalized application of the proposed motion recognition approach, more investigations on the visual features should be conducted. As motions are generally involved in objects representation in images, the medial axis transform [15] shall be investigated as a good candidate. This is part of our future work.

References 1. Bizzi, E., Giszter, S., Mussa-Ivaldi, F.A.: Computations Underlying the Execution of Movement: a Novel Biological Perspective. Science 253, 287–291 (1991) 2. Brandt, S., Laaksonen, J., Oja, E.: Statistical Shape Features in Content-Based Image Retrieval. Proceedings of 15th International Conference on Pattern Recognition, Barcelona, Spain, Vol. 12, pp. 1062–1065, September (2000) 3. Bregler, C.: Learning and Recognizing Human Dynamics in Video Sequences. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Juan, Puerto Rico, pp. 568–574, June (1997) 4. Cook, D.J., Holder, L.B.: Substructure Discovery Using Minimum Description Length and Background Knowledge. Journal of Artificial Intelligence Research 1, 231–255 (1994) 5. Craw, I., Costen, N.P., Kato, T., Akamatsu, S.: How Should We Represent Faces for Automatic Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(8), 725–736 (1999) 6. Davis, J.W., Bobick, A.F.: The Representation and Recognition of Human Movement using Temporal Templates. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 928–934, June 1997 7. Guo, A., Yang, S.X.: Neural Network Approaches to Visual Motion Perception. Science in China, Series B. 37(2), 177–189 (1994) 8. Guo, A., Sun, H., Yang, S.X.: A Multilayer Neural Network Model for Perception of Rotational Motion. Science in China, Series C. 40(1), 90–100 (1997) 9. Jenkins, O.C., Mataric, M.J.: Deriving Action and Behaviour Primitives from Human Motion Data. Proceedings of IEEE/RSJ International Conference on Intelligent Robots and System, Vol. 3, pp. 2551–2556, September (2002) 10. Kohonen, T.: Self-Organizing Maps. New York: Springer 1997 11. Laaksonen, J., Koskela, M., Laakso, S., Oja, E.: Self-Organizing Maps as a Relevance Feedback Technique in Content-Based Image Retrieval. Pattern Analysis and Applications 4(2), 140–152 (2001) 12. Lin, C.T., Wu, G.D., Hsiao, S.C.: New Techniques on Deformed Image Motion Estimation and Compensation. IEEE Transactions on Systems, Man, and Cybernetics, Part B 29(6), 846–859, (1999) 13. Rissanen, J.: Stochastic Complexity in Statistical Inquiry. Singapore: World Scientific 1989 14. Shah, M., Jain, R.: Motion-based recognition. Boston: Kluwer Academic Publishers 1997 15. Sherbrooke, E.C., Patrikalakis, N.M., Brisson, E.: An Algorithm for the Medial Axis Transform of 3D polyhedral Solids. IEEE Trans. on Visualization and Computer Graphics 2(1), 44–61 (1996) 16. Sidenbladh, H., De la Torre, F., Black, M.J.: A Framework for Modeling the Appearance of 3D Articulated Figures. Proceedings of Fourth IEEE Conference on Automatic Face and Gesture Recognition, pp. 368–375, March (2000) 17. Vesanto, J., Alhoniemi, E.: Clustering of the Self-Organizing Map. IEEE Transactions on Neural Networks 11(3), 586–600 (2000) 18. Weber, S., Jenkins, C., Mataric, M.J.: Imitation Using Perceptual and Motor Primitives. Proceedings of International Conference on Autonomous Agents, pp. 136–137, Barcelona, Spain, June 3–7, (2000) 19. Wu, Y., Huang, T.S.: Hand Modeling, Analysis, and Recognition for Vision-based Human Computer Interaction. IEEE Signal Processing Magazine 18(3), 51–60 (2001)