196 7 47MB
German Pages 240 Year 1989
Mathematical Research Advances in Mathematical Optimization edited by J. Guddat et al.
Volume 45
AKADEMIE-VERLAG
BERLIN
In this series original contributions of mathematical research in all fields are contained, such as — research monographs — collections of papers to a single topic — reports on congresses of exceptional interest for mathematical research. This series is aimed at promoting quick information and communication between mathematicians of the various special branches.
In diese Reihe werden Originalbeiträge zu allen Gebieten der mathematischen Forschung aufgenommen wie — Forschungsmonographien — Sammlungen von Arbeiten zu einem speziellen Thema — Berichte von Tagungen, die für die mathematische Forschung besonders aktuell sind. Die Reihe soll die schnelle Information und gute Kommunikation zwischen den Mathematikern der verschiedenen Fachgebiete fördern
Manuscripts in English and German comprising at least 100 pages and not more than 500 pages can be admitted to this series. W i t h respect to a quick publication the manuscripts are reproduced photomechanically. Authors w h o are interested in this series please turn directly to the 'Akademie-Verlag'. Here you will get more detailed information about the form of the manuscripts and the modalities of publication.
Manuskripte in englischer und deutscher Sprache, die mindestens 100, Seiten und nicht mehr als 500 Seiten umfassen, können in diese Reihe aufgenommen werden. Im Interesse einer schnellen Publikation werden die Manuskripte auf fotomechanischem Weg reproduziert. Autoren, die an der Veröffentlichung entsprechender Arbeiten in dieser Reihe interessiert sind, wenden sich bitte direkt an den Akademie-Verlag. Sie erhalten dort genauere Informationen über die Gestaltung der Manuskripte und die Modalitäten der Veröffentlichung.
Advances in Mathematical Optimization
Mathematical Research
Mathematische Forschung
Wissenschaftliche Beiträge herausgegeben von der Akademie der Wissenschaften der D D R Karl-Weierstraß-Institut für Mathematik
Band 45 Advances in Mathematical Optimization edited by J. C u d d a t et al.
Advances in Mathematical Optimization Invited papers dedicated to Prof. Dr. Dr. h. c. F. Nozicka on occasion of his 70th birthday
edited by Jürgen Guddat Bernd Bank Horst Hollatz Peter Kall Diethard Klatte Bernd Kummer Klaus Lommatzsch Klaus Tammer Milan Vlach Karel Zimmermann
Akademie-Verlag Berlin 1988
Herausgeber > Prof. Dr. Bernd Bank, Prof. Or. Jürgen Guddat, Doz. Dr. Bernd Kyimmer Doz. Or. Klaue Loramatzech Humboldt-Universität zu Berlin, Sektion Mathematik Prof. Or. Horst Hollatz Technische Universität Otto von Guericke Magdeburg, Sektion Mathematik/ Prof. Dr. Peter Kall Phyeik Technieche Universität Zürich Prof. Dr. Diethard Klatte Pädagogische Hochschule Halle Doz. Dr. Klaue Tammer Technische Hochschule Leipzig, Sektion Mathematik und Informatik Prof. Dr. Milan Vlach, Doz. Dr. Karel Zimmermann Karls-Univereitfit Prag Die Titel dieser Reihe werden vom Originalmanuskript der Autoren reproduziert
ISBN 3-05-500543-0 ISSN 0138-3019 Erechienen im Akademie-Verlag, Berlin, D0R-1086 Berlin, Leipziger Str. 3-4 © Akademie-Verlag Berlin 1988 Lizenznummert 202*100/413/88 Printed in the German Democratic Republic Gesamtherstellungi VEB Kongreß- und Werbedruck, 9273 Oberlungwitz Lektorj Dr. Reinhard Httppner LSV 1085 Bestellnummer: 763 862 6 (2182/45) 03200
PREFACE The excellent scientist and outstanding university professor Prof. Dr. Dr. h.c. FrantiSek NoSiCka celebrates on 5-th April, 1988 the 70-th anniversary of his birthday. Prof. NoSiCka was a full professor of mathematics at the Charles University in Prague and is a visiting professor at the Section of Mathematics of the Humboldt University in Berlin for more than twenty years. Today he still works as a consulting professor at the Charles University and is fully active at the Humboldt University as a researcher and academic teacher. During the last 30 years FrantiSek No2iCka substantially contributed by remarkable results on several fields of mathematics and, in particular, to the development of theory and methods in mathematical optimization. The numerous papers and books he authored or co-authored testify his creative and restless academic life. At both the universities in Prague and Berlin he was supervisor of hundreds of undergraduate as well as postgraduate students. A large number of national and International conferences, workshops and meetings on mathematical optimization and its applications saw him as an excellent organizer or co-organizer. During the sixties, when Prof. NoSiCka was the vice-vector of the Charles University and director of its Computer Center, he essentially Influenced the development of the computerized numerical mathematics in Czechoslovakia. A particular fruitful period of his academic career Prof. No2iCka started in the second half of the sixties when he became a visiting professor at the Section of Mathematics of the Humboldt University in Berlin (GDR) beside his full professorship at the Charles University in Prague. In Berlin he earned inestimable merits as the scientific founder and spiritus rector of the optimization group at the Section of Mathematics. By his efforts in theory and applications he decisively influenced the development of mathematical optimization in the whole GDR. In 1978 the Mathematical and Natural Science Faculty of the Humboldt University appreciated his extremely successful academic work by awarding him its doctor honoris causa. This brief valuation of Prof. No2iCka's performances as researcher and university teacher shall not be terminated without mentioning that he never stopped his efforts to bring together theory and applications as
5
close as possible. His great personality as researcher teacher and human being let him become a highly esteemed scientist beloved by students and colleagues in many countries. On occasion of his 70-th birthday the editors, former students, colleagues and friends of Prof. No2iCka, invited researchers in mathematical optimization and closed areas to submitt an article in order to honour Prof. No2iCka. The refereed versions of these contributions form the content of this volume. Among the papers summarized here the reader finds full papers and survey articles. The topics treated range from integer programming, nonlinear optimization,
stochastic
programming, non-differentiable and parametric optimization to solution methods for nonlinear optimization problems, nonlinear
operators
and certain applications. The book will be of interest for researchers in optimization theory as well as in its applications. The editors are grateful to the authors for submitting their contributions and to the various referees for their support. Special thanks belong to the publishing house, Akademie Verlag, and its lecturer in chief R. Hoppner.
Berlin, January 1988
6
The Editors
Contents D. Auzinger, Hj. Wacker: First steps to an on-line control for pusher type reheating furnaces
9
B. Bank, R. Mandel: (Mixed-) integer solutions of quasiconvex polynomial inequalities
20
H. Bernau: An exact penalty function method for strictly convex quadratic problens
35
F. Fiedler: Doubly stochastic matrices and optimization
44
F. Giannessi: On a generalization of F. John Theorem for constrained extremum problems
52
J. Guddat, H. Th. Oongen: On global optimization based on parametric optimization
63
H. Hollatz: Extended subdifferential and minimization of nondifferentisble functions
80
P. Kail: Stochastic programming with recourse: upper bounds and moment problems - a review
86
D. Klatte: On strongly stable local minimizers in nonlinear programs
104
B. Kummer: Newton's method for non-differentiable functions
114
K. Lommatzsch: A minimization of a strictly convex function on a simplex
126
«. Oettli: Monoton-konvexe Funktionen - eine Bemerkung zun Satz von Browder-Minty
130
D. Pallaschke: Quasi-differentiable functions in non-differentieble opti•ization theory
137
0. Plehler: The global rank of Gonory cuts
142
B. Pshenlchnyj: Necessary conditions for an extremum penalty functions and regularity
146
S. Rolewicz: On uniforn stability of uniformly bounded weak evolution operators 158 MF. Römisch, R. Schultz: On distribution sensitivity in chance constrained programming
161
G. S. Rubinstein: The canonical group of increasing functions of bounded convex variation and some of its applications
169
0. Rückmann, K. Tamper: Theoretical foundations of two-level methods in nonconvex optimization
180
M. Schoch: An investigation of the Frobenius problem
191
H. Schramm, J. Zowe: A combination of the bundle approach and the trust region concept 196 K. Zimmermann: Optimization problems with max-separable constraints and additively separable objective function
210
S. Zlobec:
e
Structural optima in nonlinear programming
218
M. Vlachi On aggregation of variables In linear progremming
227
FIRST STEPS TO AN ON-LINE CONTROL FOR PUSHER TYPE REHEATING FURNACES By D.Auzinger1', Hj.Wacker 1 '
1.
Introduction
1.1.
Organisation of the Paper
In steel processing it is very difficult to connect the process of continuous casting directly with the rolling process. Therefore, the slabs have to be reheated for the hot rolling. In our case this is done in pusher type furnaces. In this paper we first describe an efficient computation of the heat distribution of the slabs. Based on these results we propose a procedure that might be helpful to perform an on-line control for the reheating process. To be precise, given a fixed 'optimal' reheating curve in the furnace, how is the reheating process to be controlled to approximate the optimal heat distribution. Finally, we present some results how to determine the optimal curve.
1.2.
Description of the Furnace
Figure 1 gives a length cut through the furnace. It is about 36 m long (x-direction), 7 m high (y-direction) and 13 m broad (z-direction). It consists of 4 zones, each being divided into a upper and a lower part. The slabs, which are blocks of steel of about 1,2 m * 0,2 m x 12 m, are pushed through the furnace with their length-direction in the z-direction of the furnace. The furnace is filled with about 3o slabs, which form a "slab band". Whenever a slab is pushed into the furnace, the slab band moves in x-direction and another slab leaves the furnace at the discharging end, at a temperature of about 12 50 °C. In the first three zones (convective zone, preheating zone, heating zone) the slabs are pushed on water cooled skids, which cause the "skid marks" (compare Fig. 4). To diminish these skid marks, the lower part of zone 4 (soaking zone) is an unheated soaking hearth. The furnace is heated by coke gas and by natural gas. There are huge burners at the upper and the lower part of zones 2 and 3, which are the main heating zones, and there 1
' Universität Linz
Institut für Mathematik
A-4o4o
Linz
9
are a lot of small burners at the upper part of zone 4. The flue gas streams countercurrently to the slabs. In zone 1 there are no burners, but it is heated by the flue gas coming from zone 2. The gas leaves the furnace at the charging point and is used to preheat the air required for the combustion in a recuperator.
pusher type reheating furnace
1.3.
Basic Assumptions for the Model
Some assumptions are required to idealize and simplify the real reheating process, which is much too complex to be modelled "exactly". The first assumption is the strongest: (A1 ) steady operation of the furnace: slabs of the same size and quality are charged at a constant temperature. They are discharged after a constant heating time and at a constant temperature. (A2) the real movement of the slabs is replaced by a movement with constant velocity v. (A3) within the slabs no heat conduction in x-direction is considered, as this conductive heat transport is small compared to the other heat transport affects. (A4) for controlling purposes we assume a 'mixed' temperature for the furnace zones influenced by the flue gas, by the flames of the burners, by the furnace wall and by the slabs. It is further assumed that we can influence this 'mixed' temperature
(for instance by
choosing the input of the combustion material).
2.
Determination of the Heat Distribution
For on-line purposes it is essential to keep the computing time low. Nevertheless, to get more insight into the process, e.g. we want to study the influence of the skids, we decided to use a three dimensional model here.
10
Computation is done by classical FE-techniques where the structure of the problem is exploited. This is done such that the gradients necessary for the performance of the on-line control are determined just by a simple extension.
2.1,
The Mathematical Model
We consider a particular slab being pushed through the furnace, i.e. we use a slab fixed coordinate system. We determine those zone temperatures W = (w",wj,W^,wJ;,W^,W^,W^), which give the bulk temperature curve closest to the ideal curve. We work in 3 dimensions: the height of the slabs (y), their breadth (z) and the time t. At first we reduce the size of the problem by symmetry considerations.
slab skid
•
Fig.2
z
L
l
slab
Under the assumption 3T i
0 it suffices to consider fi. We get the LUR following model for the heat distribution 3T cp(T)p Jj: = div [ X (T) grad T] T (y,z,t)
(y,z) G iî, t 6 (0,tE>
(2)
t=0 LUR, zones 1-4 4 4 -T ) A,zones 1-4
0
àT MT)-|^(y,z,t)=R(y,z,t,T) =
(1 :
4 h
i
(W
i"
T)+r
i
(W
i ~t4)
B
'zones
1-3
k S (T c -T)
S,zones 1-3
k H (T H -T)
BUS, zone 4
(3)
h" and h^ are the convectivity constants at zone i, rV and r^ the radiS
H
ation constants, k resp. k the heat exchange coefficients slab-skidscooling water resp. slab-soaking hearth. For the approximation of the ideal bulk temperature T o p t ( t ) by the'bulk temperature T(y,z,t) in L 2 [0,tg] we obtain the objective f (W)
/ [T(y,z,t,W)- T o p t ( t ) r dt = Min!
(4)
11
The control H must satisfy the (technical) box constraints W
2.2.
min
S W
* Wmax-
S
k
Min
' Wmin * W S "max = > a k W ^ i = W k + oij^S^ (Line search is performed by quadratic interpolation) optimal =e> stop B k + 1 by BFGS-Formula, k = k+1,
GOTO S1
17
3.3.
Numerical Results
Depending on the charging temperature, the quality of steel, the size of the slabs and the pushing velocity v there are different optimal curves T Q p t (t) = T O p t (0) f o r bulk temperature, see chapter 4. We will present the approximation of one typical cold charging optimal curve. In the
• y fcE
following table the results are rounded. AT _-l|f = || ^(W)' gives the mean difference of the calculated and the optimal bulk temperature. Iteration
w"
W*
1 2 3 line search 4 5 6 7
6oo 566 563 656 555 52o 487 45o
55o 521 519 52o 512 484 459 429
4.
W^
W^
1 ooo 1 o5o 922 961 924 965 923 963 957 915 933 886 855 9o8 877 81 6
w" 1 3oo 11 96 1 2o9 1 2o2 1 21 o 1228 1 236 1 236
W* 1 35o 1 242 1 26o 1 251 1 263 1 296 1318 1336
W^ 1 3 5o 1 33o 1336 1335 1339 1351 1 36o 1368
AT 129,96 58,93 58,97 58,7o 54,81 41 ,13 31 ,69 28,97
Determination of the Optimal Reheating Curve
We used a model consisting out of three parts: i) the slabs (two dimensional: (x,y)) ii) the flue gas (one dimensional: y) iii) the recuperator Further, we assumed that the flue gas in the zones 2 and 3 is "wellstirred", i.e. one has a uniform temperature of the flue gas in each zone. For zone 1 and 4 we assumed a differential flue gas model. There is no room here to give a detained description. The procedure for the solution, however, is similar to that described in Chapter 2 and Chapter 3. First one determines the heating process, i.e. temperatures of the slabs, of the flue gas and of the preheated air, for a given energy input. Then one optimizes the energy input using similar techniques as in Chapter 3 to get the gradients. We confine ourselves to present two typical pictures for cold and hot charging optimization, for more details see [1].
18
Flg. 7
Fiq.P
cold charging optimization
hot charging optimization
Acknowledgements: The authors are indebted to Dipl.-Ing.B.Lindorfer (VOEST-ALPINE AG) and Dr.Walter Zulehner for their contributions to modelling and to Dr.H.Gfrerer for his advice concerning optimization. Financial support was given by the National Banking Foundations, the Austrian Science Foundation Fonds and the VOEST-ALPINE AG. References: Ml
D.Auzinger,Hj.Wacker: Optimal Reheating of Slabs in a Pusher Type Furnace, in: H.W.Engl, Hj.Wacker, W.Zulehner (Eds.) Case Studies in Industrial Mathematics
[2]
R.Hubmer, "Optimierung eines Stoßofens: Sollkurvenapproximation unter Anwendung der Methode der Finiten Elemente", diploma thesis 1986, Johannes-Kepler-Universität, Linz
f3]
M.Lindner, "Hierarchische Formfunktionen angewandt auf ein parabolisches Anfangsrandwertproblem" , diploma thesis 1987, Univ.Linz
p]
H.R. Schwarz, "Methode der finiten Elemente", Teubner, Stuttgart 1984
[5]
K.Solchenbach, K.Stüben, U.Trottenberg, K.Witsch, "Efficient solution of a nonlinear heat conduction problem by multigrid methods", Sonderforschungsbereich 72, Universität Bonn.
19
(MIXED-) INTEGER SOLUTIONS OF QUASICONVEX POLYNOMIAL INEQUALITIES 1 ^ Bernd Bank
and
Reinhard Mandel
Abstract. Properties of quasiconvex polynomials on R n will be derived. Based on these properties the existence and stability of (nixed-) integer solutions of quasiconvex polynomial inequalities under weak assumptions is discussed.
1. Introduction. This paper is intended to complete a stability theory of quasiconvex polynomial (mixed-) integer optimization problems presented in BANK/MANDEL (1987). In particular, it contains the proofs of several results omitted there. Essentially, the article contains basic properties of quasiconvex polynomials and solution sets of systems of inequalities involving such polynomials. These properties are basic to derive stability (in the sense of lower and upper semi-continuity) of the corresponding multifunction described by such an inequality system if its right—hand sides vary. It is well-known, that stability of the constraint sets of an optimization problem determines the stability behaviour (i.e. continuity properties of the extremal value and the optimal set) of the problem if data changes are considered. A fairly comprehensive study of this question can be found in BANK/GUDDAT/KLATTE/KUMMER/TAMMER (1982). First, in Section 2, an analysis of elementary properties of quasiconvex polynomials is given. Here, BELOUSOV's (1977) results are extended to the class of quasiconvex polynomials. Section 3. is devoted to (mixed-) integer solutions of quasiconvex polynomial inequalities. The final section is concerned with the stability behaviour of real and (mixed-) integer solutions of such inequality systems where the righthand sides changes. As usual, we understand by a cjuajiixonvex function f on a convex set S such one satisfying 1) This work was partially supported by the Austrian Science Foundation (Fonds zur Förderung der wissenschaftlichen Forschung), Project S 32/01. 2) Humboldt-Universität zu Berlin, Sektion Mathematik, PSF 1297, BERLIN, 1086 DDR.
20
f ( * x + ( l - o O y ) * nax{f(x),f(y)J, R is
for all
-
Rn
denoted by deg f. A multifunction I : A * 2 continuous (briefly: ^ s ^ c ^ ) at V £ > 0 3 cJ » c J ( e ) > 0 : r ( > . ) c U £ and, lower^semicon^^ n
v o c R ,n ==>B
0
^6-A
. Actually,
at A
For
this
-oo} ,
(20)
the s e t I (A.) does not depend on the choice the f a c t t h a t n e i t h e r
u c K ( f j ) nor the constancy d i r e c t i o n s
• Therefore,
in M ( A ) we a s s o c i a t e
its stable set.
indices
, which i s a consequence of
directions
integer points
s e c t i o n f o r M) to M(A)
the r e c e s s i o n
u£R(fj),
i e IQ,
from S e c t i o n 3 . one concludes the f o l l o w i n g
depend two
corollaries. Corollary
4.
If
t h e assumptions o f
Proposition
of
indices
(20),
I
= I(/0,
Applying for
d e f i n e d in A.eA
a r e f u l f i l l e d and 1 ( A )
the
Now, u s i n g
the s e t
M(A),A€A
its
set
(21)
the e l i m i n a t i o n
p r o c e d u r e from t h e p r e v i o u s
of
section
to
I.
indices I
from ( 2 1 ) ,
we can a s s o c i a t e
to a set
s t a b l e s e t M j ( ^ ) by
{x 6. R n / f , ( x ) i j , , i e l i Uf l i ' From t h e p r e v i o u s s e c t i o n , i t i s c l e a r t h a t t h e r e c e s s i o n Mt(X)
is
.
X ' e A , one o b t a i n s
any
8.
then
(22)
=
1
VT = n 1 i t l
K(f.)
of M T ( X ) , ^ e A
1
cone
subspace.
4.,
one d e r i v e s
f r o a Leaoia G.
5.
L e t t h e assumptions o f be d e f i n e d as i n
(21)
£ >0 , there e x i s t s f £ (x)
a linear
1
On account of C o r o l l a r y Corollary
, is
j
6
In p a r t i c u l a r . A «•
Proposition and ( 2 2 ) ,
a p o i n t x e M j ( a ) such
VxfcU£(x*)
If
and l e t
K A
I
, then,
stablejaa^pinj^Mj: A — > 2
every
that
that
we can a s s i g n to the m u l t i f u n c t i o n M:/\—'»2 R
and M j ( ^ ) for
Viftl.
C o r o l l a r y 5, implies / MjU) / 0]
holds. Therefore,
Proposition
8 . be f u l f i l l e d ,
respectively.
n
d e f i n e d by the s e t s M j ( ; 0
from
Rn
its
(22).
9.
Under the hypotheses of P r o p o s i t i o n 8. the s t a b l e mapping Mj d e f i n e d by the c l o s e d convex s e t s domain
32
A
from ( 2 2 )
of M and MT i s a c l o s e d
set.
i s H-continuous.
The common
Proof. Since the subspace Vj is constant, for all 3,eA , the multifunction Mj satisfies (CPC). Hence, Mj is H-continuous and A is closed. # Let 0 < s * n. As in (16) we define the sets (M(A)) S and ( M j C D ) , of (nixed-) integer points in the sets M(X) and Mj(A) fron (17) and (22), respectively. Further, let 8 - { A t A / (MQ))8 * In BANK/MANDEL (1987) the following was shown. If the hypotheses of Proposition 8. and (CPC) are fulfilled and, furthermore, if V satisfies,, R (MIG), then there is a compact-valued, u.s.c. multifunction C g : B — * 2 such that (M(A)) 8 - (C(*)) 8 + V 8 V jeB holds where V g denotes the set of (nixed-) integer points in V. Proposition 10. Let the hypotheses of Proposition 8. be fulfilled. Further, let Vj, (M(X))8# (MJ(A)) 8 and B then
be defined as before. If Vj satisfies (MIG),
(i) (M(X)) 8 / 0 « = I » ( M I ( A ) ) 8 * 0, (ii) B is a closed set. Proof. The first part can be shown following the same way as for Proposition 7. (remark, 6 depends on ). The second part is a consequence of the fact that Vj is a constant subspace with a (mixed-) integer basis. Namely, by the remarks preceding this proposition, 8 is closed if A is closed, but we know this from Proposition 9. # 5. Conclusion. We presented in this paper an analysis of such properties of quasiconvex polynomials which are essential for the existence of optimal points and the stability of (mixed-) integer optimization problems involving quasiconvex polynomials as well as constraints and objectiv function. We showed that the multifunction defined by quasiconvex polynomial inequalities with variable right-hand sides is lower senicontinuous on its domain. In particular, we developed a criterion under which the considered sets contain (nixed-) integer points even if the right-hand sides of the constraints vary. Moreover, we verified that the set of all right-hand sides leading to consistent (mixed-) integer quasiconvex polynomial optimization problems is a closed set. By our considerations, we demonstrated that the assumptions on the recession cone of the feasible sets (requireing the
33
existence of (mixed-) integer generators) cannot be weakend. Acknowledgements. We would like to thank Rainer E. Burkard for valuable comnents and careful reading of the manuscript.
References. Bank, B., E.G. Belousov, R. Mandel, O.N. Cheremnych and V.M. Shironin (1986). Matematyceskije programirowanje: voprozy rasreshinosti i ustoiCivo8ti. Iszd. MGU, Moskwa
(russian).
Bank, B., 0. Guddat, 0. Klatte, B. Kummer, K. Tammer (1982). Non-linear Parametric Optimization. Akademie-Verlag, Berlin. ((1983) Birkhäuser Verlag,
Basel•Boston-Stuttgart).
Bank, B., R. Mandel (1987). Nonlinear Parametric Integer
Programming.
In Guddat et al. (Eds.) Proceedings of the International
Conference
on 'Parametric Optimization and Related Topics'. Plaue/DDR, Okt. 1985, Akademie-Verlag,
Berlin.
Bank, B., R. Mandel (to appear). Parametric Integer Optimization. AkademieVerlag, Berlin. Belousov, E.G. (1977). Vvedenie v vypuklyj analiz i zeloCislennoje programmirovanie, Iszd. MGU, Moskwa
(russian).
Mandel, R. (1985). Beiträge zur Theorie ganzzahliger
Optimierungsaufgaben.
Dissertation (B), Humboldt-Universität zu Berlin. Shironin, V.M. (1980). 0 raspoloienii zelofiislennych toCek v neograniCennych vypuklych mnoshestvach euklidivo prostranstwa. Diss. (kand. fiz. mat. nauk). Ekon. Fak. MGU. Moskwa
(russian).
Stoer, 0., C . Ufitzgall (1970). Convexity and Optimization in finite dimensions I. Springer-Verlag, Berlin-Heidelberg-New York.
34
AN
EXACT
PENALTY
FUNCTION METHOD H e i nz
A b s t rac t:
A
solution method
is d e s c r i b e d vantage
of
In w h i c h
this
an
I^-solution
of
the
parameter.
penalty
used
in
the
This
the
development
CONVEX
definite
function
in c a s e
problemwill property
quadratic
subproblems,
For
positive
penalty
that,
quadratic
successive
feasible ions.
the
is
STRICTLY
QUADRATIC
PROBLEMS
Bernau
for
exact
algorithm
FOR
of be
inconsistent
there
used
as
search
also
be
this
fact
was
ad-
value
the m e t h o d
if
^ - s o l u t i o n s may algorithm
a given
if
An
constraints,
since
of
the
for
important
methods,
prograrrmi n g
to b e m i n i m i z e d .
obtained
is v e r y
programming
quadratic
is
the
are
an of
is non-
direct-
principal
m o t i vat ion.
1.
Int r o d u c t i on
In
successive
nonlinear
quadratic
programming
programming
minimize subject
f(x)
to
mine
the a
index
search
sets
(SOP-methods)
= 0
i « E,
to
solve
the
E and
direction
(x) a
I are
at
,
g.(X) g
where
methods
problem
the
,
0
,
finite,
current
(1.1)
j« I , the
subproblem
estimate
solved
x often
takes
to
deter-
the
form
[1], [8], [11], [H], minimize subject
where
B
is an
nXn
i t E w l , denote script
subproblems may
To
g. (x)
+7g,(x)Td
= 0
,
ii E ,
g j (x)
+ Pg.(x)Td
S O
,
j « I ,
positive
is
of
transpose. that
the
definite matrix,
f(x) A
and
g.(x)
disadvantage
linearizations
inconsistencies,
even
though
Vi ( x)
respectively, of
of
the
(1.2)
the
the
and
C^g. (x) ,
and
super-
application
constraints
problem
(1.1)
is
of
of
the
(1.1)
consistent
[15]-
avoid
this
algorithm i(x,v)
difficulty
to m i n i m i z e
= vf ( x)
the
rithm
problem
Fletcher
the
exact
+ 51 I9: i«E
of
T + Vf (x) d ,
gradients
the
(1.2)
introduce
M.
syrrmetric
the
T denotes
to
1 T ^ d Bd
1
( * )| 1
'
(1.1), where
to d e t e r m i n e
search
, £
, £ 6]
penalty
5Z
+
m a
*
a
trust-region
.
9i(xl}
(1.3)
.
J
j«l v X )
proposes
function
is a p e n a l t y
directions
the
parameter.
following
In t h i s
subproblem
has
algoto
be
so Ived
•This
research
++ C o m p u t e r H-1502
was
and
supported
Automation
Budapest
XI
by H u n g . Institute
,
Kende
u.
Res.
Found.
Hungarian
OTKA/1044
Academy
of
Seiendes
17-23
35
v(^dTBd
minimize
+Vf(x)Td)
• E
max {0,
we
p>0
get
subject
to
is t h e r a d i u s
of
feasible
J
uncertain. With
the c h o i c e
local m i n i m i z e r
of
problem In
(1.1)
the
[2],
that
possible
subject
feasible
for
but
of
(1.4)
^P.
the p a r a m e t e r J(x,v)
> £5] • ' n t h i s
[l] ,
the correct
choice
of
v we have
will
also
way
the p a r a m e t e r to g u a r a n t e e
be a
local
v
is
that
solution
a
of
[9].
is g i v e n w h i c h
the p e n a l t y
the p r o b l e m
solves
parameter
to the p r o p e r t y (1.2).
v
that
the m e t h o d
subproblems
is a d j u s t e d
the
solution
If the p r o b l e m
l u t i o n the s o l u t i o n of ( 1 . 4 ) c o r r s p o n d i n g direction, where v is a f i x e d p o s i t i v e m m section we describe
,
J
trust-region
function
[4],
an SQP-method
such a way
| dJJ the
subproblems,
+ V g . (x ) T d j
g . (x)
jC I where
+Vg.(x)Td|
+ / _ j g . (x)
of
type
to b e as of
(1.2)
problem has no
(1.4)
large
in
as
(1.4)
is
feasible
so-
to v = v m .i n is u s e d as s e a r c h small v a l u e . In the f o l l o w i n g
to d e t e r m i n e
search directions
of
this
type. 2.
An exact
penalty
Independently quadratic
function
method:
f r o m the n o t a t i o n s
used earlier
programming problem . . . 1 Tq minimize ^x Bx subject
to
. T. + c d
ajx
- b.
= 0
alx
- b . S
p S x g q where i 6 Eol Let's
B
is a n n X n
syrrrnetric p o s i t i v e
are n - v e c t o r s introduce
the
and
= Y Z I alx i£E ' '
exact
function
function
for
following
Ierrma
(2.3)
(see also
Lemma
2.1;
feasible i.)
36
For
v=v
for p r o b l e m x
,
j6 I ,
(2.1)
,
I are
c,p,q and
a.,
finite.
, max
+ Q(x)
(0, alx J
,
in ( 2 . 1 ) ,
this p r o b l e m the
i(x,v)
£l],
v>0
- b.}
(2.2)
J
.
the
function
£4^
and an exact
following
î(x,v)
is
an
penalty
problem 3)
,
p £ x S q
investigates
[4] , [ 9 ]
ii E,
j«l
= vf (x)
to
,
definite matrix,
•
a l g o r i t h m can be based on subject
The
• cTx
- b.I 'I
minimize
following
,
sets E and
the b o u n d c o n s t r a i n t s
penalty
the
notations
= ^xTBx
J(x,v) Disregarding
index
following
f(x) Q(x)
the
0
we consider
relations
between
the problems
(2.1)
and
) .
let x (2.1)
is t h e o p t i m a l
b e a s o l u t i o n of
the p r o b l e m
then solution
of
(2.1)
,
(2.3).
If x
is
¡¡.)
for e v e r y ponding
Proof! sible If
i.) x of
problem
vf(x)
not
we
get
last
vf(x)
v = v, a n d dently
true,
the
fa I e m
(2.1).
the
solution
for
all
that
The
values
conditions
Theorem (2.3)
of
2. 1 •.
of
for
if
+ c)
is at
turns
0 «
A..S-1
'
*4*. ~ 0 ~ i where
i f
e. denotes J
Here
it
the
is r e m a r k a b l e
nonzero
linear
function
values.
This
the of
same v.
(At
v the
(2.1) The
(2.3)
is
v,
for
= sign(alx*
(Note,
therefore
their
J
solution
- b.)
if
a j x ' - b. /
= 1
if
X*
= 0
if
alx" - b . C O
xT = Kp. i i
, '
J
J
J
J
V ! » 0 ' i
, ,
the
problem s u c
0
^
, ie E
j« I ,
if
(2.4)
x ! = q. I I
p.* x*^ q . i i i ,
• ' = 1,2
n,
vector.
the v a l u e s
of
are uniquely
property
of
J
A.
j-th unit
optima-
V , i£ E u I ,
T a. x * - b . X D
will
the p r o b l e m the
where
(2.3)
for
subsequent
problem
the p r o b l e m
the m u l t i p l i e r s ,
determined
be used
problems,
of
which and
e MM :. t ; -
+ / j^i
1 1
that
solution.
is
pro-
intervall,
same
the
the optimal
existmultipliers
the current of
the
values
is a n
the
for
[l] , [i] :
x" , p s x ' f i q , there
has
optimal of
indepen-
the
to d e f i n e the
last
belonging
signs
of
a sequence
of
these
problems not
that be
of
for
equa-
problems
the v a l u e
of
for
has
value v will
the actual
feasible
to
these
an eventually m o d i f i e d
indicates
(2.3) will
by
value
problem
).
next
dratic
if
solution
set
(2.1),
that
for
(2.3), which
following
the d e f i n i t i o n
be decreased, of
as
time
x
unique.)
problem
quadratic
solution
seen
the p r o b l e m
ineqalities of
on
are
p s x s q ,
follows
these
the o p t i m a l i t y
the
of
x,
it
is b a s e d
are
problem
functions
lity c o n s t r a i n e d
fea-
,
same
that
and
this
and
J
if
from
convex
J
/
/•*r , r 6 J ( x " ) a r e (2.3).
,
. of
i«Z
problem
o}
= **
alx
straints
=
conditions
j€Z
where
- b.
independent
,
aTx*
Mi.) iv. )
' jo jo* c o n s t r a i n t s w i l l n o t b e c h a n g e d . In t h e n e w p r o b l e m o n l y « C i s zero.
the
give an e s t i m a t e
the a c t i v e c o n s t r a i n t s
of
x
k+1 for
is
also
the
lar-
problem
independent.
ii.) xk+1 is n o t f e a s i b l e w i l l b e r e d u c e d , i.e. w e set b I em
(2.13)
l^-solution
is a g a i n of p r o b l e m
The algorithm 1 prfx 1 . 0 , a n d the so m o d i f i e d p r o k k+1 s o l v e d . If v A v m j n , the p o i n t x is a c c e p t e d a s (2.1).
starts with -1
i.e. x
v 1 = 1.0, Z 1
is t h e u n c o n s t r a l n t
= ¡6, J 1
» ff, a n d a r b i t r a r y
m i n i m u m of
chosen
the c o r r e s p o n d i n g
x1,
objec-
t i v e f u n c t i o n in ( 2 . 1 3 ) . In the f o l l o w i n g w e s h o w t h e t h e f i n i t e n e s s of t h i s a l g o r i t h m . F r o m t h e L f i n i t l o n of d in ( 2 . 1 4 ) it f o l l o w s t h a t t h e m e t h o d is n o t i n c r e a s i n g w.r.t. that
42
the o b j e c t i v e
the n u m b e r
of
functions
nonzero
of
the p r o b l e m s
in t h e s e
(2.13). Furthermore
objectives
is m o n o t o n i c a I Iy
note, de-
de-
creasing,
i.e. w e
Therefore
after
case
c.), where
only
finite
finite A
first
return
to a n e a r l i e r
number
of
the a l g o r i t h m
different
number
of
results
values
terminates of
used quadratic
occurencies or
of
a.)
the value
v are possible,
problem.
or b . ) w e m u s t of
v
get
is r e d u c e d .
the a l g o r i t h m e n d s
As
in a
steps.
implementation
factorizations rage
cannot
a finite
of
the a l g o r i t h m ,
in the p r o g r a m Z Q P C V X
as
in a n
I ^SQP m e t h o d
using
the
sanje C h o l e s k y
, f"l3j, w a s
and
tested with
QR
encou-
.
L i t erature: Bernau, Working £2}
H. : S e q u « n t i e l l e q u a d r a t i s c h e P r o g r a m m i e r u n g s v e r f a h r e n , paper, M T A SZTAKI M O / 7 2 , (1987).
C h a r a I a m b o u s , C . : O n c o n d i t i o n s for o p t i m a l i t y p r o b l e m , M a t h . P r o g r . 17, ( 1 9 7 9 ) , 1 2 3 - 1 3 5 . Fletcher, R.: A general quadratic the Inst, of M a t h , a n d its A p p l . , Fletcher, mization,
of
the n o n l i n e a r
programming algorithm, Journ. V o l . 7, ( 1 9 7 1 ) , 7 6 - 9 1 .
I-' of
R . : P r a c t i c a l M e t h o d s of O p t i m i z a t i o n - C o n s t r a i n t O p t i John Wiley and Sons, C h i c h e s t e r - N e w y o r k - T o r o n t o , (1981).
H
F l e t c h e r , R.: Numerical e x p e r i m e n t s w i t h an exact l^-penalty funct i o n m e t h o d , in M a n g a s a r i a n , O . L . , M e y e r , R . R . , R o b i n s o n , S . M . ( e d s . ) , N o n l i n e a r Prograrmning 4, A c a d e m i c P r e s s , N e w y o r k , ( 1 9 8 1 ) , 99-129.
M
F l e t c h e r , R . : A n l ^ - p e n a l t y m e t h o d for n o n l i n e a r c o n s t r a i n t s , in Boggs, P.T., Byrd, R.H., Schnabel, R.B. (eds.), Numerical Optimiz a t i o n 1 9 8 4 , SI A M , P h i l a d e l p h i a , ( 1 9 8 5 ) , 2 6 - 4 0 .
£7}
G o l d f a r b , D . , I d n a n i , A . : A n u m e r i c a l l y s t a b l e dual m e t h o d for solving strictly convex quadratic programs, Math. Progr. 27, (1983), 1-33.
[8]
H a n , S . P . : S u p e r I inear Iy c o n v e r g e n t m e t h o d m i n g , M a t h . P r o g r . 11, ( 1 9 7 6 ) , 2 6 3 - 2 8 2 .
[9]
Han, S.P., Mangasarian, O.L., Exact penalty functions p r o g r a m m i n g , M a t h . P r o g r . 17, ( 1 9 7 9 ) , 2 5 1 - 2 7 9 .
for
nonlinear in
programnonlinear
[10]
P o w e l l , M . J . D . : A fast a l g o r i t h m for n o n l ¡ n e a r l y c o n s t r a i n t o p t i m i z a t i o n , in W a t s o n , G . A . ( e d . ) , N u m e r i c a l A n a l y s i s D u n d e e 1 9 7 7 , L e c t u r e N o t e s in M a t h . 6 3 0 , Spr i n g e r - V e r I a g , B e r l i n , ( 1 9 7 8 ) , 1 4 4 - 1 5 7 .
[l 1]
P o w e l l , M . J . D . : V a r i a b l e m e t r i c m e t h o d s for c o n s t r a i n t o p t i m i z a t i o n , in B a c h e m , A . , G r o t s c h e l , M . , K o r t e , B. ( e d s . ) , M a t h e m a t i c a l P r o gramming, Springer-VerIag, Berlin, (1983), 288-311.
[12]
P o w e l l , M . J . D . : Z O P C V X a F o r t r a n s u b r o u t i n e for c o n v e y q u a d r a t i c p r o g r a m i n g , R e p o r t D A M T P 8 3 / N A 1 7 , U n i v e r s i t y of C a m b r i d g e , ( 1 9 8 3 )
£13}
P o w e l l , M . J . D . : C o r r e c t i o n s a n d E x t e n s i o n s to the F o r t r a n of Z O P C V X , R e p o r t , U n i v e r s i t y of C a m b r i d g e , ( 1 9 8 4 ) .
£14]
S c h i t t k o w s k i , K . : T h e n o n l i n e a r p r o g r a m m i n g m e t h o d of W i l s o n , H a n a n d P o w e l l w i t h a n a u g m e n t e d L a g r a n g i a n t y p e line s e a r c h f u n c t i o n , N u m . M a t h . 38, (1981), 8 3 - 1 1 7 .
[15]
T o n e , K . : R e v i s i o n of c o n s t r a i n t m e t h o d for n o n l i n e a r p r o g r a m i n g 144-152.
listing
a p p r o x i m a t i o n in s u c c e s s i v e O P problems, M a t h . Progr. 26, (1983),
43
DOUBLY STOCHASTIC MATRICES AND OPTIMIZATION Miroslav Fiedler^^
Let us recall first that doubly stochastic matrices are square nonnegative matrices all of whose row - as well as column sums are equal to one. In terms of optimization problems, the set of all n-by-n doubly stochastic matrices is identical with the set of all feasible solutions of the transportation problem (without specifying the objective function) x..?0, n
i, k = 1, ..., n,
Z * i k = aj, k=l
i = 1, • • • . n ,
Z ! xik - bk,
k = 1
n
satisfying a^ = b k = 1, i, k = 1 n. In fact, the famous Birkhoff's theorem [2] stating that this set is equal to the convex hull of all permutation matrices is a consequence of similar more general statements about the m-by-n transportation problem (as stated, e. g., in [l]) saying that the extremal points of the corresponding convex polyhedron correspond to those feasible solutions the matrix (xi|• z, if
\
of real
R
n y2,
The vector We
%
+
y. X1
* y. 12
= ... ? y.
xn
>
relation
=
z
^
l2
^1
+
...
+ y.xi n-l
+
...
+ y.i x n
V
is
kK • 1
=
z
z
k K
kK
+
z
1 +
1
reflexive
z
k
k
and
+
2 +
2
••• •••
+
+
z
z
k
k
n-1 • n
transitive.
In
addition,
e = (4, — , .... ^ is majorized by every vector in n' n n ° ' ' shall
U
= (u.k)
=
(U}k,
also
need
the
notion
is a real orthogonal is
for
n'
+ y
l
and
i • • • »V
yi
Vi
n-dimensional column vectors. If
..., y ) T "
clearly
doubly
of
an
n-by-n
orthostochastic matrix, then
stochastic.
Matrices
the
R_. n matrix.
the matrix
obtained
in
If S =
such
a
manner are called orthostochastic. Not all doubly stochastic n-by-n matrices are orthostochastic if n = 3. For instance, the doubly stomatrices are o chastic matrix
7
•
i
1
I
7
2 is
not
vectors
orthostochastic: u
u
( n>
Theorem equivalent:
1
y
i2>
U
in anc
13^
( [ 6 ]).
'
Let
2 the
previous
u
u
^ 21' y €
22'
Rn,
u
z
notation,
23^ €
the
corresponding
would not be orthogonal. Rr.
1°
y
2°
there exists a doubly stochastic matrix
Then
the
following
are
z i D
such that
z = D y ; 3°
there exists an orthostochastic matrix
S
such that
z = S y ; 4° •••» y n The
there exists a real symmetric matrix with the eigenvalues and the diagonal entries proof
was
given
in
y^,
z^, ..., z n .
C6 ].
It was substantially
simplified
in
U ] . 45
We intend to add here a maybe new geometric equivalent
character-
ization . Theoren 2. With the same notation as in Theorem 1, y ^ z only
if either
y = e (=
...,
and then
and to the points A A A A Y Y 7 7 'l • • • •' 'n' 1 n R^ with the respective (single) coordinate
z = e
if and
as well, or
y ^ e
in zp
there
exist
in some
containing
y^,
R^
Z^, ..., Z n with the following properties: A A 1° Y^, Z^ are orthogonal projections of = 1,
••• .
2°
the points
Y^, ..., Y r
Y^, Z^
on
Yn, R^, i, k =
are linearly independent, thus forming
(n - l)-simplex
all points
4°
the arithmetic mean of the points these
Zk
*he convex hull of
3°
of
...,
Y^
n;
vertices of an
ty
..., y n , z^,
points
points)
are contained in
coincides
with
the
Proof. The first statement about
Y^, ..., Y n ;
; Z k (or, the centre of gravicentre
of
gravity
of
^-y"
y = e being trivial, let us show
that the given geometric condition in the case
y / e
is equivalent to
condition 2° in Theorem 1. Let first such points y^, z^
exist. Choose an orthonormal
will be the first coordinates of
combination of
ak
R^
coor-
is its first axis and
Y^, Z k . By 3°, Z k
is a convex
Y,..... Y : n 1 n Z
where
Y^, Z^
i n such a way that
dinate system in
* 0 , ^
k ak
=
skm Y
k = 1, ..., n
= 1 .
m= 1 By 4°, we have for the arithmetic mean o-f the n
n (
Z
n ^ k=l
n
k=) n . k,m=l
s
kmYm
=
Y n ^ m " m=l
Therefore, the linear independence of the
1 k=l
Thus the matrix Conversely, if R. 1
jections on 46
n R^
yields
m=l,
S = (s^j)
is doubly stochastic and - using
z = Sy
the
z = Sy.
for a doubly stochastic matrix
y ^ e, one can choose in an
taining
Y^s
skm = 1 ,
first coordinates only - satisfies and
Zks
(n - 1)-dimensional space
linearly independent a /\ are Y^, ..., Y .
points
Y,
i
Y
n
S =
(s^)
Rn ^
con-
whose
pro-
It is then easily seen that the points s
m=l
Z^
defined by
kmYm
satisfy the conditions 1° - 4°. Let us recall that doubly stochastic matrices have been generalized to the rectangular case. These nonnegative
matrices
all
of
column sums are equal to
m-by-n
whose
doubly stochastic matrices are
row
sums
are
equal
n/m. We shall denote by
Dmn
to
one
and
all
the set of all
such matrices. In C matrix Emn in Dmn which has, u 37 J, r ' Cihdk studied a special ' in a sense, analogous properties as the identity matrix in the class D nn can be of usual doubly stochastic matrices. This "unit" matrix E_ ' mi i obtained by the following construction: one starts by completing the first column
E „ from top to the bottom, then the second column r mn ' from top to the bottom etc, putting on every place the largest possible number
of
(having
in mind,
of
course,
the
row-sums
and column-sums
con-
ditions). In other words, E
mn
=
e
ik =
^ik^
1
=
1
m; k = 1, ..., n
where m
in (1 - 2 1 e k , j n. It follows immediately that for
E
n+l,n
m ? k = n,
Gmn = Gmk. G.kn In addition, G belongs to D since E„., „e„ = e_ , where , . 1? mn " mn n+l,n n n+1 e n = (j-j, ..., (with n coordinates) as well as e T^ _ gT m m,m-1 m-1" Let us consider the following geometric construction in the class of polygonal lines in some space R g . Let P = P 1 P 2 ... P n (P1 P n points in R ) be an n-polygonal line in
R . If A is a real m-by-n matrix, A = (a,.), all of s 1j whose row sums are equal to one, then the transform of P by A is the m-polygonal line Q = Q j ^ •' • "m "s * h e ver"ti-ces Q|< o f w h i c h are given by Qk =
akjPj ,
k = 1, .. . , m .
(*)
This transform is affine, e. g. in the sense that if each of the points P^ is moving with constant speed - which may be zero - in a constant direction then the same will be true for each of the points Q^. Assume, for a moment, that Pj, ..., P n are linearly independent and that Q,, ..., Q n ., are obtained by (*) for A = E . . It follows 1* llT^ II^X | II from Example that Q^ = P^, Q n + 1 = P R and Q.^ belongs to the segment for i = 2, ..., n. In addition, the arithmetic mean of the Q^s coincides with the arithmetic mean of the P^s. These conditions, in fact, determine the QTs uniquely. Repeating this process (even if the vertices of the new polygonal line may not be linearly independent) we obtain the m-polygonal line obtained from P^ .. . P n by applying the matrix Gmn . This method to obtain to a given n polygonal line Pj P 2 — PR the new m-line Q^ Q 2 Q m with m > n is known (e. g. [ 5 ] ) . Hawever, the following explicit formula for each entry of the matrix Gmn seems to be new. Theorea 5. Let ra = n = 2. Then the 1 = k * n)
of
Gmn
(i, k)-th entry (1 ^ i = m,
is given by
49
g
The proof
=
ik
is easily obtained by induction. We shall mention,
ever, that the number
m
can tend to infinity
(n
how-
is fixed). In this
case we obtain as a limit the curve n pct) = X f t
-t>n"k pk,
: J ) t ( i
This curve which is an algebraic curve of degree as B6zier
curve
For
n = 3,
P3P2
at
corresponding
n-polygonal
to the
it is a parabola
touching
the
The
P^
i.
is known P2...
P]^
final A
topic
concerns
satisfying
normal
AA* = A*A
matrices, where
''l
i.
e.
A^. An equivalent characterization is that
A = U L U*
where
eigenvalues
L
PR. anc '
of
A)
and
U
a unitary
is unitary,
Theorem 6. Let
y, z
matrix,
has the form
then
i. e.
o
u
S = (l i|j
stochastic. We shall call such a matrix
z = (z,, ...,
A
complex
conjugate
is a complex diagonal matrix (its diagonal entries
U = (uik)
If
square
A* means complex
matrix to
a
^
UU* a
9 i-
n
= I.
doubly
unistochastic.
be complex column
n-vectors, y = (y^,.. , ,
Then the following are equivalent:
there exists a unistochastic matrix
1
ti
n - 1
line
lines
i
P3.
matrices
are
0
S
such that
z = Sy ; 2° y
there y
l'
n
Proof.
exists
and
ttle c ) i a
Follows
diagonal entries
The matrices
n-by-n
9 o n a l entries
immediately,
a^ a
a normal
both
of the matrix =
ii
problem
to
is open.
One
can ask
with
ways,
from
A = U L U*
X2" 2 ¡^'"ikl lk '
characterize
matrix
stochastic
unistochastic)
shows
applied to complex fice.
50
matrix
i
-
1
n
the
easily
vectors
that
is
not
and
(hence to
Rj
that
the
unistochastic
geometric
orthostochastic
the conditions
fact
"
orthostochastic
to find a similar
which
eigenvalues
are equal to
ization to that of Theorem 2. The above counterexample doubly
the
z^, ...,
character-
of the (and
3-by-3 also
not
1° - 4° in Theorem
instead of
2
R^) do not suf-
References [1] Bily, J., Fiedler, M., NoiiCka, F.: Die Graphentheorie in Anwendung auf das Transportproblem. Czech. Math. 3. 8 (1958), 94-121. [2]
Birkhoff,
G.:
Tres
observaciones
sobre
el
algebra
lineal.
Univ.
Nac. Tucumän Rev. Ser. A 5 (1946), 147-151. [3]
Cihäk,
P.:
rectangular [4]
On
an
exposed
element
of
a set
of
doubly
matrices. Comm. Math. Univ. Car.
11
(1970) 1,
Fiedler, M.: On a theorem by A. Horn. In: Mathematical computational
mathematics,
mathematical
modelling,
stochastic 99-113.
structures, Sofia
1975,
251-255. [5] Goldman, R. N.: An urnful of blending functions. IEEE Comp. Graphics and Appl. 3 (1983) 7, 49-54. [6] Horn, A.: Doubly stochastic matrices and the diagonal of a rotation matrix. Amer. J. Math. 76 (1954), 620-630. [7] Marshall, A. W., Olkin, I.: Inequalities: Theory of majorization and its applications. Academic Press, N. York 1979.
51
ON A GENERALIZATION OP F.JOHN THEOREM FOR CONSTRAINED EXTREMUM PROBLEMS F. Giannessl1^ Abstract. A necessary optimality condition is proved for constrained
ex-
tremum problems having a finite-dimensional image. Both the classic F.John and Euler conditions are derived as corollaries. The result is obtained by exploiting the concept of image of a constrained extremum problem C 3»^>5, 9,13,14]. 1. Introduction Assume we are given the positive integer m, a non-empty subset X of a Banach space Y whose norm is denoted by IHI, and the real-valued functions f. X + K
and g: X + ]Rm . Consider the problem: min f(x)
,
x eR = { x e X: g(x) i 0} .
A crucial aspect of our analysis is based on the concept
(1) of
image
of
(1), which will now be briefly recalled. Obviously JJ « R is a local minimum point of (1) iff there exists a neighbourhood (open sphere of centre X and radius p> 0) N = { x e Y : iix—X ii < p > of X, such that system (2), f (x) = f(X)-f(x)> 0 ; g( x) 2 0 ; x « X N = X n N
(2)
is impossible. Hence, in order to state local optimality for (1) we are led '¿o prove disjunction of sets Jt and J^, where: Jf=A {(u,v) e X x E m : u > 0 j viO}, P(x)^(f (x) ,g(x)), Jfj, = F(X„). The set Jt= F(X), which will be called the image of X under F, represents 2)
Jf^ when
N 2 X. To prove directly whether or not .JPnjf^=0
is generally
impracticable; then we try to show such a disjunction by proving that the two sets lie in two disjoint halfspaces or, more generally,in two disjoint level sets, respectively. This separation approach is exactly
equivalent
to looking for a system which Is in the alternative with (2).In this order of ideas the set Jf. plays a key role. We are referred to Refs. [4,5,6,13] N for details about sets and concepts which will be used here. In the sequel, besides the finite-dimensional case where X £ ]Rn,we will
^Professor, Department of Mathematics, Faculty of Sciences,University of Pisa, Via Buonarroti 2, PISA, Italy. 2) Both here and in the sequel, 2 and d will denote "contains" and "strictly contains", respectively} analogously for £ and c .
»2
consider another particular case, namely, the one where 1 ^ a,b
X=C*[a,b],with
]R , X is equipped with the norm IIZII. = max |z(t)|, and where: b b te [a,b] fix) * o (t,x(t),x'(t))dt ; S l ( x ) -f ^ ( t , x ( t ) , x ' ( t ) ) d t , i e I, (3) E
the functions
* : E
3
K
, i e {0} u I, with I = {l,...,m} ,
being
given.
Fixed endpoint conditions can be Included in the definition of X. The problems which can be reduced to scheme (1) share the characteristic
of having a finite-dimensional image. Hence certain
instance of geodesic-type, escape from (1).. In [5] it is
problems,
for
shown
the
present approach can be modified by means of multifunctions order to handle problems having an infinite-dimensional
how
theory,
image;
in
these
will be a n a l y z e d i-n a subsequent paper. The generalization of the concept of stationary point, which keeps up with that of necessary conditions, has received m u c h attention [12], The crucial part is the kind of convergence which is required.lt seems impossible to handle every problem with a unique kind of convergence. Our present
aim is to enlarge as much as possible the class of problems for which
a Lagrangian-type necessary condition can be established by means of the simplest generalization of the concept of limit, namely lower (or upper) limit. This does not avoid the introduction of a more complicated notion of convergence, but reduces its use to a class as small as possible. In the remaining part of this section, f denotes a generic function and not that previously
introduced.
Definition 1. S e X will be called a lower semistatlonary point of a problem of type: mln f(x), with f: X -»Ji , iff llmlnf x-x
f(x)
~ f ( * } SO "x"x"
.
(4)
An upper semistatlonary point is defined by replacing, above, liminf
and
i with limsup and s , respectively. A point which is both upper and lower semistatlonary is a stationary point; it obviously fulfils (4) as equality. The above definition is motivated by the following property [6] : Proposition 1. (i) If X Is a local minimum point of f on X, then
(holds.
(ii) If X and f are convex, then a lower semistatlonary point of f on X is
1) r C [a,b] denotes the set of real-valued functions of which the first r derivatives are continuous.
53
a global minimum point. The concept of stationarity expressed by (4) is equivalent to that tioned in
[12, page 5 8 ] .
men-
In fact, the latter requires the existence of
a
neighbourhood N of x and of a function e : XxX-»]R,with lim e(S;x-X)/||x-X||=0, x+x such that: f(x)£f(3c) + e(3f JX-3E) ,
¥X e X N
.
When x^x, this inequality is equivalent to [f(x)-f(x)]/nx-3cu £ e ( X ; x - X ) / i i x - X m
,
»ieX„ ,
and hence to (4). Starting with (4) instead of equivalent formats,it is possible to achieve general theoretical results and also some "calculus''
for a wide class of
functions. If f is differentiable, it is easy to show that the left-hand side of (4) collapses to the minimum directional derivative of f. The concept of stationarity expressed by Definition 1 is
not, of course,
the
only one conceivable for the general case. It is probably the simplest,and it enables one to achieve a necessary condition for a wide class of constrained extremum problems, including convex, differentiable,
and
(even if not all) discontinuous ones. The concept of stationarity
some can
be
strengthened to embrace other classes of problems. 2. Some Properties of a Flnte-Dlmensional Image Both here and in the sequel T(h,jT) will denote the tangent
coneto
JTat h=F(x) and G(h; JT) = {h e ® 1 + m :h=h+o(h-h), h e JT, ae [0,+ -[},the cone generated by JT at h; moreover (•,•) will denote scalar product and,given a cone C, C* = { z : < y , z ) S 0 , v y € C } will be its positive polar. To problem (1) we associate the generalized difference quotient: Q(x;e,X) = [w(f(x),g(x);e,X)-w(f(3c),g(3c);e,X)] / h x - X h , where w(u,v; 6, A) = 6 u + U ,v> , 6 e ® , X e K
1)
m
.
(5)
v The tangent cone to set Sc]R at Seel S is defined as the set of g+s for which there exists a sequence {s r }£ S, such that lijnosr=§,and apositive sequence {or}e]R ,such that lip a r .(s r -S)=s.We stipulate that the tangent cone is empty if S is empty; implies that the tangent cone is nonempty and closed. Recall that F(x)=(f(x),g(x)), where f is the function defined in Section 1. The symbols not defined in this section are those defined in Section 1.
54
U n d e r a suitable a s s u m p t i o n defined ering
in
(5) is a class of w e a k s e p a r a t i o n
[4,5]. A n e c e s s a r y
the u p p e r
condition
semistationary
points
for
of w ( f ( x ) , g ( x ) ; 6 , A ) on X . M o r e
e r a l results can be o b t a i n e d by a d o p t i n g a n o n l i n e a r functions
in p l a c e of 1
of x and
set ^
Proposition
(5). R e c a l l that
class
of
N is a n e i g h b o u r h o o d
if(x; 6, x) = e?(x)-( A,g(x)), w i t h
2. T h e r e
functions,as
(1) w i l l be a c h i e v e d by c o n s i d -
exist m u l t i p l i e r s
Be®,
leE
m
separation
(open
sphere)
.
X e ] R m , with
eeBand
gen-
(e,x)^0,such
that: - ( 6 , X)e [G(h; J f N ) - h ] * , iff we
have:
Q(x;e,X) S O ,
Proof.
(6) is e q u i v a l e n t
considering e
(6)
JT., holds
N
to
vx e X^ \{3c } .
< ( e , X ) , h - h ) S O , vh e G ( h , Jf^).This
that h e G ( h ; Jf N ) iff e i t h e r h = h or iff
< ( e, X) ,k-h) S O , v k e J t ,
N
inequality,
3 a > 0 such that
k=h+a(h-ii)e
or:
((e,X),(f(x)-f(x),g(x)-g(x))) SO and h e n c e
(7)
, vx e X , N
(7) f o l l o w s . T h i s c o m p l e t e s the p r o o f .
Condition
(6) does not
imply that
•
x be a m i n i m u m point
of
(1) (take for
2 instance ment
x=]R , ?(x)=x
which,
- 1 , g ( x ) = x , x = l ) . N e v e r t h e l e s s (6) is a s t r o n g r e q u i r e -
h o w e v e r , cannot 'easily be w e a k e n e d
in the g e n e r a l
case.
The set Jfjj can be extended without losing anything from the viewpoint of optiinality conditions; consequently the polar cone which appears In (6) is not enalarged, but the lost elements wiLl turn out to be useless multipliers. Denote by tf(S)=S-C the conic extension of set S to
with respect
(see [5]). It is easy to show (see the proof of Theorem 3.1 oftt])
that, from the optimality viewpoint, JT can be equivalently replaced with g(jT).
The same is not true if, before (or after)applying the conic- exten-
sion, Jf'is replaced, even locally, with the cone generated by it,unless a suitable assumption is made (for instance, the continuity of P). To achieve a necessary condition for a wide class of problems (Including some discontinuous one, as will be shown by an Example) we
have
to
The vector difference Is denote.d by while the difference between sets is denoted by \. Note that in (6) we have he Jf^ and this 1b crucial for achieving the thesis.
56
restrict the class of problems (1); this is done by the following
condi-
tion; int and conv denote interior and convex hull, respectively. Condition C. Let I t R and set h=F(X). The following statement holds:
if
system (2) is impossible, then (int Jf)n conv tf[G(h; JT,)]= 0.
(8)
Because of its obvious convexity, the set conv / of (8) (and hence/)admits a supporting halfspace at h, so that (8) implies the following condition: {*[G(h; Jf N )-h]}* * {0} .
110)
The Theorem of next section will show that (8) is fulfilled in the convex case; it may also be satisfied, however, in some discontinuous cases, as will be shown by the Example of next section. Condit.iorf
C
requires
the
check of the optimality of 3c. This is obviously a drawback; however this is only apparent. In fact, no difficulty arises in establishing a necessary condition (see the Lemma of next section), where the impossibility
of
(2) is an assumption. As it concernes the check of (8), the Example of next section will shows how is possible, in practice, to overcome the above drawback. However, Condition C (and the corresponding necessary
condition,i.e.
the Lemma of next section) is conceived as a source for deriving necessary conditions for certain classes of functions, like the Theorem of next section. Condition (8) is obviously equivalent to claim the existence of a closed halfspace containing JT^, the set int Jf being contained in its complement. In comparison with this statement, (8) has the advantage that the existence, of
a
separating hyperplane is expressed in terms of cone generated and
conic extension, which simplify the proof of the Theorem and let us
put
Condition C in terms of ? , g and X. As it concerns this latest aspect,let us note that Condition C is conceived an expressed in the image space, and it will be used in this form, otherwise the proofs would be unnecessarily complicated. This does not exclude the possibility that the final
version
should be in terms of the given problem. In other words, we point out the importance of passing to the image space, where the analysis and the proofs are carried on more easily than in the space where the given problem runs, and eventually to interpret in this space the results found. Now it will be shown that, within a suitable class of functions, we can reduce ourselves to a convex extended image. This will mean that we can handle, as regarding necessary conditions, some nonconvex problems in the same
SO
way as convex ones.
Definition 2. Let L and V denote respectively the sets of linear and sublinear functions. A function 1 ^ f will be said
y-dlfferentiable at x = X
Iff there exist the following two functions: g e V , which satisfies the Inequalities f + {%iz)= llmsup t+ 0 and function E: X
2
f
- f (*> E
will be said tc be of class J
at X iff
there exist a neighbourhood N of'x anda = a f (x); class
and
from
f(x)) -
will be called a lower support of f at X. f will be called of at x iff -f is of class y
; i^, = a
will be called
an
upper
support of f at x.
3. A Necessary Condition First of all we will establish a necessary condition, which holds under a suitable hypothesis and which is expressed
in terms of the imag« of
(even if It might be put in terms of (1); see [6]) to stress
the
(1)
Impor-
tance of the Image to carry on the analysis. Such a condition represents a source for deriving practically meaningful theorems. Denote byi?(xje,*) = = 6f(x)-(X,g(x)) the generalized Lagranglan function associated with (1). Definitions 2,3 and only here f denotes a generic function and not that previously introduced. 67
Lemma. Let x sX and h be the Image of x through function F(x) = (p(5c)-?(x), g(x)), and assume that there exists a neighbourhood N of x such that Condition C Is fulfilled. If x Is a minimum point of problrm (1), then there exist multipliers
8 e It and llmlnf
X +X
X e ]R m , such that: ^(x;8,x)->'(x;e>x)a II X-X II
g(X) a 0
;
= 0 The multiplier-vector
e a0
;
Q
xaO;
(e
W
0 ;
(lib)
.
(11c)
(6,X) belongs to the opposite of the polar cone In
(9). Proof. Since G(h; Jf"„) £ «f[G(h; JTj] , we have: N N { # [ G ( h ; Jf N )-h]}* £ [ G(h; Ji^)-h]* . From this Inclusion and Proposition 2 - recalling that w= 6 a 0 , and this inclusion
implies
w S O , so that (12) holds. Now, ab absurdo, let us assume that (11c)
does
not hold. Then, observing that the opposite of a couple (8,X) fulfilling (lla,b) is precisely the above u , and recalling that h=(h o ,h^,....jh^) = = f(x)=(f(x)=0,g(3E)) is the image of x, so that g 1 ( x ) = h 1 ,
we
see
that
v(— 8, — X) € C * there exists an index 1=1,...,m (depending on 8 ,x ) such that X l g l ( x ) > 0 or
i^h^ < 0 . Then, considering that u s 0 and ha0,we find 0
and
6 0. Then it holds:
n r" 2 + 2_ Xj - t 40, and the corresponding looal maximal 1 1-2 3 value is less than the looal minimal value (of. Fig. 2.5a)). k = A, i. e.
2
Case II: 6 0, xe r" ,(^>0, X X-
vm
= { x I f(x) i = X,
4
0, £ . 0 ^ = 0 0 , £¿4. ^
00
;
min f(x->}, w. wj1 (X-.x.K 4 Oictl *• *
+ w, .
4
It is proved by EHLERT [1972] that lim
d(x-,X*) = 0,
81
if X
is a nonempty set.
The
typical
around the
behaviour
of the subgradient methods
the minimal point:
special situation.
is
an
oscillating
the stepsiaes o ^ are chosen independently of
The behaviour of the EHLERT-method is
smoother
~han the other one. The
6.-subdifferential is defined as follows (ZOWE [1985]):
a t f ( x ) = { s | s'(y-x) £ f(y) - f - »
(9)
X * - { x* | oe3f(x*)} 4
bounded.
(10)
There are two cases. Case
1.
If + o holds for all k, standard arguments that lim
84
dCx^.X*) = 0.
then it is easy
to
prove
using
Case
2.
If
there
exists k such that = o, then either it holds or after finitely many steps (reductions of c^) it will be found a nonzero vector sj^. We summarize: THEOREM. If inf f(x) = - © 0 . then lim f(x^ ) = - o o . If the assumptions (9), (10) are fullfilled, then every accumulation point x* of the sequence {x^} minimizes the function f over R . For an implementation it is possible to apply a bundle concept nearly to the £-subdifferential method.
References: EHLERT, J., (Jeber Erweiterungen eines allgemeinen Verfahrens von POLJAK zur Loesung von Extremalaufgaben mit Nebenbedingungen und seine Anwendung in der nichlinearen Optimierung, Diss. Humboldt-Univ. Berlin 1972 Math. Operationsforsch, und Statist. 6 (1975) 1, 91 - 105 LEMARECHAL, C., Nondifferentiable optimization, in: Dixon, Spedicato and Szegoe (eds.). Nonlinear Optimization, Theory and Algorithms (I960) -. J. J. STR0DI0T and A. BIHAIN, On the bundle algorithm for nonsmooth optimization, in: Mangasarian, Meyer and Robinson (eds.), Nonlinear Programming 4 (1981) MIFFLIN, R., An algorithm for constrained optimization with semi-smooth functions. Mathematics of OR 2 (1977), 191-207 POLJAK, B. T., A general method for solving extremal problems, WOLFE, P., ZOWE,
J.,
Soviet Math. Dokl. 174, 33 - 36 (1967) A method of conjugate subgradients for minimizing non-differentiable functions. Math. Progr. Study 3 (1975), 145 - 173 Nondifferentiable optimisation - a motivation and a short introduction into the subgradient and the bundle concept, NATO ASI Series, VOL. F15, Computational Math. Progr., Ed. by K. Schittkowski, 321 - 356 (1985)
85
STOCHASTIC PROGRAMING WITH RECOURSE: UPPER BOUNDS AND MOMENT PROBLEMS - A REVIEW P. Kail
I.
1)
Introduction
n k Given a set X C R , a random vector 5 with range h c R , endowed with the Borel o-algebra and a (induced) probability measure P, and a vector c e R n , the stochas t i c program with recourse i s usually stated as min { f ( x ) + E p Q ( x } xeX
,
where Q : X x = - > R i s the socalled recourse function and Ep stands for the expectation with respect to the measure P. In general the recourse function i s assumed to be not only measurable with respect to £ for any x e X, but also convex in £ for each x e X, which implies 5 to be assumed as a convex set.
Throughout t h i s paper we shall make this con-
vexity assumption, except for one particular convex-concave case treated recentl y by K. Frauendorfer [8]. In particular in stochastic linear programming X i s assumed to be a convex polyhedral set and the recourse function i s defined by Q(x,c) = min { q'y|Wy = h(e) - T U ) x , y a 0 } , ni mi mixn where q e l R , W i s a m^ x n^ - matrix, and h : E —> IR , T : s —> R are assumed to be linear affine in e,. Assuming that W i s a complete recourse matrix, i . e . and
{ t|t = Wy,y a 0 } = R m i T
{ u|W u £ q } « 9 ,
i t i s well known that Q(x,*) i s not only convex for a l l x e X, but even proper, piecewise linear and convex, more precisely i t i s polyhedral and sublinear in h(5) - T(s)x (see e.g. [ 1 2 ] , [ 2 0 ] ) . From the convexity of Q ( x , 0 we get for
I
= EpC
by Jensen's inequality that Q(X,£) s EpQ(x.s). Hence we have a lower bound which requires only the evaluation of Q in (x,e) -
1)
86
University of Zürich, I n s t i t u t für Operations Research und mathematische Methoden der Wirtschaftswissenschaften, Moussonstrasse 15, CH-8044 Zürich
being in general a very simple task compared to the evaluation of the multiple integral Ej;Q(x,5), the latter being usually too complicated to be carried out repeatedly in some iterative solution method for (1).
Therefore, there is
also a need for upper bounds being computable easily compared to the evaluation of EpQ(x,s).
It turns out that most of the upper bounds proposed so far are
derived by considering various (generalized) moment problems. shall give a survey on these approaches.
In this paper we
To make them more accessible, the
underlying theoretical results on semi-infinite programming and generalized moment problems are reviewed first. The important question for strategies how to improve the lower and upper bounds and to drive simultaneously x towards a solution of (1) goes beyond the aim of this paper and is not answered in general yet.
The reader may find however
some special proposals in this direction for instance in [14], [9], [7] and [1].
II. Semi-Infinite Programming and the Generalized Moment Problem As reference for semi-infinite programming we have chosen the textbook of K. Glashoff and S.A. Gustafson [11] because it is very well written and does for the statements of our interest - not need more than elementary real analysis and linear algebra. For S an arbitrary index set (in general an infinite set) c e Rn a : S - > R n , b : S - > R arbitrary functions we may consider as primal problem the semi-infinite program (P)
min { c T y|a(s) T y £ b(s)
V s e S }
(5)
and its dual problem, the generalized moment problem to find a positive finite discrete measure with values x. at points s^ e S solving
(D)
max {
1 q I b(s.)x.| I a(s.)x. = c, x. a 0, s.eS, q a 1 } . 1 1 1 1 1 1 i=1 i=1
For any pair of feasible solutions y of (P) and {
;
(6)
} of (D)
we get immediately q q q y T c = y T I a(s.)x. = I [y T a(s.)]x. i I b(s.)x 1 1 1 1 1 1 i=1 i=i i=i
(7)
and hence for the values ,v(P) and v(D) of (P) and (D), respectively, defined by v(P) := inf { c T y|a(s) T y £ b(s)
v(D) := sup {
V s e S }
q q I b i s . J x ^ J a(s.)x i = c, x^ i 0, s. e S, q > 1 } ,
(8)
(9)
87
the weak duality theorem Lemma 1 [11] : v(D) s v(P) . The following statement allows in some sense for minimal representations of solutions of (D) and is a trivial extension of the socalled reduction theorem in [11]. Theorem 1 If { s,|,...,sq; Xj
Xq } is feasible in (D), then there exist a subset
{s. }C{s,,... ,s„} and scalars x, , ...,x, such that M ^ . 1 M !m { s. ,...,s{ ; x. ,...,x. } is feasible in (D), the set of vectors M . !m 11 im {m a(s.J )|x. ^ > 0q } is linearly independent, and, if v(D) < +», I b{ S l £ I b(s .)x . ] 1 1 j=1 ^ J 1=1 Proof Just rephrase the well-known fact in linear programming that feasibility implies the existence of basic solutions and boundedness yields the existence of an optimal basic solution. W O Defining the set of vectors a(S) :={a(s)|s e S}, the constraints of (D) involve the positive hull of a(S), i.e. the set q Mr pos a(S) = { z\z = j a ( s . ) x . , x i s 0, s. e S, q > 1 } , a convex cone called the moment cone.
Obviously holds
Lemma 2 [11] : (D) is feasible iff c e M . n We may have a situation where (P) is solvable, (D) is feasible but not solvable, but at least the values of (P) and (D) coincide. v(D) < v(P) (duality gap), as can be seen >n [11].
There are even cases where To state conditions under
which the solvability of (D) and (P), respectively, follow, it is convenient to reformulate (D) as follows: Define Mn+1
:= pos a(S)
where
Then - for c fixed as in (6) - our generalized moment problem is obviously equivalent to (D) max { c Q | (), b(>) should be continuous implying that k S is some topological space which we assume for simplicity to be a subset of F . But this does not yet imply
to be closed.
Therefore we often find in the literature the Assumption C : S C R
is compact; a(-), b(») are continuous. k As mentioned above we choose S C R just for simplicity; instead, the assumptions "S a compact metric space" or "S a compact Hausdorff space" are quite common, too. general.
However, assumption c would also not yet y i e l d
closed in
Reminding to nonlinear programming and the duality statements (and
Kuhn-Tucker conditions) discussed there, we may hardly expect to get through without any regularity assumption in semi-infinite programming.
Hence we may
try the natural extension of the familiar Slater condition for convex programs, which is called in [11] Superconsistency of (P) (P) is superconsistent i f 3 y : a(s) y > b(s)
V s 6 S .
Theorem 3 [11] If (P) is superconsistent and assumption C holds, then
is closed.
Proof M . = pos a (S) = u x • co a (S), where co a (S) i s the convex hull of X£0 a (S).
By C obviously a (S) and hence co a (S) are compact.
With z^ e t)n+j
and zV = xVy V , y V e co a(S), XV 2 0, l e t z = lim zv = lim xv'v y . V-wo V-*» Then either {X } is bounded implying z e M , or {X } is unbounded and ft v •) _ n+1 v hence [ ~ , ( A v y v ) J c : c o a ( s ) accumulates to 0, i . e . 0 e co a (S). This implies the existence of s^ e S, d^ i 0, i = 1,...,n+2, such that n+2 n+2 _ I o. = 1 and I a.(y a(s.) - b(s.)) = 0 V y e R 1 1 i=1 1 i=1 1
in contradiction to the
Slater condition, mm Theorem 4 [11] If (i)
assumption C holds,
(ii)
(D) is feasible,
(iii)
(P) i s superconsistent,
then (0) is solvable and v(P) = v(D).
89
Proof By Theorem 3, Lemma 1 and Theorem 2, (D) is solvable with ^ o j e M n + 1 , where c Q = v(D).
Hence
yTx s 0 < y T
[ V
e
V x e Mn+1, V
where y Q > 0, since T
(b(s),a(s) )
v(P) < - ^
y
T
e Mn+1
< c 0
u
icn+el i , -i „n+1 V E 4 uM n + 1 and 3 y e. R . y * 0, such that
e Mfl+1.
E
> 0 ,
Writing y = ^ o j , it follows, observing
V s e S.that
-y
is feasible in (P) with
+ e = v(D) + e S v(P) + e
V e > 0 .
H I
From Theorem 2 and the construction in the proof of Theorem 4 follows Corollary If (D) is feasible, v(D)
u° and h k —• h with h*i 0, assumption ( i i ) ' then y i e l d s the existence of some (n,n)-matrix liQ such that M0 h
£
P(x°,u°,t°)h,
and, by the above r e l a t i o n s , h T MQ h ¿ 0, which contradicts, together with ( 2 . 8 ) , assumption ( i i i ) . completes the proof. Proof of Theorem 1: ( a )
This //
c f . Remark 1 above.
( b ) Applying lemma 1 with t k 2 t ° , x k = x° and f o r some sequence { y 3 c s ( t ° ) s a t i s f y i n g y k x° and f Q ( y k , t 0 ) k f 0 ( x ° , t ° ) k ( f o r a l l k ) , we obtain that y « x ° i f k i s large enough, i . e . , there e x i s t s some r e a l number s > 0 such that k
S ( t ° ) n B(x°,s)
-
fx°3 .
The continuity assumptions on f ^ and
(2.9) (i"
1 , . . . , m ) and assumption ( i ) of the theorem allow to find p o s i t i v e r e a l numbers r £ s and q such that the Mangasarian-Promovitz CQ remains s a t i s f i e d f o r a l l x € B ( x ° , r ) w.r. to U ( t ) , where t £ S ( t , q ) , and, moreover, such that the mapping t S(t)n B(x°,r) (2.10) i s upper semicontinuous in Berge's sense ( u . s . c . ) at t ° . Por the u . s . c . property of mapping (2.10) we r e f e r to Robinson C15 » Th. 2 . 3 J . In virtue of S ( t ° ) n B ( x ° , r ) » [ x ° 3 ( b y ( 2 . 9 ) ) and of the u . s . c . of the mapping (2.10) at t ° , i t follows immediately from Lemma 1 that there i s some p o s i t i v e r e a l number q' such that S ( t ) n B ( x ° , r ) contains at most one point i f t e § ( t ° , q * ) . On the other hand, the continuity of the functions f 0 » f - j • • • • »f m > assumption ( i ) and the f a c t that, by ( a ) ) , x ° i s a s t r i c t l o c a l minimizar of P ( t ° ) allow to apply a standard argument on persistence and continuity of l o c a l m i n i m i z e » f o r perturbed programs, c f . , f o r example, Robinson [16 , Th. 4.3 and $53 or Klatte [ 8 , Th. 1 J t Por each £ e (0,rD there existB some q"=q"(e.) > 0 such that B ( x ° , £ ) contains at least one l o c a l minimizar x ( t ) of P ( t ) i f t € . § ( t ° , q " ) . I f q" é q then t € § ( t ° , q " ) implies that x ( t ) s a t i s f i e s the Mangasarian-Promovitz CQ w . r . to M ( t ) , as shown above, and hence
109
belongs t o S ( t ) . S e t t i n g f o r any we have the d e s i r e d r e s u l t .
£ € ( 0 , r ] , cTC £ )
min ( q , q ' , q " ( g ) ] ,
( c ) I n ( b ) , i t was p a r t i c u l a r l y proved t h a t x ( t ° ) = x ° and t h a t x° i s a s t r o n g l y s t a b l e l o c a l minimizer of P ( t ° ) . I f * ( • ) i s the mapping c o n s t r u c t e d by ( b ) , then i t i s a s t a n d a r d e x e r c i s e t o show t h a t f o r a l l t s u f f i c i e n t l y c l o s e t o t ° , the assumptions of the theorem remain t r u e i f x° i s r e p l a c e d by x ( t ) (and t h a t , hence, x ( t ) i s a s t r o n g l y s t a b l e l o c a l miniinizer of P ( t ) ) . The d e t a i l s of the proof a r e omitted h e r e . //
3. Consequences and s p e c i a l c a s e s F i r s t we p r e s e n t a consequence of Theorem 1, which i s of i n t e r e s t f o r t w o - l e v e l o p t i m i z a t i o n problems. Consider the p a r a m e t r i c program i n t r o d u c e d i n §1: P(t): suppose dom v
m i n x { f Q ( x , t ) / x £ M ( t ) 3 , t £ T, k
T«H and put v ( t ) s= i n f x { f Q ( x , t ) / x £ M ( t ) $ , t £ T . { t e l / v ( t ) e R ] . I f the o p t i m i z a t i o n problem
(x,t){fo(x't)/ieM(t)3 i s intended t o be s o l v e d by a decomposition method i n two p h a s e s , f i r s t phase i s ( u s u a l l y ) the p a r a m e t r i c problem P ( t ) , t £ T, while second phase w i l l be an o p t i m i z a t i o n problem w . r . t o the v a r i a b l e
Let
m l n
(3.1) the the t:
mint ( v ( t ) / tedom v j .
(3.2)
Now we give a s u f f i c i e n t c o n d i t i o n f o r a p o i n t ( x ° , t ° ) t o be a l o c a l p minimiser of problem ( 3 . 1 ) . In the case of C - d a t a , the subsequent theorem becomes a v e r s i o n of Theorem 3 . 1 in Jongen, Mbbert and Tammer [7]. Theorem 2. Consider the p a r a m e t r i c program P ( t ) , t 6 T , and the p r o b lems ( 3 . 1 ) and ( 3 . 2 ) under the g e n e r a l assumptions of §1, and l e t T » R k . Then a p o i n t ( x ° , t ° ) i s a l o c a l minimizer of the o p t i m i z a t i o n problem ( 3 . 1 ) i f x ° i s a l o c a l minimizer of P ( t ° ) which s a t i s f i e s the assumptions of Theorem 1, and i f t ° i s a l o c a l minimizer of the program ( 3 . 2 ) . P r o o f : The mapping x(>) o c c u r i n g i n p r o p o s i t i o n (c) of Theorem 1 p o s s e s s e s a l l p r o p e r t i e s of the f u n c t i o n £ ( • ) c o n s i d e r e d i n the proof of Theorem 3 . 1 i n [ 7 ] . In o r d e r t o prove the a s s e r t i o n of our theorem, one may e x a c t l y f o l l o w the arguments given i n t h e mentioned proof i n [7]. // In the p r e c e d i n g theorem, the assumptions imposed on the l o c a l m i n i mizer x ° e n s u r e , e v i d e n t l y , i t s s t r o n g s t a b i l i t y . The importance of 110
the strong stability property of x° is illustrated in [73 by an example of an unconstrained optimization problem with a polynomial objective function of two variables. We finish the paper by discussing the hypotheses of Theorem 1 for several special cases. First we note that assumption (iii) nay b6 verified in a rather simple way if the functions fo(«,t°), f ( • , t°),..., fm(-,t°) have a particular form or belong to certain subclasses of C 1 , 1 (D), we refer to Klatte and Tammer [9]. Now, a more extensive discussion will concern the question how to simplify assumption (ii). We recall that Kojlma [10, §8] studies strong stability of local minlmizers under the additional assumptions ( for i = 0,1,...,m) f^-.t) is twice differentiable on D p is continuous on D * T . 7
(VteT),
(3.3) (3.4)
Under these additional assumptions, (ii) is automatically fulfilled. Indeed, we have in this case (with x £ D , u erf11, teT): ^ K x . u . t ) .{Vxf0•>•) is locally bounded and closed on O D * Q * T (by (E1)). Moreover, we have that p cJ _l(x,u,t)cpro;J d l(x,u,t) (for (x,u,t) £ D x Q x T ) , where proi A X o X stands for the projection (M^,M„,M + )eb l(x,u,t) t — I t is easy to a u v x 2 verify that the multifunction (x,u,t)=£ F(x,u,t):= p r o j x d l(x,u,t) is also locally bounded and closed on D * Q * T and thus fulfils (ii). Remark 3. Literature on decomposition methods pays a special attention to optimization problems in which the objective function is separable w.r. to two groups of variables, cf., for example, Bank et al. [2], Beer [3]. In this context, an interesting special case of P(t), t £ T , is the following« Suppose f Q (x,t)« g Q (x)+h Q (x), where g 0 £ C 1 , 1 ( D ) , and h Q : T — * R is continuous on T. Further, let f,,,...,fm satisfy
111
(3.3) and (3.4), together with the general assumptions of §1. Of course
holds for all (x,u,t)£D aR®* T. If t°,x° and < {(x,t)£ D« T / f(x,t) - gj(x,t)3. Then, by Theorem 4 in [9], it holds with J(f,x,t)s= 2
d x f(x,t)c conv { ^ g j C x . t ) / J £ J(f,x,t)]
J/f(x,t)«=g;j(x,t^
(for (x,t)6D*T).
For any (x,i)eD*T, we have J(f,i,t)C J(f,i,t) for all (x,t) near (x,?), since f and are continuous functions. Taking the continuity of (for all J € J) into account, we can consequently conclude that for each function f£CS°(g^,...,gH), the multifunction (x,t) = * conv j V ^ U . t ) / J £J(f,x,t)5
(3.5)
is locally bounded and closed on D x T. Consider now the parametric problem P(t), t£ T, and suppose there are no equality constraints (i.e., p=0). Further suppose that for all i»0,1,...,m , f^£CS°(g1,...,gjj). Let t°,x° and Q are given as in Remark 2. Replacing in (3.5) f by f^, we thus have that for each i=0,1,...,m , the multifunction F^ defined by 2 FJ1 (x,t) ;» conv f / j £ Jif^x.t)], t v xg.(x,t) j l J (x,t) £ D * I ,
(3>6)
is locally bounded and closed on D * T . Hence, the multifunction F defined by F(x,u,t)
F (x,t)+ 0
, U J F J (x, t) , i £ l (u) 1 1
(x,u,t)£ D * Q •* T,
is locally bounded and closed on D * Q * T. Further, for each heR 11 , d 2 x l(x,u,t)h c
112
s2
x f o ( x ' t ) h + ^ + ( u ) u i ^Vi(*.t)hcF(x,u,t)h
holds on D * Q * T in virtue of property (£2) of C1'1-functions, and so, P satisfies (ii). As p«0, hence I+(u°)»I2+(u°) and definition (3.6) yield that the verification of assumption (ili) reduces, in this case, to checking the positive definiteness of V ^ ^ x ^ t 0 ) on W+(x°,u°,t°) for all indices i belonging to some subset of J. References [1] Auslender, A.: Stability in mathematical programming with nondifferentiable data. SIAM J. Control and Optimization 22 (1984) 239-254. [2] Bank, B., H. Mandel and K. Tammer: Parametrische Optimierung und Aufteilungsverfahren. In: Lommatzsch, K.(ed.): Anwendungen der linearen parametrischen Optimierung. Akademie-Verlag, Berlin, 1979. [3] Beer, K.j Lösung großer linearer Optimierungsaufgaben. VEB Deutscher Verlag der Wissenschaften, Berlin, 1.977. [4I Clarke, F.H.: Optimization and Nonsmooth Analysis. Wiley, New York, 1983. [5] Guddat, J., Hj. Wacker and W. Zulehner: On imbedding and parametric optimization. Math. Programming Study 21 (1984; 79-96. [6] Hiriart-Urruty, J.-B., J.J. Strodiot and V. Hien Nguyen: Generalized Hessian matrix and second-order optimality conditions for problems with Cl.1-data. Appl. Math. Optim. 11 (1,984) 43-56. [7] Jongen, H.Th., T. Möbert and K. Tammer: On iterated minimization in nonconvex optimization. Math. Ops. Res. 11 (1986) 679-691. [8] Klatte, D.: On the stability of local and global optimal solutions in parametric problems of nonlinear programming. Seminarbericht Nr. 75, Sektion Mathematik, Humboldt-Universität, Berlin, 1985. [9] Klatte, D. and K. Tammer: On second-order sufficient optimality conditions for C1•1-optimization problems, optimization 19 (1988), to appear. [10] Kojima, M.: Strongly stable stationary solutions in nonlinear programs. In: Bobinson, S.M. (ed.): Analysis and Computation of Fixed Points. Academic Press, New York, 1980. [11] Kummer, B.: Linearly and nonlinearly perturbed optimization problems. In: Guddat, J., H.Th. Jongen, B. Kummer, F. Noziöka (eds.): Parametric Optimization, and Related Topics. Akademie-Verlag, Berlin, 1987. [12] Kummer, B.: Ein Zugang zur quantitativen Stabilität in der nichtlinearen Optimierung: Variation von Kakutani-Abbildungen. Manuskript, Sektion Mathematik, Humboldt-Universität, Berlin, 1987. [13] Lehmann, B.: On the numerical feasibility of continuation methods for nonlinear programming problems. Math. Operationsforsch. Stat., Series Optimization 10). For given y t Y, let L(y) denote the set of all x satisfying y e F(x,t). We shall say that M is regular (related to t) if there are positive reals £
and
1 such that
( M + £ B x ) A L ( y ) C M+l llyll-Bx
V y £ I By
(1.3)
and lim tr(£,t) «= 0 t-»0
(1.4)
hold. In the case M = {t} we will simply say that t is regular. Proposition 1. If M is regular (related to t) then there is some
oC > 0
such that both (1.5)
r ^ 7 * Inf T ( r ' , t ) r'»r
116
(1.6)
where t and 1 are from (1.3) and (1.4). Moreover, such a the following
is defined by
conditions
D " « « « C ( 6 , t V t f e
(1.7)
c* B x
Proof. Let (1.5) and (1.7) be fulfilled, and let B belong to the open interval (0, t - T ( 6 , t ) ) . Since 0 £ F ( x , t + t ) and dist(x,M) = r, for each r'> r, we have x e M + r ' 8 x . By the definition of T one finds some y such that
llyll* T ( r ' , t ) + B and O e F ( x , t ) + y.
Obviously, we may assume that r'0). (1.6).
As a particular result of the proposition we obtain that the solution sets S(t+t), restricted to the neighbourhood M + ^ B x of M , form an upper semicontinuous (at t - 0) mappings (in Hausdorff's sense), briefly u .8 .c. (H). It is very clear that several simple assumptions allow to replace the estimation (1.6) by r i I - (x k ) = D f ( x k ) for J = ¿ ( k ) . After selecting some subsequence satisfying all k) this shows
A £. C and leads to
v>(k) =
»'(for
3f(x) C C .
Knowing proposition 4 the proof of (2.1), in the case 1.2, can be organized as in the case 1.1. before. We have only to note that it is enough to consider extremal points A^e. G(t+t) in (2.1) and we must replace the set I(x) by 3(x). Finally, let us look at the case II in the introduction, i.e. f: R n — * R n card
locally Lipschitz, G(x) = 2 f ( x )
9f(x) = 1.
The verification of (2.1), for t = x, then follows via ilf(t+t) - f(t)-Df(t)til ^ a x ( t ) and lim su£ ftA.-Df 1 A t t3f(t+t),t-*0
122
(t)l| = 0
(2.3)
by the standard
estimation
l|f(t+t)-f(t)-Att|| i a ^ t ) * llA t -Df(t)U- ||tl|. The condition
(*) means nothing else than that Df(x) is a regular
matrix. It should be mentioned that, in each of these cases, ALG 1 locally superlinear converges where the estimations are given by the theorem. 2.2. Complementarity
problems
Let F be given by the formula F(x,t) = f(t)+G(t)(x-t) + N(x).
(2.4)
This mapping differs from (1.2) by the fixed multifunction N not depending on t, and it is related to the complementarity
problem
(2)
O e f(x) + N(x) . The mapping N, most often, is a normal cone mapping associated with *
some closed convex set. In this case we have Y = X . If we modify ALG 1 by solving, in step k, O e f ( x * ) + A ( x - x k ) + N(x),
(2.5)
then the results above may be applied. Particularly, the function
X
will not depend on N. However, it becomes now more difficult to ensure regularity (see 1.3) as well as solvability of each inclusion (2.5) for k k x near to some solution x of (2) and for any A £ C ( * ). In this context, we refer to B. Kummer 1987, 1984 and to the basic paper S.M. Robinson 1979. For the case that (2) describes a Kuhn-Tucker-system,
the papers
D. Klatte/K. Tammer (1987) and D. Klatte(1988; in the present volume) are very
interesting.
2.3. A pathological
example
In this section, we define a real Lipschitz function f having the following properties. f(0) = 0, Df(0) = 1, f
_1
3f(0)
=[^,2]
exists
ALG 1 (with G = i f ) fails to converge for each starting point x ° 4 0 provided that Df(x°) exists. For the equation f(x) - 3 . -
0, ALG 1 finds the solution x( \ )
t>y one
step whenever * 4 0 and llx°-x(A)ll In order to construct f we fix any natural number n > l
and consider
the interval I I
= T i , —irl . Let m and m' be the middle points of n •>• n n-l J and I 2 n » respectively. Setting
8
=
2n
TRtt
'
k
b
=
8n-4
Tt
123
we define two linear functions by f„(x) = a(x+m), f 2 ( x ) They fulfil -1, 1 , _ 1 f2fJU f ( n 7 T T ) " TT-T' V n J f
n(m)
|.---.bn)» T g 5= ( g y f f ' t ^ t
(P^f-»Pm) » n g. : = a. . (j=H, ...,m) 0 iM
and
g 0
n := 1=1
b..
1
The concrete algorithm we will describe takes into account the following two specifics: A) '
126
a serie of problems (4) has to be solved where the single problems arise by setting several unknowns p., a priori, equal to zero, J m the moduls of the coefficients g^ in the equation g p » g Q
Humboldt-Universität zu Berlin, Sektion Mathematik PSF 1297, 1086 Berlin, GDR
differ from the one of the right hand side
gQ
by 8 or 9 powers
of 10. Therefore a special algorithm for minimizing a striotly convex differentiate function
f(p)
on a simplex
M
in the form given above
is developed. Let us denote min {f(p) | p £ M i .
(2)
The idea of the algorithm is to solve a sequence of quadratic subproblems on linear manifolds in the Euclidian R*
(peR* |p*05,
3
m
U° := { p e R
T
|g p = g 0 j
R m . Let
m-space and
m
U* := { p e R | p . ¡ = O j , j=>|, ...,m. Then the simplex M R®nu° J the feasible set of problem (2) and, for an arbitrary index set J
... ,mj
with
A
| JI á m - 4
the set
S J = r® n u ° n ( n u 3 ) + j€J is a closed face of the simplex
is
(3)
M. Its
m
vertices have the form
are positive, every v** = (0,... ,0, ®o/g ., 0,... ,0). Since g. and g /j J " point in Tr has at least one positive coordinate. Now, we can formulate the subproblems mentioned above: min (f(p) | p e u ° n ( fl U d ) j Let
p
If
be the uniquely defined solution of (4) and
J_ = 0
case
then
J_ 4 0
p1*
the optimal point of the problem S^H
min {f(p) | p 6 j ej_, of the simplex
(this immediately follows by the convexity of
If the point
p1^
g
where
J
+
3f(pJ)
_ ± ó
9v
f
J6J+
u"^
such that
t
i
(5)
= {j | Pj >oj. Indeed, these are the well-known Kuhn-Tucker
conditions for problem (2). If there is an index
UJ
f(p)).
is feasible, then it is optimal for the original
problem (2) if and only if there is a number UJ =
J_ := {dlPj^oJ.
is a feasible point of (2); otherwise in the
belongs at least to one of the faces SJ
(4)
< _ JL Ifsi1 , then «3* 3*i*
j*^ J
with
127
- PJ) C 0
iikil 0>P and, the problem
mill (f(p) I p eu° n(
(6) C\
U"')}
has points
superior in the sense of optimality. Note, an index set JC{4,... ,m] exists for which the optimal point of the original problem (2) is a solution of the unbounded problem (4) where J := J. On account of these properties one oan propose the following algorithm: Let the steps 4,2,...,k-A be executed. By the step k-1 we obtained v the index set J .
If
(a) Solve the unbounded problem (4) where J := J . If the resulting optimal point p has only non-negative coordinates, then go to (b); otherwise set := J k U {j | p^^oi and repeat (a). (b) Construct the set J k := fo | u j k < - — ^ P ? }. If J^ is s 3 Op v empty, then p is a solution of the given problem (2). Otherwise set Jk+ 4 := J k \J_k and go to step k+1. As an initial index set
J'' we can choose any proper subset of
... ,m j. ik The substep (a) of the general step can be repeated at most m-'l-lJ | Tr times. Not all indices dropped from index set J in substep (b) k+1 will return into the index set J constructed by the following substep (a). This is a consequence of the fact that ? - pk) < 0 holds, where p^ and are non-negative dp solutions of the unbounded problems (4) for J := and J := in the steps k and k+4, respectively. Namely rewriting this inequality we obtain (see the optimality conditions)
k+4 k i. e. at least one of the coordinates p^ , j 6 J_, is positive. By similar arguments it can be shown, that not all indices added in the substep (a) to the index set JIf can be dropped in the next substep if (b). Of course, the values ftp*) corresponding to iteration points v not necessarily decrease. p A sequence pk with decreasing values f(pk ) one obtains by the following modified algorithm, where the substep (a) is replaced by the substep
128
(a'): Solve the unbounded problem (4) where J := J^. If the resulting v optimal point p has only non-negative coordinates, then go 1 to (b). If the setV J' I, = €d I P^"' = 0 and J pi 0 V x £ T . Dann gibt es reelle Zahlen U. >0
1
(i= 1 ,. . . ,n) mit I. U. = 1 , so daß I. p . f . (x) > O V x £ r. L
I
1 1 1
Der beweis ergibt sich unmittelbar aus dem Trennsatz im l": Nach Voraussetzung ist 0 £ K n kein innerer Punkt der konvexen Menge D : = { z £ K n | z^>f^(x) (i=l
n) , x £ T } . Daher
existiert im K n ein lineares Funktional mit u * 0, so daß < M , z > > 0 V z £ D . Wegen D + f " c D folgt p > 0 . Man normiert auf Z. P . = 1 und wählt als z £ D die Punkte z = (f | (x) ,. .. ,f (x) ) mit x £ r. Dies liefert die Aussage des Lemmas. • Der Satz von Browder-Minty findet sich in der Literatur in verschiedenen Varianten. Wir legen uns hier auf die folgende mengenwertige Fassung fest: Satz 1 (Browder-Minty). Sei E ein reflexiver reeller Banach-Raum. Sei E*, sein topologischer Dualraum, mit der schwachen* Topologie versehen. Sei B c E nichtleer, abgeschlossen und konvex. Sei $ : BI+E* eine mengenwertige Abbildung mit $(u)
Vu£B,
die den folgenden Bedingungen genügt: (i) 4> ist monoton, d.h. > 0 V u , v £ B ,
Vu*£ ^ h(5)] f
at point
x.
From the convex analysis [2] it is well-known that under the given assumptions
3f (x) t 0
and
h(x) - max ( : x*e Df(x.x) , then D ° f ( x , x ) i s denoted by
i s always the u . c . a . I t s corresponding s u b d i f f e r e n t i a l d ° f ( x ) . Here i t i s elementary to oheok that D°f-(x,x) - D°f(x,-x)
and therefore 5 ° f ~ ( x ) » - £ ° f ( x ) . For a r b i t r a r y s u b d i f f e r e n t i a l s t h i s r e l a t i o n i s not v a l i d . A olass of functions which admit upper convex approximations i s very broad. Uethods of oaloulation of u . o . a . are given i n considerable d e t a i l i n [ 2 ] . Note also that q u a s i d i f f e r e n t i a l functions studied in [3] also have u . c . a . which cure e a s i l y calculated by t h e i r sub- and superdifferentials [3]. For i l l u s t r a t i o n we give the following theorem which i s e a s i l y proved. Theorem A.
Let
f
be a continuous oonvex function. Then
Df(x,x) = f ' ( x , x ) = lim MO
-
W
f
i s the u . c . a . , and the corresponding s u b d i f f e r e n t i a l dt coincides with the s u b d i f f e r e n t i a l 9°t and with the usual s u b d i f f e r e n t i a l of the convex function. At the same time Df-(x,x) - - f ' ( x , x ) £
f o r any element x * e d f ( x ) and therefore of f~ = - f at any ohoioe x * € 5 f ( x ) .
-x*
i s the s u b d i f f e r e n t i a l
This theorem shows a distinguishing f e a t u r e . The Clarke s u b d i f f e r e n t i a l d°£~ i s defined uniquely and equals -