201 85 2MB
English Pages VI, 146 [148] Year 2020
SPRINGER BRIEFS IN OPTIMIZATION
Alexander J. Zaslavski
The Projected Subgradient Algorithm in Convex Optimization 123
SpringerBriefs in Optimization Series Editors Sergiy Butenko, Department of Industrial and Systems Engineering, Texas A&M University, College Station, TX, USA Mirjam Dür, Department of Mathematics, University of Trier, Trier, Germany Panos M. Pardalos, ISE Department, University of Florida, Gainesville, FL, USA János D. Pintér, Lehigh University, Bethlehem, PA, USA Stephen M. Robinson, University of Wisconsin-Madison, Madison, WI, USA Tamás Terlaky, Lehigh University, Bethlehem, PA, USA My T. Thai , CISE Department, University of Florida, Gainesville, FL, USA
SpringerBriefs present concise summaries of cutting-edge research and practical applications across a wide spectrum of fields. Featuring compact volumes of 50 to 125 pages, the series covers a range of content from professional to academic. Briefs are characterized by fast, global electronic dissemination, standard publishing contracts, standardized manuscript preparation and formatting guidelines, and expedited production schedules. Typical topics might include • A timely report of state-of-the art techniques • A bridge between new research results, as published in journal articles, and a contextual literature review • A snapshot of a hot or emerging topic • An in-depth case study • A presentation of core concepts that students must understand in order to make independent contributions SpringerBriefs in Optimization showcase algorithmic and theoretical techniques, case studies, and applications within the broad-based field of optimization. Manuscripts related to the ever-growing applications of optimization in applied mathematics, engineering, medicine, economics, and other applied sciences are encouraged.
More information about this series at http://www.springer.com/series/8918
Alexander J. Zaslavski
The Projected Subgradient Algorithm in Convex Optimization
123
Alexander J. Zaslavski Department of Mathematics Technion – Israel Institute of Technology Haifa, Israel
ISSN 2190-8354 ISSN 2191-575X (electronic) SpringerBriefs in Optimization ISBN 978-3-030-60299-4 ISBN 978-3-030-60300-7 (eBook) https://doi.org/10.1007/978-3-030-60300-7 Mathematics Subject Classification: 49M37, 65K05, 90C25, 90C26, 90C30 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Subgradient Projection Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1
2
Nonsmooth Convex Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Approximate Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Convergence to the Solution Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Superiorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Auxiliary Results for Theorems 2.1–2.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Proof of Theorem 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Proof of Theorem 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Proof of Theorem 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Proof of Theorem 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.10 Proof of Theorem 2.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.11 Proof of Theorem 2.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.12 An Auxiliary Result for Theorems 2.11–2.15 . . . . . . . . . . . . . . . . . . . . . . . . 2.13 Proof of Theorem 2.11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.14 Proof of Theorem 2.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.15 Proof of Theorem 2.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.16 Proof of Theorem 2.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.17 Proof of Theorem 2.15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.18 Proof of Theorem 2.16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.19 Proof of Theorem 2.17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.20 Proof of Theorem 2.18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 5 7 18 22 24 35 36 39 41 43 45 48 49 54 59 65 72 78 80 82
3
Extensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Optimization Problems on Bounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 An Auxiliary Result for Theorem 3.2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 An Auxiliary Result for Theorem 3.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Proof of Theorem 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Proof of Theorem 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Optimization on Unbounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85 85 89 91 94 95 96 v
vi
Contents
3.7 3.8 3.9
Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Proof of *Theorem 3.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Proof of Theorem 3.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4
Zero-Sum Games with Two Players. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4.1 Preliminaries and an Auxiliary Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4.2 Zero-Sum Games on Bounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5
Quasiconvex Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 The Main Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Optimization on Bounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Optimization on Unbounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
129 129 133 134 137
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Chapter 1
Introduction
In this book we study the behavior of subgradient algorithms for constrained minimization problems in a Hilbert space. Our goal is to obtain a good approximate solution of the problem in the presence of computational errors. It is known that the algorithm generates a good approximate solution, if the sequence of computational errors is bounded from above by a small constant. In our study, presented in this book, we take into consideration the fact that for every algorithm, its iteration consists of several steps and that computational errors for different steps are different, in general. In this section we discuss several algorithms which are studied in the book.
1.1 Subgradient Projection Method In this book we study the subgradient projection algorithm for minimization of convex and nonsmooth functions and for computing the saddle points of convexconcave functions, under the presence of computational errors. It should be mentioned that the subgradient projection algorithm is one of the most important tools in the optimization theory, nonlinear analysis, and their applications. See, for example, [1–3, 7, 10–12, 16, 17, 26, 28–30, 33–35, 37, 43–47, 50–56, 58–60, 64– 68, 71–75, 77, 78]. The problem is described by an objective function and a set of feasible points. For this algorithm, each iteration consists of two steps. The first step is a calculation of a subgradient of the objective function, while in the second one, we calculate a projection on the feasible set. In each of these two steps, there is a computational error. In general, these two computational errors are different. In our recent research [77], we show that our algorithm generates a good approximate solution, if all the computational errors are bounded from above by a small positive constant. Moreover, if we know computational errors for the two steps of our algorithm, we find out what an approximate solution can be obtained © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 A. J. Zaslavski, The Projected Subgradient Algorithm in Convex Optimization, SpringerBriefs in Optimization, https://doi.org/10.1007/978-3-030-60300-7_1
1
2
1 Introduction
and how many iterates one needs for this. In this book we generalize all these results for an extension of the projected subgradient method, when instead of the projection on the feasible set it is used a quasi-nonexpansive retraction on this set. We study the subgradient algorithm for constrained minimization problems in Hilbert spaces equipped with an inner product denoted by ·, · which induces a complete norm · and use the following notation. For every z ∈ R 1 denote by z the largest integer which does not exceed z: z = max{i ∈ R 1 : i is an integer and i ≤ z}. For every nonempty set D, every function f : D → R 1 and every nonempty set C ⊂ D, we set inf(f, C) = inf{f (x) : x ∈ C}. Let X be a Hilbert space equipped with an inner product denoted by ·, · which induces a complete norm · . For each x ∈ X and each r > 0, set BX (x, r) = {y ∈ X : x − y ≤ r}, and for each x ∈ X and each nonempty, set E ⊂ X set d(x, E) = inf{x − y : y ∈ E}. For each nonempty open convex, set U ⊂ X and each convex function f : U → R 1 ; for every x ∈ U , set ∂f (x) = {l ∈ X : f (y) − f (x) ≥ l, y − x for all y ∈ U } which is called the subdifferential of the function f at the point x [48, 49, 61]. Let C be a nonempty closed convex subset of X and let f : X → R 1 be a convex function. Suppose that there exist L > 0, M0 > 0 such that C ⊂ BX (0, M0 ), |f (x) − f (y)| ≤ Lx − y for all x, y ∈ BX (0, M0 + 2). It is not difficult to see that for each x ∈ BX (0, M0 + 1), ∅ = ∂f (x) ⊂ BX (0, L). It is well-known that for every nonempty closed convex, set D ⊂ X and every x ∈ X, there is a unique point PD (x) ∈ D satisfying
1.1 Subgradient Projection Method
3
x − PD (x) = inf{x − y : y ∈ D}. We consider the minimization problem f (z) → min, z ∈ C. Suppose that {αk }∞ k=0 ⊂ (0, ∞). Let us describe our algorithm. Subgradient Projection Algorithm Initialization: select an arbitrary x0 ∈ BX (0, M0 + 1). Iterative step: given a current iteration vector xt ∈ U , calculate ξt ∈ ∂f (xt ) and the next iteration vector xt+1 = PC (xt − αt ξt ). In [75] we study this algorithm under the presence of computational errors. Namely, in [75], we suppose that δ ∈ (0, 1] is a computational error produced by our computer system, and study the following algorithm. Subgradient Projection Algorithm with Computational Errors Initialization: select an arbitrary x0 ∈ BX (0, M0 + 1). Iterative step: given a current iteration vector xt ∈ BX (0, M0 + 1), calculate ξt ∈ ∂f (xt ) + BX (0, δ) and the next iteration vector xt+1 ∈ U such that xt+1 − PC (xt − at ξt ) ≤ δ. In [77] we consider more complicated, but more realistic, version of this algorithm. Clearly, for the algorithm, each iteration consists of two steps. The first step is a calculation of a subgradient of the objective function f , while in the second one, we calculate a projection on the set C. In each of these two steps, there is a computational error produced by our computer system. In general, these two computational errors are different. This fact is taken into account in the following projection algorithm studied in Chapter 2 of [77]. Suppose that {αk }∞ k=0 ⊂ (0, ∞) and δf , δC ∈ (0, 1]. Initialization: select an arbitrary x0 ∈ BX (0, M0 + 1). Iterative step: given a current iteration vector xt ∈ BX (0, M0 + 1), calculate ξt ∈ ∂f (xt ) + BX (0, δf ) and the next iteration vector xt+1 ∈ U such that xt+1 − PC (xt − αt ξt ) ≤ δC . Note that in practice for some problems, the set C is simple but the function f is complicated. In this case, δC is essentially smaller than δf . On the other hand, there are cases when f is simple but the set C is complicated, and therefore δf is much smaller than δC . In our analysis of the behavior of the algorithm in [75, 77], properties of the projection operator PC play an important role. In the present book, we obtain generalizations of the results obtained in [75, 77] for the subgradient methods in the case when the set C is not necessarily convex and the projection operator PC is replaced by a mapping P : X → C which satisfies P x = x for all x ∈ C,
(1.1)
P x − z ≤ x − z for all z ∈ C and all x ∈ X.
(1.2)
4
1 Introduction
In other words, P is a quasi-nonexpansive retraction on C. Note that there are many mappings P : X → C satisfying (1.1) and (1.2). Indeed, in [57] we consider a space of mappings P : X → X satisfying (1.1) and (1.2), which is equipped with a natural complete metric, and show that for a generic (typical) mapping from the space, its powers converge to a mapping which also satisfies (1.1) and (1.2) and such that its image is C. Note that the generalizations considered in this book have, besides their obvious mathematical interest, also a significant practical meaning. Usually, the projection operator PC : X → C can be calculated when C is a simple set like a linear subspace, a half-space, or a simplex. In practice, C is an intersection of simple sets Ci , i = 1, . . . , q, where q is a large natural number. The calculation of PC is not possible in principle. Instead, it is possible to calculate the product PCq · · · PC1 and its powers (PCq · · · PC1 )m , m = 1, 2, . . . . It is well-known [76] that under certain regularity conditions on Ci , i = 1, . . . , q the powers (PCq · · · PC1 )m converge as m → ∞ to a mapping P : X → C which satisfies (1.1) and (1.2). Thus in practice we cannot calculate the projection operator PC but only a mapping P : X → C satisfying (1.1) and (1.2) [4, 5, 8, 9, 21, 24, 27, 31, 32, 36, 52, 62, 69] or, more exactly, its approximations. This shows that the results of this book are indeed important from the point of view of practice. In Chapter 2, we study the subgradient projection algorithm presented above for convex minimization problems with objective functions defined on the whole Hilbert space. In Chapter 3, we generalize some results of Chapter 2 for the case of problems with objective functions defined on subsets of the Hilbert space. In Chapter 4, we study the subgradient projection algorithm for zero-sum games with two players. In Chapter 5, we study the projected subgradient method for quasiconvex optimization problems.
Chapter 2
Nonsmooth Convex Optimization
In this chapter, we study an extension of the projected subgradient method for minimization of convex and nonsmooth functions, under the presence of computational errors. The problem is described by an objective function and a set of feasible points. For this algorithm, each iteration consists of two steps. The first step is a calculation of a subgradient of the objective function, while in the second one, we calculate a projection on the feasible set. In each of these two steps, there is a computational error. In general, these two computational errors are different. In our recent research [77], we show that our algorithm generates a good approximate solution, if all the computational errors are bounded from above by a small positive constant. Moreover, if we know computational errors for the two steps of our algorithm, we find out what an approximate solution can be obtained and how many iterates one needs for this. In this chapter, we generalize all these results for an extension of the projected subgradient method, when instead of the projection on the feasible set it is used a quasi-nonexpansive retraction on this set.
2.1 Preliminaries Let (X, ·, ·) be a Hilbert space with an inner product ·, · which induces a complete norm · . For each x ∈ X and each nonempty set A ⊂ X, set d(x, A) = inf{x − y : y ∈ A}. For each x ∈ X and each r > 0, set BX (x, r) = {y ∈ X : x − y ≤ r}.
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 A. J. Zaslavski, The Projected Subgradient Algorithm in Convex Optimization, SpringerBriefs in Optimization, https://doi.org/10.1007/978-3-030-60300-7_2
5
6
2 Nonsmooth Convex Optimization
Assume that f : X → R 1 is a convex continuous function which is Lipschitz on all bounded subsets of X. For each point x ∈ X and each positive number , let ∂f (x) = {l ∈ X : f (y) − f (x) ≥ l, y − x for all y ∈ X}
(2.1)
be the subdifferential of f at x [48, 49], and let ∂ f (x) = {l ∈ X : f (y) − f (x) ≥ l, y − x − for all y ∈ X}
(2.2)
be the -subdifferential of f at x. Let C be a closed nonempty subset of the space X. Assume that lim f (x) = ∞.
x→∞
(2.3)
It means that for each M0 > 0, there exists M1 > 0 such that if a point x ∈ X satisfies the inequality x ≥ M1 , then f (x) > M0 . In this chapter, we consider the optimization problem f (x) → min, x ∈ C. Define inf(f, C) = inf{f (z) : z ∈ C}.
(2.4)
Since the function f is Lipschitz on all bounded subsets of the space X, it follows from (2.3) that inf(f, C) is finite. Set Cmin = {x ∈ C : f (x) = inf(f, C)}.
(2.5)
It is well-known that if the set C is convex, then the set Cmin is nonempty. Clearly, the set Cmin = ∅ if the space X is finite-dimensional. We assume that Cmin = ∅.
(2.6)
It is clear that Cmin is a closed subset of X. Define X0 = {x ∈ X : f (x) ≤ inf(f, C) + 4}.
(2.7)
In view of (2.3), there exist a number K¯ > 1 such that ¯ X0 ⊂ BX (0, K).
(2.8)
2.2 Approximate Solutions
7
Since the function f is Lipschitz on all bounded subsets of the space X, there exist a number L¯ > 1 such that ¯ 1 − z2 for all z1 , z2 ∈ BX (0, K¯ + 4). |f (z1 ) − f (z2 )| ≤ Lz
(2.9)
Denote by M the set of all mappings P : X → X such that for all x ∈ C and all y ∈ X, x − P y ≤ x − y,
(2.10)
P x = x for all x ∈ C.
(2.11)
For every P ∈ M, set P 0 x = x, x ∈ X.
2.2 Approximate Solutions In the following results, our goal is to obtain an approximate solution x which is close to the set C such that f (x) is closed to inf(f, C). In the first theorem, the set C is bounded, the computational errors δf , δC are given, and the step-size α depends on δf , δC . It is proved in Section 2.6. ¯ δf , δC ∈ (0, 1], Theorem 2.1 Assume that K1 ≥ K¯ + 1, L1 ≥ L, C ⊂ BX (0, K1 ),
(2.12)
|f (z1 ) − f (z2 )| ≤ L1 z1 − z2 for all z1 , z2 ∈ BX (0, K1 + 2),
(2.13)
δf (K¯ + K1 + 2 + 5L1 + 5L¯ 1 ) ≤ 1, δC (K¯ + K1 + 2 + 5L1 + 5L¯ 1 ) ≤ 1,
(2.14)
¯ 40(L1 + L)(δ ¯ C (K¯ +K1 +2+5L1 +5L)) ¯ 1/2 } = max{4δf (K¯ +K1 +2+5L1 +5L), (2.15) and that ¯ 2 (L1 + L) ¯ 2 −2 + 2. n = 800(1 + K1 + K)
(2.16)
Let {Pi }n−1 i=0 ⊂ M satisfy Pi (X) = C, i = 0, . . . , n − 1,
(2.17)
¯ −2 , α = 10−2 (L1 + L)
(2.18)
{xi }ni=0 ⊂ X, {ξi }n−1 i=1 ⊂ X,
8
2 Nonsmooth Convex Optimization
x0 ≤ K1 , x1 − P0 x0 ≤ δC
(2.19)
and that for i = 1, . . . , n − 1, BX (ξi , δf ) ∩ ∂/4 f (xi ) = ∅,
(2.20)
xi+1 − Pi (xi − αξi ) ≤ δC .
(2.21)
Then there exists j ∈ {1, . . . , n} such that f (xj ) ≤ inf(f, C) + . In the second theorem, the set C is not necessarily bounded, the computational errors δf , δC are given, and the step-size α depends on δf , δC . It is proved in Section 2.7. ¯ δf , δC ∈ (0, 1], Theorem 2.2 Assume that K1 ≥ K¯ + 1, L1 ≥ L, |f (z1 ) − f (z2 )| ≤ L1 z1 − z2 for all z1 , z2 ∈ BX (0, 3K1 + 2), ¯ ≤ 1, δf (K¯ + 3K1 + 2 + 5L1 + 5L) ¯ ≤ (10(L¯ + L1 ))−2 , δC (K¯ + 3K1 + 2 + 5L1 + 5L)
(2.22) (2.23) (2.24)
¯ = max{4δf (K¯ + 3K1 + 2 + 5L1 + 5L), ¯ C (K¯ + 3K1 + 2 + 5L1 + 5L)) ¯ 1/2 } 40(L1 + L)(δ
(2.25)
and that ¯ 2 (L1 + L) ¯ 2 −2 + 2. n = 800(1 + K1 + K)
(2.26)
Let {Pi }n−1 i=0 ⊂ M satisfy Pi (X) = C, i = 0, . . . , n − 1,
(2.27)
¯ −2 , α = 10−2 (L1 + L)
(2.28)
x0 ≤ K1 ,
(2.29)
x1 − P0 x0 ≤ δC
(2.30)
{xi }ni=0 ⊂ X, {ξi }n−1 i=1 ⊂ X,
2.2 Approximate Solutions
9
and that for all i = 1, . . . , n − 1, BX (ξi , δf ) ∩ ∂/4 f (xi ) = ∅,
(2.31)
xi+1 − Pi (xi − αξi ) ≤ δC .
(2.32)
Then there exists j ∈ {1, . . . , n} such that f (xj ) ≤ inf(f, C) + . In the third theorem, the set C is bounded, the computational errors δf , δC and are given, and the step-size αi , i = 1, . . . , n − 1 are given too. It is proved in Section 2.8. ¯ δf , δC ∈ (0, 1], Theorem 2.3 Assume that K1 ≥ K¯ + 1, L1 ≥ L, C ⊂ BX (0, K1 ), |f (z1 ) − f (z2 )| ≤ L1 z1 − z2 for all z1 , z2 ∈ BX (0, K1 + 2),
(2.33) (2.34)
0 < ≤ 16(L¯ + L1 + 1)
(2.35)
δf (K¯ + K1 + 2 + 5L1 + 5L¯ 1 ) ≤ /8.
(2.36)
and that
Let n ≥ 2 be an integer, {Pi }n−1 i=0 ⊂ M satisfy Pi (X) = C, i = 0, . . . , n − 1,
(2.37)
αi ∈ (0, 1], i = 1, . . . , n − 1, {xi }ni=0 ⊂ X, {ξi }n−1 i=0 ⊂ X, x0 ≤ K1 ,
(2.38)
x1 − P0 x0 ≤ δC
(2.39)
and that for all i = 1, . . . , n − 1,
Then
BX (ξi , δf ) ∩ ∂/4 f (xi ) = ∅,
(2.40)
xi+1 − Pi (xi − αi ξi ) ≤ δC .
(2.41)
10
2 Nonsmooth Convex Optimization
min{f (xi ) : i = 1, . . . , n} − inf(f, C), f
n
αj
−1 n
j =1
≤
n
−1 n
αj
j =1
≤
n
αi xi − inf(f, C)
i=1
αi (f (xi ) − inf(f, C))
i=1
−1 αi
¯ 2 (2K1 + 1)2 + 3/4 + 25(L1 + L)
n
i=1
αi2
n
i=1
−1 αi
i=1
−1 n +2 αi δC (K¯ + K1 + 5L1 + 5L¯ + 2)n. i=1
Let n ≥ 2 be an integer and A > 0 be given. We are interested in an optimal choice of the step-sizes αi , i = 1, . . . , n satisfying ni=1 αi = A which minimizes the right-hand side of the final equation in the statement of Theorem 2.3. In order to meet this goal, we need to minimize the function φ(α1 , . . . , αn ) =
n
αi2
i=1
n on the set {(α1 , . . . , αn ) ∈ R n : αi ≥ 0, i = 1, . . . , n, i=1 αi = A}. By Lemma 2.3 of [75], the minimizer of φ is αi = n−1 A, i = 1, . . . , n. Theorem 2.3 implies the following result. ¯ δf , δC ∈ (0, 1], Theorem 2.4 Assume that K1 ≥ K¯ + 1, L1 ≥ L, C ⊂ BX (0, K1 ), |f (z1 ) − f (z2 )| ≤ L1 z1 − z2 for all z1 , z2 ∈ BX (0, K1 + 2), 0 < ≤ 16(L¯ + L1 + 1) and that ¯ ≤ /8. δf (K¯ + K1 + 2 + 5L1 + 5L) Let n ≥ 2 be an integer, {Pi }n−1 i=0 ⊂ M satisfy Pi (X) = C, i = 0, . . . , n − 1, α ∈ (0, 1], {xi }ni=0 ⊂ X, {ξi }n−1 i=0 ⊂ X, x0 ≤ K1 , x1 − P0 x0 ≤ δC and that for all i = 1, . . . , n − 1,
2.2 Approximate Solutions
11
BX (ξi , δf ) ∩ ∂/4 f (xi ) = ∅, xi+1 − Pi (xi − αξi ) ≤ δC . Then n min{f (xi ) : i = 1, . . . , n} − inf(f, C), f n−1 xi − inf(f, C) i=1
≤ n−1
n
f (xi ) − inf(f, C)
i=1
¯ 2α ≤ (nα)−1 (2K1 + 1)2 + 3/4 + 25(L1 + L) +2α −1 δC (K¯ + K1 + 5L1 + 5L¯ + 2). Now we can make the best choice of the step-size α in Theorem 2.4. Since n can be arbitrary large in view of Theorem 2.4, we need to minimize the function ¯ 2 α + 2α −1 δC (K¯ + K1 + 5L1 + 5L¯ + 2), α > 0 25(L1 + L) which has a minimizer ¯ −1 (2(K¯ + K1 + 5L1 + 5L¯ + 2))1/2 δ 1/2 . α = 5−1 (L1 + L) C With this choice of α, the right-hand side of the last equation in the statement of Theorem 2.4 is −1/2 ¯ K¯ + K1 + 5L1 + 5L¯ + 2))−1/2 δC n−1 (2K1 + 1)2 5(L1 + L)(2(
¯ +3/4 + 10(L1 + L)(2( K¯ + K1 + 5L1 + 5L¯ + 2))1/2 δC . 1/2
Now we should make the best choice of n. It is clear that n should be at the same order as δC−1 . In this case, the right-hand side of the last equation in Theorem 2.4 1/2 does not exceed c1 δC +3/4. Clearly, depends on δf . In particular, we can choose = 8δf (K¯ + K1 + 5L1 + 5L¯ + 2). In the next theorem, the set C is not necessarily bounded, and the computational errors δf , δC and the step-size α are given. It is proved in Section 2.9. ¯ δf , δC ∈ (0, 1], ∈ (0, 4], Theorem 2.5 Assume that K1 ≥ K¯ + 2, L1 ≥ L, |f (z1 ) − f (z2 )| ≤ L1 z1 − z2 for all z1 , z2 ∈ BX (0, 3K1 + 1),
(2.42)
12
2 Nonsmooth Convex Optimization
¯ −2 ] α ∈ (0, 25−1 (L1 + L)
(2.43)
¯ ≤ , 8δf (3K¯ + K1 + 2 + 5L1 + 5L)
(2.44)
¯ ≤ α. δC (3K¯ + K1 + 2 + 5L1 + 5L)
(2.45)
and that
Let n ≥ 2 be an integer, {Pi }n−1 i=0 ⊂ M satisfy Pi (X) = C, i = 0, . . . , n − 1,
(2.46)
{xi }ni=0 ⊂ X, {ξi }n−1 i=0 ⊂ X, x0 ≤ K1 ,
(2.47)
x1 − P0 x0 ≤ δC
(2.48)
and that for all i = 1, . . . , n − 1, BX (ξi , δf ) ∩ ∂/4 f (xi ) = ∅,
(2.49)
xi+1 − Pi (xi − αξi ) ≤ δC .
(2.50)
Then xi ≤ 2K¯ + K1 + 1, i = 1, . . . , n and n −1 min{f (xi ) : t = 1, . . . , n} − inf(f, C), f n xi − inf(f, C) i=1
≤ n−1
n
f (xi ) − inf(f, C)
i=1
¯ 2 + /2 + 15(L1 + L) ¯ 2α ≤ (2nα)−1 (K1 + 1 + K) +α −1 δC (3K¯ + K1 + 5L1 + 5L¯ + 2). Now we can make the best choice of the step-size α in Theorem 2.5. Since n can be arbitrary large in view of Theorem 2.5, we need to minimize the function ¯ 2 α + α −1 δC (3K¯ + K1 + 5L1 + 5L¯ + 2), α > 0 15(L1 + L)
2.2 Approximate Solutions
13
which has a minimizer ¯ −1 (15−1 (3K¯ + K1 + 5L1 + 5L¯ + 2))1/2 δ 1/2 . α = (L1 + L) C Since α should satisfy (2.45), we obtain an additional condition on δC : ¯ −2 . (3K¯ + K1 + 5L1 + 5L¯ + 2)δC ≤ 15 · 25−2 (L1 + L) Together with the relations above Theorem 2.5 implies the following result. ¯ δf , δC ∈ (0, 1], ∈ (0, 4], Theorem 2.6 Assume that K1 ≥ K¯ + 2, L1 ≥ L, |f (z1 ) − f (z2 )| ≤ L1 z1 − z2 for all z1 , z2 ∈ BX (0, 3K1 + 1), ¯ δf ≤ 8−1 (3K¯ + K1 + 2 + 5L1 + 5L), ¯ −2 (3K¯ + K1 + 2 + 5L1 + 5L) ¯ −1 δC ≤ 25−2 15(L1 + L) and that ¯ −1 (15−1 (3K¯ + K1 + 5L1 + 5L¯ + 2))1/2 δ 1/2 . α = (L1 + L) C Let n ≥ 2 be an integer, {Pi }n−1 i=0 ⊂ M satisfy Pi (X) = C, i = 0, . . . , n − 1, {xi }ni=0 ⊂ X, {ξi }n−1 i=0 ⊂ X, x0 ≤ K1 , x1 − P0 x0 ≤ δC and that for all i = 1, . . . , n − 1, BX (ξi , δf ) ∩ ∂/4 f (xi ) = ∅, xi+1 − Pi (xi − αξi ) ≤ δC . Then xi ≤ 2K¯ + K1 + 1, i = 1, . . . , n and
14
2 Nonsmooth Convex Optimization
min{f (xi ) : t = 1, . . . , n} − inf(f, C), f
−1
n
n
xi
− inf(f, C)
i=1
≤ n−1
n
f (xi ) − inf(f, C)
i=1 1/2 ¯ 2 (L1 + L)15 ¯ (δC (3K¯ + K1 + 5L1 + 5L¯ + 2))−1/2 ≤ /2 + 2−1 n−1 (K1 + 1 + K) −1 ¯ +30(L1 + L)(15 (3K¯ + K1 + 5L1 + 5L¯ + 2)δC )1/2 .
Now we should make the best choice of n in Theorem 2.6. It is clear that n should be at the same order as δC−1 . Theorem 2.6 implies the following result. ¯ δf , δC ∈ (0, 1], Theorem 2.7 Assume that K1 ≥ K¯ + 2, L1 ≥ L, |f (z1 ) − f (z2 )| ≤ L1 z1 − z2 for all z1 , z2 ∈ BX (0, 3K1 + 1), ¯ −1 , δf ≤ 2−1 (3K¯ + K1 + 2 + 5L1 + 5L) ¯ −2 (3K¯ + K1 + 2 + 5L1 + 5L) ¯ −1 , δC ≤ 25−2 15(L1 + L) ¯ = 8δf (3K¯ + K1 + 2 + 5L1 + 5L) and that ¯ −1 (15−1 (3K¯ + K1 + 5L1 + 5L¯ + 2))1/2 δ 1/2 . α = (L1 + L) C Let n ≥ 2 be an integer, {Pi }n−1 i=0 ⊂ M satisfy Pi (X) = C, i = 0, . . . , n − 1, {xi }ni=0 ⊂ X, {ξi }n−1 i=0 ⊂ X, x0 ≤ K1 , x1 − P0 x0 ≤ δC and that for all i = 1, . . . , n − 1, BX (ξi , δf ) ∩ ∂/4 f (xi ) = ∅, xi+1 − Pi (xi − αξi ) ≤ δC . Then
2.2 Approximate Solutions
15
xi ≤ 2K¯ + K1 + 1, i = 1, . . . , n and n −1 min{f (xi ) : i = 1, . . . , n} − inf(f, C), f n xi − inf(f, C) i=1
≤ n−1
n
f (xi ) − inf(f, C)
i=1
¯ ≤ 4δf (3K¯ + K1 + 2 + 5L1 + 5L) 1/2 ¯ 2 (L1 + L)15 ¯ +2−1 n−1 (K1 + 1 + K) (δC (3K¯ + K1 + 5L1 + 5L¯ + 2))−1/2 −1 ¯ +30(L1 + L)(15 (3K¯ + K1 + 5L1 + 5L¯ + 2)δC )1/2 .
Now we should make the best choice of n in Theorem 2.7. It is clear that n should be at the same order as δC−1 . In this case, the right-hand side of the last equation in 1/2 Theorem 2.7 does not exceed c1 δf + c2 δC where c1 , c2 > 0 are constants. In the previous theorems, we deal with the projected subgradient method: at every iterative step i of the algorithm, ξi is an approximation of a subgradient v. In the following theorems, we study the projected normalized subgradient method: at every iterative step i of the algorithm, ξi is an approximation of v−1 v, where v is a subgradient. In the next result, the set C is bounded, and the computational errors δf , δC and are given. It is proved in Section 2.10. Theorem 2.8 Assume that K1 ≥ K¯ + 1, C ⊂ BX (0, K1 ),
(2.51)
¯ 0 < ≤ 16L,
(2.52)
¯ f (K¯ + K1 + 1) ≤ , 8Lδ
(2.53)
¯ 2 δC (K¯ + K1 + 3) ≤ 2 (32L)
(2.54)
¯ 2 −2 + 3. ¯ 2 (1 + K1 + K) n = 2(16L)
(2.55)
δf , δC ∈ (0, 1),
and that
16
2 Nonsmooth Convex Optimization
Let {Pi }n−1 i=0 ⊂ M satisfy Pi (X) = C, i = 0, . . . , n − 1,
(2.56)
{xi }ni=0 ⊂ X, {ξi }n−1 i=1 ⊂ X, ¯ −1 ¯ −1 {αi }n−1 i=1 ⊂ [(32L) , (16L) ],
(2.57)
x0 ≤ K1 ,
(2.58)
x1 − P0 x0 ≤ δC
(2.59)
and that for all i = 1, . . . , n − 1, BX (ξi , δf ) ∩ {v−1 v : v ∈ ∂/4 f (xi ) \ {0}} = ∅, xi+1 − Pi (xi − αi ξi ) ≤ δC . Then there exists j ∈ {1, . . . , n} such that f (xj ) ≤ inf(f, C) + and if i ∈ {1, . . . , n} \ {j }, then f (xi ) > inf(f, C) + and ¯ −1 }. ∂/4 f (xi ) ⊂ {v ∈ X : v ≥ (3/4)(K1 + K) Theorem 2.8 implies the following result. Theorem 2.9 Assume that K1 ≥ K¯ + 1, C ⊂ BX (0, K1 ), δf , δC ∈ (0, 1), δf (K¯ + K1 + 1) ≤ 2, δC (K¯ + K1 + 3) ≤ 4−1 , ¯ f (K¯ + K1 + 1), 32L(δ ¯ C (K¯ + K1 + 3))1/2 } = max{8Lδ and that
(2.60) (2.61)
2.2 Approximate Solutions
17
¯ 2 (K1 + K) ¯ 2 −2 + 2. n = 2(16L) Let {Pi }n−1 i=0 ⊂ M satisfy Pi (X) = C, i = 0, . . . , n − 1, {xi }ni=0 ⊂ X, {ξi }n−1 i=1 ⊂ X, ¯ −1 ¯ −1 {αi }n−1 i=1 ⊂ [(32L) , (16L) ], x0 ≤ K1 , x1 − P0 x0 ≤ δC and that for i = 1, . . . , n − 1, BX (ξi , δf ) ∩ {v−1 v : v ∈ ∂/4 f (xi ) \ {0}} = ∅, xi+1 − Pi (xi − αi ξi ) ≤ δC . Then there exists j ∈ {1, . . . , n} such that f (xj ) ≤ inf(f, C) + and if i ∈ {1, . . . , n} \ {j }, then f (xi ) > inf(f, C) + and ¯ −1 }. ∂/4 f (xi ) ⊂ {v ∈ X : v ≥ (3/4)(K1 + K) In the next theorem, the set C is not necessarily bounded. It is proved in Section 2.11. Theorem 2.10 Assume that K1 ≥ K¯ + 2, δf , δC ∈ (0, 1], ¯ f (3K¯ + K1 + 2) ≤ 1, 4Lδ
(2.62)
¯ 2 δC (3K¯ + K1 + 4) ≤ 1, (8L)
(2.63)
¯ f (3K¯ + K1 + 2), 32L(δ ¯ C (3K¯ + K1 + 4))1/2 } = max{8Lδ and that
(2.64)
18
2 Nonsmooth Convex Optimization
¯ 2 (1 + K1 + K) ¯ 2 −2 + 2. n = 2(16L)
(2.65)
Let {Pi }n−1 i=0 ⊂ M satisfy Pi (X) = C, i = 0, . . . , n − 1,
(2.66)
{xi }ni=0 ⊂ X, {ξi }n−1 i=1 ⊂ X, ¯ −1 ¯ −1 {αi }n−1 i=1 ⊂ [(32L) , (16L) ] ⊂ (0, 1],
(2.67)
x0 ≤ K1 ,
(2.68)
x1 − P0 x0 ≤ δC
(2.69)
and that for i = 1, . . . , n − 1, BX (ξi , δf ) ∩ {v−1 v : v ∈ ∂/4 f (xi ) \ {0}} = ∅, xi+1 − Pi (xi − αi ξi ) ≤ δC .
(2.70) (2.71)
Then there exists j ∈ {1, . . . , n} such that f (xj ) ≤ inf(f, C) + and if i ∈ {1, . . . , n} \ {j }, then f (xi ) > inf(f, C) + and ∂/4 f (xi ) ⊂ {v ∈ X : v ≥ (3/4)(K1 + 3K¯ + 1)−1 }. All the results of this section are new.
2.3 Convergence to the Solution Set We use the notation and definitions introduced in Section 2.1 and suppose that all the assumptions made there hold. We continue to study the minimization problem f (x) → min, x ∈ C. In Section 2.2, we obtain an approximate solution x which is close to the set C such that f (x) is closed to inf(f, C). In this section, we obtain an approximate solution x which is close to the set Cmin .
2.3 Convergence to the Solution Set
19
We also suppose that the following assumption holds. (A1) For every positive number , there exists δ > 0 such that if a point x ∈ C satisfies the inequality f (x) ≤ inf(f, C) + δ, then d(x, Cmin ) ≤ . (It is clear that (A1) holds if the space X is finite-dimensional.) For every number ∈ (0, ∞), let φ() = sup{δ ∈ (0, 1] : if x ∈ C satisfies f (x) ≤ inf(f, C) + δ, then d(x, Cmin ) ≤ min{1, }}.
(2.72)
In view of (A1), φ() is well-defined for every positive number . In the following result, the step-sizes converge to zero, and their sum is infinity. It is proved in Section 2.13. Theorem 2.11 Let {αi }∞ i=0 ⊂ (0, 1] satisfy lim αi = 0,
i→∞
∞
αi = ∞
(2.73)
i=1
and let M, > 0. Then there exist a natural number n0 and δ > 0 such that the following assertion holds. Assume that an integer n ≥ n0 , {Pi }n−1 i=0 ⊂ M, Pi (X) = C, i = 0, . . . , n − 1,
(2.74)
{xi }ni=0 ⊂ X, x0 ≤ M,
(2.75)
vi ∈ ∂δ f (xi ) \ {0}, i = 0, 1, . . . , n − 1,
(2.76)
n−1 {ηi }n−1 i=0 , {ξi }i=0 ⊂ BX (0, δ),
and that for i = 0, . . . , n − 1 , xi+1 = Pi (xi − αi vi −1 vi − αi ξi ) − αi ηi . Then the inequality d(xi , Cmin ) ≤ holds for all integers i satisfying n0 ≤ i ≤ n. In the following theorem, the step-sizes are bounded from below by a sufficiently small positive constant. It is proved in Section 2.14. Theorem 2.12 Let M, > 0. Then there exists β0 ∈ (0, 1) such that for each β1 ∈ (0, β0 ), there exist a natural number n0 and δ > 0 such that the following assertion holds. Assume that an integer n ≥ n0 , {Pi }n−1 i=0 ⊂ M,
20
2 Nonsmooth Convex Optimization
Pi (X) = C, i = 0, . . . , n − 1, {xi }ni=0 ⊂ X, x0 ≤ M,
(2.77)
vi ∈ ∂δ f (xi ) \ {0}, i = 0, 1, . . . , n − 1,
(2.78)
{αi }n−1 i=0 ⊂ [β1 , β0 ],
(2.79)
n−1 {ηi }n−1 i=0 , {ξi }i=0 ⊂ BX (0, δ)
(2.80)
and that for i = 0, . . . , n − 1, xi+1 = Pi (xi − αi vi −1 vi − αi ξi ) − ηi .
(2.81)
Then the inequality d(xi , Cmin ) ≤ holds for all integers i satisfying n0 ≤ i ≤ n. In the previous two theorems, we deal with the projected normalized subgradient method: at every iterative step i of the algorithm, ξi is an approximation of v−1 v, where v is a subgradient. In the following three theorems, we study the projected subgradient method: at every iterative step i of the algorithm, ξi is an approximation of a subgradient v. In the next theorem, the set C is bounded, and the step-sizes converge to zero, and their sum is infinity. It is proved in Section 2.15. Theorem 2.13 Let {αi }∞ i=0 ⊂ (0, 1] satisfy lim αi = 0,
i→∞
∞
αi = ∞,
(2.82)
i=1
C ⊂ BX (0, M)
(2.83)
and let M, > 0. Then there exist a natural number n0 and δ > 0 such that the following assertion holds. Assume that an integer n ≥ n0 , {Pi }n−1 i=0 ⊂ M, Pi (X) = C, i = 0, . . . , n − 1, {xi }n=0 ⊂ X, x0 ≤ M,
(2.84)
{ξi }n−1 i=0 ⊂ X, for all i = 0, . . . , n − 1, BX (ξi , δ) ∩ ∂δ f (xi ) = ∅,
(2.85)
xi+1 − Pi (xi − αi ξi ) ≤ δi .
(2.86)
Then the inequality d(xi , Cmin ) ≤ holds for all integers i satisfying n0 ≤ i ≤ n.
2.3 Convergence to the Solution Set
21
In the next theorem, the set C is not necessarily bounded, but the step-sizes converge to zero, and their sum is infinity. It is proved in Section 2.16. Theorem 2.14 Let ¯ M > 2K¯ + 1, ∈ (0, 1), L0 > L, |f (z1 ) − f (z2 )| ≤ L0 z1 − z2 for all z1 , z2 ∈ BX (0, 2M + 4).
(2.87)
and let {αi }∞ i=0 ⊂ (0, 1] satisfy lim αi = 0,
i→∞
∞
αi = ∞,
i=1
αt ≤ 25−2 (L¯ + L0 + 2)−2 , t = 0, 1, . . . .
(2.88)
Then there exist a natural number n0 and δ > 0 such that the following assertion holds. Assume that an integer n ≥ n0 , {Pi }n−1 i=0 ⊂ M, Pi (X) = C, i = 0, . . . , n − 1, {xi }n=0 ⊂ X, x0 ≤ M, {ξi }n−1 i=0 ⊂ X, for all i = 0, . . . , n − 1, BX (ξi , δ) ∩ ∂δ f (xi ) = ∅, xi+1 − Pi (xi − αi ξi ) ≤ αi δ.
(2.89)
Then the inequality d(xi , Cmin ) ≤ holds for all integers i satisfying n0 ≤ i ≤ n. In the next theorem, the set C is not necessarily bounded, while the step-sizes are bounded from below by a positive constant. It is proved in Section 2.17. Theorem 2.15 Let M, > 0. Then there exists β0 ∈ (0, 1) such that for each β1 ∈ (0, β0 ), there exist a natural number n0 and δ > 0 such that the following assertion holds. Assume that an integer n ≥ n0 , {Pi }n−1 i=0 ⊂ M, Pi (X) = C, i = 0, . . . , n − 1, {xi }ni=0 ⊂ X, x0 ≤ M,
(2.90)
{αi }n−1 i=0 ⊂ [β1 , β0 ],
(2.91)
22
2 Nonsmooth Convex Optimization
{ξi }n−1 i=0 ⊂ X, for all i = 0, . . . , n − 1, BX (ξi , δ) ∩ ∂δ f (xi ) = ∅,
(2.92)
xi+1 − Pi (xi − αi ξi ) ≤ δ.
(2.93)
Then the inequality d(xi , Cmin ) ≤ holds for all integers i satisfying n0 ≤ i ≤ n. Theorems 2.13–2.15 are new. In the case with Pi = P0 , i = 0, . . . , n − 1, Theorems 2.11 and 2.12 were obtained in [71].
2.4 Superiorization In this section, we present three results on the projected subgradient in the case when step-sizes are summable. Such results are of interest in the study of superiorization and perturbation resilience of algorithms. See [9, 13, 25, 29] and the references mentioned therein. In the first two theorems at the iterative step i of the algorithm ξi is an approximation of a subgradient v, while in the third theorem ξi is an approximation of v−1 v, where v is a subgradient. In the first theorem, the set C is bounded. It is proved in Section 2.18. ¯ {αi }∞ ⊂ (0, 1] satisfy Theorem 2.16 Let K1 ≥ K¯ + 1, L1 ≥ L, i=0 ∞
αi < ∞,
i=1
C ⊂ BX (0, K1 )
(2.94)
and let |f (z1 ) − f (z2 )| ≤ L1 z1 − z2 for all z1 , z2 ∈ BX (0, K1 + 2).
(2.95)
Assume that {Pi }∞ i=0 ⊂ M, Pi (X) = C, i = 0, . . . ,
(2.96)
x0 ≤ K1 ,
(2.97)
∞ {xi }∞ =0 ⊂ X, {ξi }i=0 ⊂ X,
for all integers i ≥ 0,
2.4 Superiorization
23
ξi ∈ ∂f (xi ),
(2.98)
xi+1 = Pi (xi − αi ξi ).
(2.99)
Then there exists x∗ = limi→∞ xi , and at least one of the following properties holds: x∗ ∈ Cmin ; there exists an integer n0 ≥ 1 and 0 > 0 such that for each integer t ≥ n0 and each z ∈ Cmin , xt+1 − z2 ≤ xt − z2 − αt . In the following two theorems, the set C is not necessarily bounded. The first of them is proved in Section 2.19. ¯ {αi }∞ ⊂ (0, 1], Theorem 2.17 Let K1 ≥ K¯ + 1, L1 ≥ L, i=0 |f (z1 ) − f (z2 )| ≤ L1 z1 − z2 for all z1 , z2 ∈ BX (0, 3K1 + 2).
(2.100)
Assume that {Pi }∞ i=0 ⊂ M, Pi (X) = C, i = 0, 1, . . . ,
(2.101)
∞ {xi }∞ i=0 ⊂ X, {ξi }i=0 ⊂ X,
x0 ≤ K1 , for all integers i ≥ 0, ¯ −2 ], αi ∈ (0, 100(L1 + L)
(2.102)
ξi ∈ ∂f (xi ),
(2.103)
xi+1 = Pi (xi − αi ξi ).
(2.104)
Then there exists x∗ = limi→∞ xi , and at least one of the following properties holds: x∗ ∈ Cmin ; there exists an integer n0 ≥ 1 and 0 > 0 such that for each integer t ≥ n0 and each z ∈ Cmin ,
24
2 Nonsmooth Convex Optimization
xt+1 − z2 ≤ xt − z2 − αt 0 . The next result is proved in Section 2.20. Theorem 2.18 Let K1 ≥ K¯ + 1, {αi }∞ i=0 ⊂ (0, 1] satisfy ∞
αi < ∞,
i=1
Assume that {Pi }∞ i=0 ⊂ M, Pi (X) = C, i = 0, 1, . . . ,
(2.105)
∞ {xi }∞ i=0 ⊂ X, {ξi }i=0 ⊂ X \ {0},
x0 ≤ K1 ,
(2.106)
ξi ∈ ∂f (xi ),
(2.107)
xi+1 = Pi (xi − αi ξi −1 ξi ).
(2.108)
for all integers i ≥ 0,
Then there exists x∗ = limi→∞ xi , and at least one of the following properties holds: x∗ ∈ Cmin ; there exists an integer n0 ≥ 1 and 0 > 0 such that for each integer t ≥ n0 and each z ∈ Cmin , xt+1 − z2 ≤ xt − z2 − αt .
2.5 Auxiliary Results for Theorems 2.1–2.10 Let PC ∈ M be an arbitrary element of the space M. Lemma 2.19 Let K0 , L0 , r > 0,
(2.109)
2.5 Auxiliary Results for Theorems 2.1–2.10
|f (z1 ) − f (z2 )| ≤ L0 z1 − z2 for all z1 , z2 ∈ BX (0, K0 + 1),
25
(2.110)
x ∈ BX (0, K0 ),
(2.111)
v ∈ ∂fr (x).
(2.112)
f (u) − f (x) ≥ v, u − x − r.
(2.113)
Then v ≤ L0 + r. Proof By (2.112), for all u ∈ X,
In view of (2.113), for all ξ ∈ BX (0, 1), r + f (x + ξ ) − f (x) ≥ v, ξ . Together with (2.110) and (2.111), this implies that v, ξ ≤ r + L0 ξ ≤ r + L0 . This implies that v ≤ L0 + 2. Lemma 2.19 is proved. Lemma 2.20 Assume that > 0, x ∈ X, y ∈ X, f (x) > inf(f, C) + ,
(2.114)
f (y) ≤ inf(f, C) + /4,
(2.115)
v ∈ ∂/4 f (x).
(2.116)
Then v, y − x ≤ −/2. Proof In view of (2.116), for all u ∈ X, f (u) − f (x) ≥ v, u − x − /4. By (2.115), −(3/4) ≥ f (y) − f (x) ≥ v, y − x − /4. The inequality above implies that v, y − x ≤ −/2. This completes the proof of Lemma 2.20.
(2.117)
26
2 Nonsmooth Convex Optimization
Lemma 2.21 Let x¯ ∈ Cmin ,
(2.118)
K0 > 0, L0 > 0, |f (z1 ) − f (z2 )| ≤ L0 z1 − z2 for all z1 , z2 ∈ BX (0, K0 + 1), ¯ ∈ (0, 16(L0 + L)],
(2.119) (2.120)
α ∈ (0, 1], δf , δC ∈ (0, 1] satisfy δf (K0 + K¯ + 5L0 + 5L¯ + 1) ≤ /4,
(2.121)
let a point x ∈ X satisfy x ≤ K0 , f (x) > inf(f, C) + ,
(2.122)
v ∈ ∂/4 f (x),
(2.123)
ξ ∈ BX (v, δf )
(2.124)
y ∈ BX (PC (x − αξ ), δC ).
(2.125)
and let
Then ¯ 2 − 2−1 α + δC2 y − x ¯ 2 ≤ x − x ¯ + 25(L0 + L) ¯ 2α2 +2δC (K0 + K¯ + 5L0 + 5L)
Proof Lemma 2.19, (2.119)–(2.121), and (2.123) imply that ¯ v ≤ 5L0 + 4L. Lemma 2.20, (2.118), (2.122), and (2.123) imply that v, x¯ − x ≤ −/2.
(2.126)
y0 = x − αξ.
(2.127)
Set
2.5 Auxiliary Results for Theorems 2.1–2.10
27
It follows from (2.7), (2.8), (2.118), (2.120), (2.121), (2.124), and (2.126) that ¯ 2 = x − αξ − x ¯ 2 y0 − x = x − αv + (αv − αξ ) − x ¯ 2 = x − αv − x ¯ 2 + α 2 v − ξ 2 + 2αv − ξ, x − αv − x ¯ ≤ x − αv − x ¯ 2 + α 2 δf2 + 2αδf x − αv − x ¯ ¯ ≤ x − αv − x ¯ 2 + α 2 δf2 + 2αδf (K0 + K¯ + α(5L0 + 4L)) ¯ 2 ≤ x − x ¯ 2 − 2αx − x, ¯ v + α 2 (5L0 + 4L) ¯ +α 2 δf2 + 2αδf (K0 + K¯ + α(5L0 + 4L)) ¯ 2 ≤ x − x ¯ 2 − 2α(/2) + α 2 (5L0 + 4L) ¯ +α 2 δf2 + 2αδf (K0 + K¯ + α(5L0 + 5L)) ¯ 2 ≤ x − x ¯ 2 − α + 25α 2 (L0 + L) ¯ + 1) +2αδf (K0 + K¯ + 5α(L0 + L) ¯ 2 + α/2 ≤ x − x ¯ 2 − α + 25α 2 (L0 + L) ¯ 2. ≤ x − x ¯ 2 − α/2 + 25α 2 (L0 + L)
(2.128)
In view of (2.128), ¯ 2. ¯ 2 ≤ x − x ¯ 2 + 25(L0 + L) y0 − x
(2.129)
Relations (2.5), (2.7), (2.8), (2.118), (2.121), and (2.129) imply that ¯ ¯ ≤ K0 + K¯ + 5(L0 + L). y0 − x
(2.130)
By (2.109), (2.110), (2.118), (2.125), (2.127), (2.128), and (2.130), ¯ 2 y − x ¯ 2 = y − PC (x − αξ ) + PC (x − αξ ) − x ≤ y − PC (x − αξ )2 + 2y − PC (x − αξ )PC (x − αξ ) − x ¯ +PC (x − αξ ) − x ¯ 2 ¯ + y0 − x ≤ δC2 + 2δC (K0 + K¯ + 5L0 + 5L) ¯ 2 ¯ + x − x ¯ 2. ≤ δC2 + 2δC (K0 + K¯ + 5L0 + 5L) ¯ 2 − α/2 + 25α 2 (L0 + L) Lemma 2.21 is proved. Lemma 2.21 implies the following result. Lemma 2.22 Let K0 > 0, L0 > 0,
28
2 Nonsmooth Convex Optimization
|f (z1 ) − f (z2 )| ≤ L0 z1 − z2 for all z1 , z2 ∈ BX (0, K0 + 1), ¯ ∈ (0, 16(L0 + L)], α ∈ (0, 1], δf , δC ∈ (0, 1] satisfy δf (K0 + K¯ + 5L0 + 5L¯ + 1) ≤ /4, let a point x ∈ X satisfy x ≤ K0 , f (x) > inf(f, C) + , v ∈ ∂/4 f (x), ξ ∈ BX (v, δf ), and let y ∈ BX (PC (x − αξ ), δC ). Then d(y, Cmin )2 ≤ d(x, Cmin )2 − 2−1 α + δC2 ¯ + 25(L0 + L) ¯ 2α2. +2δC (K0 + K¯ + 5L0 + 5L)
Applying Lemma 2.22 with = 4, we obtain the following result. Lemma 2.23 Let K0 > 0, L0 > 0, |f (z1 ) − f (z2 )| ≤ L0 z1 − z2 for all z1 , z2 ∈ BX (0, K0 + 1), α ∈ (0, 1], δf , δC ∈ (0, 1] satisfy δf (K0 + K¯ + 5L0 + 5L¯ + 1) ≤ 1, let a point x ∈ X satisfy x ≤ K0 , f (x) > inf(f, C) + 4, v ∈ ∂1 f (x), ξ ∈ BX (v, δf ),
2.5 Auxiliary Results for Theorems 2.1–2.10
29
and let y ∈ BX (PC (x − αξ ), δC ). Then ¯ 2α2. d(y, Cmin )2 ≤ d(x, Cmin )2 − 2α + 2δC (K0 + K¯ + 5L0 + 5L¯ + 1) + 25(L0 + L) Recall that PC ∈ M
(2.131)
be an arbitrary element of the space M. Lemma 2.24 Let x¯ ∈ Cmin ,
(2.132)
¯ ∈ (0, 16L],
(2.133)
¯ −1 , δf (K0 + K¯ + 1) ≤ (8L)
(2.134)
x ≤ K0 ,
(2.135)
f (x) > inf(f, C) + ,
(2.136)
v ∈ ∂/4 f (x).
(2.137)
¯ −1 v ≥ 3(4K0 + 4K)
(2.138)
ξ ∈ BX (v−1 v, δf ),
(2.139)
y ∈ BX (PC (x − αξ ), δC )
(2.140)
K0 > 0,
α ∈ (0, 1], δf , δC ∈ (0, 1] satisfy
let a point x ∈ X satisfy
Then
and for each
and each
30
2 Nonsmooth Convex Optimization
the following inequality holds: ¯ −1 α + 2α 2 + δC2 + 2δC (K0 + K¯ + 2). ¯ 2 − (4L) y − x ¯ 2 ≤ x − x Proof In view of (2.5), (2.7)–(2.9), (2.132), and (2.133) for every point ¯ 4−1 L¯ −1 ), z ∈ BX (x,
(2.141)
we have ¯ − x f (z) ≤ f (x) ¯ + Lz ¯ ≤ f (x) ¯ + /4 = inf(f, C) + /4.
(2.142)
By (2.132), (2.136), and (2.137), − > f (x) ¯ − f (x) ≥ v, x¯ − x > −/4, , and combined with (2.7), (2.8), (2.132), and (2.135), this implies that v, x¯ − x ≤ −(3/4), ¯ (3/4) ≤ v, x − x ¯ ≤ vx − x ¯ ≤ v(K0 + K). Therefore (2.138) is true. Let ξ, y ∈ X satisfy (2.139) and (2.140). Lemma 2.20, (2.136), (2.137), (2.141), and (2.142) imply that for every point ¯ 4−1 L¯ −1 ), z ∈ BX (x, we have v, z − x ≤ −/2. Combined with (2.138), the inequality above implies that ¯ −1 ). ¯ (4L) v−1 v, z − x < 0 for all z ∈ BX (x,
(2.143)
Set z˜ = x¯ + 4−1 L¯ −1 v−1 v.
(2.144)
¯ 4−1 L¯ −1 ). z˜ ∈ BX (x,
(2.145)
It is easy to see that
2.5 Auxiliary Results for Theorems 2.1–2.10
31
Relations (2.143)–(2.145) imply that 0 > v−1 v, z˜ − x = v−1 v, x¯ + 4−1 L¯ −1 v−1 v − x.
(2.146)
By (2.146), v−1 v, x¯ − x < −4−1 L¯ −1 .
(2.147)
y0 = x − αξ.
(2.148)
Set
It follows from (2.8), (2.131), (2.132), (2.134), (2.135), (2.139), and (2.148) that ¯ 2 = x − αξ − x ¯ 2 y0 − x = x − αv−1 v + α(v−1 v − ξ ) − x ¯ 2 x − αv−1 v − x ¯ 2 + α 2 v−1 v − ξ 2 +2αv−1 v − ξ, x − αv−1 v − x ¯ ≤ x − αv−1 v − x ¯ 2 +α 2 δf2 + 2αδf (K0 + K¯ + 1) ≤ x − x ¯ 2 − 2x − x, ¯ αv−1 v +α 2 (1 + δf )2 + 2αδf (K0 + K¯ + 1) < x − x ¯ 2 − 2α(4−1 L¯ −1 ) +α 2 (1 + δf2 ) + 2αδf (K0 + K¯ + 1) ¯ −1 + 2α 2 . ≤ x − x ¯ 2 − α(4L)
(2.149)
In view of (2.7), (2.8), (2.15), (2.132), and (2.149), ¯ 2+2 ¯ 2 ≤ (K0 + K) y0 − x and ¯ ≤ K0 + K¯ + 2. y0 − x By (2.10), (2.131), (2.132), (2.140), and (2.148)–(2.150), ¯ 2 y − x ¯ 2 = y − PC (x − αξ ) + PC (x − αξ ) − x
(2.150)
32
2 Nonsmooth Convex Optimization
y − PC (x − αξ )2 + PC (x − αξ ) − x ¯ 2 + 2y − PC (x − αξ )PC (x − αξ ) − x ¯ ¯ 2 + δC2 + 2δC y0 − x ¯ ≤ y0 − x ¯ −1 ≤ x − x ¯ 2 − α(4L) +2α 2 + δC2 + 2δC (K0 + K¯ + 2).
This completes the proof of Lemma 2.24. ¯ we obtain the following result. Applying Lemma 2.24 with = 16L, Lemma 2.25 Let x¯ ∈ Cmin , K0 > 0, α ∈ (0, 1], δf , δC ∈ (0, 1] satisfy δf (K0 + K¯ + 1) ≤ 2 ¯ and v ∈ ∂ ¯ f (x). ; let a point x ∈ X satisfy x ≤ K0 , f (x) > inf(f, C) + 16L, 4L Then ¯ −1 ¯ 0 + K) v ≥ 12L(K and for each ξ ∈ BX (v−1 v, δf ), and each y ∈ BX (PC (x − αξ ), δC ) the following inequality holds: ¯ 2 − 2α + 2δC (K0 + K¯ + 3). y − x ¯ 2 ≤ x − x Lemma 2.24 implies the following result. ¯ α ∈ (0, 1], δf , δC ∈ (0, 1] satisfy Lemma 2.26 Let K0 > 0, ∈ (0, 16L], ¯ −1 , δf (K0 + K¯ + 1) ≤ (8L) let a point x ∈ X satisfy x ≤ K0 , f (x) > inf(f, C) + and let v ∈ ∂/4 f (x). Then ¯ −1 v ≥ 3(4K0 + 4K)
2.5 Auxiliary Results for Theorems 2.1–2.10
33
and for each ξ ∈ BX (v−1 v, δf ), and each y ∈ BX (PC (x − αξ ), δC ) the following inequality holds: ¯ −1 α + 2α 2 + δC2 + 2δC (K0 + K¯ + 2). d(y, Cmin )2 ≤ d(x, Cmin )2 − (4L) Lemma 2.27 Let x¯ ∈ Cmin ,
(2.151)
¯ L0 > 0, K0 ≥ K, |f (z1 ) − f (z2 )| ≤ L0 z1 − z2 for all z1 , z2 ∈ BX (0, K0 + 1), ¯ ∈ (0, 16(L0 + L)],
(2.152) (2.153)
α ∈ (0, 1], δf , δC ∈ (0, 1] satisfy δf (K0 + K¯ + 5L0 + 5L¯ + 1) ≤ /8,
(2.154)
let a point x ∈ X satisfy x ≤ K0 ,
(2.155)
v ∈ ∂/4 f (x),
(2.156)
ξ ∈ BX (v, δf ),
(2.157)
y ∈ BX (PC (x − αξ ), δC ).
(2.158)
and let
Then ¯ 2 + 2α(f (x) ¯ − f (x)) + 3α/4 y − x ¯ 2 ≤ x − x ¯ 2α2. +2δC (K0 + K¯ + 5L0 + 5L¯ + 1) + 25(L0 + L)
34
2 Nonsmooth Convex Optimization
Proof Set y0 = x − αξ.
(2.159)
Lemma 2.19, (2.152), (2.153), and (2.156) imply that ¯ v ≤ 5L0 + 4L.
(2.160)
It follows from (2.8), (2.151), (2.155), (2.157), (2.159), and (2.160) that ¯ 2 = x − αξ − x ¯ 2 y0 − x = x − αv + (αv − αξ ) − x ¯ 2 ≤ x − αv − x ¯ 2 + α 2 v − ξ 2 + 2αv − ξ, x − αv − x ¯ ≤ x − αv − x ¯ 2 + α 2 δf2 + 2αδf x − αv − x ¯ ¯ ≤ x − αv − x ¯ 2 + α 2 δf2 + 2αδf (K0 + K¯ + α(5L0 + 4L)) ¯ 2 ≤ x − x ¯ 2 − 2αx − x, ¯ v + α 2 (5L0 + 4L) ¯ +α 2 δf2 + 2αδf (K0 + K¯ + α(5L0 + 4L)).
(2.161)
In view of (2.256), v, x¯ − x ≤ f (x) ¯ − f (x) + /4.
(2.162)
By (2.154), (2.161), and (2.62), ¯ 2 ≤ x − x ¯ 2 + 2α(f (x) ¯ − f (x)) + α/2 y0 − x ¯ 2 + α 2 δf2 + 2αδf (K0 + K¯ + α(5L0 + 4L)) ¯ +α 2 (5L0 + 4L) ¯ 2 ¯ − f (x)) + α/2 + 25α 2 (L0 + L) ≤ x − x ¯ 2 + 2α(f (x) ¯ +2αδf (K0 + K¯ + 5(L0 + L)) ¯ 2 + α/4. ≤ x − x ¯ 2 + 2α(f (x) ¯ − f (x)) + α/2 + 25α 2 (L0 + L) (2.163) By (2.7), (2.8), (2.151), (2.153), and (2.163), ¯ 2 α 2 + 2αL0 x¯ − x y0 − x ¯ 2 ≤ x − x ¯ 2 + α + 25(L0 + L) ¯ 2 + 16(L0 + L) ¯ + (5(L0 + L)) ¯ 2 ≤ (K0 + K) ¯ ≤ (K0 + K¯ + 5L0 + 5L) ¯ 2, +2L0 (K0 + K)
2.6 Proof of Theorem 2.1
35
¯ y0 − x ¯ ≤ K0 + K¯ + 5(L0 + L).
(2.164)
It follows from (2.10), (2.131), (2.151), (2.158), (2.159), and (2.163) that ¯ 2 y − x ¯ 2 = y − PC (x − αξ ) + PC (x − αξ ) − x ≤ y − PC (x − αξ )2 + 2y − PC (x − αξ )PC (x − αξ ) − x ¯ +PC (x − αξ ) − x ¯ 2 ¯ + y0 − x ≤ δC2 + 2δC (K0 + K¯ + 5L0 + 5L) ¯ 2 ¯ + x − x ≤ δC2 + 2δC (K0 + K¯ + 5L0 + 5L) ¯ 2 ¯ 2 +2α(f (x) ¯ − f (x)) + 3α/4 + 25α 2 (L0 + L) ≤ x − x ¯ 2 + 2α(f (x) ¯ − f (x)) + 3α/4 ¯ 2 + 2δC (K0 + K¯ + 5L0 + 5L¯ + 1). +25α 2 (L0 + L) Lemma 2.27 is proved.
2.6 Proof of Theorem 2.1 By (2.17), (2.19), and (2.21), BX (xi , δC ) ∩ C = ∅, i = 1, . . . , n.
(2.165)
In view if (2.12), (2.19), and (2.165), xi ≤ K1 + 1, i = 0, 1, . . . , n.
(2.166)
x¯ ∈ Cmin .
(2.167)
Fix
It follows from (2.6), (2.7), and (2.167) that x ¯ ≤ K¯ ≤ K1 .
(2.168)
Assume that the assertion of the theorem does not hold. Then f (xi ) > inf(f, C) + , i = 1, . . . , n.
(2.169)
Let i ∈ {1, . . . , n − 1}. In view of (2.12)–(2.15), (2.18), (2.20), (2.21), and (2.167), we apply Lemma 2.21 with
36
2 Nonsmooth Convex Optimization
K0 = K1 + 1, L0 = L1 , ξ = ξi , x = xi , y = xi+1 and obtain that ¯ 2 ≤ xi − x ¯ 2 − 2−1 α + δC2 xi+1 − x ¯ 2α2 +2δC (K0 + K¯ + 5L1 + 5L¯ + 1) + 25(L1 + L) = xi − x ¯ 2 − 4−1 α + 2δC (K1 + K¯ + 5L1 + 5L¯ + 2) ≤ xi − x ¯ 2 − 8−1 α.
(2.170)
Relations (2.166) and (2.168) imply that ¯ 2 ≥ x1 − x ¯ 2 ≥ x1 − x ¯ 2 − xn − x ¯ 2 (1 + K1 + K) =
n−1 (xi − x ¯ 2 − xi+1 − x ¯ 2 ) ≥ 8−1 α(n − 1). i=1
Together with (2.18), this implies that ¯ 2 α −1 −1 n ≤ 1 + 8(1 + K1 + K) ¯ 2 (L1 + L) ¯ 2 −2 . = 1 + 800(1 + K1 + K) This contradicts (2.16). The contradiction we have reached completes the proof of Theorem 2.1.
2.7 Proof of Theorem 2.2 By (2.27), (2.30), and (2.32), BX (xi , δC ) ∩ C = ∅, i = 1, . . . , n.
(2.171)
x¯ ∈ Cmin .
(2.172)
Fix
It follows from (2.7), (2.8), and (2.172) that ¯ x ¯ ≤ K. It follows from (2.10), (2.29), (2.30) and (2.172) that
(2.173)
2.7 Proof of Theorem 2.2
37
x1 − x ¯ ≤ x1 − P0 (x0 ) + P0 (x0 ) − x ¯ ¯ ¯ ≤ 1 + K1 + K. ≤ δC + x0 − x
(2.174)
In view of (2.172)–(2.174), x1 ≤ 1 + K1 + 2K¯ ≤ 1 + 3K1 ,
(2.175)
¯ d(x1 , Cmin ) ≤ 1 + K1 + K.
(2.176)
Assume that the assertion of the theorem does not hold. Then f (xi ) > inf(f, C) + , i = 1, . . . , n.
(2.177)
We show by induction that for i = 1, . . . , n, ¯ d(xi , Cmin ) ≤ 1 + K1 + K.
(2.178)
Assume that an integer i satisfies 1 ≤ i < n and that (2.178) is true. (In view of (2.176), our assumption holds with i = 1.) By (2.7), (2.8), and (2.178), xi ≤ 1 + K1 + 2K¯ ≤ 1 + 3K1 .
(2.179)
f (xi ) ≤ inf(f, C) + 4;
(2.180)
f (xi ) > inf(f, C) + 4.
(2.181)
There are two cases:
Assume that (2.180) holds. In view of (2.7), (2.8), and (2.180), ¯ xi ≤ K.
(2.182)
It follows from (2.10), (2.32), (2.172), (2.173), and (2.182) that ¯ ≤ xi+1 − Pi (xi − αξi ) + Pi (xi − αξi ) − x ¯ xi+1 − x ≤ δC + xi − αξi − x ¯ ≤ 1 + xi − x ¯ + αξi ≤ 1 + 2K¯ + αξi .
(2.183)
Lemma 2.19, (2.8), (2.9), and (2.182) imply that ∂/4 f (xi ) ⊂ BX (0, L¯ + /4).
(2.184)
38
2 Nonsmooth Convex Optimization
By (2.23)–(2.25), (2.31), (2.98), and (2.184), αξi ≤ α(L¯ + /4 + δf ) ≤ α(L¯ + 2) ≤ 1.
(2.185)
In view of (2.183) and (2.185), ¯ ≤ 2 + 2K¯ ≤ 1 + K1 + K¯ xi+1 − x and ¯ d(xi+1 , Cmin ) ≤ 1 + K1 + K. Assume that (2.181) holds. In view of (2.23)–(2.25), (2.31), (2.32), (2.48), (2.178), (2.179), and (2.181), we apply Lemma 2.23 with PC = Pi , K0 = 3K1 + 1, L0 = L1 , ξ = ξi , x = xi , y = xi+1 and obtain that d(xi+1 , Cmin )2 ≤ d(xi , Cmin )2 − 2α ¯ + 25(L1 + L) ¯ 2α2 +2δC (3K1 + K¯ + 2 + 5L1 + 5L) ¯ ≤ d(xi , Cmin )2 − α + 2δC (3K1 + K¯ + 2 + 5L1 + 5L) ≤ d(xi , Cmin )2 − α/2 ≤ d(xi , Cmin )2 and ¯ d(xi+1 , Cmin ) ≤ d(xi , Cmin ) ≤ 1 + K1 + K. Thus in the both cases d(xi+1 , Cmin ) ≤ K1 + K¯ + 1. Thus we showed by induction that for all integers i = 1, . . . , n, (2.178) holds and that xi ≤ 1 + K1 + 2K¯ ≤ 1 + 3K1 .
(2.186)
In view of (2.23)–(2.25), (2.28), (2.31), (2.32), (2.171), (2.172), (2.177), and (2.186), we apply Lemma 2.21 with PC = Pi , K0 = 3K1 + 1, L0 = L1 , ξ = ξi , x = xi , y = xi+1 and obtain that
2.8 Proof of Theorem 2.3
39
xi+1 − x ¯ 2 ≤ xi − x ¯ 2 − 2−1 α ¯ + 25(L1 + L) ¯ 2α2 +2δC (3K1 + K¯ + 2 + 5L1 + 5L) ¯ ≤ xi − x ¯ 2 − 4−1 α + 2δC (3K1 + K¯ + 2 + 5L1 + 5L) ≤ xi − x ¯ 2 − 8−1 α.
(2.187)
By (2.28), (2.175), and (2.187), ¯ 2 ≥ x1 − x ¯ 2 (1 + K1 + K) ≥ x1 − x ¯ 2 − xn − x ¯ 2 =
n−1 (xi − x ¯ 2 − xi+1 − x ¯ 2 ) ≥ 8−1 α(n − 1) i=1
and ¯ 2 (α)−1 n ≤ 1 + 8(1 + K1 + K) ¯ 2 −2 (L1 + L) ¯ 2. ≤ 1 + 800(1 + K1 + K) This contradicts (2.26). The contradiction we have reached completes the proof of Theorem 2.2.
2.8 Proof of Theorem 2.3 Set Pn = Pn−1 , ξn ∈ ∂f (xn ), αn = 1,
(2.188)
xn+1 = Pn (xn − ξn ).
(2.189)
By (2.37), (2.39), (2.41), (2.188), and (2.189), BX (xi , δC ) ∩ C = ∅, i = 1, . . . , n + 1.
(2.190)
In view if (2.33), (2.38), and (2.190), xi ≤ K1 + 1, i = 0, 1, . . . , n + 1.
(2.191)
x¯ ∈ Cmin .
(2.192)
Fix
40
2 Nonsmooth Convex Optimization
It follows from (2.7), (2.8), and (2.192) that x ¯ ≤ K¯ ≤ K1 .
(2.193)
Let i ∈ {1, . . . , n}. In view of (2.34), (2.36), (2.40), (2.41), (2.188), (2.189), (2.191), and (2.192), we apply Lemma 2.27 with K0 = K1 + 1, L0 = L1 , α = αi , ξ = ξi , x = xi , y = xi+1 and obtain that ¯ 2 ≤ xi − x ¯ 2 + 2αi (f (x) ¯ − f (xi )) + 3αi /4 xi+1 − x ¯ 2 αi2 + 2δC (K1 + K¯ + 5L1 + 5L¯ + 2). +25(L1 + L)
(2.194)
Inequality (2.194) implies that ¯ ≤ 2−1 xi − x ¯ 2 − 2−1 xi+1 − x ¯ 2 + 3αi /4 αi (f (xi ) − f (x)) ¯ 2 αi2 + 2δC (K1 + K¯ + 5L1 + 5L¯ + 2). +25(L1 + L)
(2.195)
In view of (2.195), n
αi (f (xi ) − f (x)) ¯ ≤ 2−1 x1 − x ¯ 2 − 2−1 xn+1 − x ¯ 2
i=1
n n ¯ 2 +3 αi /4 + 25(L1 + L) αi2 + 2nδC (K1 + K¯ + 5L1 + 5L¯ + 2). i=1
i=1
(2.196) It follows from (2.191), (2.193), and (2.196) the convexity of f that ⎞ ⎛⎛ ⎞−1 n n ⎟ ⎜ min{f (xi ) : t = 1, . . . , n} − inf(f, C), f ⎝⎝ αj ⎠ αi xi ⎠ − inf(f, C) j =1
≤
n
αj
−1 n
j =1
≤
n i=1
i=1
αi (f (xi ) − inf(f, C))
i=1
−1 αi
¯ 2 (2K1 + 1)2 + 3/4 + 25(L1 + L)
n i=1
αi2
n i=1
−1 αi
2.9 Proof of Theorem 2.5
41
−1 n +2n αi δC (K¯ + K1 + 5L1 + 5L¯ + 2). i=1
Theorem 2.3 is proved.
2.9 Proof of Theorem 2.5 Set Pn = Pn−1 , ξn ∈ ∂f (xn ),
(2.197)
xn+1 = Pn (xn − αξn ).
(2.198)
x¯ ∈ Cmin .
(2.199)
Fix
It follows from (2.7), (2.8), and (2.199) that ¯ x ¯ ≤ K.
(2.200)
It follows from (2.10), (2.47), (2.48), and (2.200) that ¯ ≤ x1 − P0 (x0 ) + P0 (x0 ) − x ¯ x1 − x ¯ ≤ δC + x0 − x ¯ ≤ 1 + K1 + K.
(2.201)
We show that for all i = 1, . . . , n + 1, ¯ ¯ ≤ 1 + K1 + K. xi − x
(2.202)
Assume that an integer i satisfies 1 ≤ i < n + 1 and that (2.202) is true. (In view of (2.201), our assumption holds with i = 1.) There are two cases: f (xi ) ≤ inf(f, C) + 4;
(2.203)
f (xi ) > inf(f, C) + 4.
(2.204)
Assume that (2.203) holds. In view of (2.7), (2.8), and (2.203), ¯ xi ≤ K.
(2.205)
42
2 Nonsmooth Convex Optimization
Lemma 2.19, (2.7)–(2.9), (2.42), and (2.205) imply that ∂/4 f (xi ) ⊂ BX (0, L¯ + /4) ⊂ BX (0, L¯ + 1).
(2.206)
By (2.49), (2.197), and (2.206), ξi ≤ L¯ + 2.
(2.207)
It follows from (2.10), (2.43), (2.197), (2.198), (2.200), (2.205), and (2.207) that ¯ ≤ xi+1 − Pi (xi − αξi ) + Pi (xi − αξi ) − x ¯ xi+1 − x ≤ δC + xi − αξi − x ¯ ≤ 1 + xi − x ¯ + αξi ≤ 1 + 2K¯ + α(L¯ + 2) ≤ 1 + 2K¯ + 2 ≤ 1 + K¯ + K1 . Assume that (2.204) holds. In view of (2.42)–(2.45), (2.50), (2.147), (2.148), (2.199), (2.202), and (2.204), we apply Lemma 2.27 with K0 = 1 + 2K¯ + K1 , L0 = L1 , ξ = ξi , x = xi , y = xi+1 , = 4 and obtain that ¯ 2 ≤ xi − x ¯ 2 + 2α(f (x) ¯ − f (xi )) + 3α/4 xi+1 − x ¯ 2 α 2 + 2δC (3K¯ + K1 + 2 + 5L1 + 5L) ¯ +25(L1 + L) ¯ 2 α 2 + 2δC (3K¯ + K1 + 2 + 5L1 + 5L) ¯ ≤ xi − x ¯ 2 − 8α + 3α + 25(L1 + L) ≤ xi − x ¯ 2 − 2α and ¯ ≤ xi − x ¯ ≤ 1 + K¯ + K1 . xi+1 − x
(2.208)
Thus in both cases, (2.208) holds. Therefore by induction we showed that ¯ i = 0, . . . , n + 1, ¯ ≤ 1 + K1 + K, xi − x ¯ i = 0, . . . , n + 1. xi ≤ 1 + K1 + 2K,
(2.209) (2.210)
Let i ∈ {1, . . . , n}. In view of (2.42), (2.45), (2.49), (2.50), (2.147), (2.148), and (2.210), we apply Lemma 2.27 with K0 = 2K¯ + K1 + 1, L0 = L1 , ξ = ξi , x = xi , y = xi+1 and obtain that
2.10 Proof of Theorem 2.8
43
xi+1 − x ¯ 2 ≤ xi − x ¯ 2 + 2α(f (x) ¯ − f (xi )) + 3α/4 ¯ 2α2. +2δC (K1 + 3K¯ + 5L1 + 5L¯ + 2) + 25(L1 + L) This inequality implies that ¯ ≤ xi − x ¯ 2 − xi+1 − x ¯ 2 + 3α/4 2α(f (xi ) − f (x)) ¯ 2 α 2 + 2δC (K1 + 3K¯ + 5L1 + 5L¯ + 2). +25(L1 + L)
(2.211)
By (2.211), 2α
n n (f (xi ) − f (x)) ¯ ≤ (xi − x ¯ 2 − xi+1 − x ¯ 2) i=1
i=1
¯ 2 α 2 + 2δC (K1 + 3K¯ + 5L1 + 5L¯ + 2)] +n[3α/4 + 25(L1 + L) ¯ 2 α 2 + 2δC (K1 + 3K¯ + 5L1 + 5L¯ + 2)]. ≤ x1 − x ¯ 2 + n[3α/4 + 25(L1 + L) (2.212) It follows from (2.209) and (2.212) that n−1
n−1
¯ 2 + 3/8 f (xi ) − inf(f, C) ≤ (2α)−1 (1 + K1 + K)
i=1
¯ 2 α + α −1 δC (K1 + 3K¯ + 5L1 + 5L¯ + 2). +2−1 25(L1 + L) Theorem 2.5 is proved.
2.10 Proof of Theorem 2.8 Fix x¯ ∈ Cmin .
(2.213)
By (2.51), (2.56), (2.58), (2.59), and (2.61), for every i ∈ {1, . . . , n}, xi ≤ K1 + 1.
(2.214)
Assume that an integer i ∈ {1, . . . , n} \ {n} satisfies f (xi ) > inf(f, C) + .
(2.215)
44
2 Nonsmooth Convex Optimization
We apply Lemma 2.14 with PC = Pi , K0 = K1 , ξ = ξi , x = xi , y = xi+1 , α = αi and in view of (2.52), (2.53), (2.54), (2.57), (2.58), (2.60), (2.61), and (2.215), obtain that ¯ −1 }, ∂/4 f (xi ) ⊂ {v ∈ X : v ≥ (3/4)(K1 + K)
(2.216)
¯ −1 − δf , ξi ≥ (3/4)(K1 + K) ¯ −1 αi + 2αi2 + δC2 + 2δC (K1 + K¯ + 2) xi+1 − x ¯ 2 ≤ xi − x ¯ 2 − (4L) ¯ −1 αi + 2δC (K1 + K¯ + 3) ¯ 2 − (8L) ≤ xi − x ¯ −1 αi . ≤ xi − x ¯ 2 − (16L) Thus we have shown that the following property holds: (a) if an integer i ∈ {1, . . . , n} \ {n} and f (xi ) > inf(f, C) + then (2.216) holds and ¯ −1 αi . ¯ 2 ≤ xi − x ¯ 2 − (16L) xi+1 − x
(2.217)
Assume that an integer j ∈ {1, . . . , n} \ {n} and that f (xi ) > inf(f, C) + , i = 1, . . . , j.
(2.218)
Property (a) implies that for all i = 1, . . . , j (2.216) and (2.217) hold. Combined with (2.7), (2.8), (2.57), (2.213), and (2.214), this implies that ¯ 2 ≥ x1 − x ¯ 2 − xj +1 − x ¯ 2 (K1 + K¯ + 1)2 ≥ x1 − x =
j (xi − x ¯ 2 − xi+1 − x ¯ 2) i=1
¯ −2 2 (j + 1) ¯ −1 (j + 1) min{αi : i = 1, . . . , j } ≥ 2−1 (16L) ≥ (16L) and ¯ 2 −2 (16L) ¯ 2 ≤ n − 2. j + 1 ≤ 2(1 + K1 + K)
2.11 Proof of Theorem 2.10
45
This implies that there exists j ∈ {1, . . . , n} such that f (xj ) ≤ inf(f, C) + and if i ∈ {1, . . . , n} \ {j }, then f (xi ) > inf(f, C) + and ¯ −1 }. ∂/4 f (xi ) ⊂ {v ∈ X : v ≥ (3/4)(K1 + K) Theorem 2.8 is proved.
2.11 Proof of Theorem 2.10 In view of (2.66), (2.69), and (2.71), BX (xi , δC ) ∩ C = ∅, i = 1, . . . , n.
(2.219)
x¯ ∈ Cmin .
(2.220)
Fix
It follows from (2.7), (2.8), and (2.220) that ¯ x ¯ ≤ K.
(2.221)
It follows from (2.10), (2.68), (2.69), (2.220), and (2.221) that ¯ ≤ x1 − P0 (x0 ) + P0 (x0 ) − x ¯ x1 − x ¯ ≤ 1 + x0 − x ¯ ≤ 1 + K1 + K.
(2.222)
We show that for all i = 1, . . . , n, ¯ ¯ ≤ 1 + K1 + K. xi − x
(2.223)
Assume that an integer i satisfies 1 ≤ i < n and that (2.223) is true. (In view of (2.222), our assumption holds with i = 1.) We show that our assumption holds for i + 1 too. There are two cases: f (xi ) ≤ inf(f, C) + 4;
(2.224)
46
2 Nonsmooth Convex Optimization
f (xi ) > inf(f, C) + 4.
(2.225)
Assume that (2.224) holds. In view of (2.7), (2.8), and (2.224), ¯ xi ≤ K.
(2.226)
It follows from (2.10), (2.62), (2.64), (2.67), (2.70), (2.71), (2.216), and (2.220) that ¯ ≤ xi+1 − Pi (xi − αi ξi ) + Pi (xi − αi ξi ) − x ¯ xi+1 − x ≤ δC + xi − αi ξi − x ¯ ≤ δC + xi − x ¯ + ξi ≤ δC + 1 + 2K¯ + δf ≤ 3 + 2K¯ ≤ 1 + K¯ + K1 . Assume that (2.225) holds. In view of (2.62)–(2.64), (2.67), (2.70), (2.71), (2.220), (2.221), (2.223), and (2.225), we apply Lemma 2.24 with PC = Pi , K0 = 1 + 2K¯ + K1 , α = αi , ξ = ξi , x = xi , y = xi+1 , = 4 and obtain that ¯ 2 ≤ xi − x ¯ 2 − αi L¯ −1 + 2αi2 + δC2 + 2δC (3K¯ + K1 + 3) xi+1 − x ¯ −1 αi + 2δC (3K¯ + K1 + 4) ¯ 2 − (2L) ≤ xi − x ¯ −1 αi . ≤ xi − x ¯ 2 − (4L) By the relation above, ¯ ≤ xi − x ¯ ≤ 1 + K¯ + K1 . xi+1 − x Thus in the both cases ¯ ¯ ≤ 1 + K1 + K. xi+1 − x Therefore by induction we showed that (2.223) holds for all i = 1, . . . , n. This implies that for all i = 1, . . . , n, ¯ xi ≤ 1 + K1 + 2K. Assume that i ∈ {1, . . . , n} \ {n} and that f (xi ) > inf(f, C) + . In view of the inequality above, (2.62)–(2.64), (2.67), (2.70), (2.71), (2.220), and (2.223), we apply Lemma 2.24 with
2.11 Proof of Theorem 2.10
47
PC = Pi , K0 = 2K¯ + K1 + 1, α = αi , ξ = ξi , x = xi , y = xi+1 and obtain that ∂/4 f (xi ) ⊂ {v ∈ X : v ≥ (3/4)(K1 + 3K¯ + 1)−1 },
(2.227)
¯ −1 αi + 2αi2 + δC2 + 2δC (K1 + 3K¯ + 3) xi+1 − x ¯ 2 ≤ xi − x ¯ 2 − (4L) ¯ −1 αi + 2δC (K1 + 3K¯ + 4) ¯ 2 − (8L) ≤ xi − x ¯ −1 αi . ≤ xi − x ¯ 2 − (16L) Thus we have shown that the following property holds: (a) if an integer i ∈ {1, . . . , n} \ {n} and f (xi ) > inf(f, C) + then (2.227) holds and ¯ −1 αi . ¯ 2 ≤ xi − x ¯ 2 − (16L) xi+1 − x Assume that an integer j ∈ {1, . . . , n} \ {n} and that f (xi ) > inf(f, C) + , i = 1, . . . , j. Property (a) and the inequality above imply that for all i = 1, . . . , j (2.227) holds and ¯ −2 2 xi+1 − x ¯ 2 ≤ xi − x ¯ 2 − 2−1 (16L) is true. By (2.65), (2.222), and (2.228), ¯ 2 ≥ x1 − x ¯ 2 − xj +1 − x ¯ 2 (K1 + K¯ + 1)2 ≥ x1 − x j = (xi − x ¯ 2 − xi+1 − x ¯ 2) i=1
¯ −2 2 ≥ 2−1 j (16L) and ¯ 2 −2 (16L) ¯ 2 ≤ n − 2. j ≤ 2(1 + K1 + K) This implies that there exists j ∈ {1, . . . , n} such that
(2.228)
48
2 Nonsmooth Convex Optimization
f (xj ) ≤ inf(f, C) + and if i ∈ {1, . . . , j } \ {j }, then f (xi ) > inf(f, C) + } and ∂/4 f (xi ) ⊂ {v ∈ X : v ≥ (3/4)(K1 + 3K¯ + 1)−1 }. Theorem 2.10 is proved.
2.12 An Auxiliary Result for Theorems 2.11–2.15 Assume that all the assumptions made in Sections 2.1–2.3 hold. Proposition 2.28 Let ∈ (0, 1]. Then for each x ∈ X satisfying d(x, C) < min{2−1 L¯ −1 φ(/2), /2},
(2.229)
f (x) ≤ inf(f, C) + min{2−1 φ(/2), /2},
(2.230)
the inequality d(x, Cmin ) ≤ holds. Proof In view of the definition of φ, φ(/2) ∈ (0, 1] and if x ∈ C satisfies f (x) < inf(f, C) + φ(/2), then d(x, Cmin ) ≤ min{1, /2}.
(2.231)
Assume that a point x ∈ X satisfies (2.229) and (2.230). There exists a point y ∈ C which satisfies x − y < 2−1 L¯ −1 φ(/2) and x − y < /2.
(2.232)
Relations (2.7), (2.8), (2.230), and (2.232) imply that ¯ y ∈ BX (0, K¯ + 1). x ∈ BX (0, K),
(2.233)
By (2.223), (2.232), and the definition of L¯ (see (2.9)), ¯ − y < φ(/2)2−1 . |f (x) − f (y)| ≤ Lx
(2.234)
It follows from the choice of the point y, (2.230), and (2.234) that y ∈ C and
2.13 Proof of Theorem 2.11
49
f (y) < f (x) + φ(/2)2−1 ≤ inf(f, C) + φ(/2). Combined with (2.231), this implies that d(y, Cmin ) ≤ /2. Together with (2.232), this implies that d(x, Cmin ) ≤ x − y + d(y, Cmin ) ≤ . This completes the proof of Proposition 2.28.
2.13 Proof of Theorem 2.11 We may assume without loss of generality that < 1. In view of Proposition 2.28, there exist a number ¯ ∈ (0, /8)
(2.235)
such that if x ∈ X, d(x, C) ≤ 2¯ ,andf (x) ≤ inf(f, C) + 2¯ , then d(x, Cmin ) ≤ .
(2.236)
x¯ ∈ Cmin
(2.237)
¯ 0 ∈ (0, 4−1 ).
(2.238)
Fix
and
Since limi→∞ αi = 0 (see (2.73)), there is an integer p0 > 0 such that K¯ + 4 < p0
(2.239)
and that for all integers p ≥ p0 − 1, we have ¯ −1 0 . αp < (32L) Since that
∞
i=0 αi
(2.240)
= ∞ (see (2.73)), there exist a natural number n0 > p0 + 4 such
50
2 Nonsmooth Convex Optimization n 0 −1
¯ αi > (4p0 + M + x) ¯ 2 0−1 16L.
(2.241)
i=p0
Fix ¯ K∗ > K¯ + 4 + M + 4n0 + 4x
(2.242)
and a positive number δ such that ¯ −1 0 . 6δ(K∗ + 1) < (16L)
(2.243)
Assume that an integer n ≥ n0 and that n {Pk }n−1 k=0 ⊂ M, {xk }k=0 ⊂ X,
x0 ≤ M, Pi (X) = C, i = 0, . . . , n − 1,
(2.244)
vk ∈ ∂δ f (xk ) \ {0}, k = 0, , . . . , n − 1,
(2.245)
n−1 {ηk }n−1 k=0 , {ξk }k=0 ⊂ BX (0, δ),
(2.246)
and that for all integers k = 0, . . . , n − 1, we have xk+1 = Pk (xk − αk vk −1 vk − αk ξk ) − αk ηk .
(2.247)
In order to prove the theorem, it is sufficient to show that d(xk , Cmin ) ≤ for all integers k satisfying n0 ≤ k ≤ n. Assume that an integer k ∈ [p0 , n − 1],
(2.248)
xk ≤ K∗ ,
(2.249)
f (xk ) > inf(f, C) + 0 .
(2.250)
In view of (2.237), (2.243), (2.245), (2.246), (2.247), (2.249), and (2.250), the conditions of Lemma 2.24 hold with PC = Pk , K0 = K∗ , = 0 , δf = δ, δC = δαk , α = αk , x = xk , v = vk , ξ = ξk + vk −1 vk , and y = xk+1 , and combined with (2.240), (2.243), and (2.248), this Lemma implies that ¯ −1 0 ¯ 2 ≤ xk − x ¯ 2 − αk (4L) xk+1 − x
2.13 Proof of Theorem 2.11
51
+2αk2 + αk2 δ 2 + 2δαk (K∗ + K¯ + 2) ¯ −1 0 + 2δαk (K∗ + K¯ + 3) ¯ 2 − αk (8L) ≤ xk − x ¯ −1 0 . ≤ xk − x ¯ 2 − αk (16L) Thus we have shown that the following property holds: (P1)
If an integer k ∈ [p0 , n − 1] and (2.249) and (2.250) are valid, then we have ¯ −1 αk 0 . ¯ 2 ≤ xk − x ¯ 2 − (16L) xk+1 − x
We claim that there exists an integer j ∈ {p0 , . . . , n0 } such that f (xj ) ≤ inf(f, C) + 0 . Assume the contrary. Then f (xi ) > inf(f, C) + 0 , i = p0 , . . . , n0 .
(2.251)
It follows from (2.10), (2.243), and (2.245)–(2.247)) that for all integers i = 0, . . . , n − 1, we have ¯ ≤ 1 + Pi (xi − αi vi −1 vi − αi ξi ) − x ¯ xi+1 − x ≤ 1 + xi − αi vi −1 vi − αi ξi − x ¯ ≤ 1 + xi − x ¯ + 2 = xi − x ¯ + 3.
(2.252)
By (2.242), (2.244), and (2.252), for all integers i = 0, . . . , n0 , ¯ +x ¯ ≤ M +3i +2x ¯ ≤ M +3n0 +2x ¯ < K∗ . xi ≤ x0 − x+3i
(2.253)
Let i ∈ {p0 , . . . , n0 − 1}.
(2.254)
It follows from (2.251), (2.253), (2.254), and property (P1) that ¯ −1 αi 0 . ¯ 2 ≤ xi − x ¯ 2 − (16L) xi+1 − x
(2.255)
Relations (2.253) and (2.255) imply that ¯ 2 ≥ xp0 − x ¯ 2 − xn0 − x ¯ 2 (M + 3p0 + x) =
n 0 −1
n 0 −1
i=p0
i=p0
¯ −1 0 [xi − x ¯ 2 − xi+1 − x ¯ 2 ] ≥ (16L)
αi
52
2 Nonsmooth Convex Optimization
and n 0 −1
¯ −1 (M + 3p0 + x) αi ≤ 16L ¯ 2. 0
i=p0
This contradicts (2.241). The contradiction we have reached proves that there exists an integer j ∈ {p0 , . . . , n0 }
(2.256)
f (xj ) ≤ inf(f, C) + 0 .
(2.257)
such that
By (2.243), (2.244), (2.246), and (2.247), we have d(xj , C) ≤ αj −1 δ < ¯ .
(2.258)
In view of (2.236), (2.238), (2.257), and (2.258), d(xj , Cmin ) ≤ .
(2.259)
We claim that for all integers i satisfying j ≤ i ≤ n, d(xi , Cmin ) ≤ . Assume the contrary. Then there exists an integer k ∈ [j, n] for which d(xk , Cmin ) > .
(2.260)
By (2.256), (2.259), and (2.260), we have k > j ≥ p0 .
(2.261)
By (2.259) we may assume without loss of generality that d(xi , Cmin ) ≤ for all integers i satisfying j ≤ i < k.
(2.262)
Thus in view of (2.261) and (2.262) d(xk−1 , Cmin ) ≤ . There are two cases:
(2.263)
2.13 Proof of Theorem 2.11
53
f (xk−1 ) ≤ inf(f, C) + 0 ;
(2.264)
f (xk−1 ) > inf(f, C) + 0 .
(2.265)
Assume that (2.264) is valid. It follows from (2.7), (2.8), and (2.264) that ¯ xk−1 ∈ X0 ⊂ BX (0, K).
(2.266)
By (2.244), (2.246), and (2.247), there exists a point z ∈ C such that xk−1 − z ≤ δ.
(2.267)
By (2.10), (2.246), (2.247), and (2.267), xk − z ≤ αk−1 δ + z − Pk−1 (xk−1 − αk−1 vk−1 −1 vk−1 − αk−1 ξk−1 ) ≤ δ + z − xk−1 + αk−1 + δ = 3δ + αk−1 .
(2.268)
It follows from (2.240), (2.243), (2.261), and (2.268) that d(xk , C) ≤ 3δ + αk−1 < 0 .
(2.269)
In view of (2.267) and (2.268), xk − xk−1 ≤ xk − z + z − xk−1 ≤ 4δ + αk−1 .
(2.270)
It follows from (2.240), (2.243), (2.261), (2.263), and (2.270) that d(xk , Cmin ) ≤ 2.
(2.271)
Relations (2.7), (2.8), (2.263), and (2.271) imply that xk−1 , xk ∈ BX (0, K¯ + 2). Together with (2.9) and (2.270), the inclusion above implies that ¯ k−1 − xk ≤ L(4δ ¯ + αk−1 ). |f (xk−1 ) − f (xk )| ≤ Lx
(2.272)
In view of (2.240), (2.243), (2.264), and (2.272), we have ¯ + αk−1 ) f (xk ) ≤ f (xk−1 ) + L(4δ ¯ ≤ inf(f, C) + 0 + L(4δ + αk−1 ) ≤ inf(f, C) + 20 . It follows from (2.236), (2.238), (2.269), and (2.273) that
(2.273)
54
2 Nonsmooth Convex Optimization
d(xk , Cmin ) ≤ . This inequality contradicts (2.260). The contradiction we have reached proves (2.265). By (2.7), (2.8), and (2.263), we have xk−1 ≤ K¯ + 1.
(2.274)
It follows from (2.240), (2.242), (2.243), (2.245), (2.247), (2.263), (2.265), and (2.274) that Lemma 2.26 holds with x = xk−1 , y = xk , ξ = ξk−1 , v = vk−1 , α = αk−1 , K0 = K¯ + 1, = 0 , δf = δ, δC = αk−1 δ and this implies that d(xk , Cmin )2 2 2 ¯ −1 0 + 2αk−1 ≤ d(xk−1 , Cmin )2 − αk−1 (4L) + αk−1 δ 2 + 2αk−1 δ(2K¯ + 3)
¯ −1 αk−1 0 + 2αk−1 δ(2K¯ + 4) ≤ d(xk−1 , Cmin )2 − (8L) ¯ −1 αk−1 0 ≤ d(xk−1 , Cmin )2 ≤ 2 . ≤ d(xk−1 , Cmin )2 − (16L)
This contradicts (2.260). The contradiction we have reached proves that d(xi , Cmin ) ≤ for all integers i satisfying j ≤ i ≤ n. Since j ≤ n0 , this completes the proof of Theorem 2.11.
2.14 Proof of Theorem 2.12 We may assume that without loss of generality < 1, M > K¯ + 4.
(2.275)
Proposition 2.28 implies that there exists ¯ ∈ (0, /8) such that if x ∈ X, d(x, C) ≤ 2¯ , andf (x) ≤ inf(f, C) + 2¯ ,
(2.276)
2.14 Proof of Theorem 2.12
55
then d(x, Cmin ) ≤ /4.
(2.277)
¯ −1 ¯ . β0 = (64L)
(2.278)
β1 ∈ (0, β0 ).
(2.279)
Set
Let
There exists an integer n0 ≥ 4 such that ¯ β1 n0 > 162 (3 + 2M)2 ¯ −1 L.
(2.280)
K∗ > 2M + 4 + 4n0 + 2K¯ + 2M
(2.281)
Fix
and a positive number δ such that ¯ −1 ¯ β1 . 6δK∗ < (64L)
(2.282)
x¯ ∈ Cmin .
(2.283)
Fix a point
Assume that an integer n ≥ n0 , {Pi }n−1 i=0 ⊂ M, Pi (X) = C, i = 0, . . . , n − 1,
(2.284)
{xi }ni=0 ⊂ X, x0 ≤ M,
(2.285)
vi ∈ ∂δ f (xi ) \ {0}, i = 0, 1, . . . , n − 1
(2.286)
n−1 n−1 {αi }n−1 i=0 ⊂ [β1 , β0 ], {ηi }i=0 , {ξi }i=0 ⊂ BX (0, δ)
(2.287)
and that for all integers i = 0, . . . , n − 1, xi+1 = Pi (xi − αi vi −1 vi − αi ξi ) − ηi . We claim that d(xk , Cmin ) ≤ for all integers k satisfying n0 ≤ k ≤ n. Assume that an integer k ∈ [0, n − 1],
(2.288)
56
2 Nonsmooth Convex Optimization
xk ≤ K∗ f (xk ) > inf(f, C) + ¯ /4.
(2.289)
It follows from (2.282), (2.283), and (2.286)–(2.289) that Lemma 2.24 holds with δf = δ, δC = δ, = ¯ /4, K0 = K∗ , α = αk , x = xk , v = vk , ξ = ξk + vk 1 vk , and y = xk+1 and this implies that ¯ −1 ¯ ¯ 2 ≤ xk − x ¯ 2 − αk (16L) xk+1 − x +2αk2 + δ 2 + 2δ(K∗ + K¯ + 2) ¯ −1 ¯ + 2αk2 + 2δ(K∗ + K¯ + 3). ¯ 2 − αk (16L) ≤ xk − x Together with (2.278), (2.282), and (2.287), this implies that ¯ −1 ¯ + 2δ(K¯ + 3 + K∗ ) ¯ 2 ≤ xk − x ¯ 2 − αk (32L) xk+1 − x ¯ −1 ¯ β1 + 2δ(K¯ + 3 + K∗ ) ≤ xk − x ¯ 2 − (32L) ¯ −1 ¯ . ≤ xk − x ¯ 2 − β1 (64L) Thus we have shown that the following property holds: (P2)
if an integer k ∈ [0, n − 1] and (2.289) is valid, then we have ¯ −1 β1 ¯ . xk+1 − x ¯ 2 ≤ xk − x ¯ 2 − (64L)
We claim that there exists an integer j ∈ {1, . . . , n0 } for which f (xj ) ≤ inf(f, C) + ¯ /4. Assume the contrary. Then we have ¯ j = 1, . . . , n0 . f (xj ) > inf(f, C) + /4,
(2.290)
It follows from (2.10), (2.283), (2.287), and (2.288) that for all integers i = 0, . . . , n − 1, we have ¯ ≤ 1 + xi − αi vi −1 vi − αi ξi − x ¯ xi+1 − x ≤ xi − x ¯ + 3.
(2.291)
By (2.281), (2.283), (2.285), and (2.291) for i = 0, . . . , n0 ,
Let
¯ ≤ x0 − x ¯ + 3i, xi − x
(2.292)
xi ≤ 2x ¯ + M + 3n0 < K∗ .
(2.293)
2.14 Proof of Theorem 2.12
57
k ∈ {1, . . . , n0 − 1}.
(2.294)
It follows from (2.290), (2.293), (2.294), and property (P2) that ¯ −1 β1 ¯ . ¯ 2 ≤ xk − x ¯ 2 − (64L) xk+1 − x
(2.295)
Relations (2.295), (2.275), (2.283), (2.285), and (2.292) imply that ¯ 2 − xn0 − x ¯ 2 (M + x ¯ + 3)2 ≥ x1 − x =
n 0 −1
[xi − x ¯ 2 − xi+1 − x ¯ 2]
i=1
¯ −1 β ¯ −1 ¯ , ¯ 1 ≥ β1 n0 (128L) ≥ (n0 − 1)(64L) ¯ −1 ¯ β1 ≤ (2M + 3)2 . (n0 /2)(64L)
This contradicts (2.280). The contradiction we have reached proves that there exists an integer j ∈ {1, . . . , n0 }
(2.296)
f (xj ) ≤ inf(f, C) + ¯ /4.
(2.297)
for which
By (2.284), (2.287), and (2.288), we have d(xj , C) ≤ δ.
(2.298)
Relations (2.277), (2.282), (2.297) and (2.298) imply that d(xj , Cmin ) ≤ .
(2.299)
We claim that for all integers i satisfying j ≤ i ≤ n, we have d(xi , Cmin ) ≤ . Assume the contrary. Then there exists an integer k ∈ [j, n] for which
(2.300)
58
2 Nonsmooth Convex Optimization
d(xk , Cmin ) > .
(2.301)
k > j.
(2.302)
By (2.296) and (2.299)–(2.301),
We may assume without loss of generality that d(xi , Cmin ) ≤ for all integers i satisfying j ≤ i < k.
(2.303)
Then d(xk−1 , Cmin ) ≤ .
(2.304)
f (xk−1 ) ≤ inf(f, C) + ¯ /4;
(2.305)
f (xk−1 ) > inf(f, C) + ¯ /4.
(2.306)
There are two cases:
Assume that (2.305) is valid. In view of (2.7), (2.8), (2.275), (2.276), and (2.305), ¯ xk−1 ∈ X0 ⊂ BX (0, K).
(2.307)
By (2.284), (2.287), and (2.288), there exists a point z ∈ C such that xk−1 − z ≤ δ.
(2.308)
It follows from (2.10), (2.287), (2.288), and (2.308) that xk − z ≤ δ + z − Pk−1 (xk−1 − αk−1 vk−1 −1 vk−1 − αk−1 ξk−1 ) ≤ δ + z − xk−1 + αk−1 + δ < 3δ + αk−1 .
(2.309)
Relations (2.277), (2.282), (2.305), and (2.308) imply that d(xk−1 , Cmin ) ≤ /4.
(2.310)
By (2.276), (2.278), (2.282), (2.287), (2.308), and (2.309), xk − xk−1 ≤ xk − z + z − xk+1 ≤ 4δ + αk−1 < ¯ < /8.
(2.311)
2.15 Proof of Theorem 2.13
59
In view of (2.310) and (2.311), d(xk , Cmin ) ≤ . This inequality contradicts (2.301). The contradiction we have reached proves (2.306). In view of (2.7), (2.8), and (2.304), xk−1 ≤ K¯ + 1.
(2.312)
It follows from (2.286)–(2.288), (2.306), and (2.312) that Lemma 2.26 holds with PC = Pi , K0 = K¯ + 1, x = xk−1 , y = xk , v = vk−1 , ξ = ξk−1 + vk−1 −1 v, α = αk−1 , = 4−1 ¯ , δf = δ, δC = δ , and combining with (2.278), (2.282), (2.287), and (2.304), this implies that d(xk , Cmin )2 2 ¯ −1 ¯ + 2αk−1 ≤ d(xk−1 , Cmin )2 − αk−1 (16L) + δ 2 + 2δ(2K¯ + 4)
¯ −1 αk−1 ¯ + 2δ(2K¯ + 5) ≤ d(xk−1 , Cmin )2 − (32L) ¯ −1 β1 ¯ + 2δ(2K¯ + 5) ≤ d(xk−1 , Cmin )2 − (32L) ≤ d(xk−1 , Cmin )2 ≤ 2 . This contradicts (2.301). The contradiction we have reached proves that d(xi , Cmin ) ≤ for all integers i satisfying j ≤ i ≤ n. In view of inequality n0 ≥ j , Theorem 2.12 is proved.
2.15 Proof of Theorem 2.13 We may assume without loss of generality that M > K¯ + 4, < 1.
(2.313)
. There exists L0 > L¯ such that |f (z1 ) − f (z2 )| ≤ L0 z1 − z2 for all z1 , z2 ∈ BX (0, M + 4).
(2.314)
60
2 Nonsmooth Convex Optimization
In view of Proposition 2.28, there exist a number ¯ ∈ (0, /8)
(2.315)
such that if x ∈ X, d(x, C) ≤ 2¯ , and f (x) ≤ inf(f, C) + 2¯ , then d(x, Cmin ) ≤ .
(2.316)
x¯ ∈ Cmin
(2.317)
¯ 0 ∈ (0, 4−1 ).
(2.318)
Fix
and
Since limi→∞ αi = 0 (see (2.83)), there is an integer p0 > 0 such that M + 4 < p0
(2.319)
and that for all integers p ≥ p0 − 1, we have αp < (20(L¯ + L0 + 2))−2 0 . Since
∞
i=0 αi
(2.320)
= ∞ (see (2.83)), there exist a natural number n0 > p0 + 4
(2.321)
such that n 0 −1
¯ αi > (4p0 + M + x) ¯ 2 0−1 16L.
(2.322)
i=p0
Fix ¯ + 5L0 + 5L¯ K∗ > K¯ + 4 + M + 4n0 + 4x
(2.323)
and a positive number δ such that 6δ(K∗ + 1) < 16−1 0 . Assume that an integer n ≥ n0 and that
(2.324)
2.15 Proof of Theorem 2.13
61 n {Pi }n−1 i=0 ⊂ M, {xi }i=0 ⊂ X,
Pi (X) = C, i = 0, . . . , n − 1,
(2.325)
x0 ≤ M,
(2.326)
{ξi }n−1 i=0 ⊂ X and that for all i = 0, . . . , n − 1, BX (ξi , δ) ∩ ∂δ f (xi ) = ∅,
(2.327)
xi+1 − Pi (xi − αi ξi ) ≤ αi δ.
(2.328)
In order to prove the theorem, it is sufficient to show that d(xk , Cmin ) ≤ for all integers k satisfying n0 ≤ k ≤ n. By (2.83), (2.324)–(2.326), and (2.328), xi ≤ M + 1 for all integers i = 0, . . . , n.
(2.329)
Assume that an integer k ∈ [p0 , n − 1],
(2.330)
f (xk ) > inf(f, C) + 0 .
(2.331)
In view of (2.314), (2.317), (2.323), (2.324), (2.326)–(2.328), and (2.331), the conditions of Lemma 2.21 hold with K0 = M, δf = δ, δC = αk δ, = 0 , α = αk , x = xk , ξ = ξk , and y = xk+1 , and combined with (2.320), (2.323), (2.324), and (2.330), this Lemma implies that ¯ 2 ≤ xk − x ¯ 2 − 2−1 αk 0 xk+1 − x ¯ + 25αk2 (L0 + L) ¯ 2 +αk2 δ 2 + 2δαk (M + K¯ + 5L0 + 5L) ≤ xk − x ¯ 2 − 4−1 αk 0 ¯ + 25αk2 (L0 + L) ¯ 2 +2δαk (M + K¯ + 5L0 + 5L) ¯ ≤ xk − x ¯ 2 − 8−1 αk 0 + 2δαk (M + K¯ + 5L0 + 5L) ≤ xk − x ¯ 2 − 16−1 αk 0 . Thus we have shown that the following property holds: (P3)
If an integer k ∈ [p0 , n − 1] and (2.331) is valid, then we have
62
2 Nonsmooth Convex Optimization
xk+1 − x ¯ 2 ≤ xk − x ¯ 2 − 16−1 αk 0 . We claim that there exists an integer j ∈ {p0 , . . . , n0 } such that f (xj ) ≤ inf(f, C) + 0 . Assume the contrary. Then f (xi ) > inf(f, C) + 0 , i = p0 , . . . , n0 .
(2.332)
It follows from (2.332) and property (P3) that ¯ 2 ≤ xi − x ¯ 2 − 16−1 αi 0 . xi+1 − x
(2.333)
Relations (2.83), (2.313), (2.317), (2.325), (2.328), and (2.333) imply that ¯ 2 − xn0 − x ¯ 2 (2M + 1)2 ≥ xp0 − x =
n 0 −1
−1
[xi − x ¯ − xi+1 − x ¯ ] ≥ 16 2
2
i=p0
0
n 0 −1
αi
i=p0
and n 0 −1
αi ≤ 160−1 (2M + 1)2 .
i=p0
This contradicts (2.322). The contradiction we have reached proves that there exists an integer j ∈ {p0 , . . . , n0 }
(2.334)
f (xj ) ≤ inf(f, C) + 0 .
(2.335)
such that
By (2.318), (2.324), and (2.328), we have d(xj , C) ≤ αj −1 δ < ¯ .
(2.336)
In view of (2.316), (2.335), and (2.336), d(xj , Cmin ) ≤ . We claim that for all integers i satisfying j ≤ i ≤ n,
(2.337)
2.15 Proof of Theorem 2.13
63
d(xi , Cmin ) ≤ . Assume the contrary. Then there exists an integer k ∈ [j, n]
(2.338)
d(xk , Cmin ) > .
(2.339)
k > j ≥ p0 .
(2.340)
for which
By (2.337)–(2.339), we have
By (2.337)–(2.340), we may assume without loss of generality that d(xi , Cmin ) ≤ for all integers i satisfying j ≤ i < k.
(2.341)
Thus in view of (2.341), d(xk−1 , Cmin ) ≤ .
(2.342)
f (xk−1 ) ≤ inf(f, C) + 0 ;
(2.343)
f (xk−1 ) > inf(f, C) + 0 .
(2.344)
There are two cases:
Assume that (2.343) is valid. By (2.325) and (2.328), there exists a point z∈C
(2.345)
xk−1 − z ≤ δαk−2 .
(2.346)
such that
By (2.10), (2.328), (2.345), and (2.346), xk − z ≤ αk−1 δ + z − Pk−1 (xk−1 − αk−1 ξk−1 ) ≤ δαk−1 + z − xk−1 + αk−1 ξk−1 ≤ αk−1 ξk−1 + δ(αk−1 + αk−2 ). Lemma 2.19, (2.83), (2.313), (2.314), (2.345), and (2.346) imply that
(2.347)
64
2 Nonsmooth Convex Optimization
∂δ f (xk−1 ) ⊂ BX (0, L0 + δ) ⊂ BX (0, L0 + 1).
(2.348)
In view of (2.327) and (2.348), ξk−1 ≤ L0 + 2.
(2.349)
αk−1 ξk−1 ≤ αk−1 (L0 + 2).
(2.350)
Equation (2.349) implies that
By (2.347) and (2.350), xk − z ≤ δ(αk−1 + αk−2 ) + αk−1 (L0 + 2).
(2.351)
It follows from (2.320), (2.324), (2.334), (2.338), (2.345), and (2.351) that d(xk , C) ≤ δ(αk−1 + αk−2 ) + αk−1 (L0 + 2) ≤ 0 .
(2.352)
In view of (2.346) and (2.351), xk − xk−1 ≤ xk − z + z − xk−1 ≤ δαk−2 + δ(αk−1 + αk−2 ) + αk−1 (L0 + 2).
(2.353)
It follows from (2.83), (2.314), (2.346), (2.352), and (2.353) that |f (xk−1 ) − f (xk )| ≤ L0 xk−1 − xk ≤ 2L0 δ(αk−1 + αk−2 ) + αk−1 L0 (L0 + 2). (2.354) In view of (2.320), (2.324), (2.334), (2.340), (2.343), and (2.354), we have f (xk ) ≤ f (xk−1 ) + 2L0 δ(αk−1 + αk−2 ) + αk−1 L0 (L0 + 2) ≤ inf(f, C) + 0 + 8−1 0 + 20−1 0 ≤ inf(f, C) + 20 .
(2.355)
It follows from (2.316), (2.318), (2.352), and (2.355) that d(xk , Cmin ) ≤ . This inequality contradicts (2.339). The contradiction we have reached proves (2.344). It follows from (2.83), (2.314), (2.318), (2.323)–(2.325), (2.328), and (2.344) that Lemma 2.22 holds with x = xk−1 , y = xk , ξ = ξk−1 , α = αk−1 , K0 = M + 1, = 0 , δf = δ, δC = αk−1 δ
2.16 Proof of Theorem 2.14
65
, and combined with (2.320), (2.323), (2.324), (2.334), (2.340), and (2.342), this implies that 2 ¯ 2 (L0 + L) d(xk , Cmin )2 ≤ d(xk−1 , Cmin )2 − 2−1 αk−1 0 + 25αk−1 2 ¯ +αk−1 δ 2 + 2αk−1 δ(M + 1 + K¯ + 5L0 + 5L)
¯ ≤ d(xk−1 , Cmin )2 − 4−1 αk−1 0 + 2αk−1 δ(M + 2 + 2K¯ + 5L0 + 5L) ≤ d(xk−1 , Cmin )2 − 8−1 αk−1 0 < d(xk−1 , Cmin )2 ≤ 2 and d(xk , Cmin ) ≤ . This contradicts (2.339). The contradiction we have reached proves that d(xi , Cmin ) ≤ for all integers i satisfying j ≤ i ≤ n. This completes the proof of Theorem 2.13.
2.16 Proof of Theorem 2.14 Fix x¯ ∈ Cmin
(2.356)
In view of Proposition 2.28, there exist a number ¯ ∈ (0, /8)
(2.357)
such that if x ∈ X, d(x, C) ≤ 2¯ and f (x) ≤ inf(f, C) + 2¯ , then d(x, Cmin ) ≤ .
(2.358)
¯ 0 ∈ (0, 4−1 ).
(2.359)
Fix
Since limi→∞ αi = 0 (see (2.88)), there is an integer p0 > 0 such that M + 4 < p0 and that for all integers p ≥ p0 − 1, we have
(2.360)
66
2 Nonsmooth Convex Optimization
αp < (20(L¯ + L0 + 2))−2 0 . Since
∞
i=0 αi
(2.361)
= ∞ (see (2.88)), there exist a natural number n0 > p0 + 4
(2.362)
such that n 0 −1
αi > 16(4p0 + 2M + x ¯ + 2)2 0−1 .
(2.363)
i=p0
Fix ¯ + 5L0 + 5L¯ K∗ > K¯ + 4 + 3M + 4n0 + 4x
(2.364)
and a positive number δ such that 6δ(K∗ + 1) < 16−1 0 .
(2.365)
Assume that an integer n ≥ n0 and that n {Pi }n−1 i=0 ⊂ M, {xi }i=0 ⊂ X,
Pi (X) = C, i = 0, . . . , n − 1,
(2.366)
x0 ≤ M,
(2.367)
{ξi }n−1 i=0 ⊂ X and that for all i = 0, . . . , n − 1, BX (ξi , δ) ∩ ∂δ f (xi ) = ∅,
(2.368)
xi+1 − Pi (xi − αi ξi ) ≤ αi δ
(2.369)
and (2.88) is true. In order to prove the theorem, it is sufficient to show that d(xk , Cmin ) ≤ for all integers k satisfying n0 ≤ k ≤ n. We show that for all t = 0, , . . . , n, ¯ ¯ ≤ 2 + M + K. xt − x
(2.370)
(In view of (2.356) and (2.367), the inequality above holds with t = 0.) Assume that an integer t satisfies 0 ≤ t < n and that (2.370) is true. We show that our assumption holds for t + 1 too. There are two cases:
2.16 Proof of Theorem 2.14
67
f (xt ) ≤ inf(f, C) + 4;
(2.371)
f (xt ) > inf(f, C) + 4.
(2.372)
Assume that (2.371) holds. In view of (2.7), (2.8), and (2.371), ¯ xt ≤ K.
(2.373)
Lemma 2.19, (2.9), and (2.373) imply that ∂δ f (xt ) ⊂ BX (0, L¯ + 1).
(2.374)
By (2.365), (2.368), and (2.374), ξt ≤ L¯ + 2.
(2.375)
It follows from (2.7), (2.8), (2.10), (2.88), (2.356), (2.365), (2.366), (2.369), (2.373), and (2.375) that ¯ ≤ xt+1 − Pt (xt − αt ξt ) + Pt (xt − αt ξt ) − x ¯ xt+1 − x ¯ ≤ αt δ + xt − x ¯ + αt ξt ≤ αt δ + xt − αt ξt − x ¯ ≤ 1 + 2K¯ + α ξt ≤ 3 + 2K¯ ≤ M + 2 + K. Thus ¯ ¯ ≤ M + 2 + K. xt+1 − x
(2.376)
Assume that (2.372) holds. In view of (2.356), (2.364), (2.365), (2.368), (2.369), and (2.372), we apply Lemma 2.21 with PC = Pt , δf = δ, δC = αt δ, ¯ α = αt , ξ = ξt , x = xt , y = xt+1 , = 4 K0 = M + 4 + K, , and this Lemma together with (2.88), (2.364), (2.365), and (2.370) implies that ¯ 2 ≤ xt − x ¯ 2 − 2αt + αt2 δ 2 xt+1 − x ¯ + 25αt2 (L0 + L) ¯ 2 +2αt δ(2K¯ + M + 2 + 5L0 + 5L) ¯ ≤ xt − x ¯ 2 − αt + 2αt δ(2K¯ + M + 3 + 5L0 + 5L) ≤ xt − x ¯ 2
68
2 Nonsmooth Convex Optimization
and ¯ ≤ xt − x ¯ ≤ K¯ + M + 2. xt+1 − x Thus (2.376) holds in the both cases. Therefore by induction we showed that (2.370) holds for all t = 0, . . . , n. Assume that an integer k ∈ [p0 , n − 1],
(2.377)
f (xk ) > inf(f, C) + 0 .
(2.378)
In view of (2.356), (2.359), (2.364), (2.365), (2.368)–(2.370), (2.377), and (2.378), the conditions of Lemma 2.21 hold with K0 = 2M + 4, δf = δ, δC = αk δ, = 0 , α = αk , x = xk , ξ = ξk , and y = xk+1 , and combined with (2.361), (2.365), and (2.377), this Lemma implies that ¯ 2 ≤ xk − x ¯ 2 − 2−1 αk 0 xk+1 − x ¯ 2 +αk2 δ 2 + 2δαk (2M + 2K¯ + 5L0 + 5L¯ + 4) + 25αk2 (L0 + L) ≤ xk − x ¯ 2 − 4−1 αk 0 +2δαk (2M + 2K¯ + 5L0 + 5L¯ + 4) ≤ xk − x ¯ 2 − 8−1 αk 0 . Thus we have shown that the following property holds: (P4)
If an integer k ∈ [p0 , n − 1] and (2.378) is valid, then we have ¯ 2 ≤ xk − x ¯ 2 − 8−1 αk 0 . xk+1 − x
We claim that there exists an integer j ∈ {p0 , . . . , n0 } such that f (xj ) ≤ inf(f, C) + 0 . Assume the contrary. Then f (xi ) > inf(f, C) + 0 , i = p0 , . . . , n0 .
(2.379)
It follows from (2.379) and property (P4) that for all i = p0 , . . . , n0 − 1, xi+1 − x ¯ 2 ≤ xi − x ¯ 2 − 8−1 αi 0 . Relations (2.370) and (2.380) imply that
(2.380)
2.16 Proof of Theorem 2.14
69
(M + K¯ + 2)2 ≥ xp0 − x ¯ 2 − xn0 − x ¯ 2 =
n 0 −1
−1
[xi − x ¯ − xi+1 − x ¯ ]≥8 2
2
i=p0
0
n 0 −1
αi
i=p0
and n 0 −1
αi ≤ 80−1 (M + K¯ + 2)2 .
i=p0
This contradicts (2.363). The contradiction we have reached proves that there exists an integer j ∈ {p0 , . . . , n0 }
(2.381)
f (xj ) ≤ inf(f, C) + 0 .
(2.382)
such that
By (2.365), (2.366), and (2.369), we have d(xj , C) ≤ αj −1 δ < ¯ .
(2.383)
In view of (2.358), (2.359), (2.382), and (2.383), d(xj , Cmin ) ≤ .
(2.384)
We claim that for all integers i satisfying j ≤ i ≤ n, d(xi , Cmin ) ≤ . Assume the contrary. Then there exists an integer k ∈ [j, n]
(2.385)
d(xk , Cmin ) > .
(2.386)
k > j ≥ p0 .
(2.387)
for which
By (2.384)–(2.386), we have
70
2 Nonsmooth Convex Optimization
We may assume without loss of generality that d(xi , Cmin ) ≤ for all integers i satisfying j ≤ i < k.
(2.388)
Thus in view of (2.388), d(xk−1 , Cmin ) ≤ .
(2.389)
f (xk−1 ) ≤ inf(f, C) + 0 ;
(2.390)
f (xk−1 ) > inf(f, C) + 0 .
(2.391)
There are two cases:
Assume that (2.390) is valid. By (2.366) and (2.369), there exists a point z ∈ C such that xk−1 − z ≤ δαk−2 .
(2.392)
By (2.10), (2.366), (2.369), and (2.392), xk − z ≤ αk−1 δ + z − Pk−1 (xk−1 − αk−1 ξk−1 ) ≤ δαk−1 + z − xk−1 + αk−1 ξk−1 ≤ αk−1 ξk−1 + δ(αk−1 + αk−2 ).
(2.393)
Lemma 2.19, (2.87), and (2.370) imply that ∂δ f (xk−1 ) ⊂ BX (0, L0 + δ) ⊂ BX (0, L0 + 1).
(2.394)
In view of (2.368) and (2.394), ξk−1 ≤ L0 + 2. By the relation above, αk−1 ξk−1 ≤ αk−1 (L0 + 2). Together with (2.393), this implies that xk − z ≤ δ(αk−1 + αk−2 ) + αk−1 (L0 + 2).
(2.395)
It follows from the inclusion z ∈ C, (2.361), (2.365), and (2.387) that d(xk , C) ≤ δ(αk−1 + αk−2 ) + αk−1 (L0 + 2) ≤ 0 .
(2.396)
2.16 Proof of Theorem 2.14
71
In view of (2.392) and (2.395), xk − xk−1 ≤ xk − z + z − xk−1 ≤ δαk−2 + δ(αk−1 + αk−2 ) + αk−1 (L0 + 2).
(2.397)
It follows from (2.87), (2.370), and (2.397) that |f (xk−1 ) − f (xk )| ≤ L0 xk−1 − xk ≤ 2L0 δ(αk−1 + αk−2 ) + αk−1 L0 (L0 + 2).
(2.398)
In view of (2.361), (2.364), (2.365), (2.387), (2.390), and (2.398), we have f (xk ) ≤ f (xk−1 ) + 2L0 δ(αk−1 + αk−2 ) + αk−1 L0 (L0 + 2) ≤ inf(f, C) + 0 + 8−1 0 + 20−1 0 ≤ inf(f, C) + 20 .
(2.399)
It follows from (2.358), (2.359), (2.396), and (2.399) that d(xk , Cmin ) ≤ . This inequality contradicts (2.386). The contradiction we have reached proves (2.391). It follows from (2.87), (2.356), (2.365), (2.368)–(2.370), and (2.391) that Lemma 2.22 holds with x = xk−1 , y = xk , ξ = ξk−1 , α = αk−1 , K0 = M + 2K¯ + 2, = 0 , δf = δ, δC = αk−1 δ , and combined with (2.361), (2.364), (2.365), (2.387), and (2.389) this implies that 2 ¯ 2 (L0 + L) d(xk , Cmin )2 ≤ d(xk−1 , Cmin )2 − 2−1 αk−1 0 + 25αk−1 2 ¯ +αk−1 δ 2 + 2αk−1 δ(M + 2 + 2K¯ + 5L0 + 5L)
¯ ≤ d(xk−1 , Cmin )2 − 4−1 αk−1 0 + 2αk−1 δ(M + 3 + 2K¯ + 5L0 + 5L) ≤ d(xk−1 , Cmin )2 − 8−1 αk−1 0 and d(xk , Cmin ) ≤ d(xk−1 , Cmin ) ≤ .
72
2 Nonsmooth Convex Optimization
This contradicts (2.386). The contradiction we have reached proves that d(xi , Cmin ) ≤ for all integers i satisfying j ≤ i ≤ n. This completes the proof of Theorem 2.14.
2.17 Proof of Theorem 2.15 We may assume that without loss of generality < 1, M > K¯ + 4.
(2.400)
There exists L0 > L¯ such that |f (z1 ) − f (x2 )| ≤ L0 z1 − z2 for all z1 , z2 ∈ BX (0, 3M + 4).
(2.401)
Proposition 2.28 implies that there exists ¯ ∈ (0, /8)
(2.402)
such that if x ∈ X, d(x, C) ≤ 2¯ ,andf (x) ≤ inf(f, C) + 2¯ , then d(x, Cmin ) ≤ .
(2.403)
¯ −2 ¯ . β0 = (64(L0 + L))
(2.404)
β1 ∈ (0, β0 ).
(2.405)
Set
Let
There exists an integer n0 ≥ 4 such that β1 n0 > 162 (3 + 2M)2 ¯ −1 L0 .
(2.406)
K∗ > 6M + 4 + 4n0 + 5L0 + 5L¯
(2.407)
Fix
and a positive number δ such that 6δK∗ < (64L0 )−1 ¯ β1 .
(2.408)
2.17 Proof of Theorem 2.15
73
Assume that an integer n ≥ n0 , {Pi }n−1 i=0 ⊂ M, Pi (X) = C, i = 0, . . . , n − 1,
(2.409)
{xi }ni=0 ⊂ X, x0 ≤ M,
(2.410)
{αi }n−1 i=0 ⊂ [β1 , β0 ]
(2.411)
{ξi }n−1 i=0 ⊂ X,
and that for all integers i = 0, . . . , n − 1, BX (ξi , δ) ∩ ∂δ f (xi ) = ∅,
(2.412)
xi+1 − Pi (xi − αi ξi ) ≤ δ.
(2.413)
In order to prove the theorem, it is sufficient to show that d(xk , Cmin ) ≤ for all integers k satisfying n0 ≤ k ≤ n. Fix a point x¯ ∈ Cmin .
(2.414)
¯ ¯ ≤ 2 + M + K. xt − x
(2.415)
We show that for all t = 0, , . . . , n,
In view of (2.7), (2.8), (2.410), and (2.414), (2.415) holds with t = 0. Assume that an integer t satisfies 0 ≤ t < n and that (2.415) is true. There are two cases: f (xt ) ≤ inf(f, C) + 4;
(2.416)
f (xt ) > inf(f, C) + 4.
(2.417)
Assume that (2.416) holds. In view of (2.7), (2.8), and (2.416), ¯ xt ≤ K.
(2.418)
Lemma 2.19, (2.401) and (2.418) imply that ∂δ f (xt ) ⊂ BX (0, L¯ + 1). By (2.412) and (2.419),
(2.419)
74
2 Nonsmooth Convex Optimization
ξt ≤ L¯ + 2.
(2.420)
It follows from (2.7), (2.8), (2.10), (2.404), (2.411), (2.413), (2.414), (2.418), and (2.420) that ¯ ≤ xt+1 − Pt (xt − αt ξt ) + Pt (xt − αt ξt ) − x ¯ xt+1 − x ≤ δ + xt − αt ξt − x ¯ ≤ δ + xt − x ¯ + αt ξt ≤ 1 + 2K¯ + αt ξt ≤ 1 + 2K¯ + β0 ξt ≤ 3 + 2K¯ ≤ M + K¯ + 2. Thus ¯ ¯ ≤ M + 2 + K. xt+1 − x Assume that (2.417) holds. In view of (2.7), (2.8), (2.407), (2.408), (2.412)– (2.415), and (2.417), Lemma 2.21 holds PC = Pt , δf = δ, δC = δ, K0 = M + 2K¯ + 2, α = αt , ξ = ξt , x = xt , y = xt+1 , = 4 , and this Lemma together with (2.48), (2.404), (2.407), (2.408), and (2.411) implies that ¯ 2 ≤ xt − x ¯ 2 − 2αt + δ 2 xt+1 − x ¯ + 25αt2 (L0 + L) ¯ 2 +2δ(3K¯ + M + 2 + 5L0 + 5L) ¯ ≤ xt − x ¯ 2 − αt + δ(3K¯ + M + 3 + 5L0 + 5L) ¯ ≤ xt − x ¯ 2 − β1 + δ(2K¯ + 2M + 3 + 5L0 + 5L) ≤ xt − x ¯ 2 and ¯ ≤ K¯ + M + 2 xt+1 − x in the both cases. Therefore by induction we showed that (2.415) holds for all t = 0, . . . , n. Assume that an integer k ∈ [0, n − 1],
(2.421)
f (xk ) > inf(f, C) + ¯ .
(2.422)
2.17 Proof of Theorem 2.15
75
It follows from (2.401), (2.404), (2.407), (2.408), (2.412)–(2.415), and (2.422) that Lemma 2.21 holds with δf = δ, δC = δ, = ¯ , K0 = K¯ + 2M + 2, x = xk , ξ = ξk , and y = xk+1 , and this Lemma together with (2.411) implies that ¯ 2 ≤ xk − x ¯ 2 − 2−1 αk ¯ xk+1 − x ¯ + 25αk2 (L0 + L) ¯ 2 +δ 2 + 2δ(2K¯ + 2M + 2 + 5L0 + L) ¯ ≤ xk − x ¯ 2 − 4−1 αk ¯ + 2δ(2K¯ + 2M + 3 + 5L0 + 5L) ¯ ≤ xk − x ¯ 2 − 4−1 β1 ¯ + 2δ(2K¯ + 2M + 3 + 5L0 + 5L) ≤ xk − x ¯ 2 − 8−1 β1 . ¯ Thus we have shown that the following property holds: (P5) if an integer k ∈ [0, n − 1] and (2.422) is valid, then we have xk+1 − x ¯ 2 ≤ xk − x ¯ 2 − 8−1 β1 ¯ . We claim that there exists an integer j ∈ {1, . . . , n0 } for which f (xj ) ≤ inf(f, C) + ¯ . Assume the contrary. Then we have f (xj ) > inf(f, C) + ¯ , j = 1, . . . , n0 .
(2.423)
It follows from (2.423) and property (P5) that for all k = 0, . . . , n0 − 1, ¯ 2 ≤ xk − x ¯ 2 − 8−1 β1 ¯ . xk+1 − x
(2.424)
Relations (2.7), (2.8), (2.410), (2.414), and (2.424) imply that ¯ 2 ≥ x0 − x ¯ 2 − xn0 − x ¯ 2 (M + K) =
n 0 −1
[xi − x ¯ 2 − xi+1 − x ¯ 2 ] ≥ 8−1 n0 ¯ β1 ,
i=0
¯ 2 ¯ −1 β −1 . n0 ≤ 8(M + K) 1 This contradicts (2.406). The contradiction we have reached proves that there exists an integer j ∈ {p0 , . . . , n0 } for which
76
2 Nonsmooth Convex Optimization
f (xj ) ≤ inf(f, C) + ¯ .
(2.425)
By (2.408), (2.409), and (2.413), we have d(xj , C) ≤ δ < ¯ .
(2.426)
Relations (2.403), (2.425), and (2.426) imply that d(xj , Cmin ) ≤ .
(2.427)
We claim that for all integers i satisfying j ≤ i ≤ n, we have d(xi , Cmin ) ≤ . Assume the contrary. Then there exists an integer k ∈ [j, n]
(2.428)
d(xk , Cmin ) > .
(2.429)
k > j.
(2.430)
for which
By (2.426), (2.428), and(2.429),
In view of (2.426) and (2.429), we may assume without loss of generality that d(xi , Cmin ) ≤ for all integers i satisfying j ≤ i < k.
(2.431)
In particular d(xk−1 , Cmin ) ≤ .
(2.432)
f (xk−1 ) ≤ inf(f, C) + ¯ ;
(2.433)
f (xk−1 ) > inf(f, C) + . ¯
(2.434)
There are two cases:
Assume that (2.433) is valid. In view of (2.409) and (2.413), there exists a point z∈C such that
(2.435)
2.17 Proof of Theorem 2.15
77
xk−1 − z ≤ δ.
(2.436)
It follows from (2.10), (2.411), (2.413), (2.435), and (2.436) that xk − z ≤ δ + z − Pk−1 (xk−1 − αk−1 ξk−1 ) ≤ δ + z − xk−1 + αk−1 ξk−1 ≤ 2δ + β0 ξk−1 .
(2.437)
Lemma 2.19, (2.401), (2.414), and (2.415) imply that ∂δ f (xk−1 ) ⊂ BX (0, L0 + δ) ⊂ BX (0, L0 + 1).
(2.438)
In view of (2.412) and (2.438), ξk−1 ≤ L0 + 2.
(2.439)
By (2.404), (2.408), (2.435), (2.437), and (2.439), xk − z ≤ 2δ + β0 (L0 + 2),
(2.440)
d(xk , C) ≤ 2δ + β0 (L0 + 2) < ¯ .
(2.441)
By (2.436) and (2.440), xk − xk−1 ≤ xk − z + z − xk+1 ≤ 3δ + β0 (L0 + 2).
(2.442)
In view of (2.401), (2.414), (2.415), and (2.442), |f (xk−1 − f (xk )| ≤ L0 xk − xk−1 ≤ 3L0 δ + β0 L0 (L0 + 2).
(2.443)
In view of (2.404), (2.408), and (2.443), f (xk ) ≤ f (xk−1 ) + 3L0 δ + β0 L0 (L0 + 2) ≤ inf(f, C) + ¯ + 8−1 ¯ + 8−1 ¯ < inf(f, C) + 2¯ . It follows from (2.403), (2.441), and (2.444) that d(xk , Cmin ) ≤ .
(2.444)
78
2 Nonsmooth Convex Optimization
This inequality contradicts (2.429). The contradiction we have reached proves (2.434). It follows from (2.401), (2.407), (2.408), (2.412)–(2.415), (2.432), and (2.434) that Lemma 2.22 holds with PC = Pk , K0 = M + 2K¯ + 2, x = xk−1 , y = xk , ξ = ξk−1 , α = αk−1 , = ¯ , δf = δ, δC = δ , and combining with (2.404) and (2.411), this implies that d(xk , Cmin )2 2 ¯ 2 + δ2 ≤ d(xk−1 , Cmin )2 − 2−1 αk−1 ¯ + 25αk−1 (L0 + L)
¯ +2δ(M + 3K¯ + 2 + 5L0 + 5L) ¯ ≤ d(xk−1 , Cmin )2 − 4−1 αk−1 ¯ + 2δ(M + 3K¯ + 3 + 5L0 + 5L) ¯ ≤ d(xk−1 , Cmin )2 − 4−1 β1 ¯ + 2δ(M + 3K¯ + 3 + 5L0 + 5L) ≤ d(xk−1 , Cmin )2 − 8−1 β1 ¯ and d(xk , Cmin ) ≤ d(xk−1 , Cmin ) ≤ . This contradicts (2.429). The contradiction we have reached proves that d(xi , Cmin ) ≤ for all integers i satisfying j ≤ i ≤ n. Theorem 2.15 is proved.
2.18 Proof of Theorem 2.16 By (2.94), (2.95), (2.97), and (2.99), for each integer i ≥ 0, xi ≤ K1 ,
(2.445)
ξi ≤ L1 .
(2.446)
In view of (2.10), (2.96), (2.99), and (2.446), for each integer i ≥ 1, xi ∈ C,
(2.447)
2.18 Proof of Theorem 2.16
79
xi − xi+1 = xi − Pi (xi − αi ξi ) ≤ xi − (xi − αi ξi ) ≤ αi ξi ≤ αi L1 .
(2.448)
By (2.448), ∞
xi − xi+1 ≤ L1
∞
i=1
αi < ∞.
i=1
Thus {xi }∞ i=1 is a Cauchy sequence, and there exists x∗ = lim xi .
(2.449)
x∗ ∈ Cmin .
(2.450)
i→∞
Assume that
It follows from (2.447) and (2.449) that x∗ ∈ C.
(2.451)
In view of (2.450) and (2.451), there exists ∈ (0, 1) such that f (x∗ ) > inf(f, C) + .
(2.452)
In view of (2.449), there exist a natural number n0 such that for all integers n ≥ n0 , f (xn ) > inf(f, C) + ,
(2.453)
¯ −2 . αn ≤ 100−1 (L1 + L)
(2.454)
z ∈ Cmin .
(2.455)
Fix
Applying Lemma 2.21 with K0 = K1 , L0 = L1 , x = xt , y = xt+1 , and v = ξt and with arbitrary sufficiently small δf , δC > 0, we obtain that for all integers t ≥ n0 , ¯ 2 xt+1 − z2 ≤ xt − z2 − 2−1 αt + 25αt2 (L1 + L) ≤ xt − z2 − 4−1 αt .
Theorem 2.16 is proved.
80
2 Nonsmooth Convex Optimization
2.19 Proof of Theorem 2.17 Fix x¯ ∈ Cmin .
(2.456)
We show that for all i = 1, 2, . . . ,, ¯ d(xi , Cmin ) ≤ 1 + K1 + K. We have x0 ≤ K1 .
(2.457)
By (2.7), (2.8), (2.10), (2.101), (2.104), (2.456), and (2.457), ¯ = x¯ − P1 (x0 − α0 ξ0 ) ≤ x¯0 − x0 + α0 ξ0 . x1 − x
(2.458)
In view of (2.100), (2.103), and (2.457), ξ0 ≤ L1 + 1.
(2.459)
It follows from (2.7), (2.8), (2.102), and (2.457)–(2.459) that ¯ ≤ K1 + K¯ + 1. x1 − x
(2.460)
Assume that t ≥ 1 is an integer and d(xt , Cmin ) ≤ K1 + K¯ + 1.
(2.461)
In view of (2.456) and (2.461), xt ≤ K1 + 2K¯ + 1.
(2.462)
f (xt ) ≤ inf(f, C) + 4;
(2.463)
f (xt ) > inf(f, C) + 4.
(2.464)
There are two cases:
Assume that (2.463) holds. In view of (2.7), (2.8), and (2.463), ¯ xt ≤ K.
(2.465)
2.19 Proof of Theorem 2.17
81
In view of (2.465), ¯ x¯ − xt ≤ 2K.
(2.466)
By (2.10), (2.101), (2.104), and (2.466), ¯ = Pt (xt − αt ξt ) − x ¯ xt+1 − x ≤ xt − x ¯ + αt ξt .
(2.467)
It follows from (2.100), (2.103), and (2.465) that ξt ≤ L1 .
(2.468)
It follows from (2.102) and (2.466)–(2.468) that ¯ ≤ 2K¯ + αt L1 ≤ 2K¯ + 1. xt+1 − x
(2.469)
Assume that (2.464) holds. In view of (2.7), (2.8), (2.102), (2.461), and (2.464), we apply Lemma 2.23 with PC = Pt , K0 = 3K1 + 1, L0 = L1 , α = αt , ξ = ξt , x = xt , y = xt+1 and with arbitrary sufficiently small positive δf , δC and obtain that ¯ 2 d(xt+1 , Cmin )2 ≤ d(xt , Cmin )2 − 2αt + 25αt2 (L1 + L) ≤ d(xt , Cmin )2 − αt ≤ d(xt , Cmin )2 and d(xt+1 , Cmin ) ≤ K1 + K¯ + 1. Together with (2.456) and (2.459), this implies that the inequality above holds in the both cases. Therefore by induction we showed that d(xt , Cmin ) ≤ K1 + K¯ + 1 for all integers t ≥ 0.
(2.470)
In view of (2.7), (2.8), (2.456), and (2.470), xt ≤ K1 + 2K¯ + 1 for all integers t ≥ 0.
(2.471)
It follows from (2.100), (2.103), and (2.471) that ξt ≤ L1 for all integers t ≥ 0.
(2.472)
82
2 Nonsmooth Convex Optimization
By (2.10), (2.101), (2.104), and(2.472), xt+1 − xt = Pt (xt − αt ξt ) − xt ≤ αt ξt ≤ αt L1 . Therefore ∞
xt+1 − xt ≤ L1
t=0
∞
αt < ∞.
(2.473)
t=0
This implies that there exists x∗ = lim xt ∈ C. t→∞
(2.474)
Assume that x∗ ∈ Cmin . By the relation above, there exists ∈ (0, 1) such that f (x∗ ) > inf(f, C) + .
(2.475)
In view of (2.474) and (2.475), there exist a natural number n0 such that for all integers n ≥ n0 , f (xn ) > inf(f, C) + ,
(2.476)
¯ −2 . αn ≤ 100−1 (L1 + L)
(2.477)
Let z ∈ Cmin . In view of (2.100), (2.103), (2.104), (2.427), (2.456), and (2.471), we apply Lemma 2.21 with K0 = 3K1 , L0 = L1 , x = xt , y = xt+1 , and ξ = ξt and with arbitrary sufficiently small δf , δC > 0 and obtain that for all integers t ≥ n0 , ¯ 2 xt+1 − z2 ≤ xt − z2 − 2−1 αt + 25αt2 (L1 + L) ≤ xt − z2 − 4−1 αt .
Theorem 2.17 is proved.
2.20 Proof of Theorem 2.18 By (2.10), (2.105), (2.106), and (2.108), for each integer t ≥ 0,
2.20 Proof of Theorem 2.18
83
xt − xt+1 = xt − Pt (xt − αt ξt −1 ξt ) ≤ αt , ∞
xt − xt+1 ≤
t=0
∞
(2.478)
αt < ∞.
t=0
Thus {xt }∞ t=0 is a Cauchy sequence, and there exists x∗ ∈ lim xt ∈ C.
(2.479)
t→∞
By (2.106) and (2.478), for all integers t ≥ 0, xt ≤ x0 +
t
αi ≤ K1 +
i=0
∞
αi .
(2.480)
i=0
Assume that x∗ ∈ Cmin .
(2.481)
In view of (2.479) and (2.481), there exists > 0 such that f (x∗ ) > inf(f, C) + .
(2.482)
In view of (2.479) and (2.482), there exist a natural number n0 such that for all integers n ≥ n0 , f (xn ) > inf(f, C) + ,
(2.483)
αn ≤ 16−1 L¯ −1 .
(2.484)
z ∈ Cmin
(2.485)
Fix
(2.480), and (2.483)–(2.485), and t ≥ n0 be an integer. In view of (2.107), (2.108), we apply Lemma 2.24 with x¯ = z, K0 = K1 + ∞ α i=0 i , α = αt , x = xt , y = xt+1 , and ξ = ξt and with arbitrary sufficiently small δf , δC > 0 and obtain that ¯ −1 αt + 2αt2 xt+1 − z2 ≤ xt − z2 − (4L) ¯ −1 αt . ≤ xt − z2 − (4L) Theorem 2.18 is proved.
Chapter 3
Extensions
In this chapter we study the projected subgradient method for nonsmooth convex constrained optimization problems in a Hilbert space. For these problems, an objective function is defined on an open convex set and a set of admissible points is not necessarily convex. We generalize some results of Chapter 2 obtained in the case when an objective function is defined on the whole Hilbert space.
3.1 Optimization Problems on Bounded Sets Let (X, ·, ·) be a Hilbert space with an inner product ·, · which induces a complete norm · . We use the notation and definitions introduced in Chapter 2. Let C be a closed nonempty subset of the space X and U be an open convex subset of X such that C ⊂ U.
(3.1)
C ⊂ BX (0, M),
(3.2)
Suppose that L, M > 0,
and that a convex function f : U → R 1 satisfies |f (u1 ) − f (u2 )| ≤ Lu1 − u2 for all u1 , u2 ∈ U.
(3.3)
For each point x ∈ U and each positive number , let ∂f (x) = {l ∈ X : f (y) − f (x) ≥ l, y − x for all y ∈ U }
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 A. J. Zaslavski, The Projected Subgradient Algorithm in Convex Optimization, SpringerBriefs in Optimization, https://doi.org/10.1007/978-3-030-60300-7_3
(3.4)
85
86
3 Extensions
and let ∂ f (x) = {l ∈ X : f (y) − f (x) ≥ l, y − x − for all y ∈ U }.
(3.5)
Denote by M the set of all mappings P : X → C such that P z = z for all z ∈ C, P x − z ≤ x − z for all x ∈ X and all z ∈ C.
(3.6)
inf(f, C) = inf{f (z) : z ∈ C}.
(3.7)
Define
It is clear that inf(f, C) is finite. Set Cmin = {x ∈ C : f (x) = inf(f, C)}.
(3.8)
For all P ∈ M set P 0 x = x, x ∈ X. We assume that Cmin = ∅. In view of (3.3), for each x ∈ U , ∂f (x) ⊂ BX (0, L).
(3.9)
Proposition 3.1 Assume that , r > 0 and that x ∈ U , BX (x, r) ⊂ U. Then ∂ f (x) ⊂ BX (0, L + r −1 ). Proof Let ξ ∈ ∂ f (x). By (3.5) and (3.10), for every h ∈ BX (0, 1), Lr ≥ Lrh ≥ f (x + rh) − f (x) ≥ ξ, rh − , ξ, h ≤ L + /r. This implies that ξ ≤ L + /r. Proposition 3.1 is proved. In this chapter we prove the following two results.
(3.10)
3.1 Optimization Problems on Bounded Sets
87
T −1 Theorem 3.2 Assume that δf , δC ∈ (0, 1], T ≥ 1 is an integer, {αt }t=0 ⊂ (0, 1], T −1 ⊂ M, {Pi }i=0
(3.11)
T −1 {xi }Ti=0 ⊂ U, {ξi }i=0 ⊂ X,
x0 ≤ M + 1,
(3.12)
BX (ξi , δf ) ∩ ∂f (xi ) = ∅,
(3.13)
xi+1 − Pi (xi − αi ξi ) ≤ δC .
(3.14)
and that for i = 0, . . . , T − 1,
Then ⎛⎛
T −1
⎜ min{f (xt ) : t = 0, . . . , T − 1} − inf(f, C), f ⎝⎝
⎞−1 αi ⎠
i=0
⎛ ≤⎝
T −1
⎞−1 αj ⎠
j =0
T −1
T −1
⎞ ⎟ αt xt ⎠ − inf(f, C)
t=0
αt (f (xt ) − inf(f, C))
t=0
⎛
T −1
≤ 2−1 (2M + 1)2 ⎝
⎞−1
T −1
+ T δC ⎝
⎞⎛
+ 2−1 L2 ⎝
αt2 ⎠ ⎝
T −1
αt ⎠
t=0
⎛
⎛
t=0
T −1
⎞−1 αt ⎠
t=0
⎞−1 αt ⎠
(2M + L + 3) + δf (2M + L + 2).
(3.15)
t=0
Theorem 3.3 Assume that r > 0, BX (z, 2r) ⊂ U for all z ∈ C,
(3.16)
T −1 Δ > 0, δf , δC ∈ (0, 1], δC ≤ r, T ≥ 1 is an integer, {αt }t=0 ⊂ (0, 1], T −1 ⊂ M, {Pi }i=0
(3.17)
T −1 ⊂ X, {xi }Ti=0 ⊂ U, {ξi }i=0
x0 ≤ M + 1,
(3.18)
BX (x0 , r) ⊂ U,
(3.19)
88
3 Extensions
and that for i = 0, . . . , T − 1, BX (ξi , δf ) ∩ ∂Δ f (xi ) = ∅,
(3.20)
xi+1 − Pi (xi − αi ξi ) ≤ δC .
(3.21)
Then ⎛ ⎞ −1 T −1 T −1 αi αt xt ⎠ − inf(f, C) min{f (xt ) : t = 0, . . . , T − 1} − inf(f, C), f ⎝ i=0
⎞−1
⎛
T −1
≤⎝
T −1
αj ⎠
j =0 −1
≤2
t=0
αt (f (xt ) − inf(f, C))
t=0
(2M + 1)
2
T −1
−1 αt
−1
+Δ+2
(L + Δr
−1 2
)
t=0
+ T δC
T −1
T −1
αt2
T −1
t=0
−1 αt
t=0
−1 αt
(2M + L + 3 + Δr −1 ) + δf (2M + L + 2 + Δr −1 ).
t=0
(3.22) Note that (3.15) is a particular case of (3.22) with Δ = 0. Theorems 3.2 and 3.3 are new. They are proved in Sections 3.4 and 3.5, respectively. Let T ≥ 1 be an integer and A > 0 be We are interested in an optimal given. −1 choice of αt , t = 0, . . . , T − 1 satisfying Tt=0 αt = A which minimizes the righthand side of (3.22). By Lemma 2.3 of [75], αt = α = T −1 A, t = 0, . . . , T − 1. In this case the right-hand side of (3.22) is 2−1 (2M + 1)2 T −1 α −1 + Δ + 2−1 (L + Δr −1 )2 α + δC α −1 (2M + L + 3 + Δr −1 ) + δf (2M + L + 2 + Δr −1 ). Now we can make the best choice of the step-size α > 0. Since T can be arbitrarily large, we need to minimize the function δC α −1 (2M + L + 3 + Δr −1 ) + 2−1 (L + Δr −1 )2 α, α > 0 which has a minimizer α = (L + Δr −1 )−1 (2δC (2M + L + 3 + Δr −1 ))1/2 .
3.2 An Auxiliary Result for Theorem 3.2
89
With this choice of α the right-hand side of (3.22) is 2−1 (2M + 1)2 T −1 (L + Δr −1 )(2δC (2M + L + 3 + Δr −1 ))1/2 + Δ + (L + Δr −1 )(2−1 δC (2M + L + 3 + Δr −1 ))1/2 + δf (2M + L + 2 + Δr −1 ) + 2−1 (L + Δr −1 )(2δC (2M + L + 3 + Δr −1 ))1/2 . Now we should make the best choice of T . It is clear that T should be at the same 1/2 order as δC−1 . In this case the right-hand side of (3.22) does not exceed c1 δC + Δ + δf (2M + L + 2 + Δr −1 ), where c1 > 0 is a constant.
3.2 An Auxiliary Result for Theorem 3.2 Lemma 3.4 Let P ∈ M,
(3.23)
x ∈ U ∩ BX (0, M + 1),
(3.24)
BX (ξ, δf ) ∩ ∂f (x) = ∅,
(3.25)
y ∈ BX (P (x − αξ ), δC ).
(3.26)
α ∈ (0, 1], δf , δC ∈ (0, 1]
let ξ ∈ X satisfy
and let
Then for each z ∈ C, 2α(f (x) − f (z)) ≤ x − z2 − y − z2 + α 2 L2 + 2δC (2M + L + 3) + 2αδf (2M + L + 2). Proof Let z ∈ C. By (3.25), there exists
(3.27)
90
3 Extensions
v ∈ ∂f (x)
(3.28)
ξ − v ≤ δf .
(3.29)
v ≤ L.
(3.30)
ξ ≤ L + 1.
(3.31)
such that
In view of (3.9) and (3.28),
By (3.29) and (3.30),
It follows from (3.2), (3.24), and (3.27)–(3.30) that x − αξ − z2 = x − αv + (αv − αξ ) − z2 ≤ x − αv − z2 + α 2 v − ξ 2 + 2αv − ξ, x − αv − z ≤ x − αv − z2 + α 2 δf2 + 2αδf x − αv − z ≤ x − αv − z2 + α 2 δf2 + 2αδf (2M + L + 1) ≤ x − z2 − 2αx − z, v + α 2 v2 + α 2 δf2 + 2αδf (2M + 1 + L) ≤ x − z2 − 2αx − z, v + α 2 L2 + α 2 δf2 + 2αδf (2M + 1 + L).
(3.32)
In view of (3.28), v, z − x ≤ f (z) − f (x).
(3.33)
By (3.32) and (3.33), x − αξ − z2 ≤ x − z2 + 2α(f (z) − f (x)) + α 2 L2 + 2αδf (2M + 2 + L).
(3.34)
It follows from (3.2), (3.24), (3.27), and (3.31) that x − αξ − z ≤ 2M + 2 + L.
(3.35)
3.3 An Auxiliary Result for Theorem 3.3
91
By (3.26), (3.27), (3.34), and (3.35), y − z2 = y − P (x − αξ ) + P (x − αξ ) − z2 ≤ y − P (x − αξ )2 + 2y − P (x − αξ )P (x − αξ ) − z + P (x − αξ ) − z2 ≤ δC2 + 2δC x − αξ − z + x − αξ − z2 ≤ δC2 + 2δC (2M + L + 2) + x − z2 + 2α(f (z) − f (x)) + α 2 L2 + 2αδf (2M + 2 + L). This implies that 2α(f (x) − f (z)) ≤ x − z2 − y − z2 + α 2 L2 + 2δC (2M + L + 3) + 2αδf (2M + L + 2). Lemma 3.4 is proved.
3.3 An Auxiliary Result for Theorem 3.3 Lemma 3.5 Let P ∈ M, α ∈ (0, 1], δf , δC ∈ (0, 1], r, Δ > 0, x ∈ U ∩ BX (0, M + 1),
(3.36)
BX (x, r) ⊂ U,
(3.37)
BX (ξ, δf ) ∩ ∂Δ f (x) = ∅,
(3.38)
y ∈ BX (P (x − αξ ), δC ).
(3.39)
ξ ∈ X satisfy
and let
Then for each z ∈ C,
92
3 Extensions
2α(f (x) − f (z)) ≤ x − z2 − y − z2 + 2αΔ + δC2 + +2δC (2M + L + 2 + Δr −1 ) + α 2 (L + Δr −1 )2 + 2αδf (2M + L + 2 + Δr −1 ). Proof Let z ∈ C.
(3.40)
v ∈ ∂Δ f (x)
(3.41)
ξ − v ≤ δf .
(3.42)
By (3.38), there exists
such that
Proposition 3.1, (3.36), and (3.37) imply that ∂Δ f (x) ⊂ BX (0, L + Δr −1 ).
(3.43)
In view of (3.41) and (3.43), v ≤ L + Δr −1 .
(3.44)
ξ ≤ L + Δr −1 + 1.
(3.45)
By (3.42) and (3.44),
It follows from (3.2), (3.36), (3.40), (3.42), and (3.44) that x − αξ − z2 = x − αv + (αv − αξ ) − z2 ≤ x − αv − z2 + α 2 v − ξ 2 + 2αv − ξ, x − αv − z ≤ x − αv − z2 + α 2 δf2 + 2αδf x − αv − z ≤ x − αv − z2 + α 2 δf2 + 2αδf (2M + L + 1 + Δr −1 ) ≤ x − z2 − 2αx − z, v + α 2 v2 + α 2 δf2 + 2αδf (2M + 1 + L + Δr −1 )
3.3 An Auxiliary Result for Theorem 3.3
93
≤ x − z2 − 2αx − z, v + α 2 (L + Δr −1 )2 + α 2 δf2 + 2αδf (2M + 1 + L + Δr −1 ).
(3.46)
In view of (3.41), v, z − x ≤ f (z) − f (x) + Δ.
(3.47)
By (3.46) and (3.47), x − αξ − z2 ≤ x − z2 + 2α(f (z) − f (x)) + 2αΔ + α 2 (L + Δr −1 )2 + 2αδf (2M + 2 + L + Δr −1 ).
(3.48)
It follows from (3.27), (3.36), (3.40), and (3.45) that x − αξ − z ≤ 2M + 2 + L + Δr −1 .
(3.49)
By (3.6), (3.39), (3.40), (3.48), and (3.49), y − z2 = y − P (x − αξ ) + P (x − αξ ) − z2 ≤ y − P (x − αξ )2 + 2y − P (x − αξ )P (x − αξ ) − z + P (x − αξ ) − z2 ≤ δC2 + 2δC x − αξ − z + x − αξ − z2 ≤ δC2 + 2δC (2M + L + 2 + Δr −1 ) + x − z2 + 2α(f (z) − f (x)) + 2αΔ + α 2 (L + Δr −1 )2 + 2αδf (2M + 2 + L + Δr −1 ). This implies that 2α(f (x) − f (z)) ≤ x − z2 − y − z2 + 2αΔ + δC2 + α 2 (L + Δr −1 )2 + 2δC (2M + L + 2 + Δr −1 ) + 2αδf (2M + L + 2 + Δr −1 ). Lemma 3.5 is proved.
94
3 Extensions
3.4 Proof of Theorem 3.2 Fix x¯ ∈ Cmin . For every t = 0, . . . , T − 1 we apply Lemma 3.4 with P = Pt , x = xt , y = xt+1 , ξ = ξt , α = αt and obtain that ¯ ≤ xt − x ¯ 2 − xt+1 − x ¯ 2 2αt (f (xt ) − f (x)) + αt2 L2 + 2δC (2M + L + 3) + 2αt δf (2M + L + 2). Together with (3.2) and (3.12), this implies that T −1
αt (f (xt ) − inf(f, C)) ≤ 2−1
t=0
T −1
(xt − x ¯ 2 − xt+1 − x ¯ 2)
t=0
+ 2−1
T −1
αt2 L2 + T δC (2M + L + 3) + δf (2M + L + 2)
T −1
t=0
αt
t=0
≤ 2−1 (2M + 1)2 + 2−1
T −1
αt2 L2 + T δC (2M + L + 3) + δf (2M + L + 2).
t=0
This implies that ⎞ ⎛ −1 T −1 T −1 min{f (xt ) : t = 0, . . . , T − 1} − inf(f, C), f ⎝ αi αt xt ⎠ − inf(f, C) i=0
⎛
T −1
≤⎝
⎞−1 αj ⎠
j =0
T −1
αt (f (xt ) − inf(f, C))
t=0
≤ 2−1 (2M + 1)2
T −1 t=0
+ T δC
t=0
T −1
−1 αt
+ 2−1 L2
T −1 t=0
αt2
n
−1 αt
t=0
−1 αt
t=0
Theorem 3.2 is proved.
(2M + L + 3) + δf (2M + L + 2).
3.5 Proof of Theorem 3.3
95
3.5 Proof of Theorem 3.3 Fix x¯ ∈ Cmin . By (3.2), (3.17), and (3.21), for all t = 0, . . . , T , xt ≤ M + 1. In view of (3.19), BX (x0 , r) ⊂ U. It follows from (3.16), (3.17), and (3.21) that for all integers t = 0, . . . , T − 1, BX (xt+1 , r) ⊂ BX (Pt (xt − αt ξt ), r + δC ) ⊂ BX (Pt (xt − αt ξt ), 2r) ⊂ U.
For every t = 0, . . . , T − 1 in view of the relation above, we apply Lemma 3.5 with P = Pt , x = xt , y = xt+1 , α = αt , ξ = ξt and obtain that ¯ 2 − xt+1 − x ¯ 2 2α(f (xt ) − inf(f, C)) ≤ xt − x + 2αt Δ + δC2 + αt2 (L + Δr −1 )2 + 2δC (2M + L + 2 + Δr −1 ) + 2αt δf (2M + L + 2 + Δr −1 ). The relation above implies that T −1
αt (f (xt ) − inf(f, C)) ≤ 2−1
t=0
T −1
(xt − x ¯ 2 − xt+1 − x ¯ 2)
t=0
+
T −1
αt Δ + 2−1
t=0
T −1
αt2 (L + Δr −1 )2 + T δC (2M + L + 3 + Δr −1 )
t=0
+ δf (2M + L + 2 + Δr −1 )
T −1
αt
t=0
≤ 2−1 x0 − x ¯ 2+
T −1 t=0
αt Δ + 2−1
T −1 t=0
αt2 (L + Δr −1 )2
96
3 Extensions
+ T δC (2M + L + 3 + Δr −1 ) + δf (2M + L + 2 + Δr −1 )
T −1
αt .
t=0
Together with (3.2) and (3.18) this implies that ⎛⎛ ⎜ min{f (xt ) : t = 0, . . . , T − 1} − inf(f, C), f ⎝⎝
T −1
⎞−1 αi ⎠
i=0
⎛ ≤⎝
T −1
⎞−1 αj ⎠
j =0
T −1
T −1
⎞ ⎟ αt xt ⎠ − inf(f, C)
t=0
αt (f (xt ) − inf(f, C))
t=0
⎛
T −1
≤ 2−1 (2M + 1)2 ⎝
⎞−1 αt ⎠
⎛
T −1
+ T δC ⎝
⎞⎛ ⎞−1 n αt2 ⎠ ⎝ αt ⎠
t=0
t=0
+ Δ + 2−1 (L + Δr −1 )2 ⎝
t=0
⎛
T −1
⎞−1 αt ⎠
(2M + L + 3 + Δr −1 ) + δf (2M + L + 2 + Δr −1 ).
t=0
Theorem 3.3 is proved.
3.6 Optimization on Unbounded Sets Let (X, ·, ·) be a Hilbert space with an inner product ·, · which induces a complete norm · . Let C be a closed nonempty subset of the space X, U be an open convex subset of X such that C ⊂ U, and f : U → R 1 be a convex function which is Lipschitz on all bounded subsets of U. For each point x ∈ U and each positive number , let ∂f (x) = {l ∈ X : f (y) − f (x) ≥ l, y − x for all y ∈ U } and let ∂ f (x) = {l ∈ X : f (y) − f (x) ≥ l, y − x − for all y ∈ U }.
3.6 Optimization on Unbounded Sets
97
Assume that lim
x∈U,x→∞
f (x) = ∞.
(3.50)
It means that for each M0 > 0 there exists M1 > 0 such that if a point x ∈ U satisfies the inequality x ≥ M1 , then f (x) > M0 . Define inf(f, C) = inf{f (z) : z ∈ C}. Since the function f is Lipschitz on all bounded subsets of the space X, it follows from (3.50) that inf(f, C) is finite. Set Cmin = {x ∈ C : f (x) = inf(f, C)}.
(3.51)
It is well-known that if the set C is convex, then the set Cmin is nonempty. Clearly, the set Cmin = ∅ if the space X is finite-dimensional. We assume that Cmin = ∅. It is clear that Cmin is a closed subset of C. Fix θ0 ∈ C.
(3.52)
U0 = {x ∈ U : f (x) ≤ f (θ0 ) + 4}.
(3.53)
Set
In view of (3.50) there exists a number K¯ > 1 such that ¯ U0 ⊂ BX (0, K).
(3.54)
Since the function f is Lipschitz on all bounded subsets of U , there exists a number L¯ > 1 such that ¯ 1 − z2 for all z1 , z2 ∈ U ∩ BX (0, K¯ + 4). |f (z1 ) − f (z2 )| ≤ Lz
(3.55)
Denote by MC the set of all mappings P : X → C such that P z = z for all z ∈ C, P z − x ≤ z − x for all x ∈ C and all z ∈ X.
(3.56) (3.57)
98
3 Extensions
We prove the following two theorems. Theorem 3.6 Assume that ¯ K1 ≥ K¯ + 4, L1 ≥ L,
(3.58)
δf , δC ∈ (0, 1], |f (z1 ) − f (z2 )| ≤ L1 z1 − z2 for all z1 , z2 ∈ BX (0, 3K1 + 2) ∩ U, ¯ −2 ) α ∈ (0, (1 + L)
(3.59) (3.60)
and that δf (K¯ + 3K1 + 2 + L1 ) ≤ α, δC (K¯ + 3K1 + L1 + 3) ≤ α.
(3.61)
T −1 ⊂ MC , {Pt }t=0
(3.62)
T −1 ⊂ X, {xt }Tt=0 ⊂ U, {ξt }t=0
(3.63)
x0 ≤ K1 ,
(3.64)
BX (x0 , δC ) ∩ C = ∅
(3.65)
BX (ξt , δf ) ∩ ∂f (xt ) = ∅,
(3.66)
xt+1 − Pt (xt − αξt ) ≤ δC .
(3.67)
Let T ≥ 2 be an integer
and that for t = 0, . . . , T − 1,
Then xt ≤ 2K¯ + K1 , t = 0, . . . , T and min{f (xt ) : t = 0, . . . , T − 1} − inf(f, C), f
T
−1
T −1 i=0
xi
− inf(f, C)
3.6 Optimization on Unbounded Sets
≤ T −1
T −1
99
f (xi ) − inf(f, C)
i=0
¯ 2 + L21 α + α −1 δC (K¯ + 3K1 + L1 + 3) ≤ (2T α)−1 (K1 + K) + δf (3K1 + K¯ + L1 + 2). Theorem 3.7 Assume that ¯ r0 ∈ (0, 1], K1 ≥ K¯ + 4, L1 ≥ L, BX (z, r0 ) ⊂ U, z ∈ C,
(3.68)
|f (z1 ) − f (z2 )| ≤ L1 z1 − z2 for all z1 , z2 ∈ BX (0, 3K1 + 1) ∩ U,
(3.69)
Δ ∈ (0, r0 ], δf , δC ∈ (0, 2−1 r0 ], α ∈ (0, (L¯ + 3)−2 ],
(3.70)
and that δf (3K¯ + K1 + 4 + L1 ) ≤ α,
(3.71)
δC (3K¯ + K1 + L1 + 2) ≤ α.
(3.72)
T −1 ⊂ MC , {Pt }t=0
(3.73)
T −1 ⊂ X, {xt }Tt=0 ⊂ U, {ξt }t=0
(3.74)
x0 ≤ K1 ,
(3.75)
BX (x0 , δC ) ∩ C = ∅
(3.76)
BX (ξt , δf ) ∩ ∂Δ f (xt ) = ∅,
(3.77)
xt+1 − Pt (xt − αξt ) ≤ δC .
(3.78)
Let T ≥ 2 be an integer
and that for i = 1, . . . , T − 1,
Then xt ≤ 2K¯ + K1 , t = 0, . . . , T and
100
3 Extensions
min{f (xt ) : t = 0, . . . , T − 1} − inf(f, C), f
T
−1
T −1
xi
− inf(f, C)
i=0
≤T
−1
T −1
f (xi ) − inf(f, C)
i=0
¯ 2 + (L1 + 2)2 α ≤ (2T α)−1 (K1 + K) + α −1 δC (3K¯ + K1 + L1 + 2) + Δ + δf (K1 + 3K¯ + L1 + 4).
3.7 Auxiliary Results Lemma 3.8 Let K0 , L0 > 0, |f (z1 ) − f (z2 )| ≤ L0 z1 − z2 for all z1 , z2 ∈ BX (0, K0 + 1) ∩ U, x ∈ BX (0, K0 ) ∩ U, v ∈ ∂f (x).
(3.79) (3.80)
Then v ≤ L0 . Proof In view of (3.80), for all u ∈ U , f (u) − f (x) ≥ v, u − x.
(3.81)
There exists r ∈ (0, 1) such that BX (x, r) ⊂ U.
(3.82)
BX (x, r) ⊂ BX (0, K0 + 1).
(3.83)
By (3.80) and (3.82),
It follows from (3.70), (3.80), (3.82), and (3.83) that for all h ∈ BX (0, 1), x + rh ∈ U ∩ BX (0, K0 + 1), v, rh ≤ f (x + rh) − f (x) ≤ L0 rh ≤ L0 r, v, h ≤ L0 .
3.7 Auxiliary Results
101
Therefore v ≤ L0 . Lemma 3.8 is proved. Lemma 3.9 Let K0 , L0 > 0, r ∈ (0, 1], Δ > 0, |f (z1 ) − f (z2 )| ≤ L0 z1 − z2 for all z1 , z2 ∈ BX (0, K0 + 1) ∩ U,
(3.84)
x ∈ BX (0, K0 ) ∩ U,
(3.85)
BX (x, r) ⊂ U.
(3.86)
Then ∂Δ f (x) ⊂ BX (0, L0 + Δr −1 ). Proof Let ξ ∈ ∂Δ f (x).
(3.87)
BX (x, r) ⊂ U ∩ BX (0, K0 + 1).
(3.88)
By (3.85) and (3.86),
In view of (3.88), for each h ∈ BX (0, 1), x + rh ∈ U ∩ BX (0, K0 + 1). Together with (3.84) and (3.87) this implies that L0 r ≥ L0 rh ≥ f (x + rh) − f (x) ≥ ξ, rh − Δ, L0 + Δr −1 ≥ ξ, h. This implies that ξ ≤ L0 + Δr −1 . Lemma 3.9 is proved. ¯ L0 > 0, Lemma 3.10 Let P ∈ Mc , K0 ≥ K, |f (z1 ) − f (z2 )| ≤ L0 z1 − z2 for all z1 , z2 ∈ BX (0, K0 + 2) ∩ U,
(3.89)
α ∈ (0, 1], δf , δC ∈ (0, 1], x ∈ U ∩ BX (0, K0 + 1),
(3.90)
102
3 Extensions
ξ ∈ X satisfy BX (ξ, δf ) ∩ ∂f (x) = ∅,
(3.91)
y ∈ BX (P (x − αξ ), δC ) ∩ U.
(3.92)
and let
Then for each z ∈ C satisfying f (z) ≤ f (θ0 ) + 4, 2α(f (x) − f (z)) ≤ x − z2 − y − z2 + 2δC (K0 + K¯ + L0 + 3) + α 2 L20 + 2αδf (K0 + K¯ + L0 + 2). Proof Let z∈C
(3.93)
f (z) ≤ f (θ0 ) + 4.
(3.94)
satisfy
In view of (3.52), (3.53), and (3.94), ¯ z ≤ K.
(3.95)
v ∈ ∂f (x)
(3.96)
ξ − v ≤ δf .
(3.97)
By (3.91), there exists
such that
In view of (3.89), (3.90), and (3.96), v ≤ L0 .
(3.98)
ξ ≤ L0 .
(3.99)
By (3.97) and (3.98),
It follows from (3.90), (3.95), (3.97), and (3.98) that
3.7 Auxiliary Results
103
x − αξ − z2 = x − αv + (αv − αξ ) − z2 ≤ x − αv − z2 + α 2 v − ξ 2 + 2αv − ξ, x − αv − z ≤ x − αv − z2 + α 2 δf2 + 2αδf x − αv − z ≤ x − αv − z2 + α 2 δf2 + 2αδf (K0 + K¯ + L0 + 1) ≤ x − z2 − 2αx − z, v + α 2 v2 + α 2 δf2 + 2αδf (K0 + K¯ + L0 + 1) ≤ x − z2 − 2αx − z, v + α 2 L20 + α 2 δf2 + 2αδf (K0 + K¯ + L0 + 1).
(3.100)
In view of (3.46), v, z − x ≤ f (z) − f (x).
(3.101)
By (3.100) and (3.101), x − αξ − z2 ≤ x − z2 + 2α(f (z) − f (x)) + α 2 L20 + α 2 δf2 + 2αδf (K0 + K¯ + L0 + 1).
(3.102)
It follows from (3.90), (3.95), and (3.99) that x − αξ − z ≤ K0 + 2 + K¯ + L0 .
(3.103)
By (3.56), (3.57), (3.92), and (3.103), y − z2 = y − P (x − αξ ) + P (x − αξ ) − z2 ≤ y − P (x − αξ )2 + 2y − P (x − αξ )P (x − αξ ) − z + P (x − αξ ) − z2 ≤ δC2 + 2δC (K0 + K¯ + L0 + 2) + x − αξ − z2 ≤ δC2 + 2δC (K0 + K¯ + L0 + 2) + x − z2 + 2α(f (z) − f (x)) + α 2 L20 + α 2 δf2 + 2αδf (K0 + K¯ + L0 + 1). This implies that
104
3 Extensions
2α(f (x) − f (z)) ≤ x − z2 − y − z2 + 2δC (K0 + K¯ + L0 + 3) + α 2 L20 + 2αδf (K0 + K¯ + L0 + 2). Lemma 3.10 is proved. ¯ L0 > 0, Lemma 3.11 Let P ∈ Mc , K0 ≥ K, |f (z1 ) − f (z2 )| ≤ L0 z1 − z2 for all z1 , z2 ∈ BX (0, K0 + 2) ∩ U,
(3.104)
α ∈ (0, 1], δf , δC ∈ (0, 1], Δ > 0, r ∈ (0, 1], x ∈ U ∩ BX (0, K0 + 1),
(3.105)
BX (x, r) ⊂ U,
(3.106)
BX (ξ, δf ) ∩ ∂Δ f (x) = ∅,
(3.107)
y ∈ BX (P (x − αξ ), δC ) ∩ U.
(3.108)
ξ ∈ X satisfy
and let
Then for each z ∈ C satisfying f (z) ≤ f (θ0 ) + 4, 2α(f (x) − f (z)) ≤ x − z2 − y − z2 + 2δC (K0 + K¯ + L0 + 3 + Δr −1 ) + 2αΔ + α 2 (L0 + Δr −1 )2 + 2αδf (K0 + K¯ + L0 + 2 + Δr −1 ). Proof Let z∈C
(3.109)
f (z) ≤ f (θ0 ) + 4.
(3.110)
satisfy
In view of (3.53), (3.54), (3.109), and (3.110), ¯ z ≤ K. By (3.107), there exists
(3.111)
3.7 Auxiliary Results
105
v ∈ ∂Δ f (x)
(3.112)
ξ − v ≤ δf .
(3.113)
such that
Lemma 3.9 and (3.104)–(3.106) imply that ∂Δ f (x) ⊂ BX (0, L0 + Δr −1 ).
(3.114)
In view of (3.112) and (3.114), v ≤ L0 + Δr −1 .
(3.115)
ξ ≤ L0 + Δr −1 + 1.
(3.116)
By (3.113) and (3.115),
It follows from (3.105), (3.111), (3.113), and (3.115) that x − αξ − z2 = x − αv + (αv − αξ ) − z2 ≤ x − αv − z2 + α 2 v − ξ 2 + 2αv − ξ, x − αv − z ≤ x − αv − z2 + α 2 δf2 + 2αδf x − αv − z ≤ x − αv − z2 + α 2 δf2 + 2αδf (K0 + K¯ + L0 + 1 + Δr −1 ) ≤ x − z2 − 2αx − z, v + α 2 v2 + α 2 δf2 + 2αδf (K0 + K¯ + L0 + 1 + Δr −1 ) ≤ x − z2 − 2αx − z, v + α 2 (L0 + Δr −1 )2 + α 2 δf2 + 2αδf (K0 + K¯ + L0 + 1 + Δr −1 ).
(3.117)
In view of (3.112), v, z − x ≤ f (z) − f (x) + Δ.
(3.118)
By (3.105), (3.111), and (3.116), x − αξ − z ≤ K0 + K¯ + 2 + L0 + Δr −1 . It follows from (3.117) that
(3.119)
106
3 Extensions
x − αξ − z2 ≤ x − z2 + 2α(f (z) − f (x)) + 2αΔ + α 2 (L0 + Δr −1 )2 + 2αδf (K0 + K¯ + L0 + 2 + Δr −1 ).
(3.120)
It follows from (3.56), (3.57), (3.109), (3.119), and (3.120) that y − z2 = y − P (x − αξ ) + P (x − αξ ) − z2 ≤ y − P (x − αξ )2 + 2y − P (x − αξ )P (x − αξ ) − z + P (x − αξ ) − z2 ≤ δC2 + 2δC (K0 + 2 + K¯ + L0 + Δr −1 ) + P (x − αξ ) − z2 ≤ 2δC (K0 + 3 + K¯ + L0 + Δr −1 ) + x − z2 + 2α(f (z) − f (x)) + 2αΔ + α 2 (L0 + Δr −1 )2 + 2αδf (K0 + K¯ + L0 + 2 + Δr −1 ). This implies that 2α(f (x) − f (z)) ≤ x − z2 − y − z2 + 2δC (K0 + 3 + K¯ + L0 + Δr −1 ) + 2αΔ + α 2 (L0 + Δr −1 )2 + 2αδf (K0 + K¯ + L0 + 2 + Δr −1 ). Lemma 3.11 is proved.
3.8 Proof of *Theorem 3.6 Fix x¯ ∈ Cmin .
(3.121)
¯ x ¯ ≤ K.
(3.122)
¯ ≤ K¯ + K1 . x0 − x
(3.123)
By (3.52), (3.54), and (3.121),
In view of (3.64) and (3.122),
3.8 Proof of *Theorem 3.6
107
By induction we show that for all i = 0, . . . , T , ¯ ≤ K¯ + K1 . xi − x
(3.124)
In view of (3.123), inequality (3.124) holds for i = 0. Assume that i ∈ {0, . . . , T − 1} and that (3.124) is valid. There are two cases: f (xi ) ≤ inf(f, C) + 4;
(3.125)
f (xi ) > inf(f, C) + 4.
(3.126)
Assume that (3.125) holds. In view of (3.52)–(3.54) and (3.125), ¯ xi ≤ K.
(3.127)
Lemma 3.8, (3.55), and (3.127) imply that ¯ ∂f (xi ) ⊂ BX (0, L).
(3.128)
ξi ≤ L¯ + 1.
(3.129)
By (3.66) and (3.128),
It follows from (3.56), (3.57), (3.63), (3.67), (3.121), (3.122), (3.127), and (3.129) that ¯ ≤ xi+1 − Pi (xi − αξi ) + Pi (xi − αξi ) − x ¯ xi+1 − x ≤ δC + xi − αξi − x ¯ ¯ ≤ δC + 2K¯ + α(L¯ + 1) ≤ 2K¯ + 2 ≤ K1 + K. Assume that (3.126) holds. In view of (3.66), (3.121), (3.122), and (3.124), we apply Lemma 3.10 with P = Pi , K0 = 3K1 , L0 = L1 , ξ = ξi , x = xi , y = xi+1 , z = x¯ and obtain that 2α(f (xi ) − f (x)) ¯ ≤ xi − x ¯ 2 − xi+1 − x ¯ 2 + 2δC (K¯ + 3K1 + L1 + 3) + L21 α 2 + 2αδf (3K1 + K¯ + L1 + 2).
Together with (3.60), (3.61), and (3.126), this implies that ¯ 2 ≤ xi − x ¯ 2 − 8α + 2δC (K¯ + 3K1 + L1 + 3) xi+1 − x
108
3 Extensions
+ L21 α 2 + 2αδf (3K1 + K¯ + L1 + 2) ≤ xi − x ¯ 2 − 7α + 4α, xi+1 − x ¯ ≤ xi − x ¯ ≤ K¯ + K1 .
Thus in both cases xi+1 − x ¯ ≤ K¯ + K1 and the assumption made for i also holds for i + 1 too. Therefore (3.124) holds for all i = 0, . . . , T . In view of (3.122) and (3.124), for all t = 0, . . . , T , xi ≤ 2K¯ + K1 .
(3.130)
Let i ∈ {0, . . . , T − 1}. In view of (3.58), (3.59), (3.66), and (3.130), we apply Lemma 3.10 with P = Pi , K0 = 3K1 , L0 = L1 , x = xi , y = xi+1 , z = x, ¯ ξ = ξi and obtain that ¯ ≤ xi − x ¯ 2 − xi+1 − x ¯ 2 2α(f (xi ) − f (x)) + 2δC (K¯ + 3K1 + L1 + 3) + α 2 L21 + 2αδf (3K1 + K¯ + L1 + 2).
(3.131)
By (3.131), T −1
α(f (xi ) − f (x)) ¯ ≤ 2−1
i=0
T −1
(xi − x ¯ 2 − xi+1 − x ¯ 2)
i=0
+ T δC (K¯ + 3K1 + L1 + 3) + T α 2 L21 + T αδf (3K1 + K¯ + L1 + 2). Together with (3.121) and (3.123), the relation above implies that min{f (xt ) : t = 0, . . . , T − 1} − inf(f, C), f (T −1
T −1 i=0
≤T
−1
T −1 i=0
f (xi ) − inf(f, C)
xi ) − inf(f, C)
3.9 Proof of Theorem 3.7
109
¯ 2 + L21 α + α −1 δC (K¯ + 3K1 + L1 + 3) ≤ (2T α)−1 (K1 + K) + δf (3K1 + K¯ + L1 + 2). Theorem 3.6 is proved.
3.9 Proof of Theorem 3.7 Fix x¯ ∈ Cmin .
(3.132)
¯ x ¯ ≤ K.
(3.133)
¯ ≤ K¯ + K1 . x0 − x
(3.134)
By (3.52)–(3.54) and (3.132),
In view of (3.75) and (3.133),
By induction we show that for all i = 0, . . . , T , ¯ ≤ K¯ + K1 . xi − x
(3.135)
In view of (3.134), inequality (3.135) holds for i = 0. Assume that i ∈ {0, . . . , T − 1} and that (3.135) is valid. There are two cases: f (xi ) ≤ inf(f, C) + 4;
(3.136)
f (xi ) > inf(f, C) + 4.
(3.137)
Assume that (3.136) holds. In view of (3.52)–(3.54) and (3.136), ¯ xi ≤ K.
(3.138)
It follows from (3.68), (3.76), and (3.78) that BX (xi , r0 /2) ⊂ U.
(3.139)
Lemma 3.9, (3.55), (3.138), and (3.139) imply that ∂Δ f (xi ) ⊂ BX (0, L¯ + 2Δr0−1 ).
(3.140)
110
3 Extensions
By (3.77) and (3.140), ξi ≤ L¯ + 1 + 2Δr0−1 .
(3.141)
It follows from (3.56), (3.57), (3.70), (3.73), (3.78), (3.133), (3.138), and (3.141) that ¯ ≤ xi+1 − Pi (xi − αξi ) + Pi (xi − αξi ) − x ¯ xi+1 − x ≤ δC + xi − αξi − x ¯ ≤ δC + 2K¯ + α(L¯ + 1 + 2Δr0−1 ) ≤ 2K¯ + 1 + α(L¯ + 3) ≤ 2K + 2 ≤ K¯ + K1 . Assume that (3.137) holds. In view of (3.133) and (3.135), xi ≤ 2K¯ + K1 .
(3.142)
By (3.68), (3.70), (3.76), and (3.78), equation (3.139) is true. In view of (3.69), (3.70), (3.73), (3.74), (3.77), (3.78), and (3.142), we apply Lemma 3.11 with P = Pi , K0 = 2K¯ + K1 , L0 = L1 , r = r0 /2, ξ = ξi , x = xi , y = xi+1 , z = x¯ and obtain that ¯ ≤ xi − x ¯ 2 − xi+1 − x ¯ 2 2α(f (xi ) − f (x)) +2δC (3K¯ + K1 + L1 + 3 + 2Δr0−1 ) + 2αΔ + (L1 + 2Δr0−1 )2 α 2 + 2αδf (K1 + 3K¯ + L1 + 2 + 2Δr0−1 ).
(3.143)
Together with (3.70)–(3.72), (3.135), and (3.127), this implies that ¯ 2 ≤ xi − x ¯ 2 − 8α + 2δC (3K¯ + K1 + L1 + 3 + 2Δr0−1 ) xi+1 − x + 2αΔ + (L1 + 2Δr0−1 )2 α 2 + 2αδf (K1 + 3K¯ + L1 + 2 + 2Δr0−1 ) ≤ xi − x ¯ 2 − 8α + 7α < xi − x ¯ 2, xi+1 − x ¯ ≤ xi − x ¯ ≤ K¯ + K1 . Thus in both cases ¯ ≤ K¯ + K1 xi+1 − x
3.9 Proof of Theorem 3.7
111
and the assumption made for i also holds for i + 1 too. Therefore (3.135) holds for all i = 0, . . . , T . In view of (3.133) and (3.135), for all t = 0, . . . , T , xi ≤ 2K¯ + K1 .
(3.144)
Let i ∈ {0, . . . , T −1}. By (3.68), (3.70), (3.76), and (3.78), equation (3.139) is true. In view of (3.69), (3.74), (3.77), (3.78), (3.139), and (3.143), we apply Lemma 3.11 ¯ L0 = L1 , r = r0 /2, x = xi , y = xi+1 , z = x, ¯ ξ = ξi with P = Pi , K0 = K1 + 2K, and obtain that (3.143) is true. By (3.70) and (3.143), T −1
α(f (xi ) − f (x)) ¯ ≤ 2−1
i=0
T −1
(xi − x ¯ 2 − xi+1 − x ¯ 2)
i=0
+ T δC (3K¯ + K1 + L1 + 2) + αΔT + T α 2 (L1 + 2)2 + T αδf (K1 + 3K¯ + L1 + 4). Together with (3.135) the relation above implies that min{f (xt ) : t = 0, . . . , T − 1} − inf(f, C), f
T
−1
T −1 i=0
≤ T −1
T −1
f (xi ) − inf(f, C)
i=0
¯ 2 + α −1 δC (3K¯ + K1 + L1 + 2) ≤ (2T α)−1 (K1 + K) + Δ + α(L1 + 2)2 + δf (K1 + 3K¯ + L1 + 2). Theorem 3.7 is proved.
xi
− inf(f, C)
Chapter 4
Zero-Sum Games with Two Players
In this chapter we study an extension of the projected subgradient method for zerosum games with two players under the presence of computational errors. In our recent research [77], we show that our algorithm generates a good approximate solution, if all the computational errors are bounded from above by a small positive constant. Moreover, if we know computational errors for our algorithm, we find out what an approximate solution can be obtained and how many iterates one needs for this. In this chapter we generalize these results for an extension of the projected subgradient method, when instead of the projection on a feasible set it is used a quasi-nonexpansive retraction on this set.
4.1 Preliminaries and an Auxiliary Result Let (X, ·, ·), (Y, ·, ·) be Hilbert spaces equipped with the complete norms · which are induced by their inner products. Let C be a nonempty closed convex subset of X, D be a nonempty closed convex subset of Y , U be an open convex subset of X, and V be an open convex subset of Y such that C ⊂ U, D ⊂ V
(4.1)
and let a function f : U × V → R 1 possess the following properties: (i) for each v ∈ V , the function f (·, v) : U → R 1 is convex; (ii) for each u ∈ U , the function f (u, ·) : V → R 1 is concave. Assume that a function φ : R 1 → [0, ∞) is bounded on all bounded sets and that positive numbers M1 , M2 , L1 , L2 satisfy C ⊂ BX (0, M1 ), © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 A. J. Zaslavski, The Projected Subgradient Algorithm in Convex Optimization, SpringerBriefs in Optimization, https://doi.org/10.1007/978-3-030-60300-7_4
113
114
4 Zero-Sum Games with Two Players
D ⊂ BY (0, M2 ),
(4.2)
|f (u1 , v) − f (u2 , v)| ≤ L1 u1 − u2 for all v ∈ V and all u1 , u2 ∈ U,
(4.3)
|f (u, v1 ) − f (u, v2 )| ≤ L2 v1 − v2 for all u ∈ U and all v1 , v2 ∈ V .
(4.4)
x∗ ∈ C and y∗ ∈ D
(4.5)
f (x∗ , y) ≤ f (x∗ , y∗ ) ≤ f (x, y∗ )
(4.6)
Let
satisfy
for each x ∈ C and each y ∈ D. The following result was obtained in [77]. Proposition 4.1 Let T be a natural number, δC , δD ∈ (0, 1], {at }Tt=0 ⊂ (0, ∞) T +1 T +1 ⊂ U , {yt }t=0 ⊂ V , for and let {bt,1 }Tt=0 , {bt,2 }Tt=0 ⊂ (0, ∞). Assume that {xt }t=0 each t ∈ {0, . . . , T + 1}, BX (xt , δC ) ∩ C = ∅, BY (yt , δD ) ∩ D = ∅, for each z ∈ C and each t ∈ {0, . . . , T }, at (f (xt , yt ) − f (z, yt )) ≤ φ(z − xt ) − φ(z − xt+1 ) + bt,1
and that for each v ∈ D and each t ∈ {0, . . . , T }, at (f (xt , v) − f (xt , yt )) ≤ φ(v − yt ) − φ(v − yt+1 ) + bt,2 .
Let
4.1 Preliminaries and an Auxiliary Result
115
−1 T T xT = ai at xt , i=0
yT =
t=0
−1 T T ai at yt . i=0
t=0
Then BX ( xT , δC ) ∩ C = ∅, BY ( yT , δD ) ∩ D = ∅, −1 T T at at f (xt , yt ) − f (x∗ , y∗ ) t=0 t=0 T T −1 T ≤ at max bt,1 , bt,2 t=0
t=0
t=0
+ max{L1 δC , L2 δD } T −1 at sup{φ(s) : s ∈ [0, max{2M1 , 2M2 } + 1]}, + t=0
T −1 T f ( x , y ) − a a f (x , y ) T T t t t t t=0 t=0 T −1 ≤ at sup{φ(s) : s ∈ [0, max{2M1 , 2M2 } + 1]} t=0
+
T
−1 at
max
T
t=0
t=0
bt,1 ,
T
bt,2
t=0
+ max{L1 δC , L2 δD }
and for each z ∈ C and each v ∈ D, f (z, yT ) ≥ f ( xT , yT ) T −1 −2 at sup{φ(s) : s ∈ [0, max{2M1 , 2M2 } + 1]} t=0
116
4 Zero-Sum Games with Two Players
−2
T
−1 max
at
t=0
T
bt,1 ,
t=0
T
bt,2
t=0
− max{L1 δC , L2 δD }, f ( xT , v) ≤ f ( xT , yT ) T −1 +2 at sup{φ(s) : s ∈ [0, max{2M1 , 2M2 } + 1]} t=0
+2
T
−1 max
at
t=0
T
bt,1 ,
t=0
T
bt,2
t=0
+ max{L1 δC , L2 δD }.
The following corollary was obtained in [77]. Corollary 4.2 Suppose that all the assumptions of Proposition 4.1 hold and that x˜ ∈ C, y˜ ∈ D satisfy xT − x ˜ ≤ δC , yT − y ˜ ≤ δD . Then yT )| ≤ L1 δC + L2 δD |f (x, ˜ y) ˜ − f ( xT , and for each z ∈ C and each v ∈ D, f (z, y) ˜ ≥ f (x, ˜ y) ˜ T −1 −2 at sup {φ(s) : s ∈ [0, max{2M1 , 2M2 } + 1]} t=0
−2
T
−1 at
t=0
and f (x, ˜ v) ≤ f (x, ˜ y) ˜
max
T t=0
bt,1 ,
T t=0
bt,2 − 4 max{L1 δC , L2 δD }
4.2 Zero-Sum Games on Bounded Sets
+2
T
117
−1 sup{φ(s) : s ∈ [0, max{2M1 , 2M2 } + 1]}
at
t=0
+2
T t=0
−1 at
max
T
bt,1 ,
t=0
T
bt,2 + 4 max{L1 δC , L2 δD }.
t=0
4.2 Zero-Sum Games on Bounded Sets Let (X, ·, ·), (Y, ·, ·) be Hilbert spaces equipped with the complete norms · which are induced by their inner products. Let C be a nonempty closed convex subset of X, D be a nonempty closed convex subset of Y , U be an open convex subset of X, and V be an open convex subset of Y such that C ⊂ U, D ⊂ V . For each concave function g : V → R 1 , each x ∈ V , and each > 0, set ∂g(x) = {l ∈ Y : l, y − x ≥ g(y) − g(x) for all y ∈ V },
(4.7)
∂ g(x) = {l ∈ Y : l, y − x + ≥ g(y) − g(x) for all y ∈ V }.
(4.8)
Clearly, for each x ∈ V and each > 0, ∂g(x) = −(∂(−g)(x)),
(4.9)
∂ g(x) = −(∂ (−g)(x)).
(4.10)
Suppose that there exist L1 , L2 , M1 , M2 > 0 such that C ⊂ BX (0, M1 ), D ⊂ BY (0, M2 ),
(4.11)
a function f : U × V → R 1 possesses the following properties: (i) for each v ∈ V , the function f (·, v) : U → R 1 is convex; (ii) for each u ∈ U , the function f (u, ·) : V → R 1 is concave, for each v ∈ V , |f (u1 , v) − f (u2 , v)| ≤ L1 u1 − u2 for all u1 , u2 ∈ U
(4.12)
118
4 Zero-Sum Games with Two Players
and that for each u ∈ U , |f (u, v1 ) − f (u, v2 )| ≤ L2 v1 − v2 for all v1 , v2 ∈ V .
(4.13)
For each (ξ, η) ∈ U × V and each > 0, set ∂x f (ξ, η) = {l ∈ X : f (y, η) − f (ξ, η) ≥ l, y − ξ for all y ∈ U },
(4.14)
∂y f (ξ, η) = {l ∈ Y : l, y − η ≥ f (ξ, y) − f (ξ, η) for all y ∈ V },
(4.15)
∂x, f (ξ, η) = {l ∈ X : f (y, η) − f (ξ, η) + ≥ l, y − ξ for all y ∈ U },
(4.16)
∂y, f (ξ, η) = {l ∈ Y : l, y − η + ≥ f (ξ, y) − f (ξ, η) for all y ∈ V }.
(4.17)
In view of properties (i) and (ii), (4.12), and (4.13), for each ξ ∈ U and each η ∈ V, ∅ = ∂x f (ξ, η) ⊂ BX (0, L1 ),
(4.18)
∅ = ∂y f (ξ, η) ⊂ BY (0, L2 ).
(4.19)
x∗ ∈ C and y∗ ∈ D
(4.20)
f (x∗ , y) ≤ f (x∗ , y∗ ) ≤ f (x, y∗ )
(4.21)
Let
satisfy
for each x ∈ C and each y ∈ D. Denote by MU the set of all mappings P : X → X such that P x = x, x ∈ C,
4.2 Zero-Sum Games on Bounded Sets
119
P x − z ≤ x − z for all x ∈ X and all z ∈ C and by MV the set of all mappings P : Y → Y such that P y = y, y ∈ D, P y − z ≤ y − z for all y ∈ Y and all z ∈ C. Let δf,1 , δf,2 , δC , δD ∈ (0, 1] and {αk }∞ k=0 ⊂ (0, ∞). Let us describe our algorithm. Subgradient Projection Algorithm for Zero-Sum Games Initialization: select arbitrary x0 ∈ U and y0 ∈ V . Iterative step: given the current iteration vectors xt ∈ U and yt ∈ V , calculate ξt ∈ ∂x f (xt , yt ) + BX (0, δf,1 ), ηt ∈ ∂y f (xt , yt ) + BY (0, δf,2 ) and the next pair of iteration vectors xt+1 ∈ U , yt+1 ∈ V such that xt+1 − Pt (xt − αt ξt ) ≤ δC , yt+1 − Qt (yt + αt ηt ) ≤ δD , where Pt ∈ MU , Qt ∈ MV . In this chapter we prove the following result. Theorem 4.3 Let δf,1 , δf,2 , δC , δD ∈ (0, 1], {αk }∞ k=0 ⊂ (0, ∞), {Pt }∞ t=0 ⊂ MU , Pt (X) = C, t = 0, 1, . . . ,
(4.22)
{Qt }∞ t=0 ⊂ MV , Qt (Y ) = D, t = 0, 1, . . . .
(4.23)
∞ ∞ ∞ Assume that {xt }∞ t=0 ⊂ U , {yt }t=0 ⊂ V , {ξt }t=0 ⊂ X, {ηt }t=0 ⊂ Y ,
BX (x0 , δC ) ∩ C = ∅, BY (y0 , δD ) ∩ D = ∅
(4.24)
and that for each integer t ≥ 0, ξt ∈ ∂x f (xt , yt ) + BX (0, δf,1 ),
(4.25)
ηt ∈ ∂y f (xt , yt ) + BY (0, δf,2 ),
(4.26)
xt+1 − Pt (xt − αt ξt ) ≤ δC
(4.27)
120
4 Zero-Sum Games with Two Players
and yt+1 − Qt (yt + αt ηt ) ≤ δD .
(4.28)
For each integer t ≥ 0 set bt,1 = αt2 L21 + δC (2M1 + L1 + 3) + αt δf,1 (2M1 + L1 + 2),
(4.29)
bt,2 = αt2 L22 + δD (2M2 + L2 + 3) + αt δf,2 (2M2 + L2 + 2).
(4.30)
Let for each natural number T , xT =
T
−1 αt
i=0
yT =
T
T
αt xt ,
(4.31)
αt yt .
(4.32)
t=0
−1 αt
i=0
T t=0
Then for each natural number T , xT , δC ) ∩ C = ∅, BY ( yT , δD ) ∩ D = ∅, BX ( −1 T T αt αt f (xt , yt ) − f (x∗ , y∗ ) t=0 t=0 ≤
T
−1 max
αt
T
t=0
+ 2
bt,1 ,
t=0 T
T
bt,2 + max{L1 δC , L2 δD }
t=0
−1 max{(2M1 , 2M2 } + 1)2 ,
αt
t=0
T −1 T f ( x , y ) − α α f (x , y ) T T t t t t t=0 t=0 ≤ 2
T
−1 αt
(max{2M1 , 2M2 } + 1)2 + max{L1 δC , L2 δD }
t=0
+
T t=0
−1 αt
max
T t=0
bt,1 ,
T t=0
bt,2
4.2 Zero-Sum Games on Bounded Sets
121
and for each z ∈ C and each v ∈ D, xT , yT ) f (z, yT ) ≥ f ( T −1 − αt (max{2M1 , 2M2 } + 1)2 − max{L1 δC , L2 δD } t=0
−2
T
−1 max
αt
t=0
T
bt,1 ,
t=0
T
bt,2
t=0
f ( xT , v) ≤ f ( xT , yT ) T −1 + αt (max{2M1 , 2M2 } + 1)2 + max{L1 δC , L2 δD } t=0
+2
T
−1 max
αt
t=0
T t=0
bt,1 ,
T
bt,2 .
t=0
Proof By (4.11), (4.22)–(4.24), (4.27), and (4.28), for all integers t ≥ 0, xt ≤ M1 + 1, yt ≤ M2 + 1.
(4.33)
Let t ≥ 0 be an integer. Applying Lemma 3.4 with P = Pt , δf = δf,1 , α = αt , x = xt , f = f (·, yt ), ξ = ξt , y = xt+1 , we obtain that for each z ∈ C, αt (f (xt , yt ) − f (z, yt )) ≤ 2−1 z − xt 2 − 2−1 z − xt+1 2 +αt2 L21 + δC (2M1 + L1 + 3) + αt δf,1 (2M1 + L1 + 2) ≤ 2−1 z − xt 2 − 2−1 z − xt+1 2 + bt,1 . Applying Lemma 3.4 with P = Qt , α = αt , x = yt , f = −f (xt , ·), ξ = −ηt , y = yt+1 , δf = δf,2 we obtain that for each v ∈ D, αt (f (xt , v) − f (xt , yt )) ≤ 2−1 v − yt 2 − 2−1 v − yt+1 2 +δD (2M2 + L2 + 3) + αt δf,2 (2M2 + L2 + 2) + αt2 L22
122
4 Zero-Sum Games with Two Players
≤ 2−1 v − yt 2 − 2−1 v − yt+1 2 + bt,2 . Define φ(s) = 2−1 s 2 , s ∈ R 1 . It is easy to see that all the assumptions of Proposition 4.1 hold and it implies Theorem 4.3. Theorem 4.4 Let r1 , r2 > 0, BX (z, 2r1 ) ⊂ U for all z ∈ C,
(4.34)
BY (u, 2r2 ) ⊂ V for all u ∈ D,
(4.35)
Δ1 , Δ2 > 0, δf,1 , δf,2 , δC , δD ∈ (0, 1], δC ≤ r1 , δD ≤ r2 ,
(4.36)
∞ {Pt }∞ t=0 ⊂ MU , {Qt }t=0 ⊂ MV ,
(4.37)
and {αt }∞ k=0 ⊂ (0, 1],
Pt (X) = C, t = 0, 1, . . . , Qt (Y ) = D, t = 0, 1, . . . .
(4.38)
∞ ∞ ∞ Assume that {xt }∞ t=0 ⊂ U , {yt }t=0 ⊂ V , {ξt }t=0 ⊂ X, {ηt }t=0 ⊂ Y ,
BX (x0 , δC ) ∩ C = ∅, BY (y0 , δD ) ∩ D = ∅
(4.39)
and that for each integer t ≥ 0, BX (ξt , δf,1 ) ∩ ∂x,Δ1 f (xt , yt ) = ∅, BY (ηt , δf,2 ) ∩ ∂y,Δ2 f (xt , yt ) = ∅,
(4.40)
xt+1 − Pt (xt − at ξt ) ≤ δC and yt+1 − Qt (yt + at ηt ) ≤ δD . For each integer t ≥ 0 set bt,1 = αt Δ1 + 2−1 αt2 (L1 + Δ1 r1−1 )
(4.41)
4.2 Zero-Sum Games on Bounded Sets
123
+δC (2M1 + L1 + 3 + Δ1 r1−1 ) + αt δf,1 (2M1 + L1 + 2 + Δ1 r1−1 ), bt,2 = αt Δ2 + 2−1 αt2 (L2 + Δ2 r2−1 ) +δD (2M2 + L2 + 3 + Δ2 r2−1 ) + αt δf,2 (2M2 + L2 + 2 + Δ2 r2−1 ). Let for each natural number T xT =
T
−1 αt
i=0
yT =
T
T
αt xt ,
t=0
−1 αt
i=0
T
αt yt .
t=0
Then for each natural number T , xT , δC ) ∩ C = ∅, BY ( yT , δD ) ∩ D = ∅, BX ( −1 T T αt αt f (xt , yt ) − f (x∗ , y∗ ) t=0 t=0 ≤
T
−1 max
αt
T
t=0
+ 2
T
bt,1 ,
t=0
T
bt,2 + max{L1 δC , L2 δD }
t=0
−1 max{(2M1 , 2M2 } + 1)2 ,
αt
t=0
T −1 T f ( x , y ) − α α f (x , y ) T T t t t t t=0 t=0 ≤ 2
T
−1 αt
(max{2M1 , 2M2 } + 1)2 + max{L1 δC , L2 δD }
t=0
+
T
−1 αt
max
t=0
T t=0
and for each z ∈ C and each v ∈ D, f (z, yT ) ≥ f ( xT , yT )
bt,1 ,
T t=0
bt,2
124
4 Zero-Sum Games with Two Players
−
T
−1 αt
(max{2M1 , 2M2 } + 1)2 − max{L1 δC , L2 δD }
t=0
−2
T
−1 max
αt
t=0
T
bt,1 ,
t=0
T
bt,2
t=0
f ( xT , v) ≤ f ( xT , yT ) T −1 + αt (max{2M1 , 2M2 } + 1)2 + max{L1 δC , L2 δD } t=0
+2
T
−1 αt
max
t=0
T t=0
bt,1 ,
T
bt,2 .
t=0
Proof By (4.11) and (4.39)–(4.41), for all integers t ≥ 0, xt ≤ M1 + 1, yt ≤ M2 + 1, BX (xt , δC ) ∩ C = ∅, BY (yt , δD ) ∩ D = ∅. In view (4.34)–(4.36) and (4.39), BX (x0 , r1 ) ⊂ U, BY (y0 , r1 ) ⊂ V . It follows from (4.34)–(4.36), (4.38), (4.40), and (4.41) that BX (xt , r1 ) ⊂ BX (Pt−1 (xt−1 − αt−1 ξt−1 ), δC + r1 ) ⊂ BX (Pt−1 (xt−1 − αt−1 ξt−1 ), 2r1 ) ⊂ U, BY (yt , r2 ) ⊂ BY (Qt−1 (yt−1 + αt−1 ηt−1 ), δD + r2 ) ⊂ BY (Qt−1 (yt−1 + αt−1 ηt−1 ), 2r2 ) ⊂ V . Let t ≥ 0 be an integer. Applying Lemma 3.5 with r = r1 , Δ = Δ1 , P = Pt , α = at , δf = δf,1 , x = xt , f = f (·, yt ), ξ = ξt , y = xt+1 , we obtain that for each z ∈ C,
4.2 Zero-Sum Games on Bounded Sets
125
2αt (f (xt , yt ) − f (z, yt )) ≤ z − xt 2 − z − xt+1 2 + 2αt Δ1 +δC2 + αt2 (L1 + Δ1 r −1 ) + 2δC (2M1 + L1 + 2 + Δ1 r −1 ) +2αt δf,1 (2M1 + L1 + 2 + Δ1 r −1 ) ≤ z − xt 2 − z − xt+1 2 + 2bt,1 . Applying Lemma 3.5 with r = r2 , Δ = Δ2 , P = Qt , α = at , δf = δf,2 , x = yt , f = −f (xt , ·), ξ = −ηt , y = yt+1 we obtain that for each v ∈ D, 2αt (f (xt , v) − f (xt , yt )) ≤ v − yt 2 − v − yt+1 2 + 2αt Δ2 2 +δD + 2δD (2M2 + L2 + 2 + Δr2−1 )
+2αt δf,2 (2M2 + L2 + 2 + Δr2−1 ) + αt2 (L2 + Δr2−1 ) ≤ v − yt 2 − v − yt+1 2 + 2bt,2 . Define φ(s) = 2−1 s 2 , s ∈ R 1 . It is easy to see that all the assumptions of Proposition 4.1 hold and it implies Theorem 4.4. Theorems 4.3 and 4.4 are new. We are interested in the optimal choice of αt , t = 0, 1, . . . , T . Let T be a natural number and AT = Tt=0 αt be given. In order to make the best choice of αt , t = 0, . . . , T , we need to minimize the function Tt=0 αt2 on the set α = (α0 , . . . , αT ) ∈ R
T +1
: αi ≥ 0, i = 0, . . . , T ,
T
αi = AT
.
i=0
By Lemma 2.3 of [75], this function has a unique minimizer αi = (T + 1)−1 AT , i = 0, . . . , T . Let T be a natural number and αt = α for all t = 0, . . . , T . Now we will find the best α > 0. In order to meet this goal we need to choose a which is a minimizer of the function
126
4 Zero-Sum Games with Two Players
((T + 1)α)−1 (max{2M1 , 2M2 } + 1)2 T T −1 −1 +2α (T + 1) max bt,1 , bt,2 t=0
t=0
= ((T + 1)α)−1 (max{2M1 , 2M2 } + 1)2 +2α −1 (T + 1)−1 max{(T + 1)(αΔ1 + δC (2M1 + 3 + L1 + Δ1 r1−1 )) +2−1 α 2 (L1 + Δ1 r1−1 ) + αδf,1 (2M1 + 2 + L1 + Δ1 r1−1 ), (T + 1)(αΔ2 + δD (2M2 + 3 + L2 + Δ2 r2−1 )) +2−1 α 2 (L2 + Δ2 r2−1 ) + αδf,2 (2M2 + 2 + L2 + Δ2 r2−1 )} = ((T + 1)α)−1 (max{2M1 , 2M2 } + 1)2 +2 max{Δ1 + α −1 δC (2M1 + 3 + L1 + Δ1 r1−1 ) +2−1 α(L1 + Δ1 r1−1 ) + δf,1 (2M1 + 2 + L1 + Δ1 r1−1 ), Δ2 + α −1 δD (2M2 + 3 + L2 + Δ2 r2−1 ) +2−1 α(L2 + Δ2 r2−1 ) + δf,2 (2M2 + 2 + L2 + Δ2 r2−1 )} ≤ ((T + 1)α)−1 (max{2M1 , 2M2 } + 1)2 +2 max{Δ1 , Δ2 } +2 max{δf,1 (2M1 + 2 + L1 + Δ1 r1−1 ), δf,2 (2M2 + 2 + L2 + Δ2 r2−1 )} +2α −1 max{δC (2M1 + 3 + L1 + Δ1 r1−1 ), δD (2M2 + 3 + L2 + Δ2 r2−1 )} +α max{L1 + Δ1 r1−1 , L2 + Δ2 r2−1 }.
Since T can be arbitrarily large, we need to find a minimizer of the function φ(α) := 2α −1 max{δC (2M1 + 3 + L1 + Δ1 r1−1 ), δD (2M2 + 3 + L2 + Δ2 r2−1 )} +α max{L1 + Δ1 r1−1 , L2 + Δ2 r2−1 }, α > 0. This function has a minimizer α∗ = 21/2 max{δC (2M1 + 3 + L1 + Δ1 r1−1 ), δD (2M2 + 3 + L2 + Δ2 r2−1 )}1/2 × max{L1 + Δ1 r1−1 , L2 + Δ2 r2−1 }−1/2
4.2 Zero-Sum Games on Bounded Sets
127
and ψ(α∗ ) = 23/2 max{δC (2M1 + 3 + L1 + Δ1 r1−1 ), δD (2M2 + 3 + L2 + Δ2 r2−1 )}1/2 × max{L1 + Δ1 r1−1 , L2 + Δ2 r2−1 }1/2 .
For the appropriate choice of T , it should be at the same order as max{δC , δD }−1 .
Chapter 5
Quasiconvex Optimization
In this chapter we study an extension of the projected subgradient method for minimization of quasiconvex and nonsmooth functions, under the presence of computational errors. The problem is described by an objective function and a set of feasible points. We extend some of the results of Chapter 2 and show that our algorithm generates a good approximate solution, if all the computational errors are bounded from above by a small positive constant. Moreover, if we know computational errors for the two steps of our algorithm, we find out what an approximate solution can be obtained and how many iterates one needs for this.
5.1 Preliminaries Let (X, ·, ·) be a Hilbert space with an inner product ·, · which induces a complete norm · . We use the notation and definitions introduced in Chapter 2. For each x ∈ X and each nonempty set A ⊂ X, set d(x, A) = inf{x − y : y ∈ A}. For each x ∈ X and each r > 0, set BX (x, r) = {y ∈ X : x − y ≤ r}. The boundary of a set E ⊂ X is denoted by bd(E) and the closure of E is denoted by cl(E). Let C be a closed nonempty subset of the space X and U be an open convex subset of X such that C ⊂ U. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 A. J. Zaslavski, The Projected Subgradient Algorithm in Convex Optimization, SpringerBriefs in Optimization, https://doi.org/10.1007/978-3-030-60300-7_5
(5.1) 129
130
5 Quasiconvex Optimization
Suppose that f : U → R 1 is a continuous function such that f (λx + (1 − λ)y) ≤ max{f (x), f (y)}
(5.2)
for all x, y ∈ U and all λ ∈ [0, 1]. In other words, the function f is quasiconvex. We consider the minimization problem f (x) → min, x ∈ C. It should be mentioned that quasiconvex optimization problems are studied in [19, 38–43, 77]. We suppose that inf(f, C) := inf{f (x) : x ∈ C} is finite and that Cmin = {x ∈ C : f (x) = inf(f, C)} = ∅.
(5.3)
x ∗ ∈ Cmin .
(5.4)
S = {x ∈ X : x = 1}.
(5.5)
Fix
Set
For each x ∈ U satisfying f (x) > inf(f (z) : z ∈ U }, define ∂ ∗ f (x) = {l ∈ X : l, y − x ≤ 0 for every y ∈ U satisfying f (y) < f (x)}.
(5.6)
This set is a closed convex cone and it is called the subdifferential of f at the point x. Its elements are called subgradients. Let ≥ 0. For each x ∈ U satisfying f (x) − > inf(f (z) : z ∈ U }, define ∂∗ f (x) = {l ∈ X : l, y − x ≤ 0 for every y ∈ U satisfying f (y) < f (x) − }.
(5.7)
This set is a closed convex cone and it is called the -subdifferential of f at the point x. Its elements are called -subgradients.
5.1 Preliminaries
131
Proposition 5.1 Let x ∈ U , L > 0, β > 0, ≥ 0, M0 > 0, x, x ∗ ∈ BX (0, M0 ),
(5.8)
f (x ∗ ) < f (x) −
(5.9)
and let |f (z) − f (x ∗ )| ≤ Lz − x ∗ β for all z ∈ BX (0, 3M0 + 1) ∩ U.
(5.10)
Then for all g ∈ ∂∗ f (x) ∩ S, f (x) − f (x ∗ ) ≤ Lg, x − x ∗ β + . Proof Let g ∈ ∂∗ f (x) ∩ S.
(5.11)
In view of the continuity and quasiconvexity of the function f , the set E := {z ∈ U : f (z) < f (x) − }
(5.12)
is open and convex. By (5.9) and (5.12), x ∗ ∈ E.
(5.13)
For the continuity of f , (5.12) and (5.13) imply that bd(E) = ∅ and that {λx + (1 − λ)x ∗ : λ ∈ [0, 1]} ∩ bd(E) = ∅.
(5.14)
r = inf{x ∗ − u : u ∈ bd(E) ∩ U }.
(5.15)
Set
It follows from (5.8), (5.14), and (5.15) that r ≤ x ∗ − x ≤ 2M0 .
(5.16)
There exists a sequence {uk }∞ k=1 ⊂ bd(E) ∩ U such that x ∗ − uk ≤ r + k −1 , k = 1, 2, . . . .
(5.17)
The openness of E implies that f (x) ≤ f (u) + for all u ∈ bd(E) ∩ U.
(5.18)
132
5 Quasiconvex Optimization
In view of (5.18), for every integer k ≥ 1, f (x) − f (x ∗ ) ≤ f (uk ) − f (x ∗ ) + .
(5.19)
Let k ≥ 1 be an integer. By (5.8), (5.16), and (5.17), uk ≤ r + k −1 + x ∗ ≤ 3M0 + 1.
(5.20)
It follows from (5.8), (5.10), (5.17), (5.19), and (5.20) that f (x) − f (x ∗ ) ≤ f (uk ) − f (x ∗ ) + ≤ Luk − x ∗ β + ≤ L(r + k −1 )β +
and f (x) − f (x ∗ ) ≤ Lr β + .
(5.21)
{z ∈ U : z − x ∗ < r} ⊂ E.
(5.22)
In view of (5.15),
Let k be a natural number. By (5.11) and (5.22), x ∗ + (1 − k −1 )rg ∈ E.
(5.23)
It follows from (5.11), (5.12), and (5.23) that (1 − k −1 )r − g, x − x ∗ = (1 − k −1 )rg2 − g, x − x ∗ = g, x ∗ + (1 − k −1 )rg − x ≤ 0 and r ≤ g, x − x ∗ . Together with (5.21) this implies that f (x) − f (x ∗ ) ≤ Lg, x − x ∗ β + . Proposition 5.1 is proved. This chapter contains two main results, Theorems 5.3 and 5.4, which are proved in Sections 5.3 and 5.4, respectively. These results are new and their proofs are based on our main Lemma 5.2 proved in Section 5.2.
5.2 The Main Lemma
133
5.2 The Main Lemma Lemma 5.2 Let M0 > 0, L > 0, β > 0, δf , δC ∈ (0, 1], Δ ≥ 0, α ∈ (0, 1], x ∈ U and let a mapping P : X → X satisfy P y = y, y ∈ C, P y − z ≤ y − z for all y ∈ X and all z ∈ C.
(5.24) (5.25)
Assume that x ∗ ≤ M0 , x ≤ M0 , |f (z) − f (x ∗ )| ≤ Lz − x ∗ β , z ∈ BX (0, 3M0 + 1) ∩ U,
(5.26) (5.27)
f (x) > inf(f, C) + Δ,
(5.28)
∗ f (x) ∩ S = ∅ BX (ξ, δf ) ∩ ∂Δ
(5.29)
y − P (x − αξ ) ≤ δC .
(5.30)
ξ ∈ X satisfies
and that y ∈ U satisfies
Then y − x ∗ 2 ≤ x − x ∗ 2 + α 2 (1 + δf2 ) + 2αδf (2M0 + 1) +δC2 + 2δC (2M0 + 2) − 2αL−1/β (f (x) − inf(f, C) − Δ)1/β .
Proof In view of (5.29), there exists ∗ g ∈ ∂Δ f (x) ∩ S
(5.31)
g − ξ ≤ δf .
(5.32)
such that
Proposition 5.1 and (5.31) imply that f (x) − inf(f, C) ≤ Lg, x − x ∗ β + Δ.
(5.33)
134
5 Quasiconvex Optimization
By (5.3), (5.4), (5.25), (5.26), and (5.30)–(5.32), y − x ∗ 2 ≤ y − P (x − αξ ) + P (x − αξ ) − x ∗ 2 ≤ y − P (x − αξ )2 + P (x − αξ ) − x ∗ 2 +2y − P (x − αξ )P (x − αξ ) − x ∗ ≤ δC2 + P (x − αξ ) − x ∗ 2 + 2δC P (x − αξ ) − x ∗ ≤ δC2 + x − αξ − x ∗ 2 + 2δC x − αξ − x ∗ ≤ δC2 + x − αξ − x ∗ 2 + 2δC (2M0 + 2).
(5.34)
It follows from (5.26), (5.28), and (5.31)–(5.33) that x − αξ − x ∗ 2 = x − αg + (αg − αξ ) − x ∗ 2 ≤ x − αξ − x ∗ 2 + α 2 g − ξ 2 + 2αg − ξ x − αξ − x ∗ ≤ x − αg − x ∗ 2 + α 2 δf2 + 2αδf (2M0 + 1) ≤ α 2 δf2 + 2αδf (2M0 + 1) + x − x ∗ 2 + α 2 − 2αg, x − x ∗ ≤ α 2 δf2 +2αδf (2M0 +1)+x−x ∗ 2 +α 2 −2αL−1/β (f (x)− inf(f, C)−Δ)1/β ).
Together with (5.34) this implies that y − x ∗ 2 ≤ x − x ∗ 2 + α 2 (1 + δf2 ) + 2αδf (2M0 + 1) −2αL−1/β (f (x) − inf(f, C) − Δ)1/β ) + δC2 + 2δC (2M0 + 2). Lemma 5.2 is proved.
5.3 Optimization on Bounded Sets Denote by M the set of all mappings P : X → C such that P z = z, z ∈ C, P x − z ≤ x − z for all x ∈ X and all z ∈ C.
(5.35) (5.36)
Theorem 5.3 Let M > 1, L > 0, β > 0, δf , δC ∈ (0, 1], Δ ≥ 0, T be a natural T −1 T −1 number, {Pt }t=0 ⊂ M, {αt }t=0 ⊂ (0, 1],
5.3 Optimization on Bounded Sets
135
C ⊂ BX (0, M − 1),
(5.37)
|f (z) − f (x ∗ )| ≤ Lz − x ∗ β , z ∈ BX (0, 3M + 1) ∩ U
(5.38)
and let T −1 −1 = Δ + 2−1 L1/β (2M + 3)2 αt t=0
+L1/β
T −1 t=0
αt
2 T −1
−1 αt
+ L1/β δf (2M + 1) + L1/β δC (2M + 3)
T −1
t=0
−1 β αt
.
t=0
(5.39) T −1 ⊂ X, Assume that {xt }Tt=0 ⊂ U , {ξt }t=0
B(x0 , δC ) ∩ C = ∅
(5.40)
and that for all t = 0, . . . , T − 1, ∗ f (xt ) ∩ S = ∅, BX (ξt , δf ) ∩ ∂Δ
(5.41)
xt+1 − Pt (xt − αt ξt ) ≤ δC .
(5.42)
Then min{f (xt ) : t = 0, . . . , T − 1} ≤ inf(f, C) + .
(5.43)
Proof Clearly, ∂fΔ∗ (z) is well-defined if f (z) − Δ > inf{f (x) : x ∈ U }. We may assume without loss of generality that f (xt ) − Δ > inf{f (x) : x ∈ U }, t = 0, . . . , T − 1. We show that (5.43) holds. Assume the contrary. Then f (xt ) > + inf(f, C) = + f (x ∗ ), t = 0, . . . , T − 1.
(5.44)
Let t ∈ {0, . . . , T − 1}. By (5.35)–(5.42), (5.44), and Lemma 5.2 applied with α = αt , P = Pt , x = xt , ξ = ξt , y = xt+1 , and M0 = M, xt+1 − x ∗ 2 ≤ xt − x ∗ 2 + 2αt2 + 2αt δf (2M + 1)
136
5 Quasiconvex Optimization
−2αt L−1/β ( − Δ)1/β + 2δC (2M + 3) and xt − x ∗ 2 − xt+1 − x ∗ 2 ≥ 2αt L−1/β ( − Δ)1/β −2αt2 − 2αt δf (2M + 1) − 2δC (2M + 3).
(5.45)
It follows from (5.37), (5.40), and (5.45) that (2M + 2)2 > x0 − x ∗ 2 ≥ x0 − x ∗ 2 − xT − x ∗ 2 =
T −1
(xt − x ∗ 2 − xt+1 − x ∗ 2 )
t=0
≥ 2L−1/β ( − Δ)1/β
T −1
αt − 2
t=0
−2δf (2M + 1)
T −1
T −1
αt2
t=0
αt − 2δC (2M + 3)T
t=0
and ( − Δ)1/β < 2−1 L1/β (2M + 3)2
T −1
−1 αt
t=0
+L
1/β
δf (2M + 1) + L
+ L1/β
T −1
αt2
T −1
t=0 1/β
δC (2M + 3)T
−1 αt
t=0
T −1
−1 αt
.
t=0
This contradicts (5.39). The contradiction we have reached completes the proof of Theorem 5.3. T −1 Let T be a natural number and i=0 αt = A is Theorem 5.3 are given. We need to choose αt ≥ 0, t =0, . . . , T − 1 in order to minimize . Clearly, we need to T −1 2 αt over the set minimize the function i=0 (α0 , . . . , αT −1 ) ∈ R : αi ≥ 0, i = 0, . . . , T − 1, T
T −1 i=0
αt = A .
5.4 Optimization on Unbounded Sets
137
By Lemma 2.3 of [75], we should choose αt = AT −1 , t = 0, . . . , T − 1. Assume that αt = α > 0, t = 0, . . . , T − 1. In this case in view of (5.39), = Δ + (2−1 L1/β (2M + 3)2 (αT )−1 + L1/β α + L1/β δf (2M + 1) +L1/β δC (2M + 3)α −1 )β .
We need to choose α > 0 in order to minimize . Since a natural number T can be arbitrarily large, we need to minimize the function α + δC (2M + 3)α −1 , α > 0. Clearly, its minimizer is α = (δC (2M + 3))1/2 . In this case = Δ + (2−1 L1/β (2M + 3)2 (δC (2M + 3))−1/2 T −1 +L1/β (δC (2M + 3)) + L1/β δf (2M + 1) + L1/β (δC (2M + 3))1/2 )β .
Evidently, T should be at the same order as δC−1 .
5.4 Optimization on Unbounded Sets We continue to use the definitions and notation introduced in Section 5.1. Recall that M is the set of all mappings P : X → C which satisfy (5.35) and (5.36). Assume that lim
x∈U, x→∞
f (x) = ∞.
(5.46)
It means that for each M0 > 0, there exists M1 > 0 such that if a point x ∈ U satisfies x ≥ M1 , then f (x) > M0 . Fix θ0 ∈ C. Set
(5.47)
138
5 Quasiconvex Optimization
U0 = {x ∈ U : f (x) ≤ f (θ0 ) + 4}.
(5.48)
By (5.46) and (5.48), there exists M¯ > 1 such that ¯ U0 ⊂ BX (0, M).
(5.49)
Assume that there exists a number L¯ > 1 such that ¯ − x ∗ β , z ∈ BX (0, M¯ + 4) ∩ U. |f (z) − f (x ∗ )| ≤ Lz
(5.50)
Theorem 5.4 Let M ≥ M¯ + 4, L ≥ L¯ + 1, β > 0, δf , δC ∈ (0, 1], Δ ∈ (0, 1], 0 < α ≤ 4−1 L−1/β , δC ≤ (6M + 3)−1 4−1 L−1/β α, δf ≤ (6M + 3)−1 4−1 L−1/β (5.51) and |f (z) − f (x ∗ )| ≤ Lz − x ∗ β , z ∈ BX (0, 9M + 4) ∩ U.
(5.52)
Assume that T is a natural number, T −1 ⊂ M, {Pt }t=0
(5.53)
x0 ≤ M,
(5.54)
B(x0 , δC ) ∩ C = ∅
(5.55)
T −1 ⊂ X, {xt }Tt=0 ⊂ U , {ξt }t=0
and that for all t = 0, . . . , T − 1, ∗ f (xt ) ∩ S = ∅, BX (ξt , δf ) ∩ ∂Δ
(5.56)
xt+1 − Pt (xt − αξt ) ≤ δC .
(5.57)
Then min{f (xt ) : t = 0, . . . , T − 1} ≤ inf(f, C) + Δ +L(2M 2 (αT )−1 + α + δC α −1 (6M + 2) + δf (6M + 2))β . Proof Clearly, ∂fΔ∗ (z) is well-defined if f (z) − Δ > inf{f (x) : x ∈ U }. We may assume without loss of generality that
(5.58)
5.4 Optimization on Unbounded Sets
139
f (xt ) − Δ > inf{f (x) : x ∈ U }, t = 0, . . . , T − 1. We show that (5.58) holds. Assume the contrary. Then for all t = 0, . . . , T − 1, f (xt ) − f (x ∗ ) > Δ + L(2M 2 (αT )−1 + α + δC α −1 (6M + 2) + δf (6M + 2))β . (5.59) In view of (5.3), (5.4), (5.49), and (5.54), ¯ x0 − x ∗ ≤ M + M. By induction we show that for all t = 0, . . . , T , ¯ xt − x ∗ ≤ M + M.
(5.60)
Clearly, (5.60) is true for t = 0. Assume that t ∈ {0, . . . , T − 1} and (5.60) is true. There are two cases: f (xt ) ≤ inf(f, C) + 4;
(5.61)
f (xt ) > inf(f, C) + 4.
(5.62)
Assume that (5.61) holds. By (5.47)–(5.49) and (5.61), ¯ xt ≤ M.
(5.63)
It follows from (5.3), (5.4), (5.35), (5.36), (5.47)–(5.49), (5.56), (5.57), and (5.63) that xt+1 − x ∗ ≤ xt+1 − Pt (xt − αξt ) + Pt (xt − αξt ) − x ∗ ¯ ≤ δC + xt − αξt − x ∗ ≤ 2M¯ + 3 ≤ M + M.
Assume that (5.62) holds. By (5.60) which implies xt ≤ 3M, (5.51)–(5.53), (5.56), (5.57), (5.62), and Lemma 5.2 applied with P = Pt , x = xt , ξ = ξt , y = xt+1 , and M0 = 3M, xt+1 − x ∗ 2 ≤ xt − x ∗ 2 + 2α 2 + 2αδf (6M + 1) −2αL−1/β 31/β + 2δC (6M + 3)
and ¯ xt+1 − x ∗ ≤ xt − x ∗ ≤ M + M.
140
5 Quasiconvex Optimization
Thus in both cases ¯ xt+1 − x ∗ ≤ M + M. Therefore (5.60) holds for all t = 0, . . . , T . In view of (5.3), (5.4), (5.47)–(5.49), and (5.60), xt ≤ 3M, t = 0, . . . , T .
(5.64)
Let t ∈ {0, . . . , T − 1}. By (5.35), (5.36), (5.52), (5.56), (5.57), (5.59), (5.64), and Lemma 5.2 applied with P = Pt , x = xt , ξ = ξt , y = xt+1 , and M0 = 3M, xt+1 − x ∗ 2 ≤ xt − x ∗ 2 + 2α 2 + 2αδf (6M + 1) +δC2 + 2δC (6M + 2) − 2αL−1/β (f (xt ) − f (x ∗ ) − Δ)1/β ≤ xt − x ∗ 2 + 2α 2 + 2αδf (6M + 1) +2δC (6M + 3) − 2αL−1/β (f (xt ) − f (x ∗ ) − Δ)1/β .
(5.65)
It follows from (5.60) and (5.65) that (2M)2 ≥ x0 − x ∗ 2 ≥ x0 − x ∗ 2 − xT − x ∗ 2 =
T −1
(xt − x ∗ 2 − xt+1 − x ∗ 2 )
t=0
≥ 2α
T −1
(L−1/β (f (xt ) − f (x ∗ ) − Δ)1/β
t=0
−α − α −1 δC (6M + 2) − δf (6M + 1)) ≥ 2αT ((min{f (xt ) : t = 0, . . . , T − 1} − f (x ∗ ) − Δ)1/β L−1/β −α − α −1 δC (6M + 2) − δf (6M + 1)) and (min{f (xt ) : t = 0, . . . , T − 1} − f (x ∗ ) − Δ)1/β < L1/β (2M 2 (T α)−1 + α + α −1 δC (6M + 2) + δf (6M + 1)). This implies that min{f (xt ) : t = 0, . . . , T − 1} < inf(f, C) + Δ +L(2M 2 (T α)−1 + α + α −1 δC (6M + 2) + δf (6M + 1))β . This contradicts (5.59). The contradiction we have reached proves Theorem 5.4.
5.4 Optimization on Unbounded Sets
141
In order to minimize the right-hand side of (5.58), we should choose α = (δC (6M + 2))1/2 . In this case T should be at the same order as δC−1 . Of course, we need that (5.51) is true. This leads us to the following condition on δC : δC ≤ (6M + 3)−1 4−2 L−2β . Together with our choice of α and the condition above on δC , Theorem 5.4 implies the following result. Theorem 5.5 Let M ≥ M¯ + 4, L ≥ L¯ + 1, β > 0, δf , δC ∈ (0, 1], Δ ∈ (0, 1], |f (z) − f (x ∗ )| ≤ Lz − x ∗ β , z ∈ BX (0, 9M + 4) ∩ U, δC ≤ (6M + 3)−1 4−2 L−2/β , δf ≤ (6M + 3)−1 4−1 L−1/β and α = (δC (6M + 2))1/2 . Assume that T is a natural number, T −1 ⊂ M, {Pt }t=0 T −1 ⊂ X, {xt }Tt=0 ⊂ U , {ξt }t=0
x0 ≤ M, B(x0 , δC ) ∩ C = ∅ and that for all t = 0, . . . , T − 1, ∗ f (xt ) ∩ S = ∅, BX (ξt , δf ) ∩ ∂Δ
xt+1 − Pt (xt − αt ξt ) ≤ δC . Then min{f (xt ) : t = 0, . . . , T − 1} ≤ inf(f, C) + Δ +L(2M 2 (T (δC (6M + 2))1/2 )−1 ) + (δC (6M + 2))1/2 +(δC (6M + 3))1/2 + δf (6M + 2))β .
References
1. Alber YI (1971) On minimization of smooth functional by gradient methods. USSR Comp Math Math Phys 11:752–758 2. Alber YI, Iusem AN, Solodov MV (1997) Minimization of nonsmooth convex functionals in Banach spaces. J Convex Anal 4:235–255 3. Alber YI, Iusem AN, Solodov MV (1998) On the projected subgradient method for nonsmooth convex optimization in a Hilbert space. Math Program 81:23–35 4. Aleyner A, Reich S (2008) Block-iterative algorithms for solving convex feasibility problems in Hilbert and Banach spaces. J Math Anal Appl 343:427–435 5. Al-Mazrooei AE, Latif A, Qin X, Yao J-C (2019) Fixed point algorithms for split feasibility problems. Fixed Point Theory 20:245–254 6. Alsulami SM, Takahashi W (2015) Iterative methods for the split feasibility problem in Banach spaces. J Nonlinear Convex Anal 16:585–596 7. Barty K, Roy J-S, Strugarek C (2007) Hilbert-valued perturbed subgradient algorithms. Math Oper Res 32:551–562 8. Bauschke HH, Borwein JM (1996) On projection algorithms for solving convex feasibility problems. SIAM Rev 38:367–426 9. Bauschke HH, Koch, VR (2015) Projection methods: Swiss army knives for solving feasibility and best approximation problems with half-spaces. Contemp Math 636:1–40 10. Bauschke H, Wang C, Wang X, Xu J (2015) On subgradient projectors. SIAM J Optim 25:1064–1082 11. Beck A, Teboulle M (2003) Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper Res Lett 31:167–175 12. Burachik RS, Grana Drummond LM, Iusem AN, Svaiter BF (1995) Full convergence of the steepest descent method with inexact line searches. Optimization 32:137–146 13. Butnariu D, Davidi, R, Herman GT, Kazantsev IG (2007) Stable convergence behavior under summable perturbations of a class of projection methods for convex feasibility and optimization problems. IEEE J Select Top Signal Process 1:540–547 14. Butnariu D, Reich S, Zaslavski AJ (2006) Convergence to fixed points of inexact orbits of Bregman-monotone and of nonexpansive operators in Banach spaces. In: Fixed point theory and its applications. Yokohama Publishers, Mexico, pp 11–32 15. Butnariu D, Reich S, Zaslavski AJ (2008) Stable convergence theorems for infinite products and powers of nonexpansive mappings. Numer Func Anal Optim 29:304–323
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 A. J. Zaslavski, The Projected Subgradient Algorithm in Convex Optimization, SpringerBriefs in Optimization, https://doi.org/10.1007/978-3-030-60300-7
143
144
References
16. Ceng LC, Hadjisavvas N, Wong NC (2010) Strong convergence theorem by a hybrid extragradient-like approximation method for variational inequalities and fixed point problems. J Glob Optim 46:635–646 17. Ceng LC, Wong NC, Yao JC (2015) Hybrid extragradient methods for fiinding minimum norm solutions of split feasibility problems. J Nonlinear Convex Anal 16:1965–1983 18. Censor Y, Cegielski A (2015) Projection methods: an annotated bibliography of books and reviews. Optimization 64:2343–2358 19. Censor Y, Segal A (2006) Algorithms for the quasiconvex feasibility problem. J Comput Appl Math 185:34–50 20. Censor Y, Segal A (2010) On string-averaging for sparse problems and on the split common fixed point problem. Contemp Math 513:125–142 21. Censor Y, Tom E (2003) Convergence of string-averaging projection schemes for inconsistent convex feasibility problems. Optim Methods Softw 18:543–554 22. Censor Y, Zaslavski AJ (2013) Convergence and perturbation resilience of dynamic stringaveraging projection methods. Comput Optim Appl 54:65–76 23. Censor Y, Zaslavski AJ (2014) String-averaging projected subgradient methods for constrained minimization. Optim Methods Softw 29:658–670 24. Censor Y, Elfving T, Herman GT (2001) Averaging strings of sequential iterations for convex feasibility problems. In: Butnariu D, Censor Y, Reich S (eds). Inherently parallel algorithms in feasibility and optimization and their applications. North-Holland, Amsterdam, pp 101–113 25. Censor Y, Davidi R, Herman GT (2010) Perturbation resilience and superiorization of iterative algorithms. Inverse Probl 26:12 pp 26. Censor Y, Gibali A, Reich S (2011) The subgradient extragradient method for solving variational inequalities in Hilbert space. J Optim Theory Appl 148:318–335 27. Censor Y, Chen W, Combettes PL, Davidi R, Herman GT (2012) On the effectiveness of projection methods for convex feasibility problems with linear inequality constraints. Comput Optim Appl 51:1065–1088 28. Censor Y, Gibali A, Reich S, Sabach S (2012) Common solutions to variational inequalities. Set-Valued Var Anal 20:229–247 29. Censor Y, Davidi R, Herman GT, Schulte RW, Tetruashvili L (2014) Projected subgradient minimization versus superiorization. J Optim Theory Appl 160:730–747 30. Chadli O, Konnov IV, Yao JC (2004) Descent methods for equilibrium problems in a Banach space. Comput Math Appl 48:609–616 31. Combettes PL (1996) The convex feasibility problem in image recovery. Adv Imag Electron Phys 95:155–270 32. Combettes PL (1997) Hilbertian convex feasibility problems: convergence of projection methods. Appl Math Optim 35:311–330 33. Demyanov VF, Vasilyev LV (1985) Nondifferentiable optimization. Optimization Software, New York 34. Gibali A, Jadamba B, Khan AA, Raciti F, Winkler B (2016) Gradient and extragradient methods for the elasticity imaging inverse problem using an equation error formulation: a comparative numerical study. Nonlinear Anal Optim Contemp Math 659:65–89 35. Griva I (2018) Convergence analysis of augmented Lagrangian-fast projected gradient method for convex quadratic problems. Pure Appl Funct Anal 3:417–428 36. He H, Xu H-K (2017) Splitting methods for split feasibility problems with application to Dantzig selectors. Inverse Probl 33:28 pp 37. Hiriart-Urruty J-B, Lemarechal C (1993) Convex analysis and minimization algorithms. Springer, Berlin 38. Hishinuma K, Iiduka H (2020) Fixed point quasiconvex subgradient method. Eur J Oper Res 282:428–437 39. Hu Y, Yang X, Sim C-K (2015) Inexact subgradient methods for quasi-convex optimization problems. Eur J Oper Res 240:315–327 40. Hu Y, Yu CKW, Li C (2016) Stochastic subgradient method for quasi-convex optimization problems. J Nonlinear Convex Anal 17:711–724
References
145
41. Hu Y, Yu CKW, Li C, Yang X (2016) Conditional subgradient methods for constrained quasiconvex optimization problems. J Nonlinear Convex Anal 17:2143–2158 42. Kiwiel KC (2001) Convergence and efficiency of subgradient methods for quasiconvex minimization. Math Programm 90:1–25 43. Konnov IV (2003) On convergence properties of a subgradient method. Optim Methods Softw 18:53–62 44. Konnov IV (2009) A descent method with inexact linear search for mixed variational inequalities. Russ Math (Iz VUZ) 53:29–35 45. Konnov IV (2018) Simplified versions of the conditional gradient method. Optimization 67:2275–2290 46. Korpelevich GM (1976) The extragradient method for finding saddle points and other problems. Ekon Matem Metody 12:747–756 47. Liu L, Qin X, Yao J-C (2019) A hybrid descent method for solving a convex constrained optimization problem with applications. Math Methods Appl Sci 42:7367–7380 48. Mordukhovich BS (2006) Variational analysis and generalized differentiation I: Basic theory. Springer, Berlin 49. Mordukhovich BS, Nam NM (2014) An easy path to convex analysis and applications. Morgan & Clayton Publishes, San Rafael 50. Nadezhkina N, Wataru T (2004) Modified extragradient method for solving variational inequalities in real Hilbert spaces. Nonlinear analysis and convex analysis. Yokohama Publishers, Yokohama, pp 359–366 51. Nedic A, Ozdaglar A (2009) Subgradient methods for saddle-point problems. J Optim Theory Appl 142:205–228 52. O’Hara JG, Pillay P, Xu HK (2006) Iterative approaches to convex feasibility problems in Banach spaces. Nonlinear Anal 64:2022–2042 53. Pallaschke D, Recht P (1985) On the steepest–descent method for a class of quasidifferentiable optimization problems. In: Nondifferentiable optimization: motivations and applications (Sopron, 1984). Lecture notes in economics and mathematical systems, vol 255. Springer, Berlin, pp 252–263 54. Polyak BT (1987) Introduction to optimization. Optimization Software, New York 55. Polyak RA (2015) Projected gradient method for non-negative least squares. Contemp. Math. 636:167–179 56. Qin X, Cho SY, Kang SM (2011) An extragradient-type method for generalized equilibrium problems involving strictly pseudocontractive mappings. J Global Optim 49:679–693 57. Reich S, Zaslavski AJ (2014) Genericity in nonlinear analysis. Developments in mathematics. Springer, New York 58. Shor NZ (1985) Minimization methods for non-differentiable functions. Springer, Berlin 59. Solodov MV, Zavriev SK (1998) Error stability properties of generalized gradient-type algorithms. J Optim Theory Appl 98:663–680 60. Su M, Xu H-K (2010) Remarks on the gradient-projection algorithm. J Nonlinear Anal Optim 1:35–43 61. Takahashi W (2009) Introduction to nonlinear and convex analysis. Yokohama Publishers, Yokohama 62. Takahashi W (2014) The split feasibility problem in Banach spaces. J Nonlinear Convex Anal 15:1349–1355 63. Takahashi W, Wen C-F, Yao J-C (2020) Strong convergence theorem for split common fixed point problem and hierarchical variational inequality problem in Hilbert spaces. J Nonlinear Convex Anal 21:251–273 64. Thuy LQ, Wen C-F, Yao J-C, Hai TN (2018) An extragradient-like parallel method for pseudomonotone equilibrium problems and semigroup of nonexpansive mappings. Miskolc Math Notes 19:1185–1201 65. Wang H, Xu H-K (2018) A note on the accelerated proximal gradient method for nonconvex optimization. Carpathian J Math 34:449–457
146
References
66. Xu H-K (2011) Averaged mappings and the gradient-projection algorithm. J Optim Theory Appl 150:360–378 67. Xu H-K (2017) Bounded perturbation resilience and superiorization techniques for the projected scaled gradient method. Inverse Probl 33:19 pp 68. Yao Y, Postolache M, Yao J-C (2019) Convergence of an extragradient algorithm for fixed point and variational inequality problems. J Nonlinear Convex Anal 20:2623–2631 69. Yao Y, Qin X, Yao J-C (2018) Constructive approximation of solutions to proximal split feasibility problems. J Nonlinear Convex Anal 19:2165–2175 70. Yao Y, Qin X, Yao J-C (2019) Convergence analysis of an inertial iterate for the proximal split feasibility problem. J Nonlinear Convex Anal 20:489–498 71. Zaslavski AJ (2010) The projected subgradient method for nonsmooth convex optimization in the presence of computational errors. Numer Funct Anal Optim 31:616–633 72. Zaslavski AJ (2012) The extragradient method for convex optimization in the presence of computational errors. Numer Funct Anal Optim 33:1399–1412 73. Zaslavski AJ (2012) The extragradient method for solving variational inequalities in the presence of computational errors. J Optim Theory Appl 153:602–618 74. Zaslavski AJ (2013) The extragradient method for finding a common solution of a finite family of variational inequalities and a finite family of fixed point problems in the presence of computational errors. J Math Anal Appl 400:651–663 75. Zaslavski AJ (2016) Numerical optimization with computational errors. Springer, Cham 76. Zaslavski AJ (2016) Approximate solutions of common fixed point problems. Springer optimization and its applications. Springer, Cham 77. Zaslavski AJ (2020) Convex optimization with computational errors. Springer optimization and its applications. Springer, Cham 78. Zeng LC, Yao JC (2006) Strong convergence theorem by an extragradient method for fixed point problems and variational inequality problems. Taiwanese J Math 10:1293–1303