154 79 6MB
English Pages 210 [202] Year 2023
Springer Proceedings in Mathematics & Statistics
Rentsen Enkhbat Altannar Chinchuluun Panos M. Pardalos Editors
Optimization, Simulation and Control ICOSC 2022, Ulaanbaatar, Mongolia, June 20–22
Springer Proceedings in Mathematics & Statistics Volume 434
This book series features volumes composed of selected contributions from workshops and conferences in all areas of current research in mathematics and statistics, including data science, operations research and optimization. In addition to an overall evaluation of the interest, scientific quality, and timeliness of each proposal at the hands of the publisher, individual contributions are all refereed to the high quality standards of leading journals in the field. Thus, this series provides the research community with well-edited, authoritative reports on developments in the most exciting areas of mathematical and statistical research today.
Rentsen Enkhbat • Altannar Chinchuluun • Panos M. Pardalos Editors
Optimization, Simulation and Control ICOSC 2022, Ulaanbaatar, Mongolia, June 20–22
Editors Rentsen Enkhbat Institute of Mathematics and Digital Technology Mongolian Academy of Sciences Ulaanbaatar, Mongolia
Altannar Chinchuluun Business School National University of Mongolia Ulaanbaatar, Mongolia
Panos M. Pardalos Department of Industrial and Systems Engineering University of Florida Gainesville, FL, USA
ISSN 2194-1009 ISSN 2194-1017 (electronic) Springer Proceedings in Mathematics & Statistics ISBN 978-3-031-41228-8 ISBN 978-3-031-41229-5 (eBook) https://doi.org/10.1007/978-3-031-41229-5 Mathematics Subject Classification: 49-XX, 49Mxx, 49-06, 93-XX, 93Exx © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.
Preface
This volume brings together a selection of papers presented at the 7th International Conference on Optimization, Simulation and Control (COSC 2022), held in Ulaanbaatar, Mongolia, during June 20–22, 2022. The aim of the conference was to bring together engineers, scientists, and mathematicians from a variety of related disciplines, who are at the forefront of their research fields, to exchange ideas and present original high-level unpublished research in the areas of optimization, optimal control, simulation, and related fields. After a thorough reviewing process, 14 papers have been accepted for publication in the conference proceedings. We would like to express our gratitude and appreciation for all of the reviewers for their constructive comments on the papers. We believe that this volume will be of great interest to graduate students and researchers in engineering and mathematics, as well as to engineers and scientists involved in the application of mathematics in engineering, economics, and management science. We would like to thank all the authors and presenters who provided the research insights comprising the essential substance of the conference. We would also like to express our gratitude for the hard work of the committee members for arranging the conference in a very professional and efficient manner. Ulaanbaatar, Mongolia Ulaanbaatar, Mongolia Gainesville, FL, USA May 2023
Rentsen Enkhbat Altannar Chinchuluun Panos M. Pardalos
v
Organization
COSC 2022 is organized by the Mongolian Academy of Sciences, the National University of Mongolia, the German-Mongolian Institute for Resources and Technology, and the University of the Humanities in Mongolia.
Executive Committee Conference Chair: Program Chairs:
Rentsen Enkhbat (Mongolian Academy of Sciences, Mongolia) Altannar Chinchuluun (National University of Mongolia, Mongolia) and Panos M. Pardalos (University of Florida, USA)
Program Committee Ider Tseveendorj (Université de Versailles Saint-Quentin, France), Altangerel Lkhamsuren (German-Mongolian Institute for Resources and Technology, Mongolia), Battuvshin Chuluundorj (University of the Humanities, Mongolia), Gantumur Tsogtgerel (McGill University, Canada), Athanasios Migdalas (Luleå University of Technology, Sweden), Alexander Strekalovsky (Institute for System Dynamics and Control Theory, Russia), Ashwin Arulselvan (University of Strathclyde, UK), Petros Xanthopoulos (Stetson University, USA), Masao Fukushima (Nanzan University, Japan), Masaru Kamada (Ibaraki University, Japan), Hexi Baoyin (Tsinghua University, China), Kok Lay Teo (Sunway University, Malaysia), Anatoly S. Antipin (Moscow Computational Center, Russia), Alexander S. Buldaev (Buryat State University, Russia), Olga Vasilieva (Universidad del Valle, Colombia), Biswa Nath Datta (Northern Illinois University), vii
viii
Organization
Joydeep Dutta (Indian Institute of Technology, India), Khalide Jbilou (Universite du Littoral Cote d’Opale, France), Honglei Xu (Curtin University, Australia), Radu Ioan Bot (University of Vienna, Austria), Mend-Amar Majig (National University of Mongolia, Mongolia), Richard Vogel (Farmingdale State College, USA), Kok Lay Teo (Sunway University, Malaysia), Saheya Barintag (Inner Mongolia Normal University, China), Jirimutu (Inner Mongolia University for the Nationalities, China), Hsiao-Fan Wang (National Tsing Hua University, China), Gerhard-Wilhelm Weber (Middle East Technical University, Turkey), Sheng Bau (University of the Witwaterstand, South Africa), Milagros Baldemor (Don Mariano Marcos Memorial State University, Philippines) Conference Secretary:
Tungalag Natsagdorj (National University of Mongolia, Mongolia)
Contents
Covering Balls and .HT -Differential for Convex Maximization. . . . . . . . . . . . . Ider Tseveendorj Employing the Cloud for Finding Solutions to Large Systems of Nonlinear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . William Trevena, Alexander Semenov, Michael J. Hirsch, and Panos M. Pardalos An Approximation Scheme for a Bilevel Knapsack Problem . . . . . . . . . . . . . . . Ashwin Arulselvan and Altannar Chinchuluun Efficient Heuristics for a Partial Set Covering Problem with Mutually Exclusive Pairs of Facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aleksander Belykh, Tatyana Gruzdeva, Anton Ushakov, and Igor Vasilyev A Hybrid Genetic Algorithm for the Budget-Constrained Charging Station Location Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Themistoklis Stamadianos, Nikolaos A. Kyriakakis, Magdalene Marinaki, Yannis Marinakis, and Athanasios Migdalas Optimal Advertising Expenditure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Justin Abeles, Worku T. Bitew, Radika Latchman, Nicholas Seaton, Michael Tartamella, and Richard Vogel Pre-clustered Generative Adversarial Network Model for Mongolian Font Style Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Saheya Barintag, Zexing Zhang, Bohuai Duan, and Jinghang Wang Designing Information Sharing Platform Using IoT and AI for Farming Management System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Atsushi Ito, Munkhtuya Dooliokhuu, Shiori Ashibe, Yoshikazu Nagao, and Ariunbold Turtogtokh
1
21
35
45
65
79
89
99
ix
x
Contents
Monowave Boundary Construction Method for the Non-convex Reachable Set of the Controlled Dynamical System . . . . . . . . . . . . . . . . . . . . . . . . . 117 Tatiana Zarodnyuk and Alexander Gornov Storage Reduction of Forward-Backward Sweeping Method of Optimal Control of Active Queue Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Wejdan Alrashdan and Amr Radwan Extremal Controls Searching Methods Based on Fixed Point Problems. . . 139 Alexander Buldaev and Ivan Kazmin The Globalized Modification of Rosenbrock Algorithm for Finding Anti-Nash Equilibrium in Bimatrix Game . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Rentsen Enkhbat, Pavel Sorokovikov, Tatiana Zarodnyuk, and Alexander Gornov Optimal Choice of Parameters in Higher-Order Derivative-Free Iterative Methods for Systems of Nonlinear Equations . . . . . . . . . . . . . . . . . . . . . . 165 Zhanlav Tugal, Otgondorj Khuder, Mijiddorj Renchin-Ochir, and Saruul Lkhagvadash Extending Nonstandard Finite Difference Scheme for SIR Epidemic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Enkh-Amar Shagdar and Batgerel Balt
Covering Balls and HT -Differential for Convex Maximization .
Ider Tseveendorj
Abstract In this chapter a method of solving convex maximization problem is developed. The main idea of the method is to cover the domain of the problem by union of balls belonging to the Lebesgue set of the objectif function at the highest level. We introduce a novel notion of differentials of a function and normals to a set called as .HT -differential and the .HT -normal. For the problem of maximizing a convex function over a compact set, we derive global optimality conditions via nonlinear separation. Our results are expressed in terms of .HT -differential and the .HT -normal. We derive a global search algorithm exploiting the global optimality conditions for the convex maximization over a polytope with twice differentiable objective function. We give explicit formulas for .HT -differential through norm functions in some special cases. The convergence of the global search algorithm is investigated. Keywords Global optimality conditions · Global search algorithm · Convex maximization · Piecewise convex functions · Covering balls
1 Introduction The main purpose of this paper is to show the role of nonlinear separation in nonconvex optimization. Solving an optimization problem leads to find a feasible vector such that all it’s better vectors are separated from the domain set. In order to present our approach we consider a convex maximization (called also as a concave minimization) [1, 2, 4, 8, 9], the most studied problem of nonconvex optimization: .
maximize f (x) subject to x ∈ D
(1)
I. Tseveendorj () Laboratory of Mathematics of Versailles, Université Paris-Saclay, UVSQ, Paris, France e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 R. Enkhbat et al. (eds.), Optimization, Simulation and Control, Springer Proceedings in Mathematics & Statistics 434, https://doi.org/10.1007/978-3-031-41229-5_1
1
2
I. Tseveendorj
where .f : Rn → R is a convex function and D is a nonempty convex compact in n .R . The theory of the convex maximization has an entirely different character from the theory of the convex minimization since in a given case there are many local maxima besides the global maximum. In order to illustrate the difference between minimizing and maximizing a convex function over a convex set let us consider also the following convex optimization problem with a convex function .g : Rn → R : .
minimize g(x) subject to x ∈ D
(2)
The Lebesgue set of a function .f (·) for a real .α ∈ R is defined in the following way: Lα f = {x ∈ Rn | f (x) ≤ α}.
.
. Convex minimization For a feasible vector z of the problem (2) a set of vectors better than z in the minimizing sense is {x | g(x) < g(z)}
.
which is the interior of the Lebesgue set of the function .g(·) at .g(z) : {x | g(x) < g(z)} = int (Lg(z) g).
.
The vector .z ∈ D is the minimum of .g(·) over D if and only if there is no better than z feasible vector. Mathematically, it can be written as follows D ∩ int (Lg(z) g) = ∅.
.
(3)
Suppose .int (Lg(z) g) is not empty i.e. z is not a minimum of .g(·). Then by the separating theorem, there is an hyperplane (an affine function) separating n disjoint convex sets; i.e. there nonempty exist .a ∈ R \ {0} and .b ∈ R such that . a, x < b for all .x ∈ int (Lg(z) g) and . a, x ≥ b for all .x ∈ D. . Convex maximization For a feasible vector z of the problem (1) a set of vectors better than z in the maximizing sense is {x | f (x) > f (z)}
.
Covering Balls and .HT -Differential
3
which is a nonconvex set consequently there is no affine separation between the domain D and the set of better vectors .{x | f (x) > f (z)}. A lack of an affine separation for these two sets makes the convex maximization problem difficult and one needs a nonlinear separation. The set of vectors better than z is the complement of the Lebesgue set {x | f (x) > f (z)} = (Lf (z) f )c
.
with the convex set .Lf (z) f that describes a set of vectors no better than z for the problem (1). The vector .z ∈ D is the maximum of .f (·) over D if and only if all feasible vectors are no better than z. Thus .z ∈ D solves the problem (1) iff D ⊂ Lf (z) f.
.
(4)
Expressions (3), (4) show the principal difference between convex minimization (2) and maximization (1) as : . “separation between two convex sets” vs . “inclusion of two convex sets (one into another)”. The separating hyperplane theorem for nonempty disjoint convex sets makes the checking (3) easy for the convex minimization (2). But for checking the inclusion (4) for the convex maximization (1), an affine separation does not help much and one needs another more adapted tools. There is some consolation, however, that firstly the global maximum of a convex function f relative to a convex set D, generally occurs, not at just any point of D, but at some extreme point and secondly, the local maximum search can be done in relatively easy way.
2 Optimality Conditions We use notations from convex analysis for the subdifferential of a function .f (·) and the normal cone to D at y : ∂f (y) = {y ∗ ∈ Rn | f (x) − f (y) ≥ y ∗ , x − y for all x ∈ Rn }, N(D, y) = {y ∗ ∈ Rn | y ∗ , x − y ≤ 0 for all x ∈ D}.
.
4
I. Tseveendorj
2.1 Optimality Condition for Convex Minimization For the convex minimization problem (2), the separating hyperplane theorem implies necessary and sufficient optimality condition: A feasible vector z is a minimum of .g(·) over D if and only if there exists .z∗ ∈ ∂g(z) ∗ z , x − z ≥ 0 for all x ∈ D.
.
(5)
Due to convexity of .g(·), the optimality can be proved by the following chain of inequations: g(x) − g(z) ≥ z∗ , x − z ≥ 0 for all x ∈ D.
.
(6)
Geometrically, the optimality condition (5) can be interpreted that at a minimum z, a subgradient .z∗ ∈ ∂g(z) makes an angle less than or equal to 90.◦ with all feasible variations .x − z, x ∈ D, which means that the condition (5) is equivalent to ∂g(z) ∩ N(D, z) /= ∅
.
(7)
2.2 Optimality Condition for Convex Maximization Since a condition of type (5), (7) with .z∗ ∈ ∂f (z) ∗ z , x − z ≤ 0 for all x ∈ D.
.
(8)
may be satisfied by many local maxima, stationary points and other points of the problem (1), one needs more specific conditions for the global optimality in convex maximization. For the convex maximization problem (1), as we have seen, the inclusion (4) should be verified. Theorem 1 (Rocafellar’s Necessary Condition [11]) Let f be a convex function, and let D be a convex set on which f is finite but not constant. Suppose that the supremum of f relative to D is attained at a certain point .z ∈ ri(domf ). Then every .z∗ ∈ ∂f (z) is a non-zero vector normal to D at z. The assertion of this theorem can be written: ∂f (z) ⊂ N(D, z).
.
(9)
As an extension to the condition (9), the Strekalovsky’s conditions [12] characterize .z ∈ D a global maximum of f over D (necessary and sufficient global optimality conditions):
Covering Balls and .HT -Differential
5
∂f (y) ⊂ N(D, y) for all y such that f (y) = f (z)
.
(10)
It is clear that Strekalovsky’s condition (10) with .y = z implies Rockafellar’s condition (9). The following theorem extends the condition (7) to convex maximization case and provide a characterization of a global maximum of f over D. Theorem 2 ([13, 14]) Let .z ∈ D and there is .v ∈ Rn such that .f (v) < f (z). Then a necessary and sufficient condition for z being a global maximum of f over D is ∂f (y) ∩ N(D, y) /= ∅ for all y such that f (y) = f (z)
.
(11)
After all, the conditions (9), (10) and (11) use a linearization (the subdifferential, the normal cone) along with the affine separation.
3 Nonlinear Separation or Sandwich Inclusion As we have seen early, solving the convex maximization(1) leads, for the global maximum .z ∈ D, to check the following inclusion of two convex sets : D ⊂ Lf (z) f.
.
Our aim is to find a closed subset C such that .D ⊂ C ⊂ Lf (z) f . Definition 1 . We say sets A and B are separated if there is a continuous function .ϕ(·) satisfying: ϕ(x) ≤ ϕ(y) for all x ∈ A for all y ∈ B
.
(12)
. The function .ϕ(·) is called a separating function for the sets A and B, or is said to separate the sets A and B. . The separating function satisfies the stronger condition that ϕ(x) < ϕ(y) for all x ∈ A for all y ∈ B
.
(13)
This is called strict separation of the sets A and B. Lemma 1 Suppose A and B are nonempty disjoint subsets, i.e., .A∩B = ∅ and one of them is a compact. Then there is a separating function .ϕ(·) for the sets A and B. Definition 2 The complement of a set A, often denoted by .Ac , are the elements not in A :
6
I. Tseveendorj
Ac = {x ∈ Rn | x /∈ A}.
.
(14)
Lemma 2 If sets A and .B c are strictly separated then .A ⊂ B. Lemma 3 Let .ϕ(·) be a separating function for the sets A and .B c and let .ξ = sup{ϕ(x) | x ∈ A}. If sets A and .B c are strictly separated then the following sandwich : A ⊂ Lξ ϕ ⊂ B
.
holds. Definition 3 A continuous function .ϕ(·) ∈ Φ is called .HT -differential of a function .f (·) at y if Lϕ(y) ϕ ⊂ Lf (y) f and ϕ(y) = f (y).
.
A set of .HT -differentials of a function .f (·) at y is denoted by ∂ HT f (y) = {ϕ(·) | Lϕ(y) ϕ ⊂ Lf (y) f, ϕ(y) = f (y)}
.
(15)
Remark 1 It is worth to notice that a class of functions .Φ ⊂ C 0 [10] should be defined from the point of view that a maximization .ϕ ∈ Φ over D is somehow easier than the problem (1) with f . Thus, in practice the class of function chosen for .∂ HT f does not contain f if the initial problem (1) is difficult. Remark 2 The author named .HT -differential after his parents D..H andsuren and D..T seveendorj. Definition 4 A continuous function .ψ(·) is called a majorant to a function .f (·) at y if .
f (x) ≤ ψ(x) for all x ∈ Rn f (y) = ψ(y)
(16)
Remark 3 Any majorant .ψ(·) to .f (·) at y satisfies the following inequality f (x) − f (y) ≤ ψ(x) − ψ(y) for all x ∈ Rn
.
and therefore .ψ(·) is an .HT -differential of a function .f (·) at y. But the reverse is not true always. Definition 5 A continuous function .η(·) ∈ Φ is called .HT -normal to a set D at y ∈ D if
.
D ⊂ Lη(y) η
.
Covering Balls and .HT -Differential
7
A set of .HT -normals to a set D at .y ∈ D is denoted by N HT (D, y) = {η(·) | D ⊂ Lη(y) η}
.
(17)
Remark 4 From the Definition 5. .η ∈ N HT (D, y) implies that η(x) − η(y) ≤ 0 for all x ∈ D
.
Theorem 3 Let z be a feasible vector of the convex maximization problem (1). Then a necessary and sufficient condition for z being a global maximum of f over D is ∂ HT f (z) ∩ N HT (D, z) /= ∅
.
(18)
Proof . necessary Let z be a global maximum of f over D, i.e. f (x) ≤ f (z) for all x ∈ D.
.
By denoting D by A and .Lf (z) f by B we observe that A ∩ B c = ∅.
.
First, by the Lemma 1 there is a strictly separating function .ϕ(·) for the sets .A = D, .B c = {x | f (x) > f (z)}. Then by the Lemma 2 the inclusion .D ⊂ Lf (z) f holds. The Lemma 3 concludes that for .ξ = sup{ϕ(x) | x ∈ D} D ⊂ Lξ ϕ ⊂ Lf (z) f,
.
which means .ϕ ∈ ∂ HT f (z), .ξ = f (z) and .ϕ ∈ N HT (D, z), so therefore ϕ ∈ ∂ HT f (z) ∩ N HT (D, z).
.
. sufficient Let the condition (18) holds for a vector .z ∈ D: ∂ HT f (z) ∩ N HT (D, z) /= ∅.
.
Then there is a function .F (·) such that .F ∈ ∂ HT f (z) and .F ∈ N HT (D, z). The former implies LF (z) F ⊂ Lf (z) f
.
8
I. Tseveendorj
while the latter implies F (x) ≤ F (z) for all x ∈ D.
.
Altogether, D ⊂ LF (z) F ⊂ Lf (z) f
.
conclude that .f ∈ N HT (D, z). By the definition of the set of .HT -normals f (x) − f (z) ≤ 0 for all x ∈ D
.
therefore z is a global maximum of f over D. ⨆ ⨅
4 Convex Maximization Over Polytope In order to present a global search algorithm based on the global optimal condition (18) we consider a convex maximization over polytope: .
maximize f (x) subject to x ∈ D,
(CM)
where D is a polytope in .Rn defined by the linear inequalities D = {x ∈ Rn | a j , x ≤ bj , j ∈ I }, I = {1, 2, . . . , 𝓁}
.
(19)
and .f : Rn → R is a twice differentiable continuous convex function. We use the standard notations .∇f (·), ∇ 2 f (·) for the gradient, the Hessian of a function .f (·) respectively. This problem is known to be not easy one even with the quadratic objective function .f (·), specially if one wants to solve it globally. However, obtaining a global solution to convex maximization problems is more tractable than general non convex optimization due to the well known fact : “there exists an extreme point of convex compact D which globally maximizes the problem .(CM)”. It means that the convex function .f (·) attains its maximum at one of finitely many vertices (extreme points) of the polytope D. But a polytope could have an exponential number of extreme points, which makes the problem .(CM) challenging for the global maximum search.
Covering Balls and .HT -Differential
9
4.1 Local Search A local solution to .(CM) can be found easily due to the following a local search method (LSM): starting point : .x 0 ∈ D, iteration : .x k+1 = argmax{〈∇f (x k ), x〉 | x ∈ D} stopping criterion : .x k+1 = x k In the case of .(CM), at each iteration of (LSM) we solve linear programming problems such that .
∇f (x k ), x ≤ ∇f (x k ), x k+1 for all x ∈ D
and its accumulation point y satisfies the necessary local optimality condition : .
∇f (y), x − y ≤ 0 for all x ∈ D.
The vecteur y can be a stationary point, that occurs sometimes because of the necessary condition.
4.2 Covering Sets Definition 6 Let β a real number such that β ≤ max{f (x) | x ∈ D}. An open subset C satisfying conditions C ⊂ Lβ f and C /= int (Lβ f )
.
is called a covering set at level β. Lemma 4 Let C0 , C1 , . . . Ck be covering sets at levels β0 , β1 , . . . , βk respectively. Then ∪ki=0 Ci , the union of covering sets is also a covering set at level β = max{βi | i = 0, 1, . . . , k}. Proposition 1 Let y be a feasible point for (CM) such that f (y) = max{f (x) | x ∈ D}−δ for some δ > 0. Let C be a covering set at level f (y). Then the following problem is equivalent to (CM): .
maximize f (x) subject to x ∈ D \ C.
Proof For z the global maximum of (CM) satisfying f (x) ≤ f (z) for all x ∈ D
.
10
I. Tseveendorj
the inequality max{f | (D \C)} ≤ f (z) holds, since (D \C) ⊂ D. By the definition of the global maximum, y ∈ D implies f (y) ≤ f (z). The inclusion C ⊂ Lf (y) f along with f (y) ≤ f (z) imply z ∈ / C. Therefore z ∈ D \ C, that proves max{f | (D \ C)} = f (z). ⨆ ⨅ Remark 5 A subset C satisfying conditions of Proposition 1. generalizes the standard cutting-plane idea in one hand, and in other hand, it is an attempt on finding a nonlinear separation that will play crucial role for nonconvex optimization, in particular for the convex maximization. Candidates for C should be chosen in view of solving some optimization problems described partially by C. It seems to us, in nonconvex optimization practice spherical covering sets (covering balls) issued from the definitions of HT differential and HT normal, as the simplest nonlinear shape could replace standard the cutting planes. Motivated by above consideration, in this paper we develop an algorithm which will incorporate a local search method, the largest inscribed balls in D, different inscribed balls in Lα f obtained by HT -differential.
4.3 The Largest Inscribed Ball in Polytope We denote by .B(v, r) a closed n-dimensional ball with center .v ∈ Rn and of radius .r > 0 : B(v, r) = {x ∈ Rn |‖x − v‖≤ r}.
.
Let us introduce a nonsmooth function .h : Rn → R by the following formula: h(x) = min{bj − a j , x | j ∈ I }.
.
(20)
It is not difficult to see that . the function .h(·) is continuous and concave in .Rn , . moreover, .h(x) ≥ 0 if and only if .x ∈ D, and as a consequence, the problem .(CM) is equivalent to
maximize f (x) subject to h(x) ≥ 0.
.
In the remainder of the paper, for the sake of simplicity, it is assumed that the strict interior of D is not empty, in other words .dim(D) = n. Without loss of generality we assume also that ∀j ∈ I, ‖a j ‖= 1.
.
(21)
Covering Balls and .HT -Differential
11
Lemma 5 Let D be a nonempty polytope. Then a maximum for the function .h(·) defined in (20) exists and it is achieved at a center of the largest inscribed ball in D. Proof The boundedness of .D = {x | h(x) ≥ 0} implies that function .h(·) is bounded above. By the Weierstrass theorem, the real-valued continuous function .h(·) must attain a maximum. Let u be the maximum of .h(·) satisfying h(u) ≥ h(x) for all x ∈ Rn .
.
(22)
In the other hand, in view of assumption (21), the distance from .x ∈ int (D) to a hyperplane of j -th constraint .{x | a j , x = bj } is equal to .bj − a j , x , therefore the distance from x to the nearest border of D equals .h(x) which is a radius of the largest inscribed ball in D centered at x. So one can conclude from (22) that u is a center of the largest inscribed ball in D with radius .h(u). ⨆ ⨅ Remark 6 Note that at first sight the problem of finding a center of the largest inscribed ball is a nonsmooth nonlinear convex problem .
maximize h(x) subject to x ∈ Rn ,
(23)
but in practice it costs solving the following linear program in .Rn+1 : .
maximize xn+1 subject to a j , x + xn+1 ≤ bj , j ∈ I.
Lemma 6 Let .B(u, r) be the largest inscribed ball in a full dimensional polytope D. Then there are no extreme points of D belonging to .B(u, r). Proof Assume that y is an extreme point of D such that .y ∈ B(u, r). According to the definition of the extreme point, .x 2 = 2y − x 1 ∈ / D for all .x 1 ∈ D.
12
I. Tseveendorj
If .‖y − u‖< r, taking .x 1 ∈ D such that .‖x 1 − y‖≤ r− ‖y − u‖ one observes that .x 2 ∈ B(u, r), but .x 2 ∈ / D and therefore .B(u, r) /⊂ D, which is a contradiction. Similarly, after sufficiently small perturbation on r, one obtains a contradiction in the case of .‖y − u‖= r also. ⨆ ⨅
5 Preliminary Subproblems In this section we present a list of subproblems that we need to solve for the global search algorithm. It is assumed that the following subproblems (24), (25) and (26) should be solved in an efficient way.
5.1 Maximization Over a Ball We are given a ball .B(u, r) and a function .f (·) [3]. .
maximize f (x) subject to x ∈ B(u, r)
(24)
Lemma 7 Let .x 0 be a maximum of .f (·) over .B(u, r). Then .ϕ(x) =‖x − u‖ −r is an .HT -differential of .f (·) at .x 0 , i.e: .
‖x − u‖ −r ∈ ∂ HT f (x 0 ).
5.2 The Largest Inscribed Ball in Lf (y) f Containing a Given y We are given a vector y, for example a local maximum to (CM). The problem is to find .u ∈ Rn and .r > 0 such that .y ∈ B(u, r) and .
maximize r subject to B(u, r) ⊂ Lf (y) f
(25)
Covering Balls and .HT -Differential
13
Let .r ∗ ∈ R, u∗ ∈ Rn be the solution to this problem. Then an .HT -differential of .f (·) at y can be calculated from the solution by the following function : ϕ(x) =‖x − u∗‖ −r ∗ ∈ ∂ HT f (y).
.
Moreover, the following lemma gives an analytical solution to this problem. We recall that by the Taylor theorem any twice differentiable function .f (·) is approximated at y as follows : 1 f (x) ≈ f (y) + ∇f (y), x − y + ∇ 2 f (y)(x − y), x − y . 2
.
Lemma 8 For twice differentiable convex function .f (·) and for a vector .y such that Lf (y) f = / ∅ a .HT -differential of .f (·) at .y can be calculated analytically
.
1 ϕ(x) = f (y) + ∇f (y), x − y + λmax ‖x − y‖2 ∈ ∂ HT f (y) 2
.
where .λmax > 0 is the largest eigenvalue of .∇ 2 f (y). Proof Positive semi-definiteness of the matrice .λmax I − ∇ 2 f (y) implies .f (x) ≤ ϕ(x). The latter along with .f (y) = ϕ(y) prove the lemma. ⨆ ⨅ Example 1 We consider the following function .f : R2 → R in .x = (x1 , x2 )⏉ : f (x) =
.
1 2 1 2 x + x −1 9 1 4 2
14
I. Tseveendorj √
and are given vector .y = (1, 4 3 2 )⏉ with .f (y) = 0. ∇f (y) =
2
9 x1 1 2 x2
.
ϕ(x) =
.
1 2 (x1 − 1) + 9 2
; ∇ f (y) = 2
2 +
2 9
0
0
1 2
; λmax =
1 . 2
√ √ 2 1 4 2 76 2 2 − (x2 − )+ 2 3 3 81
5.3 The Largest Inscribed in Lα f Ball with a Fixed Center u We are given a real .α, a function .f (·) and a vector u such that .f (u) < α.
maximize r subject to B(u, r) ⊂ Lα f.
.
(26)
Lemma 9 Let .r ∗ be the largest radius and w be an edge point of the ball such that .‖ u − w ‖= r ∗ , f (w) = α. Then .ϕ(x) =‖ x − u ‖ − ‖ w − u ‖∈ ∂ HT f (w), .HT -differential of .f (·) at w. The problem (26) can be written also like: .
minimize 21 ‖x − u‖2 subject to f (x) = α.
A solution of the problem (26) permit us to dilate balls for covering a larger part of D.
Covering Balls and .HT -Differential
15
5.4 The Largest Inscribed Ball in D \ C We are given two sets .C, D; C is an open set and D is full dimensional polytope. .
maximize r subject to B(v, r) ⊂ D \ C
(27)
Difficulty of problem (27) defends on set C; it could be from linear programming problem to hard nonconvex optimization problem. Anyway it is assumed to be solvable at least locally. . If C is an open half space : C = {x ∈ Rn | c, x > β}
.
then problem (27) can be solved easily like the problem (23) by linear programming. . Let C be the union of open balls : C=
.
int (B(us , rs )).
s∈S
For solving the problem (27) similarly to (20) we introduce the following function
bj − a j , x , j ∈ I ˆ .h(x) = min ‖x − us ‖ −rs , s ∈ S and solve .
ˆ maximize h(x) subject to x ∈ Rn .
(28)
16
I. Tseveendorj
ˆ is piecewise convex function [7] since Remark 7 Clearly .h(·) ˆ h(x) = min{h(x), F (x)} with F (x) = min{‖x − us ‖ −rs | s ∈ S}.
.
We remark that the problems (27) and (28) are equivalent to .
maximize F (x) subject to x ∈ D
which is a special case of so-called piecewise convex maximization problem studied in [5–7, 14, 15].
6
Global Search Algorithm and Convergence
6.1 Global Search Algorithm Now we are at the position to describe the algorithm step by step. [Initialization] – Choose sufficiently small .ε > 0 ; – Set .k = 0 (local solutions), .S = ∅, s = 0 (covering sets index) ; – .C = s∈S int (B(us , rs )); [step 0] Solve (23) to obtain the largest inscribed ball .B(u0 , r0 ) in D, .S ← s; [step 1] Solve (24) to obtain .x 0 = argmax{f (x) | x ∈ B(us , rs )}; [step 2] Find a local solution y with a starting point .x 0 ; if (.f (y) > f (y k )) then set .k = k + 1 and .y k = y; [step 3] .s = s + 1, S ← s; Solve (25) to obtain the largest ball .B(us , rs ) in .Lf (y k ) f at .y k ; [step 4] For each .s ∈ S solve (26), dilate balls of centers .us to obtain .w s and s s .rs =‖w − u ‖; [step 5] .s = s + 1, S ← s; Solve (27) to obtain the largest inscribed ball .B(us , rs ) in .D \ C; [step 6] if (.rs ≤ ε) then terminate; .y k is the global maximum else GOTO [step 1 ];
6.2 Convergence of the Algorithm The following lemmas are needed to prove convergence of the algorithm. Lemma 10 The numerical sequence .{f (y k )} is nondecreasing and convergent.
Covering Balls and .HT -Differential
17
Proof For any starting point .x 0 ∈ D and for all .k = 0, 1, 2, . . . inequality 0 k k .f (x ) ≤ f (y ) holds since .y is found by local search method. By construction k s s .y ∈ B(u , rs ) and.B(u , rs ) ⊂ i∈S B(ui , ri ). The starting point for next local search lies in .D ∩ ( i∈S B(ui , ri )). Thus for all .k = 0, 1, 2, . . . .
f (y k ) = max{f (x) | x ∈ D ∩ B(us , rs )} ≤ i 0 k+1 ) ≤ max{f (x) | x ∈ D ∩ ( s+1 i=0 B(u , ri )} = f (x ) ≤ f (y
so that .{f (y k )} is nondecreasing. Compactness of D and continuity of .f (·) imply that .{f (y k )} is bounded above. A numerical sequence is convergent if it is nondecreasing and bounded above. ⨆ ⨅ Lemma 11 If .rs = 0 for some s, then .y k is the global maximum. Proof There are two cases of computing .rs : . on [step 3], inscribed ball in the Lebesgue set .Lf (y k ) f and . on [step 5], inscribed ball in .D \ C. Clearly .y k can not be a minimum of .f (·) that implies .Lf (y k ) f /= ∅. Hence, .rs =‖ us − y k‖> 0, therefore in the former case .rs > 0. In the latter case .rs = 0 together with .B(us , rs ) ⊂ D \ C imply that for all i .x ∈ D there is a ball containing .x ∈ B(u , ri ), consequently .x ∈ C since .C = s i i=0 int (B(u , ri )). Thus .D ⊂ C, moreover the following sandwich holds D ⊂ C ⊂ Lf (y k ) f
.
that completes the proof about .y k is the global maximum.
⨆ ⨅
Theorem 4 The accumulation point of the sequence .{y k } generated by the algorithm solves problem .(CM). Proof Consider .z = limk→+∞ y k . It follows from [step 4] that .w s ∈ B(us , rs ) ⊂ Lf (z) f and .f (w s ) = f (z) for all .s ∈ S. Denote by ϕs (x) =‖x − us ‖ − ‖w s − us ‖ for all s ∈ S.
.
By the definition of .HT -differential, for every .s ∈ S ϕs (x) ∈ ∂ HT f (w s ).
.
Thus, denoting by .ϕ(x) = min{ϕs (x) | s ∈ S}, on the one hand, we obtain ϕ ∈ ∂ HT f (z).
.
We now choose .ε → 0. Then algorithm terminates at .z = y k when .rs = 0. The radius of the largest inscribed ball in .D \ C equals zero implies .D \ C = ∅. Consequently,
18
I. Tseveendorj
D⊂C=
.
int (B(us , rs )),
s∈S
which means for all .x ∈ D there is .s ∈ S such that .x ∈ B(us , rs ), equivalently .
‖x − us ‖≤ rs =‖w s − us ‖ .
Using the above notation for .ϕs (x), ϕ(x), since .ϕ(z) = 0, we conclude that ϕ(x) − ϕ(z) ≤ 0 for all x ∈ D.
.
By the definition of .N HT (D, z)-normal we obtain on the other hand ϕ ∈ N HT (D, z).
.
The optimality condition (18) of the Theorem 3. holds : ϕ ∈ ∂ HT f (z) ∩ N HT (D, z) =⇒ ∂ HT f (z) ∩ N HT (D, z) /= ∅
.
which completes the proof and z is global maximum to the problem (CM).
⨆ ⨅
Acknowledgments This article is dedicated to the blessed memory of my beloved parents D..H andsuren and D..T seveendorj who sadly passed away in 2021 while it was under writing.
References 1. Chinchuluun, A., Enkhbat, R., Pardalos, P.M.: A novel approach for nonconvex optimal control problems. Optimization 58(7), 781–789 (2009) 2. Enkhbat, R.: Algorithm for global maximization of convex functions over sets of special structures. Ph.D. Thesis, Irkutsk State University (1991) 3. Enkhbat, R.: An algorithm for maximizing a convex function over a simple set. J. Global Optim. 8(4), 379–391 (1996) 4. Enkhbat, R.: On some theory, methods and algorithms for concave programming. In: Pardalos, P.M., Tseveendorj, I., Enkhbat, R. (eds.) Optimization and Optimal Control, pp. 79–102. World Scientific Publishing, River Edge (2003) 5. Fortin, D., Tsevendorj, I.: Piecewise-convex maximization problems: algorithm and computational experiments. J. Global Optim. 24(1), 61–77 (2002) 6. Fortin, D., Tseveendorj, I.: Piece adding technique for convex maximization problems. J. Global Optim. 48(4), 583–593 (2010) 7. Fortin, D., Tseveendorj, I.: Piecewise convex maximization problems: piece adding technique. J. Optim. Theory Appl. 148(3), 471–487 (2011) 8. Horst, R., Tuy, H.: Global Optimization, 2nd edn. Springer, Berlin (1993) 9. Horst, R., Pardalos, P.M., Thoai, N.V.: Introduction to Global Optimization. Nonconvex Optimization and its Application, vol. 48, 2nd edn. Kluwer, Dordrecht (2000) 10. Rentsen, E.: Quasiconvex optimization and its applications. In: Handbook of Nonconvex Analysis and Applications, pp. 507–542. International Press, Somerville (2010)
Covering Balls and .HT -Differential
19
11. Rockafellar, T.R.: Convex Analysis. Princeton Mathematical Series, vol. 28. Princeton University Press, Princeton (1970) 12. Strekalovski˘ı, A.S.: On the problem of the global extremum. Dokl. Akad. Nauk SSSR. 292(5), 1062–1066 (1987) 13. Tseveendorj, I.: On the conditions for global optimality. J. Mong. Math. Soc. 2, 58–61 (1998) 14. Tsevendorj, I.: Piecewise-convex maximization problems: global optimality conditions. J. Global Optim. 21(1), 1–14 (2001) 15. Tseveendorj, I, Fortin, D.: Survey of piecewise convex maximization and PCMP over spherical sets. In: Advances in Stochastic and Deterministic Global Optimization. Springer Optimization and Its Applications, vol. 107, pp. 33–52. Springer, Cham (2016)
Employing the Cloud for Finding Solutions to Large Systems of Nonlinear Equations William Trevena, Alexander Semenov, Michael J. Hirsch, and Panos M. Pardalos
Abstract Systems of nonlinear equations can be quite difficult to solve, even when the system is small. As the systems grow in size, the complexity can increase dramatically to find all solutions. This research discusses transforming the system into a global optimization problem and making use of a newly developed cloudbased optimization solver to efficiently find solutions. Examples on large systems are presented. Keywords System of nonlinear equations · Global optimization · Heuristics · Cloud-based solver
1 Introduction The research presented in this paper is concerned with finding one or more solutions to a system of nonlinear equations (SNE): f1 (x1 , x2 , . . . , xn ) = 0.
.
(1)
W. Trevena · A. Semenov University of Florida, Gainesville, FL, USA e-mail: [email protected]; [email protected] M. J. Hirsch () ISEA TEK LLC, Maitland, FL, USA TOXEUS Systems LLC, Maitland, FL, USA e-mail: [email protected]; [email protected] https://www.ISEATEK.com P. M. Pardalos University of Florida, Gainesville, FL, USA TOXEUS Systems LLC, Maitland, FL, USA e-mail: [email protected]; [email protected] https://www.TOXEUS.org © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 R. Enkhbat et al. (eds.), Optimization, Simulation and Control, Springer Proceedings in Mathematics & Statistics 434, https://doi.org/10.1007/978-3-031-41229-5_2
21
22
W. Trevena et al.
Fig. 1 Example SNE with two transcendental equations in two unknowns (Eq. (4) with blue contours and Eq. (5) with red contours)
f2 (x1 , x2 , . . . , xn ) = 0.
(2)
.. . fm (x1 , x2 , . . . , xn ) = 0
(3)
where .f1 , f2 , . . . , fm are real-valued continuous or continuously differentiable functions on the domain .Dn , and where at least one of .f1 , f2 , . . . , fm is nonlinear. We note that this system can be written in its more compact form, .Fm (x) = Θm ≡ (0, 0, . . . , 0)⏉ , where .Fm = (f1 , f2 , . . . , fm )⏉ . As a simple example in two dimensions, consider the system of transcendental equations given by Eqs. (4)–(5). The contour plots of these two equations are presented in Fig. 1, with Eq. (4) in blue and Eq. (5) in red. Solutions to this SNE are defined as the points where the blue and red contours intersect. .
f1 (x1 , x2 ) = x1 − x1 sin(x1 + 5x2 ) − x2 cos(5x1 − x2 ) = 0.
(4)
f2 (x1 , x2 ) = x2 − x2 sin(5x1 − 3x2 ) + x1 cos(3x1 + 5x2 ) = 0
(5)
Finding one or more solutions to a SNE is a challenging and ubiquitous task faced in many fields. Two recent survey papers [25, 26], as well as multiple books [10, 11, 31] provide examples of SNEs across many of the applied sciences. The problem of solving even a system of polynomial equations has been proven to be NP-hard [23]. Furthermore, it has also been proven [29] that no general algorithm exists for determining whether an integer solution exists for a polynomial equation with a finite number of unknowns and only integer coefficients. This is the
Cloud-Based Solutions for Large SNEs
23
famous 10th problem of Hilbert [15]. When the SNE is not polynomial, the general problem of finding solutions to a SNE is even more difficult. When .m > n, a SNE is referred to as an overdetermined SNE, while when .m < n, the SNE is referred to as an underdetermined SNE. When .m = n, the SNE is referred to as a square SNE [1]. Furthermore, a SNE is considered to be consistent if a solution exists which satisfies all equations [12]. If the system of equations is linear, there are many approaches available to determine the solutions (e.g., Gaussian elimination and matrix inverses [14], nullspace computation [13, 14, 19], linear programming [5], etc.). If the system happens to be polynomial in nature, then exact techniques employing resultants [7] or Gröbner bases [8, 9] can be employed. However, these techniques have their shortcomings. With resultants, some ordering of the eliminated variables can lead to extraneous solutions (i.e., a solution of the resultant that is not a solution to the original system). There is no way to determine in advance whether an ordering will cause such extraneous solutions, nor which solutions might be extraneous. Gröbner bases using Buchberger’s algorithm suffer from the generation of a large number of equations in the intermediate stages, thus making them intractable for even moderate-sized problems. When the equations in the system do not exhibit nice linear or polynomial properties, it can be very difficult to determine any or all solutions to the system. In this general case, one is left with numerical routines to determine solutions to the system. Unfortunately, there is no one numerical method that will guarantee that all solutions will be found to a general system of equations in finite time (a variation of the well-known ‘no free lunch’ theorem [32]). Many commonly utilized numerical techniques for finding solutions are based on Newton’s method [28]. However, common variants of Newton’s method can only be applied to square SNEs, and they typically utilize the Jacobian of the system (or an approximation of the Jacobian) and higher order derivatives. Obtaining derivatives of the system can be expensive to compute or approximate at each point, and for large systems, utilizing methods which rely upon the Jacobian or higher order derivatives can require large amounts of computational power. Alternatively, by reformulating a SNE as an optimization problem, one can attempt to use parallel optimization methods to search for many solutions of the SNE simultaneously. By using derivative-free optimization methods in particular, one can search for solutions to SNEs where the ability to compute derivatives is either prohibitively expensive or not even possible. Such methods are typically heuristic in nature, and although they do not guarantee finding all solutions to an arbitrary SNE in finite time, they have been shown to work well in practice on large and challenging SNEs [17, 30]. However, applying such methods to solve SNEs at scale in a time-constrained environment can still require vast amounts of computational power. Due to the substantial upfront costs associated with purchasing vasts amount of computational resources, cloud computing services have become increasingly attractive as they allow organizations to deploy new services and capabilities with limited to no up-front cost or commitment [21]. For resource constrained
24
W. Trevena et al.
environments in particular, cloud computing services can be especially useful as extremely large amounts of computational resources can now be utilized for short periods of time without substantial long-term financial investment. For example, Amazon Web Services now offers On-Demand Instances (servers) with 448 vCPU, 12288 GiB RAM, 100 Gigabit Network Performance [4]. The remaining sections of this chapter are organized as follows: In Sect. 2 a brief overview is provided on the cloud architecture being developed for our TOXEUS Cloud Solver (TCS). Section 3 presents some example SNEs from the literature and their solutions found using the TCS. Conclusions and future research directions are presented in Sect. 4.
2 Cloud Solver Architecture To solve SNEs at scale, we have developed the cloud solver architecture shown in Fig. 2. A client needing to solve a SNE will send the SNE to the cloud solver and receive the solutions when they are found. Internally, the TCS will efficiently make use of cloud resources to find solutions to the SNE.
2.1 Cloud Solver Request Handler The Solver Request Handler receives all the SNEs sent by clients to the solver, and dynamically allocates the computing resources needed to find their solution(s). The Solver Request Handler component consists of the Pre-processing, Problem Transformation, and Starting Point Selection sub-components. We note that, in practice, to prevent unauthorized access to the SNE Cloud Solver, clients should be required to authenticate themselves before their requests are received by the Solver Request Handler. For simplicity, it is assumed that any request that enters the SNE Cloud Solver (any request that makes it through the dashed blue border surrounding the SNE Cloud Solver in Fig. 2) has been successfully authenticated.
Pre-processing The Solver Request Handler routes each SNE that it receives to the SNE preprocessor. The SNE pre-processor utilizes multiple approaches to determine any simplifications of the SNE (e.g., combining like terms and/or factoring) or special structure that can be algorithmically exploited (e.g., separability of variables across the SNE).
Cloud-Based Solutions for Large SNEs
25
Fig. 2 Cloud solver architecture for finding multiple solutions to SNEs by performing optimization from multiple starting points in parallel
Transforming SNEs Into Optimization Problems After pre-processing is complete, the SNE (or appropriately simplified SNE) is transformed into an optimization problem. The SNE is transformed into a singleobjective unconstrained global optimization problem by introducing the objective function ϕ 0 (x) = α
m
.
|| |fi (x)| ||p
(6)
i=1
where .α > 0, .p > 0, and .||·||p is the well-defined .Lp norm [19]. Figure 3 provides two example transformation via Eq. (6) for the SNE defined by Eqs. (4) and (5). In order to find solutions to the SNE, the equivalent global optimization problem can be solved:
26
W. Trevena et al.
Fig. 3 The objective function surface .ϕ 0 (x) = α .α = 1, .p = 2 for the SNE defined by Eqs. (4)–(5)
.
m
i=1 |fi (x)|
min ϕ 0 (x).
x∈D
p
when (a) .α = 1, .p = 1 and (b)
(7)
It is clear that if .xˆ is a solution to the SNE defined by Eqs. (1) and (3), then .xˆ minimizes .ϕ 0 , and .ϕ 0 (x) ˆ = 0. In addition, if the SNE defined by Eqs. (1) and (3) has a solution, then any global minimum of .ϕ 0 (x) is a solution to the SNEs.
Selecting Starting Points for Optimization Selecting the number and location of starting points for any algorithm is a challenging problem in optimization. In our current implementation of the cloud solver, there are options, based on user specifications and problem characteristics, to determine the number, and location, of starting points to search for the global minima of .ϕ 0 (x). Lower and upper bounds for each problem variable are also determined at this step.
2.2 Parallel Optimization After all starting points for optimization have been selected, the Solver Request Handler will allocate a sufficient amount of computational resources for optimization to be performed in parallel, from each starting point. Based on client preferences and characteristics of the problem, the Solver Request Handler has the ability to select from a number of heuristic optimization algorithms for each starting point. Examples of such algorithms include Simulated Annealing [24], extensions to Continuous Greedy Randomized Adaptive Search Procedures [16–18], Continuous Variable Neighborhood Search (CVNS) [30], and local search, among others.
Cloud-Based Solutions for Large SNEs
27
2.3 Solution Aggregation After optimization from each starting point has completed, all of the solutions found by the cloud solver are evaluated with respect to how close their objective function value is to 0, and how close the solutions are to each other. A subset of solutions found by the cloud solver are then returned to the client.
3 Numerical Examples This section presents two large-sized SNE problems to show the performance of the TCS compared with other recently published approaches.
3.1 Discrete Integral Equation Martinez [27] introduced a Discrete Integral Equation problem, whose discretization results in a dense nonlinear system over an arbitrary number of variables, n. This 1 system is presented in Eq. (8). For this system, .h = n+1 , and .ti = i · h. We note that this is a classical discretization application and included as part of most undergraduate numerical differential equations courses. ⎡ i 3 h ⎣ · (1 − ti ) · tj · xj + tj + 1 + .fi (x) = xi + 2 j =1
⎤ n 3 1 − tj · xj + tj + 1 ⎦ ∀i = 1, . . . , n ti ·
(8)
j =i+1
Martinez [27] solved this system for .n = 500 and .n = 1000. Pei et al. [30] developed a CVNS algorithm, making use of Hooke-Jeeves [20] pattern search as the under-lying local minimizer. The CVNS algorithm solved this system for various values of n, up to .n = 1000. In Table 1, the number of objective function evaluations to find an optimal solution is shown for both the CVNS and the TCS algorithms, for up to .n = 1, 500. We see clearly that the TCS is able to find the optimal solution in about .1/3 to .1/2 the number of objective function evaluations as compared with CVNS, and the TCS was able to solve larger problems than the CVNS algorithm.
28
W. Trevena et al.
Table 1 Number of function evaluations to find the optimal solution for the Discrete Integral Equation (8)
n 20 50 100 200 300 400 500 600 700 800 900 .1, 000 1100 1200 1300 1400 1500
C-VNS 3362 8745 .19, 062 .38, 072 .66, 703 .75, 931 .93, 145 .110, 558 .135, 999 .179, 410 .176, 410 .204, 244 .− .− .− .− .−
TCS 1141 3747 5873 .15, 863 .30, 757 .36, 110 .36, 502 .44, 408 .59, 914 .59, 425 .68, 928 .77, 413 .85, 768 .85, 872 .100, 583 .178, 045 .132, 236
3.2 Van der Pol Equation The Van der Pol equation governs the flow of current in a vacuum tube with three internal elements [2, 3, 6]. This is defined via the differential equation in Eq. (9), where .μ > 0 and the initial boundary conditions are prescribed by .y(0) = 0 and .y(2) = 1.
y '' − μ · y 2 − 1 · y ' + y = 0
.
(9)
One approach to solving the Van der Pol equation (Eq. (9)) numerically is by making use of second order finite difference approximation schemes [22]. We can discretize the domain .[0, 2] as .x0 = 0 < x1 < . . . < xn = 2, with .xi = x0 + i · h, .h = 2/n, and .yi = y (xi ) for all .i = 1, . . . n. The finite difference equations approximating the first and second derivatives over this discretization is then given by Eqs. (10) and (11), where .i = 1, . . . , n − 1. yi+1 − yi−1 . 2·h yi+1 − 2 · yi + yi−1 yi'' = h2
yi' =
.
(10) (11)
Applying this discretization and finite-difference approximation to Eq. (9) results in the SNEs given by Eq. (12), which consists of .n − 1 equations in .n − 1 unknowns.
Cloud-Based Solutions for Large SNEs
29 ¯ .S
Table 2 Solution to the Van der Pol Equation, when .n = 10, as presented in [3]
Table 3 Metrics associated with the solution presented in [3] to the Van der Pol Equation (9), when .n = 10
.L1 .L2 .L∞ .Mean .Median
.y0
0
.y1
.−0.4795
.y2
.−0.9050
.y3
.−1.287
.y4
.−1.641
.y5
.−1.990
.y6
.−2.366
.y7
.−2.845
.y8
.−3.673
.y9
.−6.867
.y10
1
0.014911139 0.006064557 0.00408653 0.001656793 0.001589068
2 · h2 · yi − h · μ · yi2 − 1 · (yi+1 − yi−1 ) + 2 · (yi+1 − 2 · yi + yi−1 )
.
= 0 ∀ i = 1, . . . , n − 1
(12)
In this resulting SNE, both Alqahtani, Behl, Kansal [3] and Al-Obaidi and Darvishi [2] set the parameter .μ = 0.5, however the former considers only .n = 10, while the latter considers only .n = 100 and .n = 200. The solution
provided for .n = 10 [3] (when an initial solution is given by 1 .yi = log , for .i = 1, . . . , n − 1) is presented in Table 2. In addition, plugging i2 this solution back into the system of 9 equations in 9 unknowns defined by Eq. (12) produces the error metrics presented in Table 3. For completeness, if the given solution is .x ∗ , the error metrics are defined by Eqs. (13)–(17). m ∗ fi x . L1 x ∗ =
.
(13)
i=1
m ∗ fi (x ∗ )2. L2 x =
(14)
i=1
L∞ x ∗ = max fi x ∗ . i
(15)
30
W. Trevena et al.
Table 4 Solutions 1–5 found by the TCS for the Van der Pol Equation (9), when .n = 10 (Note that .Si refers to solution i) .y0 .y1 .y2 .y3 .y4 .y5 .y6 .y7 .y8 .y9 .y10
.S1
.S2
.S3
.S4
0 .0.273501552 .0.512426592 .0.71470764 .0.87958795 .1.006201768 .1.092925784 .1.137433272 .1.137279195 .1.091082247 1
0 .−0.478986506 .−0.904070898 .−1.285854729 .−1.640494411 .−1.989214196 .−2.365818661 .−2.844571173 .−3.673200122 .−6.868107212 1
0 .0.485910191 .0.91743423 .1.305940512 .1.668967012 .2.030135184 .2.428947241 .2.958101687 .3.963793845 .9.965384611 1
0 .0.57230171 .1.085097492 .1.563049293 .2.047764611 .2.61915569 .3.515670242 .6.444549256 .−1.939370466 .−12.91888874 1
.S5
0 .0.803470811 .1.547367689 .2.336255157 .3.45716339 .6.986023353 .−1.418398767 .−10.6588578 .2.480535389 .24.60674205
1
Table 5 Solutions 6–10 found by the TCS for the Van der Pol Equation (9), when .n = 10 (Note that .Si refers to solution i) .y0 .y1 .y2 .y3 .y4 .y5 .y6 .y7 .y8 .y9 .y10
.S6
.S7
.S8
.S9
0 .−0.798910791 .−1.537949762 .−2.31921543 .−3.419718676 .−6.756290502 .1.775624119 .12.28630981 .−1.383924593 .−16.30720049 1
0 .−0.568321887 .−1.077638523 .−1.551722731 .−2.030859326 .−2.591433888 .−3.455797487 .−6.103805825 .2.759365681 .20.21637952 1
0 .−1.499766437 .−3.135684847 .−7.134741241 .2.022481713 .14.43143106 .−0.56639203 .−14.55718478 .2.303994137 .28.30663233 1
0 .1.512986727 .3.169621928 .7.330858812 .−1.735069624 .−12.75012423 .1.305226269 .16.33113543 .−1.087753066 .−18.78470312 1
m 1 ∗ fi x . Mean x ∗ = m i=1 Median x ∗ = median f1 x ∗ , . . . , fm x ∗
.S10
0 .−50.12143002 .0.788693186 .49.77925235 .0.007326216 .−45.02512938 .0.887209852 .45.79774178 .0.039709203 .−41.36864665
1
(16) (17)
Using the same initial solution, the TCS algorithm found 10 solutions, presented in Tables 4 and 5. Metrics for these solutions are prvided in Tables 6 and 7. Note that all of these solutions have metrics approximately 10 times better when compared with the solution found by Alqahtani, Behl, and Kansal [3]. For .n = 100 and .n = 200, Al-Obaidi and Darvishi [2] use an initial solution of .yi = 1, for .i = 1, . . . , n − 1. However, they do not present the number of solutions found, nor the quality of those solutions. For .n = 100, TCS found 26 solutions. The
Cloud-Based Solutions for Large SNEs
31
Table 6 Metrics associated with solutions 1–5 to the Van der Pol Equation (9), with the TCS, when = 10
.n
.S1
.S2
.S3
.S4
.S5
.L1
.0.002801492
.0.002504353
.0.000999903
.L∞
.0.000451104
.Mean
.0.000311277
.Median
.0.000336411
0.00275064 0.000992326 0.00045965 0.000305627 0.000353736
.0.002341704
.L2
0.002556901 0.000997937 0.000462465 0.0002841 0.000345607
.0.000999057 .0.000499304 .0.000278261 .0.000316951
.0.000991869 .0.000611101 .0.000260189 .0.000214599
Table 7 Metrics associated with solutions 6–10 to the Van der Pol Equation (9), with the TCS, when .n = 10 .S6
.S7
.S8
.S9
.S10
.L1
.0.002352033
.0.002164205
.0.002428033
.0.002483718
.L2
.0.000975794
.0.000900324
.0.000945519
.L∞
.0.000582989
.0.000498407
.0.000544351
.Mean
.0.000261337
.0.000240467
.0.000269781
.Median
.0.000216279
.0.000214496
.0.000190281
0.002351115 0.000924277 0.0006674 0.000261235 0.00021149
Table 8 Summary metrics for the TCS associated with the solution to the Van der Pol Equation (9), when .n = 100
Table 9 Summary metrics for the TCS associated with the solution to the Van der Pol Equation (9), when .n = 200
.0.000992681 .0.000605802 .0.000275969 .0.000256426
.L1
.0.521233538
.L2
.0.062099231
.L∞
.0.013254278
.Mean
.0.005264985
.Median
.0.005595465
.L1
.0.681333948
.L2
.0.054035564
.L∞
.0.008708989
.Mean
.0.003423789
.Median
.0.003713471
solution quality of all 26 solutions are no larger than the metric values presented in Table 8. For .n = 200, TCS found 10 solutions. The solution quality of all 10 solutions are no larger than the metric values presented in Table 9.
4 Conclusions and Future Research As problem applications increase in size, conventional computing architectures are not able to provide adequate computational power and efficiency to solve these problems. To address this growing need, we have developed the TOXEUS Cloud Solver to allow researchers in academia and industry to more easily and efficiently
32
W. Trevena et al.
solve large scale SNEs and optimization problems in the cloud. By transforming SNEs to optimization problems, the TOXEUS Cloud Solver can take advantage of numerous heuristics for optimization that have been developed in recent years. We have presented the solutions to a few large SNEs, considering problems with thousands of variables and equations. Future research will continue to improve the pre-processing components of the TOXEUS Cloud Solver, with the goal of choosing particular heuristic optimization algorithms to employ to efficiently solve a SNE based upon properties exhibited by the SNE.
References 1. Ahookhosh, M., Amini, K., Bahrami, S.: Two derivative-free projection approaches for systems of large-scale nonlinear monotone equations. Numer. Algorithms 64, 21–42 (2013) 2. Al-Obaidi, R., Darvishi, M.: A comparative study on qualification criteria of nonlinear solvers with introducing some new ones. J. Math. 2022, 1–20 (2022) 3. Alqahtani, H., Behl, R., Kansal, M.: Higher-order iteration schemes for solving nonlinear systems of equations. Mathematics 7(10), 937–951 (2019) 4. Amazon, Amazon EC2 on-demand pricing (2022). https://aws.amazon.com/ec2/pricing/ondemand 5. Bazaraa, M., Jarvis, J., Sherali, H.: Linear Programming and Network Flows, 2nd edn. Wiley, Hoboken (1990) 6. Burden, R., Faires, J.: Numerical Analysis, 5th edn. PWS Publishing Company, Boston (1993) 7. Cohen, H.: A Course in Computational Algebraic Number Theory. Springer, Berlin (1993) 8. Cox, D., Little, J., O’Shea, D.: Ideals, Varieties, and Algorithms, 2nd edn. Springer, New York (1997) 9. Cox, D., Little, J., O’Shea, D.: Using Algebraic Geometry, 2nd edn. Springer, New York (2005) 10. Floudas, C., Pardalos, P.: A Collection of Test Problems for Constrained Global Optimization Algorithms. Lecture Notes in Computer Science. Springer, New York (1990) 11. Floudas, C., Pardalos, P., Adjiman, C., Esposito, W., Gümü¸s, Z., Harding, S., Klepeis, J., Meyer, C., Schweiger, C.: Handbook of Test Problems in Local and Global Optimization. Springer, Berlin (1999) 12. Gatilov, S.: Properties of nonlinear systems and convergence of the Newton-Raphson method in geometric constraint solving. Bullet. Novosibirsk Comput. Center 32, 57–75 (2011) 13. Gleeson, R., Grosshans, F., Hirsch, M., Williams, R.: Algorithms for the recognition of 2D images from m points and n lines in 3D. Image Vision Comput. 21(6), 497–504 (2003) 14. Golub, G., Van Loan, C.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996) 15. Hilbert, D.: Mathematical problems. Bullet. Am. Math. Soc. 8, 437–479 (1902) 16. Hirsch, M.: GRASP-based heuristics for continuous global optimization problems. PhD Thesis, University of Florida (2006) 17. Hirsch, M., Pardalos, P., Resende, M.: Solving systems of nonlinear equations with continuous grasp. Nonlinear Anal. Real World Appl. 10(4), 2000–2006 (2009) 18. Hirsch, M., Pardalos, P., Resende, M.: Speeding up continuous GRASP. Eur. J. Oper. Res. 205(3), 507–521 (2010) 19. Hoffman, K., Kunze, R.: Linear Algebra, 2nd edn. Prentice Hall, Hoboken (1971) 20. Hooke, R., Jeeves, T.: Direct search solution of numerical and statistical problems. J. ACM 8(2), 212–229 (1961) 21. Hsu, P.: A deeper look at cloud adoption trajectory and dilemma. Inf. Syst. Front. 24(1), 177– 194 (2022)
Cloud-Based Solutions for Large SNEs
33
22. Iserles, A.: A First Course in the Numerical Analysis of Differential Equations. Cambridge Texts in Applied Mathematics. Cambridge University Press, Cambridge (1996) 23. Jansson, C.: An NP-hardness result for nonlinear systems. Reliab. Comput. 4(4), 345–350 (1998) 24. Kirkpatrick, S., Gelatt, D., Vecchi, M.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983) 25. Kotsireas, I., Pardalos, P., Semenov, A., Trevena, W., Vrahatis, M.: Survey of methods for solving systems of nonlinear equations, part I: root-finding approaches (2022). arXiv:2208.08530 26. Kotsireas, I., Pardalos, P., Semenov, A., Trevena, W., Vrahatis, M.: Survey of methods for solving systems of nonlinear equations, part II: optimization-based approaches (2022). arXiv2208.08532 27. Martinez, J.: Solving systems of nonlinear equations by means of an accelerated successive orthogonal projection method. J. Comput. Appl. Math. 16(2), 169–179 (1986) 28. Martínez, J.: Algorithms for solving nonlinear systems of equations. In: Spedicato, E. (ed.) Algorithms for Continuous Optimization: The State of the Art, pp. 81–108. Springer, Dordrecht (1994) 29. Matiyasevich, Y.: Hilbert’s Tenth Problem. MIT Press, Cambridge (1993) 30. Pei, J., Draži´c, Z., Draži´c, M., Mladenovi´c, N., Pardalos, P.: Continuous variable neighborhood search (c-vns) for solving systems of nonlinear equations. INFORMS J. Comput. 31(2), 235– 250 (2019) 31. Rheinboldt, W.: Methods for Solving Systems of Nonlinear Equations, 2nd edn. Society for Industrial and Applied Mathematics, Philadelphia (1987) 32. Wolpert, D., Macready, W.: No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1(1), 67–82 (1997)
An Approximation Scheme for a Bilevel Knapsack Problem Ashwin Arulselvan and Altannar Chinchuluun
Abstract The Global Fund Allocation problem (GFAP) is a variation of bilevel knapsack problem which comprises of two players, a leader and a follower each equipped with a budget. There is a set of projects that have certain costs. In the setting we consider, the profit valuations of these projects is the same for both the leader and the follower and projects can be fractionally picked. In addition, there is a special project of exclusive interest to the follower (i.e. leader’s profit for this project is 0). The leader is interested in providing a cost offset to these projects such that the total offset is within the leader’s budget. The follower then solves a paramterised continuous knapsack problem that considers the projects with the offsetted cost. The leader’s objective is to maximise the profits of the projects selected. In this work, we provide a complexity result and a polynomial time approximation scheme for this special case of the GFAP. This bilevel problem has continuous variables at both the upper level and lower level but the bilevel nature makes this a difficult problem to solve. Keywords Knapsack problem · Bilevel optimization · Global fund allocation problem
1 Introduction Bilevel programming problems have received much attention over the last few decades due to their practical applications including in game theory, economics and management sciences. A bilevel programming problem involves two optimization problems connected in a hierarchical way. The objective of the upper-level problem A. Arulselvan () Department of Management Science, Strathclyde Business School, Glasgow, UK e-mail: [email protected] A. Chinchuluun Business School, National University of Mongolia, Ulaanbaatar, Mongolia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 R. Enkhbat et al. (eds.), Optimization, Simulation and Control, Springer Proceedings in Mathematics & Statistics 434, https://doi.org/10.1007/978-3-031-41229-5_3
35
36
A. Arulselvan and A. Chinchuluun
depends on the solution of the lower-level problem [1–4]. In general, they are challenging to solve due to their complexity and non-convexity [5, 7]. Morton et al. [9] proposed a bilevel programming problem with a knapsack constraints called the Global fund allocation problem (GFAP) as an alternative to traditional price subsidies provided to health related projects. Traditionally, subsidies were provided by donor agencies by performing a cost-benefit analysis, where the projects are ranked based on their cost to benefit ratio and funded in that order based on the availability of the budget. The recipients of these funding, typically a low or middle income country, will be able to offset their budget based on the subsidiess allocated for the health projects. The rationale behind the new subsidy structure proposed in [9] is that the cost effective health projects must be funded the recipients themselves while subsidies are provided for projects that the recipients would not pick otherwise. This maximises the utility of the total healthcare projects that gets funded. We consider the setting where projects can be picked fractionally and the follower solves a continuous knapsack problem.
2 Problem Definition In GFAP, there are two players. A donor who plays the role of a leader and a recipient playing the role of a follower. We will use the notation .[n] to represent the set .{1, . . . , n}. A set of .[n] healthcare projects with integer costs, .c : [n] → Z+ and profits .p : [n] → Z+ are given. The donor agency and the recipient are equipped with a budget of .Bout and .Bin respectively. In addition, there is an external project (non-healthcare project) that is of exclusive interest to the recipient with a profit of .p0 and cost .c0 . Note that this represents a portfolio of projects each with a profit and budget. Ideally, .p0 needs to be modelled as a piecewise linear concave function. Let i i .[m] be the set of non-healthcare related projects with profits .p and costs .c , for all 0 0 .i = 1, . . . , m. Let .σ be the order of the projects in the set .[m] in the decreasing order of their profit to cost ratios. The piecewise linear profit function would be given as: ⎧ σ (1) ⎪ ⎪ ⎪p 0 x ⎪ ⎪ ⎪ ⎪p0σ (1) x ⎪ ⎪ ⎪ ⎪ ⎨... .p0 (x) = σ (i) ⎪ ⎪p0 x ⎪ ⎪ ⎪ .. ⎪ ⎪ ⎪ . ⎪ ⎪ ⎪ ⎩pσ (m) x 0
σ (1)
x ≤ c0 σ (1)
c0
σ (1)
≤ x ≤ c0
i
σ (i−1) j =1 c0
x≥
σ (2)
+ c0
≤x≤
i
σ (i) j =1 c0
i
σ (m) j =1 c0
For now we study the simple case where .p0 is a linear function. We later provide an intuition for how our algorithm could be adapted to the piecewise-linear case.
Bilevel Knapsack Problem
37
The GFAP is a bilevel problem, where in upper level decision maker, DONOR, decides on the level of subsidy for the healthcare projects in the set .[n] that is within the budget .Bout such that the profits from the healthcare projects obtained from the optimal solution of the knapsack problem solved by the recipient comprising of the cost subsidised healthcare projects and the external project subject to the budget .Bin is maximised. More formally, the global fund allocation problem (GFAP) can be posed as the following bilevel program.
.
max
n
(1)
pi xi.
i=1 n
ci ai ≤ Bout .
(2)
i=1
a ∈ [0, 1]n. x ∈ arg max
(3) n i=1
pi xi + p0 :
n
(ci − ci ai )xi + c0 x0 ≤ Bin
i=1
(x, x0 ) ∈ [0, 1]
n+1
(4)
Solution procedures for bilevel problems normally follow an optimistic or pessimistic approach. When there are alternate optimal solutions for a follower’s problem, whose problem is parametrised by the decision made by the leader, the follower would choose the solution that is most (resp. least) favourable to the leader in an optimistic (resp. pessimistic) approach. GFAP only considers optimistic solution approaches as we are expecting cooperation between the leader and the follower. Theorem 1 GFAP is NP-hard. Proof We will prove Theorem 1 through a reduction from KNAPSACK problem, where we are given n items with each item i having a profit .pi and cost .ci and a budget B. The decision version asks the question whether there is a set of items whose combined profit exceeds a certain value K and their costs does not exceed the budget B. The reduction involves in creating the .0th project with profit and cost 0 and the rest of the items correspond to the projects in the outer problem. We will take .Bout = B and the .Bin = 0. The KNAPSACK problem has a YES solution if and only if the corresponding GFAP problem has a solution of value at least K as we have a one-to-one correspondence.
38
A. Arulselvan and A. Chinchuluun
2.1 Algorithm We will first give an algorithm for the optimistic approach that involves in solving a mixed integer program and we will show its equivalence to GFAP. We will later provide a polynomial time approximation scheme to solve the mixed integer program. This variation of knapsack problem with set up cost [8] might be of independent interest. The follower picks the best solution with respect to the leader in the event of a tie in an optimistic approach. Let us order the n projects in nonincreasing order . pcii ratio. Let .π be this ordering, so we have: .
pπ(1) pπ(2) pπ(n) ≥ ≥ ··· ≥ cπ(1) cπ(2) cπ(n)
Let S be the set of projects with their profit to cost ratio better than . pc00 , i.e., pi p0 ¯ .S = {i : ci ≥ c0 , i = 1, . . . , n}. We denote the complement set of S by .S = {1, . . . , n}\S. For every project .i ∈ [n], let .mi the cost of the project at which that the inner problem finds it competitive compared to the .0th project. More precisely pi c0 .mi = p0 . In other words, if the outer problem subsidises project i by .ci − mi , then project i becomes profitable enough for the inner problem to favour project i over project 0, provided there is a budget to accommodate it. Note that by this definition we have .mi = ci , for all .i ∈ S We will now make a guess on whether project 0 will be picked in the optimal solution of GFAP as 0 (not picked) or as 1 (fully) or fractional. If it is picked fully, then we know the remaining projects have a combined budget of .Bout + Bin − c0 and they could be picked fractionally. So the problem reduces to solving the following LP: P1 : max
n
.
pi xi
i=1 n
ci xi ≤ Bout + Bin − c0
i=1
x ∈ [0, 1]n Let the optimal value of the above problem be .P1 . It is fairly easy to compute a subsidy for the projects in the outer problem of GFAP for which we will obtain the same optimal value as the above linear program. In the event that project 0 is either picked fractionally or not picked at all, we know that every other project that was picked is more favourable than project 0 for the inner problem. We now propose the following mixed integer program and show its equivalence with the GFAP in the event that project 0 is not picked fully (fractionally or not picked at all).
Bilevel Knapsack Problem
39
P2 : max
n
.
(5)
pi xi.
i=1 n
ci xi ≤ Bout + Bin.
(6)
i=1 n
mi yi ≤ Bout .
(7)
i=1
xi ≤ yi ,
i = 1, . . . , n.
(8)
y ∈ {0, 1}n.
(9)
x ∈ [0, 1]
(10)
n
Let the optimal value of the above problem be .P2 . We can now return the solution corresponding to .P1 if .P1 ≥ P2 and .P2 otherwise. We will later provide a PTAS to solve .P2 . For the moment we will just show the equivalence between GFAP and .P2 . Note .mi is not the optimal subsidy for the projects and it is non-trivial to retrieve the subsidy from the optimal solution of .P2 and there can be more than one possible way to subsidise the projects corresponding to the optimal projects picked in .P2 . We provide a method to compute this and show the equivalence of GFAP and .P2 in Theorem 2. We will first give the following lemma regarding optimality structure of a solution in .P2 . Lemma 1 There exist an optimal solution .P2 where all .xi takes a value of 0 or 1 except at most 1 project. Proof This observation is from the fact that if we know the optimal .yi , then fixing those variables into .P2 will give us a continuous knapsack problem and the optimality structure of the continuous knapsack problem naturally applies to .P2 . We will assume that we work with this optimal solution in our discussion. ∗ ∗ Theorem 2 Let .x˜ , y˜ and the optimal solution of GFAP and .P2 respecn .x , y be tively. If .x˜0 < 1, then . i=1 pi x˜i = ni=1 pi xi∗
Proof .⇒ We will now compute a subsidy for the projects in the outer problem using the solution of .P2 and solve GFAP. The idea is that the inner problem would pick the optimal solution corresponding to .P2 ’s optimal solution for this computed subsidy. The following algorithm will build the subsidy for projects as the vector .a. The subsidy is valid (total subsidy provided by outer problem is less than .Bout ) by construction. n .
i=1
ai =
i∈R
ai + a𝓁 +
i∈[n]\R∪{𝓁}
ai.
(11)
40
A. Arulselvan and A. Chinchuluun
Algorithm 1 Compute subsidies from .P2 for i ∈ [n] do if yi∗ = 1 then ai = mi end if end for R=∅ while Bin > 0 do for i ∈ π(1), . . . , π(n) do if yi∗ = 1 then if Bin − (ci − mi ) > 0 then R = R ∪ {i} Bin = Bin − (ci − mi )xi∗ else Bin = 0 ai = ai + (ci − mi ) − Bin 𝓁=i end if end if end for end while for i ∈ [n]\R ∪ {𝓁} do ai = ai + (ci − mi )xi∗ end for
=
i∈R
≤
mi yi∗ + (c𝓁 − Bin ) +
≤
mi yi∗ + (c𝓁 − Bin ) +
n
(12) mi yi∗ + (ci − mi )yi∗.
(13)
ci yi∗.
(14)
ci xi∗.
(15)
i∈[n]\R∪{𝓁}
mi xi∗ + (c𝓁 − Bin ) +
i∈R
≤
i∈[n]\R∪{𝓁}
i∈R
mi yi∗ +(ci − mi )xi∗.
i∈[n]\R∪{𝓁}
i∈R
≤
mi yi∗ +m𝓁 y𝓁∗ +((c𝓁 − m𝓁 )−Bin )y𝓁∗ +
i∈[n]\R∪{𝓁}
ci xi∗ − Bin.
(16)
i=1
≤ Bout
(17)
Inequality (15) comes from Lemma 1. We now claim that with the subsidy vector a, the inner problem produces the same output as the .P2 . We note that the all projects with .yi∗ = 1 and in .[n]\R are completely subsidized outer problem, so they will be picked by the inner problem. For all projects in .R\{𝓁}, we know that they are subsidized to the point where they are more profitable than project 0 and
.
Bilevel Knapsack Problem
41
.
(ci − mi ) ≤ Bin i∈R
So they will be picked in the inner problem as well. Finally project .𝓁 is partially subsidised and is more competitive than project 0 and will also be picked in the inner problem. In addition notice that there is no budget left for the inner problem to pick any other projects with .yi∗ = 0. .⇐ We need to show that for the assumption that GFAP has an optimal solution with .x 0 /= 1, the optimal solution of GFAP could be used to construct a feasible solution for .P2 with the same objective value. Since .x 0 is not picked fully, every other project picked is preferred over project 0, which implies that those projects are subsidized at least to competitive levels of project 0. We take the corresponding .yi values to be 1 and this would satisfy constraint (7). Take .xi to be the same values as GFAP for all projects in .[n]. It is easy to verify that for these chosen values 6 and 8 are satisfied.
2.2 A Polynomial Time Approximation Scheme for P2 We first give the intuition about the generalisation of this problem, where in we have more than one external project, m in total, (in the current case we have just project 0 and .m = 1). We have already noted that in this case, we will model the profit function as a piecewise linear concave function. Clearly, an optimal solution will be picking projects into the solution in the order of pieces in the concave function, since the inner problem is a linear knapsack problem. So for each project i in this order, we will make a guess on whether this project is either not picked or fractionally picked but project .i − 1 is fully picked. For each of these guesses, we will solve the corresponding MIP above. Since there will be polynomially many pieces, this will still result in a PTAS for the general case. The algorithm works very similar to the algorithm proposed by Frieze and Clarke [6] for the multidimensional knapsack problem. We have two knapsack constraints, but we have to take into account of the two different types variables for each project into account. We will modify the algorithm slightly to take this into account. From lemma 1 we have that given a solution .y, the problem reduces to a linear knapsack problem, so there exists an optimal solution to .P2 with at most one fraction x variable. Now observe that the following LP is a valid relaxation of .P2 . P3 : max
n
.
pi xi
i=1 n i=1
ci xi ≤ Bout + Bin
42
A. Arulselvan and A. Chinchuluun n
mi xi ≤ Bout
i=1
x ∈ [0, 1]n In fact, it has the exact same optimal value as the LP relaxation of .P2 . Given an optimal solution for the LP relaxation, we can reduce every .yi variables to the value of the corresponding value .xi variable. This will retain feasibility and the objective value remains the same as well. For a given .ϵ > 0 chosen as the parameter for the PTAS, choose .k = ⎾ 2(1−ϵ) ϵ ⏋. Frieze and Clarke enumerated all subsets of items of size less than k and fixed them to be picked in the solution and removed all items which costed more than the cheapest item in the set. For our problem, we will enumerate all subsets of size .S ⊆ [n] such that .|S| ≤ k + 1. Now each subset .Sk ⊊ S, of k items acts in the same way as set used in Frieze and Clarke’s algorithm. Let .pmin = mini∈Sk pi . The .k + 1st item is a wild card item that may or not have a profit greater than .pmin . The idea is this acts as our guess on the possibly fractional healthcare project picked in an optimal solution. We now fix all variables in a similar manner in [6]: xj =
.
1 j ∈ Sk 0
j ∈ T (Sk ),
where .T (Sk ) := {j ∈ [n]\Sk : pj > pmin }. Let .N(S) := [n]\S ∪ T (Sk ) and N(Sk ) := [n]\Sk ∪ T (Sk ). Note that all other items .j ∈ N(S) have a profit .pj ≤ pmin . Now we solve the following modified version of .P3 for a specific set .Sk :
.
P4 (Sk ) : max
n
.
pi xi
i=1
ci xi ≤ Bout + Bin
i∈N (Sk )
mi xi + mk+1 yk+1 ≤ Bout
i∈N (S)
xk+1 ≤ yk+1 x ∈ [0, 1]|N (Sk )| yk+1 ∈ {0, 1} Note that .P4 (Sk ) has only one integer variable and it is easy to solve by guessing the value of .yk+1 . Now, for all projects .i /= k + 1, we round down .xi . For .k + 1 the values are taken as it is. Note that we will have at most two x variables that are fractional and one of them could be .k + 1.
Bilevel Knapsack Problem
43
More formally we present the algorithm: Algorithm 2 PTAS for .P2 ⏋ Set k := ⎾ 2(1−ϵ) ϵ Set x¯j = 0 forall j ∈ [n] for each S ⊊ [n] with |S| ≤ k + 1 do for each i ∈ S do Set Sk = S\{i} Set xˆj = 1 forall j ∈ Sk Set xˆj = 0 forall j ∈ T (Sk ) Solve P4 (Sk ) and let x˜ be the solution xˆj = ⎿x˜j ⏌ forall j ∈ N (S) end for if i∈[n] pi x¯i < i∈[n] pi xˆi then x¯ = xˆ end if end for Return x¯
2.3 Analysis Theorem 3 The PTAS algorithm provided above terminates with a solution with a value of at least (1 − ϵ) within the optimal Proof The case when a project is not picked fractionally is easy and directly follows from Frieze and Clarke’s algorithm. We will deal with the case when there is a fractional project picked here. Let the optimal projects picked by P2 be i1 , i2 , . . . , ir , ir+1 , where the r + 1st is the fractionally picked project and let x∗ , y∗ be the optimal solution. Let pi1 ≥ pi2 ≥ · · · ≥ pir . Now consider the subset S of size k+1, where the first k projects are i1 , . . . , ik and the k+1st project is ir+1 , when we solved P4 (Sk ) corresponding to this set and let x˜ be the optimal solution and x˜ be the rounded down solution as described in the algorithm above. Every project j ∈ N (S) has pj ≤ pmin ≤ opt =
.
i∈[n]
k t=1 pit
k
pi xi∗ ≤
. Let opt be the optimal objective value of P2
i∈[n]
pi x˜i ≤
i∈[n]
pi xˆi +
pj x˜k
j ∈N (S)
The number of projects whose profits we lost because of rounding down is at most 2, since we have at most two fractional variables (it’s one if one ofthem is the k + 1st project). But we know that these projects have a profit less than we have kt=1 pit ≤ i∈[n] pi xˆi . So we have,
k t=1 pit
k
and
44
A. Arulselvan and A. Chinchuluun
opt ≤
.
i∈[n]
pi xˆi +
2 pi xˆi k i∈[n]
3 Conclusion In this work we studied a special case of a bilevel knapsack problem called Global fund allocation problem. The special case involves in the identical valuations of the projects by the two players in the bilevel problem. For this variation, we provided a NP-hardness proof and polynomial time approximation scheme to solve it. The algorithm is of very little practical use but the article was provided to show the existence of an approximation scheme and characterise the complexity of the problem. A more practical algorithm that targets in solving the MIP .P2 should work well in practice especially when one has a piecewise linear profit function for the non-healthcare project. A number of open questions still remain. We do not know whether a psuedopolynomial algorithm exists for this problem nor an FPTAS has been ruled out.
References 1. Chinchuluun, A., Pardalos, P.M., Huang, H.-X.: Multilevel (Hierarchical) Optimization: Complexity Issues, Optimality Conditions, Algorithms. In: Gao, D.Y., Sherali, H.D. (Eds.) Advances in Applied Mathematics and Global Optimization: In Honor of Gilbert Strang, Advances in Mechanics and Mathematics, Springer, 17, 197–221 (2009) 2. Colson, B., Marcotte, P., Savard, G.: An overview of bilevel optimization. Ann. Oper. Res. 153(1), 235–256 (2007) 3. Dempe, S.: Foundations of Bilevel Programming. Kluwer Academic, Dordrecht (2002) 4. Dempe, S., Richter, K.: Bilevel programming with knapsack constraints. Exchange Eur. Newspaper Oper. Res. 8, 93–107 (2000) 5. Deng, X.: Complexity issues in bilevel linear programming. In: Migdalas, A., Pardalos, P.M., Varbrand, P. (eds.) Multilevel Optimization: Algorithms and Applications, pp. 149–164. Kluwer Academic, Dordrecht (1998) 6. Frieze, A.M., Clarke, M.: Approximation algorithms for the m-dimensional 0–1 knapsack problem: worst-case and probabilistic analyses. Eur. J. Oper. Res. 15(1), 100–109 (1984) 7. Hansen, P., Jaumard, B., Savard, G.: New branch-and-bound rules for linear bilevel programming. SIAM J. Sci. Stat. Comput. 13(5), 1194–1217 (1992) 8. Michel, S., Perror, M., Vanderbeck, F.: Knapsack problems with setups. Eur. J. Oper. Res. 196(3), 909–918 (2009) 9. Morton, A., Arulselvan, A., Thomas, R.: Allocation rules for global donors. J. Health Econ. 58, 67–75 (2018)
Efficient Heuristics for a Partial Set Covering Problem with Mutually Exclusive Pairs of Facilities Aleksander Belykh, Tatyana Gruzdeva, Anton Ushakov, and Igor Vasilyev
Abstract Set cover problems are among the most popular and well-studied models in facility location. In this paper, we address an extension of the set covering and partial set covering problems. We suppose that a set of customers to be served is divided into two non-overlapped subsets: “mandatory” customers who must always be covered and “optional” customers who do not require obligatory coverage. To avoid a trivial solution that covers only customers from H , we suppose that the number of covered optional customers must be at least larger than a predefined threshold In the real world, local laws may prohibit location of a large number of facilities in some places to reduce harm to people and/or nature. To reflect that fact our problem involves pairs of facilities that are mutually exclusive, hence if a facility from such a pair is located, another one is prohibited. We formulate this problem as an integer linear program and develop and implement several fast heuristic approaches capable of finding high-quality solutions of largescale problem instances arisen in real-life settings. In particular, we develop a hybrid heuristic involving a greedy algorithm and local search embedded into a variable neighborhood framework. We test our algorithms in a series of intensive computational experiments on both real-life problems and synthetically generated instances. Keywords Partial set covering · Facility location · Local search · VNS · Greedy algorithms · Base stations location
A. Belykh · T. Gruzdeva () · A. Ushakov · I. Vasilyev Matrosov Institute for System Dynamics and Control Theory of SB RAS, Irkutsk, Russia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 R. Enkhbat et al. (eds.), Optimization, Simulation and Control, Springer Proceedings in Mathematics & Statistics 434, https://doi.org/10.1007/978-3-031-41229-5_4
45
46
A. Belykh et al.
1 Introduction Facility location problems form an extensive class of combinatorial optimization problems that have a wide range of diverse applications. They may be broadly divided into geographical and non-geographical applications. The former include the problems of locating facilities, plants, or infrastructure objects, e.g. ambulance stations, hospitals, schools, etc. The latter include applications of the facility location problems in machine learning and other related fields, e.g. in cluster analysis (e.g. see [34]). A discrete facility location problem involves a set .I = {1, 2, ..., n} of possible sites for locating facilities and a set .J = {1, . . . , m} of customers or demands that must be served by facilities. The goal of the problem is to make two decisions: locate facilities at sites from I and assign customers to open facilities. These decisions are specific and dependent on both problem objective and service constraints. With respect to these decisions, one large class of discrete facility locations problems imposes that a customer can receive a service if she/he is located within a certain radius of an open facility. For example, this is of key importance when locating public emergence facilities like fire or ambulance stations (help must be provided within several minutes) or telecommunication towers and surveillance equipment. The location problems of this class are called set cover location problems, as a customer is said to be covered when being within a radius of an open facility. In this paper we consider an extension of the conventional set covering location problem. The latter is the basic set covering problem aimed at minimizing the number (total cost) of located facilities in order to cover all customers. The problem was first introduced by Hakimi in [20] and formulated as a mathematical program related to facility location in [33]. The problem is known to be NP-hard [18]. In the set covering problems we assume that there is given a threshold value s that defines the so-called “covering distance”. Thus, if the distance .dij between a customer .j ∈ J and a facility site i is less than s, we say that customer j is covered by facility i. We can denote all facility sites that cover customer j as .Ij = {i ∈ I : dij ≤ s}. The objective of the problem is to find a subset of I such that all customers are covered by at least one facility from the subset. Facility sites may be assigned weights, hence the goal is to find a minimum-cost subset that covers all customers. Note that how to measure distances between customers and how to determine threshold s is dependent on the particular problem. The problem was extremely widely studied in the last 50 years and there are numerous results related to its theoretical and algorithmic aspects (for a survey see [9, 16, 17, 28, 32]). As was noted in [3, 17] the set covering problem is one of the three special structures in pure integer programming with the most widespread applications. The first exact algorithm was proposed by Lemke et al. [24]. It was a branch-andbound method based on exploiting the structure of the SCP formulation. Etcheberry [15] developed a branch-and-bound method based on the Lagrangian relaxation where branching is done on constraints rather than variables. Cutting plane algo-
Efficient Heuristics for a Partial Set Covering Problem
47
rithm combined with various primal and dual heuristics was later proposed in [2] as well as a branch-and-bound method combined with dual accent procedure was developed in [4, 6]. In general, Lagrangian heuristics and the procedures based on Lagrangian duality are proved to be extremely efficient for the set cover problems due to their nice structure. Next advances with developing exact algorithms for the problem was presented in [1] who developed a cutting plane algorithm based on the so-called SepGcuts following the idea of local cuts introduced for the travelling salesman problem. Finally, an exact algorithm based on the Benders decomposition was developed in [19] to solve a particular case of the SCP “almost satisfying” the consecutive ones property. A comprehensive theretical study of the combinatroial structure of SCP can be found in [13, 30, 31] etc. One of the first effective heuristics for the set cover problem, proposed by Chvatal in the seminal paper [11], is the greedy algorithm, now considered as conventional one for SCP. Being one of the basic NP-hard combinatorial optimization problems, it has been attracting numerous attempts to develop various metaheuristic approaches. Probably, almost all metaheuristics were adapted to the set cover problem, e.g. simulated annealing [7], local search [22, 29], genetic algorithms [5, 25], hybrid metaheuristcs [23] etc. Nevertheless, as was noted above, one of the most effective heuristics are primal-dual one based on the Lagrangian relaxation. For example, in [8, 10] the authors introduced the so-called core heuristic and incorporated it into a primal-dual approach. Its main idea is to use the values of Lagrange multipliers to identify the most promising variables and fix other to 0. In [26], the authors developed a heuristic based on solving a surrogate relaxation with a subgradient algorithm followed by tests to remove columns of the constraint matrix. There are numerous modifications of the set cover problem proposed in the literature to reflect real-life settings. One notable variant of SCP, named the partial set covering problem (PSCP), assumes that it is not necessary to cover all customers by open facilities. Instead, given a threshold value .ε, one needs to find the minimal number of sites for locating facilities in order to cover at least .ε customers. The reasoning behind this variant is that, in practical settings, a significant portion of customers may be covered with only a few facilities, whereas the coverage of all customers may require locating a large number of facilities. Usually, budget limitations hinder locating such a large number of facilities, hence we can demand serving only a portion of clients. The problem was introduced in [14], where the authors proposed to solve it with a Lagrangian heuristic. It has received little attention since then. There is only a few papers outside the field of facility location. Recently, Cordeau et al. [12] proposed an exact algorithm to effectively tackle large-scale PSCP instances with much more customers than facilities, based on branch-and-Benders-cut algorithms that exploit a combinatorial cut-separation procedure. In this paper we address a modification of the set covering and partial set covering problems. We suppose that a set of customers J to be served is divided into two non-overlapped subsets. A set .H ⊆ J contains customers who must be covered (“mandatory” customers) and the set .J \ H consists of “optional” customers who
48
A. Belykh et al.
do not require obligatory coverage. To avoid a trivial solution that covers only customers from H , we suppose that the number of covered optional customers must be at least .ε. In the real world, local laws may prohibit location of a large number of facilities in some places to reduce harm to people and/or nature. To reflect that fact our problem involves pairs of facilities that are mutually exclusive, hence if a facility from such a pair is located, another one is prohibited. In the paper, we formulate this problem as an integer linear program and develop and implement several fast heuristic approaches capable of finding highquality solutions of large-scale problem instances arisen in real-life settings. In particular, we develop a hybrid heuristic involving a greedy algorithm and local search embedded into a variable neighborhood framework. We test our algorithms in a series of intensive computational experiments on both real-life problems and synthetically generated instances. We demonstrate that the algorithms find high quality solutions for problems that turns out to be intractable by the exact MIP solvers.
2 Problem Statement Let us formulate our set covering problem as an integer program. Recall that we have a set .I = {1, . . . , n} of candidate sites for locating facilities and a set .J = {1, . . . , m} of customers who are supposed to be covered by a number of facilities. We suppose that each customer .j ∈ J may be covered by facilities .Ij ⊂ I . There are a set .H ⊆ J of customers that must always be covered by facilities, whereas other customers are “optional”. We suppose that not all optional customers must be served but only at most .ε > 0 of them. As some facilities may not be opened simultaneously, we introduce a set .V ⊂ I × I of pairs which are mutually exclusive, i.e., the pairs of sites where facilities cannot be located at the same time. Let us introduce the following binary variables .(i ∈ I, j ∈ J ): yi = .
xj =
1, if a fasility is located in site i, 0, otherwise; 1, if optional customer j is covered, 0, otherwise.
With these notations, the problem can be cast as the following integer linear program: .
min
i∈I
yi ,
.
(1)
Efficient Heuristics for a Partial Set Covering Problem
yi ≥ 1
49
∀j ∈ H, .
(2)
∀j ∈ J \ H, .
(3)
i∈Ij
yi ≥ xj
i∈Ij
xj ≥ ε, .
(4)
j ∈J \H
yi1 + yi2 ≤ 1
∀(i1 , i2 ) ∈ V , .
(5)
yi ∈ {0, 1}
∀i ∈ I, .
(6)
xj ∈ {0, 1}
∀j ∈ J.
(7)
The objective function (1) minimizes the number of opened facilities. Constraints (2) guarantee that each customer .j ∈ H is covered. Constraints (3) bind variables x and y and ensure that optional customer j is covered only when a facility form .Ij is opened. According to the constraint (4), the total number of covered optional customers must be at least .ε. The constraints (5) guarantee that the facilities .i1 and .i2 that form a mutually exclusive pair from V cannot be opened at the same time. Note that the integrality constraint on variables x may be relaxed as in any optimal solution it will take integer values.
3 Solution Algorithms As the set cover problem is NP-hard, exact algorithms or primal-dual heuristics often require prohibitively many computing resources and may take a long time to find a quality solution of instances of real-world size involving millions of customers and hundreds of facility location cites. However, this setting is quite natural when locating base stations or cell tower antennas. Here, we try to develop a fast constructive and local search (greedy-interchange) heuristics for our particular case of the partial set cover problem. As far as we know, there only a couple attempts to devise such kind of heuristics even for the conventional partial set cover problem. The main drawback of local search and its combination with constructive heuristics (e.g. greedy algorithms), often referred to as greedy-interchange heuristic, is that it in general converges to a local optimal solution. Note that this holds when the optimization problem is non-convex. Thus, a common practice is to embed a local search (and constructive heuristics) into a metaheuristic framework to try to “jump out” local optima. Among plenty of metaheuristics, the variable neighborhood search is the most best known and widely used one due to its simplicity and high effectiveness. We tried to develop a very fast hybrid heuristic that incorporate a fast local search-based metaheuristic (VNS) with constructive heuristics.
50
A. Belykh et al.
3.1 Constructive Heuristics The most basic and fast type of the solution algorithms for SCP is constructive heuristics, in particular greedy algorithm. For our problem we develop a fast, efficient modification of the conventional greedy algorithm (Chatal’s algorithm), taking into account particular features of the problem. Note that Chvatal’s algorithm assumes a weighted counterpart of SCP, whereas in our problem we suppose that all location sites of the same weight. First of all, we perform a pre-processing step for an input problem instance. We can exclude all but one location sites covering the same sets of customers. Indeed, these locations can be viewed as “duplicates” and excluded from consideration. Secondly, we identify all the customers who have the same set of covering facilities. Indeed, we can consider them as a single customer with weight .wj equal to the number of such “duplicate” customers. In contrast to location sites, we cannot simply discard such customers due to the partial covering constraint (4) that must be satisfied. The pre-processing step can considerably reduce the problem size, which is critical for the run time of solution algorithms. The idea of our greedy algorithm is quite simple. It starts with an empty set of located facilities .I ∗ = ∅. At each step, it finds the facility that covers the maximal number of still uncovered “mandatory” customers, i.e. customers .j ∈ H . The outline of the algorithm is presented in Algorithm 1. Note that due to the set V of mutually exclusive pairs of facilities, not all facility location sites are available at iteration k. Indeed, if a facility i is added to the covering set .I ∗ at some iteration k, ¯ i) ∈ V are discarded and cannot be considered in subsequent iterations ¯ .(i, all sites .i: (see step 5). A covering is not feasible until all “mandatory” customers are covered. However, after serving these customers, a covering may be still infeasible due to the
Algorithm 1 Greedy algorithm Input: I = {1, . . . , n} is a set of candidate sites for base station locations, J = {1, . . . , m} is a set of customers to be served by facilities, H ⊆ J is a subset of customers, which must be covered, Mi ⊆ H is a subset of mandatory customers, which can be served from site i ∈ I , V ⊂ I × I is a subset of site pairs, which are mutually exclusive. Output: Covering I ∗ . 1: I ∗ ← ∅ |Si | < ε do 2: while H /= ∅ and i∈I ∗
3:
Find site i ∗ ∈ I such that i ∗ = argmax |Mi |. i∈I
4: Locate base station in site i ∗ , i.e. I ∗ ← I ∗ ∪ {i ∗ }. 5: I ← I \ {i ∗ }. ¯ ∈ V then 6: if pair (i, i) ¯ 7: I ← I \ {i}. 8: end if 9: H ← H \ Mi ∗ , Ms ← Ms \ Mi ∗ , s ∈ I, s /= i ∗ . 10: end while
Efficient Heuristics for a Partial Set Covering Problem
51
Algorithm 2 Random Search Input: I = {1, . . . , n} is a set of candidate sites for base station locations, J = {1, . . . , m} is a set of customers to be served by facilities, H ⊆ J is a subset of customers, which must be covered, Mi ⊆ H is a subset of mandatory customers, which can be served from site i ∈ I , V ⊂ I × I is a subset of site pairs, which are mutually exclusive. Output: Solution I ∗ . 1: I ∗ ← ∅. |Si | < ε do 2: while H /= ∅ and i∈I ∗
3: Randomly choose site i ∗ ∈ I and locate the base station. 4: I ∗ ← I ∗ ∪ {i ∗ }. 5: I ← I \ {i ∗ }. ¯ ∈ V then 6: if pair (i, i) ¯ 7: I ← I \ i. 8: end if 9: H ← H \ Mi ∗ . 10: end while
partial covering constraint. If this happens, the algorithm proceeds until the minimal number of customers is covered. Efficient implementation of the greedy algorithm followed by the pre-processing step can be extremely fast even for relatively large problem instances. Another straightforward algorithm that we can use to fast find initial feasible covering solutions is random covering algorithm. Its outline is presented in Algorithm 2. Its idea is very simple and consists in locating facilities at random until all mandatory customers have been served taking into account the partial covering constraints and mutually exclusive constraints. Note this algorithm is simple a natural way of generating feasible solutions for our partial set cover problem.
3.2 Local Search Though the constructive heuristics presented in the previous section may provide a relatively good solutions, these solutions are actually not locally optimal. The next natural step is to try to improve an initial constructive solution by a local search algorithm. Such combined algorithms are widely referred to in the facility location literature as greedy-interchange heuristics. A local search algorithm begins form an initial solution, called incumbent, and attempts to find a better solution in a neighborhood of incumbent. The neighborhood consists of all the (sometimes not necessarily feasible) solutions obtained by some small modifications of incumbent. If a neighbor with better objective value is found, it replaces the incumbent and the process is repeated for the new solution. If there is no a better solution, the current incumbent is claimed to be locally optimal in the neighborhood and the algorithm is terminated. Note that there are two principal
52
A. Belykh et al.
Algorithm 3 Local search Input: S — initial incumbent. Output: Local optimal solution S ∗ . 1: for i ∈ S do 2: Exclude facility i from solution S. 3: Open new facilities from I \ S in greedy fashion until feasibility is recovered. Denote the new solution as S ' (i), compute Z(S ' (i)). 4: end for 5: Find i ∗ ← argmin Z(S ' (i)). i
6: 7: 8: 9: 10: 11: 12:
if Z(S(i ∗ )) < Z(S) then S ← S'. go to step 1 else stop. end if Remove redundant facilities from S ∗ .
strategies of how to choose the next incumbent. In case of best improvement strategy, the algorithm always goes to the neighbor with the best objective value. At the same time, the first-improvement strategy supposes that the incumbent is replaced with the first discovered neighbor solution with better objective value. The idea of the local search algorithm for our particular partial set covering problem is similar to one of other local search techniques developed for other facility location problems. Its main concept is to search the neighborhood of the current incumbent obtained by dropping one open facility. As such a solution is actually infeasible, the feasibility is repaired by greedily opening the facilities in other places. A general outline of our local search algorithm is presented in Algorithm 3. Our local search algorithm follows the best improvement strategy, i.e. it enumerates all neighbor solutions and finds the best one. Greedy opening of facilities means that each location sites is evaluated with respect to the number of uncovered “mandatory” customers that it covers. Note that a new facility must not be conflicting with any facility in the current incumbent. If all mandatory customers are covered but the partial covering constraint does not hold, the algorithms proceed locating facilities in the same greedy fashion but with respect to the optional customers. The final step of the local search is to remove redundant facilities, i.e. those dropping of which does not affect solution feasibility. Note that the algorithms examines neighbor solutions obtained by dropping single facility. However, removing more than one facility is possible and allows one to extensively search for a new incumbent.
3.3 Variable Neighborhood Search As local search and greedy-interchange heuristics, in spite of their effectiveness and efficiency, may fast stuck in a local optimal solution, an obvious strategy is
Efficient Heuristics for a Partial Set Covering Problem
53
to embed the local search (and constructive heuristics) into a metaheuristic, which allows one to leave local optima and systematically search for better solution. Among a huge number of metaheuristics, one of the most widely used one is the variable neighborhood search (VNS) proposed by Mladenovi´c, and Hansen [27]. It has become especially popular due its simplicity, high speed, and effectiveness even for hard large-scale optimization problems. There numerous modifications and extensions of VNS. In our approach we adapt the most natural and basic scheme of the algorithm. The main idea of VNS is to switch systematically between neighborhoods to find a better solution [21]. The algorithm begins by running a local search that provides a locally optimal solution. To proceed looking for other possibly better solutions, the algorithm tries to explore more distant solutions from the current one. In other words, it randomly picks a solution from a ’far’ neighborhood and starts a local search from this new solution. This phase is called shaking. The distance between any two solutions in general is defined as the number of components which the solutions differ in. Defining the distance requires supplementing the solution space with an appropriate metric. If a better solution than the current incumbent is not found, the algorithm stays at the current incumbent and performs another shaking, otherwise jumps into a new improved solution. The algorithm stops if after a fixed number of shakings a better solution is not found. The neighborhoods used by VNS are usually ordered so as the search is performed consequentially for solutions from the nearest neighborhood to ones from the farthest neighborhood. This strategy is allowed to alternate intensification and diversification steps within the algorithm to discover local optimal solutions in a systematic way. Note that any feasible solution for our problem can be specified by cover set S corresponding to vector of variables y. We define a quite natural distance metric for the solution space U as d(S1 , S2 ) = |S1 \ S2 | = |S2 \ S1 |, S1 , S2 ∈ U
.
Using this distance measure, we can define the family of neighborhoods around a feasible solution as ¯ = k}, k = 1, . . . n − 1, Nk (S) = {S¯ ∈ U : d(S, S)
.
i.e. neighborhood .Nk (y) consists of all solutions that have distance k from S. We should especially note that solutions from .S¯ ∈ Ns (S) are not actually feasible. Indeed, removing some open facilities, we can get a solution that does not cover all “mandatory” customers or the enough number of “optional” customers. Thus, after shaking (i.e. selecting a neighbor from .Nk ), we have to recover feasibility of ¯ e.g. by using a greedy approach described in the the chosen neighbor solution .S, local search. We apply the aforementioned local search procedure to find a local optimal solution.
54
A. Belykh et al.
Algorithm 4 VNS Input: S – initial feasible solution. Output: solution S ∗ . 1: k ← 1 2: while k