2018 MATRIX Annals 9783030382292, 9783030382308


241 59 4MB

English Pages 430 Year 2020

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface......Page 6
Contents......Page 29
I Refereed Articles......Page 31
Part 1 On the Frontiers of High Dimensional Computation......Page 32
1 Introduction......Page 33
2 The main result......Page 37
2.1.2 Data-explicit estimates for the RTE......Page 40
2.1.3 Spectral properties of the RTE......Page 41
References......Page 42
2 A reduced-order-model Bayesian obstacle detection algorithm......Page 44
1 Introduction......Page 56
1.1 Variational eigenvalue problems......Page 57
2 Bounding the spectral gap......Page 59
3 Numerical results......Page 65
References......Page 69
3 Bounding the spectral gap for an elliptic eigenvalue problem with uniformly bounded stochastic coefficients......Page 55
1 Introduction......Page 70
2 Restricted Randomized Algorithms in a General Setting......Page 71
3 The Power of Restricted Randomized Algorithms......Page 80
Reference......Page 83
1 Introduction and Preliminaries......Page 85
2 Results......Page 89
References......Page 101
1 Introduction......Page 103
2 Preliminaries......Page 105
3 Existence result for worst-case error......Page 107
4 Numerical experiments on the conjecture......Page 118
References......Page 120
1 Introduction......Page 121
2 Approximation in periodic spaces of Sobolev type......Page 123
3 Preasymptotics for approximation in isotropic Sobolev spaces......Page 125
4 Preasymptotics for approximation in mixed Sobolev spaces......Page 133
References......Page 136
1 Introduction......Page 137
2 Background......Page 138
3 Numerical algorithms for semi-supervised learning......Page 139
4 Experimental results......Page 141
References......Page 143
1 Introduction......Page 144
1.1 The model problem......Page 145
2 The parametric and nonparametric P1–simplicial and quadrilateral nonconforming finite elements......Page 146
2.1 The parametric simplicial and rectangular NC elements in two and three dimensions......Page 147
2.2.2 The P1–NC quadrilateral element......Page 149
3 The P1–nonconforming polyhedral finite element......Page 150
3.2 Basis and its dimension......Page 153
3.4 The P1–NC polyhedral Galerkin methods......Page 154
References......Page 155
Part 2 Month of Mathematical Biology......Page 157
1 Introduction......Page 158
2.1 Background......Page 159
2.2 Analytical results for the errors of the stochastic QSSA......Page 163
References......Page 167
11 Accurate particle-based reaction algorithms for fixed timestep simulators......Page 169
1 Introduction......Page 170
2 Assumptions and Definitions......Page 171
3.1 Brownian bridge method, two-step algorithm......Page 173
3.2 Brownian bridge method, three-step algorithm......Page 175
3.3 RDF-matching, two-step......Page 176
3.4 RDF-matching, three-step......Page 178
3.5 RDF-matching, two-step with remapping......Page 179
3.6 Example of RDF-matching with remapping......Page 181
4 Discussion......Page 182
References......Page 183
Part 3 Recent Trends on Nonlinear PDEs of Elliptic And Parabolic Type......Page 185
1 Decay estimates, methods, results and perspectives......Page 186
2 Recurrence and transiency of long jump random processes......Page 194
References......Page 199
1 Introductory comments......Page 202
1.1 Moduli of continuity......Page 203
1.2 Motivation: Zero counting for equations in one space variable......Page 204
1.3 The heat equation on Euclidean space......Page 206
1.4 The Neumann heat equation on a bounded domain......Page 208
1.5 The Payne-Weinberger inequality......Page 209
2.1 Riemannian manifolds: Distance, Curvature and heat equations......Page 210
2.2 The Ricci non-negative case......Page 211
2.3 Nonlinear eigenvalues......Page 215
3.1 The fundamental gap conjecture......Page 217
3.2 The 1D case......Page 218
3.3 Converting to a Neumann problem......Page 219
3.4 Sharp log-concavity......Page 221
3.6 Sharp lower bound in terms of modulus of convexity of the potential......Page 224
4.1 Gradient estimates and P-functions......Page 225
4.2 The two-point estimate......Page 226
References......Page 229
1 Introduction......Page 231
1.1 A fractional obstacle problem......Page 232
1.2 General assumptions......Page 234
2 Main results......Page 235
2.1 Further comments and strategy of proofs......Page 236
3 Some mathematical background......Page 238
4 The case of convex obstacles: proofs of the main Theorem......Page 240
References......Page 243
1 Introduction......Page 245
2 Physical considerations......Page 246
3 Symmetry results......Page 252
3.1 Symmetry properties for the Allen-Cahn equation......Page 255
3.2 Symmetry properties for the fractional Allen-Cahn equation......Page 256
3.3 Symmetry properties for the water wave problem......Page 258
Acknowledgement......Page 261
References......Page 262
1 Introduction and main result......Page 265
2 Counterexamples in convex domains and proof of the theorem......Page 267
References......Page 277
17 A potential well argument for a semilinear parabolic equation with exponential nonlinearity......Page 280
1 Model parabolic problem......Page 281
2 Stable and unstable sets......Page 284
3 Sketch of the proof of Theorem 1......Page 285
References......Page 287
18 Quantitative analysis of a singularly perturbed shape optimization problem in a polygon......Page 289
1 Introduction......Page 290
2 Setting of the problem and main results.......Page 292
References......Page 297
1.1 The energy, notation......Page 298
1.3 The boundary value problem......Page 299
2.1 Previous geometric gap lemmas......Page 300
3 Estimates......Page 301
References......Page 304
1 Introduction......Page 305
2 Critical points in the Wulff problem......Page 308
3 Minimizers in the anisotropic liquid drop model......Page 310
References......Page 312
21 Liouville-type theorems for nonlinear elliptic and parabolic problems......Page 315
1.1 Motivation and classical results: Fujita, Gidas-Spruck, Liouville......Page 316
1.2 Equations vs. inequalities – a first method: rescaledtest-functions......Page 317
2.1 Results and conjectures......Page 318
2.2 Radial case: proof based on zero-number......Page 319
2.3 Nonradial case: proof based on similarity variables and energy estimates......Page 321
3.1 Results......Page 323
3.2 Sketch of proof of Theorem 3.1(i) (initial-final blow-up estimate in Rn)......Page 324
4.1 Elliptic systems I: Lane-Emden......Page 326
4.2 Elliptic systems II: positive self-interaction......Page 328
4.3 Elliptic systems III: negative self-interaction......Page 329
5.2 Gradient structure-homogeneous case......Page 332
5.3 Gross-Pitaevskii case......Page 333
References......Page 335
II Other Contributed Articles......Page 338
Part 4 Algebraic Geometry, Approximation and Optimisation......Page 339
1 Introduction......Page 340
3.1 Prototype construction......Page 341
4.1 Vandermonde and generalised Vandermonde matrices......Page 343
5 Discussions and future research directions......Page 345
References......Page 346
Part 5 Dynamics, Foliations, and Geometry In Dimension 3......Page 347
1 Introduction......Page 348
1.1 Homotopy, integrability, and conjugacy......Page 349
1.2 Results......Page 350
2 General Outline......Page 351
2.1 Dichotomy for foliations......Page 352
2.3 Double translation......Page 353
3 A key general proposition......Page 354
3.2 Gromov hyperbolic leaves......Page 355
3.3 Coarse contraction and a key proposition......Page 356
4.2 Changing the lifts......Page 357
5.1 Uniform foliations and transverse pseudo-Anosov flows......Page 358
5.2 Forcing a particular type of dynamics on periodic center leaves......Page 359
5.4 No mixed behavior......Page 360
6.2 Double translation in hyperbolic manifolds......Page 361
6.3 More general manifolds......Page 362
References......Page 363
1 Introduction......Page 365
2 Projectively Anosov plugs......Page 366
2.1 Definition of projectively hyperbolic plug......Page 367
2.2 A plug with six Reeb components......Page 368
3 Filling a pseudo-Anosov flow......Page 370
4 Filling incoherent repellers......Page 371
5 A foliation/contact interpretation of the examples......Page 372
6 Questions and future directions......Page 374
References......Page 375
1 Introduction......Page 376
2 The proof......Page 377
3 Proof of lemma 2.4......Page 382
References......Page 384
Part 6 Geometric and Categorical Representation Theory......Page 385
1 Introduction......Page 386
2 An Introduction to Geometrically Adhesive Functors, with Motivation......Page 389
3 The Adhesive Site......Page 405
4 The Adhesive Fundamental Group......Page 420
5 Applications to Local Systems......Page 427
References......Page 430
Recommend Papers

2018 MATRIX Annals
 9783030382292, 9783030382308

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

MATRIX Book Series 3

David R. Wood   Editor-in-Chief Jan de Gier Cheryl E. Praeger Terence Tao   Editors

2018 MATRIX Annals

MATRIX Book Series

Editors David R. Wood (Editor-in-Chief ) Jan de Gier Cheryl E. Praeger Terence Tao

3

MATRIX is Australia’s international and residential mathematical research institute. It facilitates new collaborations and mathematical advances through intensive residential research programs, each lasting 1–4 weeks.

More information about this series at http://www.springer.com/series/15890

David R. Wood Editor-in-Chief

Jan de Gier Cheryl E. Praeger Terence Tao •



Editors

2018 MATRIX Annals

123

Editors David R. Wood (Editor-in-Chief ) School of Mathematics Monash University Melbourne, VIC, Australia

Jan de Gier School of Mathematics and Statistics University of Melbourne Parkville, VIC, Australia

Cheryl E. Praeger Department of Mathematics and Statistics The University of Western Australia Perth, WA, Australia

Terence Tao Department of Mathematics University of California Los Angeles Los Angeles, CA, USA

ISSN 2523-3041 ISSN 2523-305X (electronic) MATRIX Book Series ISBN 978-3-030-38229-2 ISBN 978-3-030-38230-8 (eBook) https://doi.org/10.1007/978-3-030-38230-8 Mathematics Subject Classification (2010): 14-XX, 37-XX, 35-XX, 18-XX, 92Bxx, 91-08 © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface MATRIX is Australia’s international and residential mathematical research institute. It was established in 2015 and launched in 2016 as a joint partnership between Monash University and The University of Melbourne, with seed funding from the ARC Centre of Excellence for Mathematical and Statistical Frontiers. The purpose of MATRIX is to facilitate new collaborations and mathematical advances through intensive residential research programs, which are currently held in Creswick, a small town nestled in the beautiful forests of the Macedon Ranges, 130km west of Melbourne. This book is a scientific record of the eight programs held at MATRIX in 2018: • Non-Equilibrium Systems and Special Functions • Algebraic Geometry, Approximation and Optimisation Guest editors: Enrico Carlini, Jochen Garcke, James Saunderson • On the Frontiers of High Dimensional Computation Guest Editor: Frances Kuo • Month of Mathematical Biology Guest Editors: Mark Flegg and James Osborne • Dynamics, Foliations, and Geometry In Dimension 3 Guest Editors: Jonathan Bowden and Andy Hammerlindl • Recent Trends on Nonlinear PDEs of Elliptic and Parabolic Type Guest Editor: Daniel Hauer • Functional Data Analysis and Beyond • Geometric and Categorical Representation Theory Guest editor: Peter McNamara The MATRIX Scientific Committee selected these programs based on scientific excellence and the participation rate of high-profile international participants. This committee consists of: Jan de Gier (University of Melbourne, Chair), Ben Andrews (Australian National University), Peter B¨uhlmann (ETH Zurich), Alison Etheridge (University of Oxford), Gary Froyland (University of New South Wales), Kerrie Mengersen (Queensland University of Technology), Joshua Ross (University of Adelaide), Terence Tao (University of California, Los Angeles), Ole Warnaar (University of Queensland), Geordie Williamson (University of Sydney), and David Wood (Monash University). These programs involved organisers from a variety of Australian universities, including Australian National University, Monash University, Queensland University of Technology, RMIT University, Swinburne University of Technology, Federation University Australia, University of Adelaide, University of New South Wales, University of Melbourne, University of Queensland, and University of Sydney, along with international organisers and participants. Each program lasted 1–4 weeks, and included ample unstructured time to encourage collaborative research. Some of the longer programs had an embedded conference or lecture series. All participants were encouraged to submit articles to the MATRIX Annals. v

vi

Preface

The articles were grouped into refereed contributions and other contributions. Refereed articles contain original results or reviews on a topic related to the MATRIX program. The other contributions are typically lecture notes or short articles based on talks or activities at MATRIX. A guest editor organised appropriate refereeing and ensured the scientific quality of submitted articles arising from each program. The Editors (Jan de Gier, Cheryl E. Praeger, Terence Tao and myself) finally evaluated and approved the papers. Many thanks to the authors and to the guest editors for their wonderful work. MATRIX is hosting ten programs in 2019, with more to come in 2020; see www.matrix-inst.org.au. Our goal is to facilitate collaboration between researchers in universities and industry, and increase the international impact of Australian research in the mathematical sciences. David R. Wood MATRIX Book Series Editor-in-Chief

Non-Equilibrium Systems and Special Functions 8 January – 2 February 2018 Organisers Vadim Gorin MIT Tomohiro Sasamoto Tokyo Institute of Technology Ole Warnaar University of Queensland Michael Wheeler University of Melbourne Participants Patrik Ferrari (Bonn Uni), Craig Tracy (UC Davis), Vadim Gorin (MIT), Tomohiro Sasamoto (Tokyo Inst Technology), Ole Warnaar (Uni Queensland), Michael Wheeler (Uni Melbourne), Dan Betea (Universite Paris Diderot), Jrmie Bouttier (CEA Saclay, ENS de Lyon), Zeying Chen (Uni Melbourne), Evgeny Dimitrov (MIT), Victor Dotsenko (Universite Pierre et Marie Curie), Caley Finn (Universite de Savoie), Alexandr Garbali (Uni Melbourne), Jan de Gier (Uni Melbourne), Liam Hodgkinson (Uni Queensland), Takashi Imamura (Uni Tokyo), Ivan Kostov (CEASaclay), Atsuo Kuniba (Uni Tokyo), Vladimir Mangazeev (Australian National Uni), Masato Okado (Osaka City Uni), Vincent Pasquier (CEA Saclay), Leonid Petrov (Uni Virginia), Matthieu Vanicat (Uni Ljubljana), Mirjana Vuleti´c (Uni Massachusetts), Paul Zinn-Justin (Uni Melbourne), Yi Sun (Columbia Uni), Paul Pearce (Uni Melbourne), Chihiro Matsui (Uni Tokyo), Matteo Mucciconi (Tokyo Inst Technology), Iori Hiki (Tokyo Inst Technology), Xin Zhang (Uni Melbourne), Alessandra Vittorini Orgeas (Uni Melbourne) This program focused on recent advances in the study of non-equilibrium statistical mechanical systems, in particular stochastic classical particle systems in the KPZ class. These systems have surprisingly deep connections to two-dimensional lattice models with strong boundary effects and also to the theory of special functions, such as the Macdonald polynomials and elliptic functions. The program gathered leading experts in these areas and promoted discussion and collaboration on related themes. Overall, the program drew good attendance from all over the globe. There were healthy numbers of researchers from US and European institutions, and a very strong Japanese contingent. The program received very positive feedback from attendees. Many new collaborations were formed and the atmosphere of the program

vii

viii

Non-Equilibrium Systems and Special Functions

was extremely convivial. Overseas participants particularly appreciated the relaxed and stimulating environment. Weeks 1 and 2 were primarily devoted to collaboration/research, with one talk organised per day. These weeks also saw very good attendance. Week 3 of the program featured a conference on “KPZ universality, particle processes, Macdonald processes, and special functions” This week drew the largest number of attendees during the program and had the highest density of talks. Vadim Gorin, Tomohiro Sasamoto, Ole Warnaar, Michael Wheeler Organisers

Algebraic Geometry, Approximation and Optimisation 5 – 16 February 2018 Organisers Enrico Carlini Politecnico di Torino Jochen Garcke University of Bonn Wolfgang Hackbusch Max Planck Institute Markus Hegland Australian National University Vera Roshchina RMIT University Nadia Sukhorukova Swinburne University of Technology Julien Ugon Federation University Australia David Smyth Australian National University Participants Pablo Parrilo (MIT), Levent Tuncel (Waterloo), Terry Rockafellar (Uni Washington), Ludmila Polyakova (Saint-Petersburg State University, Russia), Wolfgang Hackbusch (Max Planck Institute for Mathematics in the Sciences), Enrico Carlini (Politecnico di Torino, Italy), Andrew Eberhard (RMIT Uni), Jochen Garcke (Institut fr Numerische Simulation), Lars Grasedyck (RWTH Aachen Uni), Markus Hegland (Australian National Uni), Vera Roshchina (RMIT Uni), James Saunderson (Monash Uni), Nadezda (Nadia) Sukhorukova (Swinburne Uni Technology), Julien Ugon (Federation Uni Australia), David Yost (Federation Uni Australia), Alex Kruger (Federation Uni Australia), Fei Lu (RMIT Uni), Thi Bui Hoa (Federation Uni Australia), Scott Lindstrom (Uni Newcastle), Abhishek Bhardwaj (Australian National Uni), Sebastian Kraemer (RWTH Aachen Uni), Nguyen Duy Cuong (Federation University Australia), Anand Rajendra Deopurkar (Australian National Uni), Zari Dzalilov (Federation Uni Australia), Kyle Broder (Australian National Uni), Shilu Feng (Australian National Uni), Edda Koo (Uni Sydney) There has been notable success in applying the tools of algebraic geometry to a selection of approximation and optimisation problems. In optimisation, a whole

ix

x

Algebraic Geometry, Approximation and Optimisation

new field of convex algebraic geometry has emerged based on the ideas of semidefinite programming relaxations of polynomial optimisation problems. In numerical analysis and approximation, understanding properties of, and developing algorithms for, working with tensors of low rank (and other structured tensors) naturally leads to the study of secant varieties and optimisation algorithms over such varieties. The 2-week MATRIX workshop ‘Algebraic geometry, approxmation, and optimisation’ brought leading experts working at the mutual interfaces between these three areas, with the aim of strengthening the emerging connections between these fields. The morning sessions of the program consisted of a number of lecture series discussing instances of fruitful interactions between the topics of the workshop, from varying points of view. • Enrico Carlini (Politecnico di Torino) discussed secant varieties and general notions of tensor rank; • Wolfgang Hackbusch (Max Planck Institute) surveyed various formats for structured tensors and numerical methods for tensor approximation; • Markus Hegland (Australian National Uni) and Anand Deopurkar (Australian National Uni) discussed connections between algebraic geometry and numerical analysis; • Ludmila Polyakova (Saint-Petersburg State University) discussed various aspects of non-smooth analysis in approximation and optimisation; • Pablo Parrilo (MIT) discussed convexification of secant varieties for linear inverse problems, and methods to exploit chordal sparsity in polynomial ideals; and • Levent Tunc¸el (University of Waterloo) discussed optimisation algorithms for algebraically structured convex optimisation problems, and conditions for the exactness of convex relaxations. In addition to these, there were presentations from James Saunderson (Monash University) and Sebastian Kraemer (RWTH Aachen University) related to hyperbolic programming and hierarchical tensors, respectively. Throughout the two weeks of the program, the afternoons were focused on smaller group discussions among the participants. A number of program participants remained in the area for a third week to attend the Variational Analysis Down Under (VADU) workshop at Federation University in Ballarat. This workshop attracted leading senior researchers in variational and non-smooth analysis, including Asen Dontchev (University of Michigan and Mathematical Reviews) and Terry Rockafellar (University of Washington). One outcome of the workshop was the preliminary version, included in this volume, of the research paper ”Schur functions for approximation problems” by Nadezda Sukhorukova, Julien Ugon and David Yost. In this paper the authors propose a new approach to least squares approximation problems. The approach has a combinatorial flavour, exploiting the connection between generalized Vandermonde matrices and Schur functions. Enrico Carlini, Jochen Garcke, James Saunderson

Algebraic Geometry, Approximation and Optimisation

xi

Guest Editors

On the Frontiers of High Dimensional Computation 4 – 15 June 2018 Organisers Frances Kuo University of New South Wales Hans De Sterck Monash University Josef Dick University of New South Wales Mahadevan Ganesh Colorado School of Mines Mike Giles University of Oxford Markus Hegland Australia National University Dirk Nuyens KU Leuven, Belgium Ian Sloan University of New South Wales Clayton Webster Oak Ridge National Lab, USA Henryk Wozniakowski University of Warsaw and Columbia University Participants Christoph Aistleitner (TU Graz, Austria), Abhishek Bhardwaj (Australian National Uni), Johann Brauchart (TU Graz, Austria), Bruce Brown (UNSW Sydney), Tiangang Cui (Monash Uni), Hans de Sterck (Monash Uni), Jerome Droniou (Monash Uni), Mahadevan Ganesh (Colorado School of Mines), Alexander Gilbert (UNSW Sydney), Mike Giles (Uni Oxford), Ivan Graham (Uni Bath), Michael Griebel (Uni Bonn), Stuart Hawkins (Macquarie Uni), Markus Hegland (Australian National Uni), Stefan Heinrich (Uni Kaiserslautern), Kerstin Hesse (Uni Paderborn), Fred Hickernell (IIT Chicago), Yoshihito Kazashi (UNSW Sydney), Peter Kritzer (Austrian Academy of Sciences), Thomas K¨uhn (Uni Leipzig), Frances Kuo (UNSW Sydney), Bishnu Lamichhane (Uni Newcastle), Fanzi Meng (Australian National Uni), Hrushikesh Mhaskar (Claremont Graduate Uni), Giovanni Migliorati (Sorbonne Uni), James Nichols (Sorbonne Uni), Dirk Nuyens (KU Leuxiii

xiv

On the Frontiers of High Dimensional Computation

ven, Belgium), Chaitanya Oehmigara (Australian National Uni), Sergei Pereverzyev (Austrian Academy of Sciences), Leszek Plaskota (Uni Warsaw), Stanislav Polishchuk (Monash Uni), Robert Scheichl (Uni Bath), Dongwoo Sheen (Seoul National Uni), Ian Sloan (UNSW Sydney), Yuguang Wang (UNSW Sydney), Grzegorz Wasilkowski (Uni Kentucky), Clayton Webster (Oak Ridge National Laboratory), Wolfgang Wendland (Uni Stuttgart), Robert Womersley (UNSW Sydney), Henryk Wozniakowski (Uni Warsaw and Columbia Uni), Yuan Xu (Uni Oregon), Yuesheng Xu (Sun Yat-sen University, China), Guannan Zhang (Oak Ridge National Laboratory), Houying Zhu (UNSW Sydney), Yuancheng Zhou (Australian National Uni) High dimensional computation is a new frontier in scientific computing, with applications ranging from particle physics, chemical reactions, groundwater flow, heat transport, and wave propagation, to financial mathematics, risk management, and parameter estimation. Often the difficulties come from uncertainty or randomness in the data, which presents major challenges in the areas of data science and uncertainty quantification. This program provided a forum for interaction between Australian and international experts on the theory and application of high dimensional computation, including quasi-Monte Carlo and sparse grid methods, information based complexity, discrepancy and approximation theory, computational and Bayesian inverse problems, multi-level and multi-index techniques, energy and point configuration on spheres and manifolds, PDEs with random coefficients or boundaries, and more, with the aim to establish new collaborations. We had a total of 45 participants, including 19 from Australia (ANU, Macquarie, Melbourne, Monash, Newcastle, and UNSW Sydney), and the remaining from Austria, Belgium, China, France, Germany, South Korea, Poland, UK, and USA. The first week of the program was devoted to research collaboration, with intentionally only one research seminar each day. We had a workshop in the second week, with a total of 34 short research talks, and this allowed time each day for research collaboration both in the mornings and afternoons. During the program there were many groups of collaborators working on a multitude of topics, including: • Kernel construction for machine learning; graph convolutional networks; graphbased methods for manifold learning; high dimensional data approximation. • Probing the cosmic microwave background radiation maps; complex spherical designs; complex orthogonal polynomials on the unit disc; localized polynomial kernels; regularization parameter choice strategies for the reconstruction from noisy data on spheres; radial basis function approximation on the sphere from noisy scattered data. • Tractability analysis for linear multivariate problems in terms of singular values; absolute value information; complexity of stochastic integration; approximation in spaces of smooth functions; sharp preasymptotics for approximation in periodic isotropic Sobolev spaces; approximation of Sobolev functions defined on high-dimensional Euclidean balls; tractability of approximation of weighted Sobolev embeddings.

On the Frontiers of High Dimensional Computation

xv

• Lattice rules for function approximation and reconstruction; lattice rules without random shifting; hyperuniformity; low-discrepancy sampling for non-uniform measures; efficient integration over unbounded domains; reduced component-bycomponent constructions for product and order dependent weights; quasi-Monte Carlo community software. • Application of quasi-Monte Carlo methods to stochastic wave propagation, elliptic eigenvalue problems with stochastic coefficients, Bayesian computation; uncertainty quantification for neutron transport. • Multi-level methods for dimension reduction and infinite dimensional Markov chain Monte Carlo; efficient implementation of multivariate decomposition methods • Bayesian methods for wave propagation inverse problems; fully discrete spectral algorithm for dielectric media; 3D T-matrix software; analysis of wavefunction expansions in the near field; tensor train approximation for stochastic wave propagation. • Computation of high dimensional oscillatory integrals in the high frequency case; fast solutions of boundary integral equations. • L´evy-Ciesielski approximation of Brownian motion. The nine refereed articles arising from this program cover a range of these topics. Participants reported a high level of satisfaction: their words were “excellent opportunity”, “interesting”, “inspiring”, “stimulating”, “productive”, “new research results and directions”, “new contacts and connections”, as well as “fantastic facilities and catering”. After this MATRIX program, most participants joined the Conference in Honour of Ian Sloan on the Occasion of His 80th Birthday from June 17 to 19 at UNSW Sydney. I would like to thank everyone who participated in this MATRIX program and the subsequent celebration in Sydney. I am also grateful to all authors and referees who contributed to this volume. On behalf of all participants, I would like to thank MATRIX for hosting our program and giving us the wonderful opportunity to interact and collaborate in such a stimulating environment. Frances Kuo Guest Editor

xvi

On the Frontiers of High Dimensional Computation

Month of Mathematical Biology 27 June – 20 July 2018 Organisers Ruth Baker University of Oxford Kevin Burrage Queensland University of Technology Helen Byrne University of Oxford Edmund Crampin University of Melbourne Mark Flegg Monash University Alexander Fletcher Sheffield University Edward Green University of Adelaide Samuel Isaacson Boston University James Osborne University of Melbourne Hans Othmer University of Minnesota Participants Axel Almet (Uni Oxford), Satya Arjunan (RIKEN), Steve Andrews (Fred Hutchinson Cancer Research Center), Christopher Angstmann (UNSW), Sandy Anderson (Moffit Cancer Centre), Bartosz Bartmanski (Uni Oxford), Josh Bull (Uni Oxford), Phillip Brown (Uni Adelaide), Ana Victoria Ponce Bobadilla (Uni Heidelberg), Casper Beentjes (Uni Oxford), Bartosz Bartmanski (Uni Oxford), Jess Crawshaw (Uni Melbourne), James Cavallo (Monash Uni), Edmund Crampin (Uni Melbourne), Radek Erban (Uni Oxford), Mark Flegg (Monash Uni), Edward Green (Uni Adelaide), James Glazier (Indiana Uni), Bruce Gardiner (Murdoch Uni), Guillermo Gomez (Uni South Australia), Ramon Grima (Uni Edinburgh), Daniel Hahne (Leipzig Uni), Samuel Isaacson (Boston Uni), Stuart Johnston (Uni Melbourne), Melissa Knothe-Tate (UNSW), Yangjin Kim (Konkuk Uni), Ashfaq Khan (RMIT), Andr´e Leier (UAB), Karen Lipkow (Cambridge Uni), Brodie Lawson (QUT), Kynan Lawlor (MCRI), Sharon Lubkin (North Carolina State Uni), Shev xvii

xviii

Month of Mathematical Biology

Macnamara (UTS), Paul Macklin (Indiana Uni), Claire Miller (Uni Melbourne), Tatiana Marquez-Lago (UAB), Zoltan Neufeld (Uni Queensland), Don Newgreen (MCRI), Hans Othmer (Uni Minnesota), James Osborne (Uni Melbourne), Margriet Palm (Leiden Uni), Catherine Penington (Macquarie), Erika Tsingos (Heidelberg Uni), Chin Wee Tan (WEHI), Daniel Wilson (Uni Oxford), Ruth Williams (UCSD), Michael Watson (Uni Sydney). This program consisted of two research workshops, “Virtual tissues: Progress and challenges in multicellular systems biology (1–7 July) and “Spatio-temporal stochastic systems in biology” (15–20 July). Between these workshops, many participants attended the Society of Mathematical Biology meeting held in Sydney. Virtual tissues: Progress and challenges in multicellular systems biology This workshop brought together world-leading mathematical modellers, systems biologists and experimentalists to discuss advances in all aspects of cell and tissue modelling and simulation and to define (and begin working on) a set of grand challenges in the pathway to in silico drug discovery and improved therapies using multicellular models. The goals of the workshop were to: • produce a set of collaborative projects focusing on grand challenges in colonic crypt, kidney, and enteric nervous system development and disease; • develop new partnerships and strengthen existing interdisciplinary collaborations; • advance functionality of existing multicellular modelling tools; • improve recognition by the biological community of multicellular modelling as a research tool. The workshop included around 20 talks (each 15-45 minutes) ranging from experimental talks through to technical talks on model developments and multicellular modelling standards. The remaining time was broken into large group discussions focussing on the workshop goals led by the organisers and senior participants and also smaller group discussions using the facilities in MATRIX house. During the program several group collaborations emerged. One looked at defining standards for multicellular modelling, and another looked at a repository for multicellular modelling benchmarks. Both of these projects have continued after the workshop. The goals of these groups are (i) to write a position paper on the necessity for a standard for multicellular model specification and (ii) to develop a set of benchmark problems for multicellular models and present these as an online resource for the community. The workshop was discussed many times at the Society of Mathematical Biology meeting which was held in Sydney the following week. This feedback was overwhelmingly positive and it was pleasing when praise came through third parties. Spatio-temporal stochastic systems in biology The first day of this workshop was dedicated to the training of junior participants. Flegg and Isaacson ran group discussions, and there were software presentations

Month of Mathematical Biology

xix

by participants Andrews (Smoldyn), Arjunan (Spatiocyte) and Coulier (URDME) showing implementations of various algorithms and mathematical frameworks. On subsequent days, the participants worked in small groups on five research focus problems, each lead by a senior participant: nonlinear effects in spatiotemporal stochastic systems in biology (Andr´e Leier and Tatiana Marquez-Lago), classical molecular dynamics (Radek Erban), adaptation (Ruth Williams), reaction and diffusion processes (Hans Othmer), and particle-based simulation methods (Steve Andrews). There were also four talks given by Karen Lipkow, Ramon Grima, Brodie Lawson and Shev Macnamara covering a wide spectrum of topics ranging from fundamental mathematical paradigms and curiosities to complex biological systems and applications. While the focus of the workshop was research, these talks broke up the workshop and provide some diversity to the daily schedule. A recurring thread in all the focus problems was issues related to multiple scales, which are known as a major challenge in mathematical biology. Often multiple scales (temporal, spatial and density) lie at the heart of a mathematical analysis of stochastic systems in biology. Results of work were presented at the end of the workshop by the junior participants. All of the focus problems were successful and most will lead to publications. Mark Flegg and James Osborne Guest Editors

Dynamics, Foliations, and Geometry In Dimension 3 3–14 September 2018 Organisers Jonathan Bowden Monash University Steven Frankel Yale University, USA Andy Hammerlindl Monash University Rafael Potrie Universidad de la Rep´ublica, Uruguay Participants Jonathan Bowden (Monash Uni), Andy Hammerlindl (Monash Uni), Steven Frankel (Washington Uni, St Louis), Rafael Potrie (Uni de la Rep´ublica, Uruguay), Christian Bonatti (Uni de Bourgogne), Keith Burns (Northwestern), Danny Calegari (Uni Chicago), Vincent Colin (Uni Nantes), Sergio Fenley (Florida State Uni), Thomas Vogel (LMU Munich), Katie Mann (Brown Uni), Jessica Purcell (Monash Uni), Jana Rodriguez Hertz (SUSTech), Raul Ures (SUSTech), Helene Eynard-Bontemps (Uni Jussieu), Pierre Dehornoy (Uni Grenoble), Mario Shannon (Uni de Bourgogne), Yi Shi (Peking Uni), Andre de Carvalho (Uni Sao Paolo), Santiago Martinchich (Uni de la Rep´ublica), Vernica De Martino (Uni de la Rep´ublica), Josh Howie (Monash Uni), Dan Mathews (Monash Uni), Agnieszka Zelerowicz (Pennsylvania State Uni), Layne Hall (Monash Uni) This 2-week workshop brought together a small but focussed team of researchers from Australia, Europe, Asia, and North and South America to study the interplay of dynamical systems, foliations, and contact structures with the geometry and topology of three-dimensional manifolds. One key focus of the event was on the construction and classification of partially hyperbolic diffeomorphisms and Anosov flows in dimension three. Associated to these dynamical systems are a pair of taut foliations in the manifold, and so the study of these foliations and the closely related concept of tight contact structures also took prominence during the workshop. The workshop schedule was of a relaxed nature, with most days having one or two hours of scheduled lectures and the remaining time left for informal discussion. There were four mini-courses during the event. Danny Calegari gave a lecture series describing taut foliations and their properties. Andy Hammerlindl and Rafael Potrie presented a mini-course explaining the properties of the branching foliation theory developed by Brin, Burago, and Ivanov and its application to the classification of partially hyperbolic dynamics. Stephen Frankel’s mini-course explored xxi

xxii

Dynamics, Foliations, and Geometry In Dimension 3

the relation between pseudo-Anosov and quasi-geodesic flows on hyperbolic 3manifolds. Lastly, the mini-course of Thomas Vogel and Vincent Colin detailed the properties of tight contact structures, their approximation of taut foliations, and their relation to dynamics on the manifold. The informal discussions pursued a number of avenues of research and we highlight a few of these here. Any Anosov flow in dimension 3 yields a pair of transverse taut foliations on the manifold. However, it is an open question whether any 3-manifold with such a pair of foliations must necessarily support an Anosov flow. This question was explored during the workshop and the article by Bonatti, Bowden, and Potrie in this volume discusses this work. Recent years have seen huge progress in the understanding of partial hyperbolicity in dimension 3, due both to classification results in certain families of manifolds and the construction of new examples with novel properties. A research announcement of Barthelm´e, Fenley, Frankel, and Potrie concerns the classification of partially hyperbolic systems homotopic to the identity map. Finally, the classification of partially hyperbolic diffeomorphisms and Anosov flows is closely related the long-standing open problem of the classification of Anosov diffeomorphisms. An article by Andy Hammerlindl looks at the classification problem for Anosov diffeomorphisms with global product structure. This short but productive workshop passed far too quickly, and by all accounts the visitors thoroughly enjoyed their time in Creswick. The most-asked question at the end of the workshop was when would we host such an event again. Jonathan Bowden and Andy Hammerlindl Guest Editors

Dynamics, Foliations, and Geometry In Dimension 3

xxiii

Recent Trends on Nonlinear PDEs of Elliptic And Parabolic Type 5 - 16 November 2018

Organisers Yihong Du University of New England Daniel Hauer University of Sydney Angela Pistoia Sapienza Universit´a di Roma

Participants Yihong Du (Uni New England), Daniel Hauer (Uni Sydney), Angela Pistoia (Sapienza Universit di Roma), Susanna Terracini (Uni Turin), Philippe Souplet (Uni Paris 13), Changfeng Gui (Uni Texas at San Antonio), Robin Neumeyer (Northwestern Uni), Jerome Coville (INRA), Isabella Ianni (Uni Campania), Massimo Grossi (Uni Roma 1), Benedetta Pellacci (Uni Campania), Serena Dipierro (Uni Milan), Yannick Sire (Johns Hopkins Uni), Enrico Valdinoci (Uni Milan), Florica Cirstea (Uni Sydney), Bernhard Ruf (Uni Milan), Michael Winkler (Uni Paderborn), Francesca Gladiali (Uni Sassari), Gianmaria Verzini (Politecnico di Milano), Hiroshi Matano (Meiji Uni), Ben Andrews (Australian National Uni), Xu-Jia Wang (Australian National Uni), Ki-Ahm Lee (Seoul National Uni), Masaharu Taniguchi (Okayama Uni), Daniel Daners (Uni Sydney), Julie Clutterbuck (Monash Uni), Wolfgang Arendt (Uni Ulm), Paul Bryan (Macquarie Uni), Guofang Wei (Uni California at Santa Barbara), Barbara Brandolini (Uni Naples Federico II), Glen Wheeler (Uni Wollongong), James McCoy (Uni Newcastle), Timothy Collier (Uni Sydney), Yuhan Wu (Uni Wollongong), Elisa Affili (Uni Milan), Pietro Miraglio (Uni Milan) This meeting aimed to gain a deeper understanding of several key themes in nonlinear elliptic and parabolic PDEs, where spectacular progress has been made in recent years. These include spatial segregation in nonstandard diffusion, propagation in heterogeneous media, gradient blow-up of diffusive Hamilton-Jacobi equations, and some selected topics on geometric flows. We brought to the fore the key challenges for future research in these areas. This program intertwined talks from each community of the four main themes and highlighted the most salient ideas, proofs and questions, which are important and fertile for pushing forward the research in these areas in Australia and worldwide. Week 1 featured the following two mini-courses: • Susanna Terracini: “Spatial segregation with non standard diffusions” xxv

xxvi

Recent Trends on Nonlinear PDEs of Elliptic and Parabolic Type

• Philippe Souplet: “An introduction to nonlinear Liouville theorems for reactiondiffusion equations and systems and their applications” Week 2 featured the following two mini-courses: • Hiroshi Matano: “Front propagation in the presence of obstacles” • Ben Andrews: “Multi-point maximum principles with applications to sharp gradient estimates and travelling waves” Daniel Hauer Guest Editor

Functional Data Analysis and Beyond 3 – 14 December 2018

Organisers Aurore Delaigle University of Melbourne Frederic Ferraty University of Toulouse Debashis Paul University of California at Davis

Participants Alexander Aue (Uni California at Davis), Michelle Carey (University College Dublin), Ming-Yen Cheng (Hong Kong Baptist Uni), Sophie Dabo-Niang (Uni de Lille), Aurore Delaigle (Uni Melbourne), Marie-Hlne Descary (Uni de Montreal), Idris Eckley (Lancaster Uni), Frdric Ferraty (Toulouse Jean Jaures Uni), Gery Geenens (UNSW), Rob Hyndman (Monash Uni), Ci-Ren Jiang (Academia Sinica), Pavel Krupskiy (UoM), Dominik Liebl (Bonn Uni), Eardi Lila (Cambridge Uni), Steve Marron (Uni North Carolina), James Ramsay (McGill Uni), Matthew Reimherr (Penn State Uni), Damla Senturk (Uni California at Los Angeles), Karim Seghouane (Uni Melbourne), Hanlin Shang (Australian National Uni), Jian Qing Shi (Newcastle Uni), Katharine Turner (Australian National Uni), Susan Wei (Uni Melbourne), Fang Yao (Uni Toronto), Jiajun Tang (Uni Melbourne), Zhuosong Zhang (Uni Melbourne), Wei Huang (Uni Melbourne), Ruoxu Tan (Uni Melbourne), Alessandra Vittorini Orgeas (Uni Melbourne), Andriy Olenko (La Trobe Uni), JengMin Chiou (Academia Sinica), Heng Lian (City Uni Hong Kong), Jin-Tin Zhang (National Uni Singapore) The program focused on recent developments in functional data analysis. This area of statistics has been widely used to answer science and policy questions, where the data are typically observed over time, space and other continuous variables. During the program there were several groups of collaborators. A real highlight of the program was when the young participants were able to present their work to senior researchers. As a result, the young researchers attending the workshop interacted significantly with the two most senior and prominent participants, James Ramsay and Steve Marron. The warm atmosphere of the program also encouraged significant interactions between young female participants and more senior female participants. Several xxvii

xxviii

Functional Data Analysis and Beyond

groups of them were invited to each others institutions to continue the discussion and collaboration initiated at the workshop. Another highlight is that several participants are organising another similar workshop in Senegal in 2020. The whole two weeks were extremely fruitful. Aurore Delaigle, Frederic Ferraty, Debashis Paul Organisers

Geometric and Categorical Representation Theory 10 – 21 December 2018 Organisers Clifton Cunningham University of Calgary Masoud Kamgarpour University of Queensland Anthony Licata Australian National University Peter McNamara University of Melbourne Sarah Scherotzke Bonn University Oded Yacobi University of Sydney Participants Peter McNamara (Uni Melbourne), Masoud Kamgarpour (Uni Queensland), Oded Yacobi (Uni Sydney), Tony Licata (Australian National Uni), Clifton Cunningham (Uni Calgary), Luca Migliorini (Universita di Bologna), Geordie Williamson (Uni Sydney), Valentin Buciumas (Uni Queensland), Arun Ram (Uni Melbourne), Iva Halacheva (Uni Melbourne), Anna Romanov (Uni Sydney), Asilata Bapat (Australian National Uni), Yohan Brunebarbe (Institut de Mathmatiques de Bordeaux), Joel Gibson (Uni Sydney), Giulian Wiggins (Uni Sydney), Josh Ciappara (Uni Oxford), Javier Fresan (Centre de Mathmatiques Laurent Schwartz cole Polytechnique), Arik Wilbert (Uni Melbourne), Yaping Yang (Uni Melbourne), Gufang Zhao (Uni Melbourne), Gaston Burrull (Uni Sydney), Joseph Baine (Uni Sydney), Minhua Liu (Uni Sydney), Xun Xie (Uni Sydney), Geoff Vooys (Uni Calgary) Representation theory is a central branch of mathematics, with connections to algebraic geometry, symplectic topology and number theory. The fundamental aims of the newer branches of geometric and categorical representation theory are to uncover deeper geometric and categorical structures underpinning familiar mathematical objects. Once such structures are discovered, they are then able to be applied to resolve classical open problems. Recent years have seen significant advances made in this field, partly fuelled by the results of breakthroughs made by Australian researchers. This program brought together leading international researchers and early career researchers for an intense and productive two-week program. Peter McNamara Guest Editor xxix

xxx

Geometric and Categorical Representation Theory

Contents

Preface

v

I

1

Refereed Articles

1 On the Frontiers of High Dimensional Computation

3

J.C.H. Blake, I. Graham, F. Scheben, A. Spence “The radiative transport equation with heterogeneous cross-sections” . . . . . . . . . . . . . . .

5

Mahadevan Ganesh, Stuart Hawkins “A reduced-order-model Bayesian obstacle detection algorithm” . . . . . . . . . . . . . . . . . 17 Alexander Gilbert, Ivan Graham, Robert Scheichl, Ian Sloan “Bounding the spectral gap for an elliptic eigenvalue problem with uniformly bounded stochastic coefficients” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Stefan Heinrich “On the power of restricted Monte Carlo algorithms” . . . . . . . . . . . . . . . . . . . . . . . 45 Fred Hickernell, Peter Kritzer, Henryk Wozniakowski “Exponential tractability of linear tensor product problems” . . . . . . . . . . . . . . . . . . . . 61 Yoshihito Kazashi, Ian Sloan “Worst-case error for unshifted lattice rules without randomization” . . . . . . . . . . . . . . . 79 Thomas Kuhn “New preasymptotic estimates for approximation of Sobolev functions” . . . . . . . . . . . . . 97 Hrushikesh Mhaskar, Sergei Pereverzyev, Vasyl Semenov, Evgeniya Semenova “Data based construciton of kernels for classification” . . . . . . . . . . . . . . . . . . . . . . . 113 Dongwoo Sheen “P1 -nonconforming polyhedral finite elements in high dimensions” . . . . . . . . . . . . . . . . 121 2

Month of Mathematical Biology

135

Ana Victoria Ponce Bobadilla, Bartosz Jan Bartmanski, Ramon Grima, Hans G. Othmer “The status of the QSSA approximation in stochastic simulations of reaction networks” . . . . . 137 Stuart T. Johnston, Christopher N. Angstmann, Satya N.V. Arjunan, Casper H.L. Beentjes, Adrien Coulier, Samuel A. Isaacson, Ash A. Khan, Karen Lipkow, Steven S. Andrews “Accurate particle-based reaction algorithms for fixed timestep simulators” . . . . . . . . . . . 149 3 Recent Trends on Nonlinear PDEs of Elliptic And Parabolic Type

165

xxxi

xxxii

Contents

Elisa Affili, Serena Dipierro, Enrico Valdinoci “Decay estimates in time for classical and anomalous diffusion” . . . . . . . . . . . . . . . . . 167 Ben Andrews “Multi-point maximum principles and eigenvalue estimates” . . . . . . . . . . . . . . . . . . . 185 Jérôme Coville “A note on Liouville type results for a fractional obstacle problem” . . . . . . . . . . . . . . . . 215 Serena Dipierro, Pietro Miraglio, Enrico Valdinoci “Symmetry results for the solutions of a partial differential equation arising in water waves” . . 229 Francois Hamel, Nikolai Nadirashvili, Yannick Sire “Geometric properties of superlevel sets of semilinear elliptic equations in convex domains” . . 249 Michinori Ishiwata, Bernhard Ruf, Federica Sani, Elide Terraneo “A potential well argument for a semilinear parabolic equation with exponential nonlinearity” . 265 Dario Mazzoleni, Benedetta Pellacci, Gianmaria Verzini “Quantitative analysis of a singularly perturbed shape optimization problem in a polygon” . . . 275 James McCoy, Glen Wheeler “A rigidity theorem for ideal surfaces with flat boundary” . . . . . . . . . . . . . . . . . . . . . 285 Robin Neumayer “On minimizers and critical points for anisotropic isoperimetric problems” . . . . . . . . . . . . 293 Philippe Souplet “Liouville-type theorems for nonlinear elliptic and parabolic problems” . . . . . . . . . . . . . 303

II Other Contributed Articles

327

4

329

Algebraic Geometry, Approximation and Optimisation

Nadezda Sukhorukova, Julien Ugon, David Yost “Schur functions for approximation problems” . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 5

Dynamics, Foliations, and Geometry In Dimension 3

339

Thomas Barthelmé, Sergio R. Fenley, Steven Frankel, Rafael Potrie “Research announcement: Partially hyperbolic diffeomorphisms homotopic to the identity on 3-manifolds” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 Christian Bonatti, Jonathan Bowden, Rafael Potrie “Some remarks on projective Anosov flows in hyperbolic 3-manifolds” . . . . . . . . . . . . . 359 Andy Hammerlindl “Notes on global product structure” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 6

Geometric and Categorical Representation Theory

381

Geoff Vooys “The Greenberg functor is site cocontinuous” . . . . . . . . . . . . . . . . . . . . . . . . . . . 383

Part I

Refereed Articles

Chapter 1

On the Frontiers of High Dimensional Computation

The radiative transport equation with heterogeneous cross-sections J.C.H. Blake, I.G. Graham, F. Scheben, A. Spence

Abstract We consider the classical integral equation reformulation of the radiative transport equation (RTE) in a heterogeneous medium, assuming isotropic scattering. We prove an estimate for the norm of the integral operator in this formulation which is explicit in the (variable) coefficients of the problem (also known as the cross-sections). This result uses only elementary properties of the transport operator and some classical functional analysis. As a corollary, we obtain a bound on the convergence rate of source iteration (a classical stationary iterative method for solving the RTE). We also obtain an estimate for the solution of the RTE which is explicit in its dependence on the cross-sections. The latter can be used to estimate the solution in certain Bochner norms when the cross-sections are random fields. Finally we use our results to give an elementary proof that the generalised eigenvalue problem arising in nuclear reactor safety has only real and positive eigenvalues.

1 Introduction In this note we present some elementary estimates for the steady-state monoenergetic Radiative (Boltzmann or Neutron) Transport Equation (RTE) with heterogeneous cross-sections. Although these are relatively straightforward to prove, and there is a huge literature on this topic, we were unable to find proofs of these precise J.C.H. Blake IBM, Hursley Park, Winchester SO21 2JN, UK, e-mail: [email protected] I.G. Graham Mathematical Sciences, University of Bath, BA2 7AY, UK, e-mail: [email protected] F. Scheben Am Vogelbusch 25, 24568 Kattendorf, Germany, e-mail: [email protected] A. Spence Mathematical Sciences, University of Bath, BA2 7AY, UK, e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_1

5

J.C.H. Blake, I.G. Graham, F. Scheben, A. Spence

6

results in the literature and so it seems useful to record them here. We emphasise that there are many excellent classical references for the general field discussed here - for example [2, 8, 4]. The estimates given here provide the tools needed to estimate the solution of the RTE explicitly in terms of the data (the so-called ‘cross-sections’). They also allow an explicit estimate for the rate of convergence of source iteration in terms of the scattering ratio. These estimates have recently proved essential for the rigorous analysis of uncertainty quantification techniques for the RTE [5, 6, 9]. Most of the estimates presented here appeared in the University of Bath PhD theses of Fynn Scheben and Jack Blake [13, 3, 14]. An application to problems with random data appears in the more recent University of Bath thesis of Matthew Parkinson [9] - see also [5, 6]. For r ∈ V ⊂ R3 , where V is a bounded convex spatial domain and Ω ∈ S2 , the unit sphere in 3D, and assuming isotropic scattering, the RTE source problem takes the form

Ω · ∇ψ (r, Ω ) + σ (r)ψ (r, Ω ) =

σS (r) 4π

 S2

ψ (r, Ω  ) dΩ  + Q(r),

(1)

where ∇ denotes the gradient with respect to r, Q is the source and σ , σS are, respectively, the total and scattering cross sections that satisfy

σ (r) = σS (r) + σA (r),

∀r ∈ V,

(2)

where σA is the absorption cross section. All cross-sections are assumed to be pointwise bounded above and below on V by strictly positive constants, i.e., for all r ∈ V , 0 < (σS )min ≤ σS (r) ≤ (σS )max ,

0 < (σA )min ≤ σA (r) ≤ (σA )max ,

and thus 0 < (σ )min ≤ σ (r) ≤ (σ )max , which ensures that σS /σ L∞ (V ) < 1. In (1), the angular flux ψ is to be found, subject to given boundary conditions. Here we only consider the vacuum boundary condition on the inflow boundary:

ψ (r, Ω ) = 0,

when n(r) · Ω < 0,

r ∈ ∂ V,

(3)

where n denoted the ourward normal from V . Introducing the transport and averaging operators: T ψ (r, Ω ) = Ω · ∇ψ (r, Ω ) + σ (r)ψ (r, Ω ) and P(·) =

1 4π

 S2

(·) dΩ ,

(4)

(1) can be rewritten as T ψ (r, Ω ) = σS (r)φ (r) + Q(r), where φ = P ψ is called the scalar flux. This is to be solved, subject to (3),

(5)

The radiative transport equation with heterogeneous cross-sections

7

Throughout, L2 (V ) will denote the space of square integrable functions on V , with inner product denoted ·, · . For any uniformly positive and bounded weight function w on V , we will use L2 (V, w) to denote the space of functions v, for which v2L2 (V,w) := V |v(r)|2 w(r)dr < ∞. When w ≡ 1, this reduces to the standard L2

norm v2L2 (V ) = v, v . Before studying (1), (3) it is useful to first consider the ‘pure transport problem’ T ψ = g, subject to boundary condition (3) and with source g ∈ L2 (V ). If the scalar flux φ were known, then (5) would allow computation of ψ (r, Ω ) for all r and any fixed Ω by solving a single transport equation with vacuum boundary condition. Such ‘transport sweeps’ are ‘easy’ operations both in theory (the solution of the transport problem can be written down using characteristics - see Lemma 1), and in numerical practice (e.g., when discontinuous Galerkin methods are used to discretise the transport operator, the resulting linear system can usually be solved by a single sweep through the elements). This motivates the use of the ‘source iteration’ for solving (5), which computes a sequence of approximations to φ , starting with an initial guess φ 0 and iterating by solving : T ψ i+1 (r, Ω ) = σS φ i (r) + Q(r),

subject to (3) and

φ i+1 := P ψ i+1 .

(6)

The fact that this iteration is always well-defined and is equivalent to a fixed point iteration for a certain self-adjoint weakly singular integral operator is established in the following lemma. Lemma 1. Let g ∈ L2 (V ) and consider the pure transport problem: Solve T ψ (r, Ω ) = g(r),

r ∈ V, Ω ∈ S2 ,

for ψ (r, Ω ), subject to boundary condition (3). This problem has a unique solution given by

ψ (r, Ω ) = where τ (r, r ) =

 l(r,r )

and r ) and

 d(r,Ω ) 0

exp(−τ (r, r − sΩ ))g(r − sΩ )ds,

(7)

σ (z) dl(z) (the integral of σ along the line l(r, r ) joining r d(r, Ω ) = inf{s > 0 : r − sΩ ∈ V }.

Moreover, the correponding scalar flux φ := P ψ can be expressed as

φ (r) := (K g)(r),

(8)

where K is the integral operator defined by (K v)(r) :=

 V

k(r, r )v(r ) dr ,

with kernel k(r, r ) :=

exp (−τ (r, r )) 4π r − r 22

. (9)

8

J.C.H. Blake, I.G. Graham, F. Scheben, A. Spence

Proof. It is easy to show (using the method of characteristics) that the formula (7) provides the unique solution to the pure transport problem. This is well-known in the neutron transport literature, e.g. [2, 11] and τ is called the optical path length. Once (7) is established, the formula (8) is obtained by applying P to each side of (7) and rewriting the result using spherical polar coordinates. Remark 1. An interesting observation is that, although the original transport problem is far from being self-adjoint, the integral operator K is self-adjoint with a positive kernel (and in fact is a positive definite operator, as we see below). Since K has a weakly singular kernel and the domain of integration is a bounded Euclidean domain, it is to be expected that the solution φ will have (weak) boundary singularities. This property has been analysed in the classical literature - see, e.g., [10] and the references therein. Returning to source iteration (6), we have the following simple corollary: Corollary 1. The iterates φ i and the errors ei := φ − φ i satisfy the equations

φ i+1 = K σS φ i + K Q and

ei+1 = K σS ei

(10)

Hence iteration (6) will converge if and only if there is a norm in which the operator K σS is a contraction. (Here we emphasise that K σS denotes the composition of the operator of multiplication by σS with the integral operator K .) In Theorem 1 we prove that this is the case in a certain weighted L2 norm induced by the total cross-section σ . Then Corollary 2 provides the result on the convergence of source iteration. In fact Theorem 1 has several other ramifications. Combining Lemma 1 with (5) we obtain that the scalar φ satisfies the second kind weakly singular integral equation

φ − K σS φ = K Q,

(11)

and Theorem 1 (and the Banach lemma) then readily tells us that this equation has a unique solution and provides a bound on its norm explicit in the cross-sections (Corollary 3). One reason for providing these results in this paper is that their proofs are hard to locate in the literature. Another reason is that they have direct relevance to the modern theory of unertainty quantification for the transport equation. When the crosssections σS and σ are random fields, then both the error estimates for numerical methods for computing φ and also the rates of convergence of iterative methods for computing realisations of the scalar flux φ depend expilcitly on the cross-sections through the theorems presented here. This dependence is used explicitly in recent work on UQ for transport problems [5, 9, 6]. It is known that source iteration converges when solving the neutron transport equation with constant cross sections - see, e.g., [13, Chapter 4]. Ashby et. al [1,

The radiative transport equation with heterogeneous cross-sections

9

Section 4] prove a similar result with spatially dependent cross sections for a special discrete case. This work motivated us to consider a general proof in the heterogeneous case. The results here are for the underlying operator before discretization. Extension to general discretizations is a complicated question. We will present the theory for the full 3D case where r ∈ V ⊂ R3 and Ω ∈ S2 the unit sphere in 3D, but the result also applies to the 2D reduction where r ∈ V ⊂ R2 and Ω ∈ S1 and to the case when space and angle are one-dimensional (the so called slab geometry case). Details of the proof in this case are given in [3].

2 The main result Our main goal in this section will be to prove the following theorem. ∗ ≤ σ ∗ (r) ≤ σ ∗ for all r ∈ Ω , Theorem 1. For any function σ ∗ satisfying 0 < σmin max  ∗ σ   K σ ∗ L2 (V,σ ) ≤  (12) σ  ∞ . L (V )

(The left hand side of the inequality in (12) denotes the operator norm of K σ ∗ on the space L2 (V ), equipped with the weighted norm  · L2 (V,σ ) .) The proof depends on several lemmas. In these it is useful to introduce the operator L := σ 1/2 K σ 1/2 . Lemma 2. The operators K and L are compact, self-adjoint and positive definite on L2 (V ). Proof. First, K is compact on L2 (V ) because it is a weakly singular operator of potential type, see [7, p.332]. To see the positive definiteness, let g be an arbitrary function in L2 (V ) and let ψ be the solution of T ψ = g, subject to vacuum boundary conditions (3). Then

ψ (r, Ω )g(r) = ψ (r, Ω )Ω · ∇ψ (r, Ω ) + σ ψ 2 (r, Ω ) 1 = ∇ · (Ω ψ 2 (r, Ω )) + σ (r)ψ 2 (r, Ω ). 2 Hence, integrating over V and using the divergence theorem, we obtain 

1 ψ (r, Ω )g(r)dr = 2 V ≥



V



ψ (r, Ω )Ω .n(r)ds +



2

∂V

σ (r)ψ 2 (r, Ω ),

V

σ (r)ψ 2 (r, Ω )dr (13)

10

J.C.H. Blake, I.G. Graham, F. Scheben, A. Spence

where we used the vacuum boundary condition to get the final inequality in (13). Now introducing φ = P ψ , and recalling (8), we also have P ψ = φ = K g. Applying P to each side of (13), we then obtain g, K g =

 V

φ (r)g(r)dr ≥

1 4π

 V

σ (r)

 S2

ψ 2 (r, Ω )dΩ dr .

This proves the postitive definiteness of K . Since L is a simple left and right scaling of K with the positive-valued function σ 1/2 , the proof for L follows directly. Our next result concerns an upper bound on L . Lemma 3. g, L g ≤ g2L2 (V ) ,

for all

g ∈ L2 (V ).

Proof. In a variation of the proof of Lemma 2, let g ∈ L2 (V ), but this time let ψ be the solution of T ψ = σ 1/2 g,

(14)

subject to vacuum boundary conditions. Then set φ = P ψ , implying that P ψ = φ = K (σ 1/2 g). This time, applying P directly to (14) and recalling that g and σ are both independent of Ω , we get P(Ω · ∇ψ ) + σ φ = σ 1/2 g, and so

σ 1/2 K σ 1/2 g = σ 1/2 φ = g − σ −1/2 P(Ω · ∇ψ ). Multiplying each side of this relation by g and integrating over V , we get g, L g = g, g −

 V

σ −1/2 (r)g(r)P(Ω · ∇ψ (r, Ω ))dr.

(15)

Examining the second term on the right-hand side of (15), we see that this may be written 1 4π





S2 V

σ −1/2 (r)g(r)Ω · ∇ψ (r, Ω )drdΩ .

(16)

Multiplying (14) by σ −1 , we obtain the formula σ −1/2 g = ψ + σ −1 Ω .∇ψ , Using this in (16) and then the divergence theorem again, we see that (16) is 1 4π

 S2

  V



 ψ (r, Ω )Ω · ∇ψ (r, Ω )drdΩ + σ −1 (Ω · ∇ψ (r, Ω ))2 drdΩ 1 8π





S2 V

∇ · (ψ 2 (r, Ω )Ω )drdΩ =

1 8π





S2 ∂ V

ψ 2 (r, Ω )Ω · n(r)dsdΩ ,

The radiative transport equation with heterogeneous cross-sections

11

the inner integral on the right-hand side being over the surface ∂ V . This is nonnegative because of the vacuum boundary conditions. Hence (16) is non-negative and combining this with (15), we obtain the result. Lemma 4. L g2L2 (V ) = g, L 2 g ≤ g2L2 (V ) ,

for all

g ∈ L2 (V ).

Proof. By [12, Chapter 104] the positive-definite self-adjoint operator L possesses a unique positive-definite self-adjoint square root, L 1/2 . Take any g ∈ L2 (V ). Then, using the self-adjointness of L 1/2 and Lemma 3 (twice), we obtain g, L 2 g = g, L 1/2 L L 1/2 g = L 1/2 g, L L 1/2 g ≤ L 1/2 g, L 1/2 g = g, L g ≤ g, g , as required. Using the above results we are now in a position to prove the main theorem. Proof of Theorem 1. For any g ∈ L2 (V, σ ), we have g ∈ L2 (V ) and we can write

σ 1/2 K σ ∗ g = L

σ ∗ 1/2 σ g. σ

Using Lemma 4, we then have K σ ∗ gL2 (V,σ ) = σ 1/2 K σ ∗ gL2 (V ) = L (σ ∗ /σ )σ 1/2 gL2 (V ) ≤ (σ ∗ /σ )σ 1/2 gL2 (V ) ≤ σ ∗ /σ L∞ (V ) gL2 (V,σ ) , and the result follows. Remark 2. Although we have given the proof here only in the 3D case, the same result holds for classical 2D and 1D model problems. For example in the 1D “slab geometry” case formulated on the unit interval, the transport equation is dψ 1 (x, μ ) + σ (x)ψ (x, μ ) = σS (x) μ dx 2

 1 −1

ψ (x, μ  )dμ  + Q(x)

where x ∈ (0, 1) and μ ∈ (−1, 1). The counterpart of the intergral operator K is K g(x) =

1 2

 1 0

E1 (|τ (x, y)|)g(y)dy,

with τ denoting the optical path and E1 the exponential integral. The counterpart of Theorem 1 for this case is proved using almost identical arguments to those given above (see, e.g., [3]).

12

J.C.H. Blake, I.G. Graham, F. Scheben, A. Spence

2.1 Some applications of Theorem 1 2.1.1 Convergence of source iteration From Theorem 1 and Corollary 1 we immediately have the following result on the convergence of source iteration. Corollary 2. Under the definitions above we have: σ   i+1     S e  2 ≤   ∞ ei L2 (V,σ ) . L (V,σ ) σ L (V ) σ   S Since, by (2),   < 1, we have ei → 0 as i → ∞. σ L∞ (V )

(17)

Remark 3. A stronger estimate than (17) can be obtained in the case of constant cross-sections. In [3] it was shown that on a spatial domain V with diameter d σ     i+1  S e  2 (1 − exp(−σ d)) ei L2 (V ) . ≤ L (V ) σ So on small domains source iteration can still converge rapidly, even if the scattering ratio is close to 1.

2.1.2 Data-explicit estimates for the RTE The next corollary gives data-explicit estimates for the pure transport problem and for the RTE. Corollary 3. (i) Consider the pure transport problem T ψ = g with vacuum boundary conditions, as in Lemma 1, and let φ be the corresponding scalar flux. Then φ L2 (V ) ≤

1 gL2 (V ) . σmin

(ii) Consider the RTE (1) with vacuum boundary conditions (3) and let φ be the corresponding scalar flux. Then φ L2 (V ) ≤

1

σmin



−1 σ   S QL2 (V ) . 1−  σ L∞ (V )

Proof. (i) By Theorem 1, we have φ = K g, so φ L2 (V,σ ) = σ 1/2 K gL2 (V ) = L (σ −1/2 g)L2 (V ) ≤ σ −1/2 gL2 (V ) (using Lemma 4), and this yields the result.

The radiative transport equation with heterogeneous cross-sections

13

(ii) By Theorem 1, the operator K σS is a contraction on L2 (V, σ ) with norm bounded by σS /σ L∞ (V ) < 1. Hence, by (11) and the Banach Lemma, we can write φ = (I − K σS )−1 K Q, with φ L2 (V,σ )



−1 σ   S ≤ 1−  K QL2 (V,σ ) . σ L∞ (V )

Writing σ 1/2 K Q = L (σ −1/2 Q) and proceeding as in part (i), we obtain (ii). Remark 4. (i) The bounds in Corollary 3 provide the mechanism for estimating the flux φ in appropriate Bochner norms when the data σ , σA , σS are random fields, i.e., when we wish to quantify how uncertainty in data propagates to uncertainty in the fluxes or in the criticality (see the next subsection). Integrability in probability space of the right-hand sides in each of the estimates (i) or (ii) above immediately implies the same integrability properties for the resulting flux. In [9, 5, 6] this is worked out in detail and the theory of multilevel Monte Carlo methods for computing quantities of interest is presented for the RTE in one and two-dimensional models.

2.1.3 Spectral properties of the RTE In the study of nuclear reactor stability, one is concerned with the eigenvalues λ of the generalised eigenproblem:

Ω · ∇ψ + σ ψ = σS φ + λ σF φ ,

(18)

with vacuum boundary condition. Here (in this simplified model problem), σF is the fission cross-section which is also assumed bounded above and below on V by positive constants, and now

σ = σS + σF + σA . In fact one is concerned with the fundamental eigenvalue of (18), the smallest in absolute value. The reactor is stable and efficient provided the fundamental eigenvalue is close to 1. In this case the neutrons produced by fission balance the neutrons lost by scattering and absorption. It is a not completely obvious fact that the spectrum of the problem (18) is in fact discrete, real and bounded below by a positive number. This fact can be obtained from the elementary properties which we have derived above. Corollary 4. The eigenvalues of problem (18) are real and positive. Proof. Let (λ , ψ ) be an eigenpair of (18). Then, as in (11), (I − K σS )φ = λ K σF φ . 1/2

Multiplying through by σS 1/2 and setting v = σS φ , we have

14

J.C.H. Blake, I.G. Graham, F. Scheben, A. Spence

(I − LσS )v = λ LσS 1/2

σF σS



v ,

(19)

1/2

where LσS := σS K σS . Now, since LσS v = (σS /σ )1/2 L ((σS /σ )1/2 v), we can use Lemma 4 to obtain σ   S LσS vL2 (V ) ≤   vL2 (V ) . σ L∞ (V ) Since σS /σ L∞ (V ) < 1, LσS is a contraction on L2 (V ), and I − LσS is an invertible operator. Thus λ cannot vanish in (19) (since if λ = 0, then v = 0, which implies φ = 0 and hence by (18), ψ = 0). Hence (19) is equivalent to

σF 1 v = M v , λ σS where M = (I − LσS )−1 LσS . It is easy to see that M is self-adjoint and compact and has eigenmalues (1 − μ )−1 μ where μ denotes an eigenvalue of LσS . Since μ is always positive and less than 1 it follows that the eigenvalues of M are all positive 1/2 and so M is positive definite. Then setting w = (σF /σS )1/2 v = σF φ , we have 1 w = N w, λ where

N =

σF σS



1/2 M

σF σS

1/2 .

Since N is also self-adjoint and positive definite, the result follows. Remark 5. (i) A more sophisticated argument based on the Krein-Rutman theorem can be used to show that the fundamental eigenvalue is simple with a positive eigenfunction. (ii) The eigenvalue problem is discussed in detail in the fundamental reference [4]. (iii) The quantity 1/λ is called “k−effective” in the nuclear engineering literature. Acknowledgement We thank Professor Paul Smith (Wood plc., Poundbury, Dorset, UK) for many helful discussions over many years’ collaboration, and for supporting the PhD theses of Fynn Scheben, Jack Blake and Matt Parkinson, whose work is partially reported here.

References 1. S. F. Ashby, P. N. Brown, M. R. Dorr, and A. C. Hindmarsh. A linear algebraic analysis of diffusion synthetic acceleration for the Boltzmann transport equation. SIAM J. Numer. Anal., 32:128–178, 1995.

The radiative transport equation with heterogeneous cross-sections

15

2. G. I. Bell and S. Glasstone. Nuclear Reactor Theory. Van Nostrand Reinhold Company, 1970. 3. J.C.H. Blake, Domain decomposition methods for nuclear reactor modelling with diffusion acceleration. PhD Thesis, University of Bath (2016). 4. R. Dautray, and J.L. Lions, Mathematical Analysis and Numerical Methods for Science and Technology: Volume 1, Physical Origins and Classical Methods. Springer, Heidelberg (2012). 5. Contemporary Computational Mathematics — A Celebration of the 80th Birthday of Ian Sloan Editors: Dick, Josef, Kuo, Frances Y., Wozniakowski, Henryk (Eds.) Springer-Verlag, 2018 6. I.G. Graham, M.J. Parkinson and R. Scheichl, Error Analysis and Applications for the heterogenous transport equation in slab geometry ArXiv:1903.11838 (2019) 7. L. V. Kantorovich and G. P. Akilov. Functional Analysis. Pergamon Press, 1982. 8. Lewis, E.E., Miller, W.F.: Computational methods of Neutron Transport. John Wiley and Sons, New York (1984). 9. M.J. Parkinson, Uncertainty Quantification in Radiative Transport, PhD thesis, University of Bath, 2018. 10. Pitkaranta, J., Scott, L.R.: Error estimates for the combined spatial and angular approximations of the transport equation for slab geometry. SIAM J. Numer. Anal. 20, 922–950 (1983). 11. A.K. Prinja and E.W. Larsen, General Principles of Neutron Transport, in Handbook of Nuclear Engineering, D.G. Cacuci, Ed, Springer Science and Business Media, 2010. 12. F. Riesz and B. SZ.-Nagy. Functional Analysis. Frederick Ungar Publishing co., 1955. 13. F. Scheben, Iterative Methods for Criticality Computations in Neutron Transport Theory. PhD Thesis, University of Bath, (2011). 14. F. Scheben and I. G. Graham, Iterative methods for neutron transport eigenvalue problems, SIAM Journal on Scientific Computing, 33 (2011), 2785-2804

A reduced-order-model Bayesian obstacle detection algorithm Mahadevan Ganesh and Stuart C. Hawkins

Abstract We develop an efficient Bayesian algorithm for solving the inverse problem of classifying and locating certain two dimensional objects using noisy far field data obtained by illuminating them with a radiating wave. While application of Bayesian algorithms for wave-propagation inverse problems is itself innovative, the principal novelty in this work is in using i) a surrogate Bayesian posterior distribution computed using a generalised polynomial chaos approximation; and ii) an efficient wave-propagation-specific reduced order model in place of the full multiple scattering forward model. We demonstrate the capability of this approach with simulations in which we accurately detect two dimensional objects, with shapes motivated by safety and security applications.

1 Introduction The time harmonic radiating field u scattered by a two dimensional scatterer D in a homogeneous medium satisfies the Helmholtz partial differential equation (PDE) u + k2 u = 0,

x ∈ R2 \ D,

and the Sommerfeld radiation condition [5, Equation (3.85)]   √ ∂u lim r − iku = 0, r = |x|, r→∞ ∂r

(1)

(2)

M. Ganesh Colorado School of Mines, Golden, CO 80401, USA, e-mail: [email protected] Stuart C. Hawkins Macquarie University, Sydney, NSW 2109, Australia, e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_2

17

18

Mahadevan Ganesh and Stuart C. Hawkins

uniformly for all directions θ . Here k = 2π /λ is the wavenumber and λ is the wavelength. The two dimensional Helmholtz model—which arises in scattering of acoustic or electromagnetic waves by cylinders, whereupon D represents the cross section of the cylinder—is often used as a proving ground for developing scattering algorithms. The approach used in this paper generalises to the three dimensional case and application to the three dimensional case will be the subject of a future work. The radiating scattered field is induced by an incident field via a boundary condition imposed on the boundary ∂ D of the scatterer D. Our approach allows all of the standard wave propagation boundary conditions. For the numerical experiments considered in Section 3, we impose the Dirichlet boundary condition u(x) = −ui (x),

x ∈ ∂ D,

(3)

which models acoustic scattering by a sound soft body, or electromagnetic scattering by a perfect electrical conductor under transverse electric (TE) incident polarisation. In this work we assume the incident field is generated by a point source located at a point x0 close to the body D. The incident field is then i (1) ui (x) = H0 k|x −x0|, } 4

(4)

(1)

where Hn denotes the first kind Hankel function of order n. A consequence of the Sommerfeld radiation condition (2) is that at large distances from the body the scatterered field satisfies eikr u(r, θ ) ≈ √ u∞ (θ ). r

(5)

The modulation function u∞ is known as the far field pattern of D. The well known radar cross section, which is important in applications, is computed from |u∞ | and is a measure of the strength of the reflection from D. The complex valued far field additionally includes phase information, which is important in our algorithm. In practice the far field u∞ is dependent on the shape of the body D. In many applications the shape of D (or a close approximation to it) can be described by a finite

Fig. 1 Visualisations of the reference scatterers: a gun (left), a bomb (right) and a knife (bottom).

A reduced-order-model Bayesian obstacle detection algorithm

19

number of parameters. An important example is the case where D is a configuration of disjoint objects, which can be parametrised by choosing a local origin with coordinates (x, y) inside each object, and describing the one dimensional boundary of the object in local polar coordinates (r, θ ). For star-shaped objects the shape is determined by the function r(θ ) and this function can itself be discretised using standard functional analysis techniques such as Fourier series or spline approximation so that the scatterer is described by a vector parameter. Such an approach has been used in various applications in science and engineering to describe or reconstruct the shape of a single object [10, 17, 20, 5, 21]. However, if the aim is to detect objects that match certain known shapes in a catalogue (such as in Figure 1) then the object shape can be parametrised using an integer (an index into the catalogue) and the configuration of J obstacles is described using d = 3J parameters. Such inverse problems are of interest in security detection type applications. Motivated by such applications, we assume that D is described by the vector parameter σ ∈ Rd , and we denote the corresponding far field u∞ (θ ; σ ). It is convenient to define F(σ ) = u∞ (·; σ ). The problem of computing F(σ ) for a given σ is commonly known as the forward problem. Given data y the corresponding inverse problem is to find σ such that F(σ ) = y.

(6)

Typically the data y incorporates noise and so may not be in the range of F. This is one of the classical wave propagation inverse problems and has received considerable attention in the literature (see the book [5] and references therein). The substantial literature for the ill-posed wave-propagation inverse problem is based on a deterministic approach using Tikhonov regularisation techniques. A completely different approach is to formulate the deterministic inverse problem (6) as a stochastic problem in which the parameter σ is modelled as a stochastic variable with associated probability distribution ρ0 (σ ) = ρ01 (σ1 )ρ02 (σ2 ) · · · ρ0d (σd ). The d-dimensional distribution is known in the Bayesian setting as the prior. Under the assumption that the noise in the data is random with probability distribution ρ , the noisy data can be written in the form y = y + η .

(7)

Here y is in the range of F and η is a d-dimensional sample of the random noise. Then Bayes’ Theorem gives an improved probability distribution for σ conditional on the data y, ρ ( y − F(σ )) ρ0 (σ ) ρy(σ ) = (8) Z where  ρ ( y − F(σ )) ρ0 (σ ) d σ , (9) Z= Ω

is the domain of the stochastic variable σ . The disand Ω = Ω1 × · · · × Ωd ⊆ tribution ρy is known in the Bayesian setting as the posterior. This approach was Rd

20

Mahadevan Ganesh and Stuart C. Hawkins

introduced, in the context of PDEs, with justification using mathematical theory, in the seminal paper [24]. The use of Bayesian inversion for PDEs has been mostly limited to simple diffusion problems [24, 28, 14, 16] posed on a bounded domain, for which the associated forward problem can be solved efficiently using the finite element method, with fast evaluation using techniques such as the multigrid method. For this simple class of PDEs, fast evaluation justifies the use of Bayesian techniques. The application of Bayesian inversion to wave propagation has been limited to two dimensional acoustic models for shape reconstruction [25, 2, 20, 21, 6]. In practice we discretise by requiring that (6) holds at discrete points θ1 , . . . , θM ∈ [0, 2π ), leading to F(σ ) = y, (10) where F(σ ) = (F (m) (σ ))m=1,...,M , y = ( ym )m=1,...,M ,

F (m) (σ ) = u∞ (θm ; σ ), ym = y(θm ).

(11)

Solving (10) to determine D is extremely challenging, even when D is known to be a simply connected body in a known location, because the number of parameters d required to parametrise the body is large, and the corresponding posterior probability distribution (8) is d-dimensional. In view of the above, it is not feasible to use this approach to determine the shapes of several obstacles—that is, when D comprises several disjoint bodies—located in a large area. However, if specific information is available about the shapes of the obstacles then the number of parameters required to parametrise D may be substantially reduced. In this work we consider the case where D comprises two bodies D1 and D2 which are chosen from a catalogue of three known shapes, and whose centres lie in bounded rectangular regions S1 and S2 respectively. The shapes in the catalogue are visualised in Figure 1. Under these assumptions D(σ ) = D1 (σ ) ∪ D2 (σ ) can be parametrised by

σ = (σ1 , . . . , σ6 ) = (x1 , y1 , x2 , y2 , s1 , s2 )

(12)

where (x j , y j ) ∈ S j ⊆ R2 are the coordinates of the center of D j and s j ∈ {1, 2, 3} identifies the shape of D j . Such problems arise in several applications in which the aim is to determine whether or not certain objects are present in a particular area. This model can be extended to include changes of scale, in which the scatterers’ diameters are changed but their shapes remain the same, by adding the scaling factor as an extra parameter.

2 Efficient approximation of the forward model The posterior distribution (8) contains rich information about the body D. However, even for the simplified model, this information is hard to access because the poste-

A reduced-order-model Bayesian obstacle detection algorithm

21

rior distribution is defined on a d = 6 dimensional stochastic space. In practice, information about the posterior is typically extracted by sampling using Markov Chain Monte Carlo (MCMC) methods, computing conditional- and marginal-posterior distributions, or by computing the modal value. All of these approaches require a large number of evaluations of the posterior using (8), which in turn requires a large number of evaluations of the forward model (11). Because the forward problem involves computing the far field, it is desirable to use surface integral based methods such as boundary element methods (BEM), which satisfy the radiation condition (2) exactly and for which the far field is readily available. However, for obstacles comprising heterogeneous materials or irregular shapes, such as the shapes in the catalogue (see Figure 1), application of BEM is challenging. In this work we couple a high order finite element method (FEM) with a high order Nystr¨om BEM. The high-order coupled FEM-BEM scheme [1, 13] facilitates handling general materials and shapes using the FEM whilst properly incorporating the radiation condition (2) using the BEM.

2.1 gPC approximation Evaluating the forward model (11) using the coupled FEM-BEM is typically time consuming, requiring a few seconds CPU time on a modern workstation even for simple cases, so that evaluating the posterior (8) at thousands of points is prohibitively expensive. To facilitate fast evaluation of the posterior (8) at thousands of points we replace the forward model in (8) with its generalised polynomial chaos (gPC) approximation. A similar approach has been used for diffusion problems [18, 22, 3, 4] and the resulting posterior—computed using the gPC approximation in place of the full forward model—is sometimes known as a surrogate posterior. Our degree L gPC approximation is (m)

F L (σ ) = (FL (σ ))m=1,...,M ,

(m)

FL (σ ) =

(m)

∑ cl

Ql (σ ),

(13)

|l|≤L

where l = (l1 , . . . , ld ) and |l| = max j=1,...,d l j . Here the tensor product polynomial chaos polynomials are Ql (σ ) = Q1l1 (σ1 ) · · · Qdld (σd ),

(14)

and for each j = 1, . . . , d the polynomials Q0j , . . . , QLj are orthonormal with respect to the inner product on Ω j induced by the prior distribution,

f , g j =

 Ωj

f (σ )g(σ )ρ0j (σ ) d σ .

The expansion coefficients are computed using

(15)

22

Mahadevan Ganesh and Stuart C. Hawkins (m)

cl

=

 Ω

F (m) (σ )Ql (σ ) d σ .

(16)

In practice the integral in (16) is approximated with high order accuracy using an (L + 1)d point tensor product Gauss quadrature rule. (m) Once the coefficients cl have been computed for m = 1, . . . , M and |l| ≤ L the gPC approximation (13) can be evaluated for a given σ ∈ Ω very quickly. Thus a significant advantage of this approach is that the gPC approximation can be setup offline by computing and storing the coefficients (16) and subsequently evaluated online very quickly. The gPC approximation is spectrally accurate, with convergence rate O(L−r ), where r is the regularity constant of the forward model F with respect to the parameter σ . It is well known [5] that the far field is smooth with respect to the spatial variable even for configurations with non-smooth obstacles. Because of its connection to the far field, as demonstrated in [9], the forward model F is smooth with respect to σ . Thus in practice, for the model considered in this article, L ≤ 10 is sufficient. Next we describe an efficient reduced order model for the forward model with high-order accuracy. For the wave propagation problem, our approach facilitates efficient offline setup of the reduced order model, independent of σ ∈ Rd .

2.2 Reduced order model To compute the gPC coefficients using an (L + 1)d point tensor product Gauss quadrature rule requires (L + 1)d evaluations of the forward model (11). We accelerate setting up the gPC approximation by replacing the forward model in (16) with a reduced order model (ROM) approximation. For the scattering problem in this work there is a well established ROM based on the T-matrix. For a single scatterer D with centre at the origin, the T-matrix ROM involves expanding the incident field in regular cylindrial wavefunctions, ui (r, θ ) =





n=−∞

an J|n| (kr)einθ .

(17)

Here Jn is the Bessel function of order n and the coefficients an for the incident wave are given explicitly by (1) (18) an = H|n| (kR)ein(π +φ ) , where (R, φ ) are polar coordinates for the source location x0 . The scattered field is expanded in radiating cylindrical wavefunctions u(r, θ ) =





n=−∞

(1)

bn H|n| (kr)einθ

(19)

A reduced-order-model Bayesian obstacle detection algorithm

23

and from the linearity of the Helholtz equation (1) it follows that there is a matrix T such that b = Ta (20) where b = (bn ), a = (an ). The matrix T is called the T-matrix of D and encapsulates the scattering properties of D for any incident wave. In practice the series (17) and (19) are truncated for |n| ≤ N and the corresponding finite T-matrix has dimension (2N + 1) × (2N + 1). The truncation parameter N is chosen using Equation (12) in [11] and depends on the wavenumber k and the radius of D. The T-matrix was introduced by Waterman in the 1960s [26, 27] and has been extensively used since (see [19] and references therein). The T-matrix is usually computed using the Null Field method, which is well known to be numerically unstable for scatterers that are large or have a large aspect ratio [23, 15]. In this work we use the numerically stable formulation [7],   2π 1 k |m| −imθ Tmn = u∞ i (1 + i) dθ (21) n (θ )e 4 π 0 to compute the (single-scattering) T-matrix of each scatterer in the catalogue, where u∞ n is the far field of the scatterer computed with incident field ui (r, θ ) = J|n| (kr)einθ .

(22)

Once the T-matrix of each scatterer in the catalogue is available it can be efficiently used for an online multiple scattering computation to obtain the far field of D(σ ) using the approach in [8]. The error analysis in [12] establishes that the error in the far field computed using the T-matrix ROM is of the same order as the error obtained using the high-order coupled FEM-BEM solver. However, we emphasise that the T-matrices are independent of σ and so can be computed offline and stored before being used to computed the gPC coefficients in (16).

3 Numerical Results We demonstrate our approach for configurations modelled stochastically by D(σ ) = D1 (σ ) ∪ D2 (σ ) and parametrised as in (12) with x1 = σ1 ∼ U (−4, −2), x2 = σ3 ∼ U (2, 4),

y1 = σ2 ∼ U (−1, 1),

s1 = σ5 ∼ U {1, 2, 3},

y2 = σ4 ∼ U (−1, 1),

s2 = σ6 ∼ U {1, 2, 3}.

The configuration is illuminated from above by a point source at x0 = (0, 7) with field given by (4). We choose the incident wavenumber k = π so that the corresponding incident wavelength λ = 2 is about double the diameter of our test objects. (After scaling the dimensionless units we use in our numerical experiments, this corresponds to a wavelength a bit longer than used by Wi-Fi.)

24

Mahadevan Ganesh and Stuart C. Hawkins

In our numerical experiments we create test configurations D(σ 0 ) for samples σ 0 of σ . For each test configuration we generate far field data y with ym = u∞ (θm , σ 0 ),

m = 1, . . . , M,

(23)

for M = 20 using the T-matrix based reduced order model described in Section 2. A detailed investigation to determine how many data points M are required to obtain an accurate reconstruction of the configuration will be the subject of a future work and is beyond the scope of this paper. We generate synthetic noisy data from y using (7) with 10% noise using

τ = 0.1|u∞ (·; σ 0 )|.

ηm ∼ N (0, τ 2 ),

(24)

We avoid the “inverse crime” in which the data is generated using the same forward model that is used for the inversion (see [5, Page 154]) by generating our synthetic data using the T-matrix based reduced order model rather than using the gPC approximation, and by including random noise. The far field associated with a particular sample σ 0 and the corresponding noisy data are visualised in Figure 2. The posterior ρy (·) has a six-dimensional domain comprising four dimensions associated with continuous random variables and two dimensions associated with discrete random variables. The posterior is computed using (8) with the forward model approximated using the gPC polynomial (13) for the four dimensions associated with continuous random variables. (gPC approximation is not required for the dimensions associated with the discrete random variables.) Because the random variables in our stochastic parametrisation are uniform, our gPC basis (14) uses dilated and translated Legendre polynomials. The gPC truncation parameter L was chosen so that the error in the gPC approximation (in the maximum norm) was at least of order 10−2 . In practice the maximum norm was approximated using 64 points. Althought the posterior ρy (·) contains rich information about the configuration D(σ 0 ), in practice it is hard to extract this information due the high dimension of the domain Ω . In this work we observed that typically the sets Ωτ = {σ ∈ Ω : ρy (σ ) ≥ τ } for τ > 0 have very small diameter. This indicates that the data allows us to accurately locate the centre of each scatterer. However the small diameter of these sets presents problems for Markov Chain Monte Carlo (MCMC) sampling because there is low probability that one of the samples lands in Ωτ . We have found it useful to compute the marginal posterior density of σ j , ( j)

ρy (σ j ) =

 Ω− j

ρy (σ ) d σ − j

(25)

where

σ − j = (σ1 , . . . , σ j−1 , σ j+1 , . . . , σ6 ),

Ω− j = Ω1 × · · · × Ω j−1 × Ω j+1 · · · × Ω6 .

In practice we approximate the integral in (25) using a tensor product GaussLegendre rule with 4L + 1 points in each dimension. In Figure 2 (left) we demon-

A reduced-order-model Bayesian obstacle detection algorithm

25

strate the accuracy of our method for detecting the location of the scatterers by visualising the marginal posterior densities of x1 , y1 , x2 , y2 (respectively σ1 , . . . , σ4 ) computed using the data visualised in Figure 2 (right). Our visualisation includes a schematic of the configuration D(σ 0 ) and the corresponding total field. The corresponding marginal posterior probabilities for the shapes are visualised in Figure 3.

Fig. 2 Left: visualisation of the total field ui (·) + u(·; σ ) (main panel) and the marginal posteriors for the x and y coordinates of the left scatterer (magenta) and the right scatterer (red). Right: visualisation of the noisy data (red) and the originating cross section 10 log10 2π |u∞ (·; σ )|2 (blue).

Fig. 3 Visualisation of the marginal posteriors for the shape of the left and right scatterers.

26

Mahadevan Ganesh and Stuart C. Hawkins

Acknowledgements SCH thanks Alistair Reid of Data61 for helpful discussions that guided part of this work.

References 1. Bagheri, S., Hawkins, S.C.: A coupled FEM-BEM algorithm for the inverse acoustic medium problem. ANZIAM 56, C163–C178 (2015) 2. Bui-Thanh, T., Ghattas, O.: An analysis of infinite dimensional Bayesian inverse shape acoustic scattering and its numerical approximation. J. Uncertainty Quant. 2, 203–222 (2014) 3. Chen, P., Schwab, C.: Sparse-grid, reduced-basis Bayesian inversion. Comput. Methods Appl. Mech. Engrg. 297, 84–115 (2015) 4. Chen, P., Schwab, C.: Sparse-grid, reduced-basis Bayesian inversion: nonaffine-parametric nonlinear equations. J. Comput Phys. 316, 470–503 (2016) 5. Colton, D., Kress, R.: Inverse Acoustic and Electromagnetic Scattering Theory, 3rd edn. Springer (2012) 6. Daza, M.L., Capistr´an, M.A., Christen, J.A., Guadarrama, L.: Solution of the inverse scattering problem from inhomogeneous media using affine invariant sampling. Math. Meth. Appl. Sci. 40, 3311–3319 (2017) 7. Ganesh, M., Hawkins, S.C.: A far-field based T-matrix method for two dimensional obstacle scattering. ANZIAM J. 51, C201–C216 (2009) 8. Ganesh, M., Hawkins, S.C.: A stochastic pseudospectral and T-matrix algorithm for acoustic scattering by a class of multiple particle configurations. J. Quant. Spectrosc. Radiat. Transfer 123, 41–52 (2013) 9. Ganesh, M., Hawkins, S.C.: A high performance computing and sensitivity analysis algorithm for stochastic many-particle wave scattering. SIAM J. Sci. Comput. 37, A1475–A1503 (2015) 10. Ganesh, M., Hawkins, S.C.: Scattering by stochastic boundaries: hybrid low- and high-order quantification algorithms. ANZIAM J. 56, C312–C338 (2016) 11. Ganesh, M., Hawkins, S.C.: Algorithm 975: TMATROM—a T-matrix reduced order model software. ACM Trans. Math. Softw. 44, 9:1–9:18 (2017) 12. Ganesh, M., Hawkins, S.C., Hiptmair, R.: Convergence analysis with parameter estimates for a reduced basis acoustic scattering T-matrix method. IMA J. Numer. Anal. 32, 1348–1374 (2012) 13. Ganesh, M., Morgenstern, C.: High-order FEM-BEM computer models for wave propagation in unbounded and heterogeneous media: Application to time-harmonic acoustic horn problem. J. Comp. Appl. Math. 37, 183–203 (2016) 14. Jiang, L., Ou, N.: Multiscale model reduction for Bayesian inverse problems of subsurface flow. J. Comput. Appl. Math. 319, 188–209 (2017) 15. Khlebtsov, N.: Anisotropic properties of plasmonic nanoparticles: depolarized light scattering, dichroism, and birefringence. J Nanophotonics 4, 041,587–041,587 (2010) 16. Knapik, B.T., van der Vaart, A.W., van Zanten, J.H.: Bayesian recovery of the initial condition for the heat equation. Commun. Stat. 42, 1294–1313 (2013) 17. Lamberg, L., Muinonen, K., Yl¨onen, J., Lumme, K.: Spectral estimation of Gaussian random circles and spheres. J. Comput. Appl. Math. 136, 109–121 (2001). DOI 10.1016/S03770427(00)00578-1 18. Lu, F., Morzfeld, M., Chorin, A.J.: Limitations of polynomial chaos expansion in the Bayesian solution of inverse problems. J. Comput Phys. 282, 138–147 (2015) 19. Mishchenko, M.I., et al.: Comprehensive thematic T-matrix reference database: A 2013–2014 update. J Quant Spectrosc Radiat Transfer 146, 249–354 (2014) 20. Palafox, A., Capistr´an, M.A., Christen, J.A.: Effective parameter dimension via Bayesian model selection in the inverse acoustic scattering problem. Math. Probl. Eng. 2014, 1–12 (2014)

A reduced-order-model Bayesian obstacle detection algorithm

27

21. Palafox, A., Capistr´an, M.A., Christen, J.A.: Point cloud-based scatterer approximation and affine invariant sampling in the inverse scattering problem. Math. Meth. Appl. Sci. 40, 3393– 3403 (2017) 22. Schillings, C., Schwab, C.: Sparse, adaptive Smolyak quadratures for Bayesian inverse problems. Inverse Problems 29, 1–28 (2013) 23. Somerville, W.R.C., Augui´e, B., Ru, E.C.L.: Severe loss of precision in calculations of Tmatrix integrals. J Quant Spectrosc Radiat Transfer 113, 524–535 (2012) 24. Stuart, A.: Inverse problems: A Bayesian perspective. Acta Numer. 19, 451–559 (2010) 25. Wang, Y., Ma, F., Zheng, E.: Bayesian method for shape reconstruction in the inverse interior scattering problem. Math. Probl. Eng. 2015, 1–12 (2015) 26. Waterman, P.C.: Matrix formulation of electromagnetic scattering. Proc. IEEE 53, 805–812 (1965) 27. Waterman, P.C.: New formulation of acoustic scattering. J. Acoust. Soc. Am. 45, 1417–1429 (1969) 28. Yang, K., Guha, N., Efendiev, Y., Mallick, B.: Bayesian and variational Bayesian approaches for flows in heterogeneous random media. J. Comput. Phys. 345, 275–293 (2017)

Bounding the spectral gap for an elliptic eigenvalue problem with uniformly bounded stochastic coefficients Alexander D. Gilbert, Ivan G. Graham, Robert Scheichl and Ian H. Sloan

Abstract A key quantity that occurs in the error analysis of several numerical methods for eigenvalue problems is the distance between the eigenvalue of interest and the next nearest eigenvalue. When we are interested in the smallest or fundamental eigenvalue, we call this the spectral or fundamental gap. In a recent manuscript [Gilbert et al., https://arxiv.org/abs/1808.02639], the current authors, together with Frances Kuo, studied an elliptic eigenvalue problem with homogeneous Dirichlet boundary conditions, and with coefficients that depend on an infinite number of uniformly distributed stochastic parameters. In this setting, the eigenvalues, and in turn the eigenvalue gap, also depend on the stochastic parameters. Hence, for a robust error analysis one needs to be able to bound the gap over all possible realisations of the parameters, and because the gap depends on infinitelymany random parameters, this is not trivial. This short note presents, in a simplified setting, an important result that was shown in the paper above. Namely, that, under certain decay assumptions on the coefficient, the spectral gap of such a random elliptic eigenvalue problem can be bounded away from 0, uniformly over the entire infinite-dimensional parameter space.

Alexander D. Gilbert Institute for Applied Mathematics and Interdisciplinary Center for Scientific Computing, Universit¨at Heidelberg, 69120 Heidelberg, Germany, e-mail: [email protected] Ivan G. Graham Department of Mathematical Sciences, University of Bath, Bath BA2 7AY UK, e-mail: i.g. [email protected] Robert Scheichl Institute for Applied Mathematics and Interdisciplinary Center for Scientific Computing, Universit¨at Heidelberg, 69120 Heidelberg, Germany and Department of Mathematical Sciences, University of Bath, Bath BA2 7AY UK, e-mail: [email protected] Ian H. Sloan School of Mathematics and Statistics, University of New South Wales, Sydney NSW 2052, Australia, e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_3

29

30

Alexander D. Gilbert, Ivan G. Graham, Robert Scheichl and Ian H. Sloan

1 Introduction Eigenvalue problems are useful for modelling many phenomena from applications in engineering and physics, e.g, structural mechanics, acoustic scattering, elastic membranes, criticality of neutron transport/diffusion and band gap calculations for photonic crystal fibres. In this work, we consider the following eigenvalue problem   −∇ · a(xx, y )∇u(xx, y ) = λ (yy)u(xx, y ) , for x ∈ D, (1.1) u(xx, y ) = 0 , for x ∈ ∂ D , where the derivatives are taken with respect to x , and y = (y1 , y2 , . . .) is a random, infinite-dimensional vector with independent uniformly distributed components y j ∼ U([− 12 , 12 ]). We assume that the physical domain D ⊂ Rd , for d = 1, 2, 3, is bounded with Lipschitz boundary, and denote the stochastic/parameter domain by U := [− 12 , 12 ]N . In many uncertainty quantification (UQ) applications, the coefficient a(xx, y ) is given by a Karhunen–Lo`eve expansion of a random field. Taking this as motivation, we assume that the coefficient is an affine map of y , and satisfies ∞

0 < a(xx, y ) = a0 (xx) + ∑ y j a j (xx) < ∞ ,

for all x ∈ D, y ∈ U.

j=1

Further assumptions on the coefficient will be given explicitly in Assumption A1 below. If we ignore the y dependence, then (1.1) is a self-adjoint eigenvalue problem, which has been studied extensively in the literature, see, e.g., [7, 9]. In particular, it is well known that (1.1) has countably many eigenvalues 0 < λ1 < λ2 ≤ λ3 ≤ · · · , and that the smallest eigenvalue is simple. However, in our setting the eigenvalues depend on y : λk = λk (yy), and since the parameter domain is infinite-dimensional, care must be taken when transferring classical results to our setting. In particular, although it is well known that in the unparametrised setting the spectral gap, λ2 − λ1 , is some fixed positive number, in our setting the spectral gap, λ2 (yy) − λ1 (yy), is a function defined on an infinite-dimensional domain, which could be arbitrarily close to 0. Our main result (see Theorem 3 for a full statement) is as follows. Assuming that the terms a j in the coefficient decay sufficiently fast (in a suitable norm), then there exists a δ > 0, independent of y , such that the spectral gap of the eigenvalue problem (1.1) satisfies

λ2 (yy) − λ1 (yy) ≥ δ ,

for all y ∈ U .

Bounding the spectral gap for a stochastic elliptic eigenvalue problem

31

As an example of the important role that the spectral gap plays in error analysis, consider the random elliptic eigenvalue problem from [8]. There, an algorithm using dimension truncation, Quasi–Monte Carlo (QMC) quadrature and finite element (FE) methods was used to approximate the expectation with respect to the stochastic parameters of the smallest eigenvalue. Throughout the error analysis the reciprocal of the spectral gap occurred in: 1) the bounds on the derivatives of the eigenvalues with respect to y (required for the QMC and dimension truncation error analysis); 2) the constants for the FE error; and 3) the convergence rate for the eigensolver (by Arnoldi iteration). In short, the entire error analysis in [8] fails unless the gap can be bounded from below uniformly in y . In the remainder of this section we frame (1.1) as a variational eigenvalue problem, introduce the function space setting and summarise some known properties of the eigenvalues. Then, in Section 2 we prove that the spectral gap is uniformly bounded. Finally, in Section 3 we perform a numerical experiment for a specific example of (1.1), and present results on the size of the gap over different realisations of the parameter generated by a QMC pointset.

1.1 Variational eigenvalue problems It is often useful to study the eigenvalue problem (1.1) in its equivalent variational form. In this section we introduce the variational eigenvalue problem, then present some well known properties and tools that are required for our analysis. First, we clarify the assumptions on the coefficient a and our setting. Assumption A1 1. The coefficient is of the form ∞

a(xx, y ) = a0 (xx) + ∑ y j a j (xx) ,

(1.2)

j=1

with a j ∈ L∞ (D), for all j ≥ 0. 2. There exists 0 < amin < amax < ∞ such that amin ≤ a(xx, y ) ≤ amax , for all x ∈ D, y ∈ U. 3. For some p ∈ (0, 1), ∞

∑ a j Lp∞

< ∞.

j=1

The last condition (Assumption 1.3) is the same as is required for the QMC and dimension truncation analysis for corresponding source problems (see [11]). Note that in that paper they also allow p = 1, but with an extra condition on the size of the sum.

32

Alexander D. Gilbert, Ivan G. Graham, Robert Scheichl and Ian H. Sloan

Also, the assumption from the Introduction that each y j is uniformly distributed is not a restriction, the important point is that each y j belongs to a bounded interval. Let V = H01 (D) equipped with the norm vV := ∇vL2 , and let V ∗ denote the dual of V . We identify L2 (D) with its dual, and denote the inner product on L2 (D) by ·, · , which can be continuously extended to a duality pairing on V × V ∗ , also denoted ·, · . Note that we have the following chain of compact embeddings V ⊂⊂ L2 (D) ⊂⊂ V ∗ . The parameter domain U is equipped with the topology and metric of ∞ . Next, define the (parametric) symmetric bilinear form A : U ×V ×V → R by A(yy; w, v) :=

 D

a(xx, y )∇w(xx) · ∇v(xx) dxx ,

which is also an inner product on V . In this setting, for each y ∈ U, the variational eigenvalue problem equivalent to (1.1) is: Find 0 = u(yy) ∈ V and λ (yy) ∈ R such that A(yy; u(yy), v) = λ (yy) u(yy), v , u(yy)L2 = 1 .

for all v ∈ V ,

(1.3)

It follows from Assumption A1 that the bilinear form A(yy) is coercive and bounded, uniformly in y : A(yy; v, v) ≥ amin vV2 , A(yy; w, v) ≤ amax wV vV ,

for all v ∈ V , and for all w, v ∈ V .

(1.4) (1.5)

As a consequence, for each y we have a self-adjoint and coercive eigenvalue problem. Therefore, it is well known (see, e.g., [3, 9]) that (1.3) has a countable sequence of positive, real eigenvalues, which (counting multiplicities) we write as 0 < λ1 (yy) ≤ λ2 (yy) ≤ · · · , and the corresponding eigenvectors are denoted by u1 (yy), u2 (yy), . . . ∈ V . The min-max principle [3, (8.36)]

λk (yy) =

min

max

Vk ⊂V v∈Vk dim(Vk )=k

A(yy; v, v) , v, v

allows us to bound each eigenvalue above and below independently of y . Indeed, by (1.4) and (1.5) we have amin

min

max

Vk ⊂V v∈Vk dim(Vk )=k

∇v, ∇v

≤ λk (yy) ≤ amax v, v

min

max

Vk ⊂V v∈Vk dim(Vk )=k

∇v, ∇v

. v, v

Now, using the min-max properties of the kth eigenvalue of the Dirichlet Laplacian on D, which we denote by χk , the bounds above simplify to

Bounding the spectral gap for a stochastic elliptic eigenvalue problem

amin χk ≤ λk (yy) ≤ amax χk .

33

(1.6)

To consider our problem in the framework of Kato [10] for perturbations of linear operators, we introduce, for each y ∈ U, the solution operator T (yy) : V ∗ → V , which for f ∈ V ∗ is defined by A(yy; T (yy) f , v) = f , v

for all v ∈ V .

(1.7)

Clearly, μ = 1/λ is an eigenvalue of T (yy) if and only if λ is an eigenvalue of (1.3), and their eigenspaces coincide. Alternatively, we can consider the operator T (yy) : L2 (D) → L2 (D). In this case, T (yy) is self-adjoint with respect to the L2 inner product due to the symmetry of A(yy); it is compact because V ⊂⊂ L2 (D); and finally, it is bounded due to the Lax-Milgram theorem, which states that for each f ∈ L2 (D) there is a unique T (yy) f ∈ V satisfying (1.7), with T (yy) f V ≤

1 amin

 f V ∗ ≤

1 √

amin χ1

 f L 2 .

(1.8)

In the last inequality, we have used the Poincar´e inequality: −1/2

vL2 ≤ χ1

vV ,

for v ∈ V ,

(1.9)

and a standard duality argument. Note that we have expressed the Poincar´e constant in terms of the smallest eigenvalue of the Dirichlet Laplacian on D, using again the min-max principle.

2 Bounding the spectral gap The Krein–Rutman theorem guarantees that for every y the fundamental eigenvalue λ1 (yy) is simple, see, e.g., [9, Theorems 1.2.5 and 1.2.6]. However, it does not provide any quantitative statements about the size of the spectral gap, λ2 (yy) − λ1 (yy), for different parameter values y . As discussed in the Introduction, when studying numerical methods for eigenvalue problems in a UQ setting (see, e.g., [8]) several areas of the error analysis require uniform positivity of this gap over all y ∈ U. Here, we prove the required uniform positivity under the conditions of Assumption A1, in particular A1.3. An explicit bound on the spectral gap can be obtained in slightly different settings or by assuming tighter restrictions on the coefficients. For example, for Schr¨odinger operators (−Δ + V ) on D with a weakly convex potential V and Dirichlet boundary conditions, [2] gives an explicit lower bound on the fundamental gap. Alternatively, using the upper and lower bounds on the eigenvalues (1.6), we can determine restrictions on amin and amax such that the gap is bounded away from 0. Explicitly, if the coefficient a is such that amin and amax satisfy

34

Alexander D. Gilbert, Ivan G. Graham, Robert Scheichl and Ian H. Sloan

amin χ1 > , amax χ2

(2.1)

then, by (1.6), λ2 (yy) − λ1 (yy) ≥ amin χ2 − amax χ1 > 0. However, the condition (2.1) may prove to be too restrictive. The general idea of our proof is to use the continuity of the eigenvalues to show that a non-zero minimum of the gap exists. A complication that arises in this strategy is that the parameter domain U is not compact, so we cannot immediately conclude the existence of such a minumum; we know that U cannot be compact in the topology of ∞ because it is the unit ball of ∞ , and the unit ball of an infinite-dimensional Banach space is not compact. Our solution is based on the fact that although there are infinitely-many parameters, because of the decay of the terms in the coefficient (see Assumption A1.3), the contribution of a parameter y j decreases as j increases. Specifically, we reparametrise (1.3) as an equivalent eigenvalue problem whose parameters do belong to a compact set. The first step is the following elementary lemma, which shows that subsets of ∞ that are majorised by an q sequence (for some 1 < q < ∞) are compact. Lemma 1 Let α ∈ q for some 1 < q < ∞. The set U(α ) ⊂ ∞ given by   1 ∞ U(α ) := w ∈  : |w j | ≤ |α j | 2 is a compact subset of ∞ . Proof. Since ∞ is a normed (and hence a metric) space, U(α ) is compact if and only if it is sequentially compact. To show sequential compactness of U(α ), take any sequence {yy(n) }n≥1 ⊂ U(α ). Clearly, by definition of U(α ), each y (n) ∈ q and moreover, 1 yy(n) q ≤ α q < ∞ for all n ∈ N . 2 So y (n) is a bounded sequence in q . Since q < ∞, q is a reflexive Banach space, and so by [4, Theorem 3.18] {yy(n) }n≥1 has a subsequence that converges weakly to a limit in q . We denote this limit by y ∗ , and, with a slight abuse of notation, we denote the convergent subsequence again by {yy(n) }n≥1 . We now prove that y∗ ∈ U(α ) and that the weak convergence is in fact strong, i.e. we show y(n) → y∗ in ∞ , as n → ∞. For any j ∈ N, consider the linear functional f j : w) = w j , where w j denotes the jth element of the sequence w = q → R given by f j (w q (w j ) j≥1 ∈  . Clearly, f j ∈ (q )∗ (the dual space) and using the weak convergence established above, it follows that (n)

yj

= f j (yy(n) ) → f j (yy∗ ) = y∗j

as n → ∞ ,

for each fixed j. (n)

That is, we have componentwise convergence. Furthermore, since |y j | ≤ 12 |α j | it follows that |y∗j | ≤ 12 |α j | for each j, and hence y ∗ ∈ U(α ). Now, for any J ∈ N we can write

35

Bounding the spectral gap for a stochastic elliptic eigenvalue problem

yy(n) − y ∗ qq =

J

(n)

∑ |y j

− y∗j |q +

j=1





(n)

|y j − y∗j |q

j=J+1 (n)

≤ J max |y j − y∗j |q + j=1,2,...,J





| α j |q .

(2.2)

j=J+1

Let ε > 0. Since α ∈ q , we can choose J ∈ N such that ∞



j=J+1

|α j |q ≤

εq , 2

and since y (n) converges componentwise we can choose K ∈ N such that (n)

|y j − y∗j | ≤ (2J)−1/q ε

for all j = 1, 2, . . . , J and n ≥ K .

Thus, by (2.2) we have yy(n) − y ∗ qq ≤ ε q for all n ≥ K, and hence yy(n) − y ∗ q → 0 w∞ ≤ w wq when w ∈ q and 1 < q < ∞, this also implies as n → ∞. Because w (n) ∗ ∞ that y → y in  , completing the proof.   A key property following from the perturbation theory of Kato [10] is that the eigenvalues λk (yy) are continuous in y , which for completeness is shown below in Proposition 2. First, recall that T (yy) is the solution operator as defined in (1.7), and let Σ (T (yy)) denote the spectrum of T (yy). Proposition 2 Let Assumption A1 hold. Then the eigenvalues λ1 , λ2 , . . . are Lipschitz continuous in y . Proof. We prove the result by establishing the continuity of the eigenvalues μk (yy) of T (yy). Let y , y  ∈ U and consider the operators T (yy), T (yy ) : L2 (D) → L2 (D) as defined in (1.7). Since T (yy), T (yy ) are bounded and self-adjoint with respect to ·, · , it follows from [10, V, §4.3 and Theorem 4.10] that we have the following notion of continuity of μ (·) in terms of T (·) sup μ ∈Σ (T (yy))

dist(μ , Σ (T (yy ))) ≤ T (yy) − T (yy )L2 →L2 .

(2.3)

For an eigenvalue μk (yy) ∈ Σ (T (yy)), (2.3) implies that there exists a μk (yy ) ∈ Σ (T (yy )) such that |μk (yy) − μk (yy )| ≤ T (yy) − T (yy )L2 →L2 .

(2.4)

Note that this means there exists an eigenvalue of T (yy ) close to μk (yy), but does not imply that the kth eigenvalue of T (yy ) is close to μk (yy), that is, in (2.4) k is not necessarily equal to k . However, consider any μk (yy) and let m denote its multiplicity. Since m < ∞, we can assume without loss of generality that the collection μk (yy) = μk+1 (yy) = · · · = μk+m−1 (yy) is a finite system of eigenvalues in the sense of Kato. It then follows from the discussion in [10, IV, §3.5] that the

36

Alexander D. Gilbert, Ivan G. Graham, Robert Scheichl and Ian H. Sloan

eigenvalues in this system depend continuously on the operator with multiplicity preserved. This preservation of multiplicity is key to our argument, since it states that for T (yy ) sufficiently close to T (yy) there are m consecutive eigenvalues μk (yy ), μk +1 (yy ), . . . , μk +m−1 (yy ) ∈ Σ (T (yy )), no longer necessarily equal, that are close to μk (yy). A simple argument then shows that each μk is continuous in the following sense |μk (yy) − μk (yy )| ≤ T (yy) − T (yy )L2 →L2 .

(2.5)

To see this, consider, for k = 1, 2, . . ., the graphs of μk on U. Note that the separate graphs can touch (and in principle can even coincide over some subset of U), but by definition cannot cross (since at every point in U the successive eigenvalues are nonincreasing); and by the preservation of multiplicity a graph cannot terminate and a finite set of graphs cannot change multiplicity at an interior point. Thus by (2.23) the ordered eigenvalues μk must be continuous for each k ≥ 1 and satisfy (2.5). It then follows from the relationship μk (yy) = 1/λk (yy) along with the upper bound in (1.6) that we have a similar result for the eigenvalues λk of (1.3):   λk (yy) − λk (yy ) ≤ (amax χk )2 T (yy) − T (yy ) 2 2 . (2.6) L →L All that remains is to bound the right hand side of (2.6) by CLip yy − y ∞ , with CLip > 0 independent of y and y . To this end, note that since the right hand side of (1.7) is independent of y we have A(yy; T (yy) f , v) = A(yy ; T (yy ) f , v)

for all f ∈ L2 (D), v ∈ V .

Rearranging and then expanding this gives     A y ; T (yy) − T (yy ) f , v ) = A(yy ; T (yy ) f , v) − A(yy; T (yy ) f , v)   a(xx, y  ) − a(xx, y ) ∇[T (yy ) f ](xx) · ∇v(xx) dxx . = D

Letting v = (T (yy) − T (yy )) f ∈ V , the left hand side can be bounded from below using the coercivity (1.4) of A(yy), and the right hand side can be bounded from above using the Cauchy–Schwarz inequality to give amin (T (yy) − T (yy )) f V2 ≤ a(yy) − a(yy )L∞ T (yy ) f V (T (yy) − T (yy )) f V . Dividing by amin (T (yy) − T (yy )) f V and using the upper bound in (1.8) we have (T (yy) − T (yy )) f V ≤

1  √  f L2 a(yy) − a(yy )L∞ . a2min χ1

Then, applying the Poincar´e inequality (1.9) to the left hand side and taking the supremum over f ∈ L2 (D) with  f L2 ≤ 1, in the operator norm we have

Bounding the spectral gap for a stochastic elliptic eigenvalue problem

T (yy) − T (yy )L2 →L2 ≤

37

1 a(yy) − a(yy )L∞ . a2min χ1

Using this inequality as an upper bound for (2.6) we see that the eigenvalues inherit the continuity of the coefficient, and so |λk (yy) − λk (yy )| ≤

a2max χk2 a(yy) − a(yy )L∞ . a2min χ1

(2.7)

where the constant is clearly independent of y and y  . Finally, to establish Lipschitz continuity with respect to y , we recall Assumptions A1.1 and A1.3, expand the coefficients in (2.7) above and use the triangle inequality to give

∞ 2 χ2 ∞ 2 χ2 a a k k |λk (yy) − λk (yy )| ≤ max ∑ |y j − yj |a j L∞ ≤ amax ∑ a j L∞ yy − y  ∞ . 2 χ a2min χ1 j=1 j=1 min 1 By Assumption 1 the sum is finite, and hence the eigenvalue λk (yy) is Lipschitz in y, with the constant clearly independent of y .   Now that we have shown Lipschitz continuity of the eigenvalues and identified suitable compact subsets, we can prove the main result of this paper: namely, that the spectral gap is bounded away from 0 uniformly in y . The strategy of the proof is to rewrite the coefficient as a(xx, y ) = a0 (xx) +



∑ y j a j (xx) ,

j=1

with y j = α j y j and a j (xx) = a j (xx)/α j , choosing α ∈ q to decay slowly enough such that ∑∞j=1  a j L∞ < ∞ continues to hold. Then, using the intermediate result (2.7) from the proof of Proposition 2 we can show that the eigenvalues of the “reparametrised” problem are continuous in the new parameter y , which now ranges over the compact set U(α ). The required bound on the spectral gap is obtained by using the equivalence of the eigenvalues of the original and reparametrised problems. Theorem 3. Let Assumption A1 hold. Then there exists a δ > 0, independent of y , such that λ2 (yy) − λ1 (yy) ≥ δ . (2.8) Proof. We can assume, without loss of generality, that p > 1/2, because if Assumption A1.3 holds with exponent p ≤ 1/2 then it also holds for all p ∈ (p , 1). Consequently, set ε = 1 − p ∈ (0, 1/2) and consider the sequence α defined by

α j = a j εL∞ + 1/ j,

for each j ∈ N.

(2.9)

Setting q = p/ε = p/(1 − p) ∈ (1, ∞), using Assumption A1.3 and the triangle inequality, it is easy to see that α ∈ q . Moreover, the inclusion of 1/ j in (2.9) ensures

38

Alexander D. Gilbert, Ivan G. Graham, Robert Scheichl and Ian H. Sloan

that α j = 0, for all j ≥ 1. Hence, from now on, for w = (w j )∞j=1 ∈ ∞ , we can define the sequences α w = (α j w j )∞j=1 and w /α = (w j /α j )∞j=1 . Then, recalling the definition of U(α ) in Lemma 1, it is easy to see that y ∈ U(α ) if and only if y /α ∈ U, and moreover, y ∈ U if and only if α y ∈ U(α ). Now for x ∈ D and y ∈ U(α ), we define ∞

a (xx, y ) = a0 (xx) + ∑ y j j=1

a j (xx) , αj

from which it is easily seen that a (xx, y ) = a(xx, y /α ) .

(2.10)

Then we set y ; w, v) := A(

 D

a (xx, y )∇w(xx).∇v(xx) dxx

for w, v ∈ V ,

and we consider the following reparametrised eigenvalue problem: Find λ ( y) ∈ R and 0 = u ( y ) ∈ V such that y ; u ( A( y ), v) = λk ( y ) u( y ), v

 u( y )L2 = 1 .

for all v ∈ V , (2.11)

Note that because we have equality between the original and reparametrised coefficients (2.10), for each y ∈ U, and corresponding y = α y ∈ U(α ), (2.10) implies that there is equality between eigenvalues λk (yy) of (1.3) and λk ( y ) of the reparametrised eigenvalue problem (2.11)

λk (yy) = λk ( y ) for k ∈ N ,

(2.12)

and their eigenspaces coincide. Moreover, for an eigenvalue y ) of (2.11), using (2.12) in the inequality (2.7) λk ( we have a2 χk2 λk ( y) − λk ( y  )| ≤ max a( y /α ) − a( y /α )L∞ , | a2min χ1 which after expanding the coefficient and using the triangle inequality becomes

a2max χk2 ∞ 1  y ) − λk ( y )| ≤ |λk ( ∑ a j L∞  y − y  ∞ , a2min χ1 j=1 α j

  C Lip

y and y  . Now by (2.9) together with Assumpwhere C Lip is clearly independent of tion A1, we have

39

Bounding the spectral gap for a stochastic elliptic eigenvalue problem ∞

a j L∞ ≤ αj j=1





ε ∑ a j 1− L∞

=

j=1



∑ a j Lp∞
s, where a0 > 0 and c0 ∈ R will determine the values of amin and amax . Note that the stochastic dimension is here fixed at s = 100. Since multiplying the coefficient by any constant factor will simply rescale the eigenvalues by that same factor, without loss of generality we can henceforth set a0 = 1 and vary c0 . For a coefficient (1.2) given by the basis functions (3.1), we can obtain a formula for the bounds amin , amax as follows amin = 1 − amax = 1 +

c0 2 c0 2

s



1 = 1 − 0.81c0 , j2



1 = 1 + 0.81c0 . j2

j=1 s j=1

We remark that these bounds are not sharp. We also consider the so-called log-normal coefficient:

a(x, y ) = a∗ + exp

s

∑Φ

j=1

−1

(y j +



1 2 )a j (x)

,

(3.2)

40

Alexander D. Gilbert, Ivan G. Graham, Robert Scheichl and Ian H. Sloan

where a∗ ≥ 0, a j are as in (3.1), and Φ −1 is the inverse of the normal cumulative distribution function. In this way each Φ −1 (y j + 21 ) ∼ N (0, 1). Since Φ −1 maps [0, 1] to R, the coefficient (3.2) is unbounded, and although it is positive (amin = a∗ ), for a∗ = 0 it could be arbitrarily close to 0. So in this case Assumption 1.2 does not hold, our compactness argument fails and we have no theoretical prediction. To approximate the eigenvalue problem in space we use piecewise linear finite elements on a uniform mesh, with a meshwidth of h = 1/64. Numerical tests for different meshwidths (h = 1/8 to h = 1/128) produced qualitatively the same results. The purpose of this section is to provide supporting evidence that for the problem above the gap remains bounded. We do this in a brute force manner by studying the minimum of the gap over a large number of parameter realisations. The parameter realisations are generated by a QMC point set; specifically, a base-2 embedded lattice rule (see [5]) with up to 220 points and a single random shift. To generate the points we use the generating vector exod2 base2 m20 CKN from [12]. We choose QMC points as the test set because they can be shown to be well-distributed in high dimensions, see e.g. [5, 6]. The goal is not to find the minimum, but to provide evidence that as more and more of the parameter domain is searched (which corresponds to more realisations), the minimum of the gap over all realisations approaches a constant value. The results are given in Figures 1–5, where in each figure we plot the minimum of the gap against the number of realisations N (blue circles) and an estimate (see the next paragraph) of the distance between the estimated minimum and the true minimum (black triangles), along with a least-squares fit to α N −β (dashed red line) of this estimate. Each data point corresponds to a doubling of the number of QMC points: N = 1, 2, 4, . . . , 220 , and the axes are in loglog scale. Letting δN denote the approximate minimum over the first N realisations, we estimate the distance to the true minimum by   δN − min λ2 (yy) − λ1 (yy) ≈ δN − δN ∗ , y ∈U

where δN ∗ corresponds to the most accurate estimate of the minimum, with N ∗ = 220 . The purpose of including such an estimate of the distance to the true minimum is to demonstrate that not only does the minimum of the gap appear to plateau, but that the differences also decay like a power of N. First we consider the affine coefficient in (1.2). The three different choices of c0 are: in Figure 1 c0 = 1, which gives amin = 0.18 and amax = 1.82; in Figure 2 c0 = 1.223, which gives amin = 2 × 10−4 and amax = 2; and in Figure 3 c0 = 0.5, which gives amin = 0.59 and amax = 1.41. Then, in Figure 4 we plot the log-normal coefficient (3.2) with amin = a∗ = 0 and c0 = 1, and in Figure 5 we plot the lognormal coefficient with amin = a∗ = 0.18 and c0 = 1. For each different choice of the affine coefficient (1.2) (Figures 1, 2, 3), the minimum value of the gap appears to plateau and approach a nonzero minimum. The minimum of the gap seems fairly insensitive to changes in c0 . However, in Figures 4 and 5 for the log-normal coefficient (which, we recall is not covered by the theory of the current work) the results are inconclusive. It appears that the gap tends to 0

41

Bounding the spectral gap for a stochastic elliptic eigenvalue problem

in the case of a true lognormal coefficient (a∗ = 0, Figure 4), and that it is bounded away from 0 for the “regularised” lognormal coefficient with a∗ = 0.18 in Figure 5. However, in Figure 4 the smallest computed value of the gap is close to 1, and it is possible that it will plateau for a denser set of QMC points. Also, in Figure 5 the plateau is not as clearly developed in as in Figures 1–3. It remains an open question if the gap can be bounded in the case of the lognormal coefficient (3.2), and whether a∗ needs to be strictly positive. 102

10

1

100

10-1

10

-2

10

0

10

2

10

4

10

6

Number of realisations (N)

Fig. 1 Estimate of the minimum of the spectral gap and estimate of the distance to the true minimum: affine coefficient (1.2) with a0 = c0 = 1, amin = 0.18, amax = 1.82.

102

101

100

10

-1

100

102

104

Number of realisations (N)

106

Fig. 2 Estimate of the minimum of the spectral gap and estimate of the distance to the true minimum: affine coefficient (1.2) with a0 = 1, c0 = 1.223, amin = 2 × 10−4 , amax = 2.

42

Alexander D. Gilbert, Ivan G. Graham, Robert Scheichl and Ian H. Sloan 10

2

10

1

10

0

10

-1

10-2 0 10

10

2

10

4

Number of realisations (N)

10

6

Fig. 3 Estimate of the minimum of the spectral gap and estimate of the distance to the true minimum: affine coefficient (1.2) with a0 = 1, c0 = 0.5, amin = 0.59, amax = 1.41. 103

102

101

100

10

-1

100

102

104

Number of realisations (N)

106

Fig. 4 Estimate of the minimum of the spectral gap and estimate of the distance to the true minimum: log-normal coefficient (3.2) with amin = a∗ = 0 and c0 = 1. 102

101

10

0

10-1 100

102

104

Number of realisations (N)

106

Fig. 5 Estimate of the minimum of the spectral gap and estimate of the distance to the true minimum: log-normal coefficient (3.2) with amin = a∗ = 0.18 and c0 = 1.

Bounding the spectral gap for a stochastic elliptic eigenvalue problem

43

4 Conclusion The spectral gap is an important quantity that occurs throughout several areas of the numerical analysis of eigenvalue problems, and in this work we proved that, under certain conditions on the coefficient, the spectral gap of a random elliptic eigenvalue problem is uniformly bounded from below. In all of our numerical experiments the results strongly suggest that the minimum of the gap approaches a nonzero constant value. The only exception is the lognormal coefficient, which is not covered by our theory.

References 1. R. Andreev and Ch. Schwab. Sparse tensor approximation of parametric eigenvalue problems. In I. G. Graham et al, editor, Numerical Analysis of Multiscale Problems, Lecture Notes in Computational Science and Engineering, 203–241. Springer-Verlag, Berlin Heidelberg, Germany, 2012. 2. B. Andrews and J. Clutterbuck. Proof of the fundamental gap conjecture. J. Amer. Math. Soc. 24, 899–916, 2011. 3. I. Babuˇska and J. Osborn. Eigenvalue problems. In P. G. Ciarlet and J. L. Lions, editors, Handbook of Numerical Analysis, Volume 2: Finite Element Methods (Part 1), 641–787. Elsevier Science, Amsterdam, The Netherlands, 1991. 4. H. Brezis, Functional Analysis, Sobolev Spaces and Partial Differential Equations. Universitext, Springer, New York, 2011. 5. R. Cools, F. Y. Kuo and D. Nuyens, Constructing embedded lattice rules for multivariate integration. SIAM J. Sci. Comp., 28, 2162–2188, 2006. 6. J. Dick, F. Y. Kuo, and I. H. Sloan. High-dimensional integration: The quasi-Monte Carlo way. Acta Numerica, 22, 133–288, 2013. 7. N. Dunford and J. T. Schwartz, Linear Operators, Part II: Spectral Theory, Self Adjoint Operators in Hilbert Space. Interscience Publishers, New York London, 1963. 8. A. D. Gilbert, I. G. Graham, F. Y. Kuo, R. Scheichl and I. H. Sloan, Analysis of quasi-Monte Carlo Methods for elliptic eigenvalue problems with stochastic coefficients. Submitted, arXiv: https://arxiv.org/abs/1808.02639, 2018. 9. A. Henrot. Extremum Problems for Eigenvalues of Elliptic Operators. Birkh¨auser Verlag, Basel, Switzerland, 2006. 10. T. Kato. Perturbation Theory for Linear Operators. Springer-Verlag, Berlin Heidelberg, Germany, 1984. 11. F. Y. Kuo, Ch. Schwab, and I. H. Sloan. Quasi-Monte Carlo finite element methods for a class of elliptic partial differential equations with random coefficients. SIAM J. Numer. Anal., 50, 3351–3374, 2012. 12. D. Nuyens, https://people.cs.kuleuven.be/˜dirk.nuyens/ qmc-generators/, accessed November 2, 2018.

On the Power of Restricted Monte Carlo Algorithms Stefan Heinrich

Abstract We introduce a general notion of restricted Monte Carlo algorithms that generalizes previous notions in two ways: it includes full adaptivity and general (i.e. not only bit) restrictions. We show that for each such restricted setting there is a computational problem that can be solved in the general randomized setting but not under the restriction.

1 Introduction Restricted Monte Carlo algorithms were considered in [11, 12, 15, 9, 13, 2, 17, 3, 4] (in the present paper the terms ’randomized’ and ’Monte Carlo’ will be used synonymously). Restriction usually means that the algorithm has access only to random bits or to random variables with finite range. In a number of numerical problems the admission of randomized algorithms brings considerable gains in terms of the convergence rate, e.g., in high dimensional integration, see [3, 5, 6, 12, 16] for this and other problems. So it is certainly of theoretical interest to understand how much randomness is really needed. Looking at algorithms that use random bits is an obvious way to quantify randomness. Most of the papers on restricted randomized algorithms consider the non-adaptive case. Only [4] includes adaptivity, but considers a class of algorithms where each information call is tied to one random bit call. Since for a number of important problems essentially fewer random elements are needed than function values [9, 13, 2, 17], a general notion of adaptive restricted randomized algorithms is of interest. Here we give such a general definition, which extends the previous notions in two ways: Firstly, we include full adaptivity, that is, the question whether to call a Stefan Heinrich Department of Computer Science, University of Kaiserslautern, D-67653 Kaiserslautern, Germany, e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_4

45

Stefan Heinrich

46

function value or a random element is decided solely on the basis of the outcome of the computation carried out so far. Secondly, we do not restrict the consideration to random bits, but include models in which the algorithms have access to an arbitrary, but fixed set of random variables, for example, uniform distributions on [0, 1]. This general definition was inspired by the approach to stochastic problems from [7, 8]. A first (simple, but technically somewhat involved) step to fit this notion into the existing IBC framework is to represent each restricted randomized algorithm as a general randomized algorithm of suitable cardinality. Secondly, and this is the main topic of the present paper, we study the question of the power of restricted as compared to general randomized algorithms. It became clear from the previous work that there are problems which can be solved by general Monte Carlo but not by Monte Carlo algorithms that use only random bits or random variables with finite range (see [15], Th. 3.1, or example 14 in [4], the latter credited to E. Novak). The question arises whether there are such problems for randomized algorithms with access to uniform distributions on [0, 1], or with access to an even more general, but still restricted set of random variables. We settle this question by showing that for each restriction model there is a problem which cannot be solved in this restricted model, but can be solved (in one step) by a general randomized algorithm. Our examples are based on cardinal number considerations and on probability measures on suitably large sets. This can be viewed as a generalization of the above-mentioned examples. Let us finally mention that recently a very practical aspect came up, which motivates the consideration of restricted Monte Carlo algorithms. The usual sequences of ’random numbers’ on a computer are generated in a deterministic way, using number theoretic methods, thus they are not random at all, just have appropriate statistical properties. With the appearance of quantum random generators the use of truly random bits became realistic, see, e.g., [14] and references therein. Of course, now the minimal number of random bits is also of practical interest.

2 Restricted Randomized Algorithms in a General Setting We work in the framework of information-based complexity theory (IBC) [12, 16]. First we recall the notions of deterministic and randomized algorithms in the IBC approach from [5, 6], see also [7, 8]. An abstract numerical problem P is given as P = (F, G, S, K, Λ ).

(1)

Here F is a non-empty set, G a Banach space and S is a mapping F → G. The operator S is called the solution operator, it sends the input f ∈ F of our problem to the exact solution S( f ). Moreover, Λ is a nonempty set of mappings from F to K, the set of information functionals, where K is any nonempty set - the set of values of information functionals.

On the Power of Restricted Monte Carlo Algorithms

47

For illustration, let us consider an example (which will be used later on). Let (Q, F , μ ) be a probability space and let L2 (Q, F , μ ) denote the set of all F -toBorel measurable, μ -square integrable functions f : Q → R. We set F = BL2 (Q,F ,μ ) = { f ∈ L2 (Q, F , μ ) :  f L2 (Q,F ,μ ) ≤ 1}, S = IQ,μ : F → R, K = R,

IQ,μ ( f ) =

Λ = {δx : x ∈ Q},



G = R,

(2)

f (x)d μ (x) ( f ∈ F),

(3)

δx ( f ) := f (x) ( f ∈ F).

(4)

Q

Thus, we want to compute (approximately) the integral of functions from the unit ball of L2 (Q, F , μ ). The set of available information functionals consists of function evaluations at arbitrary points of Q. The basic IBC approach to a general notion of an algorithm is the following. The algorithm starts with evaluating an information functional L1 ∈ Λ at input f ∈ F, that is L1 ( f ) ∈ K. Next a termination function τ1 (L1 ( f )) is evaluated. If its value is 1, we stop the process of gathering information. If the value is 0, we go on and choose, depending on L1 ( f ), another functional L2 ∈ Λ , and L2 ( f ) is evaluated. The termination function τ2 (L1 ( f ), L2 ( f )) decides if to stop or to continue. In the latter case, the choice of the next functional L3 ∈ Λ may depend on L1 ( f ) and L2 ( f ), and so on. The procedure goes on until τn (L1 ( f ), . . . , Ln ( f )) = 1 for some n, thus n values L j ( f ) ( j = 1, . . . , n) are obtained, the ’information’ about f . On the basis of this information a final mapping ϕn : K n → G is applied, representing the computations on the information leading to the approximation A( f ) to S( f ) in G. This is formalized as follows (including also the case of choosing no information functionals at all, which we omitted in the above informal description). An (adaptive) deterministic algorithm for P is a tuple ∞ ∞ A = ((Li )∞ i=1 , (τi )i=0 , (ϕi )i=0 )

such that L1 ∈ Λ , τ0 ∈ {0, 1}, ϕ0 ∈ G, and for i ∈ N Li+1 : K i → Λ ,

τi : K i → {0, 1},

ϕi : K i → G

(5)

are arbitrary mappings, where K i denotes the i-th Cartesian power of K. Given an input f ∈ F, we define (λi )∞ i=1 with λi ∈ Λ as follows:

λ1 = L 1 ,

λi = Li (λ1 ( f ), . . . , λi−1 ( f )) (i ≥ 2).

(6)

Define card(A, f ), the cardinality of A at input f , to be 0 if τ0 = 1. If τ0 = 0, let card(A, f ) be the first integer n ≥ 1 with τn (λ1 ( f ), . . . , λn ( f )) = 1 if there is such an n. If τ0 = 0 and no such n ∈ N exists, put card(A, f ) = +∞. We define the output A( f ) of algorithm A at input f as  ϕ0 if card(A, f ) ∈ {0, ∞} (7) A( f ) = ϕn (λ1 ( f ), . . . , λn ( f )) if 1 ≤ card(A, f ) = n < ∞.

Stefan Heinrich

48

The cardinality of A is defined as card(A, F) = sup card(A, f ). f ∈F

Given n ∈ N0 , we define Andet (P) as the set of deterministic algorithms A for P with card(A) ≤ n, the error of A in approximating S as e(S, A, F, G) = sup S( f ) − A( f )G , f ∈F

and for n ∈ N0 the deterministic n-th minimal error of S as edet n (S, F, G) =

inf

A∈Andet (P)

e(S, A, F, G).

(8)

In the case of the example (2)–(4) above a deterministic algorithm calls function values λ1 ( f ) = f (x1 ), . . . , λn ( f ) = f (xn ), where the sample points xi ∈ Q can be chosen adaptively, depending on the so far computed information, and also the termination number n may be adaptive in this sense. The cardinality card(A, f ) is the total number n of function values called at input f . Finally, the mapping ϕn is applied to produce the output of the algorithm, the approximation ϕn ( f (x1 ), . . . , f (xn )) to the integral IQ,μ ( f ). An (unrestricted) randomized algorithm for P is a tuple A = ((Ω , Σ , P), (Aω )ω ∈Ω ), where (Ω , Σ , P) is a probability space and for each ω ∈ Ω , Aω is a deterministic algorithm for P. Let n ∈ N0 . Then Anran (P) stands for the class of randomized algorithms A for P with the following properties: For each f ∈ F the mapping ω → card(Aω , f ) is Σ -measurable, E card(Aω , f ) ≤ n, and the mapping ω → Aω ( f ) is Σ -to-Borel measurable and P-almost surely separably valued, i.e., there is a separable subspace G f of G such that P{ω : Aω ( f ) ∈ G f } = 1. We define the cardinality of A ∈ Anran (P) as card(A, F) = sup E card(Aω , f ), f ∈F

the error as e(S, A, F, G) = sup E S( f ) − Aω ( f )G , f ∈F

and the randomized n-th minimal error of S as eran n (S, F, G) =

inf

A∈Anran (P)

e(S, A, F, G).

On the Power of Restricted Monte Carlo Algorithms

49

Considering trivial one-point probability spaces Ω = {ω } immediately yields det eran n (S, F, G) ≤ en (S, F, G).

(9)

A classical example of an unrestricted randomized algorithm is the standard Monte Carlo method for integration (2)–(4) with n ∈ N samples. Here we take a sufficiently large probability space, e.g., (Ω , Σ , P) = (Q, F , μ )n , a sequence (ξi )ni=1 of independent, μ distributed on Q random variables over (Ω , Σ , P), and put An = ((Ω , Σ , P), (An,ω )ω ∈Ω ), where An,ω ( f ) =

1 n ∑ f (ξi (ω )). n i=1

(10)

To view this algorithm in the formal context of (5), fix n ∈ N and ω ∈ Ω . Then we ∞ ∞ have An,ω = ((Li )∞ i=1 , (τi )i=0 , (ϕi )i=0 ) with Li ≡ δξi (ω ) (i = 1, . . . , n) τi ≡ 0 (0 ≤ i < n), τn ≡ 1 1 n ϕn (a1 , . . . , an ) = ∑ ai (ai ∈ R), n i=1

(11) (12) (13)

while all other algorithm components can be chosen arbitrarily – they do not contribute to the output An,ω ( f ). It is well-known (see, e.g., [12], 2.1.3) that e(IQ,μ , An , BL2 (Q,F ,μ ) , R) =

sup

f ∈BL (Q,F ,μ ) 2

E|IQ,μ ( f ) − An,ω ( f )| ≤ n−1/2 .

(14)

Now we introduce the new notion of a restricted randomized algorithm for P. A probability space with access restriction is a tuple   (15) R = (Ω , Σ , P), K  , Λ  , where (Ω , Σ , P) is a probability space, K  a non-empty set, and Λ  a non-empty set of mappings from Ω to K  . With P = (F, G, S, K, Λ ) as above we set ˙ , K¯ = K ∪K

Λ¯ = Λ ∪˙ Λ  ,

where ∪˙ denotes the disjoint union. For λ ∈ Λ¯ we define  λ ( f ) if λ ∈ Λ λ ( f , ω) = λ (ω ) if λ ∈ Λ  . An R-restricted randomized algorithm for problem P is a tuple

Stefan Heinrich

50 ∞ ∞ A = ((Li )∞ i=1 , (τi )i=0 , (ϕi )i=0 )

such that L1 ∈ Λ¯ , τ0 ∈ {0, 1}, ϕ0 ∈ G, and for i ∈ N Li+1 : K¯ i → Λ¯ ,

τi : K¯ i → {0, 1},

ϕi : K¯ i → G

(16)

¯ are any mappings. Given f ∈ F and ω ∈ Ω , we define (λi )∞ i=1 with λi ∈ Λ as follows:

λ1 = L 1 ,

λi = Li (λ1 ( f , ω ), . . . , λi−1 ( f , ω )) (i ≥ 2).

(17)

Define cardΛ¯ (A, f , ω ), cardΛ (A, f , ω ), and cardΛ  (A, f , ω ) all to be 0 if τ0 = 1. If τ0 = 0, let cardΛ¯ (A, f , ω ) be the first integer n ≥ 1 with τn (λ1 ( f , ω ), . . . , λn ( f , ω )) = 1 if there is such an n. If τ0 = 0 and no such n ∈ N exists, put cardΛ¯ (A, f , ω ) = +∞. Let cardΛ (A, f , ω ) = |{k ≤ cardΛ¯ (A, f , ω ) : λk ∈ Λ }| cardΛ  (A, f , ω ) = |{k ≤ cardΛ¯ (A, f , ω ) : λk ∈ Λ  }|. Clearly, cardΛ¯ (A, f , ω ) = cardΛ (A, f , ω ) + cardΛ  (A, f , ω ). We define the output A( f , ω ) of algorithm A at input ( f , ω ) as  ϕ0 if cardΛ¯ (A, f , ω ) ∈ {0, ∞} (18) A( f , ω ) = ϕn (λ1 ( f , ω ), . . . , λn ( f , ω )) if 1 ≤ cardΛ¯ (A, f , ω ) = n < ∞. Thus, a restricted randomized algorithm depends on randomness of (Ω , Σ , P), but in a special way. Namely, ω ∈ Ω can only be accessed through the functionals λ (ω ) for λ ∈ Λ  . Intuitively, it seems to be clear that a restricted randomized algorithm is a special case of a randomized algorithm. Formally, though, this has to be checked on the basis of the respective definitions. Corollary 1 states that this is indeed the case. Also note the similarities of the definition of a restricted randomized algorithm with the notion of a deterministic algorithm for a stochastic problem from [7, 8]. ran (P, R) as the set of those R-restricted randomGiven n, k ∈ N0 , we define An,k ized algorithms for problem P with the following properties: For each f ∈ F the mappings

ω → cardΛ¯ (A, f , ω ),

ω → cardΛ (A, f , ω ),

ω → cardΛ  (A, f , ω )

are Σ -measurable, E cardΛ (A, f , ω ) ≤ n,

E cardΛ  (A, f , ω ) ≤ k,

and the mapping

ω → A( f , ω ) ∈ G is Σ -to-Borel measurable and P-almost surely separably valued. The error of A ∈ ran (P, R) is defined as An,k

On the Power of Restricted Monte Carlo Algorithms

51

e(S, A, F, G) = sup E S( f ) − A( f , ω )G . f ∈F

The (n, k)-th minimal R-restricted randomized error of S is defined as eran n,k (S, F, G) =

inf

ran (P,R) A∈An,k

e(S, A, F, G).

(19)

For example, bit Monte Carlo algorithms fit the above definition with K  = {0, 1}, = {ξi : 1 ≤ i < ∞}, with (ξi ) being independent random variables on (Ω , Σ , P) with P({ξi = 0}) = P({ξi = 1}) = 1/2. The restricted Monte Carlo algorithms considered by Novak in [11, 12] correspond to arbitrary K  and Λ  consisting of random variables on (Ω , Σ , P) with finite range and rational distribution probabilities. Of particular interest, because most frequently used, is the case where K  = [0, 1] and Λ  = {ηi : 1 ≤ i < ∞}, with (ηi ) being independent uniformly distributed on [0, 1] random variables over (Ω , Σ , P). Concerning example (2)–(4), one might ask if the same rate as in (14) could be obtained by the help of a finite number of uniformly distributed on [0, 1] random variables (and maybe suitable transformations). If Q is too large, this may not be the case. In fact, it can happen that no rate whatsoever is possible. This statement is a special case of Theorem 1 below. To a given R-restricted randomized algorithm A for P and ω ∈ Ω we can associate a deterministic algorithm Aω for P. The following proposition is related to Lemma 3 in [7], with a refined statement about the cardinality of the resulting algorithm Aω .

Λ

Proposition 1. Let A be an R-restricted randomized algorithm for P. Then for each ω ∈ Ω there is a deterministic algorithm Aω for P such that for all f ∈ F card(Aω , f ) = cardΛ (A, f , ω ) Aω ( f ) = A( f , ω ).

(20) (21)

∞ ∞ Proof. Let ν0 ∈ Λ be any element, let A = ((Li )∞ i=1 , (τi )i=0 , (ϕi )i=0 ), and fix ω ∈ Ω . ∞ ∞ , ( τ ) Our goal is to define a suitable algorithm Aω = ((Li,ω )∞ i,ω i=0 , (ϕi,ω )i=0 ). i=1 We start with the following construction. Given an arbitrary sequence (yl )∞ l=1 ∈ N and (z )∞ ∈ K N inductively as follows. K N , we define two sequences (λi )∞ ∈ Λ i i=1 i=1 Let

λ1 = L1  y1 if λ1 ∈ Λ z1 = λ1 (ω ) if λ1 ∈ Λ  .

(22) (23)

Now let i ≥ 1, assume that (λ j ) j≤i and (z j ) j≤i have been defined, let l = |{ j ≤ i : λ j ∈ Λ }|, and set

(24)

Stefan Heinrich

52

λi+1 = Li+1 (z1 , . . . , zi )  yl+1 if λi+1 ∈ Λ zi+1 = λi+1 (ω ) if λi+1 ∈ Λ  .

(25) (26)

Observe that, roughly speaking, (λi )∞ i=1 is something like the sequence (17), just with ’input’ (yl )∞ l=1 instead of f . It is convenient for us to set λ∞ = ν0 . Let k0 = 0 and define for l ∈ N kl = min{i ∈ N : i > kl−1 , λi ∈ Λ },

(27)

with the understanding that min 0/ = ∞. This defines the function ⊃

      ∞ ∞ ∞ Ψ : K N → Λ N × K N × N0 {∞} N0 , Ψ (yl )∞ l=1 = (λi )i=1 , (zi )i=1 , (kl )l=0 . We note that for each l ∈ N0 the following holds. Let (y˜ j )∞j=1 ∈ K N be such that (y j ) j≤l = (y) ˜ j≤l and let     ˜ ∞ zi )∞ , (k˜ l )∞ . Ψ (y˜l )∞ i=1 l=1 = (λi )i=1 , (˜ l=0 Then (λ j ) j≤kl+1 = (λ˜ j ) j≤kl+1 ,

(z j ) j ℵ. By Cantor’s theorem, one could take, e.g., ℵ1 = 2ℵ , see [10], Th. 6. We construct a probability space (Q, F , μ ) as follows. (For the case ℵ = |N| of this construction, see, e.g., [1], p. 29–30, exercise 2.12 (d).) Let Q be any set with |Q| = ℵ1 . Define

Stefan Heinrich

56

F0 = {B ⊆ Q : |B| ≤ ℵ},

F1 = {B ⊆ Q : |Q \ B| ≤ ℵ},

F = F0 ∪ F1 ,

and put for B ∈ F 

μ (B) =

0 1

B ∈ F0 B ∈ F1 .

if if

Since the union of countably many sets Bi ⊆ Q with |Bi | ≤ ℵ satisfies | ∪i∈N Bi | ≤ ℵ ([10], section 6, relations 6.1 and 6.4), it follows that F is a σ -algebra and μ is a (countably additive) probability measure on (Q, F ). The structure of the space L2 (Q, F , μ ) is simple: Let f : Q → R be F -to-Borel measurable, thus Q( f , a) := {x ∈ Q : f (x) ≤ a} ∈ F Observe that since



Q( f , a) = 0, /

a∈Q



(a ∈ R).

Q( f , a) = Q,

a∈Q

with Q the set of rationals, it follows that a1 = inf{a ∈ R : Q( f , a) ∈ F1 } ∈ R and Q( f , a1 ) =



Q( f , a) ∈ F1 .

a∈Q, a>a1

For all a > a1 we have Q( f , a) \ Q( f , a1 ) ∈ F0 , thus {x ∈ Q : f (x) = a} =

a∈Q,aa1

Thus, f is constant except for a set of cardinality ≤ ℵ, that is, of μ -measure zero. Since each such function is obviously μ -square integrable, L2 (Q, F , μ ) consists of all these functions. Let us mention in passing that the respective space L2 (Q, F , μ ) of equivalence classes of functions equal up to a set of μ -measure zero is onedimensional and consists of equivalence classes of constant functions. We let P = (BL2 (Q,F ,μ ) , R, IQ,μ , R, Λ ) be defined by (2)–(4), with (Q, F , μ ) as above. Let A1 = ((Q, F , μ ), (A1,x )x∈Q ) be the classical Monte Carlo method (10) with one sample, which is an unrestricted randomized algorithm, see (11)–(13). Obviously, card(A1 , BL2 (Q,F ,μ ) ) = 1 and A1,x ( f ) = f (x) (x ∈ Q, f ∈ F), hence for each f ∈ F the mapping x → A1,x ( f ) = f (x) is F -to-Borel measurable. Moreover,

  μ x ∈ Q : A1,x ( f ) = f (y)d μ (y) = 1, Q

which means

On the Power of Restricted Monte Carlo Algorithms

e(IQ,μ , A, BL2 (Q,F ,μ ) , R) =

57



sup

f ∈BL (Q,F ,μ ) Q 2

|IQ,μ ( f ) − A1,x ( f )|d μ (x) = 0,

proving (41). ran (P, R). By Proposition 1 for each ω ∈ Ω Now let n, k ∈ N0 and let A ∈ An,k ∞ ∞ there is a deterministic algorithm Aω = ((Li,ω )∞ i=1 , (τi,ω )i=0 , (ϕi,ω )i=0 ) for P so that Aω ( f ) = A( f , ω ). (43) Consider the zero function f0 (x) = 0 (x ∈ Q). For ω ∈ Ω let, according to (6) and (7),

δt1,ω = L1,ω ,

δti,ω = Li,ω ( f0 (t1,ω ), . . . , f0 (ti−1,ω )) = Li,ω (0, . . . , 0) (i ≥ 2), (44)

thus card(Aω , f0 ) = min{i ∈ N0 : τi,ω (0, . . . , 0) = 1}  ϕ0 if card(Aω , f0 ) ∈ {0, ∞} Aω ( f 0 ) = ϕn (0, . . . , 0) if 1 ≤ card(Aω , f0 ) = n < ∞.

(45) (46)

Let B0 = {ti,ω : i ∈ N, ω ∈ Ω }.

(47)

Then |B0 | ≤ |N| × |Ω | ≤ |N| × ℵ = ℵ, hence B0 ∈ F0 . Define f j ∈ BL2 (Q,F ,μ ) for j ∈ {1, 2} by  f j (x) =

0 (−1) j

if x ∈ B0 if x ∈ Q \ B0 .

(48)

With (6) and (7) we obtain

δt1,ω , j = L1,ω ,

δti,ω , j = Li,ω ( f j (t1,ω , j ), . . . , f j (ti−1,ω , j )) (i ≥ 2),

(49)

and card(Aω , f j ) = min{i ∈ N0 : τi,ω ( f j (t1,ω , j ), . . . , f j (ti,ω , j )) = 1} (50)  ϕ0 if card(Aω , f j ) ∈ {0, ∞} (51) Aω ( f j ) = ϕn ( f j (t1,ω , j ), . . . , f j (tn,ω , j )) if 1 ≤ card(Aω , f j ) = n < ∞. Using (45)–(51), it is readily checked by induction that for all i ∈ N, ω ∈ Ω , and j ∈ {1, 2} we have ti,ω = ti,ω , j , therefore f j (ti,ω , j ) = 0,

card(Aω , f j ) = card(Aω , f0 ),

Aω ( f j ) = Aω ( f0 ).

Stefan Heinrich

58

Consequently, e(IQ,μ , A, BL2 (Q,F ,μ ) , R) = ≥



sup

f ∈BL (Q,F ,μ ) Ω 2

1 2

1 ≥ 2





j=1,2 Ω



Ω

|IQ,μ ( f ) − Aω ( f )|dP(ω ) = max

|IQ,μ ( f j ) − Aω ( f j )|dP(ω ) =



j=1,2 Ω

1 2





j=1,2 Ω

|IQ,μ ( f j ) − Aω ( f j )|dP(ω )

|IQ,μ ( f j ) − Aω ( f0 )|dP(ω )

|IQ,μ ( f1 ) − IQ,μ ( f2 )|dP(ω ) = 1.

This shows (42). Acknowledgements This paper is partly based on research carried out while the author was guest of the International Mathematical Research Institute MATRIX, Melbourne, during the program ’On the frontiers of high dimensional computation’. The author is also grateful to Thomas M¨ullerGronbach and Klaus Ritter for a discussion on the topic of Theorem 1, to Pawel Przybyłowicz for pointing out reference [1], and to the referees, whose suggestions helped to improve the presentation.

References 1. Billingsley, P.: Probability and Measure. Wiley Series in Probability and Mathematical Statistics, John Wiley & Sons, Inc., New York (1986). 2. Gao, W., Ye, P., Wang, H.: Optimal error bound of restricted Monte Carlo integration on anisotropic Sobolev classes. Progr. Natur. Sci. (English Ed.) 16, 588–593 (2006). 3. Giles, M. B., Hefter, M., Mayer, L., Ritter, K.: Random bit quadrature and approximation of distributions on Hilbert spaces. Found. Comput. Math. 19, 205–238 (2019). 4. Giles, M. B., Hefter, M., Mayer, L., Ritter, K.: Random bit multilevel algorithms for stochastic differential equations. J. Complexity 54 (2019), 101395. 5. Heinrich, S.: Monte Carlo approximation of weakly singular integral operators. J. Complexity 22, 192–219 ( 2006) 6. Heinrich, S.: The randomized information complexity of elliptic PDE. J. Complexity 22, 220– 249 ( 2006) 7. Heinrich, S.: Lower complexity bounds for parametric stochastic Itˆo integration. In: Owen, A. B., Glynn, P. W. (eds.) Monte Carlo and Quasi-Monte Carlo Methods 2016, pp. 295–312. Springer Proceedings in Mathematics & Statistics 241, Berlin (2018) 8. Heinrich, S.: Complexity of stochastic integration in Sobolev classes. J. Math. Anal. Appl. 476, 177–195 (2019). 9. Heinrich, S., Novak, E., Pfeiffer, H.: How many random bits do we need for Monte Carlo integration? In: Niederreiter, H. (ed.) Monte Carlo and Quasi-Monte Carlo Methods 2002, pp. 27–49, Springer-Verlag, Berlin (2004) 10. Jech, T.: Set Theory. Academic Press, New York (1977) 11. Novak, E.: Eingeschr¨ankte Monte Carlo-Verfahren zur numerischen Integration. In: W. Grossmann et al. (eds.) Proc. 4th Pannonian Symp. on Math. Statist., Bad Tatzmannsdorf, Austria 1983, pp. 269-282. Reidel, Dordrecht (1985)

On the Power of Restricted Monte Carlo Algorithms

59

12. Novak, E.: Deterministic and Stochastic Error Bounds in Numerical Analysis. Lecture Notes in Mathematics, vol. 1349, Springer-Verlag, Berlin (1988) 13. Novak, E., Pfeiffer, H.: Coin tossing algorithms for integral equations and tractability. Monte Carlo Methods Appl. 10, 491–498 (2004) 14. Symul, T., Assad, S. M., Lam, P. K.: Real time demonstration of high bitrate quantum random number generation with coherent laser light. Appl. Phys. Lett. 98, 231103 (2011) 15. Traub, J. F., Wo´zniakowski, H.: The Monte Carlo algorithm with a pseudorandom generator. Math. Comp. 58, 323–339 (1992) 16. Traub, J. F., Wasilkowski, G. W., Wo´zniakowski, H.: Information-Based Complexity. Academic Press, New York (1988) 17. Ye, P., Hu, X.: Optimal integration error on anisotropic classes for restricted Monte Carlo and quantum algorithms. J. Approx. Theory 150, 24–47 (2008)

Exponential tractability of linear tensor product problems Fred J. Hickernell, Peter Kritzer, Henryk Wo´zniakowski

Abstract In this article we consider the approximation of compact linear operators defined over tensor product Hilbert spaces. Necessary and sufficient conditions on the singular values of the problem under which we can or cannot achieve different notions of exponential tractability are given in [8]. In this paper, we use the new equivalency conditions shown in [2] to obtain these results in an alternative way. As opposed to the algebraic setting, quasi-polynomial tractability is not possible for non-trivial cases in the exponential setting.

1 Introduction and Preliminaries Tractability of multivariate problems is the subject of a considerable number of articles and monographs in the field of Information-Based Complexity (IBC). For an introduction to IBC, we refer to the book [11]. For a recent overview of the state Fred J. Hickernell Center for Interdisciplinary Scientific Computation, Illinois Institute of Technology, Pritzker Science Center (LS) 106A, 3105 Dearborn Street, Chicago, IL 60616, USA e-mail: [email protected] Peter Kritzer Johann Radon Institute for Computational and Applied Mathematics (RICAM), Austrian Academy of Sciences Altenbergerstr. 69, 4040 Linz, Austria e-mail: [email protected] Henryk Wo´zniakowski Department of Computer Science, Columbia University, New York, NY, USA Institute of Applied Mathematics, University of Warsaw, ul. Banacha 2, 02-097 Warszawa, Poland e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_5

61

62

Fred J. Hickernell, Peter Kritzer, Henryk Wo´zniakowski

of the art in tractability studies we refer to the trilogy [5]–[7]. In this article we study tractability in the worst case setting for linear tensor product problems and for algorithms that use finitely many arbitrary continuous linear functionals. The information complexity of a compact linear operator Sd : Hd → Gd is defined as the minimal number, n(ε , Sd ), of such linear functionals needed to find an ε -approximation. It is natural to ask how the information complexity of a given problem depends on both d and ε −1 . In most of the literature on this subject, different notions of tractability are defined in terms of a relationship between n(ε , Sd ) and some powers of d and max(1, ε −1 ). This is called algebraic (ALG) tractability. For a complete overview of a wide range of results on algebraic tractability, see [5]–[7] On the other hand, a relatively recent stream of work defines different notions of tractability in terms of a relationship between n(ε , Sd ) and some powers of d and 1 + log max(1, ε −1 ). Now the complexity of the problem increases only logarithmically as the error tolerance vanishes. This situation is referred to as exponential (EXP) tractability, which is the subject of this article. Precise definitions of ALG and EXP tractabilities are given below. Algebraic tractability is usually obtained for classes of functions with finite smoothness and exponential tractability for classes of functions which are at least C∞ or analytic functions. Although we study general spaces in this paper, examples with positive algebraic or exponential tractability may be found for specific spaces in the papers cited before as well as in the papers [4] and [1, 3, 12]. General compact linear multivariate problems have been studied in the recent article [2]. Here, we deal with the case of tensor product problems for which the singular values of a d-variate problem are given as products of the singular values of univariate problems. Exponential tractability for tensor product problems has been studied in [8]. In this paper we re-prove the results of [8] by a different argument via the criteria presented in [2]. Consider two Hilbert spaces H1 and G1 and a compact linear solution operator, S1 : H1 → G1 . Let N denote the set of positive integers, and N0 = N ∪ {0}. For d ∈ N, let Hd = H1 ⊗ H1 ⊗ · · · ⊗ H1

and Gd = G1 ⊗ G1 ⊗ · · · ⊗ G1

be the d-fold tensor products of the spaces H1 and G1 , respectively. Furthermore, let Sd be the linear tensor product operator, Sd = S1 ⊗ S1 ⊗ · · · ⊗ S1 , on Hd . In this way, obtain a sequence of compact linear solution operators S = {Sd : Hd → Gd }d∈N . We now consider the problem of approximating {Sd ( f )} for f from the unit ball of Hd by means of algorithms {Ad,n : Hd → Gd }d∈N,n∈N0 . For n = 0, we

Exponential tractability of linear tensor product problems

63

set Ad,0 := 0, and for n ≥ 1, Ad,n ( f ) depends on n continuous linear functionals L1 ( f ), L2 ( f ), . . . , Ln ( f ), so that Ad,n ( f ) = φn (L1 ( f ), L2 ( f ), . . . , Ln ( f ))

(1)

for some φn : Cn → Gd or φn : Rn → Gd and L j ∈ Hd∗ . We allow an adaptive choice of L1 , L2 , . . . , Ln as well as n, i.e., L j = L j (·; L1 ( f ), L2 ( f ), . . . , L j−1 ( f )) and n can be a function of the L j ( f ), see [11] and [5] for details. The error of a given algorithm Ad,n is measured in the worst case setting, which means that we need to deal with e(Ad,n ) =

sup Sd ( f ) − Ad,n ( f )Gd .

f ∈Hd  f Hd ≤1

However, to assess the difficulty of the approximation problem, we would not only like to study the worst case errors of particular algorithms, but consider a more general error measure. To this end, let e(n, Sd ) = inf e(Ad,n ) Ad,n

denote the nth minimal worst case error, where the infimum is extended over all admissible algorithms Ad,n of the form (1). Then the information complexity n(ε , Sd ) is the minimal number n of continuous linear functionals needed to find an algorithm Ad,n that approximates Sd with error at most ε . More precisely, we consider the absolute (ABS) and normalized (NOR) error criteria in which n(ε , Sd ) = nABS (ε , Sd ) = min{n : e(n, Sd ) ≤ ε }, n(ε , Sd ) = nNOR (ε , Sd ) = min{n : e(n, Sd ) ≤ ε Sd }. It is known from [11] (see also [5]) that the information complexity is fully determined by the singular values of Sd , which are the same as the square roots of the eigenvalues of the compact self-adjoint and positive semi-definite linear operator Wd = Sd∗ Sd : Hd → Hd . We denote these eigenvalues by λd,1 , λd,2 , . . .. Then it is known that the information complexity can be expressed in terms of the eigenvalues λd, j . Indeed, nABS (ε , Sd ) = min{n : λd,n+1 ≤ ε 2 },

(2)

nNOR (ε , Sd ) = min{n : λd,n+1 ≤ ε λd,1 }.

(3)

2

 Clearly, nABS (ε , Sd ) = 0 for ε ≥ λd,1 = Sd , and nNOR (ε , Sd ) = 0 for ε ≥ 1. Therefore for ABS we can restrict ourselves to ε ∈ (0, Sd ), whereas for NOR to ε ∈ (0, 1). Since Sd  can be arbitrarily large, to deal simultaneously with ABS and NOR we consider ε ∈ (0, ∞). It is known that nABS/NOR (ε , Sd ) is finite for all ε > 0 iff Sd is compact, which justifies our assumption about the compactness of Sd . We now recall that the spaces Hd and Gd are tensor product spaces. It is known that the eigenvalues λd, j of Wd are then given as products of the eigenvalues  λ j of

Fred J. Hickernell, Peter Kritzer, Henryk Wo´zniakowski

64

the operator W1 = S1∗ S1 : H1 → H1 , i.e., d

λd, j = ∏  λ j .

(4)

=1

λ j are ordered, i.e.,  λ1 ≥  λ2 ≥ · · · . Without loss of generality, we assume that the    Although the λ j are given by (4), the ordering of the λ j does not easily imply the ordering of the λd, j since the map j ∈ N → ( j1 , . . . , jd ) ∈ Nd exists but does not have a simple explicit form. This makes the tractability analysis challenging. We are ready to define exactly various notions of ALG and EXP tractabilities. To present them concisely, let  in the case of ALG, max(1, ε −1 ), y ∈ {ABS, NOR}, z = 1 + log max(1, ε −1 ), in the case of EXP. These definitions are as follows. The problem S is . . . • strongly polynomially tractable (SPT) if there are C, q ≥ 0 such that ny (ε , Sd ) ≤ Czq

∀d ∈ N, ε ∈ (0, ∞),

• polynomially tractable (PT) if there are C, p, q ≥ 0 such that ny (ε , Sd ) ≤ Cd p zq

∀d ∈ N, ε ∈ (0, ∞),

• quasi-polynomially tractable (QPT) if there are C, p ≥ 0 such that   ny (ε , Sd ) ≤ C exp p (1 + log d)(1 + log z) ∀d ∈ N, ε ∈ (0, ∞), • (s,t)-weakly tractable ((s,t)-WT) if lim

d+ε −1 →∞

log max(1, ny (ε , Sd )) = 0, d t + zs

• uniformly weakly tractable (UWT) if (s,t)-WT holds for all s,t > 0. We use the prefix ALG- with the above tractability notions in the case z = max(1, ε −1 ) and EXP- in the case z = 1 + log max(1, ε −1 ). A recent article [2] provides necessary and sufficient conditions on the eigenvalues λd, j of Wd for the various tractability notions above. For the special case of linear tensor product spaces considered here, it is natural to ask for conditions on the eigenvalues  λ j of W1 such that we obtain the different kinds of exponential tractability. For results on algebraic tractability for tensor product spaces, see again [5]–[7] and the articles cited therein. The notion of (s,t)-WT was introduced in [10], and UWT was introduced in [9]. See also [13] and [2] for results on (s,t)-WT and UWT in the algebraic sense.

Exponential tractability of linear tensor product problems

65

Finding necessary and sufficient conditions on the  λ j for the different kinds of exponential tractability turns out to be a technically difficult question. Necessary and sufficient conditions have been considered in the paper [8]. Here we re-prove the results in [8], using a completely different technique, namely using criteria that have been shown very recently in the paper [2]. In some cases, our new technique enables us to obtain the desired results using shorter and/or less technical arguments than those that were used in [8].

2 Results In this section we show results on tractability conditions in terms of the eigenvalues of the operator W1 . λ j equal zero, then the operators Sd are all zero, and Note that, if all of the  nABS/NOR (ε , Sd ) = 0 for all d ≥ 1. Furthermore, if only  λ1 > 0 and  λ2 =  λ3 = · · · = 0 (remember that the  λ j are ordered), it can be shown that n (ε , Sd ) ≤ 1 for all ABS/NOR

d ≥ 1. Hence, the problem is interesting only if at least two of the  λ j are positive, which we assume from now on. Before we state our main result, we state two technical lemmas and a theorem proved elsewhere. The first lemma is well known. Lemma 1. For any n ∈ N and a1 , a2 , . . . , an ≥ 0 we have: • For s ≥ 1, 1 (a1 + · · · + an )s ≤ as1 + · · · + asn ≤ (a1 + · · · + an )s . ns−1 • For s ≤ 1 (a1 + · · · + an )s ≤ as1 + · · · + asn ≤ n1−s (a1 + · · · + an )s . Lemma 2. For all k, n ∈ N with k < n, it follows that 

  n k n n−k  en k en n−k n max , , ≤ . ≤ min k k n−k k n−k Proof. It is easy to see that  n k k

=



n n n n(n − 1) · · · (n − k + 1) n ··· ≤ = = k n−k k k k(k − 1) · · · 1     k k k k n n k en ≤ = ≤ , k! k k! k

and the estimates of the lemma follow.



Fred J. Hickernell, Peter Kritzer, Henryk Wo´zniakowski

66

Theorem 1. [2, Theorem 3] S is EXP-(s,t)-WT-ABS/ NOR iff sup σ EWT (d, s,t, c) < ∞

∀c > 0,

(5)

d∈N

where

s

CRId , σ EWT (d, s,t, c) := exp(−cd ) ∑ exp −c 1 + log 2 max 1, λd, j j=1 ∞

t



where CRId =

1 λd,1

for ABS, for NOR.

We now state and prove the main result of this article. Theorem 2. Let

 λ1 ≥  λ2 > 0.

Consider the conditions log  λn−1 = ∞, (log n)1/ min(s,t)

(6)

lim

log  λn−1 = ∞, (log n)1/s

(7)

lim

log  λn−1 = ∞, (log n)1/η

(8)

lim

n→∞

n→∞

n→∞

where η is given below. EXP-(s,t)-WT-ABS holds iff one of the following conditions is true: • • • •

(A.1): (A.2): (A.3): (A.4):

t > 1, s > 1,  λ1 > 1, and (6) holds or t > 1, s ≥ 1,  λ1 ≤ 1, and (7) holds or t > 1, s < 1, and (8) holds with η = s(t − 1)/(t − s) or λ1 ≤ 1,  λ2 < 1, and (7) holds. t ≤ 1, s > 1, 

EXP-(s,t)-WT-NOR holds iff one of the following conditions is true: • (N.1): t > 1, s ≥ 1, and (7) holds or • (N.2): t > 1, s < 1, and (8) holds with η = s(t − 1)/(t − s) or λ1 >  λ2 , and (7) holds. • (N.3): t ≤ 1, s > 1,  Furthermore, EXP-UWT, EXP-QPT, EXP-PT, and EXP-SPT do not hold under λ j , i.e., even for  λ3 =  λ4 = · · · = 0. any conditions on  Proof. We know from Theorem 1 that EXP-(s,t)-WT holds iff

Exponential tractability of linear tensor product problems

67

sup σ EWT (d, s,t, c) < ∞

∀c > 0,

(9)

d∈N

where

σ EWT (d, s,t, c) ∞ s       := ∑ exp −c d t + log(2e) + log max 1, CRId /λd, j j=1

=







j1 =1

jd =1

∑ · · · ∑ exp

where

 CRId =

1 λd,1









s 

d

−c d + log 2e max 1, ∏ CRI/ λ j t

,

=1

for ABS, for NOR,

 and

CRI =

(10)

1  λ1

(11)

for ABS, for NOR.

λ1 and  λ2 for We first show the necessity of the conditions on the eigenvalues  ABS and NOR, and then the necessity of the conditions (6), (7), or (8), depending on the different cases. Then, we show the sufficient conditions for all the cases (A.1)–(A.4) and (N.1)–(N.3).

Necessary Conditions λ1 > 1 =⇒ NO EXP-(s,t)-WT-ABS Case I: t ≤ 1 &  λ2 ≥  λ1−r , and for every d > r + 2, let Choose the smallest non-negative r such that  k = d/(r + 2). Then it follows that

r+1 d d , d − k − kr ≥ d 1 − = . k≤ r+2 r+2 r+2 Focusing on just these eigenvalues of the form d/(r+2) λd, j =  λ1d−k  λ2k =  λ1d−k−kr  λ1kr  λ2k ≥  λ1d−k−kr  λ1kr  λ1−kr =  λ1d−k−kr ≥  λ1 > 1,

σ EWT (d, s,t, c) has the following lower bound via (10) and Lemma 2.

s       d σ EWT (d, s,t, c) ≥ exp −c d t + log(2e) + log max 1, λd,−1j k k    d ≥ exp −c d t + [log(2e)]s k    ≥ (r + 2)d/(r+2)−1 exp −c d t + [log(2e)]s

Fred J. Hickernell, Peter Kritzer, Henryk Wo´zniakowski

68



(r + 2)1/(r+2) = r+2

d

   exp −c d t + [log(2e)]s .

Since (r + 2)1/(r+2) > 1 and t ≤ 1, then σ EWT (d, s,t, c) → ∞ for small c as d → ∞, regardless of the value of s. Hence, we do not have EXP-(s,t)-WT-ABS. Case II: t ≤ 1 &  λ1 ≥  λ2 ≥ 1 =⇒ NO EXP-(s,t)-WT-ABS We have 2d eigenvalues no smaller than 1. Therefore   σ EWT (d, s,t, c) ≥ 2d exp −c(d t + [log(2e)]s ) → ∞ for small c independently of s. Hence, we do not have EXP-(s,t)-WT-ABS. Case III: t ≤ 1 & s ≤ 1 =⇒ NO EXP-(s,t)-WT-ABS

λ2d . We then have We have 2d eigenvalues no smaller than      s  σ EWT (d, s,t, c) ≥ 2d exp −c d t + log 2e max(1,  λ2−d )   s  λ2−1 ) . = exp d log 2 − cd t − c log(2e) + d log max(1,  Since s,t ≤ 1 and since c can be arbitrarily small, we see that this latter term is not bounded for d → ∞. Hence, we do not have EXP-(s,t)-WT-ABS. From the analysis of all these cases, we see that EXP-(s,t)-WT-ABS may only λ1 ≤ 1, and  λ2 < 1. This completes the proof hold when t > 1, or when t ≤ 1 < s,   of the necessary conditions on λ1 and  λ2 for ABS. We turn to the necessary conditions on  λ1 and  λ2 for NOR. This corresponds to considering the ratios λd,1 /λd, j which are at least 1. We know that EXP-(s,t)-WTNOR holds iff supd∈N σ EWT (d, s,t, c) < ∞ for all c > 0, where 

s

∞ λd,1 t σ EWT (d, s,t, c) = ∑ exp −c d + log(2e) + log λd, j j=1     s  ∞ ∞ d  λ1 t = ∑ · · · ∑ exp −c d + log 2e ∏ .  j =1 j =1 =1 λ j 1

d



λ1 = 1. Using the previous results on Hence, it is the same as ABS if we assume that  λ1 = 1 we obtain the results for the parameters necessary conditions for ABS with  λ1 , and  λ2 for NOR. s, t, 

Exponential tractability of linear tensor product problems

69

Next, we show the necessity of the conditions (6), (7), or (8), depending on the different cases.

Necessity of (6): The necessity of (6) for the corresponding subcases follows from Items L1 and L2 of Lemma 1 in [8]. We remark that these are only based on general definitions, and do not require the technical results used in the proof of the main theorem of [8].

Necessity of (7): Assume first that EXP-(s,t)-WT-ABS/NOR holds and that the parameters t, s,  λ1 , λ2 are as in Case (A.2), (A.4), (N.1), or (N.3), respectively. We prove that (7) and  holds. Take d = 1. Then we know that log max(1, nABS/NOR (ε , S1 )) = 0. ε →0 (log ε −1 )s lim

This means that for any (small) positive β there is a positive εβ such that   log max 1, nABS/NOR (ε , S1 ) ≤ β (log ε −1 )s for all ε ≤ εβ , and equivalently   1/s  for all ε ≤ εβ . ε 2 ≤ exp −2/β 1/s log max(1, nABS/NOR (ε , S1 )) Let n = nABS/NOR (ε , S1 ). Since  λn+1 ≤ ε 2 CRI, with CRI = 1 for ABS and CRI =  λ1 for NOR, we obtain for n ≥ max(1, nABS/NOR (εβ , S1 )), −1 λn+1 ≥ log 

2 (log n)1/s + log CRI−1 . β 1/s

Since 2/β 1/s can be arbitrarily large, this yields (6).

Necessity of (8): The necessity of (8) for the corresponding subcases follows from Items L1 and L2 of Lemma 1 in [8]. We remark that these are only based on general definitions, and do not require the technical results used in the proof of the main theorem of [8].

Fred J. Hickernell, Peter Kritzer, Henryk Wo´zniakowski

70

Sufficient Conditions For technical reasons, we begin the proof by showing Case (A.2). (A.2): t > 1, s ≥ 1,  λ1 ≤ 1 & (7) =⇒ EXP-(s,t)-WT-ABS

λ1 ≤ 1, it is clear that  λ j−1 ≥ 1 for all j. Due to the assumption that  We then have    σEWT (d, s,t, c) = exp(−c d t ) ≤ exp(−c d t ) = exp(−c d t ) ≤ exp(−c d t )





d

1 ∑ · · · ∑ exp −c log(2e) + log ∏ λ j1 =1 jd =1 =1 j s     ∞ ∞ d 1 ∑ · · · ∑ exp −c log ∏ λ j1 =1 jd =1 =1 j    s  ∞ ∞ d 1 ∑ · · · ∑ exp −c ∑ log λ j1 =1 jd =1 =1 j   s   ∞ ∞ d 1 , ∑ · · · ∑ exp −c ∑ log λ j =1 j =1 =1 j 1



d

where we used s ≥ 1 and Lemma 1 in the last step. This yields 

σEWT (d, s,t, c) ≤ exp(−c d ) t

d

   s  ∑ exp −c log 1/λ j ∞

.

j=1

Since Condition (7) holds, we know that   log 1/ λ j = h j (log( j + 1))1/s , where (h j ) j≥1 is a sequence with lim j→∞ h j = ∞, i.e., 

σEWT (d, s,t, c) ≤ exp(−c d ) t

  = exp(−c d ) t

 = exp(−c d )

j=1 ∞

∑ exp

= exp(−c d ) t

t

d

  s  ∑ exp −c h j (log( j + 1))1/s ∞

d



−c hsj



log( j + 1)

j=1

   c hsj exp log 1/( j + 1) ∑ ∞

j=1 ∞



j=1



1 j+1

c hsj d

d

s 

Exponential tractability of linear tensor product problems

71

= exp(−c d t ) Adc ,  c hsj 1 is well defined and independent of d since c hsj is greater where Ac = ∑∞j=1 j+1 than one for sufficiently large j, and the series is convergent. Hence,

σ EWT (d, s,t, c) ≤ exp(−c d t ) exp (d log Ac ) . As t > 1, we obtain EXP-(s,t)-WT-NOR. We now show Case (A.1), and the other cases in the same order as they are stated in the theorem. (A.1): t > 1, s > 1,  λ1 > 1 & (6) =⇒ EXP-(s,t)-WT-ABS Subcase (A.1.1): s ≤ t

λ j / λ1 for j ≥ 1, and consider the information complexity with We define β j :=  respect to β j instead of  λ j . We denote the information complexity with respect to (β ) λ j ) j≥1 the sequence β = (β j ) j≥1 by nABS , and that with respect to the sequence λ = ( (λ )

by nABS . Then it is straightforward to see that nABS (ε , Sd ) = nABS (ε / λ 1 , Sd ) (λ )

(β )

d/2

Since s ≤ t, we have, by Lemma 1,    s 1/2 1/2 s d t + d log( λ1 ) + log(1/ε ) ≤ d t + 2s−1 d s log( λ1 ) + 2s−1 (log(1/ε ))s ≤ Cs (d t + (log(1/ε ))s ) for some positive constant Cs depending on s, but not on d or ε . This implies (λ ) (β ) d/2 log nABS (ε / λ1 , Sd ) log nABS (ε , Sd )    ≤ d/2 s d t + (1 + log max(1, ε −1 ))s C d t + 1 + log max(1, ε −1  λ1 ) s

for some positive constant Cs . This means that EXP-(s,t)-WT-ABS holds with respect to λ if it holds with respect to β . However, as β j ≤ 1 for all j ≥ 1, and since in this subcase min(s,t) = s, the result follows from case (A.2) above. Subcase (A.1.2): s > t Assume that (6) holds with min(s,t) = t. We need to show that

Fred J. Hickernell, Peter Kritzer, Henryk Wo´zniakowski

72

(λ )

log nABS (ε , Sd ) = 0. d t + (1 + log max(1, ε −1 ))s

lim

d+ε −1 →∞

However, note that (λ )

(λ )

log nABS (ε , Sd ) log nABS (ε , Sd ) , s ≤ t t −1 d + (1 + log max(1, ε )) d + (1 + log max(1, ε −1 ))t and that

(λ )

log nABS (ε , Sd )

lim

d+ε −1 →∞

d t + (1 + log max(1, ε −1 ))t

=0

by Case (A.1.1). This shows the result. (A.3): t > 1, s < 1, & (8) with η = s(t − 1)/(t − s) =⇒ EXP-(s,t)-WT-ABS

λ1 ≤ 1 Subcase (A.3.1):  We study the expression

σ EWT (d, s,t, c) ≤ exp(−c d )







j1 =1

jd =1

∑ ··· ∑

t



d



exp −c ∑ log 1/ λ j

s 



=1

.

Note that the definition of η together with s < 1 implies that η < s < 1, and 1/η > 1/s > 1. Note furthermore that Condition (8) implies   λ j = h j (log( j + 1))1/η = h j (log( j + 1))1/η −1/s (log( j + 1))1/s , (12) log 1/ where (h j ) j≥1 is a sequence with lim j→∞ h j = ∞. Using (12) and the second item in Lemma 1, we obtain  s    s  d d 1/η −1/s 1/s  (log( j + 1)) log 1/λ j = h j (log( j + 1))







=1



=1

≥ d s−1

d

∑ (log( j + 1))(s−η )/η hsj log( j + 1).

=1

Consequently,

σ EWT (d, s,t, c) ≤ exp(−c d t ) ×





j1 =1

jd =1

∑ · · · ∑ exp

 −c d

s−1

d



(log( j + 1))(s−η )/η hsj =1

 log( j + 1)

Exponential tractability of linear tensor product problems

 = exp(−c d ) t



∑ exp

73

d



−c d

s−1

(log( j + 1))(s−η )/η hsj

log( j + 1)



.

j=1

Note that

d s−1 (log( j + 1))(s−η )/η ≥ (c/2)(s−η )/η

if and only if     j ≥ exp (c/2) d (1−s)η /(s−η ) − 1 =: J0 = J0 (c, d, s, η ). This implies 

σ EWT (d, s,t, c) ≤ exp(−c d t ) J0 +





d

  exp −cs/η 2(η −s)/η hsj log( j + 1)

.

j=J0

In the same way as in case (A.2), we conclude that there exists a positive constant Ac such that

σ EWT (d, s,t, c) ≤ exp(−c d t ) (J0 + Ac )d  d   = exp(−c d t ) exp (c/2) d (1−s)η /(s−η ) + Ac . Since Ac is independent of d, for sufficiently large d, d   σ EWT (d, s,t, c) ≤ exp(−c d t ) 2 exp (c/2) d (1−s)η /(s−η )   = exp(−c d t ) 2d exp (c/2) d (1−s)η /(s−η )+1 . It is easily checked that (1 − s)η /(s − η ) + 1 = t, so we obtain

σ EWT (d, s,t, c) ≤ exp(−c d t ) 2d exp((c/2) d t ) = exp(−(c/2) d t ) exp(d log 2). As t > 1, we obtain EXP-(s,t)-WT-ABS.

λ1 > 1 Subcase (A.3.2):  We again define β j :=  λ j / λ1 for j ≥ 1, and consider the information complexity with respect to β j instead of  λ j . Then, as in Case (A.1.1), (λ ) (β ) d/2 λ 1 , Sd ) nABS (ε , Sd ) = nABS (ε /

Since t > 1 and s < 1, we have  s 1/2 λ1 ) + log(1/ε ) ≈ d t + (log(1/ε ))s , d t + d log(

Fred J. Hickernell, Peter Kritzer, Henryk Wo´zniakowski

74

and this implies that EXP-(s,t)-WT-ABS holds with respect to β if and only if it holds with respect to λ . However, as β j ≤ 1 for all j ≥ 1, the result now follows from Case (A.3.1), similar to Case (A.1.1).

λ1 ≤ 1,  λ2 < 1 & (7) =⇒ EXP-(s,t)-WT-ABS (A.4): t ≤ 1, s > 1,  λ1 ≤ 1, we again have  λ j−1 ≥ 1 for all j, such that Due to the assumption that  ∞









d

1 σEWT (d, s,t, c) ≤ exp(−c d t ) ∑ · · · ∑ exp −c log ∏  j =1 j =1 =1 λ j 1

s  .



d

Note that the approach taken in Case (A.2) does not work now since t ≤ 1. In the following we write [d] for the index set {1, 2, . . . , d} and, for u ⊆ [d], u = [d] \ u. Due to (7), we can find m ∈ N such that    ∀ j ≥ m + 1. (13) λ j−1 ≥  λ2−1 exp (log j)1/s (2/c)1/s Now we study ∞









d

1 ∑ · · · ∑ exp −c log ∏ λ j =1 j =1 =1 j 1

=

s  =



d



m



jv1 =1 u⊆[d] u={v1 ,...,v|u| } u={w1 ,...,wd−|u| }

···



m





jv|u| =1 jw1 =m+1









d

1 ∑ exp −c log ∏ λ =m+1 =1 j

···

jwd−|u|

s  .



(14)

Since s > 1, we have, for fixed u ⊆ [d],   s s   s   d 1 1 1 ≥ log ∏ + log ∏ , log ∏    ∈u λ j ∈u λ j =1 λ j 





so the expression in (14) is bounded from above by



m



jv1 =1 u⊆[d] u={v1 ,...,v|u| } u={w1 ,...,wd−|u| }

···



m



jv|u| =1

×



s 



exp −c log ∏  λ j−1 





jw1 =m+1

∈u

···





jwd−|u| =m+1





s 



exp −c log ∏  λ j−1  ∈u

.

Exponential tractability of linear tensor product problems

75

We first study m

m







1 Au := ∑ · · · ∑ exp −c log ∏  j =1 j =1 ∈u λ j v1

s  .



v|u|

λ j−1 for jv1 , . . . , jv|u| ∈ {1, . . . , m} in There are a total of md terms of the form ∏∈u     |u| k Au . For k = 0, . . . , |u| there are (m − 1) k of these terms containing the factor of  λ −1 exactly |u| − k times. Such terms are bounded below by  λ −k , so we obtain 1

2

Au ≤

|u|



k=0

We bound

|u| k



  s  |u| λ2−k ) . (m − 1)k exp −c log( k

by (e |u| /k)k due to Lemma 2. Hence, we have

  Au ≤ 1 + |u| max exp( f (k)) ≤ 1 + exp log(|u|) + f (kmax ) , k=1,...,|u|

where  s λ2−1 ) , f (k) = k + k log(|u| /k) + k log(m − 1) − cks log(  s f  (k) = log(|u| /k) + log(m − 1) − csks−1 log( λ2−1 ) , f (kmax ) = max f (k) ≥ max f (k). k∈[1,|u|]

k=1,...,|u|

For |u| large enough, we have f  (1) = log(|u|) + log(m − 1) − cs(log( λ2−1 ))s > 0, λ −1 ))s < 0, f  (|u|) = log(m − 1) − cs |u|s−1 (log( 2

hence, the maximum occurs in the interior. By setting the f  (k) = 0, we obtain s−1 (log( λ2−1 ))s , 0 = log(|u| /kmax ) + log(m − 1) − cskmax

 s s λ2−1 ) f (kmax ) = kmax + kmax log(|u| /kmax ) + kmax log(m − 1) − ckmax log(  s s = kmax + c (s − 1) kmax λ2−1 ) . log( The nonlinear equation defining kmax above implies that     kmax = O (log(|u|))1/(s−1) , and f (kmax ) = O (log(|u|))s/(s−1) . Consequently,    Au ≤ exp log(|u|) + O (log(|u|))s/(s−1) .

Fred J. Hickernell, Peter Kritzer, Henryk Wo´zniakowski

76

Due to the choice of m in (13), we obtain s      −1 −1   + log ≥ log λ λ log





∈u



j

∈u



2

∏ exp



∈u

s

(log j )

1/s

  = |u| log  λ2−1 + ∑ (log j )1/s (2/c)1/s

1/s

(2/c)



s

∈u



 s   1 ≥ |u|s log  λ2−1 + ∑ log j2 . ∈u c Consequently, ∞

Bu :=



jw1 =m+1

···







jwd−|u| =m+1 s



  s   = exp −c |u|s log  λ2−1







jw1 =m+1

∈u

···





jw1 =m+1

  s   (ζ (2))|u| ≤ exp −c |u|s log  λ2−1 In total, we obtain  

s 



exp −c log

 s  ≤ exp −c |u| log  λ2−1 



 λ j−1  





  1 ∑ exp −c ∑ c log j2 =m+1 ∈u

jwd−|u|



1 ··· jw21 jw

∑ d−|u|

=m+1

1 jw2

d −|u|

s  1 ∑ · · · ∑ exp −c log ∏ λ j1 =1 jd =1 =1 j     s  s −1 + |u| log(ζ (2)) + log(|u|) + O (log(|u|))s/(s−1) ≤ ∑ exp −c |u| log  λ2 ∞





d

u⊆[d]

   ≤ exp log(d) + O (log(d))s/(s−1)



u⊆[d]

    s + |u| log(ζ (2)) , exp −c |u|s log  λ2−1

where we used u = [d] \ u in the last step. Similarly as in the analysis of Au , we see that the sum in the latter expression is bounded by    exp log(d) + O (log(d))s/(s−1) . This term grows slower with d than exp(−cd t ), so we obtain EXP-(s,t)-WT-ABS, as desired.

Exponential tractability of linear tensor product problems

77

(N.1): t > 1, s ≥ 1 & (7) =⇒ EXP-(s,t)-WT-NOR Note that in this case we have 





 λ1 σEWT (d, s,t, c) = exp(−c d ) ∑ · · · ∑ exp −c log(2e) + log ∏  j =1 j =1 =1 λ j t



1



d

d

s  ,



λ1 / λ j ≥ 1 for all j. The rest of the argument is analogous to that in Case (A.2). since  (N.2): t > 1, s < 1, & (8) with η = s(t − 1)/(t − s) =⇒ EXP-(s,t)-WT-NOR This case can be treated in a similar way as Case (A.3), Subcase (A.3.1).

λ1 >  λ2 , & (7) =⇒ EXP-(s,t)-WT-NOR (N.3): t ≤ 1, s > 1,  This case can be treated in a similar way as Case (A.4). Regarding all other tractability notions, we know from above that we do not have EXP-(s,t)-WT when t ≤ 1 and s ≤ 1. Since EXP-(s,t)-WT is a weaker tractability notion than all other tractability notions considered here, we cannot have any other stronger kind of tractability. This completes the proof of Theorem 2.  Acknowledgements The authors thank the MATRIX institute in Creswick, VIC, Australia, and its staff for supporting their stay during the program “On the Frontiers of High-Dimensional Computation” in June 2018. Furthermore, the authors thank the RICAM Special Semester Program 2018, during which parts of the paper were written. F. J. Hickernell gratefully acknowledges support by the United States National Science Foundation grant DMS-1522687. P. Kritzer gratefully acknowledges support by the Austrian Science Fund (FWF): Project F5506-N26, which is part of the Special Research Program “Quasi-Monte Carlo Methods: Theory and Applications”. H. Wo´zniakowski gratefully acknowledges the support of the National Science Centre, Poland, based on the decision DEC-2017/25/B/ST1/00945.

References 1. Chen, J., Wang, H.: Preasymptotics and asymptotics of approximation numbers of anisotropic Sobolev embeddings. J. Complexity 39, 94-110 (2017) 2. Kritzer, P., Wo´zniakowski, H.: Simple characterizations of exponential tractability for linear multivariate problems. To appear in J. Complexity (2019) 3. K¨uhn, Th., Mayer, S., Ullrich, T.: Counting via entropy: new preasymptotics for the approximation numbers of Sobolev embeddings. SIAM J. Numer. Anal. 54, 3625–3647 (2016)

78

Fred J. Hickernell, Peter Kritzer, Henryk Wo´zniakowski

4. K¨uhn, Th., Sickel, W., Ullrich, T.: Approximation of mixed order Sobolev functions on the dtorus: asymptotics, preasymptotics, and d-dependence. Constr. Approx. 42, 353–398 (2015) 5. Novak, E., Wo´zniakowski, H.: Tractability of Multivariate Problems, Volume I: Linear Information. EMS, Z¨urich (2008) 6. Novak, E., Wo´zniakowski, H.: Tractability of Multivariate Problems, Volume II: Standard Information for Functionals. EMS, Z¨urich (2010) 7. Novak, E., Wo´zniakowski, H.: Tractability of Multivariate Problems, Volume III: Standard Information for Operators. EMS, Z¨urich (2012) 8. Papageorgiou, A., Petras, I., Wo´zniakowski, H.: (s, lnκ )-weak tractability of linear problems. J. Complexity 40, 1–16 (2017) 9. Siedlecki, P.: Uniform weak tractability. J. Complexity, 29, 438–453 (2013) 10. Siedlecki, P., Weimar, M.: Notes on (s,t)-weak tractability: a refined classification of problems with (sub)exponential information complexity. J. Approx. Theory 200, 227–258 (2015) 11. Traub, J.F., Wasilkowski, G.W., Wo´zniakowski, H.: Information-Based Complexity. Academic Press, New York (1988) 12. Wang, H.: A note about EC-(s,t)-weak tractability of multivariate approximation with analytic Korobov kernels. Submitted (2019) 13. Werschulz, A., Wo´zniakowski, H.: A new characterization of (s,t)-weak tractability. J. Complexity 38, 68–79 (2017)

Worst-case error for unshifted lattice rules without randomisation Yoshihito Kazashi and Ian H. Sloan

Abstract An existence result is presented for the worst-case error of lattice rules for high dimensional integration over the unit cube, in an unanchored weighted space of functions with square-integrable mixed first derivatives. Existing studies rely on random shifting of the lattice to simplify the analysis, whereas in this paper neither shifting nor any other form of randomisation is considered. Given that a certain number-theoretic conjecture holds, it is shown that there √ exists an N-point rankone lattice rule which gives a worst-case error of order 1/ N up to a (dimensionindependent) logarithmic factor. Numerical results suggest that the conjecture is plausible.

1 Introduction This paper is concerned with an error estimate for a numerical integration rule for functions defined on high-dimensional hypercube [0, 1)s , s ∈ N,  [0,1)s

f (x) dx.

(1)

More specifically, we consider the worst-case error for rank-one lattice rules. The main contribution of this paper is the analysis of unshifted lattice rules without randomisation; we allow neither shifting nor any other form of randomisation. Given the truth of a certain conjecture with a number-theoretic flavour (Conjecture 1), our results show the existence of a deterministic cubature point set that attains the worstYoshihito Kazashi ´ Mathematics Institute, CSQI, Ecole Polytechnique F´ed´erale de Lausanne, Switzerland, e-mail: yoshihito.kazashi@epfl.ch Ian H. Sloan University of New South Wales, Sydney, NSW 2052, Australia, e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_6

79

Yoshihito Kazashi and Ian H. Sloan

80

√ case error of the order 1/ N, up to a logarithmic factor, where N is the number of cubature points, with a dimension-independent constant (Corollary 1). An N-point rank-one lattice rule in s-dimension is an equal-weight cubature rule for approximating the integral (1) — a quasi-Monte Carlo rule — of the form 1 N−1 ∑ f (t k ), N k=0 with cubature points

 tk =

 kz , N

(2)

k = 0, . . . , N − 1,

(3)

for some z ∈ {1, . . . , N −1}s , where {x} ∈ [0, 1)s for x = (x1 , . . . , xs ) ∈ [0, ∞)s denotes the vector consisting of the fractional part of each component of x. The choice of z, known as the generating vector, completely determines the cubature points, and thus the quality of the cubature rule. Our interest in this paper lies in proving the existence of a good generating vector z ∈ {1, . . . , N − 1}s . The figure of merit we consider is the so-called worst-case error, defined by e(N, z) := e(N, ({kz/N})k ) :=

sup f ∈Hs,γ ,  f Hs,γ ≤1

    1 N−1    [0,1)s f (x) dx − N ∑ f ({kz/N}), k=0

where Hs,γ is a suitable normed space consisting of non-periodic functions over [0, 1)s , specified below. As is standard nowadays, we will assume that the norm incorporates certain parameters γu , one for each subset u ⊆ {1, 2, . . . , s}, since without weights integration problems are often intractable, see [1, 9] for more details. It is natural to seek a generating vector z that makes the worst-case error small. If Hs,γ is a reproducing kernel Hilbert space then the worst-case error e(N, z) can be computed for any value of z (see below), but there is no known formula that gives a good value of z for general s. The strategy we take in this paper is to prove an existence result, by considering the average of e2 (N, z) over all possible generating vectors z ∈ ZNs , with ZN := {1, 2, . . . , N − 1}, i.e. we compute e2 (N) :=

1 ∑s e2 (N, z); (N − 1)s z∈Z

(4)

N

and then use the well known principle that there must exist one choice of z that is as good as average. With the support of a certain number-theoretic conjecture (Conjecture 1), which does not depend on the choice of z, we will show that e2 (N) ≤

C (ln N)α , N

with C independent of N, where α > 0 is an exponent appearing in the conjecture that depends on neither s nor N. Moreover, C is independent of s for suitable weights

Worst-case error for unshifted lattice rules without randomisation

81

γu . It follows that there√exists a generating vector z∗ for which the worst-case error √ ∗ α /2 e(N, z ) is bounded by C(ln N) / N (Corollary 1). For periodic function spaces, error estimates for rank-one lattice rules are well known; see [1, 4, 7, 8] and references therein. For non-periodic functions, with the aid of shifting—changing the cubature points from {kz/N} to {kz/N + Δ } with elements Δ ∈ [0, 1)s —good results have been obtained for shift-averaged worst-case errors; see [1] and references therein for more details. In the present paper, however, the function space is not periodic, and the worst-case error we consider is not shift-averaged. Approaches to estimating the error for lattice rules for non-periodic functions without randomisation include [2, 3], where a mapping called the tent transform was applied to the lattice rule. In this paper, however, no transformation of the lattice points is considered. The shift-averaged worst-case error mentioned above is the expected worst-case error for randomly shifted lattice rules, see [1]. The present paper is a first step in our project to “derandomise” randomly shifted lattice rules—that is, to produce explicit shifts (for an untransformed rule) that gives worst-case errors that lose no accuracy compared to the shift-averaged worst-case errors. While randomly shifted lattice rules have the advantages of providing us with an online error estimator and of being simple to analyse and construct, they are less efficient than a good deterministic rule, because of the need in practice to repeat the calculations of integrals with fixed z for some number (say 30) of random shifts. In this first step in this programme, we study the case of zero shifts. (Experience suggests that this is a poor choice—perhaps the worst!) There are related works in [5, 6] where a quantity called ‘R’, which is connected to the so-called (weighted) star discrepancy, was considered as the error criterion. In the weighted setting in [6], lattice rules can be constructed to achieve O(n−1+δ ) convergence rate for any δ > 0, with the implied constant independent of s and N for suitable weights. After establishing the setting in Section 2, the conjecture and the main results are stated in Section 3. Section 4 provides numerical evidence relating to the conjecture. Finally in Section 5 we give concluding remarks.

2 Preliminaries In this section, we introduce the setting and recall some facts on lattice rules that will be needed later. Throughout this paper, we assume that N, the number of cubature points, is a prime number. Let us start with a general reproducing kernel Hilbert space (RKHS) Hs with a reproducing kernel K : [0, 1]s × [0, 1]s → R that satisfies 



[0,1]s [0,1]s

K(x, y) dx dy < ∞.

Yoshihito Kazashi and Ian H. Sloan

82

It is well known that for a general quasi-Monte Carlo (QMC) rule (2), the square of the worst-case error in Hs ,     1 N−1   sup e(N, (t k )k ) :=  [0,1)s f (x) dx − N ∑ f (t k ), f ∈Hs ,  f Hs ≤1 k=0 is given by e2 (N, (t k )k ) =





[0,1]s [0,1]s

K(x, y) dx dy −

2 N−1 ∑ N k=0

 [0,1]s

K(t k , x) dx +

1 N−1 N−1 K(t k ,t k ), ∑ ∑ N 2 k=0 k =0

see for example [1, Theorem 3.5]. We specialise to the case  [0,1]s

K(x, y) dx = 1

for any y ∈ [0, 1]s ,

to obtain e2 (N, (t k )k ) =

1 N−1 N−1 K(t k ,t k ) − 1. ∑ ∑ N 2 k=0 k =0

(5)

In particular, for the QMC rule we here take an unshifted lattice rule with cubature points given by (3) for some z ∈ ZNs . Then, we have 1 N−1 N−1 e (N, z) = 2 ∑ ∑ K N k=0 k =0



2

   kz kz , − 1. N N

(6)

Now we further specialise the RKHS to Hs,γ with kernel Ks,γ (x, y) = where



u⊆{1:s}

γu ∏ η (x j , y j ),

(7)

j∈u

 1 1  1 η (x, y) := B2 (|x − y|) + x − y− , 2 2 2

x, y ∈ [0, 1].

Here B2 (t) = t 2 − t + 1/6, t ∈ R is the Bernoulli polynomial of degree 2, {1 : s} is a shorthand notation for {1, 2, ..., s}, and the sum in (7) is over all subsets u ⊆ {1 : s}, including the empty set; and γ = {γu }u⊂N is an arbitrary collection of positive numbers called weights with γ0/ = 1. The choice of weights plays an important role in deriving a dimension-independent error estimate, see Corollary 1. This space, discussed fully in [1], is an “unanchored” space of functions on the unit cube with square integrable mixed first derivatives. We again refer the readers to [1] for more details. For this space it follows from (5) that

Worst-case error for unshifted lattice rules without randomisation

e2 (N, z) =



0 / =u⊆{1:s}

83

γu e2u (N, zu ),

(8)

where for u ⊆ {1 : s} and zu = (z j ) j∈u , from (5) and (7) e2u (N, zu ) :=

          kz j k z j  kz j k zj 1 1 N−1 N−1 1 1  B − + − − . ∑ ∑ ∏ 2 2  N N 2 k=0 N  N 2 N 2 k =0 j∈u (9)

Thus the quantity e2u (N, zu ) is a key to deriving an estimate for e2 (N, z).

3 Existence result for worst-case error In this section, we derive an existence result for the worst-case error. We first note the following property. Proposition 1. Let g be a function that satisfies g(t) = g(1 − t) for t ∈ [0, 1]. Then for a, b ≥ 0 we have g(|{a} − {b}|) = g({a − b}), where, as before, the braces indicate that we take the fractional part of the real number. Proof. Note first that {a}, {b} ∈ [0, 1) and therefore {a} − {b} ∈ (−1, 1). It is clear that {a} − {b} differs from {a − b} by 1 or 0. If {a} = {b}, then {a − b} = 0 and the result is trivial. If {a} > {b}, then {a} − {b} ∈ (0, 1), and so {a} − {b} = {a − b}. Thus, again the result is trivial. If {a} < {b}, then |{a} − {b}| = {b} − {a} ∈ (0, 1) and so |{a} − {b}| = {b − a}. Thus, using g(t) = g(1 − t), t ∈ [0, 1] we have g(|{a} − {b}|) = g({b} − {a}) = g({b − a}) = g(1 − {b − a}) = g({a − b}), where in the last step we used the identity {t} + {−t} = 1 for t ∈ Z.



In particular, Proposition 1 applies to the function B2 (·) so we can rewrite (9) as e2u (N, zu )

        (k − k )z j kz j k zj 1 1 N−1 N−1 1 1 = 2 ∑ ∑ ∏ B2 + − − . N k=0 k =0 j∈u 2 N N 2 N 2 (10)

Now we obtain the average over z ∈ ZNs . From (4) and (8) we have e2 (N) =



0 / =u⊆{1:s}

γu e2u (N),

(11)

Yoshihito Kazashi and Ian H. Sloan

84

where e2u (N) :=

1 1 ∑s e2u (N, zu ) = (N − 1)|u| (N − 1)s z∈Z N

= with XN;k,k := and JN;k,k



|u| zu ∈ZN

e2u (N, zu )

1 N−1 N−1 (XN;k,k + JN;k,k )|u| , ∑ ∑ N 2 k=0 k =0

N−1 1 ∑ B2 2(N − 1) z=1

1 N−1 := ∑ N − 1 z=1



kz N





1 − 2

(k − k )z N



k z N

(12)





,

(13)

 1 − . 2

(14)

Further, the binomial theorem gives e2u (N) =

1 N−1 N−1 ∑ ∑ ∑ (XN;k,k )|u\v| (JN;k,k )|v| . N 2 k=0 k =0 v⊆u

(15)

In seeking an error estimate for the generating-vector-averaged worst-case error e2 (N), we take the point of view that estimates of order 1/N or higher are relatively harmless, so we are concentrating on isolating terms that are more slowly converging. In the following two subsections, we derive estimates for XN;k,k and JN;k,k . It turns out that, roughly speaking, the terms (XN;k,k )|u\v| yield the order 1/N. The terms (JN;k,k )|v| seem to converge more slowly, and require more detailed analysis.

3.1 Estimates for XN;k,k We have the following expression for XN;k,k . Lemma 1. For N prime and k, k ∈ {0, 1, . . . , N − 1}, the quantity XN;k,k defined in (13) satisfies ⎧ ⎪1 ⎨ if k = k , XN;k,k = 12 (16) ⎪ ⎩− 1 if k = k . 12N Proof. For k = k , we have XN;k,k = 12 B2 (0) = lutely convergent) series representation B2 (x) =

1 12 .

1 exp(2π ihx) , 2π 2 h ∑ h2 =0

For k = k , recalling the (abso-

x ∈ [0, 1],

Worst-case error for unshifted lattice rules without randomisation

85

we have 1 N−1 1 ∑ ∑ exp(2π ih(k − k )z/N) 4π 2 (N − 1) h =0 h2 z=1   1 1 N−1 = 2 ∑ exp(2π izh(k − k )/N) − 1 , 2 4π (N − 1) h ∑ z=0 =0 h

XN;k,k =



with N−1

∑ exp(2π izh(k − k )/N) =

z=0

N 0

if h(k − k ) ≡N 0, if h(k − k ) ≡N 0.

Throughout this paper, the notation a ≡N b means that a ≡ b (mod N), and similarly a ≡N b means that a ≡ b (mod N). Since N is prime and k = k , we conclude that all possible values of k − k , namely, ±1, ±2, . . ., ±(N − 1), are relatively prime to N, and so h(k − k ) ≡N 0 ⇐⇒ h ≡N 0. Thus   1 1 1 N ∑ 2−∑ 2 X N;k,k = 2 4π (N − 1) h =0 h h =0 h 1 = 2 4π (N − 1)

 N∑

h≡N 0

1 π2 − 2 3  =0 (N)

which completes the proof.



1 = 2 4π (N − 1)



N π2 π2 − N2 3 3

 =−

1 , 12N



We deduce the following estimate for e2u (N). Proposition 2. For N prime, the quantity e2u (N) defined in (15) satisfies e2u (N) ≤ cu

1 N−1 N−1 1 + 2 ∑ ∑ (JN;k,k )|u| , N N k=1 k =1

with cu :=

2 1 + . 3|u| 4|u|

Proof. On separating out the diagonal terms of (15), we have e2u (N) =

1 N−1 1 N−1 N−1 (XN;k,k + JN;k,k )|u| + 2 ∑ ∑ ∑ (XN;k,k )|u\v| (JN;k,k )|v| . (17) ∑ 2 N k=0 N k=0 k =0 v⊆u k =k

From XN;k,k = bounded by

1 12

and 0 ≤ JN;k,k ≤

1 N−1

1 1 ∑N−1 z=1 4 = 4 , the first term in (17) can be

1 N−1 1 ∑ (XN;k,k + JN;k,k )|u| ≤ 3|u| N . N 2 k=0 For the second term in (17), noting |JN;k,k | ≤ 14 , from Lemma 1 we have for any v⊆u   1 (XN;k,k )|u\v| (JN;k,k )|v|  ≤ , (12N)|u\v| 4|v|

Yoshihito Kazashi and Ian H. Sloan

86

and thus summing over v  u and estimating N −|u\v| by N −1 we obtain 



1

1 . |u\v| 4|v| 12 vu

∑ (XN;k,k )|u\v| (JN;k,k )|v|  ≤ N ∑

vu

1 Further, from the binomial theorem we have ∑vu 12|u\v| = 4|v| 1 1 − . Using this, together with the case v = u, we obtain 3|u| 4|u|

 e2u (N)



2 1 − 3|u| 4|u|





1 12

+ 14

|u|

1 − 4|u| =

1 1 N−1 N−1 + 2 ∑ ∑ (JN;k,k )|u| . N N k=0 k =0 k =k

Using again |JN;k,k | ≤ 1/4, we can separate out the contributions for k = 0 or k = 0, to obtain 1 N−1 N −1 1 |JN;0,k ||u| ≤ |u| 2 ≤ |u| N 2 k∑ 4 N 4 N =1

1 N−1 1 ∑ |JN;k,0 | ≤ 4|u| N . N 2 k=1

and

Finally noting (JN;k,k )|u| ≥ 0 yields the desired result.



3.2 Estimates for JN;k,k In this subsection, we derive estimates for JN;k,k for k, k ≥ 1. In the following we will make use of the Fourier series for the real 1-periodic sawtooth function, defined on [0, 1) by  x − 1/2 if x ∈ (0, 1), b(x) := 0 if x = 0, and then extended to the whole of R by b(x) = b(x + 1) for all x ∈ R. Thus b(x) is the periodic version of the first-degree Bernoulli polynomial B1 (x) = x − 1/2. It is well known (following, for example, from the Dini criterion) that the symmetric partial sums in its Fourier series converge to b(x) pointwise for all x ∈ R, that is i M→∞ 2π

b(x) = lim

exp(2π ihx) , h h=−M M



x ∈ R.

h =0

For notational simplicity we shall often omit the limit, writing simply b(x) =

i 2π

exp(2π ihx) , h h =0



x ∈ R,

but this is always to be understood as the limit of the symmetric partial sum.

Worst-case error for unshifted lattice rules without randomisation

87

We have the following expression for JN;k,k , k, k ∈ {1, . . . , N − 1}. Lemma 2. For N prime and k, k ∈ {1, . . . , N − 1}, the quantity JN;k,k defined in (14) satisfies JN;k,k =

N 1 2 4π N − 1 h ∑ =0



h =0 h k ≡N hk

1 , hh

(18)

where the double sum is to be interpreted as the double limit

∑ ∑

h =0

h =0 h k ≡N hk

1 1 := lim lim . ∑ ∑ M→∞ M →∞ hh hh h∈{−M,...,M}\{0} h ∈{−M ,...,M }\{0} h k ≡N hk

Proof. For (x, y) ∈ (0, 1)2 we have

B1 (x)B1 (y) =

e2π ihx e−2π ih y 1 ∑ ∑ 2 4π h =0 h =0 h h

1 e2π ihx e−2π ih y . ∑ ∑ M→∞ M →∞ 4π 2 h h h∈{−M,...,M}\{0} h ∈{−M ,...,M }\{0}

:= lim lim

Thus for any k, k = 1, . . . , N − 1 we have, noting that the finite sum over z may be interchanged with the implied limits, JN;k,k =

   hk − h k 1 N−1 1 exp 2 π i z ∑ ∑ 4π 2 (N − 1) h ∑ N =0 h =0 hh z=1

   1 1 1 N−1 1 hk − h k =− 2 exp 2π i z . ∑ + 4π 2 (N − 1) ∑ ∑ ∑ 4π (N − 1) h ∑ N =0 h =0 hh h =0 h =0 hh z=0 The first term vanishes because it has as a factor the limit of the product of symmetric partial sums of the odd function 1/h. For the second term we use  N−1 N if hk − h k ≡N 0, ∑ exp(2π iz(hk − h k )/N) = 0 if hk − h k ≡N 0, z=0 which leads to the desired formula.



We now want to estimate JN;k,k for k, k ≥ 1 using (18). It turns out that it suffices to consider JN;κ ,1 , for κ = 1, . . . , N − 1. Proposition 3. For N prime, the quantity e2u (N) defined in (15) satisfies

Yoshihito Kazashi and Ian H. Sloan

88

e2u (N) ≤ cu

1 N−1 1 + ∑ |JN;κ ,1 ||u| . N N κ =1

Proof. Because N is prime, for each k ∈ {1, . . . , N − 1} there is a unique inverse k −1 ∈ {1, . . . , N − 1} such that k k −1 ≡N 1, and therefore h k ≡N hk



h ≡N h(kk

−1

).

It follows from (18) that JN;k,k = JN;κ ,1 ,

with

κ := kk

−1

mod N,

and since κ runs over all of {1, . . . , N − 1} as k runs over {1, . . . , N − 1}, we have N − 1 N−1 1 N−1 1 N−1 N−1 (JN;k,k )|u| = (JN;κ ,1 )|u| ≤ |JN;κ ,1 ||u| . ∑ ∑ ∑ 2 2 N k=1 k =1 N κ =1 N κ∑ =1 Applying this to Proposition 2 yields the desired result.



From Lemma 2 we have JN;κ ,1 =

N 1 2 4π N − 1 h ∑ =0



h =0 h ≡N hκ

1 N 1 lim lim S(M, M ), = 2 hh 4π N − 1 M→∞ M →∞

(19)

where 1 . hh h∈{−M,...,M}\{0} h ∈{−M ,...,M }\{0}



S(M, M ) :=



(20)

h ≡N hκ

To further simplify JN;κ ,1 , we note that for h, h satisfying h ≡N hκ with κ ∈ {1, . . . , N − 1} we have h ≡N 0 ⇔ hκ ≡N 0 ⇔ h ≡N 0. Hence, for the h ≡N 0 contribution to the double sum (20) we have 1 1 ∑ h = 0. h h∈{−M,...,M}\{0} h ∈{−M ,...,M }\{0}



h ≡N 0

h≡N 0

Thus, we can restrict the double sum (20) to h ≡N 0 so that S(M, M ) =

1 . hh h∈{−M,...,M}\{0} h ∈{−M ,...,M }\{0}





h ≡N 0

h ≡N hκ

(21)

Worst-case error for unshifted lattice rules without randomisation

89

We now assume N ≥ 3 so that N − 1 is even for N prime. We can write h ≡N 0 as   N−1 \ {0} =: RN . h = N + q, with  ∈ Z and q ∈ − N−1 (22) 2 , ..., 2 Then, we can write h ≡N hκ with h ≡N 0 as h =  N + r(qκ , N),

with

 ∈ Z,

where r( j, N) is the unique integer congruent to j mod N with the smallest magnitude. More precisely, the function r(·, N) : Z → RN ∪ {0} is defined for j ≥ 0 by  j mod N if j mod N ≤ N−1 2 , r( j, N) := (23) j mod N − N if j mod N > N−1 2 , and extended to all integers j by r( j, N) = r( j + N, N). It follows that for j > 0 we have r(− j, N) = r(N − j mod N, N) = −r( j, N). Hence the function is both Nperiodic and odd. If N divides j, then we have r( j, N) = 0, but otherwise r( j, N) ∈ RN . Using these representations of h and h , the double limit in JN;κ ,1 as in (19) can be rewritten as follows. Lemma 3. For N ≥ 3 prime and κ ∈ {1, . . . , N − 1}, the quantity JN;κ ,1 given by (19) satisfies JN;κ ,1 =

N 1 2 2π N − 1

(N−1)/2







2q 1 −∑ q =1 (N)2 − q2



q=1

(24) 



2 r(qκ , N) 1 −∑ 2 , r(qκ , N)  =1 ( N) − r(qκ , N)2 (25)

where r(·, N) is defined as in (23). Proof. We begin with the expression (19) for JN;κ ,1 . Writing M = LN + Q and M = L N + Q with L, L ∈ N and Q, Q ∈ RN ∪ {0}, the double sum (21) can be rewritten as S(M, M ) =



∈Z, q∈RN  ∈Z, q ∈RN |N+q|≤LN+Q | N+q |≤L N+Q q ≡N qκ



=





q,q ∈RN q ≡N qκ

L



=−L |N+q|≤LN+Q

1 1 N + q  N + q

1 N + q



L



 =−L | N+q |≤L N+Q

 1 ,  N + q

(26)

where we used the fact that the inequalities in the summation conditions cannot hold if || > L or | | > L .

Yoshihito Kazashi and Ian H. Sloan

90

First we consider the sum over  in (26). Since the condition |N + q| ≤ LN + Q always holds for || ≤ L − 1, we can write L



=−L |N+q|≤LN+Q

L 1 1 = ∑ − N + q =−L N + q



=±L |N+q|>LN+Q

1 , N + q

where we have  L  L 1 1 1 1 2q 1 ∑ N + q = q + ∑ N + q + −N + q = q − ∑ (N)2 − q2 =−L =1 =1 L

and

    



=±L |N+q|>LN+Q

 2 2 1  ≤ →0 ≤ N + q  LN + Q LN − N/2

as

L → ∞.

Thus we conclude that L

lim

L→∞



=−L |N+q|≤LN+Q

L ∞ 1 2q 2q 1 1 = =: PN (q). = − lim ∑ − ∑ N + q q L→∞ =1 (N)2 − q2 q =1 (N)2 − q2

The sum over  in (26) is similar. Now since the double limit of S(M, M ) exists as M → ∞ and M → ∞, it must equal the double limit of the last expression in (26) as L → ∞ and L → ∞, with arbitrary Q and Q . (This is because for a particular pair (Q, Q ), the last expression in (26), when interpreted as a sequence in the double index (L, L ), can be considered as a subsequence of the convergent sequence S(M, M ) with double index (M, M ).) Hence we obtain lim lim S(M, M ) =

M→∞ M →∞



PN (q) PN (q ) =

q,q ∈RN q ≡N qκ



PN (q) PN (r(qκ , N)),

q∈RN

where we used the fact that for a given q ∈ RN , the only value of q ∈ RN that satisfies q ≡N qκ is q = r(qκ , N). Finally, we observe that PN (−q) = −PN (q), and PN (r(−qκ , N)) = −PN (r(qκ , N)) since r(−qκ , N) = −r(qκ , N). Thus the contributions of q and −q to the sum are the same, and so we only need to sum over the positive values of q and then double the result. Applying the result in (19) completes the proof.  Now we estimate the magnitude of JN;κ ,1 . Lemma 4. For N ≥ 3 prime and κ ∈ {1, . . . , N − 1}, the quantity JN;κ ,1 from (25) satisfies

Worst-case error for unshifted lattice rules without randomisation

91

  1 10π 2 ln N N |JN;κ ,1 | ≤ 2 TN (κ ) + , 2π N − 1 9N where TN (κ ) :=

(N−1)/2



q=1

1 π2 < . q |r(qκ , N)| 6

(27)

Proof. We expand the two factors in the sum over q in (25) and then apply the triangle inequality to obtain |JN;κ ,1 | ≤

 1 N  TN (κ ) + A1 + A2 + A3 , 2π 2 N − 1

with   ∞ 1 2q A1 := ∑ , ∑ 2 2 q=1 |r(qκ , N)| =1 (N) − q   (N−1)/2 1 ∞ 2 |r(qκ , N)| , A2 := ∑ ∑ 2 2 q=1 q  =1 ( N) − r(qκ , N)    (N−1)/2 ∞ ∞ 2q 2 |r(qκ , N)| A3 := ∑ ∑ ( N)2 − r(qκ , N)2 . ∑ 2 2 q=1 =1 (N) − q  =1 (N−1)/2

Since q ≤ N/2 ≤ N/2 and |r(qκ , N)| ≤ N/2 ≤  N/2, we have ∞



2q

N

4



1

∑ (N)2 − q2 ≤ ∑ (N)2 − (N/2)2 = 3N ∑ 2 =

=1

and

=1

=1

2π 2 , 9N



∞ 2π 2 2 |r(qκ , N)| N ≤ = . ∑ ∑ 2 2 2 2 ( N) − r(qκ , N) ( N) − ( N/2) 9N  =1  =1

Moreover, we have (N−1)/2



q=1

1 ≤ 1+ q

 (N−1)/2 1

dt ≤ 2 ln N

(28)

(N−1)/2 1 1 = ∑ ≤ 2 ln N, |r(qκ , N)| t=1 t

(29)

1

t

and (N−1)/2



q=1

where in the penultimate step we used the fact that |r(qκ , N)| takes all the values from 1 to (N − 1)/2 exactly once as q runs from 1 to (N − 1)/2. These estimates lead to

Yoshihito Kazashi and Ian H. Sloan

92

4π 2 ln N 4π 2 ln N N − 1 4π 4 + + 9N 9N 2 81N 2   2 2 π ln N 2π 10π 2 ln N ≤ . 8+ ≤ 9N 9 ln 3 9N

A1 + A2 + A3 ≤

(30) (31)

On the other hand, a crude estimate for TN (κ ) follows from the Cauchy-Schwarz inequality: 1/2 

1/2 1 TN (κ ) ≤ ∑ ∑ 2 q=1 q=1 r(qκ , N)  1/2  1/2 (N−1)/2 (N−1)/2 1 1 π2 . = < ∑ 2 ∑ 2 6 q=1 q t=1 t 

This completes the proof.

(N−1)/2

1 q2

(N−1)/2



Numerical experiments show that the value of TN (κ ) is much smaller than the crude bound π 2 /6 for most values of κ , and have led us to the following conjecture. Note that we have r(q(N − κ ), N) = r(−qκ , N) = −r(qκ , N), and so TN (N − κ ) = TN (κ ). Moreover, from (25) we conclude that JN;N−κ ,1 = −JN;κ ,1 . Since we are only interested in the magnitude of JN;κ ,1 (see Proposition 3), it suffices to consider only κ ∈ R+ N := {1, 2, . . . , (N − 1)/2}. Conjecture 1. For N ≥ 3 prime and κ ∈ R+ N , with TN (κ ) as defined in (27), let (κ j ) be an ordering of the elements of R+ for j ∈ R+ N N such that (TN (κ j )) is non-increasing. The conjecture is that there exist C1 ,C2 > 0 and α ≥ 2 independent of N such that TN (κ j ) ≤ C1

(ln N)α for all N

j > C2 (ln N)α .

(32)

Conjecture 1 together with Lemma 4 lead to an estimate for |JN;κ j ,1 | of the following form: ⎧ ⎨C3 for j ≤ C2 (ln N)α , α |JN;κ j ,1 | ≤ (ln N) ⎩C4 for j > C2 (ln N)α , N where C3 and C4 are known numerical constants. We will use this bound in the next subsection to obtain the desired result for the mean of the worst-case error.

3.3 Final results Now we are ready to state our main results.

Worst-case error for unshifted lattice rules without randomisation

93

Theorem 1. Suppose that Conjecture 1 holds with some α ≥ 2. For arbitrary u ⊆ {1 : s} and any prime number N ≥ 3 such that (ln N)α /N ≤ 1, the quantity eu (N) defined in (12) satisfies (ln N)α /2 , (33) eu (N) ≤ Cu √ N 

where

 23 |u|

 3C

5 |u| . 24 6 Here, the constant cu is as in Proposition 2, and C1 ,C2 are as in Conjecture 1. Cu :=

cu + 2C2

+

1 4π 2

+

Proof. From Proposition 3 together with JN;N−κ ,1 = −JN;κ ,1 , we have e2u (N) ≤ cu

2 (N−1)/2 1 + ∑ |JN;κ j ,1 ||u| . N N j=1

(34)

For j ≤ C2 (ln N)α , we use TN (κ j ) ≤ π 2 /6, ln N/N ≤ 1 and N/(N − 1) ≤ 3/2 in Lemma 4 to obtain  2    N 10π 2 ln N π 1 1 3 π 2 10π 2 23 + + |JN;κ j ,1 | ≤ 2 ≤ 2 = . 2π N − 1 6 9 N 2π 2 6 9 24 For j > C2 (ln N)α , we use ln N ≥ 1, N/(N − 1) ≤ 3/2 and Conjecture 1 to obtain     3C1 5 (ln N)α N (ln N)α 10π 2 ln N 1 + . |JN;κ j ,1 | ≤ 2 + C1 ≤ 2π N − 1 N 9 N 4π 2 6 N Combining these and using (ln N)α /N ≤ 1, we obtain (N−1)/2



j=1

|JN;κ j ,1 ||u| 

     23 |u| 3C1 5 |u| (ln N)α |u| ≤ + + ∑ ∑ 24 4π 2 6 N 1≤ j≤C2 (ln N)α C2 (ln N)α < j≤(N−1)/2  |u|  |u| N − 1 3C1 5 (ln N)α 23 ≤ C2 (ln N)α + + 24 2 4π 2 6 N   |u|  |u|  1 3C1 5 23 + + ≤ C2 (ln N)α . 24 2 4π 2 6 This together with (34) yields the required result.  Corollary 1. Suppose that Conjecture 1 holds with some α ≥ 2. Let N ≥ 3 be a prime number. Suppose that the weights γ = (γu )u satisfy C :=



|u| 0 independent of s and N. As a consequence, there exists a generating vector z∗ ∈ ZNs = {z ∈ Z | 1 ≤ z ≤ N − 1}s that attains the worst-case error e(N, z∗ ) ≤

√ (ln N)α /2 . C √ N

(36)

Proof. From (11) and Theorem 1, we have e2 (N) ≤



0 / =u⊆{1:s}

γuCu

(ln N)α (ln N)α ≤C . N N

(37)

Now, recall that e2 (N) is defined in (11) as the average of e2 (N, z) over all possible z. Thus, there must be at least one z∗ such that e2 (N, z∗ ) ≤ C

(ln N)α , N

which yields the second statement. 

4 Numerical experiments on the conjecture In this section, we present numerical evidence relating to Conjecture 1. We compute (N−1)/2 the numbers {TN (κ )}κ =1 , given by (27) for varying N. For each fixed N, we sort these values in non-increasing order, which we write as (TN (κ j )) j=1,...,(N−1)/2 , plot the values, and make a guess of the constants C1 , C2 in Conjecture 1. We used Julia 0.6.2. for the experiments below. Figure 1 shows the values of (lnNN)α TN (κ j ) against j/(ln N)α for j = 1, . . . , (N − 1)/2 with α = 2, 3, and N = 50021, 74687, 99991. We see that for both α = 2 and 3 and these values of N we can take constants C1 , C2 such that for all j/(ln N)α > C2 with j = 1, . . . , (N − 1)/2 we have TN (κ j )N/(ln N)α ≤ C1 : for example, C1 = 20 and C2 = 10. This is consistent with Conjecture 1, especially for α = 3. Of course, we cannot be certain even in this case that the bounds will hold for very large N, with these or any constants. But even if the conjecture fails, the numerical experiments give us confidence, even for α = 2, that the bounds in Theorem 1 will hold with C1 = 20 and C2 = 10 for N up to at least a few hundred thousand.

Worst-case error for unshifted lattice rules without randomisation

95

(a)

(b)

Fig. 1: Values of TN (κ j )N/(ln N)α , against j/(ln N)α for j = 1, . . . , (N − 1)/2. Top: α = 2. Bottom: α = 3. We see there exist constants C1 , C2 such that for all j/(ln N)α > C2 we have TN (κ j )N/(ln N)α ≤ C1 : for example, C1 = 20 and C2 = 10.

96

Yoshihito Kazashi and Ian H. Sloan

5 Concluding remarks In this paper, we considered the worst-case error for unshifted lattice rules without randomisation. A conjecture to support the error estimate was proposed. Given the conjecture, in Corollary 1 we √ showed the existence of a generating vector that attains the worst-case error 1/ N, up to a logarithmic factor. Numerical experiments suggest that the conjecture is plausible. Corollary 1, which holds if the conjecture is true, shows that some lattice rules work well for non-periodic functions as well. We note that this would not be too surprising: as mentioned in Section 1, Joe [5, 6] has considered CBC constructions for unshifted lattice rules that give good star discrepancy bounds. In closing, we mention that one difficulty in proving the conjecture is that the ordering of the κ to ensure that TN (κ j ) is non-increasing typically changes completely when N changes. Acknowledgements We gratefully acknowledge the financial support from the Australian Research Council (FT130100655 and DP180101356). We are also indebted to Frances Y. Kuo for her stimulating comments.

References 1. J. Dick, F. Y. Kuo, and I. H. Sloan. High-dimensional integration: The quasi-Monte Carlo way. Acta Numer., 22:133–288 (2013). 2. J. Dick, D. Nuyens, and F. Pillichshammer. Lattice rules for nonperiodic smooth integrands. Numer. Math., 126(2):259–291 (2014). 3. T. Goda, K. Suzuki, and T. Yoshiki. Lattice rules in non-periodic subspaces of Sobolev spaces. Numer. Math., appeared online in October 2018. 4. F. J. Hickernell. Lattice rules: How well do they measure up?, in P. Hellekalek and G. Larcher, editors, Random and Quasi-Random Point Sets, Springer, Berlin, pp. 109–166, 1998. 5. S. Joe. Component by component construction of rank-1 lattice rules having O(n−1 (ln(n))d ) star discrepancy, in Monte Carlo and Quasi-Monte Carlo Methods 2002 (H. Niederreiter, ed.), Springer, pp. 293–298, 2004. 6. S. Joe. Construction of good rank-1 lattice rules based on the weighted star discrepancy, in Monte Carlo and Quasi-Monte Carlo Methods 2004 (H. Niederreiter and D. Talay, eds.), Springer, pp. 181–196, 2006. 7. H. Niederreiter. Random Number Generation and Quasi-Monte Carlo Methods, SIAM, Philadelphia, 1992. 8. I. H. Sloan and S. Joe. Lattice methods for multiple integration. Oxford Science Publications. The Clarendon Press, Oxford University Press, New York, 1994. 9. I. H. Sloan and H. Wo´zniakowski. When are quasi-Monte Carlo algorithms efficient for highdimensional integrals?, J. Complexity, 14:1–33 (1998).

New preasymptotic estimates for approximation of periodic Sobolev functions Thomas K¨uhn

Dedicated to Ian Sloan on the occasion of his 80th birthday

Abstract Approximation of Sobolev embeddings is a well-studied subject in highdimensional approximation, with many application to different branches of mathematics. E.g., for isotropic Sobolev spaces H s (Td ) of fractional smoothness s > 0 on the d-dimensional torus it is known that the approximation numbers an of the embedding H s (Td ) → L2 (Td ) behave like an ∼ n−s/d as n → ∞, where the (weak) equivalence ∼ holds only up to multiplicative constants which are not known explicitly. However, for practical purposes it is more relevant to know the preasymptotic behaviour of the an , i.e. for small n, say n ≤ 2d . In this range the dependence on n is only logarithmic. The main results in this note are sharp two-sided preasymptotic estimates for approximation of isotropic Sobolev functions on Td . In particular we give explicit constants, which show the exact dependence on the dimension d, the smoothness s, and further parameters of the norm. This improves the known results in the literature. Moreover, we prove a new preasymptotic estimate for approximation of Sobolev functions of dominating mixed smoothness.

1 Introduction Approximation of Sobolev functions is a classical topic in functional analysis, with numerous applications to other branches of mathematics like approximation theory or numerical analysis. The quality of such approximations can be expressed in terms of approximation numbers an of embeddings of the corresponding Sobolev spaces. The error is measured in the norm of the target space, mostly in the L2 - or L∞ norm. The approximation numbers an coincide with the worst case error that can be achieved by linear algorithms that are allowed to use n pieces of (general linear) information on the functions that we wish to approximate. Thomas K¨uhn Mathematisches Institut, Universit¨at Leipzig, Augustusplatz 10, 04109 Leipzig, Germany e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_7

97

98

Thomas K¨uhn

The asymptotic rate of the approximation numbers an as n → ∞ is known in many cases, but often only up to non-specified multiplicative constants, i.e. the dependence of these constants on the smoothness parameter of the spaces and the dimension of the underlying domain has not yet been investigated very carefully. For practical purposes, however, this is a crucial point. Without further information on the constants, the asymptotic rate alone is fairly useless in numerical computations. Another effect which is important for numerical approximation of functions on high-dimensional domains, e.g. on the d-dimensional unit cube or the d-dimensional torus (periodic case), is that one usually has to ’wait exponentially long’ until the asymptotic rate ’becomes visible’. More precisely, for small n, say n ≤ 2d , the behaviour of the approximation numbers is often quite different from the asymptotic behaviour. In many practical problems the dimension d is very large, possibly up to hundreds or thousands. But for large or even moderate dimensions, 2d pieces of information might already be well beyond the capacity of a computer. Therefore it is necessary to have good estimates in this so-called preasymptotic range. Only quite recently this topic has attracted more attention, but still there are only few papers devoted to a systematic study of the preasymptotic behaviour, among them [5], [4] and [6] for periodic functions, and [1] for functions on spheres. Prior to these papers, preasymptotic estimates appeared only occasionally in the literature. For detailed comments on this issue we refer to Section 4.5 of [6]. The aim of this note is to provide some new preasymptotic estimates for approximation of periodic functions from two classical types of Sobolev spaces, namely isotropic spaces and spaces of dominating mixed smoothness. In both cases we consider arbitrary fractional smoothness s > 0, and in the isotropic case we consider in addition a family of equivalent norms. Our results exhibit an interesting dependence on the parameters of these norms, showing again how sensitive appproximation problems are with respect to change of norms. Our main results improve the known results in the literature: In the isotropic case we obtain explicit constants, which were not available so far (c.f. [5] and [4]), and in the mixed case we improve the exponent that was given in [6]. The paper is organized as follows: In Section 2 we collect some known facts on approximation numbers, and give the definitions of the Sobolev spaces that we consider in this paper. Sections 3 and 4 contain our main results on preasymptotic estimates for approximation in isotropic Sobolev spaces resp. Sobolev spaces of dominating mixed smoothness. Notation. Throughout the paper we use standard notation. As usual N, Z, R and C denote the natural, integer, real and complex numbers, respectively. For sequences (an ) and (bn ) of positive real numbers the weak equivalence an ∼ bn means that there are constants C, c > 0 such that c bn ≤ an ≤ C bn for all n ∈ N.

New preasymptotic estimates

99

2 Approximation in periodic spaces of Sobolev type The approximation numbers of a (bounded linear) operator T : X → Y between two Banach spaces are defined by an (T ) := inf{T − A : rank(A) < n}

,

n ∈ N.

Since T is compact if lim an (T ) = 0, the rate of decay of an (T ) as n → ∞ describes n→∞ the ’degree’ of compactness of T . If T is a compact operator between Hilbert spaces, then the approximation numbers coincide with the well-known singular numbers, an (T ) = sn (T ) := λn ((T ∗ T )1/2 ) , where the eigenvalues λn ((T ∗ T )1/2 ) are arranged in non-increasing order and counted according to their multiplicities. We will work only in this Hilbert space situation. For further properties of approximation numbers we refer to the monographs [7] or [3]. Let T be the torus, i.e. the interval [0, 2π ] where the endpoints are identified. In this note we consider Sobolev spaces on the d-dimensional torus Td , equipped with the normalized Lebesgue measure. Hence the Fourier coefficients of f ∈ L2 (Td ) are f(k) :=

1 (2π )d

 Td

f (x) e−ikx dx

,

k ∈ Zd .

For a given sequence w = (w(k))k∈Zd of positive weights, bounded away from zero, let H w (Td ) be the space of all f ∈ L2 (Td ) such that the norm 1/2





 f |H (T ) := w

d

w(k) | f(k)| 2

2

(1)

k∈Zd

is finite. Clearly, H w (Td ) is a Hilbert space with respect to the inner product  f , g w :=



f(k) g(k) w(k)2 ,

k∈Zd

and { fk }k∈Zd is an orthonormal basis (ONB) in H w (Td ), where fk (x) :=

eikx w(k)

for

d

x ∈ Td and k ∈ Zd , and kx := ∑ k j x j . For the concrete spaces that we will consider j=1

in this paper, the weights satisfy the additional conditions 1 = w(0) ≤ w(k)

for all k ∈ Zd

and

lim w(k) = ∞ .

|k|→∞

(2)

This ensures that there is a compact embedding H w (Td ) → L2 (Td ) of norm one. Setting ek (x) := eikx , we have for every f ∈ H w (Td )

Thomas K¨uhn

100

f=

∑  f , f k w f k = ∑

k∈Zd

k∈Zd

1  f , fk w ek w(k)

(3)

with convergence in L2 (Td ). The series in (3) is a Schmidt representation of the emd bedding Id : H w (Td ) → L2 (Td ), since {ek }k∈Z  d is an ONB in L2 (T ). Let (σn )n∈N denote the non-increasing rearrangement of 1/w(k) k∈Zd . Then an (Id : H w (Td ) → L2 (Td )) = sn (Id : H w (Td ) → L2 (Td )) = σn .

(4)

From a theoretical point of view the story would be finished here, but in concrete cases it is a highly non-trivial task to find this rearrangement. In particular, this requires subtle combinatorial estimates. Now we give the definitions of the Sobolev spaces that we are going to investigate. As in the general case (1) above, their norms are again weighted 2 -sums of Fourier coefficients. Definition 2.1 (Isotropic spaces) Let d ∈ N, s > 0 and 0 < p < ∞. Then the Sobolev space H s,p (Td ) is defined as the collection of all f ∈ L2 (Td ) such that   f |H

s,p



(T ) := d



k∈Zd

d

1 + ∑ |k j |

p

2s/p

1/2 | f(k)|

2

< ∞.

(5)

j=1

Remark 2.2 For fixed s > 0, all these norms are equivalent. That means, in fact all spaces H s,p (Td ) , 0 < p < ∞, coincide, i.e. the superscript p only indicates which norm we are using. Moreover let us point out that for integer smoothness s = m ∈ N this family of norms is closely related to the most common classical norms on the isotropic Sobolev space H m (Td ). For p = 2 we have the inequalities  1 √  f |H m,2 (Td ) ≤ m!

1/2



α

D f |L2 (T ) d

2

≤  f |H m,2 (Td ) .

|α |≤m

The term in the middle is the original norm on H m (Td ). Note that the equivalence constants depend only on the smoothness parameter m but not on the dimension d. For p = 2m we have even an equality with another classical equivalent norm on H m (Td ) that is often used and works only with the highest derivatives in each coordinate direction,   f |H

m,2m

(T ) = d

∂m f

2



 f |L2 (T ) + ∑ m L2 (Td )

∂ x j j=1 d

2

For the proof of these facts see section 2 of [5].

d

1/2 .

New preasymptotic estimates

101

Definition 2.3 (Spaces of dominating mixed smoothness) s (Td ) is defined as the collection Let d ∈ N and s > 0. Then the Sobolev space Hmix d of all f ∈ L2 (T ) such that  s (Td )  f |Hmix

:=

d

∑ ∏



2s

1 + |k j |

1/2 | f(k)|2

< ∞.

(6)

k∈Zd j=1 s (Td ), but In [6] we considered some other (quite natural) equivalent norms on Hmix for simplicity we restrict ourselves in this note just to the norm given in Definition 2.3.

3 Preasymptotics for approximation in isotropic Sobolev spaces Approximation of isotropic Sobolev embeddings is a topic with a long history, and many authors have contributed to this subject, see the monographs by Temlyakov [9] and Tikhomirov [10], and the references therein. By Theorems 4.1 and 4.2 in Chapter 2 of [9] we have for the Sobolev spaces H s,p (Td ) the two-sided estimates cs,p (d) n−s/d ≤ an (Id : H s,p (Td ) → L2 (Td )) ≤ Cs,p (d) n−s/d

,

n∈N

with certain unspecified positive constants cs,p (d) and Cs,p (d). Only recently the dependence of these constants on the parameters s, p and, more importantly, on the dimension d was investigated. For fixed s > 0 and some values of the parameter p this was done in [5], further results can be found in [8, 1, 4, 12]. Quite surprisingly, it turned out that for fixed s and p the constants decay polynomially in d. From a computational point of view, it is even more relevant to have estimates in the preasymptotic range, i.e. for small n, say n ≤ 2d . In Theorem 4.6 of [5] , for s > 0, d ∈ N and 2d + 2 ≤ n ≤ 2d , an almost matching two-sided estimate was shown,

ln(2d + 1) s ln 2 s s,1 d d ≤ an (Id : H (T ) → L2 (T )) ≤ . (7) ln(4n) ln n But the proof worked only for the parameter value p = 1, the only case where an exact formula for the cardinalities C p (r, d) := card{k ∈ Zd : |k1 | p + . . . + |kd | p ≤ r}

,

r, d ∈ N

(8)

was available. Somewhat later a connection to entropy numbers was found which allowed to deal with the full parameter range 0 < p < ∞, see Theorem 1 of [4]. For all s, p > 0 and d ≤ n ≤ 2d the equivalence an (Id : H s,p (Td ) → L2 (Td )) ∼s,p

ln(1 + d/ ln n) ln n

s/p (9)

Thomas K¨uhn

102

was shown, where the symbol ∼s,p means that the equivalence constants depend only on s and p, but not on d and n. However, in [4] these constants were not explicitly given. The aim of this section is to determine explicit constants in (9). Concerning the proof technique, we return to combinatorial estimates, the crucial quantities will be the cardinalities Cp (r, d) defined in (8). Their role is explained by the following fact: If r ≤ d, then (r + 1)−s/p ∈ {(1 + |k1 | p + . . . |kd | p )−s : k ∈ Zd } , and hence, applying (4) to the weight that determines H s,p (Td ), we have an (Id : H s,p (Td ) → L2 (Td )) = (r + 1)−s/p

,

if n = C p (r, d) .

(10)

Before we can prove the main results in this section we need some preliminary estimates for the cardinalities C p (r, d). Our first result implies that, for d ≥ 4 and all p, the range 2 ≤ n ≤ C p ( d/2 , d) is sufficiently large, i.e. contains the preasymptotic range 2 ≤ n ≤ 2d . Lemma 3.1 Let 0 < p < ∞ and d, r ∈ N. Then we have

2d r C p (r, d) ≥ 1 + for all r ≤ d r

(11)

and C p ( d/2 , d) ≥ 2d

for all

d ≥ 4.

(12)

Proof. As in formula (3.3) in [5] we have for r ≤ d    d C p (r, d) = 1 + ∑ 2 card (mi ) ∈ N : ∑ mip ≤ r .  i=1 =1 r



(13)

For the convenience of the reader we sketch how to prove that. To find all vectors (k j ) ∈ Zd with ∑dj=1 |k j | p ≤ r we note that the number  of non-zero coordinates k j = 0 is at most r. If  = 0, we only have the null vector. For  = 1, . . . , r we consider all vectors (mi ) ∈ N with ∑i=1 mip ≤ r. Now select  coordinates j ∈ {1, . . . , d} and   place the mi ’s there, this gives a vector (k j ). There are d possibilities for selecting these positions. Moreover we can flip the sign in all non-zero coordinates, altogether there are 2 choices of signs ±. In this way we obtain all vectors (k j ) ∈ Zd with ∑dj=1 |k j | p ≤ r, which proves formula (13). Clearly all cardinalities in (13) are at least one, just take all mi = 1. This implies C p (r, d) ≥

r

∑ 2

=0

Here we used the inequality and for 1 ≤  ≤ r we have



r d r d 2d r ≥ ∑ 2 = 1+ .   r r =0

d  

≥ ( dr )

r  

for  = 0, 1, . . . , r. For  = 0 this is trivial,

New preasymptotic estimates

103

d 

 (d −  + j)r  r  d  = ∏ (r −  + j)d ≥ 1 , 

r

j=1

since all factors are ≥ 1. This proves (11). To prove (12) we distinguish two cases. Case 1. Let d be an even integer, i.e. d = 2m for some m ∈ N. Then, taking only the    k last summand in (13) and using nk ≥ nk , we get m

2m 2m m = 22m = 2d . ≥2 · m m

C p ( d/2 , d) = C p (m, 2m) ≥ 2

m

Case 2. Let now d ≥ 5 be an odd integer, i.e. r = 2m + 1 for some m ≥ 2. Taking now the last two summands in (13) we obtain Cp ( d/2 , d) = C p (m, 2m + 1)



m−1 2m + 1 m 2m + 1 +2 ≥2 m−1 m

m−1

2m + 1 2m + 1 m + 2m ≥ 2m−1 m−1 m  m−1   3 1 m = 2m−1 · 2m−1 1 + 2(m−1) + 2m · 2m 1 + 2m       ≥ 22m−2 1 + 32 + 22m 1 + 12 = 22m 58 + 32 ≥ 2d , where we used in the last line the estimate (1 + x)n ≥ 1 + nx for n ∈ N and x > 0, which follows from the binomial formula. This finishes the proof.  Our first main result in this section is a lower estimate for the approximation numbers an of the embedding H s,p (Td ) → L2 (Td ) in the preasymptotic range n ≤ 2d . Theorem 3.2 Let s > 0, 0 < p < ∞, d ≥ 4 and 2d + 1 ≤ n ≤ 2d . Then  an (Id : H

s,p

(T ) → L2 (T )) ≥ d

d

 s/p  ln 1 + ln3dn . 3 ln n

(14)

Proof. Denote an := an (Id : H s,p (Td ) → L2 (Td )). Fix n ∈ N with 2d + 1 ≤ n ≤ 2d . From the definition (8) of C p (r, d) we see that C p (1, d) = 2d + 1. By Lemma 3.1 there is an r ∈ N with 2 ≤ r ≤ d/2 such that C p (r − 1, d) ≤ n ≤ C p (r, d). Moreover  2d ln n ≥ lnC p (r − 1, d) ≥ (r − 1) ln 1 + ≥ (r − 1) ln 5 , r−1 since d ≥ 2r. Consequently

ln 5 1 ≥ . r − 1 ln n

Thomas K¨uhn

104

Inserting this in the previous inequality we get   2 ln 5 · d 3d ln n ≥ (r − 1) ln 1 + ≥ (r − 1) ln 1 + ln n ln n   ln 1 + ln3dn 1 ≥ . r−1 ln n

and

Since r ≥ 2, we have r + 1 ≤ 3(r − 1), and altogether this yields 1 1 ≥ ≥ an ≥ aCp (r,d) = s/p (r + 1) (3(r − 1))s/p The proof is finished.



 s/p  ln 1 + ln3dn . 3 ln n



Now we pass to upper estimates. Here we distinguish between two cases for the parameter p, we begin with 1 ≤ p < ∞. The first estimate (15) in the theorem below is a slight improvement of (7), it holds for all parameter values p ≥ 1 and not only for p = 1, while the second estimate (16) provides explicit constants for the correct order in n and d, see (9). Theorem 3.3 Let d ∈ N, d ≥ 4, s > 0, 1 ≤ p < ∞ and 2d + 1 ≤ n ≤ 2d . Then 





an Id : H s,p (Td ) → L2 (Td ) ≤ 



an Id : H s,p (Td ) → L2 (Td ) ≤

1 2



+ ln d ln n

s/p and

  s/p 3 + 2 ln lndn √ . ln (2 π n)

(15)

(16)

Proof. Since p ≥ 1, we have |m| p ≥ |m| for all m ∈ Z. So, for any r ∈ N, 

C p (r, d) = card k ∈ Z : d

d

∑ |k j |

p



≤ r ≤ C1 (r, d) =

j=1

min{r,d}



=0

r d 2 .   

The formula for C1 (r, d) was shown in Lemma 3.2 in [5]. Given any n ∈ N with 2d + 1 ≤ n ≤ 2d , we have C p (r − 1, d) ≤ n ≤ C p (r, d) for some r ∈ N with 2 ≤ r ≤ d2 , and hence   an := an Id : H s,p (Td ) → L2 (Td ) ≤ aCp (r−1,d) = r−s/p . Now let us prove (15). If r = 2, then d ≥ 4, and hence  2 1 41 2 d ≤ ed 2 . C1 (2, d) = 1 + 2d + 2d 2 = d 2 2 + + 2 ≤ d d 16

(17)

New preasymptotic estimates

105

If r ≥ 3, then d ≥ 6. Using

d  



d !

we get

2 r d r r = 2(d + 1)r ≤ max · ∑ d  !    ≥0 =0 =0  √ r   √ r/3 1 r r 7 3 1 + d d ≤ 6 2 d r ≤ ( ed)r , ≤2

C1 (r, d) =

r

∑ 2

which gives ln n ≤ lnC p (r, d) ≤ lnC1 (r, d) ≤ r( 12 + ln d) . Together with (17) this implies (15). To prove inequality (16) we first observe that   d d d rr d  er r d  r d ≤ max · = ≤√ = ≤  ! ! r r r! r 0≤≤r ! 2π r r by Sterling’s formula. This implies er n ≤ C p (r, d) ≤ C1 (r, d ) ≤ √ 2 π

r



=0



2d r





r er 2d r = √ 1+  r 2 π

and





 d √ 2d 5ed 2d ln(2 π n) ≤ r + r ln 1 + ≤ r 1 + ln + = r ln . r 2r r 2r

As one can easily verify, the function f (x) =  ln x ≤ Applying this with x :=

5ed 2r

x ln x0 x0

ln √x x

is decreasing for x ≥ e2 , whence

whenever x ≥ x0 ≥ e2 .

≥ x0 := 5e we obtain √



ln n ≤ ln(2 π n) ≤ r

d ln(5e) = 2r



dr ln(5e) , 2

which implies



d ln(5e) 2 d ln(5e) 2 2d = . ≤ 2d · r 2 ln n ln n 2  Since n ≤ 2d , we have 1 ≤ dlnlnn2 , and therefore 1+



 d 2 d 2 2d  ≤ e2 . ≤ (ln 2)2 + (1 + ln 5)2    r ln n ln n ≤7.29≤7.38≤e2

Inserting this in (18) we arrive at

(18)

Thomas K¨uhn

106



2d 1 1 + ln 1 + r √ ≤ r ln(2 π n)



and in view of (17) the proof is finished.





3 + 2 ln lndn √ ≤ , ln(2 π n) 

Remark 3.4 Clearly the approximation numbers strongly depend on the chosen norm in the Sobolev space. As shown by (14), (15) and (16), in the preasymptotic range the norm parameter p appears even in the exponent s/p. This is an interesting effect, which is in contrast to the asymptotic behaviour, where the rate an ∼ n−s/d is the same for all norms (considered in this paper), while the norm parameter p influences only the hidden equivalence constants, see e.g. Section 4 of [5]. Remark 3.5 Comparing the two estimates in the theorem, it is easy to see that for small n and d the first inequality (15) gives a better bound than (16). However, for large d and moderate n the bound in (16) is better. For example, if d ≥ 26 and √ 2 26d ≤ n ≤ 2d , then √ e3 d 2 e3 d 2 ≤ < 1.61d < ed , 2 2 (ln n) 26d · (ln 2) and therefore

 3 2  e d ln (ln 3 + 2 ln lndn n)2 √ < < ln n ln(2 π n)

1 2

+ ln d . ln n

Now let 0 < p < 1. We will prove similar results as in the case p ≥ 1, and also the proof strategy is the same. The heart of the proof is to find good upper bounds for the cardinalities C p (r, d). But the difficulty is that for p < 1 they are larger than C1 (r, d), and no exact formula is available. Therefore some modifications are required, which will involve volume arguments. As a preparation we state in the following lemma some auxiliary results that we will need later. Lemma 3.6 (i) Let0 < p < ∞ and d ∈ N. Then  the volume (d-dimensional Lebesgue d d d p measure) of B p := x ∈ R : ∑ j=1 |x j | ≤ 1 is  2d Γ (1 + 1/p)d . vold Bdp = Γ (1 + d/p) (ii) For all real numbers x > 1 and a > 0 the Gamma function satisfies

Γ (1 + x) ≤ xx

and

Γ (1 + x) ≥ ax e−a .

Proof. Formula (i) is well known, it can be found e.g. in [11]. For a proof of the first inequality in (ii) see e.g. [5]. The second inequality in (ii) is easy to prove. We have

Γ (1 + x) = ax

 ∞  x t 0

a

e−t dt ≥

 ∞ a

e−t dt = e−a .



New preasymptotic estimates

107

Now we give upper preasymptotic estimates for the case 0 < p < 1. d Theorem √ 3.7 Let d ∈ N, d ≥ 4, s > 0, 0 < p < 1 and 2d + 1 ≤ n ≤ 2 . Moreover let 3 π c := 2 = 2.6586.... Then we have



an Id : H

s,p



2

(T ) → L2 (T ) ≤ e d

d

s

p

+ ln



d ln cn

 s/p

ln cn

.

Proof. Let n ∈ N, 2d + 1 ≤ n ≤ 2d . By Lemma 3.6 there is an integer r with 2 ≤ r ≤ d/2 and C p (r − 1, d) ≤ n ≤ C p (r, d) . This implies an := an (Id : H s,p (Td ) → L2 (Td )) ≤ aCp (r−1,d) = r−s/p . Now we must estimate r in terms of n. According to (13) we have r  d A p (r, ) Cp (r, d) = 1 + ∑ 2  =1

(19)

where A p (r, ) = card A p (r, ) and   r A p (r, ) = (mi ) ∈ N : ∑ mip ≤ r . i=1

Our first aim is to estimate the cardinalities A p (r, ). To this end we consider the pairwise disjoint cubes Qk := k + (−1, 0] , k ∈ N . We have the inclusion  k∈A p (r,)

Qk ⊆ r1/p Bp ∩ R+ ,

where Bp = {x ∈ R : ∑j=1 |x j | p ≤ 1} is the unit ball in R equipped with the p-norm. Taking volume (Lebesgue measure in R ) we get   vol Bp Γ (1 + 1/p) /p = r/p . A p (r, ) ≤ r  Γ (1 + /p) 2 Here we used the volume formula from part (i) of Lemma 3.6. Moreover, by part (ii) of the same Lemma we have

Γ (1 + 1/p) ≤ (1/p)1/p

and

Γ (1 + /p) ≥ (r/p)/p e−r/p ,

which implies A p (r, ) ≤ er/p . Inserting this in (19) we obtain r

C p (r, d) ≤ er/p ∑ 2 =0

r (2d) d . ≤ er/p ∑ !   =0   :=b

Thomas K¨uhn

108

The sum can easily be bounded by comparing with a geometric series. Since 2 ≤ −r b , and r ≤ d/2, the summands satisfy bb = 2d r  ≥ 4. By induction we get b ≤ 4 −1 using Sterling’s formula we obtain r

4 (2d)r 4br ≤ · 3 3 r! =0

r

2ed r 2ed 2 4 √ √ ≤ . ≤ r r 3 π 3 2π r r

∑ b ≤ br ∑ 4−r ≤

=0

This implies, with constant c :=

√ 3 π 2 ,

ln cn ≤ lnC p (r, d) ≤ r ln

 2e1+1/p d r

.

Now we set

α :=

p p+1

,

which gives

1 = 1 + 1/p α

1 = p+1. 1−α

and

It is easy to show that the function f (x) := x−α ln x attains its maximum over [1, ∞) at e1/α , whence xα ln x ≤ xα f (e1/α ) = for all x ≥ 1 . αe 1/α Applying this with x := e1+1/p 2d r =e

ln cn ≤ r ln x ≤ r Multiplying this inequality with

Recall that

1 α

= 1 + 1/p and

r

1 1−α

we get

xα r = αe α

 2d 1−α 2d r

2d r

1−α



2d r

α

.

and dividing by ln cn we obtain ≤

2d . α ln cn

= p + 1, and therefore

  2d p+1 p+1  2d p+1 2d p+1 2d ≤ (1 + 1/p) p+1 ≤e p ≤ e1/p . r ln cn ln cn ln cn We conclude that  

  p+1 2d 2 1/α 2d 2/p 2d + ln ln cn ≤ r ln e = r(p + 1) ≤ r ln e . r ln cn p ln cn Observing that (p + 1)1/p ≤ e, we finally arrive at the desired estimate

New preasymptotic estimates

109

2 an ≤ r and the proof is finished.

−s/p

≤e

s

p

+ ln



2d ln cn

 s/p ,

ln cn



4 Preasymptotics for approximation in mixed Sobolev spaces In this final section we give some new preasymptotic estimates for the approximas (Td ) of dominating mixed tion numbers of L2 -embeddings of Sobolev spaces Hmix smoothness as defined in Definition 2.3. In particular this definition implies s Id : Hmix (Td ) → L2 (Td ) = 1

for all d ∈ N and s > 0 .

For the long history of mixed Sobolev spaces we refer to the comments in [6]. It is well known that s an (Id : Hmix (Td ) → L2 (Td )) ∼ n−s (ln n)s(d−1) .

Note that the function f (x) := x−1 (ln x)(d−1) is increasing for 1 ≤ x ≤ ed−1 , and  d−1  1. Hence this equivalence relation is useless in the f (ed−1 ) = (d − 1)/e preasymptotic range n ≤ 2d , since an (Id ) ≤ Id  = 1 for all n ∈ N. In Theorem 4.9 of [6] it was shown that for all s > 0 and d ∈ N with d ≥ 2 and 8 ≤ n ≤ d2 4d an (Id :

s Hmix (Td )

→ L2 (T )) ≤ d

e2 n

s/(2+log2 d)

.

(20)

.

(21)

We shall improve this estimate as follows. Theorem 4.1 Let s > 0 and d ∈ N. Then we have for all n ≥ 6 s (Td ) → L2 (Td )) ≤ an (Id : Hmix

16 3n

s/(1+log2 d)

Proof. Our strategy is similar to the proof of Theorem 4.9 in [6]. For r, d ∈ N we consider the cardinalities   d C(r, d) := card k ∈ Zd : ∏ (1 + |k j |) ≤ r . j=1

Let m := min{d, log2 r }. By formula (3.1) in [6] we have m  d A(r, ) , C(r, d) = 1 + ∑ 2  =1

(22)

Thomas K¨uhn

110

where A(r, ) = card A (r, )

   with A (r, ) = k ∈ N : ∏ (1 + k j ) ≤ r . j=1

For the disjoint cubes Qk := k + [0, 1) with k ∈ N we get the inclusion  k∈A (r,)

   Qk ⊆ H (r, ) := x ∈ R : x j ≥ 1, ∏ x j ≤ r . j=1

Taking volume in R this implies A(r, ) ≤ vol (H (r, )) . The set H (r, ) is of hyperbolic cross type, and for its volume we use the estimate, for all integers  ∈ N and all real numbers r ≥ 1, v (r) := vol (H (r, )) ≤

r2 . 4

(23)

Inequality (23) can easily be verified by induction on . Indeed, for  = 1 we have v1 (r) = r − 1 ≤

r2 , 4

and if (23) holds for some  ∈ N and all r ≥ 1, then the recursion formula v+1 (r) = implies v+1 (r) ≤ If r ≥ 2, then 1 ≤

r2 4

d  

 r dt 1

t2

1

v (r/t) dt



r2 · 4

 ∞ dt 1

t2

=

r2 . 4

, and combining (22) and (23) we obtain

C(r, d) ≤ where we used

r2 · 4

 r



d !

r2 m r2 m (2d) r2 m  d = 2 ≤ ∑ ∑ ∑ b ,  4 =0 4 =0 ! 4 =0

and have set b :=

b 2d ≥2 = b−1 

(2d) ! .

(24)

Since m ≤ d, we have

for all  = 1, ..., m .

Similarly as in the proof of Theorem 3.7, by comparing with a geometric series, this implies

New preasymptotic estimates

111

4m  d m 2

m

∑ b ≤ 2bm = 2 · m! ·

=0 m

Observing that max 4m! = m∈N

C(r, d) ≤

43 3!

=

32 3 ,

(25)

formulae (24) and (25) yield for d ≥ 2

r2 64  d m 16r2  d log2 r 16 1+log2 d ≤ = . · · r 4 3 2 3 2 3

(26)

This is also true for d = 1, since C(r, 1) = card{k ∈ Z : 1 + |k| ≤ r} = 2r + 1 ≤ 3r. For every n ∈ N with n ≥ 2 there is an integer r ≥ 2 such that C(r − 1, d) < n ≤ C(r, d) . Then s s (Td ) → L2 (Td )) = aC(r,d) (Id : Hmix (Td ) → L2 (Td )) = r−s . an (Id : Hmix

The first equality follows from the fact that k → w(k) := ∏dj=1 (1 + |k j |) is a map from Zd onto N, hence all elements of the non-increasing rearrangement (σn )n∈N of (1/w(k)s )k∈Zd are of the form r−s for some r ∈ N. But the σn are exactly the approximation numbers, see (4). Now n ≤ C(r, d) and (26) finally imply the desired estimate s/(1+log2 d) 16 s an (Id : Hmix (Td ) → L2 (Td )) = r−s ≤ , 3n and the proof is finished.



We conclude the paper by some comments. Remark 4.2 Our new inequality (21) improves (20) in several respects: • The range of n is larger. 2 • The factor 16 3 is smaller than e . • Most importantly, the exponent s/(1 + log2 d) is better than s/(2 + log2 d). Remark 4.3 In this note we only dealt with L2 -approximation, but it would be desirable to have corresponding results for L∞ -approximation, too. This problem was addressed in [2], where several sharp asymptotic estimates were established, including exact constants and their dependence on the dimension d and the smoothness parameter s. However, it seems much harder to prove preasymptotic estimates, this remains an open problem. Acknowledgements This work was started in June 2018 during the MATRIX Workshop ’On the Frontiers of High Dimensional Computation’ in Creswick. I would like to thank the organizers of the workshop for creating a stimulating scientific atmosphere, and the staff of the MATRIX Institute for providing excellent working conditions. Moreover, I thank two anonymous referees for constructive remarks.

112

Thomas K¨uhn

References 1. Chen, J., Wang, H. P.: Approximation numbers of Sobolev and Gevrey type embeddings on the sphere and on the ball – Preasymptotics, asymptotics, and tractability. J. Complexity 50 (2019), 1–24. 2. Cobos, F., K¨uhn, T., Sickel, W.: Optimal approximation of multivariate periodic Sobolev functions in the sup-norm. J. Funct. Anal. 270 (2016), 4196–4212. 3. K¨onig, H.: Eigenvalue distribution of compact operators. Operator Theory: Advances and Applications, 16. Birkh¨auser Verlag, Basel, 1986. 4. K¨uhn, T., Mayer, S., Ullrich, T.: Counting via entropy: new preasymptotics for the approximation numbers of Sobolev embeddings. SIAM J. Numer. Anal. 54 (2016), no. 6, 3625–3647. 5. K¨uhn, T., Sickel, W., Ullrich, T.: Approximation numbers of Sobolev embeddings – Sharp constants and tractability. J. Complexity 30 (2014), no. 2, 95–116. 6. K¨uhn, T., Sickel, W., Ullrich, T.: Approximation of mixed order Sobolev functions on the d-torus: Asymptotics, preasymptotics, and d-dependence. Constr. Approx. 42 (2015), no. 3, 353–398. 7. Pietsch, A.: Operator ideals. North-Holland Mathematical Library, 20. North-Holland Publishing Co., Amsterdam-New York, 1980. 8. Siedlecki, P., Weimar, M.: Notes on (s,t)-weak tractability: A refined classification of problems with (sub)exponential information complexity. J. Approx. Theory 200 (2015), 227–258. 9. Temlyakov, V.N.: Approximation of Periodic Functions. Nova Science, New York, 1993. 10. Tikhomirov, V.M.: Approximation Theory, in: Encyclopaedia of Math. Sciences, Analysis II, vol. 14, Springer, Berlin, 1990. 11. Wang, X.: Volumes of generalized unit balls. Math. Mag. 78 (2005), 390–395. 12. Werschulz, A., Wo´zniakowski, H.: A new characterization of (s,t)-weak tractability. J. Complexity 38 (2017), 68–79.

Data based construction of kernels for classification Hrushikesh N. Mhaskar, Sergei V. Pereverzyev, Vasyl Yu. Semenov and Evgeniya V. Semenova

Abstract This paper is an announcement for our longer paper in preparation. Traditional kernel based methods utilize either a fixed kernel or a combination of judiciously chosen kernels from a fixed dictionary. In contrast, we construct a datadependent kernel utilizing the components of the eigen-decompositions of different kernels constructed using ideas from diffusion geometry, and use a regularization technique with this kernel with adaptively chosen parameters. In this paper, we illustrate our method using the two moons dataset, where we obtain a zero test error using only a minimal number of training samples.

1 Introduction The problem of learning from labeled and unlabeled data (semi-supervised learning) has attracted considerable attention in recent years. A variety of machine learning algorithms use Tikhonov single penalty or multiple penalty schemes for regularizing with different approaches to data analysis. Many of these are kernel based algorithms that provide regularization in Reproducing Kernel Hilbert Spaces (RKHS). The problem of finding a suitable kernel for learning a real-valued function by reguHrushikesh N. Mhaskar Institute of Mathematical Sciences, Claremont Graduate University, Claremont, CA91711, USA, e-mail: [email protected] Sergei V. Pereverzyev Johann Radon Institute for Computational and Applied Mathematics, Linz, Austria, e-mail: [email protected] Vasyl Yu. Semenov R&D department, Scientific and Production Enterprise “Delta SPE”, Kiev, Ukraine e-mail: [email protected] Evgeniya V. Semenova Institute of Mathematics of NASU, Kiev, Ukraine e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_8

113

Authors Suppressed Due to Excessive Length

114

larization is considered, in particular, in the papers [16], [17] (see also the references therein), where different approaches were proposed. All the methods mentioned in these papers deal with some set of kernels that appear as a result of parametrization of classical kernels or linear combination of some functions. Such approaches lead to the problem of multiple kernel learning. In this way, the kernel choice problem is somehow shifted to the problem of a description of a dictionary of predefined kernels, on which multiple kernel learning is performed. In the present paper we propose an approach to construct a kernel directly from observed data rather than choosing one from a given kernel dictionary in advance. The approach uses ideas from diffusion geometry (see, e.g. [1, 2, 3, 5, 11]), where the eigenvectors of the graph Laplacian associated to the unlabeled data are used to mimic the geometry of the underlying manifold that is usually unknown. The literature on this subject is too large to be cited extensively. The special issue [7] of Applied and Computational Harmonic Analysis is devoted to an early review of this subject. Most relevant to the current paper are the papers [5], [6], where the graph Laplacian associated with the data has been used to form additional penalty terms in a multi-parameter regularization functional of Tikhonov type. In contrast to [5], [6], we use eigenvectors and eigenfunctions of the corresponding family of graph Laplacians (rather than a combination of these graph Laplacians) to construct a data-dependent kernel that directly generates an RKHS. In Section 2, we summarize some known theoretical facts relevant to our paper. Our numerical algorithm is described in Section 3. In Section 4, we present the experimental results with the two moons data set.

2 Background The subject of diffusion geometry seeks to understand the geometry of the data {xi }ni=1 ⊂ RD drawn randomly from an unknown probability distribution μ , where D is typically a large ambient dimension. It is assumed that the support of μ is a smooth sub-manifold of RD having a small manifold dimension d. The theory works with eigenfunctions of the Laplace-Beltrami operator of this manifold. However, since the manifold is unknown, one needs to approximate the Laplace-Beltrami operator. One way to do this is using a graph Laplacian as follows. For ε > 0 and x, y ∈ RD , let   x − y2 W ε (x, y) := exp − . (1) 4ε We consider the points {xi }ni=1 as vertices of an undirected graph with the edge weight between xi and x j given by W ε (xi , x j ), thereby defining a weighted adjacancy matrix, denoted by Wε . We define Dε to be the diagonal matrix with the i-th entry on n

the diagonal given by

∑ W ε (xi , x j ). The graph Laplacian is defined by the matrix

j=1

Data based construction of kernels for classification

Lε =

115

1 ε {D − Wε } . n

(2)

We note that the eigenvalues of Lε are all real and non-negative, and therefore, can be ordered as (3) 0 = λ1ε < λ2ε ≤ · · · ≤ λnε . It is convenient to consider the eigenvector corresponding to λkε to be a function on {x j }nj=1 rather than a vector in Rn , and denote it by φkε , thus,

λkε φkε (xi ) =

n



j=1

 Li,ε j φkε (x j ) =

1 n

φkε (xi )

n

∑W

j=1

ε

n

(xi , x j ) − ∑ W

 ε

(xi , x j )φkε (x j )

,

j=1

(4) i = 1, . . . , n. Since the function W ε is defined on the entire ambient space, one can extend the function φkε to the entire ambient space using (4) in an obvious way (the Nystr¨om extension). Denoting this extended function by Φkε , we have (cf. (4), [19])

Φkε (x) =

∑nj=1 W ε (x, x j )φkε (x j ) , ∑nj=1 W ε (x, x j ) − nλkε

(5)

for all x ∈ RD for which the denominator is not equal to 0. The condition that the denominator of (5) is not equal to 0 for any x can be verified easily for any given ε . The violation of this condition for a particular k can be seen as a sign that for a given amount n of data the approximations of the eigenvalue λk of the corresponding Laplace-Beltrami operator by eigenvalues λkε cannot be guaranteed with a reasonable accuracy. The convergence of the extended eigenfunctions Φkε , restricted to a smooth manifold X, to the actual eigenfunctions of the Laplace-Beltrami operator on X is described in [4, Theorem 2.1].

3 Numerical algorithms for semi-supervised learning The approximation theory utilizing the eigen-decomposition of the LaplaceBeltrami operator is well developed, even in greater generality than this setting, in [12, 10, 13, 14, 9]. In practice, the correct choice of ε in the approximate construction of these eigenvalues and eigenfunctions is a delicate matter that affects greatly the performance of the kernel based methods based on these quantities. Some heuristic rules for choosing ε have been proposed in [11, 8]. These rules are not applicable universally; they need to be chosen according to the data set and the application under consideration. In contrast to the traditional literature, where a fixed value of ε is used for all the eigenvalues and eigenfunctions, we propose in this paper the construction of a

Authors Suppressed Due to Excessive Length

116

Algorithm 1 Algorithm for kernel ridge regression with the constructed kernel (7) m Given data {xi }ni=1 ∈ X, {xi , yi }m i=1 are the labeled examples; y = {y}i=1 . k Introduce the grid for parameter α : αk = p , k = 1, 2, . . . , N m consisting of the sub-matrix {Kn (xi , x j )}m Calculate Gram matrix K i, j=1 defined by (7) in labeled points for k=1:N do Calculate Cαk as m )−1 y, Cαk = (αk I + K

mCα − y is minimized. Find the αmin such that K k end for The decision-making function is m

fn∗ (x) = ∑ (Cαmin )i Kn (x, xi ). i=1

kernel of the form εj

εj

εj

Kn (x,t) = ∑(nλk k )−1 Φk k (x)Φk k (t);

(6)

k

i.e., we select the eigenvalues and the corresponding eigenfunctions from different kernels of the form W ε to construct our kernel. We note again that in contrast to the traditional method of combining different kernels from a fixed dictionary, we are constructing a single kernel using eigenvectors and eigenfunctions of different kernels from a dictionary. Our rule for selecting the ε jk ’s is based on the well-known quasi-optimality criterion [18] that is one of the simplest and oldTable 1 Results est, but still a quite efficient strategy for choosing a regularization of testing for two parameter. According to that strategy, one selects a suitable value moons dataset of ε (i.e. the regularization parameter) from a sequence of admissible values {ε j }, which usually form a geometric sequence, n: m Error i.e. ε j = ε0 q j , j = 1, 2, . . . , M; q < 1. We propose to employ the 50 2 0% quasi-optimality criterion in the context of the approximation of 50 4 0% the eigenvalues of the Laplace-Beltrami operator. Then by anal50 6 0% ogy to [18] for each particular k we calculate the sequence of ap30 2 17% ε proximate eigenvalues λk j , j = 1, 2, . . . , M, and select ε jk ∈ {ε j } 30 4 8% ε ε such that the differences |λk j − λk j−1 | attain their minimal value 30 6 0% at j = jk . 10 2 38% Since the size of the grid of ε j is difficult to be estimated be10 4 10% forehand and, at the same time, has a strong influence on the 10 6 2% performance of the method, we propose the following strategy for the selection of the grid size M. We note that the summation in formula (6) has to be done for indices k for which the correεj sponding eigenvalue λk = λk k is non-zero. It is also known that the first eigenvalue

Data based construction of kernels for classification ε

117

ε

λ1 j = 0. To prevent the other λk j from becoming too close to zero with the decreasing of ε j , we propose to stop continuation of the sequence ε j as soon as the value of λ2εM becomes sufficiently small. So, the maximum grid size M is the smallest (thr) (thr) is some estimated threshold. Taking the integer for which λ2εM < λ2 , where λ2 abovementioned into account, we also replace the formula for the kernel calculation (6) by the kernel n

εj

εj

εj

Kn (x,t) = 1 + ∑ (nλk k )−1 Φk k (x)Φk k (t).

(7)

k=2

Algorithm 1 described above uses the constructed kernel (7) in kernel ridge regression from labeled data. The regression is performed in combination with a discrepancy based principle for choosing the regularization parameter α . More details can be found in [15].

4 Experimental results In this section we consider classification of the two moons dataset that can be seen as the case D = 2, d = 1. The software and data were borrowed from bit.ly/2D3uUCk. For the two moons dataset we take {x i} ni=1 with n = 50, 30, 10 and subsets n {xi }m i=1 ⊂ {xi }i=1 with m = 2, 4, 6 labeled points. The goal of semi-supervised data classification problems is to assign correct labels for the remaining points {xi }ni=1 \ {xi }m i=1 .

Fig. 1 Classification of “two moons” dataset with extrapolation region. The values of parameters (thr) are m=2, ε0 = 1, q=0.9, λ2 = 10−6 .

118

Authors Suppressed Due to Excessive Length

For every dataset (defined by the pair (n, m)) we performed 10 trials with randomly chosen labeled examples. As follows from the experiments, the accuracy of the classification is improving with the growth of the number of unlabeled points. In particular, for n ≥ 50, to label all points without error, it is enough to take only one labeled point for each of two classes (m = 2). At the same time, if the set of unlabeled points is not big enough, then for increasing the accuracy of prediction we should take more labeled points. The result of the classification for the two moons dataset with m = 2 as well as the corresponding plot of selected ε are shown in Figures 1 and 2. The big crosses correspond to the labeled data and other points are colored according to the constructed predictors. The parameters’ values were m=2, ε0 = 1, q=0.9, (thr) −6 λ2 = 10 . The application of the proposed method to other classification problems including automatic gender identification can be found in [15]. Acknowledgements Sergei V. Pereverzyev and Evgeniya Semenova gratefully acknowledge the support of the consortium AMMODIT funded within EU H2020-MSCA-RICE. The research of Hrushikesh Mhaskar is supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via 2018-18032000002. The paper has been finalized while the first two co-authors took part in the workshop “On the frontiers of high-dimensional computation” at the MATRIX Research Institute, Creswick, June 2018. The support of MATRIX is gratefully acknowledged.

Fig. 2 Plot of adaptively chosen ε for two-moon dataset. The values of parameters are m=2, ε0 = 1, (thr) q=0.9, λ2 = 10−6 .

Data based construction of kernels for classification

119

References 1. Belkin, M., Matveeva, I., Niyogi P.: Regularization and semi-supervised learning on large graphs. In: Learning theory, pp. 624-638. Springer (2004) 2. Belkin, M., Niyogi P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation 15, 1373–1396 (2003) 3. Belkin, M., Niyogi P.: Semi-supervised learning on Riemannian manifolds. Machine learning 56, 209–239 (2004) 4. Belkin, M., Niyogi P.: Convergence of Laplacian eigenmaps. Adv. neur. inform. process. 19 129, (2007) 5. Belkin, M., Niyogi P., Sindhwani V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7, 2399–2434 (2006) 6. Bertozzi A.L., Luo Xi., Stuart A.M., Zygalakis K.C.: Uncertainty Quantification in the Classification of High Dimensional Data. In: CoRR (2017) http://arxiv.org/abs/1703.08816 7. Chui, C.K., Donoho D.L.: Special issue: Diffusion maps and wavelets. Appl. Comput. Harm. Anal. 21 (2006) 8. Coifman, R. R. and Hirn, M. J.: Diffusion maps for changing data. Appl. Comput. Harmon. Anal. 36, 79–107 (2014) 9. Ehler, M., Filbir, F., Mhaskar H.N.: Locally Learning Biomedical Data Using Diffusion Frames. J. Comput. Biol. 19, 1251–1264 (2012) 10. Filbir, F., Mhaskar H.N.: Marcinkiewicz–Zygmund measures on manifolds. J. Complexity. 27, 568–596 (2011) 11. Lafon, S.S: Diffusion maps and geometric harmonics. Yale University (2004) 12. Maggioni, M., Mhaskar H.N.: Diffusion polynomial frames on metric measure space. Appl. Comput. Harmon. Anal. 24, 329–353 (2008) 13. Mhaskar, H.N.: Eignets for function approximation on manifolds. Appl. Comput. Harm. Anal. 29, 63–87 (2010) 14. Mhaskar, H.N.: A generalized diffusion frame for parsimonious representation of functions on data defined manifolds. Neural Networks 24, 345–359 (2011) 15. Mhaskar, H.N., Pereverzyev S.V., Semenov, V.Yu., Semenova E.V.: Data based construction of kernels for semi-supervised learning with less labels. RICAM Preprint (2018). https://www.ricam.oeaw.ac.at/files/reports/18/rep18-25.pdf 16. Micchelli, C.A., Mhaskar H.N.: Learning the kernel function via regularization. J.Mach.Learn.Res. 6, 10127–10134 (2005) 17. Pereverzyev, S.V., Tkachenko, P.: Regularization by the Linear Functional Strategy with Multiple Kernels. Frontiers Appl. Math. Stat. 3, 1 (2017) 18. Tikhonov, A.N., Glasko, V.B.: Use of regularization method in non-linear problems. Zh. Vychisl. Mat. Mat. Fiz. 5, 463–473 (1965) 19. von Luxburg U., Belkin, M., Bousquet O.: Consistency of spectral clustering. Ann. Statist. 36, 555–586 (2008)

P1 –nonconforming polyhedral finite elements in high dimensions Dongwoo Sheen

Abstract We consider the lowest–degree nonconforming finite element methods for the approximation of elliptic problems in high dimensions. The P1 –nonconforming polyhedral finite element is introduced for any high dimension. Our finite element is simple and cheap as it is based on the triangulation of domains into parallelotopes, which are combinatorially equivalent to d–dimensional cube, rather than the triangulation of domains into simplices. Our nonconforming element is nonparametric, and on each polytope it contains only linear polynomials, but it is sufficient to give optimal order convergence for second–order elliptic problems.

1 Introduction We are interested in the lowest–degree nonconforming finite element methods for the approximation of elliptic problems in high dimensions. Efficient numerical methods to approximate solutions of partial differential equations in high dimensions are very demanding. For instance, in computational finance, efficient numerical methods are necessary to approximate high dimensional basket options (see [2, 21, 19] and the references therein). Also, in the approximation of the Einstein equations of general relativity, one needs to work on high dimensional dynamics modeling (see [1, 9, 22], and the references therein). For possible applications in fluid mechanics in high dimensions ≥ 4, see [13, 10, 11, 12, 24] and so on for the uniqueness, existence and regularity results on the solution of Navier–Stokes equations. However, practical application areas in fluid mechanics are hardly found. In high dimensions it is much simpler to to adopt cubic type of elements rather than simplicial elements. In our paper we develop finite elements based on the triDongwoo Sheen Department of Mathematics, Seoul National University, Seoul 08826, Korea, e-mail: sheen@ snu.ac.kr

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_9

121

122

Dongwoo Sheen

angulation of domains into polytopes, which are combinatorially equivalent to d– dimensional cube. In order to have lowest degree conforming finite elements on d–cubes, one needs to have multilinear polynomial spaces whose dimensions are at least 2d . Hence to reduce the dimension of approximation polynomial space, we develop nonconforming elements which are nonparametric, but on each polytope it contains only P1 polynomials which is sufficient to give optimal order convergence for second–order elliptic problems. To present most effectively the idea of developing the nonconforming polyhedral finite elements, which are nonparametric, we briefly review the nonconforming elements of lowest degrees from parametric elements to nonparametric elements, and from rotated bilinear elements to P1 –nonconforming quadrilateral elements. By this brief review it will be very natural to expose our idea to develop the final nonconforming polyhedral elements in high dimensions. In this section we present our model problem, and then some notations and preliminaries are given.

1.1 The model problem Let Ω ∈ Rd be a simply–connected polyhedral domain with boundary Γ . Consider the second–order elliptic problem: −∇ · (A(x)∇ u) + cu = f , u = 0,

Ω, Γ,

(1a) (1b)

where the uniformly positive–definite matrix–valued function A and the nonnegative function c > 0 are assumed to be sufficiently smooth. The weak formulation of (1) is to find u ∈ H01 (Ω ) fulfilling a(u, v) = (v) ∀v ∈ H01 (Ω ),

(2)

where the bilinear form a(·, ·) : H01 (Ω ) × H01 (Ω ) → R and the linear form  : H01 (Ω ) → R are given by a(u, v) = (A∇ u, ∇ v) + (cu, v), (v) = ( f , v), for all u, v ∈ H01 (Ω ).

(3a) (3b)

123

P1 –nonconforming polyhedral finite elements in high dimensions

1.2 Notations and preliminary results For be a domain S ∈ Rd , we adopt standard notations for Sobolev spaces with the inner products and norms L2 (S) = { f : S → R | ( f , g)S =



S 2

 S

| f (x)|2 dx < ∞},

f (x)g(x) dx;  f 0,S =



( f , f );

H 1 (S) = { f ∈ L (S) | ∇ f (x)0,S < ∞}, ( f , g)H 1 (S) = ( f , g)S + (∇ f , ∇ g)S ;  f 1,S =



( f , f )H 1 (S) ;

H01 (S) = { f ∈ H 1 (S) | f |∂ S = 0}; H k (S) = { f ∈ L2 (S) | ∂ α f (x)0,S < ∞ ∀|α | ≤ k},  ( f , g)H k (S) = ∑ (∂ α f , ∂ α g)S ;  f k,S = ( f , f )H k (S) . |α |≤k

Here, and in what follows, if S = Ω the subindex Ω may be dropped as well as the subindex 0. Denote by conv S the interior of the convex hull of S, which is an open set. The 0– and 1–faces of d–polyhedral domain S are the vertices and edges of S, respectively. In particular, the (d − 1)–faces of S will be called the “facets” of d–dimensional polyhedral domain, and by μ j we designate the barycenter of facet Fj ’s. The organization of the paper is as follows. In Section 2, the lowest–degree parametric and nonparametric nonconforming quadrilateral elements for two and three dimensions are briefly reviewed. In Section 3, we introduce the nonparametric P1 – NC polyhedral finite element space in Rd for any d ≥ 2. Here, and in what follows, P1 means “piecewise linear” and NC means “nonconforming.”

2 The parametric and nonparametric P1 –simplicial and quadrilateral nonconforming finite elements In this section we review the simplicial and quadrilateral NC (nonconforming) finite element spaces in two and three dimensions.

124

Dongwoo Sheen

2.1 The parametric simplicial and rectangular NC elements in two and three dimensions The NC elements for elliptic and Stokes equations in two and three dimensions have been well known since the work of Crouzeix and Raviart [7] was published. Denote the reference element as follows:  d  = Δ = d–simplex, i.e., conv{0,e1 , · · · ,ed }, (4) K d = d–cube, i.e., (−1, 1)d . Q 1. The lowest–degree simplicial Crouzeix-Raviart element (1973) [7]:  = Δd , d = 2, 3; a. K  = Span{1, x1 , · · · , xd }; b. PK = P1 (K)   K)}.  c. ΣK = {ϕ(ξj ), ξj barycenter of facets, j = 1, · · · , d + 1, ∀ϕ ∈ P( All odd–degree simplicial NC elements were introduced for the Stokes problems in [7]. Remark 1. It is straightforward to define the simplicial NC elements on d– simplicial triangulation in any high dimension. However, for high dimension it is not easy to see how the d–simplices are packed in the domain. Thus the development of d–cubical elements is beneficial in this regard. 2. The Han rectangular element (1984) [15]: 2 ; =Q a. K   ⊕ Span{ b. PK = P1 (K) x12 − 53 x14 , x22 − 53 x24 };  c. Σ = {ϕ(ξj ), ξj , j = 1, · · · , 4, midpoints of facets;

 2 ϕ Q

 K

∀ϕ ∈ PK }.

3. The Rannacher–Turek rotated Q1 element (1992, [20], also Z. Chen [5]): d , d = 2, 3; =Q a. K 2   ⊕ Span{ b. PK = P1 (K) x12 − xd2 , xd−1 − xd2 }; (m) c. Σ  = {ϕ(ξj ), ξj , j = 1, · · · , 2d, barycenters of facets Fj , ∀ϕ ∈ PK }; K

(i) Σ  = { |F1 | K

j



Fj

ϕd σ ,

Fj , j = 1, ·, 2d, are facets, ∀ϕ ∈ PK }.

Remark 2. The two DOFs generate two different NC elements, and for general (i) quadrilateral meshes the NC element with the DOFs Σ  gives optimal conK (m) vergence rates while that with the DOFs Σ  leads to suboptimal convergence K rates. 4. The DSSY element(DOUGLAS-SANTOS-Sheen-YE, 1999) [8]: For  = 1, 2, define ⎧ 2  = 0; ⎨t ,  = 1; θ (t) = t 2 − 53 t 4 , ⎩ 2 25 4 7 6 t − 6 t + 2 t ,  = 2.

125

P1 –nonconforming polyhedral finite elements in high dimensions

d , d = 2, 3; =Q K  ⊕ Span{θ (  x1 ) − θ ( xd ), θ ( xd−1 ) − θ ( xd )}; PK = P1 (K) (m)     c. Σ  = {ϕ (ξ j ), ξ j barycenters of facets, j = 1, · · · , 2d, ∀ϕ ∈ PK } K  (i) Σ = { 1  ϕd σ , Fj , j = 1, · · · , 2d, are facets, ∀ϕ ∈ P }.

a. b.

 K

|Fj | Fj

K

Remark 3. The benefit of the DSSY element is the Mean Value Property 1 ϕ(ξj ) =  |Fj |

 Fj

ϕd σ ∀ϕ ∈ PK

(5)

(m) (i) holds if  = 1, 2. Thus, for  = 1, 2, the two DOFs Σ  and Σ  generate an K K identical NC elements with optimal convergence rates. The case of  = 0 reduces to the Rannacher–Turek rotated Q1 element.

 for the Rannacher–Turek element 5. For truly quadrilateral triangulations, P1 (K)  is replaced by Q1 (K)  and the DSSY element should be modified such that P (K)  1 x1 , x2 ) x1 x2 d x1 d x2 (Cai– in the reference elements with an additional DOF Q2 ϕ( Douglas–Santos–Sheen–Ye, CALCOLO, 2000) [4]. Let (Th )0 σa . The value of this absorption radius can be found by integrating the RDF after diffusion until the correct mass has been reached, mreact = 4π

σa 0

gdi f f (r)r2 dr,

(19)

An alternative choice would be to react B molecules with uniform probability, preact , up to the binding radius and not beyond it, meaning that P(react|r) = preact for r < σb and P(react|r) = 0 for r > σb . For this choice, preact can be found from mreact = 4π preact

σb 0

gdi f f (r)r2 dr.

(20)

With either choice, reactions then convert the RDF after diffusion, gdi f f (r), to the RDF after diffusion and reaction, gd,rxn (r),

160

Johnston, Angstmann, Arjunan, Beentjes, Coulier, Isaacson, Khan, Lipkow, Andrews

gd,rxn (r) = gdi f f (r)[1 − P(react|r)]

(21)

Next, this modified RDF, gd,rxn (r), needs to be remapped to make it equal to the model RDF. Because the correct number of molecules have been reacted at this point, the mass of the RDF is conserved during this mapping step. Using the assumption that molecules are not allowed to overlap in the model, all RDF mass within the binding radius is excess and needs to be remapped to points outside of the binding radius. For convenience, also assume that the gd,rxn (r) function is smaller than gmodel (r) at all values that are outside of the binding radius (it is necessarily smaller on average, but things get complicated if it’s not also smaller at all individual points). The mass of the RDF that needs to be remapped is mmap = 4π

σb 0

gd,rxn (r)r dr = 4π 2

∞ σb

[gmodel (r) − gd,rxn (r)]r2 dr.

(22)

Again, there are multiple options, now for how to map these molecules from [0, σb ) to their new locations in (σb , ∞) in order to recover the steady-state profile. Simple choices are to maintain or invert radial ordering, where molecules near the origin are moved just outside σb in the former option and further out toward ∞ in the latter option. It is not intuitively clear which would be more accurate. For the first approach, in which radial ordering is maintained, consider the cumulative function for the molecules that need to be moved, Cin (rin ) = 4π

rin 0

gd,rxn (r )r2 dr ,

(23)

This is defined on 0 < rin < σb and returns a Cin (rin ) value that increases monotonically from 0 to the mass of molecules that need to undergo mapping, mmap . As this mass is conserved upon mapping, there is an equivalent value in the cumulative function for the locations where the molecules get mapped to, Cout (rout ) = 4π

rout σb

[gmodel (r ) − gd,rxn (r )]r2 dr .

(24)

This cumulative function is defined on σb < rout < ∞ and also increases monotonically from 0 to the mass of molecules that need to undergo mapping, but now represents the spaces available for those molecules. The process of mapping a molecule from rin ∈ [0, σb ) to its new location at rout ∈ (σb , ∞) is then: 1. Calculate Cin (rin ) for the initial location of the molecule from eq. 23. 2. Calculate the value of rout such that Cout (rout ) = Cin (rin ) from eq. 24. 3. Move the molecule from rin to rout . The second case, in which radial ordering is reversed, is identical but with the exception that the outer cumulative function runs in reverse order, being defined as Cout (rout ) = 4π

∞ rout

[gmodel (r ) − gd,rxn (r )]r2 dr ,

(25)

161

Accurate particle-based reaction algorithms for fixed timestep simulators

The same mapping process defined above works here as well.

3.6 Example of RDF-matching with remapping Consider a 3D reaction process where the steady-state distribution follows the Collins and Kimball RDF. Using reduced variables for simplicity, it is   0 r˜ < 1 gmodel (˜r) = (26) 1 1 − r˜(1+ γ˜) 1 ≤ r˜ where γ˜ is the reduced boundary coefficient (equal to γ /σb ). After the diffusion step, the distribution becomes, from eq. 11 and ref. [9] gdi f f (˜r) =

s˜2 1 1 [Gs˜(˜r − 1) − Gs˜(˜r + 1)] + (e+ + e− ) + (e+ − e− ) r 2 2˜r(γ˜ + 1)

(27)

The mass to be absorbed is given by substituting these two RDFs into eq. 18, yielding 2π s˜2 mreact = (28) 1 + γ˜ Next, we described two possibilities for the reaction step, of which one is to react all molecules up to some radius σa and the other was to react molecules up to the radius σb with probability preact . The former is more difficult to solve analytically, so we consider the latter in this example. From eq. 20, the reaction probability is preact = =



1 0

mreact

(29)

gdi f f (˜r)˜r2 d r˜

√ −2 2√ s˜ 2 {(s˜2 − 1)(γ˜ + 1)e s˜2 π

6s˜2 + [s˜2 (γ˜ + 1) − 3γ˜]} + [3s˜2 − 4(γ˜ + 1)] erf

√ 2 s˜

+ 4(γ˜ + 1) (30)

Applying this to the RDF after diffusion yields   preact gdi f f (˜r) r˜ < 1 gd,rxn (˜r) = gdi f f (˜r) 1 < r˜

(31)

with substitutions from eqs. 27 and 30. This RDF is lengthy but can still be expressed in closed form. However, the next step is to compute the cumulative masses both inside and outside of the binding radii with eqs. 23 and 24, which can only be done numerically. Once those numerical integrals are computed, they are equated to each other and then solved for rout as a function of rin . This solution gives the required mapping.

162

Johnston, Angstmann, Arjunan, Beentjes, Coulier, Isaacson, Khan, Lipkow, Andrews

4 Discussion This work presents two new algorithms for simulating bimolecular chemical reactions with particle-based simulators that use fixed time steps. Both are more accurate than existing methods but do not incur substantial computational penalties. In the Brownian bridge approach, the simulator considers both the initial and final separation vectors between potential reactants and computes the probability of a reaction occurring for those values. All simulated results exactly match those of the underlying model for isolated pairs of molecules, making it exact at this level of detail (interactions among 3 or more molecules are still approximate). A 2-step version of this algorithm, in which the algorithm only diffuses and reacts molecules, is sufficient for the Smoluchowski and Doi models, whereas a 3-step version, adding an intermediate reflection step, is necessary for the Collins and Kimball model. The primary disadvantage of the Brownian bridge method is that it requires looking up reaction probabilities for each reaction in lookup tables that have a minimum of three dimensions (initial separation, final separation, and interior angle) and often more dimensions. This may create an undesirable computational cost. In the RDF-matching method, the simulator only considers final separations between potential reactants, while effectively assuming that the initial separations are randomly chosen from the steady-state distribution for the model. This algorithm enables the simulator to match the model radial distribution function exactly, but only when at steady state. Moreover, the precise dynamics of single molecules do not quite statistically match those of the underlying particle reaction-diffusion model. We again developed 2-step and 3-step algorithm versions. Additionally, we developed a remapping method that resamples the position of unreacted molecules after a diffusion step to correctly reproduce the steady-state radial distribution function. This may aid in reducing the error introduced in the algorithm by only considering molecular separations. The underlying reaction probabilities that need to be sampled can be computed from closed form equations in simple cases, or can be stored in a one dimensional lookup table. One way in which the proposed algorithms could be optimised is by replacing the lookup tables with suitable approximating functions. For example, many functions can be closely approximated by rational functions or continued fractions, for which there are simple and efficient evaluation methods. These new algorithms are not just two of many possible improvements on existing algorithms, but are particularly accurate methods for particle-based simulations that use fixed time steps. Simulating diffusion using Gaussian distributed molecule displacements is a sensible approach as it simulates diffusion exactly for non-interacting molecules in free space. If separating diffusion and reaction into separate steps, as is common, then sampling reaction probabilities with the Brownian bridge method is the unique solution that produces exact agreement with the underlying model. Also, the RDF-matching approach is the simplest option for producing exact agreement with the steady-state model RDF. Absent from this work was any consideration of reversible reactions. Accounting for them would minimally affect the Brownian bridge algorithm because it doesn’t

Accurate particle-based reaction algorithms for fixed timestep simulators

163

make any assumptions about molecule starting locations. On the other hand, they would affect the RDF-matching algorithm because reversibility changes the steadystate radial distribution functions. An intriguing aspect of this work is that these algorithms can be set up to simulate realistic intermolecular potentials, such as a Lennard-Jones potential, nearly as easily as they can simulate the Smoluchowski or other simple models. Doing so could enable much more efficient simulation of these reaction dynamics than is currently possible. Acknowledgements We thank Mark Flegg, Kevin Burrage, Ruth Baker, Samuel Isaacson, and Hans Othmer for organising the 2018 MATRIX workshop on “Spatio-temporal stochastic systems in biology”, where we began this work. SNVA was partially supported by JSPS KAKENHI Challenging Research (Pioneering) Grant No. 18H05371. SAI was partially supported by National Science Foundation award DMS-1255408.

References 1. Agbanusi, I.C., Isaacson, S.A.: A comparison of bimolecular reaction models for stochastic reaction–diffusion systems. Bulletin of Mathematical Biology 76(4), 922–946 (2014) 2. Aldridge, B.B., Burke, J.M., Lauffenburger, D.A., Sorger, P.K.: Physicochemical modelling of cell signalling pathways. Nature Cell Biology 8(11), 1195 (2006) 3. Andrews, S.S.: Serial rebinding of ligands to clustered receptors as exemplified by bacterial chemotaxis. Phys. Biol. 2, 111–122 (2005) 4. Andrews, S.S.: Spatial and stochastic cellular modeling with the Smoldyn simulator. In: Bacterial Molecular Networks, pp. 519–542. Springer (2012) 5. Andrews, S.S.: Smoldyn: particle-based simulation with rule-based modeling, improved molecular interaction and a library interface. Bioinformatics 33(5), 710–717 (2017) 6. Andrews, S.S.: Particle-based stochastic simulators. Encyclopedia of Computational Neuroscience (2018) 7. Andrews, S.S., Addy, N.J., Brent, R., Arkin, A.P.: Detailed simulations of cell biology with Smoldyn 2.1. PLoS Comput. Biol. 6, e1000,705 (2010) 8. Andrews, S.S., Arkin, A.P.: Simulating cell biology. Current Biology 16(14), R523–R527 (2006) 9. Andrews, S.S., Bray, D.: Stochastic simulation of chemical reactions with spatial resolution and single molecule detail. Physical Biology 1(3), 137 (2004) 10. Andrews, S.S., Dinh, T., Arkin, A.P.: Stochastic models of biological processes. In: Encyclopedia of Complexity and Systems Science, pp. 8730–8749. Springer (2009) 11. Carslaw, H., Jaeger, J.: Conduction of Heat in Solids. Oxford University Press, Oxford, England (1959) 12. Clifford, P., Green, N.: On the simulation of the Smoluchowski boundary condition and the interpolation of Brownian paths. Molecular Physics 57(1), 123–128 (1986) 13. Collins, F.C., Kimball, G.E.: Diffusion-controlled reaction rates. Journal of Colloid Science 4(4), 425–437 (1949) 14. Doi, M.: Stochastic theory of diffusion-controlled reaction. Journal of Physics A: Mathematical and General 9(9), 1479 (1976) 15. Donovan, R.M., Tapia, J.J., Sullivan, D.P., Faeder, J.R., Murphy, R.F., Dittrich, M., Zuckerman, D.M.: Unbiased rare event sampling in spatial stochastic systems biology models using a weighted ensemble of trajectories. PLOS Comput Biol 12(2), e1004,611 (2016)

164

Johnston, Angstmann, Arjunan, Beentjes, Coulier, Isaacson, Khan, Lipkow, Andrews

16. ElKalaawy, N., Amr, W.: Methodologies for the modeling and simulation of biochemical networks, illustrated for signal transduction pathways: A primer. Biosystems 129, 1–18 (2015) 17. Erban, R.: From molecular dynamics to brownian dynamics. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 470(2167), 20140,036 (2014) 18. Erban, R., Chapman, J., Maini, P.: A practical guide to stochastic simulations of reactiondiffusion processes. arXiv preprint arXiv:0704.1908 (2007) 19. Erban, R., Chapman, S.J.: Stochastic modelling of reaction–diffusion processes: algorithms for bimolecular reactions. Physical Biology 6(4), 046,001 (2009) 20. Grima, R., Schnell, S.: Modelling reaction kinetics inside cells. Essays in Biochemistry 45, 41–56 (2008) 21. Karplus, M., McCammon, J.A.: Molecular dynamics simulations of biomolecules. Nature Structural and Molecular Biology 9(9), 646 (2002) 22. Kerr, R.A., Bartol, T.M., Kaminsky, B., Dittrich, M., Chang, J.C.J., Baden, S.B., Sejnowski, T.J., Stiles, J.R.: Fast Monte Carlo simulation methods for biological reaction-diffusion systems in solution and on surfaces. SIAM Journal on Scientific Computing 30(6), 3126–3149 (2008) 23. Mogilner, A., Allard, J., Wollman, R.: Cell polarity: quantitative modeling as a tool in cell biology. Science 336(6078), 175–179 (2012) 24. Rice, S.A.: Diffusion-Limited Reactions. Elsevier (1985) 25. Robinson, M., Andrews, S.S., Erban, R.: Multiscale reaction-diffusion simulations with Smoldyn. Bioinformatics 31, 2406–2408 (2015) 26. Robinson, M., Flegg, M., Erban, R.: Adaptive two-regime method: application to front propagation. The Journal of Chemical Physics 140(12), 124,109 (2014) 27. Sch¨oneberg, J., Ullrich, A., No´e, F.: Simulation tools for particle-based reaction-diffusion dynamics in continuous space. BMC Biophysics 7(1), 11 (2014) 28. Smoluchowski, M.v.: Versuch einer mathematischen theorie der koagulationskinetik kolloider l¨osungen. Z. Phys. Chem 92(129-168), 9 (1917) 29. Tournier, A.L., Fitzjohn, P.W., Bates, P.A.: Probability-based model of protein-protein interactions on biological timescales. Algorithms for Molecular Biology 1(1), 25 (2006) 30. Turner, T.E., Schnell, S., Burrage, K.: Stochastic approaches for modelling in vivo reactions. Computational Biology and Chemistry 28(3), 165–178 (2004) 31. Tyson, J.J., Chen, K.C., Novak, B.: Sniffers, buzzers, toggles and blinkers: dynamics of regulatory and signaling pathways in the cell. Current Opinion in Cell Biology 15(2), 221–231 (2003) 32. van Zon, J.S., Ten Wolde, P.R.: Simulating biochemical networks at the particle level and in time and space: Green’s function reaction dynamics. Physical Review Letters 94(12), 128,103 (2005)

Chapter 3

Recent Trends on Nonlinear PDEs of Elliptic And Parabolic Type

Decay estimates in time for classical and anomalous diffusion Elisa Affili, Serena Dipierro and Enrico Valdinoci

Abstract We present a series of results focused on the decay in time of solutions of classical and anomalous diffusive equations in a bounded domain. The size of the solution is measured in a Lebesgue space, and the setting comprises time-fractional and space-fractional equations and operators of nonlinear type. We also discuss how fractional operators may affect long-time asymptotics.

1 Decay estimates, methods, results and perspectives In this note we present some results, recently obtained in [2, 19], focused on the long-time behavior of solutions of evolution equations which may exhibit anomalous diffusion, caused by either time-fractional or space-fractional effects (or both). The case of several nonlinear operators will be also taken into account (and indeed some of the results that we present are new also for classical diffusion run by nonlinear operators). The results that we establish give quantitative bounds on the decay in time of smooth solutions, confined in a smooth bounded set with Dirichlet data. The size of the solution will be measured in classical Lebesgue spaces, and we will detect different types of decays according to the different cases that we take into considElisa Affili Dipartimento di Matematica, Universit`a degli studi di Milano, Via Saldini 50, 20133 Milan, Italy, ´ ´ and Centre d’Analyse et de Math´ematique Sociales, Ecole des Hautes Etudes en Sciences Sociales, 54 Boulevard Raspail, 75006 Paris, France, e-mail: [email protected] Serena Dipierro Department of Mathematics and Statistics, University of Western Australia, 35 Stirling Highway, Crawley WA 6009, Australia, e-mail: [email protected] Enrico Valdinoci Department of Mathematics and Statistics, University of Western Australia, 35 Stirling Highway, Crawley WA 6009, Australia, e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_12

167

168

Elisa Affili, Serena Dipierro and Enrico Valdinoci

eration (the main order of decay being affected by the structure of the diffusion in time and by the possible nonlinear character of the spatial operator). The evolution equation that we take into account is very general, and it can be written as an initial datum problem with homogeneous external Dirichlet condition of the type ⎧ α ⎪ ⎨λ1 ∂t u + λ2 ∂t u + N [u] = 0 in Ω × (0, +∞), (1) u=0 in (Rn \ Ω ) × (0, +∞), ⎪ ⎩ in Ω . u(·, 0) = u0 (·) In this setting, u = u(x,t) is a smooth solution of (1), Ω is a bounded set of Rn with smooth boundary (and we are not trying here to optimize the smoothness assumptions on the solution or on the domain), the convex parameters λ1 , λ2 ∈ [0, 1] are such that λ1 + λ2 = 1, the (possibly nonlinear) operator N acts on the space variable x, and the time-fractional parameter α lies in (0, 1). Also, in our setting, the symbol ∂tα stands for the so-called Caputo timefractional derivative, defined, up to normalizing constants that we omit for simplicity, by  d t v(τ ) − v(0) ∂tα v(t) := dτ . dt 0 (t − τ )α Such a time-fractional derivative naturally arises in many context, including geophysics [14], neurology [18, 36] (see also [29] and the references therein) and viscoelasticity [5], and can be seen as a natural consequence of classical models of diffusion in highly ramified media such as combs [4]. In addition, from the mathematical point of view, equations involving the Caputo derivatives can be framed into the broad line of research devoted to Volterra type integrodifferential operators, see [28, 40]. The operator N in (1) takes into account the diffusion in the space variable, and can be either of classical or of fractional type, and concrete choices will be made in what follows. More precisely, our setting always comprises, as particular situations, the cases of diffusion driven by the Laplacian or by the fractional Laplacian, defined by  u(x) − u(y) (−Δ )s u(x) := dy, with s ∈ (0, 1) n+2s n R |x − y| where the integral is taken in the principal value sense (to allow cancellations near the singularity). In our framework, we also deal with the case in which N is nonlinear, studying the cases of the classical p-Laplacian and porous media diffusion (see [17, 39])

Δ p um := div(|∇um | p−2 ∇um ),

with p ∈ (1, +∞) and m ∈ (0, +∞),

the case of graphical mean curvature, given in formula (13.1) of [23],

169

Classical and anomalous diffusion

 

div

∇u 1 + |∇u|2

,

the case of the fractional p-Laplacian (see e.g. [10]) 

|u(x) − u(y)| p−2 (u(x) − u(y)) dy, |x − y|n+sp Rn with p ∈ (1, +∞) and s ∈ (0, 1),

(−Δ )sp u(x) :=

and possibly even the sum of different nonlinear operator of this type, with coefficients β j > 0, N

∑ β j (−Δ ) pjj u, s

with p j ∈ (1, +∞) and s j ∈ (0, 1),

j=1

the case of the anisotropic fractional Laplacian, that is the sum of fractional directional derivatives in the directions of the space e j , given by (−Δβ )σ u(x) =

n

∑ β j (−∂x2j )σ j u(x)

j=1

for β j > 0, β = (β1 , . . . , βn ) and σ = (σ1 , . . . , σn ), where (−∂x2j )σ j u(x) =



u(x) − u(x + ρ e j ) dρ , ρ 1+2σ j R

considered for example in [20]. The list of possible diffusion operators continues with two fractional porous media operators (see [13, 33]) P1,s (u) := (−Δ )s um

with s ∈ (0, 1) and m ∈ (0, +∞),

and P2,s (u) := −div (u∇R(u)),

where R(u)(x) :=



Rn

u(y) dy |x − y|n−2s

and s ∈ (0, 1), the graphical fractional mean curvature operator (see [6])

 u(x) − u(x + y) dy F , H s (u)(x) := |y| |y|n+s Rn with s ∈ (0, 1) and F(r) :=

 r 0

dτ (1 + τ 2 )

the classical Kirchhoff operator for vibrating strings

K (u)(x) := −M ∇u2L2 (Ω ) Δ u(x),

n+1+s 2

,

170

Elisa Affili, Serena Dipierro and Enrico Valdinoci

and the fractional Kirchhoff operator (see [21])

 |u(y) − u(Y )|2 dy dY (−Δ )s u(x), Ks (u)(x) := M n+2s Rn ×Rn |y −Y | with M : [0, +∞) → [0, +∞) nondecreasing and s ∈ (0, 1). The case of complex valued operators is also considered, in view of a classical (see [26]) and fractional (see [32]) magnetic settings, in which we took into account the operators M u := −(∇ − iA)2 u, and Ms u(x) :=



Rn

x+y u(x) − ei(x−y)A( 2 ) u(y) dy, |x − y|n+2s

with s ∈ (0, 1),

where A : Rn → Rn represents the magnetic field. For further motivations and additional details on these operators, we refer to [2, 19]: here we just mention that, given the general assumptions that we take, the operator N in (1) comprises many cases of interest in both pure and applied mathematics, with applications in several disciplines, see for instance [11, 27, 30] for detailed discussions on anomalous diffusion with several applications in different contexts. In our setting, we will obtain decay estimates in suitable Lebesgue spaces L (Ω ), for some appropriate exponent   1. The typical estimate that we establish is that all solutions u of (1) satisfy u(·,t)L (Ω )  C∗ Θ (t)

for all t  1,

(2)

where C∗ > 0 depends on the structural assumptions of the problem (namely on Ω , λ1 , λ2 , α , N , u0 and ), and Θ : [1, +∞) → (0, +∞) is an appropriate decay function, described here below in concrete situations, possibly depending on another constant C > 0. The proof of the decay in (2) relies on energy estimates, which are in turn based on suitable Sobolev embeddings that employ the “parabolic” structure of the problem, leading to an appropriate ordinary differential inequality (if λ1 in (1) is equal to zero), or an appropriate integral inequality (if λ1 = 1), or a mixed differential/integrodifferential inequality (if λ1 ∈ (0, 1)), for the normmap t → u(·,t)L (Ω ) .The solutions of the equations related to those inequalities are used as barriers and compared to the function u(·,t)L (Ω ) , as presented in Theorem 1.1 of [19] and Theorems 1.1 and 1.2 of [2]. More precisely, the “elliptic” character of the spatial diffusive operator is encoded in an inequality of the type 

−1+γ u(·,t)L (Ω )  C |u(x,t)|−2 Re u(x,t) N u(x,t) dx, (3) Ω

171

Classical and anomalous diffusion

where γ and C are positive structural constants, “Re” denotes the real part and u is the complex conjugate of u (in case the problem is set in the reals, the inequality in (3) obviously simplifies). We observe that (3) becomes more transparent when  = 2 and N u = −Δ u, with u real valued: in such a case, after an integration by parts which takes into account the Dirichlet datum of u, the inequality in (3) boils down to the classical Sobolev-Poincar´e inequality with γ = 1. Once the inequality in (3) is established for the operator N under consideration, one obtains a bound in terms of an ordinary differential equation, or more generally of a nonlinear integral equation on the variable t: depending on γ and on the type of time-derivative, this provides an estimate on the decay in t of the norm-map t → u(·,t)L (Ω ) , which can be either polynomial or exponential (in particular, different operators N can lead to different values of γ and therefore to different asymptotics in time for the solution u). This strategy, suitably adapted to the different situations, applies to many operators: the concrete cases that we comprise are listed explicitly in Tables 1 and 2, which presents the main results achieved in [2, 19]. For the first table, the theorems cited in the last column are the ones proving (3) and a decay estimate in the case λ1 = 1 and λ2 = 0 for the operators in their row. Then, combining these results with Theorem 1.1 and 1.2 of [2], the declared estimates trivially follow. However, in Table 1 for the first time we apply the estimates for the case λ2 = 1 of [2] to the operators analyzed in [19], stating the expected decays in a quantitative way. Table 1: Results from [19]. Values of λ1 , λ2 Range of  Decay Θ (t)

Operator N

Reference

Nonlinear classical diffusion

Δ p um

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

[1, +∞)

Nonlinear classical diffusion

Δ p um with (m, p) = (1, 2)

λ1 = 0 λ2 = 1

[1, +∞)

Δ2 u

λ1 = 0 λ2 = 1

[1, +∞)

e− C

Thm 1.2 [19]

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

[1, +∞)

1 tα

Thm 1.5 [19]

λ1 = 0 λ2 = 1

[1, +∞)

e− C

Thm 1.5 [19]

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

[1, +∞)

1 α t p−1

Thm 1.6 [19]

Bi-Laplacian Graphical mean curvature Graphical mean curvature Fractional p-Laplacian

div √

div √

∇u 1+|∇u|2

∇u 1+|∇u|2

(−Δ )sp u





1

α

Thm 1.2 [19]

1

Thm 1.2 [19]

t m(p−1)

1

t m(p−1)−1 t

t

Continued on next page

172

Elisa Affili, Serena Dipierro and Enrico Valdinoci Table 1 – Continued from previous page Operator N Values of λ1 , λ2 Range of  Decay rate Θ

Reference

(−Δ )sp u, p > 2

λ1 = 0 λ2 = 1

[1, +∞)

(−Δ )sp u, p  2

λ1 = 0 λ2 = 1

[1, +∞)

e− C

Thm 1.6 [19]

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

[1, +∞)

1 α t pmax −1

Thm 1.7 [19]

λ1 = 0 λ2 = 1

[1, +∞)

1

Thm 1.7 [19]

Superposition ∑Nj=1 β j (−Δ ) pjj u, of fractional with β j > 0 p-Laplacians and pmax  2

λ1 = 0 λ2 = 1

[1, +∞)

e− C

Thm 1.7 [19]

Superposition of anisotropic ∑Nj=1 β j (−∂x2j )s j u fractional with β j > 0 Laplacians

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

[1, +∞)

1 tα

Thm 1.8 [19]

Superposition of anisotropic ∑Nj=1 β j (−∂x2j )s j u, fractional with β j > 0 Laplacians

λ1 = 0 λ2 = 1

[1, +∞)

e− C

Thm 1.8 [19]

1 α tm

Thm 1.9 [19]

1

Thm 1.9 [19]

Fractional p-Laplacian Fractional p-Laplacian

sj Superposition ∑Nj=1 β j (−Δ ) p j u, of fractional with β j > 0 p-Laplacians

1

Thm 1.6 [19]

1

t p−2 t

s

Superposition ∑Nj=1 β j (−Δ ) pjj u, of fractional with β j > 0 p-Laplacians and pmax > 2

1

t pmax −2

s

t

t

Fractional porous media I

P1,s (u)

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

[1, +∞)

Fractional porous media I

P1,s (u), m > 1

λ1 = 0 λ2 = 1

[1, +∞)

Fractional porous media I

P1,s (u), m  1

λ1 = 0 λ2 = 1

[1, +∞)

e− C

Thm 1.9 [19]

Fractional graphical mean curvature

H s (u)

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

[1, +∞)

1 tα

Thm 1.10 [19]

Fractional graphical mean curvature

H s (u)

λ1 = 0 λ2 = 1

[1, +∞)

e− C

1

t m−1

t

t

Thm 1.10 [19]

173

Classical and anomalous diffusion Table 2: Results from [2]. Operator N

Values of λ1 , λ2 Range of  Decay rate Θ Reference

Fractional porous media II

P2,s

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

[1, +∞)

1 α t2

Thm 1.3 [2]

Fractional porous media II

P2,s

λ1 = 0 λ2 = 1

[1, +∞)

1 t

Thm 1.3 [2]

Classical Kirchhoff operator

K (u) with M(0) > 0

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

[1, +∞)

1 tα

Thm 1.4 [2]

Classical Kirchhoff operator

K (u) with M(t) = bt, b > 0 and n  4

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

[1, +∞)

1 α t3

Thm 1.4 [2]

Classical Kirchhoff operator

K (u) with M(t) = bt, b > 0 and n  5

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

1 α t3

Thm 1.4 [2]

Classical Kirchhoff operator

K (u) with M(0) > 0

λ1 = 0 λ2 = 1

[1, +∞)

e− C

Thm 1.4 [2]

Classical Kirchhoff operator

K (u) with M(t) = bt, b>0

λ1 = 0 λ2 = 1

[1, +∞)

1 √ t

Thm 1.4 [2]

Fractional Kirchhoff operator

Ks (u) with M(0) > 0

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

[1, +∞)

1 tα

Thm 1.5 [2]

Fractional Kirchhoff operator

K (u) with M(t) = bt, b > 0 and n  4s

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

[1, +∞)

1 α t3

Thm 1.5 [2]

Fractional Kirchhoff operator

K (u) with M(t) = bt, b > 0 and n > 4s

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

1 α t3

Thm 1.5 [2]

Fractional Kirchhoff operator

Ks (u) with M(0) > 0

λ1 = 0 λ2 = 1

[1, +∞)

e− C

Thm 1.5 [2]

Fractional Kirchhoff operator

Ks (u) with M(t) = bt, b>0

λ1 = 0 λ2 = 1

[1, +∞)

1 √ t

Thm 1.5 [2]



2n 1, n−4





2n 1, n−4s



t

t

Continued on next page

174

Elisa Affili, Serena Dipierro and Enrico Valdinoci Table 2 – Continued from previous page Operator N Values of λ1 , λ2 Range of  Decay Θ (t)

Reference

Classical magnetic operator

M (u)

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

[1, +∞)

1 tα

Classical magnetic operator

M (u)

λ1 = 0 λ2 = 1

[1, +∞)

e− C

Thm 1.6 [2]

Fractional magnetic operator

Ms (u)

λ1 ∈ (0, 1] λ2 ∈ [0, 1)

[1, +∞)

1 tα

Thm 1.7 [2]

Fractional magnetic operator

Ms (u)

λ1 = 0 λ2 = 1

[1, +∞)

e− C

Thm 1.6 [2]

t

t

Thm 1.7 [2]

It would be interesting to detect the optimality of the estimates listed in Tables 1 and 2, and to investigate other cases of interest as well. For related decay estimates, see [7, 22, 28]. As a matter of fact, decay estimates for evolutionary problems are a classical topic of research that has produced a very abundant, and extremely interesting, literature. Without aiming at providing an exhaustive list of all the important contributions on this topic, we mention that: • the classical doubly-nonlinear operator Δ p um with λ2 = 1 has been addressed in [9], • the classical 1-Laplace operator has been dealt with in [3, 24, 25], • decay estimates for the fractional p-Laplacian (−Δ p )s with s ∈ (0, 1) and p > 1 with λ2 := 1 have been first established in Section 6 of [16] (see also [15]), • the case of the fractional 1-Laplacian, namely (−Δ p )s with s ∈ (0, 1) and p := 1 has been treated in [25], • the porous medium equation for λ2 = 1 has been deeply analyzed in [7], • some interesting estimates for the Kirchoff equation are given in [22], • see also [35], where several decay estimates have been obtained by using integral inequalities. We also remark that the interplay between time derivatives and fractional diffusion produces interesting decay patterns also in nonlinear equations, see e.g. [34]. Furthermore, in general, the fractional aspect of the problem can cause significant differences, as can be observed also from the time decays of Table 1. For instance, one may notice that the decay switch from polynomial to exponential in the Kirchhoff equations when the time-diffusion changes from fractional to classical, and this independently on the fact that the space diffusion is of classical or fractional type (roughly speaking, in this context, it is just the character of the diffusion in time which detects the time decay, regardless the character of the diffusion in space). We also point out that classical and fractional operators share several common properties, but they also exhibit structural differences at a fundamental level. For

175

Classical and anomalous diffusion

instance, to exhibit an elementary but very interesting feature in which long-time behaviors are affected by fractional environments, we recall that fractional diffusion in space, as modeled by the fractional Laplacian (−Δ )s with s ∈ (0, 1), is related to L´evy-type and 2s-stable stochastic processes, and in such case the long jumps of the underlying random walk causes significant differences with respect to the classical Brownian motion. In particular, fractional processes are typically recurrent only in dimension 1 and for values of s greater or equal to 1/2 (being transient in dimension 2 and higher, and also in dimension 1 for values of s smaller than 1/2), and this is an important difference with respect to the case of Gaussian processes, which are recurrent in dimensions 1 and 2 (and transient in dimension 3 and higher). See Example 3.5 in [37] and the references therein for a detailed treatment of recurrence and transiency for L´evy-type processes. In this note, in § 2, we present a very simple, and somewhat heuristic, discussion of the recurrence and transiency properties related to the long jump random walks, based on PDE methods and completely accessible to a broad audience. For a detailed list of other elementary structural differences between classical and fractional diffusion see also § 2.1 in [1].

2 Recurrence and transiency of long jump random processes In this section, we discuss a simple PDE approach to the recurrence of the long jump random walk related to (−Δ )s in dimension 1 and for values of s greater or equal to 1/2 and to its transiency in dimension 2 and higher, and also in dimension 1 for values of s smaller than 1/2. The treatment will comprise the classical case s = 1 as well, showing how the structural differences between the different regimes naturally arise from a PDE analysis. To this end, for s ∈ (0, 1], we denote by Gs (x,t) the solution of the (possibly fractional) heat equation with initial datum given by the Dirac’s Delta, namely  ∂t Gs + (−Δ )s Gs = 0 in Rn × (0, +∞), Gs (x, 0) = δ0 (x). When s = 1, we have that such function reduces to the classical Gauss kernel for the heat flow, namely 1 − |x|2 4t . G1 (x,t) = n e (4t) 2 In general, when s ∈ (0, 1), the expression of Gs is less explicit, except when s = 1/2; in the latter case, it holds that G1/2 (x,t) =

ct (t 2 + |x|2 )

n+1 2

,

176

Elisa Affili, Serena Dipierro and Enrico Valdinoci

where c > 0 is a normalizing constant – the need of which lying in the general mass conservation law  Gs (x,t) dx = 1, (4) Rn

see also formula (2.29) in [1] and page 1363 in [12]. Furthermore, see again page 1363 in [12], we have that Gs (x,t) > 0

for all (x,t) ∈ Rn × (0, +∞),

(5)

and it enjoys the natural scaling property Gs (x,t) =



1 n

t 2s

Gs

x 1

t 2s

,1 .

(6)

See [8], [12], formulas (2.41)–(2.45) in [1] and the references therein for a discussion about the fractional heat kernel and its differences with the classical case. For every k ∈ {1, 2, 3, . . . } and ρ > 0, we define qk (s, ρ ) :=



Gs (x, k) dx.

(7)

Gs (x, k) dx = 1,

(8)

Rn \Bρ

We observe that 0  qk (s, ρ ) 

 Rn

thanks to (4). Let also +∞

q(s, ρ ) := ∏ qk (s, ρ ) ∈ [0, 1] k=1

and

(9)

q(s) := lim q(s, ρ ). ρ →0

In view of (8), we can consider q(s) as related to the probability of the stochastic process associated with the operator (−Δ )s of “drifting away without coming back”. Namely (see [38]), we know that  A

Gs (x,t) dx

represents the probability that a particle starting at the origin at time 0 and following the stochastic process producing (−Δ )s ends up in the region A ⊆ Rn at time t. In this sense, the quantity qk (s, ρ ) in (7) represents the probability that this particle lies outside Bρ at time k. Roughly speaking, for small ρ , a natural Ansatz is to assume these events to be more or less independent from each other: indeed, in view of (6), using the substitution y := x/ρ we have that

177

Classical and anomalous diffusion

qk (s, ρ ) =

1 n k 2s

=



 Rn \Bρ







ρn ρy , 1 dx = Gs dy n 1 1 ,1 n k 2s R \B1 k 2s k 2s 

  y k Gs dy = Gs y, 2s dy, 1 ,1 ρ Rn \B1 Rn \B1 (k/ρ 2s ) 2s

Gs

1 n (k/ρ 2s ) 2s



x

representing the probability of a particle to lie outside B1 at time k/ρ 2s . In view of this, since the time steps k/ρ 2s are very separated from each other when ρ is small, we may think that the quantity q(s, ρ ) in (9) is a good approximation of the probability that the particle does not lie in B1 in all the time steps k/ρ 2s , as well as a good approximation of the probability that the particle does not lie in Bρ in all the time steps k ∈ {1, 2, 3, . . . }. In this heuristics, the case in which the quantity q(s) in (9) is equal to 0 indicates that the particle will come back infinitely often to its original position at the origin in integer times (with probability 1); conversely, the case in which the quantity q(s) in (9) is equal to 1 indicates that the particle will return to its original position at the origin in integer times only with probability zero. In this sense, computing q(s) gives an interesting indication of the recurrence properties of the associated stochastic process, and, in our case, this calculation can be performed as follows. First of all, we notice that inf Gs (x, 1) := ιs > 0,

x∈B1

thanks to (5), and sup Gs (x, 1) := μs < +∞.

x∈Rn

As a consequence, recalling (4) and (6), if ρ ∈ (0, 1] 

Gs (x, k) dx pk (s, ρ ) := 1 − qk (s, ρ ) = Bρ  

 ιs |Bρ | μs |Bρ | x 1 Gs , 1 dx ∈ , = n . n n 1 k 2s Bρ k 2s k 2s k 2s This gives that   Cρ n cρ n qk (s, ρ ) = 1 − pk (s, ρ ) ∈ 1 − n , 1 − n , k 2s k 2s for some C > c > 0, depending only on n and s, and accordingly  log q(s, ρ ) = log

+∞

∏ qk (s, ρ )

k=1 +∞



+∞



Cρ n = ∑ log qk (s, ρ ) ∈ ∑ log 1 − n k 2s k=1 k=1



+∞



cρ n , ∑ log 1 − n k 2s k=1



(10) .

178

Elisa Affili, Serena Dipierro and Enrico Valdinoci

Also, for a fixed C0 ∈ (0, +∞), the convergence of the series

C0 ρ n log 1 − n ∑ k 2s k=1 +∞

can be reduced to that of the series +∞

C0 ρ n n , k=1 k 2s

−∑ and consequently +∞



C0 ρ n ∑ log 1 − k 2sn k=1



=

−C1 ρ n −∞

if n > 2s, if n  2s,

for some C1 > 0 depending only on n, s and C0 . This and (10) lead to log q(s, ρ ) = −∞

if n  2s,

log q(s, ρ ) ∈ [−C2 ρ , −C3 ρ n ]

if n > 2s,

n

for some C2 > C3 > 0 depending only on n and s. Therefore, q(s, ρ ) = 0 if n  2s, n n q(s, ρ ) ∈ [e−C2 ρ , e−C3 ρ ]

if n > 2s.

Taking the limit as ρ → 0, we thereby find that q(s) = 0

if n  2s,

q(s) = 1

if n > 2s.

(11)

When s = 1, we can write (11) as q(s) = 0 q(s) = 1

if n ∈ {1, 2}, if n  3,

that is, in our framework, the classical random walk “comes back to the initial” point in dimensions 1 and 2, and it “drifts away forever” in dimension 3 and higher. When s ∈ (0, 1), the situation is different, since (11) produces the alternative q(s) = 0 q(s) = 1

if n = 1 and s ∈ [1/2, 1), if n  2, and also if n = 1 and s ∈ (0, 1/2).

That is, in our setting, the fractional random walk “comes back to the initial” point only in dimensions 1 and only if the fractional parameter is above a certain threshold (namely s  1/2). Conversely, the fractional random walk “drifts away for ever”

Classical and anomalous diffusion

179

already in dimension 2, and even in dimension 1 if the fractional parameter is too small (namely s < 1/2). See e.g. [31] and the references therein for a comprehensive treatment of recurrence and transiency of general stochastic processes. Acknowledgements This work has been supported by the Australian Research Council Discovery Project 170104880 NEW “Nonlocal Equations at Work”. Part of this work was carried out on the occasion of a very pleasant visit of the first author to the University of Western Australia, which we thank for the warm hospitality. The authors are members of INdAM/GNAMPA.

181

References

[1] Nicola Abatangelo and Enrico Valdinoci, Getting acquainted with the fractional Laplacian, Contemporary Research in Elliptic PDEs and Related Topics, 2019, pp. 1–105, DOI 10.1007/978-3-030-18921-1. [2] Elisa Affili and Enrico Valdinoci, Decay estimates for evolution equations with classical and fractional time-derivatives, J. Differential Equations 266 (2019), no. 7, 4027–4060, DOI 10.1016/j.jde.2018.09.031. MR3912710 [3] Fuensanta Andreu-Vaillo, Vicent Caselles, and Jos´e M. Maz´on, Parabolic quasilinear equations minimizing linear growth functionals, Progress in Mathematics, vol. 223, Birkh¨auser Verlag, Basel, 2004. MR2033382 ´ M. Baskin, Anomalous diffusion and drift in a comb model of [4] V. E. Arkhincheev and E. percolation clusters, J. Exp. Theor. Phys. 73 (1991), 161–165. [5] Ron Bagley, On the equivalence of the Riemann-Liouville and the Caputo fractional order derivatives in modeling of linear viscoelastic materials, Fract. Calc. Appl. Anal. 10 (2007), no. 2, 123–126. MR2351653 [6] Bego˜na Barrios, Alessio Figalli, and Enrico Valdinoci, Bootstrap regularity for integrodifferential operators and its application to nonlocal minimal surfaces, Ann. Sc. Norm. Super. Pisa Cl. Sci. (5) 13 (2014), no. 3, 609–639. MR3331523 [7] Piotr Biler, Cyril Imbert, and Grzegorz Karch, The nonlocal porous medium equation: Barenblatt profiles and other weak solutions, Arch. Ration. Mech. Anal. 215 (2015), no. 2, 497– 529, DOI 10.1007/s00205-014-0786-1. MR3294409 [8] Krzysztof Bogdan and Tomasz Jakubowski, Estimates of heat kernel of fractional Laplacian perturbed by gradient operators, Comm. Math. Phys. 271 (2007), no. 1, 179–198, DOI 10.1007/s00220-006-0178-y. MR2283957 [9] Matteo Bonforte and Gabriele Grillo, Super and ultracontractive bounds for doubly nonlinear evolution equations, Rev. Mat. Iberoam. 22 (2006), no. 1, 111–129. MR2268115 [10] Lorenzo Brasco, Erik Lindgren, and Armin Schikorra, Higher H¨older regularity for the fractional p-Laplacian in the superquadratic case, Adv. Math. 338 (2018), 782–846, DOI 10.1016/j.aim.2018.09.009. MR3861716 [11] Claudia Bucur and Enrico Valdinoci, Nonlocal diffusion and applications, Lecture Notes of the Unione Matematica Italiana, vol. 20, Springer, [Cham]; Unione Matematica Italiana, Bologna, 2016. MR3469920 [12] Xavier Cabr´e and Jean-Michel Roquejoffre, Propagation de fronts dans les e´ quations de Fisher-KPP avec diffusion fractionnaire, C. R. Math. Acad. Sci. Paris 347 (2009), no. 23-24, 1361–1366, DOI 10.1016/j.crma.2009.10.012 (French, with English and French summaries). MR2588782 [13] Luis A. Caffarelli and Juan Luis V´azquez, Asymptotic behaviour of a porous medium equation with fractional diffusion, Discrete Contin. Dyn. Syst. 29 (2011), no. 4, 1393–1404, DOI 10.3934/dcds.2011.29.1393. MR2773189

182

Elisa Affili, Serena Dipierro and Enrico Valdinoci

[14] Michele Caputo, Linear models of dissipation whose Q is almost frequency independent. II, Fract. Calc. Appl. Anal. 11 (2008), no. 1, 4–14. Reprinted from Geophys. J. R. Astr. Soc. 13 (1967), no. 5, 529–539. MR2379269 [15] Thierry Coulhon and Daniel Hauer, Regularisation effects of nonlinear semigroups, arXiv e-prints (2016), available at 1604.08737. , Regularisation effects of nonlinear semigroups – theory and applications, Springer[16] Briefs in Mathematics, Springer, Cham; BCAM Basque Center for Applied Mathematics, Bilbao. BCAM SpringerBriefs. [17] Emmanuele DiBenedetto, Degenerate parabolic equations, Universitext, Springer-Verlag, New York, 1993. MR1230384 [18] Serena Dipierro and Enrico Valdinoci, A simple mathematical model inspired by the Purkinje cells: from delayed travelling waves to fractional diffusion, Bull. Math. Biol. 80 (2018), no. 7, 1849–1870, DOI 10.1007/s11538-018-0437-z. MR3814763 [19] Serena Dipierro, Enrico Valdinoci, and Vincenzo Vespri, Decay estimates for evolutionary equations with fractional time-diffusion, J. Evol. Equ. 19 (2019), no. 2, 435–462, DOI 10.1007/s00028-019-00482-z. MR3950697 [20] Alberto Farina and Enrico Valdinoci, Regularity and rigidity theorems for a class of anisotropic nonlocal operators, Manuscripta Math. 153 (2017), no. 1-2, 53–70, DOI 10.1007/s00229-016-0875-6. MR3635973 [21] Alessio Fiscella and Enrico Valdinoci, A critical Kirchhoff type problem involving a nonlocal operator, Nonlinear Anal. 94 (2014), 156–170, DOI 10.1016/j.na.2013.08.011. MR3120682 [22] Marina Ghisi and Massimo Gobbino, Hyperbolic-parabolic singular perturbation for mildly degenerate Kirchhoff equations: time-decay estimates, J. Differential Equations 245 (2008), no. 10, 2979–3007, DOI 10.1016/j.jde.2008.04.017. MR2454809 [23] Enrico Giusti and Graham Hale Williams, Minimal surfaces and functions of bounded variation, Vol. 2, Springer, 1984. [24] Daniel Hauer and Jos´e M. Maz´on, Kurdyka-Lojasiewicz-Simon inequality for gradient flows in metric spaces, arXiv e-prints (2017), available at 1707.03129. , Regularizing effects of homogeneous evolution equations: the case of homogeneity [25] order zero, J. Evol. Equ., posted on 2019, DOI 10.1007/s00028-019-00502-y. [26] Teruo Ikebe and Tosio Kato, Uniqueness of the self-adjoint extension of singular elliptic differential operators, Arch. Rational Mech. Anal. 9 (1962), 77–92, DOI 10.1007/BF00253334. MR0142894 [27] C. Ionescu, A. Lopes, D. Copot, J. A. T. Machado, and J. H. T. Bates, The role of fractional calculus in modeling biological phenomena: a review, Commun. Nonlinear Sci. Numer. Simul. 51 (2017), 141–159, DOI 10.1016/j.cnsns.2017.04.001. MR3645874 [28] Jukka Kemppainen, Juhana Siljander, Vicente Vergara, and Rico Zacher, Decay estimates for time-fractional and other non-local in time subdiffusion equations in Rd , Math. Ann. 366 (2016), no. 3-4, 941–979, DOI 10.1007/s00208-015-1356-z. MR3563229 [29] Toma M. Marinov, Nelson Ramirez, and Fidel Santamaria, Fractional integration toolbox, Fract. Calc. Appl. Anal. 16 (2013), no. 3, 670–681, DOI 10.2478/s13540-013-0042-7. MR3071207 [30] Ralf Metzler and Joseph Klafter, The random walk’s guide to anomalous diffusion: a fractional dynamics approach, Phys. Rep. 339 (2000), no. 1, 77, DOI 10.1016/S03701573(00)00070-3. MR1809268 [31] T. M. Michelitsch, B. A. Collet, A. P. Riascos, A. F. Nowakowski, and F. C. G. A. Nicolleau, Recurrence of random walks with long-range steps generated by fractional Laplacian matrices on regular networks and simple cubic lattices, J. Phys. A 50 (2017), no. 50, 505004, 29, DOI 10.1088/1751-8121/aa9008. MR3738798 [32] Hoai-Minh Nguyen, Andrea Pinamonti, Marco Squassina, and Eugenio Vecchi, New characterizations of magnetic Sobolev spaces, Adv. Nonlinear Anal. 7 (2018), no. 2, 227–245, DOI 10.1515/anona-2017-0239. MR3794886 [33] Arturo de Pablo, Fernando Quir´os, Ana Rodr´ıguez, and Juan Luis V´azquez, A fractional porous medium equation, Adv. Math. 226 (2011), no. 2, 1378–1409, DOI 10.1016/j.aim.2010.07.017. MR2737788

Classical and anomalous diffusion

183

[34] Stefania Patrizi and Enrico Valdinoci, Relaxation times for atom dislocations in crystals, Calc. Var. Partial Differential Equations 55 (2016), no. 3, Art. 71, 44, DOI 10.1007/s00526016-1000-0. MR3511786 [35] Maria Michaela Porzio, On decay estimates, J. Evol. Equ. 9 (2009), no. 3, 561–591, DOI 10.1007/s00028-009-0024-8. MR2529737 ` Saftenku, Modeling of slow glutamate diffusion and AMPA receptor activa[36] E. E. tion in the cerebellar glomerulus, J. Theoret. Biol. 234 (2005), no. 3, 363–382, DOI 10.1016/j.jtbi.2004.11.036. MR2139665 [37] Nikola Sandri´c, On transience of L´evy-type processes, Stochastics 88 (2016), no. 7, 1012– 1040, DOI 10.1080/17442508.2016.1178749. MR3529858 [38] Enrico Valdinoci, From the long jump random walk to the fractional Laplacian, Bol. Soc. Esp. Mat. Apl. SeMA 49 (2009), 33–44. MR2584076 [39] Juan Luis V´azquez, The porous medium equation, Oxford Mathematical Monographs, The Clarendon Press, Oxford University Press, Oxford, 2007. Mathematical theory. MR2286292 [40] Rico Zacher, Maximal regularity of type L p for abstract parabolic Volterra equations, J. Evol. Equ. 5 (2005), no. 1, 79–103, DOI 10.1007/s00028-004-0161-z. MR2125407

Multi-point maximum principles and eigenvalue estimates Ben Andrews

Abstract Estimates on modulus of continuity, isoperimetric profiles of various kinds, and quantities involving function values at several points have been central in several recent results in geometric analysis. In these lectures I will focus mostly on the applications to partial differential equations, and to estimates on eigenvalues. These lectures were presented at the MATRIX program on “Recent trends on Nonlinear PDE of Elliptic and Parabolic type” at Creswick, November 5-16, 2018. Acknowledgements This survey describes work supported by Discovery Projects grants DP0985802, DP120102462, and DP120100097, and Laureate Fellowship FL150100126 of the Australian Research Council.

Introductory comments In this article I want to describe some techniques which have been applied with some success recently to a variety of problems, ranging from my proof with Julie Clutterbuck of the sharp lower bound on the fundamental gap for Schr¨odinger operators [4] to Brendle’s proof of the Lawson conjecture [14] and my proof with Haizhong Li of the Pinkall-Sterling conjecture [7]. I will discuss several other interesting applications below. The common thread in these techniques is the application of the maximum principle to functions involving several points or to functions depending on the global structure of solutions. Further related ideas, and more details on some of the methods presented here, can be found in the author’s survey article [1]. Ben Andrews Mathematical Sciences Institute, Australia National University, e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_13

185

186

Ben Andrews

Lecture 1: Controlling the modulus of continuity in heat equations 1.1 Moduli of continuity Today I want to discuss how two-point maximum principles can be used to control the modulus of continuity for solutions of heat equations. This implies a lot of information including sharp gradient estimates. In particular the modulus of continuity estimates imply sharp decay estimates, which are the key to some sharp eigenvalue inequalities, the first of which we will reach by the end of today’s lecture. Recall that for a function f (on a metric space), a function ω of one positive variable is a modulus of continuity for f if ω (s) bounds the difference in function values | f (y) − f (x)| for all point with separation d(x, y) = s. I will adopt a slightly different definition, for the sake of simplicity further down the track: We say ω is a modulus of continuity for f if   | f (y) − f (x)| d(x, y) ≤ω 2 2 for all x and y. This differs from the usual definition by the factors of 2, and the reason for these will become clear in a moment. For a time-dependent function f (x,t), we say that a time-dependent function ω (s,t) is a modulus of continuity for f if ω (.,t) is a modulus of continuity for f (.,t) for each t, which means that   d(x, y) | f (y,t) − f (x,t)| ≤ω ,t 2 2 for all x and y and all t. In particular, there is a smallest modulus of continuity (which we will sometimes call ‘the modulus of continuity of f ’) defined by   | f (y) − f (x)| d(x, y) ω f (s,t) = sup : =s . 2 2 The following example is an important one: Lemma 1. Suppose that f is a function on the real line which is odd, increasing, and concave on the positive half-line. Then

ω f (s) = f (s) for s > 0. Proof. We will show that for any fixed s > 0, the supremum of | f (y) − f (x)| among points with |y − x| = 2s is attained at the points y = s, x = −s, so that ω f (s) = f (s)− f (−s) = f (s) since f is odd. First, we can assume that y > x and f (y) − f (x) = 2 | f (y) − f (x)| since f is increasing. Then the function η (x) = f (x + s) − f (x − s) is

187

Multi-point maximum principles and eigenvalue estimates

even in x since f is odd, and we have for x ≥ s that x−s =

x−s 2s (0) + (x + s); x+s x+s

s=

x s (0) + (x + s), x+s x+s

so since f is concave on [0, x + s] and f (0) = 0 we have f (x − s) ≥

x−s f (x + s); x+s

f (s) ≥

s f (x + s); x+s

=⇒ f (x − s) + 2 f (s) ≥ f (x + s)

which is equivalent to

η (x) − η (0) = f (x + s) − f (x − s) − f (s) + f (−s) = f (x + s) − f (x − s) − 2 f (s) ≤ 0. If 0 < x < s then we have

η (x) − η (0) = f (x + s) − f (x − s) − 2 f (s) = f (x + s) + f (s − x) − 2 f (s) ≤ 0 since f is concave on the interval [s − x, s + x] ⊂ R+ . Thus 0 is the global maximum of η , as claimed. We observe that if f0 is an odd, increasing function which is concave for positive values, then the same remains true for f (.,t) for each t > 0 if f satisfies the heat equation ∂f ∂2 f = 2. (1) ∂t ∂s Thus we have the following curious corollary: Corollary 1. If f0 is an odd, increasing function which is concave for positive values, then ω f (s,t) = f (s,t) for all s > 0 and t > 0 if f evolves by (1). In particular, the modulus of continuity of f satisfies the one-dimensional heat equation. Furthermore, if u is a solution of the heat equation n ∂ ∂ 2u u = Δ u := ∑ 2 ∂t i=1 ∂ xi

(2)

on Rn which depends on only one of the spatial variables, so that u(x1 , · · · , xn ,t) = f (x1 ,t), where f is as above, then ωu (s,t) = f (s,t), and ωu is a solution of the one-dimensional heat equation. Later we will see this as the extreme case of a result for general solutions of heat equations.

1.2 Motivation: Zero counting for equations in one space variable For equations in one spatial variable, we can use zero-counting methods to get a good understanding of how the modulus of continuity of a solution of the heat equa-

188

Ben Andrews

tion changes with time. This is based on the fact that the number of zeroes of a solution, or the number of intersections of two solutions, does not increase in time — a result first observed by Sturm in 1836 for solutions of the linear heat equation, and refined into a very general tool more recently, particularly through work of Hiroshi Matano, Sigurd Angenent and others. Roughly speaking, as long as new zeroes (or intersections) are not introduced on the boundary or at infinity, then new ones cannot appear. It is also true that the number strictly decreases whenever a zero (or intersection) becomes degenerate in the sense that the first derivative also vanishes, but we will not need this fact. Consider a bounded smooth solution u : R × R+ → [−M, M] of the heat equation on the real line. We will use the zero-counting argument to compare u with  the  √ special solution of (1) with the same range [−M, M] given by ϕ (x,t) = M erf x−a , 2t for any a ∈ R. Since u is smooth and ϕ (.,t) approaches a Heaviside function as t → 0, for any ε > 0, we have exactly one intersection between (1 + ε )ϕ (,t) and u(.,t) for t > 0 sufficiently small. Since the number of intersections does not increase with time (noting that (1+ ε )ϕ → (1+ ε )M > u as s → ∞ and (1+ ε )ϕ → −(1+ ε )M < u as s → −∞, so no new zeroes are produced near infinity), we have at most one intersection between (1 + ε )ϕ (.,t) and u(.,t) for every t > 0; on the other hand the asymptotics of ϕ near s = ±∞ also imply that there is at least one intersection, by the intermediate value theorem. Therefore we have exactly one intersection between (1 + ε )ϕ (.,t) and u(.,t) for each t > 0. For any given x ∈ R and t > 0, there is a unique a ∈ R such that (1 + ε )ϕ (x,t) = u(x,t). Since there is only one intersection between u(.,t) and (1 + ε )ϕ (.,t), and since (1 + ε )ϕ (s,t) > u(s,t) for large s, we have u(x + s,t) < (1 + ε )ϕ (x + s,t) for s > 0, and u(x + s,t) > (1 + ε )ϕ (x + s,t) for s < 0. This implies s  ,t |u(x + s,t) − u(x,t)| ≤ (1 + ε )|ϕ (x + s,t) − ϕ (x,t)| ≤ 2(1 + ε )ϕ 2 by Lemma 1, since ϕ is odd, increasing, and concave for positive values. We conclude that ωu (s,t) ≤ (1 + ε )ϕ (s,t). Finally, letting ε → 0 we deduce that ωu ≤ ϕ . This gives a universal bound on the modulus of continuity for solutions of the heat equation, depending only on M. Notice that this result is sharp, since equality holds in the particular case where u = ϕ . From the modulus of continuity estimate, we can also deduce a sharp gradient estimate: Taking y → x, we conclude that |u (x,t)| ≤ ϕ (0,t) = √Mπ t . Again, this is sharp because equality holds on the solution ϕ . We remark that this argument is very robust, and applies equally well for solutions of other parabolic equations such as the p-Laplacian heat flows and graphical curve shortening flow. In fact, interpreted in the right way, this idea gives sharp estimates for arbitrary parabolic equations in one dimension, by comparison to solutions which approach a Heaviside function at the initial time. Unfortunately there is no good analogue of the zero-counting argument known in higher dimensions, so we must find other ways to control the modulus of continuity.

Multi-point maximum principles and eigenvalue estimates

189

In fact the result we just obtained does have a direct analogue in higher dimensions, but to prove it we instead use a two-point maximum principle argument.

1.3 The heat equation on Euclidean space Now let us consider the case of the heat equation (2) on Rn . For simplicity, we will start by considering solutions which approach a constant at infinity (this assumption can be removed but makes the argument simpler). The maximum principle is a tool which can be used to show that pointwise inequalities (i.e. some function having a definite sign) can be preserved under a parabolic flow. In order to apply this to control the modulus of continuity, we first observe that the statement “ω is a modulus of continuity for u” is equivalentto the non-positivity of the function Z defined by Z(x, y,t) = u(y,t) − u(x,t) −  |y−x| 2ω 2 ,t . Thus we can try to apply a maximum principle to keep Z non-positive, if it is initially so (that is, if ω (., 0) is a modulus of continuity for u(., 0)). The price we pay for doing this is that Z is now a function of two points x and y as well as of t. Proposition 1. Suppose that u is a smooth solution to the heat equation (2) on Rn with u(x,t) → 0 as |x| → ∞ and |u(x,t)| ≤ M for all (x,t). Suppose that ω : R+ × R+ → R+ satisfies (1) with ω (0,t) = 0 and lims→∞ ω (s,t) ≥ M, and such that ω (., 0) is a modulus of continuity for u(., 0). Then ω (.,t) is a modulus of continuity for u(.,t) for each t ≥ 0. Proof. Fix ε > 0, and let ω˜ (s,t) = ω (s,t) + ε (1 + t). We will prove that Z(x, y,t) = u(y,t) − u(x,t) − 2ω˜ ( |y−x| 2 ,t) is strictly negative. Note that Z ≤ −ε on {t = 0} and {x = y} and as |y − x| → ∞, and since u → 0 at spatial infinity this also implies that Z < 0 for either |x| or |y| large. It follows that Z remains strictly negative unless there is some (x0 , y0 ,t0 ) with x0 = y0 , t0 > 0, and Z(x0 , y0 ,t0 ) = 0 and Z(x, y,t) ≤ 0 for all x and y and all t ∈ [0,t0 ]. We will derive a contradiction by considering this point: Here we have ∂∂Zt ≥ 0, and the spatial derivatives satisfy DZ = 0 and D2 Z ≤ 0. The first inequality gives 0≤

∂ Z  ∂ ω˜  , = Δ u|(y0 ,t0 ) − Δ u|(x0 ,t0 ) − 2 |y −x | (x ,y ,t ) ∂t 0 0 0 ∂ t ( 0 2 0 ,t0 )

(3)

The inequality D2 Z ≤ 0 is in the sense of posivite-definiteness of matrices, and this gives a lot of information since it involves a (2n) × (2n) matrix (with n directions corresponding moving x, and the other n corresponding to moving y). We will only need to compute certain components of this matrix, chosen to extract useful inequalities. In order to do this, we first choose an orthonormal basis for Rn for which 0 en = |yy0 −x −x | . Then we consider two special variations: 0

0

190

Ben Andrews

• Move x and y apart with equal velocities, i.e. have

d ds |y − x|

= 2, and

0≥

d2 |y − x| ds2

d ds x

= −en ,

d ds y

= en . Then we

= 0, so we obtain

d2 Z = Dn Dn u|(y0 ,t0 ) − Dn Dn u|(x0 ,t0 ) − 2ω˜ | |y0 −x0 | . ( 2 ,t0 ) ds2

• Move x and y in parallel in a direction orthogonal to the line between them, i.e. d = ds y = ei for some i < n. Then we have |y − x| constant, and so

d ds x

0≥

d2 Z = Di Di u|(y0 ,t0 ) − Di Di u|(x0 ,t0 ) . ds2

Adding these inequalities over all i gives 0 ≥ Δ u|(y0 ,t0 ) − Δ u|(x0 ,t0 ) = 2ω˜ |

. |y −x | ( 0 2 0 ,t0 )

(4)

Now we combine the inequalities (3) and (4) to give  at the point

|y0 −x0 | 2 ,t0



∂r ω˜ ≤ ω˜ . But this is impossible, since

∂t ω˜ = ∂t ω + ε = ω + ε > ω = ω˜ . This is a contradiction. Therefore such a point (x0 , y0 ,t0 ) cannot occur, and Z stays negative. Finally, letting ε approach zero gives the result of the Proposition, that ω (.,t) is a modulus of continuity for u(.,t) for each t. It may be useful to make some remarks here about the particular directions which were used to obtain second derivative inequalities: The guide here is the ‘equality case’, which is when the solution is a ‘one-dimensional’ solution of the form u(x,t) = f (x · e1 ,t), where f is odd, increasing, and concave for positive values. As we saw earlier, in this case the modulus of continuity of u is f , and equality is attained at the points of the form (x, −x,t) for x > 0. That is, the equality set in this special case is {(x, y,t) : y·e1 = −x ·e1 }. In the argument we are using the inequality D2 Z ≤ 0 in some direction, so if we want to get a sharp inequality (and not throw anything away) we can only use those directions for which D2 Z = 0 in the equality case. It is easy to see that these are just the ones that we chose in the argument: The directions where equality holds are spanned by those where x and y move in parallel in a direction orthogonal to e1 , and that where x and y move in opposite directions along the line between them. This principle is often a useful guide in constructing these arguments for two-point maximum principles. As a consequence of the above modulus of continuity estimate, we can deduce the same sharp gradient bound as the zero-counting argument gave us, but now in any dimension:

Multi-point maximum principles and eigenvalue estimates

191

Corollary 2. Let u be a smooth solution to the heat equation (2) on Rn , with |u(x,t)| ≤ M and approaching zero at infinity. Then |Du(x,t)| ≤ √Mπ t for all x ∈ Rn and t ≥ 0.   Proof. We can apply the Proposition with ω (s,t) = M erf 2√st+a for any sufficiently small a > 0. Letting a → 0 gives the same bound on the modulus of continuity as for the one-dimensional case, and the same gradient bound.

1.4 The Neumann heat equation on a bounded domain Next we consider solutions of the heat equation on bounded domains. The simplest case to consider is the Neumann condition, though (non-sharp) results can be obtained for other cases. The main result for the Neumann case is the following: Proposition 2. Let Ω be a (smooth) bounded (strictly) convex domain in Rn with diameter D = sup{|y − x| : x, y ∈ Ω }, and let u be a (smooth) solution of the heat equation (2) on Ω × R+ , satisfying the Neumann condition Dν u = 0 on ∂ Ω × R+ . Suppose that ω : [0, D/2] × R+ → R is a solution of the one-dimensional heat equation (1) such that ω (s,t) ≥ 0 and ω (0,t) = 0, and such that ω (., 0) is a modulus of continuity for u(., 0). Then ω (.,t) is a modulus of continuity for u(.,t) for each t ≥ 0. Proof. The proof is similar to the one we just gave, but we must also deal with the possibility that the maximum occurs on the boundary. As before, we first modify ω in order to obtain strict inequalities: Set ω˜ (s,t) = ω (s,t) + ε (1 + t) + ε s. Since Z is continuous on the compact set Ω¯ × Ω¯ , it attains a maximum at each time, and the maximum is a continuous function of time. Therefore if Z does not remain negative, there is a first time t0 > 0 where the maximum of Z reaches zero, and a point (x0 , y0 ) in Ω¯ × Ω¯ where this occurs. Since Z ≤ −ε on {x = y}, we know that x0 = y0 . This leaves two possibilities: Either (x0 , y0 ) is in the boundary ∂ Ω × Ω¯ ∪ Ω¯ × ∂ Ω , or both x0 and y0 are interior points of Ω . In the latter case, we derive a contradiction exactly as before, since ∂t ω˜ > ω˜ . So we need only consider the case where x0 ∈ ∂ Ω (the case y0 ∈ ∂ Ω is similar). We compute the derivative of Z in the direction where x˙ = −ν (where ν is the outward-pointing unit normal to ∂ Ω at x0 ), and y˙ = 0. Since Dν u|(x0 ,t0 ) = 0 by the Neumann condition, we have

 y0 − x0  ,ν > 0 D(−ν ,0) Z|(x0 ,y0 ,t0 ) = −ω˜ |y0 −x0 ( 2 ,t0 ) |y0 − x0 | since y0 − x0 , ν < 0 by the strict convexity of Ω , and ω˜ = ω + ε > 0. This contradicts the assumption that Z attains a maximum at (x0 , y0 ) at time t0 , and the proof is complete.

192

Ben Andrews

As before, we obtain sharp gradient estimates, determined by the gradient of the solution of the one-dimensional heat equation with Neumann condition and Heaviside initial data on the interval [−D/2, D/2]. In fact the modulus of continuity control allows us to prove a famous eigenvalue inequality, the Payne-Weinberger inequality:

1.5 The Payne-Weinberger inequality Theorem 1 ([30]). If Ω is a bounded convex domain with diameter D in Rn , then the first Neumann eigenvalue   λ1N (Ω ) = inf |Du|2 : u ∈ H 1 (Ω ), u = 0 Ω

is no less than

Ω

π2 . D2

Proof. By approximation, it suffices to consider the case where the boundary of Ω is smooth and strictly convex. Let u1 be the first eigenfunction of the Neumann Laplacian, so that Δ u1 + λ1N u1 = 0 on Ω , with Dν u1 = 0 on ∂ Ω . Then define N u(x,t) = e−λ1 t u1 (x), so that u is a smooth solution of the Neumann heat equation 2 −π t on Ω . We can apply Proposition 2 with ω (s,t) = Ce D2 sin πDs , since this satisfies the one-dimensional heat equation on [0, D/2] and is increasing and positive, provided we choose C sufficiently large to ensure that ω (., 0) is a modulus of continuity for u1 (this can always be done since u1 is smooth). 2

− π2 t

This implies that for any x and y, |u(y,t) − u(x,t)| ≤ ω (D/2,t) = Ce particular this implies oscu1 e−λ1 t ≤ Ce N

for each t ≥ 0. Taking t → ∞ implies that λ1N ≥

D

. In

2

− π2 t D

π2 , D2

as claimed.

Notes: The modulus of continuity bounds as presented here were developed in a series of papers with Julie Clutterbuck [2, 3]. The proof of the Payne-Weinberger inequality was included in [4].

Multi-point maximum principles and eigenvalue estimates

193

Lecture 2: Heat equations on Riemannian manifolds and nonlinear eigenvalues 2.1 Riemannian manifolds: Distance, Curvature and heat equations In this lecture we will investigate the application of the ideas we developed in the first lecture to a more general context, including equations more general than the heat equation, and domains more general than Euclidean domains. In order to do this we first need to review some Riemannian geometry. A Riemannian manifold is a smooth manifold M equipped with a Riemannian metric, which is a smoothly varying inner product gx on each tangent space Tx M. This allows us to make sense of the length of a smooth curve σ : [a, b] → M by setting b gσ (s) (σ (s), σ (s)) ds, L[σ ] = a

where σ (s) ∈ Tσ (s) M

is the tangent vector to the curve. The length then allows us to define a distance function (if M is path-connected), called the Riemannian distance, by setting d(x, y) := inf {L[σ ] : σ ∈ C∞ ([0, 1], M), σ (0) = x, σ (1) = y} . This defines a distance function in the sense of metric spaces. If the manifold is metrically complete, then the Hopf-Rinow theorem tells us that the distance between points is attained by a (possibly non-unique) geodesic, which is a locally length-minimizing curve. In the case where the manifold is Rn and the Riemannian metric is the Euclidean inner product, this is the usual Euclidean distance and the geodesics are straight lines. This is still true if M is a convex subset of Rn , but not for non-convex subsets. In general, the distance function is not smooth (for a simple example, consider the unit circle with arc length parameter s from some given point p, on which the distance from p is non-smooth at the antipodal point s = π , where it looks like π − |s − π |). The Riemannian metric also defines a connection, which is a differential operator acting on vector fields. This is called the Riemannian connection or Levi-Civita connection, and is uniquely determined by the conditions that it be symmetric (so that ∇i ∂ j = ∇ j ∂i in any local chart) and that it is compatible with the Riemannian metric, so that differentiating an inner product gives the same result as differentiating the vector fields inside the inner product, in the sense that dw g(U,V ) = g(∇wU,V ) + g(U, ∇wV ) for any vector fields U and V and any vector w. The geodesics are then (up to reparametrisation) the curves σ which have parallel tangent vector, so that ∇s σ (s) = 0 along σ . Accordingly, there is a unique geodesic starting at any point

194

Ben Andrews

with any given initial tangent vector, and this defines the exponential map: This takes any tangent vector v at a point x of M to the endpoint expx (v) of the geodesic σ : [0, 1] → M with σ (0) = x and σ (0) = v. This always exists for small v, but may not exist for all v (the Hopf-Rinow theorem guarantees global existence if M is metrically complete, however). The curvature tensor measures the failure of the commutation of differentiation using the Riemannian connection: For vector fields U,V,W, Z it is defined by R(U,V,W, Z) = g(∇V ∇U W − ∇U ∇V W − ∇[V,U]W, Z). Given a pair of orthonormal unit vectors e1 and e2 , the sectional curvature of the plane they generate is defined by R(e1 , e2 , e1 , e2 ) (this is independent of the choice of orthonormal basis for this plane). Given a unit vector e1 , the Ricci curvature in direction e1 is defined by Rc(e1 , e1 ) =

∑ R(e1 , e j , e1 , e j ),

j>1

for any orthonormal basis {ei } completing e1 . The Ricci curvature will be particularly important in what follows, as it arises naturally when computing Laplacians of the distance function. For a smooth function f on M, the Hessian of f is the second derivative computed using the Riemannian connection, so that ∇2 f (u, v) = Du (Dv f )−D∇u v f for any vec d2  tors u and v. Equivalently, ∇2 f (e, e) = ds 2 f ◦ σ (s) s=0 , where σ is the geodesic in M with σ (0) = e. The Laplacian Δ f of f is the trace of the Hessian with respect to the metric (equivalently, the sum of diagonal elements with respect to any orthonormal basis). This allows us to make sense of the heat equation on a Riemannian manifold. More generally, we will consider quasilinear equations (with cefficients depending on the gradient) which are ‘isotropic’, in the sense that the diffusion coefficients are unchanged by any orthogonal transformation fixing the gradient vector. This means that the equation has the form    ui u j ui u j ∂u ij + b(|Du|) δ − (5) = L [u] := a(|Du|) ∇i ∇ j u, ∂t |Du|2 |Du|2 where a and b are positive functions. Examples include the heat equation with a = b = 1, the p-Laplacian heat flows with a = |Du| p−2 and b = (p − 1)|Du| p−2 , and the 1 graphical mean curvature flow with a = 1+|Du| 2 and b = 1.

2.2 The Ricci non-negative case In this lecture we will only consider the simplest case, where the Ricci curvature is non-negative on M. We will assume for now that M is compact. We allow the possibility that M has boundary, but in that case we require that the boundary is

195

Multi-point maximum principles and eigenvalue estimates

convex. In this case there is a version of the Hopf-Rinow theorem which says that the distance between any pair of points in M is attained by a geodesic. The result is as follows: Proposition 3. Let M be a compact Riemannian manifold with convex (or empty) boundary, with non-negative Ricci curvature and diameter D = sup{d(x, y) : x, y ∈ M}. Let u be a smooth solution of equation (5) with Neumann boundary condition Dν u = 0 on ∂ M, where a and b are positive functions. Suppose that ω0 is a modulus of continuity for u(., 0), which is increasing, and suppose ω satisfies ∂t ω ≥ a(ω )ω on [0, D/2] × [0, T ), with ω (0,t) = 0 and ω (D/2,t) = 0. Then ω (.,t) is a modulus of continuity for u(.,t) for each t ≥ 0. Proof. Assuming a and b are smooth and positive, we can find a sequence ωε which strictly decreases uniformly to ω as ε → 0, such that ∂t ωε > a(ωε )ωε and ωε (s,t) > 0 on (0, D/2) × [0, T ). define Z : M × M × [0, T ) → R by  As before, 

Z(x, y,t) = u(y,t) − u(x,t) − 2ωε d(x,y) 2 ,t , where d is the Riemannian distance. By assumption Z is strictly negative at t = 0, and where x = y for any t. We consider a point (x0 , y0 ,t0 ) where Z first becomes zero, noting that we have y0 = x0 . Now we must confront the difficulty that Z is in general not a smooth function: Since the Riemannian distance is not smooth (just Lipschitz in general) we cannot just differentiate Z in some direction to deduce inequalities. To get around this problem, we go back to the definition of distance as an infimum of lengths of paths, and extend Z to a function which lives on a space of paths. In doing this we appear to sacrifice something, since we now have to deal with functions defined on an infinite-dimensional space, but we gain smoothness: We define Z˜ to be the function on C∞ ([0, 1], M) × [0, T ) defined by   ˜ σ ,t] = u(σ (1),t) − u(σ (0),t) − 2ωε L[σ ] ,t . Z[ 2 This corresponds naturally to Z when σ is a length-minimising geodesic, and since ˜ σ ,t] ≤ ωε is non-decreasing in the first argument we always have the inequality Z[ Z(σ (0), σ (1),t), with equality if σ is a length-minimising geodesic. Importantly for us, Z˜ is smooth, in the sense that if σ : [0, 1] × I → M is a smooth family of paths, ˜ σ (., r),t] is a smooth function of r and t. then Z[ It is useful to compute the derivatives of the length under such smooth variations: For the first derivative, we have d L[σ (., r)] = dr

1 g(σs , ∇r σs ) 0

|σs |

ds

(6)

where the subscripts denote derivatives, and s represents the parameter along the curve, while r represents the variation parameter through the family of curves. Using the symmetry of the connection, and the compatibility of the connection with the metric, and assuming we have parametrised at constant speed (so that |σs | is constant in s) we can re-write this as follows:

196

Ben Andrews

d L[σ (., r)] = dr

1 g(σs , ∇s σr )

ds |σs |  1 1  σs g(∇s σs , σr ) =g , σr  − ds |σs | |σs | 0 0

(7)

0

In particular if σ is a geodesic then ∇s σs = 0, and the second term vanishes. The second derivative may be computed by differentiating (6) with respect to r. We do this assuming that σ is a geodesic (so that |σs | = L), and obtain the following:  1 1 d2  L[ σ (., r)] = | (∇r σs )⊥ |2 + g(σs , ∇r ∇r σs ) ds r=0 dr2 L 0 2 1 1  ⊥ = (∇r σs )  + g(σs , ∇s ∇r σr ) + R(σr , σs , σr , σs ) ds L 0  1 2  1 1  σs ⊥ = , ∇r σr  . (∇r σs )  − R(σr , σs , σr , σs ) ds + g L 0 |σs | 0 (8)

Using (7) we can rule out the possibility that x0 or y0 is in the boundary, in exactly the same way as in the Euclidean setting: If σ0 is a minimising geodesic from x0 ∈ ∂ M to y0 , define a smooth variation σ : [0, 1] × (−δ , δ ) → M as follows: Set e = −ν ∈ Tx M, and parallel transport along σ to obtain e(s) ∈ Tσ0 (s) M such that ∇s e = 0. Then define σ (s, r) = expσ0 (s) (r(1 − s)e(s)). Note that when r = 0 this returns σ0 (s), and σ exists for small r and is smooth in s and r. We have σr (s) = (1 − s)e(s), so σr (0) = −ν and σr (1) = 0. This gives (since Dν u = 0 by the Neumann condition)  1     σ0 (0) d ˜ σs   Z[σ (., r),t0 ] r=0 = −ωε g , γr  = ωε gx , −ν > 0 dr |σs | |σ0 (0)| 0 since σ0 points strictly into M at x and ωε > 0. This contradicts the claim that ˜ 0 ) has a maxiZ(., .,t0 ) has a maximum at (x0 , y0 ) since this would imply that Z(.,t mum at σ0 . It follows therefore that Z(., .,t0 ) attains a maximum for x0 = y0 both interior ˜ 0 ) attains a maximum at σ0 , where σ0 is a points of M, and therefore that Z(.,t minimising geodesic from x0 to y0 . For convenience we choose an orthonormal basis σ {ei } for Tx0 M with en = |σ0 | , and parallel transport along σ to get an orthonormal 0

basis {ei (s)} at each point σ0 (s). Since σ0 is parallel along σ0 we have that en (s) = σ0 (s) |σ0 (s)|

for each s. The first variation formula (7) yields the following: Since σ0 is a maximum point ˜ we have for any smooth variation of L,  d L[σ (., r)]r=0 dr

= Dσr (1) u − Dσr (0) u − ωε gy0 (en (1), σr (1)) + gx0 (en (0), σr (0)) ,

0=

197

Multi-point maximum principles and eigenvalue estimates

and therefore since we can choose variations with arbitrary σr (0) and σr (1) we must have Du(y0 ,t0 ) = ωε en (1); and Du(x0 ,t0 ) = ωε en (0). ˜ σ0 ,t) increases It follows that the time derivative of Z˜ at (x0 , y0 ,t0 ) satisfies (since Z( to zero as t increases to t0 ) 0≤

∂ ˜ Z[σ0 ,t0 ] = a(ωε )∇n ∇n u(y0 ,t0 ) + b(ωε ) ∑ ∇i ∇i u(y0 ,t0 ) ∂t i 0 for all z ∈ [a, b]. Let ψ : [m, M] → [a, b] be the inverse function of ϕ . Then

ψ (u(y)) − ψ (u(x)) ≤ |y − x| for all x, y ∈ Ω . This two-point estimate has an immediate corollary obtained by allowing y and x to approach each other: Corollary 3. Under the assumptions above, |Du(x)| ≤ ϕ ◦ ψ (u(x)) for all x ∈ Ω . That is, the gradient of u is bounded by that of ϕ at the point with the same height. This includes all of the gradient estimates mentioned above, but does not require any special form of the equation. The proof proceeds as follows: We let Z(x, y) = ψ (u(y)) − ψ (u(x)) − |y − x|, and suppose that a positive supremum of Z occurs at some point (x0 , y0 ) ∈ Ω¯ × Ω¯ . Since Z = 0 when x = y we know that y0 = x0 . We can rule out the possibility that x0 or y0 is on the boundary ofΩ : Suppose x0 ∈  d 0 ∂ Ω . Then we compute ds Z(x0 −sν (x0 ), y0 )s=0 = −ψ Dν u(x0 )− |yy0 −x , ν (x0 ) > 0 −x0 | 0, where we used the Neumann boundary condition and the convexity of the domain to get the last inequality. This contradicts the assumption that (x0 , y0 ) is a maximum value, so this case cannot occur. The case where y0 ∈ ∂ Ω is similar. This leaves us with the conclusion that x0 and y0 are distinct interior points of −x0 Ω . We choose an orthonormal basis {ei } such that en = |yy00 −x . Then we compute 0| variations of Z in various directions: The first derivatives give 0=

 d ˜ s=0 Z(x0 + se, y0 + se) ds

y0 − x0 = e, −ψ (u(x0 ))Du(x0 ) + |y0 − x0 |





y0 − x0 + e, ˜ ψ (u(y0 )) − . |y0 − x0 |

1 , where zy = ψ (u(y)), so Since ψ ◦ ϕ is the identity map, we have ψ (u(y)) = ϕ (z y) we conclude that Du(y0 ) = ϕ (zy0 )en and Du(x0 ) = ϕ (zx0 )en . Next we compute useful parts of the second derivatives: Moving x0 towards y0 we have

0≥

 d2 Z(x0 + sen , y0 )s=0 = −ψ (u(x0 ))Dn Dn u(x0 ) − ψ (u(x0 ))(Dn u(x0 ))2 ; 2 ds

Similarly, moving y0 towards x0 gives 0≥

 d2 Z(x0 , y0 − sen )s=0 = ψ (u(y0 ))Dn Dn u(y0 ) + ψ (u(y0 ))(Dn u(y0 ))2 . 2 ds

Multi-point maximum principles and eigenvalue estimates

211

Since ψ (ϕ (z))ϕ (z) = 1, differentiating further gives

ψ (ϕ (z))ϕ (z)2 + ψ (ϕ (z))ϕ (z) = 0, so that

ψ (u(x)) = −

ϕ (zx ) . ϕ (zx )3

Substituting this above gives Dn Dn u(y0 ) ≤ ϕ (zy0 ) and

Dn Dn u(x0 ) ≥ ϕ (zx0 ).

Next we move x and y in parallel in a direction orthogonal to the line between them: 0≥

 Di Di u(y0 ) Di Di u(x0 ) d2 Z(x0 + sei , y0 + sei )s=0 = − . ds2 ϕ (zy0 ) ϕ (zx0 )

(20)

Now we can use the equation: 0 = ai j (Du(y0 ))Di D j u(y0 ) + q(u(y0 ), |Du(y0 )|) = a(ϕ (zy0 ), ϕ (zy0 ))Dn Dn u(y0 ) + ∑ b(ϕ (zy0 ), ϕ (zy0 ))Di Di u(y0 ) + q(ϕ (zy0 ), ϕ (zy0 )), i 0, f  (0) < 0, f  (θ ) > 0.

(8) (9)

A Note on Liouville type results for a fractional obstacle problem

219

Observe that the assumptions on f implimply that the associated potential is unbalanced which is a necessary condition to observe the propagation of a front with a positive speed [1, 9]. Thus it seems reasonable to assume such conditions in our setting since we expect that the solution u of (6) reflects the outcome of the invasion of the population in the environment Rn \ K.

2 Main results For the local problem (3), the Liouville property obtained in [3] says that u = 1 in Rn \ K under some geometric conditions on K, in particular when K is convex. A similar Liouville type property was recently obtained for continuous solutions of (6) 1 when the singular kernel |z|n+2s is replaced by a non negative integrable kernel J, i.e. J ∈ L1 (R), see [4]. More precisely, if J is assume to satisfy the assumptions below

J ∈ L1 (Rn ) is a non-negative, radially symmetric kernel with unit mass, there are 0 ≤ r1 < r2 such that J(x) > 0 for a.e. x with r1 < |x| < r2 ,

and there exists a function φ ∈ C(R) satisfying J1 ∗ φ − φ + f (φ ) ≥ 0 in R, φ is increasing in R, φ (−∞) = 0, φ (+∞) = 1,

(10)

(11)

where J1 ∈ L1 (R) is the non-negative even function with unit mass given for a.e. x ∈ R by  J1 (x) :=

Rn−1

J(x, y2 , · · · , yn ) dy2 · · · dyn .

then in [4] the authors prove the following Theorem 1 (Brasseur, Coville, Hamel,Valdinoci [4]). Let K ⊂ Rn be a compact convex set. Assume that f satisfies (8) and (9) and J satisfies (10) and (11) and let u ∈ C(Rn \ K, [0, 1]) be a function satisfying ⎧ ⎨ J(x − y)(u(y) − u(x)) dy + f (u(x)) ≤ 0 for x ∈ Rn \ K, Rn \K (12) ⎩u(x) → 1 as |x| → +∞. Then, u = 1 in Rn \ K. Observe that the problems (12) and (6) only differ in their formulation by the singularity of the kernels used. In particular, the problem (12) can be reformulated in to the framework of problem (6) since for all J ∈ L1 (Rn ) and for all x ∈ Rn \ K and u ∈ L∞ (Rn )

220

J´erˆome Coville



lim

ε → Rn \K,|x−y|>ε

J(x − y)(u(y) − u(x)) dy =

 Rn \K

J(x − y)(u(y) − u(x)) dy.

Therefore, it is expected that (12) and (6) share some common properties. On of the goals of the present note is to extend the results known for (12) to the solutions of (6) and when possible to highlight the role of the singularity of the kernel in this context. Our main results show that under the right regularity assumptions we can transpose the results of Theorem 1 to solutions to (6). Namely, we first prove that Theorem 2. Let K ⊂ Rn be a compact smooth convex set (C0,1 ). Assume (8), (9) and s ∈ (0, 12 ). Let u ∈ C0,β (Rn \ K, [0, 1]) with β > 2s be a function satisfying  L [u] + f (u) ≤ 0 in Rn \ K, (13) u(x) → 1 as |x| → +∞. Then, u = 1 in Rn \ K. Our second result complete the picture, namely we show that n 0,1 Theorem

13. Let K ⊂ R 1,βbe an compact smooth convex set (C ). Assume (8), (9) and s ∈ 2 , 1 . Let u ∈ C (R \ K, [0, 1]) with β > 2s − 1 be a function satisfying ⎧ n ⎪ ⎨ L [u] + f (u) ≤ 0 in R \ K, ∇u · ν = 0 on ∂ K, (14) ⎪ ⎩ u(x) → 1 as |x| → +∞.

Then, u = 1 in Rn \ K. We can already see clearly the effect of the singularity of the kernel. Indeed, unlike the non local operators with integrable kernel the s-fractional Laplacian is well defined in Rn \ K only for regular function, i.e. u should be at least C0,β . In this singular setting, requiring that the super-solution u is solely continuous is not enough. These results complete our knowledge on the validity of such type of Liouville property for a broad class of reaction diffusion equation. They show some universality of such type of property and prove that such rigidity type result can be viewed as an intrinsic property of the problem which can be related to a generic property of the equation rather than a special property of the diffusion process considered.

2.1 Further comments and strategy of proofs Prior to proving these results, let us make some comments on our hypotheses and 1 highlights some of the differences that arise when the singular measure |z|n+2s is replaced by an integrable kernel J.

A Note on Liouville type results for a fractional obstacle problem

221

First, let us observe that thanks to the regularising property of the regional fractional Laplacian L , see [6, 7, 8] the continuity assumption made on u can be easily weakened when u is assumed to be a solution to (6) instead of a super-solution. Thus, in this situation, the result of Theorems 2 and 3 hold as well for bounded solution u that satisfies the equation (13) respectively (14) in the sense of viscosity solutions. Note that contrary to the regional fractional Laplacian L , the nonlocal  operator M [u] := Rn \K J(x − y)(u(y) − u(x)) dy has no regularising property and as a consequence weakening the regularity assumption on the solution u is a hard task which, for the moment, can only been achieved by imposing further restrictions on the data f and J. Nevertheless, in this non regularising context, the regularity of the obstacle K is no more an issue and K can be any arbitrary convex domain. The regularisation effect on the solutions induced by the singularity is in fact the only main distinction between the problem (12) and the singular problems (13) and (14). This distinction appears also clearly in the set of assumption needed for the existence of monotone travelling front with a positive speed, which is a key element of the proof of Theorem 1. In particular, as already mentioned at Section 1.2, assumptions (8) and (9) are actually necessary and sufficient for the existence of a travelling wave solution with positive speed c to the one dimensional fractional equation, i.e. a monotone solution to c ∂z ϕ = ∂zs ϕ + f (ϕ ), with a positive speed c > 0. Such assumptions are not any more sufficient for the problem (12), for which there exists data f and J that satisfy (8)–(9) such that only discontinuous null speed fronts exist. Let us also note that, similar conditions are also necessary and sufficient for the existence of one dimentional travelling wave with positive speed for the local problem, namely solution of c ϕ  = ϕ  + f (ϕ ) with positive speed (see e.g. [2]). This fact, then suggests a strong connexion between the regularity of the front and the minimal set of assumptions that are required to produced a front of positive speed. Let us emphasize that the motivation behind these assumptions are that, by analogy with the local problem (2), we expect a solution to (6) to be the large time limit of an entire solution to the evolution problem (5) which behaves like ϕ (x1 + ct) when t → −∞. For this interpretation to even make sense it is necessary to work in a setting where the function ϕ exists. Let us now say a word on our strategy of proofs. The proofs are a rather straightforward adaptation of the arguments developed in [4] for the non local obstacle problem (12). The main idea is to compare by means of adequate sliding methods, a family of planar function of the type ϕ (x · e − r), where e ∈ ∂ B1 , r ∈ R and ϕ ∈ C1,1 (R) a given monotone function with a given super-solution u. To adapt such technique to our situation, we need first to verify that, as proved for (12) similar comparison principles in half-spaces hold true as well for the fractional equations (13) and (14). The outline of this note will be as follows. In Section 3 we provide several comparison principles and recall some known results on the 1d travelling fronts for

222

J´erˆome Coville

fractional bistable equation. Then in Section 4, following the arguments developed in [4], we prove the Liouville property described in Theorems 2 and 3.

3 Some mathematical background In this section, we start by collecting some comparison principles that are suitable for our purposes and to shortened the presentation we only fully state

the necessary comparison principles for regional fractional Laplacian with s ∈ 12 , 1 . Throughout this section, K is any compact subset of Rn , f is any C1 (R) function. We start with a weak maximum principle

 Lemma 1 (Weak maximum principle). Assume that s ∈ 12 , 1 and f  ≤ −c1 in [1 − c0 , +∞), for some c0 > 0, c1 > 0.

(15)

c n Let H ⊂ Rn be anopen  affine half-space such that K ⊂⊂ H = R \ H. Let u, v ∈ ∞ n 1, β L (R \ K) ∩C H for some β > 1 − 2s be such that  L [u] + f (u) ≤ 0 in H,

L [v] + f (v) ≥ 0 in H. Assume also that u ≥ 1 − c0

in H,

and

  lim sup v(x) − u(x) ≤ 0 |x|→+∞

and that v≤u

a.e. in H c \ K.

Then, v ≤ u a.e. in Rn \ K. The next lemma is concerned with a strong maximum principle.

 Lemma 2 (Strong maximum principle). Assume that s ∈ 12 , 1 and let H ⊂ Rn be   an open affine half-space such that K ⊂⊂ H c . Let u, v ∈ L∞ (Rn \ K) ∩C1,β H for some β > 1 − 2s be such that (1) holds true. Assume also that v≤u

a.e. in Rn \ K

and that there exists x¯ ∈ H such that v(x) ¯ = u(x). ¯ Then, v=u

a.e. in H.

These comparison principles are in essence identical to the one derived in [4] and as such we point the interested reader to [4] for a detailed proof of these results.

223

A Note on Liouville type results for a fractional obstacle problem

Remark 1. The above comparison principles

 have only been stated for regional fractional operators with exponent s ∈ 12 , 1 . Identical weak and strong maximum principles   can be formulated for the regional fractional Laplacian with exponent s ∈ 0, 12 as soon as we impose the adequate regularity to the functions u and v in order to properly define the regional fractional Laplacian of u and v. In such case,   the above statement will holds true if instead of having u, v ∈ L∞ (Rn \ K) ∩C1,β H   we assume that u, v ∈ L∞ (Rn \ K) ∩C0,β H with β > 2s. Lastly, we recall some known results on the existence and properties of travelling fronts ϕ (x · e + ct), entire solution of the fractional evolution equation

∂t u(t, x) = Δ s u(t, x) + f (u(t, x)) for t ∈ R, x ∈ Rn that is, solution of the following − c∂z ϕ (z) + ∂zs ϕ (z) + f (ϕ (z)) = 0 lim ϕ (z) = 1, lim ϕ (z) = 0

z→+∞

for

z∈R

z→−∞

(16) (17)

where ∂zs ϕ denotes the one dimentional s−fractional Laplacian. The existence, uniqueness and some asymptotic properties of such solution ϕ have been obtained in several context [1, 5, 9, 10]. The next statement is a summary of these results. Theorem 4 (Fractional Travelling wave [1, 5, 9, 10]). Assume f is a bistable function that satisfies (8) and (9) and let s ∈ (0, 1). Then there exists a unique c ∈ R and a monotone smooth (at least C1,1 ) increasing function ϕ such that (c, ϕ ) is a solution  to (16)–(17). Moreover, if f is unbalanced with 01 f (s) ds > 0, then c > 0. As a trivial consequence of the existence of a smooth front of positive speed, for any separating open affine half-space H ⊂ Rn such that K ⊂⊂ H c we can derive a family of function which will be a ”sub-solution to the problem (6)” for all x ∈ H. More precisely, let H be an affine subspace of Rn such that K ⊂ H c . By definition of the affine space, there exists a unit vector e ∈ ∂ B1 and x0 ∈ Rn such that H = x0 + He with He an open-halfspace of direction e, i.e. He := {x ∈ Rn |x · e ≥ 0}. For this direction e and for all real r ∈ R, we can define the family of functions φe,r (x) := ϕ (x · e − r) where ϕ is the smooth increasing profile obtained in Theorem 4. By construction, since ϕ is monotone increasing we have ∀ x ∈ H, ∀ y ∈ K

φr,e (y) − φr,e (x) ≤ 0.

In addition, we can check that for all x ∈ H, we have

(18)

224

J´erˆome Coville



φr,e (y) − φr,e (x) dy |x − y|n+2s   φr,e (y) − φr,e (x) φr,e (y) − φr,e (x) dy − dy = lim ε →0 Rn ,|x−y|>ε |x − y|n+2s |x − y|n+2s K,|x−y|>ε  φr,e (y) − φr,e (x) = Δ s φr,e (x) − dy. |x − y|n+2s K

L [φr,e ](x) = lim

ε →0 Rn \K,|x−y|>ε

which combined with (18) enforces L [φr,e ](x) ≥ Δ s φr,e (x)

for

x ∈ H.

Hence, for all x ∈ H, we get L [φe,r ](x) + f (φe,r ) ≥ Δ s φe,r (x) + f (φe,r ) ≥ ∂ s ϕ (x.e − r) + f (ϕ (x.e − r)) = c∂z ϕ (x · e − r) > 0.

(19)

4 The case of convex obstacles: proofs of the main Theorem In this section, based on the arguments introduced in [4] we sketch the proof our main results (Theorems 2 and 3). The proofs of the Theorems 3 and 2 being identical, we only sketch the proof of Theorem 3. But before we start our discussion,let us first start with the following simple observation.

 Lemma 3. Let s ∈ 12 , 1 , K ⊂ Rn be a smooth compact convex set and assume (8) and (9). Let u ∈ C1,β (Rn \ K), [0, 1]) with β > 1 − 2s be such that L [u] + f (u) ≤ 0 in Rn \ K, ∇u · ν = 0 in ∂ K, u(x) → 1 as |x| → +∞.

(20) (21) (22)

Then there exists γ ∈ (0, 1] such that γ ≤ u ≤ 1 in Rn \ K. The proof of this Lemma being an elementary adaptation of the argument used in [4], we will refer to [4] for its proof. We now turn to the proof of Theorem 3 . Proof (Proof of Theorem

 3). Let us fix s ∈ 12 , 1 and let K, f , and u be as in Theorem 3. Let us now follow the argument developed in [4]. Firstly, without loss of generality, one can assume by (8) that f can be extended to a C1(R) function satisfying (15). Secondly, by (13) and the boundedness of K, there exists R0 > 0 large enough so that K ⊂ BR0 and u ≥ 1 − c0 in Rn \ BR0 , where c0 > 0 is given in (15).

A Note on Liouville type results for a fractional obstacle problem

225

We now proceed by contradiction, and suppose that inf u < 1.

(23)

Rn \K

From (14) and (23), together with the continuity of u, there exists then x0 ∈ Rn \ K such that u(x0 ) = min u ∈ [0, 1). Rn \K

We observe that, by Lemma 3, one has u(x0 ) > 0. In addition, since K is convex, there exists e ∈ ∂ B1 such that K ⊂ Hec , where He is the open affine half-space defined by   He := x0 + x ∈ Rn ; x · e > 0 . As in section 3, let us define for all r ∈ R, the family of functions

φr (x) := φr,e (x) = ϕ (x · e − r), x ∈ Rn , where ϕ is a smooth monotone increasing function given by Theorem 4. Note that by construction, since K ⊂ Hec , we can check (as in the section 3) that for any r ∈ R, φr satisfies L [φr ](x) + f (φr (x)) > 0

for

x ∈ He .

(24)

First, we claim that Proposition 1. There exists r0 ∈ R such that φr0 ≤ u in Rn \ K. Again the proof of this Proposition is an elementary adaptation of a proof done in [4] that we present for the sake of clarity. Proof. First let us define H := x1 + He with x1 to be chosen such that BR0 ⊂ H c . Let us fix x1 such that H ⊂⊂ He . By construction the function ϕ is monotone increasing and satisfies limz→−∞ ϕ (z) = 0. So we can find r0 >> 1 such that φr0 (x) = ϕ (x · e − r0 ) ≤ u(x0 ) ≤ u(x) for all x ∈ H c . Now thanks to our choice of x1 we have H ⊂⊂ He and from (24) we deduce ⎧ ⎪ L [u](x) + f (u(x)) ≤ 0 for x ∈ H, ⎪ ⎨ L [φr0 ](x) + f (φr0 (x)) > 0 for x ∈ H, ⎪ ⎪ ⎩ u(x) ≥ φr0 (x) for x ∈ H c \ K, We then get the desired results by applying the weak-maximum principle (Lemma 1). Equipped with the Proposition 1, we can now define the following quantity   r∗ := inf r ∈ R ; φr ≤ u in Rn \ K .

226

J´erˆome Coville

All the game now is to show that r∗ = −∞. So, we claim that Claim. r∗ = −∞. Assume that the claim is true, then the proof of Theorem 3 is thereby complete. Indeed, from this claim we infer that φr ≤ u in Rn \ K for any r ∈ R. In particular, recalling that ϕ (+∞) = 1, we get that 1 > u(x0 ) ≥ lim φr (x0 ) = lim ϕ (x0 · e − r) = 1, r→−∞

r→−∞

a contradiction. Therefore (23) can not hold. In other words, inf Rn \K u = 1, i.e. u = 1 in Rn \ K proving thereby Theorem 3. Let us now conclude our proof by establishing the Claim. Again, the proof of this last Claim is done by a very elementary adaption of the arguments used to prove Theorem 1. As a consequence we will only highlights the main differences. Proof (Proof of the Claim). The proof is by contradiction. We assume that r∗ ∈ R. Then, there exists a sequence (ε j ) j∈N of positive real numbers such that φr∗ +ε j (x) = ϕ (x · e − r∗ − ε j ) ≤ u(x) for all x ∈ Rn \ K and ε j → 0 as j → +∞. Thus, passing to the limit as j → +∞, we obtain that

φr∗ (x) ≤ u(x)

for all x ∈ Rn \ K.

Let us denote H the open affine half-space   H = x ∈ R n ; x · e > R0 . Notice that H ∩ K = 0/ and that u is well defined and continuous in H. We also observe that, by construction, sup φr∗ < 1.

(25)

Hc

Two cases may occur. Case 1: infH c \K (u − φr∗ ) > 0. In this situation, the argument is identical as for (12), and we point the reader to [4] for the details. Case 2: infH c \K (u− φr∗ ) = 0. In this situation, by (22) and (25), and by continuity of u and φr∗ , there exists a point x¯ ∈ H c \ K such that u(x) ¯ = φr∗ (x). ¯ Note that x¯ ∈ He , since otherwise x¯ ∈ Rn \ He , namely x¯ · e < x0 · e, and the chain of inequalities u(x) ¯ = φr∗ (x) ¯ < φr∗ (x0 ) ≤ u(x0 ) = min u Rn \K

leads to a contradiction. Therefore, we have φr∗ ≤ u in Rn \ K with equality at a point x¯ ∈ Rn \ K ∩ He . Again, two situations can occur either x¯ ∈ Rn \ K or x¯ ∈ ∂ K. Assume for the moment that the latter situation occurs. Then thanks to the convexity of

A Note on Liouville type results for a fractional obstacle problem

227

K, the outward normal to ∂ K at x¯, ν (x) ¯ = e and ¯ , is then the vector e, i.e. ν (x) ¯ = 0 from which we deduce thanks to (14) we get ∇u · ν (x) 0 ≤ ∇(u − φr∗ ) · ν (x) ¯ ≤ −∇(φr∗ ) · ν (x) ¯ = −ϕ  (x¯ · e − r∗ ) < 0. This contradiction then rules out this situation. Lastly assume that x¯ ∈ Rn \ K, then in this situation K ⊂⊂ Hec and ϕr∗ and u satisfy respectively  L [u] + f (u) ≤ 0 in He , . L [φr∗ ] + f (φr∗ ) > 0 in He (by (24)), In particular, it follows from the strong maximum principle (Lemma 2) that φr∗ = u in He . Thus, for any e⊥ ∈ ∂ B1 such that e⊥ · e = 0, one infers from (22) and the definition of φr∗ that 1 = lim u(x0 + t e⊥ ) = lim φr∗ (x0 + t e⊥ ) = φr∗ (x0 ) < 1. t→+∞

t→+∞

This last contradiction then rules out also this situation and therefore rules out Case 2 too. Remark 2. The proof of Theorem 2 is identical to the one given in [4]. This is due to the fact that the s−fractional   operator is well defined and continuous up to the boundary of ∂ K when s ∈ 0, 12 and as such the strong maximum principle (Lemma 2) holds also true for any half space H such that K ⊂ H c . In this situation, the case x¯ ∈ ∂ K does not need to be analysed separately from the other cases. Acknowledgements The author has been supported by the ANR DEFI project NONLOCAL (ANR-14-CE25-0013). The author want to thank Professor Changfeng Gui for bringing to my attention this question during the MATRIX program “Recent Trends on Nonlinear PDEs of Elliptic and Parabolic Type”. These results have emerged through the scientific discussions during this event.

References 1. Achleitner, F., Kuehn, C., et al.: Traveling waves for a bistable equation with nonlocal diffusion. Advances in Differential Equations 20(9/10), 887–936 (2015) 2. Aronson, D.G., Weinberger, H.F.: Multidimensional nonlinear diffusion arising in population genetics. Adv. in Math. 30(1), 33–76 (1978) 3. Berestycki, H., Matano, H., Hamel, F.: Bistable traveling waves around an obstacle. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences 62(6), 729–788 (2009) 4. Brasseur, J., Coville, J., Hamel, F., Valdinocci, E.: Liouville type results for a non-local equation in a presence of an obstacle. to appear in PLMS pp. – (2019) 5. Cabr´e, X., Sire, Y.: Nonlinear equations for fractional laplacians ii: existence, uniqueness, and qualitative properties of solutions. Transactions of the American Mathematical Society 367(2), 911–941 (2015)

228

J´erˆome Coville

6. Caffarelli, L., Silvestre, L.: Regularity theory for fully nonlinear integro-differential equations. Comm. Pure Appl. Math. 62(5), 597–638 (2009). DOI 10.1002/cpa.20274. URL http://dx.doi.org/10.1002/cpa.20274 7. Guan, Q.Y., Ma, Z.M.: Boundary problems for fractional laplacians. Stochastics and Dynamics 5(03), 385–424 (2005) 8. Guan, Q.Y., Ma, Z.M.: Reflected symmetric α -stable processes and regional fractional laplacian. Probability theory and related fields 134(4), 649–694 (2006) 9. Gui, C., Zhao, M.: Traveling wave solutions of allen–cahn equation with a fractional laplacian 32(4), 785–812 (2015) 10. Palatucci, G., Savin, O., Valdinoci, E.: Local and global minimizers for a variational energy involving a fractional norm. Annali di matematica pura ed applicata 192(4), 673–718 (2013)

Symmetry results for the solutions of a partial differential equation arising in water waves Serena Dipierro, Pietro Miraglio and Enrico Valdinoci

Abstract This paper recalls some classical motivations in fluid dynamics leading to a partial differential equation which is prescribed on a domain whose boundary possesses two connected components, one endowed with a Dirichlet datum, and the other endowed with a Neumann datum. The problem can also be reformulated as a nonlocal problem on the component endowed with the Dirichlet datum. A series of recent symmetry results are presented and compared with the existing literature.

1 Introduction In this paper we present some recent results related to the partial differential equation ⎧ div(ya ∇v) = 0 for x ∈ Rn , y ∈ (0, 1), ⎪ ⎪ ⎪ ⎪ ⎨vy (x, 1) = 0 x ∈ Rn , y = 1, (1) v(x, 0) = u(x) x ∈ Rn , y = 0, ⎪ ⎪ ⎪ ⎪ ⎩− lim ya vy = f (v) x ∈ Rn , y = 0, y→0

Serena Dipierro Department of Mathematics and Statistics, University of Western Australia, 35 Stirling Highway, Crawley WA 6009, Australia, e-mail: [email protected] Pietro Miraglio Dipartimento di Matematica, Universit`a degli studi di Milano, Via Saldini 50, 20133 Milan, Italy, and Universitat Polit`ecnica de Catalunya, Departament de Matem`atiques, Avinguda Diagonal 647, 08028 Barcelona, Spain, e-mail: [email protected] Enrico Valdinoci Department of Mathematics and Statistics, University of Western Australia, 35 Stirling Highway, Crawley WA 6009, Australia, e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_15

229

230

Serena Dipierro, Pietro Miraglio and Enrico Valdinoci

with a ∈ (−1, 1). The problem in (1) is related to a water waves model and, in a suitable limit, it recovers a fractional Laplace operator. More precisely, a solution v of (1) can be related to its trace u by a nonlocal equation of the type La u = f (u)

in Rn ,

(2)

for a suitable linear operator La . The operator La can be written in Fourier modes and will present different asymptotic behaviours for small and large frequencies, making the problem particularly interesting. One of the main questions that we address is under which conditions the bounded and monotone solutions of (2) are necessarily one-dimensional — that is, as a counterpart, solutions of (1) that are monotone in one of the x-variables are necessarily functions only of x1 and y, up to a rotation. In Section 2 we recall some basic fluid dynamics motivations to give an elementary but exhaustive description of the problem in (1) in terms of classical physics. Then, in Section 3, we focus on the mathematics relative to (1) and (2), discussing symmetry results in the light of a classical conjecture by Ennio De Giorgi.

2 Physical considerations In this section, we give a detailed description of the physical considerations that are leading to the study of (1). To this end, we consider a possible physical description of an irrotational and inviscid fluid (the “ocean”) in Rn+1 , though we commonly take n = 2 in the “real world”. The position of a fluid particle at time t will be denoted by X(t) = (x(t), y(t)) ∈ Rn ×R. We suppose that, at time t, the region occupied by the ocean lies above the graph of a function b(·,t) (the “bottom of the ocean”) and below the graph of a function h(·,t) (the “surface of the ocean”). Therefore, in this model, the ocean can be described by the time-dependent domain Ω (t) := {(x, y) ∈ Rn × R s.t. b(x,t)  y  h(x,t)},

(3)

see Figure 1. Given a point X ∈ Ω (t), we denote by v(X,t) the velocity of the fluid particle at X at time t. We denote by Φ t (X) the evolution produced by the vector field v at time t starting at the point X at time zero, that is the solution of the initial value problem ⎧ ⎨ d Φ t (X) = v(Φ t (X),t) for (small) t > 0, dt (4) ⎩ Φ 0 (X) = X. We suppose that the density of the water is described by a positive function ρ =  ⊂ Rn+1 at time t is described ρ(X,t). Then, the mass of the fluid lying in a region Ω by the quantity

231

Symmetry result in water waves models

y=h(x,t)

y=b(x,t) Fig. 1 The domain Ω (t) in (3).

  Ω

ρ(X,t) dX.

(5)

 through an infinitesimal portion of ∂ Ω  The rate at which a fluid mass enters in Ω  in the vicinity of a point X ∈ ∂ Ω is given by the density times the velocity at X in  at X. That is, if ν(X) denotes the exterior the direction of the inner normal of ∂ Ω   is given normal of ∂ Ω at X, we find that the rate at which a fluid mass enters in Ω by  −

 ∂Ω

ρ(X,t) v(X,t) · ν(X) dH n (X).

Comparing with (5), and using the Divergence Theorem, this leads to   Ω

∂t ρ(X,t) dX =

d dt

=− =−



 Ω  ∂Ω

  Ω

ρ(X,t) dX ρ(X,t) v(X,t) · ν(X) dH n (X)

 divX ρ(X,t) v(X,t) dX.

 is arbitrary, we obtain the “mass conservation From this, since the volume region Ω law” (also known as “continuity equation”) given by  in Ω (t). (6) ∂t ρ(X,t) + divX ρ(X,t) v(X,t) = 0 Let us now analyze the conditions occurring at the bottom and at the surface of the fluid. At the bottom, we assume that the fluid cannot penetrate inside the

232

Serena Dipierro, Pietro Miraglio and Enrico Valdinoci

ground, hence its velocity is tangent to the seabed. Recalling the notation in (3), we have that v needs to be orthogonal to the normal direction of the graph of b, and thus, using the notation X = (x, y) ∈ Rn × R,  v(X,t) · ∇x b(x,t), −1 = 0 if y = b(x,t). (7) We can therefore collect the results in (6) and (7) by writing

 ∂t ρ(X,t) + divX ρ(X,t) v(X,t) = 0 in Ω (t),  on {y = b(x,t)}. v(X,t) · ∇x b(x,t), −1 = 0

(8)

From (8) one sees that the vector field ρv has perhaps more physical meaning than v alone, since it represents the density speed of the flow, and it is somehow more meaningful to prescribe a bound on ρv rather than on v itself. For instance, the situation in which v becomes unbounded becomes physically realistic if ρv remains bounded, since, in this case, roughly speaking, only a very negligible amount of fluid would travel at exceptionally high speed. Therefore, though the equations are perfectly equivalent in case of “nice” vector fields v and densities ρ, we prefer to write (8) in a form which makes appear directly the quantity ρv rather than v alone. This is done by multiplying the identity on the bottom of the ocean by the density, to find

 in Ω (t), ∂t ρ(X,t) + divX ρ(X,t) v(X,t) = 0  (9) ρ(X,t) v(X,t) · ∇x b(x,t), −1 = 0 on {y = b(x,t)}. We also assume that the fluid particles do not “circulate in a cyclone way”, namely that the fluid is irrotational, see Figure 2. To formalize this notion in an arbitrarily large number of dimensions in an elementary geometric way (without using the notion of higher dimensional curls), we assume that, for every fixed time, the integral of the velocity vector field along any closed one-dimensional curve in Rn vanishes. As a matter of fact, it would be enough to require such a condition along polygonal lines, and in fact it would be sufficient to require it along triangular connections. This irrotationality condition implies (and, in fact, it is equivalent to) that the velocity field admits a potential, namely that there exists a scalar function u = u(X,t) such that v(X,t) = ∇X u(X,t). (10) We stress that (10) is a rather striking formula, since it reduces the knowledge of a vector valued function (namely, v) to the knowledge of (the derivatives of) a single scalar function. The construction of the potential u is standard, and can be performed along the following argument: we let ΓX be the oriented segment starting at the origin and arriving at X, and we set  1



u(X,t) :=

ΓX

v := 0

v(ϑ X,t) · X dϑ .

233

Symmetry result in water waves models

v Fig. 2 The velocity field v has always a positive component along the tangential direction of the closed curve, hence it is not irrotational.

To prove (10), let j ∈ {1, . . . , n} and δ = 0, to be taken arbitrarily small in what follows. We also denote by ΓX,δ , j the oriented segment from X to X + δ e j . Also, given two adjacent segments Γ1 and Γ2 , we denote by Γ1 ∪ Γ2 the broken line joining the initial point of Γ1 to the end point of Γ1 (which coincides with the initial point of Γ2 ) and that to the end point of Γ2 . Furthermore, we denote by −Γ1 the segment Γ1 run in the opposite direction. With this notation, we have that ΓX+δ e j ∪ (−ΓX,δ , j ) ∪ (−ΓX ) forms a close triangle and accordingly, by the irrotationality condition, 

0=



ΓX+δ e ∪(−ΓX,δ , j )∪(−ΓX )

v=

j

= u(X + δ e j ,t) − δ

 1 0

ΓX+δ e

v− j

 ΓX,δ , j

v−

 ΓX

v

v(X + ϑ δ e j ,t) · e j dϑ − u(X,t).

Dividing by δ and sending δ → 0, we obtain (10), as desired. Then, inserting (10) into (9), we conclude that

 ∂t ρ(X,t) + divX ρ(X,t) ∇X u(X,t) = 0 in Ω (t), ρ(X,t) ∇x u(X,t) · ∇x b(x,t) − ρ(X,t) ∂y u(X,t) = 0 on {y = b(x,t)}. (11)

234

Serena Dipierro, Pietro Miraglio and Enrico Valdinoci

We observe that the setting in (1) is a particular case of that in (11), in which one considers the steady case of stationary solutions (i.e. ρ does not depend on time), with X = (x, y) ∈ Ω = Rn × (0, 1), and ρ(X) = ya , with a ∈ (−1, 1). Remark 1. Concerning the setting in (4), we recall that in the literature one also considers the “streamlines” of the fluid, described by a parameter τ ∈ R, which are (local) solutions of the differential equation (for fixed time t) d X(τ,t) = v(X(τ,t),t). dτ Notice that, if the velocity field v is independent of time, we can actually identify the curve parameter τ with the usual time t and then the streamlines describe the physical trajectories of the fluid particle. But in general, for velocity fields which depend on time, streamlines do not represent the physical trajectories. Nevertheless, streamlines are always instantaneously tangent to the velocity field of the flow and therefore they indicate the direction in which the fluid particle at a given point travels in time. We maintain the distinction between streamlines and physical trajectories of the flow, and in this note only the latter objects will be taken into account for the main computations. Remark 2. We point out that in the literature one often assumes that the fluid is , “incompressible”, that is, fixed any reference domain Ω d dt

  Ω

ρ(Φ t (X),t) dX = 0.

This condition together with (4) leads to ∂t ρ(Φ t (X),t) + ∇X ρ(Φ t (X),t) · v(Φ t (X),t) = 0,

(12)

or, equivalently, changing the name of the space variable ∂t ρ(X,t) + ∇X ρ(X,t) · v(X,t) = 0.

(13)

The incompressibility condition (13) may be also understood from a “discrete analogue” by thinking that the density ρ(X,t) of a gas formed by indistinguishable molecules at a point X at time t is measured by “counting the number of molecules” in the vicinity of X at time t. That is, fixing r > 0, the gas density could be defined as the number of molecules lying in Br (X) at time t. If the gas is incompressible, we expect that the number of molecules around the evolution Φ t (X) of X remains the same. This gives that ρ(Φ t (X),t) = ρ(X, 0), which leads to (12) and (13). To appreciate the structural difference between the mass conservation law in (6) and the incompressibility condition in (13), let us consider two examples. In the first example, let

235

Symmetry result in water waves models

v(X,t) := −X

and

ρ(X,t) := ent ,

with n > 0. In this case, the velocity field pushes all the fluid towards the origin, preserving the mass according to (6): as a consequence, the particles of the fluid get “packed” and their density increases, and the incompressibility condition (13) is indeed violated. As a second example, let us consider the case in which v(X,t) := −X

and

ρ(X,t) := 1.

In this case, the fluid elements are still pushed towards the origin, but the density remains constant. This means that there must be a leak somewhere, from which the fluid escapes. In this situation, the incompressibility condition in (13) is satisfied, but the mass is lost and accordingly (6) does not hold. We also point out that if the the mass conservation law in (6) and the incompressibility condition in (13) are both satisfied, then  0 = ∂t ρ(X,t) + divX ρ(X,t) v(X,t) = ∂t ρ(X,t) + ∇X ρ(X,t) · v(X,t) + ρ(X,t) divX v(X,t) = ρ(X,t) divX v(X,t), and, as a consequence, divX v(X,t) = 0

in Ω (t).

In this note, we will not explicitly take into account incompressibility assumptions, but merely the conservation of mass in (6). Remark 3. Concerning the top surface of the fluid, in the literature it is often assumed that fluid particles on this surface remain there forever (i.e., there is no “mixing effect” between the top surface of the sea and the rest of the water mass). This condition, in the notation of (3) and (4), would translate into Φ2t (X) = h(Φ1t (X),t),  as long as X = (x, y) and y = h(x, 0), where Φ t (X) = Φ1t (X), Φ2t (X) ∈ Rn × R. Hence, in view of (4), 0=

 d h(Φ1t (X),t)−Φ2t (X) = v(Φ t (X),t)· ∇x h(Φ1t (X),t), −1 +∂t h(Φ1t (X),t). dt

In this note, we do not need to assume this additional no mixing condition.

236

Serena Dipierro, Pietro Miraglio and Enrico Valdinoci

3 Symmetry results Now, we present some results for an elliptic problem related to the stationary case of the model introduced in Section 2. Besides assuming no dependence on time t, we also consider the simplification of a “flat ocean”, by taking b(x) = H > 0 and h(x) = 0 (recall the notation in (3)). This choice implies that we now consider the sea as Ω = {(x, y) ∈ Rn × R s.t. 0  y  H}, and that we are “reversing the vertical direction”, in order to have the ocean surface on {y = 0}. This last simplification is done for pure mathematical convenience and does not affect the model. In our setting, we can use (11) in order to associate a velocity potential in the whole slab R2 × [0, H] with a given datum on the surface of the ocean. Given the values of the velocity potential on {y = 0} and denoting such datum by u, we consider the velocity potential v in the whole slab R2 × [0, H] that solves ⎧ ) = div(ρ∇v) ⎪ ⎨0 = div(ρV

0 = V3 y=H = vy y=H

⎪ ⎩ v = u y=0

in R2 × (0, H), on R2 × {y = H},

(14)

on R2 × {y = 0}.

In relation to water waves and in view of the discussion in Section 2, we are interested in the weighted vertical velocity on the surface of the ocean. Thus, the operator that we want to study is La u(x) := − lim ρ(y)vy (x, y). y→0

(15)

When ρ := 1 and H → +∞ (which is the case of a fluid with constant density and an “infinitely deep sea”), the operator La is the square root of the Laplacian, see e.g. [10]. For finite values of H the operator described in (15) is nonlocal, but also not of purely fractional type, as we are going to see. In the following, we choose ρ(y) := ya (16) as a density, where a ∈ (−1, 1). We notice that, in this case, the limit as H → +∞ corresponds to the s-th root of the Laplacian,

(17)

with s := (1 − a)/2, but for a finite value of H the problem is not of purely fractional type. From now on, we normalize the domain by setting H := 1. From a physical point of view, the choice in (16) corresponds to the situation in which the density of the fluid at a point depends only on the depth, in a power-like fashion, and it is constant in the horizontal directions. Possibly, some of the results that we present here can be extended to the case of a more general density ρ(y), and we intend to investigate the possibility of this generalization in a forthcoming work.

237

Symmetry result in water waves models

After generalizing the physical setting R2 × [0, 1] to the mathematically interesting case Rn × [0, 1] — with coordinates x ∈ Rn and y ∈ [0, 1] — the extension problem in (14) reads ⎧ a ⎪ in Rn × (0, 1), ⎨div(y ∇v) = 0 (18) vy (x, 1) = 0 on Rn × {y = 1}, ⎪ ⎩ n v(x, 0) = u(x) on R × {y = 0}. Therefore, in light of (16), the Dirichlet to Neumann operator La in (15) is given by La u(x) = − lim ya vy (x, y), y→0

and, for a given nonlinearity f ∈ C1,γ (R), we want to study the equation La u(x) = f (u)

in Rn .

(19)

As a technical remark, we notice that, in order to have the operator La well defined for every smooth function u : Rn → R, we need to choose the extension v in (18) in a unique way. Indeed, for example, if v is a solution of (18) with a = 0, then so is the function v(x, y) + eπx/2 sin(πy/2). To overcome this problem and uniquely determine v in (18), we choose among all the possible solutions of (18) the one which is a minimizer of the energy Ea (w) :=

 Rn ×(0,1)

ya |∇w(x, y)|2 dx dy,

(20)

in the class of all the functions w ∈ W 1,2 (Rn × (0, 1), ya ) such that w(x, 0) = u(x). Such a minimizer v exists, it is unique, due to the convexity of the energy functional in (20), and it solves the problem in (18) — see [29] for all the details. With the setting in (18), the problem in (19) can be formulated in the following way: ⎧ a ⎪ in Rn × (0, 1), ⎪ ⎨div(y ∇v) = 0 vy (x, 1) = 0 on Rn × {y = 1}, (21) ⎪ a ⎪ on Rn × {y = 0}, ⎩− lim y vy = f (v) y→0

where f ∈ C1,γ (R) with γ > 0. Problem (21) has a variational structure, since solutions of (21) correspond to critical points of the energy functional E (v) :=

1 2

 Rn ×(0,1)

ya |∇v(x, y)|2 dx dy +

 Rn ×{y=0}

where the associated potential F is such that F  = − f .

F(v(x, 0)) dx,

(22)

238

Serena Dipierro, Pietro Miraglio and Enrico Valdinoci

Since problem (21) is set in a slab of fixed height, it is technically convenient to localize the energy functional on cylinders. Namely, we define the cylinder CR := BR × (0, 1),

(23)

where BR ⊂ Rn denotes the ball of radius R centered at 0. Then, by (22), the localized energy functional associated to problem (21) reads ER (v) :=

1 2

 CR

ya |∇v(x, y)|2 dx dy +

 BR ×{y=0}

F(v(x, 0)) dx.

In particular, the potential F is naturally defined up to an additive constant, hence, focusing on bounded solutions, we can also suppose that F  0. For this kind of problems, the model case is the nonlinearity f (t) := t −t 3 , which arises in the study of phase transitions and it is the derivative of the double-well potential F(t) =

2 1 1 − t2 . 4

The usual notions of minimizer of the energy and of stable solution to problem (21) can be defined in a standard way. We say that a bounded function v ∈ C1 (Rn × (0, 1)) is a minimizer for (21) if ER (v)  ER (w) for every R > 0 and for every bounded competitor w such that v ≡ w on ∂ BR × (0, 1). We say that a bounded solution v of (21) is stable if the second variation of the energy is non-negative, i.e.  Rn ×[0,1]

ya |∇ξ |2 dx dy −

 Rn ×{y=0}

f  (u)ξ 2 dx  0

for every function ξ ∈ C01 (Rn × [0, 1]). Clearly, if v is a minimizer for (21) then, in particular, it is a stable solution. Another important subclass of stable solutions that we consider in this paper is given by the monotone solutions of (21). We say that a solution v of (21) is monotone if it is strictly monotone in one horizontal direction, say ∂xn v > 0. For this kind of problems, it is possible to prove that monotone solutions are stable using a nonvariational characterization of stability — see for example Lemma 3.1 in [13] for all the details. See also [21] for a complete introduction to stable solutions in elliptic PDEs. Problem (21) was initially studied by de la Llave and the third author in [16] with constant density, so with a = 0. In particular, they proved a Liouville theorem that assures the one-dimensional symmetry of monotone solutions on the trace, provided that a suitable energy estimate for the functional associated to the problem holds true. Since this energy estimate in dimension n = 2 is a direct consequence of a

239

Symmetry result in water waves models

classical gradient bound, they obtain that monotone solutions of (21) with a = 0 depend on only one horizontal variable if n = 2. We now describe some recent symmetry and rigidity results for problem (21) in the light of a long-lasting line of investigation that was opened by a celebrated conjecture by Ennio De Giorgi.

3.1 Symmetry properties for the Allen-Cahn equation One of the main interests in proving the one-dimensional symmetry of monotone solutions comes from a conjecture formulated by Ennio De Giorgi for the classical Allen-Cahn equation. Indeed, in 1979 De Giorgi posed the following question. Conjecture 1. Let u be a bounded and smooth solution of the Allen-Cahn equation −Δ u = u − u3

in Rn ,

such that ∂xn u > 0. Is it true that, if n  8, then u is one-dimensional? A heuristic motivation of the conjecture can be formulated in light of the work of Modica and Mortola [30]. Indeed, they proved that a proper rescaling of the energy functional associated to the Allen-Cahn equation Γ -converges to the perimeter functional, as the rescaling parameter goes to zero. This means that a proper rescaling of the minimizers of the Allen-Cahn equation converges to characteristic functions of sets of minimal perimeter. The threshold dimension n = 8 comes from the fact that super-level sets of monotone functions are expected to be epigraphs (though this is a tricky point, see e.g. formula (5) in [22]), and minimal graphs are flat if n − 1  7. For a complete discussion of minimal surfaces, see the illuminating monograph [28]. Summing up, the above heuristic argument would give that, at least in dimension n  8, if we look at monotone solutions “from very far” (through a rescaling), their level sets are close to hyperplanes. The question in Conjecture 1 asks if, for this to hold, the level sets of the function must be necessarily parallel hyperplanes. The conjecture of De Giorgi remained unanswered in every dimension n for almost twenty years. It was proved to hold if n = 2 by Ghoussoub and Gui [26] and by Berestycki, Caffarelli and Nirenberg [2], and if n = 3 by Ambrosio and Cabr´e [1]. Regarding dimensions 4  n  8, Savin proved in [31] the conjecture by assuming the following additional hypothesis about the limits in the monotone direction lim u(x , xn ) = ±1.

xn →±∞

(24)

Condition (24) can be weakened by assuming two-dimensional symmetry of the profiles at infinity, see [23]. More precisely, a number of symmetry results hold true under appropriate assumptions of geometric type. Without claiming to be exhaustive, we mention, for example the following results from [23]:

240

Serena Dipierro, Pietro Miraglio and Enrico Valdinoci

Symmetry from the profiles

Let −Δ u = u − u3 in Rn , with ∂xn u > 0. Let

u(x ) := lim u(x , xn ) xn →−∞

and

u(x ) := lim u(x , xn ). xn →+∞

(25)

Then: • If both u and u depend on (at most) two Euclidean variables, then u is identically −1 and u is identically +1; if also n  8, then u is one-dimensional; • If either u or u depends on (at most) two Euclidean variables and n  4, then u is one-dimensional. Symmetry from level sets being graphs Let −Δ u = u − u3 in Rn , with ∂xn u > 0, and let the notation in (25) hold true. Assume that one level set of u is a graph in the nth Euclidean direction. Then u is identically −1 and u is identically +1; if also n  8, then u is one-dimensional. Symmetry for monotone minimizers Let u be a local minimizer of the energy functional  1 1 |∇u(x)|2 + (1 − u2 (x))2 dx, 2 4 and assume that ∂xn u > 0. Suppose that n  8. Then u is one-dimensional. Symmetry for minimizers with uniform limits Let u be a local minimizer of the energy functional  1 1 |∇u(x)|2 + (1 − u2 (x))2 dx, 2 4 and assume that either lim u(x , xn ) = −1

xn →−∞

or

lim u(x , xn ) = +1,

xn →+∞

uniformly for x ∈ Rn−1 . Then u is one-dimensional. As a counterpart of the results giving positive answers to Conjecture 1 (possibly under additional assumptions), del Pino, Kowalczyk and Wei provided in [17] an example of a monotone solution to the Allen-Cahn equation in dimension n = 9 which is not one-dimensional. In this way, they proved that dimension n = 8 in Conjecture 1 is the optimal one. We refer to [12, 22] for more detailed surveys on topics related to Conjecture 1.

3.2 Symmetry properties for the fractional Allen-Cahn equation The fractional analogue of Conjecture 1 can be formulated as follows: Conjecture 2. Let s ∈ (0, 1) and u be a bounded and smooth solution of the fractional Allen-Cahn equation in Rn , (26) (−Δ )s u = u − u3

Symmetry result in water waves models

241

such that ∂xn u > 0. Is it true that, if n is sufficiently small, then u is one-dimensional? This question is also motivated by an analogue in the fractional setting of the Γ -convergence result by Modica and Mortola. Indeed, the third author and Savin proved in [34] that a proper rescaling  of the energy associated to (26) Γ -converges  to the classical perimeter if s ∈ 12 , 1 and to the fractional perimeter if s ∈ 0, 12 . The fractional perimeter was introduced by Caffarelli, Roquejoffre and Savin in [9], and — without going into the details — can be thought as a nonlocal version of the classical perimeter, counting the interactions between points which lie in the two separated sides of the boundary of the set. As in the classical case, one could relate, at least at a level of motivations, the validity of Conjecture 2 to the regularity and rigidity properties of the minimizers  of the limit energy functional, namely to the classical minimal surfaces when s ∈ 12 , 1 , and to the nonlocal minimal surfaces  when s ∈ 0, 12 . With respect to this, we recall that nonlocal minimal surfaces are known to be smooth  only in dimension 2 — see [35] — and up to dimension 7 provided that s ∈ 12 − ε0 , 12 and ε0 is sufficiently small — see [11]. Nonlocal minimal surfaces that are entire graphs are known to be necessarilyhyperplanes only in dimension 2 and 3, and up to dimension 8 provided that s ∈ 12 − ε0 , 12 and ε0 is sufficiently small — see [25]. Till now, no singular minimal surface is known — see however [14] for the construction of a singular cone in dimension 7 which is a stable critical point of the fractional perimeter when s is sufficiently small. Of course, this lack of knowledge for the nonlocal minimal surfaces (when compared to the classical minimal surfaces) provides a series of conceptual difficulties  when dealing with Conjecture 2, especially in the regime s ∈ 0, 12 . The problem posed by Conjecture 2 was solved in dimension n = 2 by Cabr´e and Sol`a-Morales in [8] for s = 12 , and then by Cabr´e, Sire and the third author in [7, 36] for every s ∈ (0, 1). A positive answer in dimension  n = 3 was given by Cabr´e and Cinti in [4] and [5] in the cases s = 12 and s ∈ 12 , 1 , respectively. Regarding the strongly nonlocal regime, namely when s ∈ (0, 12 ), recently the conjecture has been proved in dimension n = 3 by Farina and the first and the third authors in [19] (using an improvement of flatness result by [20]) and by Cabr´e, Cinti and Serra in [6] (by a different approach which relies on some sharp energy estimates and a blow-down convergence result for stable solutions). Very recently, Figalli and Serra proved in [24] Conjecture 2 to be true for s = 12 and n = 4 (also providing one-dimensional symmetry of stable solutions in dimension n = 3). Concerninghigher dimensions, Savin proved in [32, 33] the conjecture for 4  n  8 and s ∈ 12 , 1 under the additional assumption (24). Moreover, in [20] it has  been proved that Conjecture 2 is true in dimensions 4  n  8 if s ∈ 12 − ε0 , 12 , for some ε0 sufficiently small, under the additional assumption (24). We also recall that, similarly to what happens in the classical case, it is possible to obtain one-dimensional symmetry from the geometry of the profiles of the monotone solutions, defined in (25). More precisely, it has been proved in [19] that mono-

242

Serena Dipierro, Pietro Miraglio and Enrico Valdinoci

tone solutions with two-dimensional are necessarily one-dimensional   limit profiles in dimension n  8, as long as s ∈ 12 − ε0 , 12 , for a sufficiently small ε0 ∈ 0, 12 . Besides these results, Conjecture 2 is also open in its generality, and the critical dimension might depend on the fractional parameter s.

3.3 Symmetry properties for the water wave problem Since, in our framework, we are dealing with a generalization of fractional Laplace operators, which are attained in the limit according to (17), a natural counterpart of Conjecture 2 is the following one: Conjecture 3. Let a ∈ (−1, 1) and u be a bounded and smooth solution of the fractional Allen-Cahn equation La u = u − u3

in Rn ,

such that ∂xn u > 0. Is it true that, if n is sufficiently small, then u is one-dimensional? Conjecture 3 is related to, but structurally different from, Conjecture 2. As a matter of fact, to point out the differences between problem (19)-(21) treated in these notes and its analogue for the fractional Laplacian, we consider the Fourier transform of the Dirichlet to Neumann operator La . It can be computed as J1−s (−i |ξ |) 2s  |ξ | u(ξ ), L a (u)(ξ ) = c1 (s) Js−1 (−i |ξ |) where Jm (·) is the Bessel function of the first kind of order m, and c1 (s) is a constant depending only on s ∈ (0, 1). As customary, the symbol u denotes the Fourier transform of u. Therefore, the operator La can be seen as a Fourier operator with symbol Ss (ξ ) := c1 (s)

J1−s (−i |ξ |) 2s |ξ | . Js−1 (−i |ξ |)

The symbol Ss (ξ ) was already known in [3, 16] in the special case s = S1/2 (ξ ) =

1 2

as

e|ξ | − e−|ξ | |ξ | e|ξ | + e−|ξ |

and it has been computed later by the second and third author in [29] for every fractional parameter s ∈ (0, 1). By evaluating the limits of Ss (ξ ) as |ξ | goes to zero and infinity, we observe that Ss (ξ ) ∼ |ξ |2 Ss (ξ ) ∼ |ξ |

2s

as |ξ | → 0; as |ξ | → +∞.

(27)

243

Symmetry result in water waves models

This fact is already evident in the simpler case s = 12 , but it can be shown also in the general case s ∈ (0, 1) — see again [29] for all the details. To better undestand the implications of this behaviour, we should remind that |ξ |2 is the symbol of the classical Laplacian, and that the fractional Laplacian can be also written in the Fourier setting as  (−Δ )s u(ξ ) = |ξ |2s u(ξ ), see for example [18]. Looking at the asymptotics (27), it becomes evident that the operator La is not of purely fractional type, and, in fact, it shows a nonlocal behaviour for high frequencies but it becomes similar to the Laplacian for small frequencies. In this setting, Conjecture 3 was first addressed by de la Llave and the third author in [16], for the special case a = 0. As mentioned above, their main result is a Liouville theorem, that gives one-dimensional symmetry of monotone solutions under an assumption about the growth of the Dirichlet energy of the solution. In this way, they establish Conjecture 3 for a = 0 and n = 2 — see in particular Theorem 1 in [16]. The results in [16] have been extended in [13] from monotone to stable solutions, also considering all the fractional parameters a ∈ (−1, 1) and not only a = 0. In this setting, the result in [13] reads as follows. Theorem 1. Let f ∈ C1,γ (R), with γ > max{0, −a}, and let v be a bounded and stable solution of (21). Suppose that there exists C > 0 such that  CR

ya |∇x v(x, y)|2 dx dy  CR2

(28)

for any R  2, where the notation in (23) has been used. Then, there exist v0 : R × (0, 1) → R and ω ∈ Sn−1 such that v(x, y) = v0 (ω · x, y)

for any (x, y) ∈ Rn × (0, 1).

In particular, the trace u of v on {y = 0} can be written as u(x) = u0 (ω · x). Finally, either u0 > 0 or u0 ≡ 0. Remark 4. For this kind of elliptic problems, it is a standard fact that bounded solutions have bounded gradients, see for example [27]. For this reason, if we assume n = 2, then hypothesis (28) is trivially verified by any bounded stable solution. Therefore, we deduce that bounded stable solutions of (21) are one-dimensional on the trace if n = 2. In particular, this implies the validity of Conjecture 3 in dimension n = 2, for all a ∈ (−1, 1) as a corollary of Theorem 1. In [13], Conjecture 3 is also addressed when n = 3. For this, the strategy is based on energy estimates and the use of Theorem 1. Namely, in [13] the following result is proved:

244

Serena Dipierro, Pietro Miraglio and Enrico Valdinoci

Theorem 2 (Energy estimate for minimizers). Let f ∈ C1,γ (R), with γ > max{0, −a}, and let v be a bounded minimizer for problem (21). Then, we have ER (v) =

1 2

 CR

ya |∇v|2 dx dy +

 BR ×{y=0}

F(v) dx  CRn−1 ,

(29)

for any R  2, where the notation in (23) has been used. We point out that (29) holds in general for minimizers of the energy associated to problem (21) in every dimension n, but the application to symmetry problems usually becomes relevant only in dimension n  3. Let us give now a brief look at the proof of Theorem 2, which is based on a direct comparison. Indeed, since we are assuming that v is a minimizer, for every admissible competitor w it holds that ER (v)  ER (w). We say that a competitor w is admissible if w ≡ v on the lateral boundary ∂ BR ×  constantly equal to the (0, 1). The key point for the proof is defining a competitor w minimum of the potential F in a cylinder of radius R − 1, and then cutting it off in  order to make it admissible. In such a way, one is able to estimate the energy of w in a cylinder of radius R and obtain (29). A strategy of this type has been used also in [1] to solve the classical De Giorgi conjecture in dimension n = 3. Restricting to the case n = 3, it is possible to prove the same estimate in Theorem 2 for bounded solutions whose traces on {y = 0} are monotone in some direction, according to the following result: Theorem 3 (Energy estimate for monotone solutions for n = 3). Let f ∈ C1,γ (R), with γ > max{0, −a}, and let v be a bounded solution of (21) with n = 3 such that its trace u(x) = v(x, 0) is monotone in some direction. Then, we have ER (v) =

1 2

 CR

ya |∇v|2 dx dy +

 BR ×{y=0}

F(v) dx  CR2 ,

for any R  2, where the notation in (23) has been used. We stress the fact that this energy estimate holds for monotone solutions of (21) only if we are in the case n = 3. As mentioned above, this is due to the proof, and in particular to the fact that we know from Remark 4 that stable solutions enjoy rigidity properties when we are in one dimension less, i.e. when n = 2. Let us briefly sketch the proof of Theorem 3. The interested reader can find all the details in Section 5 of [13]. First, it is necessary to define the two limit profiles of the monotone solution v. Indeed, since v is monotone in one direction, say vx3 > 0, we can define v(x1 , x2 , y) := lim v(x, y); x3 →−∞

245

Symmetry result in water waves models

v(x1 , x2 , y) := lim v(x, y). x3 →+∞

These functions are well defined for the monotonicity hypothesis, they are solutions of (21) in one dimension less, so with n = 2, and in particular they are stable. Here the dimension plays a key role, since we can use Theorem 1 and deduce that v and v are one-dimensional functions on the trace. From the existence of such solutions, it is possible to characterize the potential F. This is something that is fundamental in the proof. On the other side, a monotone solution v is a minimizer of the energy in the class   Av := w ∈ H 1 (CR ; ya ) such that w = v on ∂ BR × (0, 1) and v  w  v in CR . For the detailed proof of this fact in the setting of water waves, see Lemma 5.6 in [13]. At this point, one can use the characterization of the potential F provided by the previous steps in order to show that the competitor used in the proof of Theorem 2  ∈ Av . Since the energy of w  in CR can be bounded belongs to the class Av , i.e. w by CR2 , this finishes the proof of Theorem 3. The energy estimates in Theorems 2 and 3 give as a corollary the following rigidity result for minimizers and for monotone solutions in dimension n = 3. Indeed, in this case hypothesis (28) of Theorem 1 is fulfilled and the application is straightforward. Corollary 1. Let f ∈ C1,γ (R), with γ > max{0, −a} and let n = 3. Assume that one of the two following condition is satisfied: • v is a bounded minimizer for problem (21); • v is a bounded solution of (21) such that its trace u(x) = v(x, 0) is monotone in some direction. Then, there exist v0 : R × (0, 1) → R and ω ∈ S2 such that: v(x, y) = v0 (ω · x, y)

for all (x, y) ∈ R3 × (0, 1).

In particular, the trace u of v on {y = 0} can be written as u(x) = u0 (ω · x). In particular, Corollary 1 establishes the validity of Conjecture 3 when n = 3. The case of dimension n  4 remains open.

Acknowledgement The first author has been supported by the DECRA Project DE180100957 “PDEs, free boundaries and applications”. The first and third authors have been supported by the Australian Research Council Discovery Project DP170104880 “N.E.W. Nonlocal Equations at Work”. The second author has been supported by MINECO grant

246

Serena Dipierro, Pietro Miraglio and Enrico Valdinoci

MTM2017-84214-C2-1-P and is part of the Catalan research group 2017 SGR 1392. Part of this work was carried out on the occasion of a very pleasant visit of the second author to the University of Western Australia, which we thank for the warm hospitality.

References [1] L. Ambrosio and X. Cabr´e, Entire solutions of semilinear elliptic equations in R3 and a conjecture of De Giorgi, J. Amer. Math. Soc. 13 (2000), no. 4, 725–739, DOI 10.1090/S0894-0347-00-00345-3. MR1775735 [2] H. Berestycki, L. Caffarelli, and L. Nirenberg, Further qualitative properties for elliptic equations in unbounded domains, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 25 (1997), no. 1-2, 69–94 (1998). Dedicated to Ennio De Giorgi. MR1655510 [3] C. Bucur and E. Valdinoci, Nonlocal diffusion and applications, Lecture Notes of the Unione Matematica Italiana, vol. 20, Springer, [Cham]; Unione Matematica Italiana, Bologna, 2016. MR3469920 [4] X. Cabr´e and E. Cinti, Energy estimates and 1-D symmetry for nonlinear equations involving the half-Laplacian, Discrete Contin. Dyn. Syst. 28 (2010), no. 3, 1179–1206, DOI 10.3934/dcds.2010.28.1179. MR2644786 [5] , Sharp energy estimates for nonlinear fractional diffusion equations, Calc. Var. Partial Differential Equations 49 (2014), no. 1-2, 233–269, DOI 10.1007/s00526-012-0580-6. MR3148114 [6] X. Cabr´e, E. Cinti, and J. Serra, Stable nonlocal phase transitions, In preparation (2019). [7] X. Cabr´e and Y. Sire, Nonlinear equations for fractional Laplacians II: Existence, uniqueness, and qualitative properties of solutions, Trans. Amer. Math. Soc. 367 (2015), no. 2, 911–941, DOI 10.1090/S0002-9947-2014-05906-0. MR3280032 [8] X. Cabr´e and J. Sol`a-Morales, Layer solutions in a half-space for boundary reactions, Comm. Pure Appl. Math. 58 (2005), no. 12, 1678–1732, DOI 10.1002/cpa.20093. MR2177165 [9] L. Caffarelli, J.-M. Roquejoffre, and O. Savin, Nonlocal minimal surfaces, Comm. Pure Appl. Math. 63 (2010), no. 9, 1111–1144, DOI 10.1002/cpa.20331. MR2675483 [10] L. Caffarelli and L. Silvestre, An extension problem related to the fractional Laplacian, Comm. Partial Differential Equations 32 (2007), no. 7-9, 1245– 1260, DOI 10.1080/03605300600987306. MR2354493 [11] L. Caffarelli and E. Valdinoci, Regularity properties of nonlocal minimal surfaces via limiting arguments, Adv. Math. 248 (2013), 843–871, DOI 10.1016/j.aim.2013.08.007. MR3107529 [12] H. Chan and J. Wei, On De Giorgi’s conjecture: recent progress and open problems, Sci. China Math. 61 (2018), no. 11, 1925–1946, DOI 10.1007/s11425017-9307-4. MR3864761

Symmetry result in water waves models

247

[13] E. Cinti, P. Miraglio, and E. Valdinoci, One-dimensional symmetry for the solutions of a three-dimensional water wave problem, J. Geom. Anal., to appear. [14] J. D´avila, M. del Pino, and J. Wei, Nonlocal s-minimal surfaces and Lawson cones, J. Differential Geom. 109 (2018), no. 1, 111–175, DOI 10.4310/jdg/1525399218. MR3798717 [15] E. De Giorgi, Convergence problems for functionals and operators, Proceedings of the International Meeting on Recent Methods in Nonlinear Analysis (Rome, 1978), Pitagora, Bologna, 1979, pp. 131–188. MR533166 [16] R. de la Llave and E. Valdinoci, Symmetry for a Dirichlet-Neumann problem arising in water waves, Math. Res. Lett. 16 (2009), no. 5, 909–918, DOI 10.4310/MRL.2009.v16.n5.a13. MR2576707 [17] M. del Pino, M. Kowalczyk, and J. Wei, On De Giorgi’s conjecture in dimension N  9, Ann. of Math. (2) 174 (2011), no. 3, 1485–1569, DOI 10.4007/annals.2011.174.3.3. MR2846486 [18] E. Di Nezza, G. Palatucci, and E. Valdinoci, Hitchhiker’s guide to the fractional Sobolev spaces, Bull. Sci. Math. 136 (2012), no. 5, 521–573, DOI 10.1016/j.bulsci.2011.12.004. MR2944369 [19] S. Dipierro, A. Farina, and E. Valdinoci, A three-dimensional symmetry result for a phase transition equation in the genuinely nonlocal regime, Calc. Var. Partial Differential Equations 57 (2018), no. 1, Art. 15, 21, DOI 10.1007/s00526-017-1295-5. MR3740395 [20] S. Dipierro, J. Serra, and E. Valdinoci, Improvement of flatness for nonlocal phase transitions, Amer. J. Math. (2019). [21] L. Dupaigne, Stable solutions of elliptic partial differential equations, Chapman & Hall/CRC Monographs and Surveys in Pure and Applied Mathematics, vol. 143, Chapman & Hall/CRC, Boca Raton, FL, 2011. MR2779463 [22] A. Farina and E. Valdinoci, The state of the art for a conjecture of De Giorgi and related problems, Recent progress on reaction-diffusion systems and viscosity solutions, World Sci. Publ., Hackensack, NJ, 2009, pp. 74–96. MR2528756 [23] , 1D symmetry for solutions of semilinear and quasilinear elliptic equations, Trans. Amer. Math. Soc. 363 (2011), no. 2, 579–609, DOI 10.1090/S0002-9947-2010-05021-4. MR2728579 [24] A. Figalli and J. Serra, On stable solutions for boundary reactions: a De Giorgi-type result in dimension 4 + 1, ArXiv e-prints (2017), available at 1705.02781. [25] A. Figalli and E. Valdinoci, Regularity and Bernstein-type results for nonlocal minimal surfaces, J. Reine Angew. Math. 729 (2017), 263–273, DOI 10.1515/crelle-2015-0006. MR3680376 [26] N. Ghoussoub and C. Gui, On a conjecture of De Giorgi and some related problems, Math. Ann. 311 (1998), no. 3, 481–491, DOI 10.1007/s002080050196. MR1637919 [27] D. Gilbarg and N. S. Trudinger, Elliptic partial differential equations of second order, Classics in Mathematics, Springer-Verlag, Berlin, 2001. Reprint of the 1998 edition. MR1814364

248

Serena Dipierro, Pietro Miraglio and Enrico Valdinoci

[28] E. Giusti, Minimal surfaces and functions of bounded variation, Monographs in Mathematics, vol. 80, Birkh¨auser Verlag, Basel, 1984. MR775682 [29] P. Miraglio and E. Valdinoci, Energy asymptotics of a Dirichlet to Neumann problem related to water waves, forthcoming. [30] L. Modica and S. Mortola, Un esempio di Γ − -convergenza, Boll. Un. Mat. Ital. B (5) 14 (1977), no. 1, 285–299 (Italian, with English summary). MR0445362 [31] O. Savin, Regularity of flat level sets in phase transitions, Ann. of Math. (2) 169 (2009), no. 1, 41–78, DOI 10.4007/annals.2009.169.41. MR2480601 , Rigidity of minimizers in nonlocal phase transitions, Anal. PDE 11 [32] (2018), no. 8, 1881–1900, DOI 10.2140/apde.2018.11.1881. MR3812860 [33] , Rigidity of minimizers in nonlocal phase transitions II, ArXiv e-prints (2018), available at 1802.01710. [34] O. Savin and E. Valdinoci, Γ -convergence for nonlocal phase transitions, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 29 (2012), no. 4, 479–500, DOI 10.1016/j.anihpc.2012.01.006. MR2948285 [35] , Regularity of nonlocal minimal cones in dimension 2, Calc. Var. Partial Differential Equations 48 (2013), no. 1-2, 33–39, DOI 10.1007/s00526012-0539-7. MR3090533 [36] Y. Sire and E. Valdinoci, Fractional Laplacian phase transitions and boundary reactions: a geometric inequality and a symmetry result, J. Funct. Anal. 256 (2009), no. 6, 1842–1864, DOI 10.1016/j.jfa.2009.01.020. MR2498561

Geometric properties of superlevel sets of semilinear elliptic equations in convex domains Franc¸ois Hamel, Nikolai Nadirashvili and Yannick Sire

Abstract In this paper, we report on some recent results dealing with geometrical properties of solutions of some semilinear elliptic equations in bounded smooth convex domains. We investigate the quasiconcavity, i.e. the fact that the superlevel sets of a positive solution are convex or not. We actually construct a counterexample to this fact in two dimensions, showing that the solutions under consideration do not always inherit the convexity of the domain. We report on the results in [23].

1 Introduction and main result This paper is concerned with some geometrical properties of real-valued solutions of semilinear elliptic equations

Δ u + f (u) = 0

(1)

in bounded domains Ω ⊂ RN , in dimension N = 2 with Dirichlet-type boundary conditions on ∂ Ω . By domains, we mean non-empty open connected subsets of RN . The domains Ω are assumed to be convex domains. One is interested in knowing how these geometrical properties of Ω are inherited by the solutions u, under some suitable boundary conditions, that is how the shape of the solutions is influenced Franc¸ois Hamel Aix-Marseille Universit´e, CNRS, Centrale Marseille, I2M, UMR 7373, 13453 Marseille, France, e-mail: [email protected] Nikolai Nadirashvili Aix-Marseille Universit´e, CNRS, Centrale Marseille, I2M, UMR 7373, 13453 Marseille, France, e-mail: [email protected] Yannick Sire Johns Hopkins University, 21218 Baltimore, MD, USA, e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_16

249

250

Franc¸ois Hamel, Nikolai Nadirashvili and Yannick Sire

by the shape of the underlying domains. It is well-known that the convexity or the concavity of the solutions are too strong properties which are not true in general (see e.g. [32]). However, a typical question we address in this paper is the following one: assuming that Ω is convex and that u is a solution of (1) which is positive in Ω and vanishes on ∂ Ω , is it true that the superlevel sets   x ∈ Ω ; u(x) > λ of u are all convex? We prove that the answer to this question can be negative, that is we show that the superlevel sets of some solutions u of problems of the type (1) are not all convex. Various examples of solutions with non-convex superlevel sets in convex rings have also been given in [23, 39]. Regarding problem (1) in convex domains, the example given in Theorem 1.1 below is the first one, up to our knowledge. Let us consider the semilinear elliptic problem ⎧ ⎪ ⎨ Δ u + f (u) = 0 in Ω , u = 0 on ∂ Ω , (2) ⎪ ⎩ u > 0 in Ω . Throughout the paper, the function f : [0, +∞) → R is assumed to be locally H¨older continuous. The domains Ω are always assumed to be of class C2,α (with α > 0, we then say that the domains Ω are smooth) and  the solutions u are understood in the classical sense C2 (Ω ). The superlevel set x ∈ Ω ; u(x) > 0 of a solution u of (2) is equal to the domain Ω , which isconvex by assumption. A natural question is to  know whether the superlevel sets x ∈ Ω ; u(x) > λ for λ ≥ 0 are all convex or not. If this is the case, u is called quasiconcave. In his paper [37] (see Remark 3, page 268), P.-L. Lions writes that, in a convex domain Ω , “[he] believe[s] that [...] for general f , the [super]level sets of any solution u of [(2)] are convex”. There is indeed a vast literature containing some proofs of the above statement for various nonlinearities f . We here list some of the most classical references. Firstly, Makar-Limanov [38] proved that, for the twodimensional torsion √ problem, that is f (u) = 1 with N = 2, the solution u is quasiconcave, since u is actually concave. Brascamp and Lieb [12] showed that, if f (u) = λ u (λ is then necessarily the principal eigenvalue of the Laplacian with Dirichlet boundary condition), then the principal eigenfunction u is quasiconcave and more precisely it is log-concave, that is log u is concave. The proof uses the fact that log-concavity is preserved by the heat equation (but quasiconcavity is not in general, see [24]). When f (u) = λ u p with 0 < p < 1 and λ > 0, Keady [29] for N = 2 and Kennington [30] for N ≥ 2 proved that u(1−p)/2 is concave, whence u is quasiconcave. Many generalizations under more general assumptions on f and alternate proofs have been given. A possible strategy is to prove that g(u) is concave for some suitable increasing function g, by showing that g(u(tx + (1 − t)y)) − tg(u(x)) − (1 − t)g(u(y)) ≥ 0 for all (t, x, y) ∈ [0, 1] × Ω × Ω

Geometric properties of superlevel sets of semilinear elliptic equations

251

and by using the elliptic maximum principle or the preservation of concavity of g(u) by a suitable parabolic equation, see [15, 21, 27, 28, 30, 31, 32, 37]. Other strategies consist in studying the sign of the curvatures of the level sets of u or in proving that the Hessian matrix of g(u) for some suitable increasing g has a constant rank, see [3, 11, 14, 34, 36, 44]. Lastly, we refer to [4, 19] for further references using the quasiconcave envelope and singular perturbations arguments, and to the book of Kawohl [26] for a general overview. The following result is the first counterexample to the quasiconcavity of solutions u of (2) in convex domains Ω and can be found in [23] . We give here the proof of the following result. Theorem 1.1 In dimension N = 2, there are some smooth bounded convex domains Ω and some C∞ functions f : [0, +∞) → R such that f (s) ≥ 1 for all s ≥ 0 and for which problem (2) admits both a quasiconcave solution v and a solution u which is not quasiconcave. Remark 1.2 It would be interesting to study the stability of the solution constructed in Theorem 1.1 in connection with results of Cabr´e and Chanillo [13]. We refer the reader to the remark at this end of this paper.

2 Counterexamples in convex domains and proof of the theorem We construct explicit examples of bounded smooth convex two-dimensional domains Ω and of functions f for which problem (2) admits some non-quasiconcave solutions u. The construction is divided into five main steps. Firstly, we define a one-parameter family (Ωa )a≥1 of more and more elongated stadium-like convex domains. Secondly, for each value of the parameter a ≥ 1, we solve a variational problem in H01 (Ωa ) with a nonlinear constraint, whose solution ua solves an elliptic equation of the type (2) in Ωa with some function fa . Thirdly, we prove some a priori estimates for the superlevel sets of the functions ua . Next, we compare ua with a one-dimensional profile in Ωa when a is large enough. Lastly, we show that the superlevel sets of the functions ua cannot be all convex when a is large enough. As a preliminary step, let us fix a C∞ function g : R → [0, 1] such that g = 0 on (−∞, 1], g = 1 on [2, +∞) and g ≥ 0 on R. The function g is fixed throughout the proof.

(3)

252

Franc¸ois Hamel, Nikolai Nadirashvili and Yannick Sire

Step 1: construction of a family of smooth bounded convex domains (Ωa )a≥1 . We first introduce a family of stadium-like smooth convex domains. Let ϕ : [−1, 1] → R be a fixed continuous nonnegative concave even function such that ϕ (±1) = 0. For a ≥ 1, we define   Ωa = (x, y) ∈ R2 ; −a − ϕ (y) < x < a + ϕ (y), −1 < y < 1 (4) and we choose ϕ once for all so that Ω1 (and then Ωa for every a ≥ 1) be of class 2,α C2,α with α > 0 (this means that ϕ is of class Cloc (−1, 1) and that ϕ satisfies some 2, α compatibility conditions at ±1). The C bounded domains Ωa for a ≥ 1 are all convex and axisymmetric with respect to both axes {x = 0} and {y = 0}, see Figure 1.

1 a -a

a

0 -1

Fig. 1 The convex stadium-like domain Ωa

Our goal is to show that the conclusion of Theorem 1.1 holds with these convex domains Ωa and some functions fa , when a is large enough. Step 2: a constrained variational problem in Ωa . In this step, we fix a parameter a ≥ 1. We construct a C2,α (Ωa ) function ua as a minimizer of a constrained variational problem in Ωa . Let Ia be the functional defined in H01 (Ωa ) by Ia (u) =

1 2

 Ωa

|∇u|2 −

 Ωa

u, u ∈ H01 (Ωa ).

It is well-known that this functional has a unique minimizer in H01 (Ωa ), which is the classical C2,α (Ωa ) solution va of the torsion problem Δ va + 1 = 0 in Ωa , (5) va = 0 on ∂ Ωa .

Geometric properties of superlevel sets of semilinear elliptic equations

253

It follows from the strong maximum principle and the definition of Ωa that 0 < va (x, y)
0 such that ua ≥ m a.e. in Ωa , contradicting the fact that ua ∈ H01 (Ωa ) has a zero trace on ∂ Ωa . Hence, g (ua ) cannot be the zero function and the differential of the map H01 (Ωa )  u → Ωa g(u) is not zero at ua . From the Euler-Lagrange formulation and elliptic regularity theory, any such minimizer ua is then a classical C2,α (Ωa ) solution of an equation of the type Δ ua + fa (ua ) = 0 in Ωa , (7) ua = 0 on ∂ Ωa , where

fa (s) = 1 + μa g (s) for s ∈ R

and μa ∈ R is a Lagrange multiplier. Observe that the function fa is of class C∞ (R). Furthermore, Δ (ua − va ) = −μa g (ua ) has a constant sign in Ωa , since g is nonnegative. As a consequence of the maximum principle, the function ua − va itself has a constant sign in Ωa . But max ua > 1 Ωa

(8)

254

Franc¸ois Hamel, Nikolai Nadirashvili and Yannick Sire

because of (3) and by definition of Ua . Therefore, from (6), the function va cannot majorize ua . The strong maximum principle finally implies that 0 < va (x, y) < ua (x, y) for all (x, y) ∈ Ωa .

(9)

Thus, the function ua is a classical solution of the problem (2) in Ωa with the function fa . Notice also that the sign of Δ (ua − va ) is therefore nonpositive and, since ua and va are not identically equal, one has μa > 0. In particular, fa (s) ≥ 1 for all s ∈ R.

(10)

On the other hand, since fa (s) = 1 for all s ≥ 2 because of (3), the maximum principle also yields 1 − y2 ua (x, y) < (11) + 2 for all (x, y) ∈ Ωa . 2 The uniqueness of the minimizer ua of Ia in the set Ua is not clear, and is anyway not needed in the sequel. However, we point out an important geometrical property fulfilled by ua , which will be used in the next step. Namely, since Ωa is convex and symmetric with respect to the axes {x = 0} and {y = 0}, it follows from [18] that ua is even in x and y and is decreasing with respect to |x| and |y|. In the sequel, we are going to show that, for a large enough, the conclusion of Theorem 1.1 holds with Ωa , fa and ua , that is, the minimizers ua have some nonconvex superlevel sets. Notice that fa satisfies (10), as stated in Theorem 1.1. Before going further on, we also point out that the solution va of the torsion problem (5) also solves the same equation (2) as ua , with fa in Ωa , because of (6) and the fact that fa = 1 on [0, 1] ⊃ [0, 1/2] due to (3). Therefore, problem (2) with fa in Ωa admits the solution va , which is always quasiconcave by [38] applied to (5), whereas the solutions ua will be proved to be non-quasiconcave for a large.

Step 3: a priori estimates of the size of a superlevel set of the functions ua . In this step, we study the location of the superlevel sets   ωa = (x, y) ∈ Ωa ; ua (x, y) > 1 of the minimizers ua of Ia in Ua when a is large. From (8) and the remarks of the previous step, the sets ωa are non-empty open sets, they are all symmetric with respect to the axes {x = 0} and {y = 0}, and they are convex with respect to both variables x and y. The key-point in this step is to show a uniform control of the size of the sets ωa . We first begin with a bound in the x-direction, meaning that the sets ωa are not too elongated. Lemma 2.1 There exists a constant Cx > 0 such that

255

Geometric properties of superlevel sets of semilinear elliptic equations

0 ≤ sup |x| < Cx

(12)

(x,y)∈ωa

for any a ≥ 1 and for any minimizer ua of Ia in Ua .

a

Cy -Cy -Cx

ua 1

ua >1 a

Cx

Fig. 2 The set ωa where ua > 1

Proof. The proof is divided into two main steps. We first estimate from above the quantities Ia (ua ) by introducing a suitable test function in the set Ua , which is not too far from the one-dimensional function y → (1 − y2 )/2. Then, we estimate Ia (ua ) from below by observing that if ua (x, 0) is larger than 1 then the contribution of ua (x, ·) to Ia (ua ) in the section Ωa ∩ ({x} × R) will be uniformly larger than that of the minimizer y → (1 − y2 )/2. This eventually provides a control of the size of such points x and then of the size of ωa , independently of a. Throughout the proof, one can assume without loss of generality that a is any real number such that a ≥ 2 (since sup(x,y)∈Ωa |x| ≤ a + ϕ L∞ (−1,1) for all a ≥ 1 by the definition (4) of Ωa ). We consider any minimizer ua of the functional Ia in the set Ua and we set (13) xa = sup |x|. (x,y)∈ωa

Let us first bound Ia (ua ) from above by using the minimality of ua and comparing Ia (ua ) with the value of Ia at some suitably chosen test function. Let w be a fixed C∞ (R2 ) nonnegative function such that w = 0 in R2 \(−1, 1)2 and w > 0 in [−2/3, 2/3]2 . The function w is independent of a. Let φ0 be the H01 (−1, 1) function defined by

φ0 (y) =

1 − y2 for all y ∈ [−1, 1]. 2

(14)

We point out that φ0 is the unique minimizer in H01 (−1, 1) of the functional J defined by   1 1 1  2 J(φ ) = φ (y) dy − φ (y)dy, φ ∈ H01 (−1, 1). (15) 2 −1 −1

256

Franc¸ois Hamel, Nikolai Nadirashvili and Yannick Sire

From Lebesgue’s dominated convergence theorem, the function G : t →

 (−1,1)2

g(φ0 (y) + t w(x, y)) dx dy

is continuous in R. Furthermore, G(0) = 0 from (3) and (14), and lim G(t) =

 {w(x,y)>0}

t→+∞

dx dy ≥

4 2 3

> 1.

Therefore, there is t0 ∈ (0, +∞), independent of a, such that G(t0 ) =



g(φ0 (y) + t0 w(x, y)) dx dy = 1.

(−1,1)2

Let us now consider the test function wa defined in Ωa by wa (x, y) = φ0 (y)χa (x) + t0 w(x, y), where χa : R → [0, 1] is even and defined in [0, +∞) by ⎧ if x ∈ [0, a − 1], ⎪ ⎨1 χa (x) = a − x if x ∈ (a − 1, a), ⎪ ⎩ 0 if x ≥ a. The function wa belongs to H01 (Ωa ). Furthermore, since a ≥ 2, one has wa (x, y) = φ0 (y) + t0 w(x, y) for all (x, y) ∈ (−1, 1)2 , while wa (x, y) = φ0 (y)χa (x) ≤ φ0 (y) < 1 for all (x, y) ∈ Ωa \(−1, 1)2 . Therefore,  Ωa

g(wa ) =

 (−1,1)2

g(wa ) =

 (−1,1)2

g(φ0 (y) + t0 w(x, y)) dx dy = G(t0 ) = 1.

In other words, wa ∈ Ua . By definition of ua , one infers that Ia (ua ) ≤ Ia (wa ).

(16)

Let us now estimate Ia (wa ) from above. By using the facts that the domain Ωa is symmetric in x and that the function χa is even in x and by decomposing the integral Ia (wa ) into three subdomains, one gets that

257

Geometric properties of superlevel sets of semilinear elliptic equations

Ia (wa ) =



(−1,1)2



|∇(φ0 (y) + t0 2

w(x, y))|2

dx dy −

|∇φ0 (y)|2 dx dy − 2 +2 2 (1,a−1)×(−1,1) +2



(17)





(−1,1)2

(φ0 (y) + t0 w(x, y)) dx dy

(1,a−1)×(−1,1)

|∇(φ0 (y)χa (x))|2 dx dy − 2 2 (a−1,a)×(−1,1)

φ0 (y) dx dy



(a−1,a)×(−1,1)

φ0 (y)χa (x) dx dy

=2(a − 2)J(φ0 ) + β , where β is a real number which does not depend on a (it is indeed immediate to see by setting x = x + a in the last two integrals of (17) that these quantities do not depend on a). Finally, it follows from (16) and (17) that Ia (ua ) ≤ 2(a − 2)J(φ0 ) + β .

(18)

In the second step, we bound Ia (ua ) from below. On the set Ωa \(−a, a)×(−1, 1), one simply uses the fact that 



|∇u |2  5 a − ua ≥ − ≥ −10 ϕ L∞ (−1,1) 2 Ωa \(−a,a)×(−1,1) Ωa \(−a,a)×(−1,1) 2

from (11) and from the definition (4) of Ωa . Therefore, Ia (ua ) ≥ ≥



|∇u |2  a − ua − 10 ϕ L∞ (−1,1) 2 (−a,a)×(−1,1)

 a

(19)

J(ua (x, ·)) dx − 10 ϕ L∞ (−1,1) ,

−a

where the functional J has been defined in (15) and where we have used the fact that ua (x, ·) belongs to H01 (−1, 1) for all x ∈ (−a, a). Remember that φ0 is the (unique) minimizer of J. As a consequence, J(ua (x, ·)) ≥ J(φ0 ) for all x ∈ (−a, a).

(20)

On the other hand, by definition of xa in (13) and by convexity and symmetry of ωa with respect to both variables x and y, it follows that (x, 0) ∈ ωa for all x ∈ (−xa , xa ), whence ua (x, 0) > 1 > φ0 (0) for all x ∈ (−xa , xa ). Hence, there is a positive real number γ > 0, independent of a, such that

ua (x, ·) − φ0 H 1 (−1,1) ≥ γ > 0 for all x ∈ (−xa , xa ). By definition of φ0 and from the coercivity of the functional J, one infers the existence of a positive constant δ > 0, independent of a, such that

258

Franc¸ois Hamel, Nikolai Nadirashvili and Yannick Sire

J(ua (x, ·)) ≥ J(φ0 ) + δ for all x ∈ (−xa , xa ). From (19) and (20), one then gets that Ia (ua ) ≥ 2δ min(xa , a) + 2aJ(φ0 ) − 10 ϕ L∞ (−1,1) .

(21)

Putting together (18) and (21) with the inequality xa − ϕ L∞ (−1,1) ≤ min(xa , a) yields 2δ (xa − ϕ L∞ (−1,1) ) + 2aJ(φ0 ) − 10 ϕ L∞ (−1,1) ≤ 2(a − 2)J(φ0 ) + β , where β > 0 and δ > 0 are independent of a. Hence, there exists a constant Cx > 0, independent of a, such that 0 ≤ xa < Cx , that is (12). The proof of Lemma 2.1 is thereby complete.  The second lemma gives a bound from below of the “vertical” size of the sets ωa , meaning that the sets ωa are not too thin. We just state the lemma, which is an immediate consequence of Lemma 2.1 and the definition of Ua . Lemma 2.2 There exists a constant Cy > 0 such that 0 < Cy < sup |y|

(22)

(x,y)∈ωa

for any a ≥ 1 and for any minimizer ua of Ia in Ua . Step 4: comparison of ua (x, y) with φ0 (y) when a is large. In this step, we prove that the minimizers ua of Ia in Ua are close to the onedimensional profile φ0 (y) = (1 − y2 )/2 far away from the origin and far away from the leftmost and rightmost points of Ωa in the direction x. Lemma 2.3 For all ε > 0, there exist A ≥ 1 and M ∈ [0, A/2] such that  1 − y2   |ua (x, y) − φ0 (y)| =ua (x, y) −  2   ≤ε in [−a + M, −M] ∪ [M, a − M] × [−1, 1] (⊂ Ωa ), for all a ≥ A and for any minimizer ua of Ia in Ua . Proof. Assume that the conclusion does not hold for some ε > 0. Then there are some sequences (an )n∈N and (xn , yn )n∈N of real numbers and points in R2 such that an ≥ n,

n n ≤ |xn | ≤ an − , |yn | ≤ 1, |uan (xn , yn )− φ0 (yn )| > ε for all n ∈ N, (23) 2 2

where uan is a minimizer of the functional Ian in the set Uan . For each n ∈ N, define

Geometric properties of superlevel sets of semilinear elliptic equations

259

  un (x, y) = uan (x+xn , y) for all (x, y) ∈ Ωan −(xn , 0) = (x, y) ∈ R2 ; (x+xn , y) ∈ Ωan .

Each function un satisfies a semilinear elliptic equation of the type (7) in Ωan − (xn , 0) with a nonlinearity fan = 1 + μan g for some μan ∈ R. Lemma 2.1 and (9) imply that 0 < uan (x, y) ≤ 1 for all n ∈ N and (x, y) ∈ Ωan \(−Cx ,Cx ) × (−1, 1). Hence, because of (3) and (23), for every fixed C ≥ 0, there holds 0 ≤ un (x, y) ≤ 1 and Δ un (x, y) + 1 = 0 for all (x, y) ∈ [−C,C] × [−1, 1], for all n large enough. From standard elliptic estimates up to the boundary, it follows that, up to extraction of a subsequence, the functions un converge in 2 (R × [−1, 1]) to a classical solution u of Cloc ∞ ⎧ ⎪ ⎨ Δ u∞ + 1 = 0 in R × [−1, 1], 0 ≤ u∞ ≤ 1 in R × [−1, 1], ⎪ ⎩ u∞ = 0 on R × {±1}. Without loss of generality, one can also assume that yn → y∞ ∈ [−1, 1] as n → +∞, whence |u∞ (0, y∞ ) − φ0 (y∞ )| ≥ ε (24) from (23). On the other hand, a standard Liouville-type result implies that u∞ is necessarily identically equal to the one-dimensional profile φ0 (y) in R × [−1, 1]. Indeed, the function h(x, y) = u∞ (x, y) − φ0 (y) is bounded and harmonic in R × [−1, 1], and it vanishes on R × {±1}. The maximum principle implies that

πy 

πx  cosh |h(x, y)| ≤ η cos 4 4 for all (x, y) ∈ R × [−1, 1] and for all η > 0 (otherwise, the same inequality would hold in R × [−1, 1] for some η ∗ > 0, with equality at some point in R × (−1, 1), contradicting the strong maximum principle). Thus, since η > 0 can be arbitrarily small, one gets that h(x, y) = 0 for all (x, y) ∈ R × [−1, 1]. In other words, u∞ (x, y) = φ0 (y) for all (x, y) ∈ R × [−1, 1]. This is in contradiction with (24) and the proof of Lemma 2.3 is thereby complete. 

260

Franc¸ois Hamel, Nikolai Nadirashvili and Yannick Sire

Step 5: the superlevel sets of the minimizers ua cannot be all convex when a is large enough. In this last step, we complete the proof of Theorem 1.1. Actually, Lemma 2.2 and the one-dimensional convergence given in Lemma 2.3 will prevent any minimizer ua of Ia in Ua from being quasiconcave when a is large enough. Given Cy > 0 as in Lemma 2.2, let P, Qa and Ra be the points of R2 whose coordinates are given by P = (0,Cy ), Qa =

a C 

a  y , and Ra = ,0 4 2 2

for all a ≥ 1, see Figure 3. From Lemma 2.2 and the convexity and symmetry of ωa with respect to x and y, there holds P ∈ ωa , that is, ua (P) > 1 for any minimizer ua of Ia in Ua . On the other hand, the point Ra belongs to Ωa for all a ≥ 1 by definition (4) of Ωa and the point Qa is at the middle of the segment [P, Ra ] and is thus in Ωa too by convexity of Ωa .

a

P

Qa a/4

Ra

a/2

Fig. 3 The aligned points P, Qa and Ra

Furthermore, Lemma 2.3 implies that ua (Qa ) −→

1 − (Cy /2)2 1 Cy2 1 = − and ua (Ra ) −→ as a → +∞, 2 2 8 2

for any minimizer ua of Ia in Ua . As a consequence, given any real number λ such that 1 Cy2 1 − λ (25) of any minimizer ua of Ia in Ua is not convex, whence ua is not quasiconcave. The proof of Theorem 1.1 is thereby complete.  a = (ε a, (1 − 2ε )Cy ) and by choosing ε ∈ Remark 2.4 By replacing Qa by Q (0, 1/2) arbitrarily small, it follows from the above arguments that, given any real number λ such that 1 −Cy2 1 0, and f (u) := 2α0 ueα0 u , for some α0 > 0. We take into account initial data in the energy space H 1 (R2 ), i.e. u0 ∈ H 1 (R2 ), and in view of the TrudingerMoser inequality, the nonlinearity f (which has square exponential growth at infinity) is in the energy critical regime. We look for sufficient conditions in order to predict from the initial data whether the solution blows up in finite time or the solution exists globally in time. Our main tools are energy methods, and the so-called potential well argument. If 0 < λ < 2α1 0 , we prove that for energies below the ground state level, the dichotomy between blow-up and global existence is determined by the sign of a suitable functional. 2

Michinori Ishiwata Osaka University, Osaka, 560-8531, Japan e-mail: [email protected] Bernhard Ruf Universit`a di Milano, via C. Saldini 50, Milano 20133, Italy e-mail: [email protected] Federica Sani Universit`a di Milano, via C. Saldini 50, Milano 20133, Italy e-mail: [email protected] Elide Terraneo Universit`a di Milano, via C. Saldini 50, Milano 20133, Italy e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_17

265

266

Michinori Ishiwata, Bernhard Ruf, Federica Sani, Elide Terraneo

1 Model parabolic problem We consider the Cauchy problem for a two space dimensional parabolic equation with square exponential nonlinearity, more precisely we focus the attention on the following model problem:  ∂t u = Δ u − u + λ f (u) in (0, T ) × R2 , (1) in R2 , u(0, x) = u0 (x) where λ > 0,

f (u) := 2α0 ueα0 u , 2

for some α0 > 0,

and we consider initial data in the energy space H 1 (R2 ), i.e. u0 ∈ H 1 (R2 ). In this framework, energy refers to the functional associated with the stationary problem:  1 I(v) := v2H 1 − λ F(v) dx, 2 R2 where  1 vH 1 := ∇v2L2 + v2L2 2 ,

 v

and

F(v) := 0

f (η ) d η = eα0 v − 1. 2

The above functional is well defined in H 1 (R2 ), and the nonlinear term f that we are considering is critical in the energy space in view of the Trudinger-Moser embedding [1, 19]. Concerning local existence and uniqueness for (1), Ibrahim, Jrad, Majdoub and Saanouni [5] proved that, for any u0 ∈ H 1 (R2 ), the Cauchy problem (1) has a local in time solution u up to some finite time T > 0 satisfying   u ∈ C [0, T ]; H 1 (R2 ) , and the solution in unique. Then the smoothing effect of the heat kernel implies that the local in time solution u found in [5] belongs to the class       ∞ u ∈ Lloc (0, T ]; L∞ (R2 ) ∩ C 1 (0, T ); L2 (R2 ) ∩ C 1,2 (0, T ) × R2 ,   see [9, Remark 4.1], and Δ u ∈ C (0, T ), L2 (R2 ) . Therefore, for any t ∈ (0, T ), we have  d  ∂t u(t)2L2 = − I u(t) , (2) dt and    1d 2 2 u(t)L2 = −u(t)H 1 + λ u(t) f u(t) dx. (3) 2 2 dt R

Potential well argument for a parabolic equation with exponential nonlinearity

267

We define the maximal existence time T∗ of the solution u as    T∗ := sup T > 0 : the solution u to (1) satisfies u ∈ C [0, T ]; H 1 (R2 ) ∈ (0, +∞], and the following blow-up alternative holds: if T∗ < +∞ then lim sup u(t)L∞ = +∞,

(4)

t→T∗

see [5, Lemma 4.6]. Our aim is to find sufficient conditions in order to determine from the initial data u0 ∈ H 1 (R2 ) whether the solution blows up in finite time (i.e. T∗ < +∞) or the solution is global in time (i.e. T∗ = +∞). The same problem for nonlinear parabolic equations with polynomial nonlinearities has been widely studied via the potential well argument starting from the seminal papers by Tsutsumi [22], Ishii [10], and Payne and Sattinger [17]. Let us recall the central idea of this method following the presentation given in [18]. Let Ω ⊂ RN , N ≥ 3, be a bounded set with smooth boundary, and let us consider ⎧ p−1 ⎪ ⎨∂t u = Δ u + |u| u in (0, T ) × Ω , (5) u(t, x) = 0 in (0, T ) × ∂ Ω , ⎪ ⎩ in Ω , u(0, x) = u0 (x) 2N with 1 < p ≤ 2∗ − 1, with 2∗ = N−2 . For any initial data in the energy space H01 (Ω ) there  exists some finite time T > 0 and a local in time solution u belonging to C [0, T ]; H01 (Ω ) (this is a consequence of the L p+1 -existence result in [2] for any 1 < p ≤ 2∗ − 1, and of the smoothing effect of the heat kernel). In this case, the energy functional is given by

1 1 vLp+1 I p (v) := ∇v2L2 − p+1 . 2 p+1 Let v ∈ H01 (Ω ) \ {0}, and let us analyze the energy of the function σ v for any σ ≥ 0. By an easy computation, one can show that Ip (σ v) =

σ2 σ p+1 ∇v2L2 − vLp+1 p+1 2 p+1

attains its unique maximum at a point σ¯ > 0, and v¯ := σ¯ v satisfies ¯ Lp+1 ∇v ¯ 2L2 − v p+1 = 0. Therefore, the energy I(σ v) has the structure of a potential well, and every ray σ v, for any σ > 0 and for v ∈ H01 (Ω )\{0}, has a unique intersection with the Nehari manifold N = {v ∈ H01 (Ω ) \ {0} : ∇v2L2 − vLp+1 p+1 = 0}.

268

Michinori Ishiwata, Bernhard Ruf, Federica Sani, Elide Terraneo

The depth of the well is given by the lowest pass over the ridge defined by all possible I p (σ v) as v ranges over H01 (Ω ) \ {0}, namely d p :=

inf

max I p (σ v).

v∈H01 (Ω )\{0} σ ≥0

It is well known that d p can be characterized as d p = inf I p (v), v∈N

and also

dp =

p−1 Λ 2(p+1)/(p−1) , 2(p + 1)

where Λ = Λ p+1 (Ω ) is the best constant in the Sobolev embedding H01 (Ω ) ⊂ L p+1 (Ω ), i.e. ∇vL2 Λ= inf . 1 v∈H0 (Ω )\{0} vL p+1 If 1 < p < 2∗ − 1 then d p is the energy level of ground state solutions, i.e.   d p = inf I p (v) : v ∈ H01 (Ω ) \ {0} satisfies dI p (v), ϕ = 0 for any ϕ ∈ H01 (Ω ) The potential well associated with the Cauchy problem (5) is the set (stable set)   Wp := v ∈ H01 (Ω ) : I p (v) < d p , ∇v2L2 − vLp+1 p+1 > 0 ∪ {0}, and the exterior of the potential well (unstable set) is   < 0 . Vp := v ∈ H01 (Ω ) : I p (v) < d p , ∇v2L2 − vLp+1 p+1 The sets Vp and Wp are both invariant under the flow associated with the problem (5). Concerning the stable set if 1 < p < 2∗ − 1, any solution which enters the stable set Wp exists globally in time. This result is a direct consequence of the fact that, in the subcritical case, the time T of local existence of the solution to (5) depends only on the size of the norm of the initial data in H01 (Ω ), and for any v ∈ Wp the ∇vL2 is uniformly bounded (see [22]). Similar results have also been proven for p = 2∗ − 1, where the situation is different because the local existence time of the solution to (5) depends from the specific initial data rather than its size (see [10], [8], [11], and [12]). On the other side, if 1 < p ≤ 2∗ − 1 then any solution which intersects the unstable set Vp blows up in finite time (see [17] and [10]). Related results can be found in [16, 15, 3, 7]. For the case p = 2∗ − 1 and Ω = RN , N ≥ 3, we refer to [6] and the references therein. In the same spirit of these results, we show that for energies below the ground state level the dichotomy between blow-up and global existence for the Cauchy problem (1) can be determined by means of a potential well argument.

Potential well argument for a parabolic equation with exponential nonlinearity

269

2 Stable and unstable sets In analogy with the polynomial case, also the energy I associated with our model problem (1) has a potential well structure. More precisely, for any fixed v ∈ H 1 (R2 )\ {0}, the function σ → I(σ v) has the shape of a potential well. This can be deduced from the study of the sign of the so-called Nehari functional J(v) := dI(v), v = v2H 1 − λ in fact

 R2

v f (v) dx,

1 d I(σ v) = J(σ v). dσ σ

Proposition 1. Assume that 0 0 such that ⎧ ⎪ ⎨> 0 if 0 < σ < σ , J(σ v) = 0 if σ = σ , ⎪ ⎩ < 0 if σ > σ .

(6)

(7)

Moreover, lim I(σ v) = −∞,

σ →+∞

(8)

and σ is the unique maximum point of the function σ → I(σ v) on [0, +∞). Next, we introduce the depth of the well   d := inf I(v) : v ∈ H 1 (R2 ) \ {0}, J(v) = 0 , which coincides with the mountain pass level c :=

inf

max I(σ v).

v∈H 1 (R2 )\{0} σ >0

More precisely, c = d.

(9)

We recall that the existence of a mountain pass solution for the stationary problem (10) −Δ v + v = λ f (v) in R2 with λ in the range (6) is proved in [20], where it is also shown that 0 0 ∪ {0}.

  Theorem 1. Let u ∈ C [0, T∗ ); H 1 (R2 ) be the maximal solution to (1) with λ as in (6), and u0 ∈ H 1 (R2 ). i) If u(t0 ) ∈ V for some t0 ∈ [0, T∗ ) then T∗ < +∞. ii) There exists t0 ∈ [0, T∗ ) such that u(t0 ) ∈ W if and only if T∗ = +∞,

and

lim u(t)H 1 = 0.

t→+∞

Theorem 1.i) can be seen as an improvement of the blow-up result obtained in [5] for non-positive energies:   Theorem 2. [5, Theorem 2.1.3] Let u ∈ C [0, T∗ ); H 1 (R2 ) be the maximal solution   to (1) with 0 < λ ≤ 2α1 0 , and u0 ∈ H 1 (R2 ) \ {0}. If I u(t0 ) ≤ 0 for some t0 ∈ [0, T∗ ) then T∗ < +∞. Up to our knowledge, Theorem 1 is a new application of the potential well argument to heat equations with critical exponential nonlinearities in the 2-dimensional case. The subcritical exponential case is studied in [4] and [21]. In the next Section, we will sketch the proof of Theorem 1.i), and the part of the proof of Theorem 1.ii) concerning global existence in W . Complete proofs and detailed explanations can be found in [13].

3 Sketch of the proof of Theorem 1 Blow-up. Thanks to the monotonicity of the energy along the solution and the geometry of the sublevelsets of the energy I, one can show the invariance of the set V with respect to the heat flow associated with (1). In order to prove that the solutions in V blow up in finite time, we will apply the following blow-up Lemma containing the classical idea of the concavity method due to Levine [14]. Lemma 1. There exists no non-negative and increasing function y ∈ C 2 (t, +∞), with t ∈ R, such that, for some β > 1,

Potential well argument for a parabolic equation with exponential nonlinearity

271

y(t)y (t) ≥ β [y (t)]2 on (t, +∞), and lim y(t) = +∞.

t→+∞

The concavity method works in our setting due to the fact that the Nehari functional along solutions entering V is bounded away from zero by a strictly negative constant.   Proposition 2. Let u ∈ C [0, T∗ ); H 1 (R2 ) be the maximal solution to (1) with u0 ∈   H 1 (R2 ). If u(t0 ) ∈ V for some t0 ∈ [0, T∗ ) then there exists ε > 0 such that J u(t) < −ε for any t ∈ [t0 , T∗ ). We argue by contradiction assuming T∗ = +∞, and we apply the blow-up Lemma 1 to the non-negative and increasing C 2 -function defined by y(t) :=

1 2

 t t0

u(s)2L2 ds,

t ∈ [t0 , +∞).

In view of (3), we have y (t) =

  1 d u(t)2L2 = −J u(t) > ε , 2 dt

t ∈ (t0 , +∞),

(12)

where ε > 0 is given by Proposition 2. From (12), we deduce that lim y (t) = lim y(t) = +∞.

t→+∞

t→+∞

(13)

Finally, refining the estimate in (12), it is possible to show that y(t)y (t) ≥ β [y (t)]2 ,

for any large t,

for some β > 1.

Therefore we are in the framework of the blow-up Lemma 1, and we reach a contradiction. Global existence. Thanks to the uniqueness of the solution to (1), it is possible to prove that the stable set W is invariant under the flow associated with the problem (1). In order to prove that solutions which intersect the set W at some time t0 ∈ [0, T ∗ ) are global in time, we first remark that (3) yields the boundedness in L2 (R2 ) of any solution satisfying u(t) ∈ W for any t ∈ [t0 , T ∗ ), more precisely sup u(t)L2 < +∞.

t∈[t0 ,T∗ )

Moreover, the following property of W in the energy space holds: Proposition 3. For any v ∈ W , we have ∇v2L2 < 2d. Therefore, if the solution u(t) ∈ W for any t ∈ [t0 , T ∗ ) then there exists M > 0 such that

272

Michinori Ishiwata, Bernhard Ruf, Federica Sani, Elide Terraneo

sup u(t)2L2 ≤ M,

t∈[t0 ,T∗ )

and

sup ∇u(t)2L2 ≤ 2d
0 is uniform with respect to the H 1 -norm of the initial data, i.e. T = T (u0 H 1 ). Nevertheless, if we consider only small initial data, we can find a uniform local existence time for the solution to (1), and we can quantify the smallness condition as follows: Theorem 3. Let 0 < m < 4απ0 , and M > 0. There exists T = T (m, M) > 0 such that for any u0 ∈ H 1 (R2 ) with ∇u0 2L2 ≤ m and u0 2L2 ≤ M then the Cauchy problem   (1) has a unique solution u ∈ C [0, T ]; H 1 (R2 ) . Therefore in a similar way as in the polynomial case, we can conclude that any solution intersecting the stable set exists globally in time.

References 1. Adachi, S., Tanaka, K.: Trudinger type inequalities in RN and their best exponents. Proc. Amer. Math. Soc. 128 2051–2057 (2000) 2. Brezis, H., Cazenave, T.: A nonlinear heat equation with singular initial data. J. Anal. Math. 68, 277–304 (1996) 3. Cazenave, T., Lions, P. L.: Solutions globales d’´equations de la chaleur semi lin´eaires [Global solutions of semilinear heat equations]. Comm. Partial Differential Equations 9, 955–978 (1984) 4. Dai, H., Zhang, H.: Energy decay and nonexistence of solution for a reaction-diffusion equation with exponential nonlinearity. Bound. Value Probl. 2014, 9 pp. 5. Ibrahim, S., Jrad, R., Majdoub, M., Saanouni, T.: Local well posedness of a 2D semilinear heat equation. Bull. Belg. Math. Soc. Simon Stevin 21, 535–551 (2014) 6. Ikehata, R., Ishiwata, M., Suzuki, T.: Semilinear parabolic equation in RN associated with critical Sobolev exponent. Ann. I. H. Poincar´e 27, 877–900 (2010) 7. Ikehata, R., Suzuki, T.: Stable and unstable sets for evolution equations of parabolic and hyperbolic type. Hiroshima Math. J. 26, 475–491 (1996) 8. Ikehata, R., Suzuki, T.: Semilinear parabolic equations involving critical Sobolev exponent: local and asymptotic behavior of solutions. Differential Integral Equations 13, 869–901 (2000) 9. Ioku, N., Ruf, B., Terraneo, E.: Existence, non-existence and uniqueness for a heat equation with exponential nonlinearity in R2 . Math. Phys. Anal. Geom. 18 (2015), 19 pp. 10. Ishii, H.: Asymptotic stability and blowing up of solutions of some nonlinear equations. J. Differential Equations 26, 291–319 (1977) 11. Ishiwata, M.: Existence of a stable set for some nonlinear parabolic equation involving critical Sobolev exponent. Discrete Contin. Dyn. Syst. 2005, suppl., 443–452. 12. Ishiwata, M.: Asymptotic behavior of strong solutions of semilinear parabolic equations with critical Sobolev exponent. Adv. Differential Equations 13, 349–366 (2008) 13. Ishiwata, M., Ruf, B., Sani, F., Terraneo, E.: Asymptotics for a parabolic equation with critical exponential nonlinearity. Preprint (2019)

Potential well argument for a parabolic equation with exponential nonlinearity

273

14. Levine, A. H.: Instability and nonexistence of global solutions to nonlinear wave equations of the form Putt = −Au + F (u). Trans. Amer. Math. Soc. 192, 1–21 (1974) 15. Lions, P. L.: Asymptotic behavior of some nonlinear heat equations. Phys. D 5, 293–306 (1982) ˆ 16. Otani, M.: Existence and asymptotic stability of strong solutions of nonlinear evolution equations with a difference term of subdifferentials. Qualitative theory of differential equations, Vol. I, II (Szeged, 1979), pp. 795–809, Colloq. Math. Soc. J´anos Bolyai, 30, North-Holland, Amsterdam-New York (1981) 17. Payne, L. E., Sattinger, D. H.: Saddle points and instability of nonlinear hyperbolic equations. Israel J. Math. 22, 273–303 (1975) 18. Quittner, P., Souplet, P.: Superlinear Parabolic Problems. Blow-up, Global Existence and Steady States. Birkh¨auser Verlag, Basel (2007) 19. Ruf, B.: A sharp Trudinger-Moser type inequality for unbounded domains in R2 . J. Funct. Anal. 219, 340–367 (2005) 20. Ruf, B., Sani, F.: Ground states for elliptic equations in R2 with exponential critical growth. Geometric properties for parabolic and elliptic PDE’s, 251–267, Springer, Milan (2013) 21. Saanouni, T.: A note on the inhomogeneous nonlinear heat equation in two space dimensions. Mediterr. J. Math. 13, 3651–3672 (2016) 22. Tsutsumi, M.: On solutions of semilinear differential equations in a Hilbert space. Math. Japon. 17, 173–193 (1972)

Quantitative analysis of a singularly perturbed shape optimization problem in a polygon Dario Mazzoleni, Benedetta Pellacci and Gianmaria Verzini

Abstract We carry on our study of the connection between two shape optimization problems with spectral cost. On the one hand, we consider the optimal design problem for the survival threshold of a population living in a heterogenous habitat Ω ; this problem arises when searching for the optimal shape and location of a shelter zone in order to prevent extinction of the species. On the other hand, we deal with the spectral drop problem, which consists in minimizing a mixed Dirichlet-Neumann eigenvalue in a box Ω . In a previous paper [12] we proved that the latter one can be obtained as a singular perturbation of the former, when the region outside the refuge is more and more hostile. In this paper we sharpen our analysis in case Ω is a planar polygon, providing quantitative estimates of the optimal level convergence, as well as of the involved eigenvalues. AMS-Subject Classification. 49R05, 49Q10; 92D25, 35P15, 47A75. Keywords. Singular limits, survival threshold, mixed Neumann-Dirichlet boundary conditions, α -symmetrization, isoperimetric profile.

Dario Mazzoleni Dipartimento di Matematica e Fisica, Universit`a Cattolica del Sacro Cuore Via Trieste 17, 25121 Brescia, Italy, e-mail: [email protected] Benedetta Pellacci Dipartimento di Matematica e Fisica, Universit`a della Campania “Luigi Vanvitelli” viale A. Lincoln 5, Caserta, Italy, e-mail: [email protected] Gianmaria Verzini Dipartimento di Matematica, Politecnico di Milano piazza Leonardo da Vinci 32, 20133 Milano, Italy, e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_18

275

276

Dario Mazzoleni, Benedetta Pellacci and Gianmaria Verzini

1 Introduction In this note we investigate some relations between the two following shape optimization problems, settled in a box Ω ⊂ RN , that is, a bounded, Lipschitz domain (open and connected).

δ . For any measurable D ⊂ Ω such |Ω | − δ that |D| = δ , we define the weighted eigenvalue      |∇u|2 dx 1 2 2 Ω  λ (β , D) := min  2 u dx > β u dx , : u ∈ H (Ω ), 2 D Ω \D D u dx − β Ω \D u dx (1) and the optimal design problem for the survival threshold as   (2) Λ(β , δ ) = min λ (β , D) : D ⊂ Ω , |D| = δ . Definition 1. Let 0 < δ < |Ω | and β >

Definition 2. Let 0 < δ < |Ω |. Introducing the space  H01 (D, Ω ) := u ∈ H 1 (Ω ) : u = 0 q.e. on Ω \ D (where q.e. stands for quasi-everywhere, i.e. up to sets of zero capacity), we can define, for any quasi-open D ⊂ Ω such that |D| = δ , the mixed Dirichlet-Neumann eigenvalue as

 |∇u|2 dx μ (D, Ω ) := min Ω 2 (3) : u ∈ H01 (D, Ω ) \ {0} , Ω u dx and the spectral drop problem as   M(δ ) = min μ (D, Ω ) : D ⊂ Ω , quasi-open, |D| = δ .

(4)

The two problems above have been the subject of many investigations in the literature. The interest in the study of the eigenvalue λ (β , D) goes back to the analysis of the optimization of the survival threshold of a species living in a heterogenous habitat Ω , with the boundary ∂ Ω acting as a reflecting barrier. As explained by Cantrell and Cosner in a series of paper [3, 4, 5] (see also [11, 9, 12]), the heterogeneity of Ω makes the intrinsic growth rate of the population, represented by a L∞ (Ω ) function m(x), be positive in favourable sites and negative in the hostile ones. Then, if m+ ≡ 0 and m < 0, it turns out that the positive principal eigenvalue λ = λ (m) of the problem  −Δ u = λ mu in Ω ∂ν u = 0 on ∂ Ω , i.e.

Quantitative analysis of a singularly perturbed shape optimization problem in a polygon



λ (m) =

|∇u|2 dx : u ∈ H 1 (Ω ), 2 Ω mu dx



 Ω

277

mu2 dx > 0 ,

acts a survival threshold, namely the smaller λ (m) is, the greater the chances of survival become. Moreover, by [11], the minimum of λ (m) w.r.t. m varying in a suitable class is achieved when m is of bang-bang type, i.e. m = 1D − β 1Ω \D , being D ⊂ Ω with fixed measure. As a consequence, one is naturally led to the shape optimization problem introduced in Definition 1. On the other hand, the spectral drop problem has been introduced in [2] as a class of shape optimization problems where one minimizes the first eigenvalue μ = μ (D, Ω ) of the Laplace operator with homogeneous Dirichlet conditions on ∂ D ∩ Ω and homogeneous Neumann ones on ∂ D ∩ ∂ Ω : ⎧ ⎪ ⎨−Δ u = μ u in D u=0 on ∂ D ∩ Ω ⎪ ⎩ on ∂ D ∩ ∂ Ω . ∂ν u = 0 In our paper [12], we analyzed the relations between the above problems, showing in particular that M(δ ) arises from Λ(β , δ ) in the singularly perturbed limit β → +∞, as stated in the following result. δ δ Theorem 1 ([12, Thm. 1.4, Lemma 3.3]). If 0 < δ < |Ω |, β > and < |Ω | − δ β ε < |Ω | − δ then 

 M(δ + ε ) 1 −

δ εβ

2 ≤ Λ(β , δ ) ≤ M(δ ).

As a consequence, for every 0 < δ < |Ω |, lim Λ(β , δ ) = M(δ ).

β →+∞

In respect of this asymptotic result, let us also mention [8], where the relation between the above eigenvalue problems has been recently investigated for D ⊂ Ω fixed and regular. In [12], we used the theorem above to transfer information from the spectral drop problem to the optimal design one. In particular, we could give a contribution in the comprehension of the shape of an optimal set D∗ for Λ(β , δ ). This topic includes several open questions starting from the analysis performed in [4] (see also [9, 11]) when Ω = (0, 1): in this case it is shown that any optimal set D∗ is either (0, δ ) or (1 − δ , 1). The knowledge of analogous features in the higher dimensional case is far from being well understood, but it has been recently proved in [9] that when Ω is an N-dimensional rectangle, then ∂ D∗ does not contain any portion of sphere, contradicting previous conjectures and numerical studies [1, 15, 7]. This result prevents the existence of optimal spherical shapes, namely optimal D∗ of the form D∗ = Ω ∩ Br(δ ) (x0 ) for suitable x0 and r(δ ) such that |D∗ | = δ .

278

Dario Mazzoleni, Benedetta Pellacci and Gianmaria Verzini

On the other hand, we have shown that spherical shapes are optimal for M(δ ), for small δ , when Ω is an N-dimensional polytope. This, together with Theorem 1, yields the following result. Theorem 2 ([12, Thm. 1.7]). Let Ω ⊂ RN be a bounded, convex polytope. There exists δ¯ > 0 such that, for any 0 < δ < δ¯ : • D∗ is a minimizer of the spectral drop problem in Ω , with volume constraint δ , if and only if D∗ = Br(δ ) (x0 ) ∩ Ω , where x0 is a vertex of Ω with the smallest solid angle; • if |D| = δ and D is not a spherical shape as above, then, for β sufficiently large,

λ (β , D) > λ (β , Br(δ ) (x0 ) ∩ Ω ). In particular, in case Ω = (0, L1 ) × (0, L2 ), with L1 ≤ L2 , and 0 < δ < L12 /π , then any minimizing spectral drop is a quarter of a disk centered at a vertex of Ω . Then, even though the optimal shapes for Λ (β , δ ) can not be spherical for any fixed β , they are asymptotically spherical as β → +∞, at least in the qualitative sense described in Theorem 2. The main aim of the present note is to somehow revert the above point of view: we will show that, in case M(δ ) is explicit as a function of δ , one can use Theorem 1 in order to obtain quantitative bounds on the ratio Λ(β , δ ) . M(δ ) In particular, we will pursue this program in case Ω is a planar polygon: indeed, on the one hand, in such case the threshold δ¯ in Theorem 2 can be estimated explicitly; on the other hand, such theorem implies that the optimal shapes for M(δ ) are spherical, so that M(δ ) can be explicitly computed. This will lead to quantitative estimates about the convergence of Λ(β , δ ) to M(δ ). As a byproduct of this analysis, we will also obtain some quantitative information on the ratio λ (β , Br(δ ) (p) ∩ Ω ) , Λ(β , δ ) thus providing a quantitative version of the second part of Theorem 2. These new quantitative estimates are the main results of this note, and they are contained in Theorems 3 and 4, respectively. The next section is devoted to their statements and proofs, together with further details of our analysis.

2 Setting of the problem and main results. Let Ω ⊂ R2 denote a convex n-gon, n ≥ 3. We introduce the following quantities and objects, all depending on Ω :

Quantitative analysis of a singularly perturbed shape optimization problem in a polygon

• • • •

279

αmin is the smallest interior angle; Vmin is the set of vertices having angle αmin ; e1 , . . . , en are the (closed) edges; d denotes the following quantity: d = min{dist(ei ∩ e j , ek ) : i = j, i = k, j = k}.

Under the above notation, we define the threshold

δ¯ :=

d2 . 2αmin

(5)

Remark 1. Notice that, as far as n ≥ 4, d corresponds to the shortest distance between two non- consecutive edges: / d = min{dist(xi , x j ) : xi ∈ ei , x j ∈ e j , ei ∩ e j = 0}. Moreover, for any n, 0 < δ¯ < |Ω |. Indeed, let ei ∩ e j ∈ Vmin , with |ei | ≤ |e j |. Then d ≤ |ei | sin αmin

1 |Ω | ≥ |ei ||e j | sin αmin , 2

and

and the claim follows since sin αmin < αmin . Our main results are the following. Theorem 3. Let Ω ⊂ R2 denote a convex n-gon, let δ¯ be defined in (5), and let us assume that 0 < δ < δ¯ . Then M(δ ) is achieved by D∗ if and only if D∗ = Br(δ ) (p) ∩ Ω , where p ∈ Vmin . Moreover  3   2 Λ(β , δ ) δ −1/3 −1 −1/3 1 − , 1 =⇒ (1 + ) < < 1. β > max β β M(δ ) δ¯ − δ By taking advantage of the asymptotic information on Λ(β , δ )/ M(δ ), we can deduce the corresponding relation between the eigenvalue of a spherical shape and the minimum Λ(β , δ ). Theorem 4. Let Ω ⊂ R2 denote a convex n-gon, β > 1, and let us assume that

δ
R(Dδ , Ω ) ≥  2 2 |D∗δ | 2 δ whenever δ < δ¯ , which is fixed as d/2αmin . So that we get again a contradiction concluding the proof. Finally, the assertion concerning K(Ω , δ¯ ) follows by its definition and from the fact that for δ ≤ δ¯ (see also [12, Corollary 4.3]), we have just showed that  all α I(Ω , δ ) = 2 is a constant independent of δ .

D∗

D∗

D∗

D∗

Fig. 1 some possibilities for cases B (above) and C (below) in the proof of Lemma 1. The Dirichlet boundary ∂ D∗ ∩ Ω is dashed.

282

Dario Mazzoleni, Benedetta Pellacci and Gianmaria Verzini

Remark 2. Notice that the threshold δ¯ in Lemma 1 has no reason to be optimal. On the other hand, one can easily check that in the case of a rectangle, as treated in Theorem 2 it is actually optimal, since, for δ > δ¯ , I(Ω , δ ) is achieved by a rectangle (see e.g. [12, Remark 4.5]). We are now in position to prove our main results. Proof (Proof of Theorem 3). First of all, we take ε ∈ (δ /β , δ¯ − δ ) = 0/ by the assumption on δ and we apply [12, Corollary 4.3] and Lemma 1 to deduce that M(δ ) = K 2 (Ω , δ )δ −1 λ1Dir = αmin (2δ )−1 λ1Dir M(δ + ε ) = K 2 (Ω , δ + ε )(δ + ε )−1 λ1Dir = αmin [2(δ + ε )]−1 λ1Dir , where λ1Dir stands for the first eigenvalue of the Dirichlet-Laplacian in the ball of unit radius. By Theorem 1 we obtain Λ(β , δ ) M(δ + ε ) ≥ 1≥ M(δ ) M(δ )



 1−

δ εβ

2

δ = δ +ε



 1−

δ εβ

2 ,

for all ε ∈ (δ /β , δ¯ − δ ). Then we make the choice of ε = δ /β 1/3 , which is admissible since β > 1 and δ < β 1/3 δ¯ /(1 + β 1/3 ), and obtain 1≥

 2 Λ(β , δ ) 1 −1/3 1 − β , ≥ M(δ ) 1 + β −1/3

(9)

yielding the conclusion. Proof (Proof of Theorem 4). Calling D∗ = Br(δ ) (p) ∩ Ω , for some p ∈ Vmin and using conclusion 2 of [12, Lemma 3.1], we infer that λ (β , D∗ ) ≤ μ (D∗ , Ω ). As a consequence we can use Theorem 3 to write 1≤

−2  M(δ ) λ (β , D∗ ) . ≤ ≤ (1 + β −1/3 ) 1 − β −1/3 Λ (β , δ ) Λ (β , δ )

Remark 3. The estimate of Theorem 4 can be read as, 1≤

λ (β , D∗ ) ≤ 1 + 3β −1/3 + o(β −1/3 ), Λ (β , δ )

as β → ∞.

On the other hand, even without using asymptotic expansions, as β increases, the estimate becomes more precise. As an example, for all β > 8, one has the explicit estimate λ (β , D∗ ) ≤ 1 + 15β −1/3 + 14β −2/3 . 1≤ Λ (β , δ ) Acknowledgements Work partially supported by the project ERC Advanced Grant 2013 n. 339958: “Complex Patterns for Strongly Interacting Dynamical Systems - COMPAT”, by the

Quantitative analysis of a singularly perturbed shape optimization problem in a polygon

283

PRIN-2015KB9WPT Grant: “Variational methods, with applications to problems in mathematical physics and geometry”, and by the INdAM-GNAMPA group.

References 1. Berestycki, H., Hamel, F., Roques, L.: Analysis of the periodically fragmented environment model. I. Species persistence. J. Math. Biol. 51(1), 75–113 (2005). http://dx.doi.org/10.1007/s00285-004-0313-3 2. Buttazzo, G., Velichkov, B.: The spectral drop problem. In: Recent advances in partial differential equations and applications, Contemp. Math., vol. 666, pp. 111–135. Amer. Math. Soc., Providence, RI (2016). https://doi.org/10.1090/conm/666/13236 3. Cantrell, R.S., Cosner, C.: Diffusive logistic equations with indefinite weights: population models in disrupted environments. Proc. Roy. Soc. Edinburgh Sect. A 112(3-4), 293–318 (1989). http://dx.doi.org/10.1017/S030821050001876X 4. Cantrell, R.S., Cosner, C.: The effects of spatial heterogeneity in population dynamics. J. Math. Biol. 29(4), 315–338 (1991). URL http://dx.doi.org/10.1007/BF00167155 5. Cantrell, R.S., Cosner, C.: Spatial ecology via reaction-diffusion equations. Wiley Series in Mathematical and Computational Biology. John Wiley & Sons, Ltd., Chichester (2003). http://dx.doi.org/10.1002/0470871296 6. Cianchi, A.: On relative isoperimetric inequalities in the plane. Boll. Un. Mat. Ital. B (7) 3(2), 289–325 (1989) 7. Kao, C.Y., Lou, Y., Yanagida, E.: Principal eigenvalue for an elliptic problem with indefinite weight on cylindrical domains. Math. Biosci. Eng. 5(2), 315–335 (2008). http://dx.doi.org/10.3934/mbe.2008.5.315 8. Kielty, D.: Singular limits of sign-changing weighted eigenproblems. ArXiv e-prints arxiv:1812.03617 (2018). URL https://arxiv.org/pdf/1812.03617 9. Lamboley, J., Laurain, A., Nadin, G., Privat, Y.: Properties of optimizers of the principal eigenvalue with indefinite weight and Robin conditions. Calc. Var. Partial Differential Equations 55(6), Paper No. 144, 37 (2016). http://dx.doi.org/10.1007/s00526-016-1084-6 10. Lions, P.L., Pacella, F., Tricarico, M.: Best constants in Sobolev inequalities for functions vanishing on some part of the boundary and related questions. Indiana Univ. Math. J. 37(2), 301–324 (1988). https://doi.org/10.1512/iumj.1988.37.37015 11. Lou, Y., Yanagida, E.: Minimization of the principal eigenvalue for an elliptic boundary value problem with indefinite weight, and applications to population dynamics. Japan J. Indust. Appl. Math. 23(3), 275–292 (2006). URL http://projecteuclid.org/euclid.jjiam/1197390801 12. Mazzoleni, D., Pellacci, B., Verzini, G.: Asymptotic spherical shapes in some spectral optimization problems. J. Math. Pures Appl., in press, https://doi.org/10.1016/j.matpur.2019. 10.002 13. Pacella, F., Tricarico, M.: Symmetrization for a class of elliptic equations with mixed boundary conditions. Atti Sem. Mat. Fis. Univ. Modena 34(1), 75–93 (1985/86) 14. Ritor´e, M., Vernadakis, E.: Isoperimetric inequalities in Euclidean convex bodies. Trans. Amer. Math. Soc. 367(7), 4983–5014 (2015). https://doi.org/10.1090/S0002-9947-201506197-2 15. Roques, L., Hamel, F.: Mathematical analysis of the optimal habitat configurations for species persistence. Math. Biosci. 210(1), 34–59 (2007). http://dx.doi.org/10.1016/j.mbs.2007.05.007

A rigidity theorem for ideal surfaces with flat boundary James McCoy and Glen Wheeler

Abstract We are interested in surfaces with boundary satisfying a sixth order nonlinear elliptic partial differential equation associated with extremal surfaces of the L2 -norm of the gradient of the mean curvature. We show that such surfaces satisfying so-called ‘flat boundary conditions’ and small L2 -norm of the second fundamental form are necessarily planar.

1 Introduction 1.1 The energy, notation We are interested in the geometric energy F[f] = under the hypothesis

 Σ

 Σ

|∇H|2 d μ

|A|2 d μ ≤ ε0

(1)

where ε0 > 0 is a small, universal constant. Here f : Σ → R3 a smooth immersion of surface Σ with boundary; d μ is the induced surface area element; H = κ1 + κ2 and |A|2 = κ12 + κ22 are respectively the mean curvature and the norm of the second fundamental form of f (Σ ) and ∇ is the covariant derivative on f (Σ ).

James McCoy School of Mathematical and Physical Sciences, University of Newcastle Glen Wheeler School of Mathematics and Applied Statistics, University of Wollongong

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_19

285

James McCoy and Glen Wheeler

286

1.2 The normal variation Lemma 1. Given a smooth normal variation φ : Σ → R3 of f : Σ → R3 ,      d F [ f + εφ ] Δ 2 H + |A|2 Δ H − (A0 )i j ∇i H∇ j H φ , ν  d μ = −2 dε Σ ε =0    +2 Δ ϕ + |A|2 ϕ ∇H + ∇Δ H ϕ − Δ H ∇ϕ , η d σ . ∂Σ

(2)

where ϕ := φ , ν .

i j Above A0 = Ai j − 12 H gi j is the trace-free part of A. The metric is denoted gi j ; the entries of its inverse matrix are gi j . The unit normal to f (Σ ) is denoted ν while the unit conormal to the boundary is denoted η . Idea of proof: The result follows from the variations   ∂ ε  ∂ i j  g = −2ϕ Ai j , g = 2ϕ Ai j , ∂ ε i j ε =0 ∂ ε ε ε =0     ∂ ∂ det(gεi j ) Hε  = −H ϕ det(gi j ), = Δ ϕ + ϕ |A|2 . ∂ε ∂ ε ε =0 ε =0 The boundary terms arise from ‘integration by parts’ on Σ with boundary, ie via the Divergence Theorem  Σ

divΣ X d μ =

 ∂Σ

X, η  d σ . 

1.3 The boundary value problem If f (Σ ) were closed without boundary, there would be no boundary terms and critical points of F [ f ] would satisfy

i j I[ f ] := Δ 2 H + |A|2 Δ H − A0 ∇i H∇ j H = 0.

(3)

We will impose flat boundary conditions on ∂ Σ : |A| = 0 and ∇η H = ∇η Δ H = 0

(4)

A rigidity theorem for ideal surfaces with flat boundary

287

(defined in terms of limits approaching the boundary). Then the boundary terms in (2) disappear and we are left with (3) for critical points of the energy. We study smooth solutions (3) with boundary conditions (4) and smallness condition (1). Theorem 1. Suppose f : Σ → R3 satisfies (3) with boundary conditions (4). If f also satisfies (1) for ε0 > 0 sufficiently small, then f (Σ ) is part of a flat plane. Remark: In the case of f : Σ → Rk , k > 3, (3) may be replaced by the condition I [ f ] , H = 0. Idea of proof: We establish that surfaces satisfying (4) and (1) also satisfy       ∇2 A2 + |A|2 |∇A|2 + |A|4 A0 2 γ p d μ Σ 

≤c

Σ

I [ f ] H γ p d μ + c c4γ

 [γ >0]

|A|2 d μ ,

for appropriately chosen cutoff functions γ (see Definition 1 below). For surfaces additionally satisfying (3), the first term on the right hand side above disappears and the other term is bounded by ρc4 . Taking ρ → ∞ we see that f (Σ ) must have  6  2  2  2 |A|4 A0  ≡ 0. Since A0  ≤ |A|4 A0  we see that A0  ≡ 0 implying f (Σ ) is either part of a sphere or part of a plane. The boundary condition (4) implies f (Σ ) is part of a plane. 

2 Earlier related work 2.1 Previous geometric gap lemmas Previous higher-order geometric gap lemmas closest in spirit to ours are for Willmore surfaces [2]; for stationary solutions of the surface diffusion flow [7]; for biharmonic surfaces [8]; for some Helfrich surfaces (constrained Willmore surfaces) [5], with some others covered in [1]; for triharmonic surfaces [3] and for polyharmonic surfaces [6]. For several of these there are also versions for surfaces with boundary, with either of two boundary conditions:     1. umbilic boundary conditions ∇A0  = A0  = 0; or 2. flat boundary conditions |∇A| = |A| = 0. Under suitable smallness conditions, umbilic boundary conditions lead to parts of spheres and planes, while flat boundary conditions allow planes only [9]. Often (including here) results hold for arbitrary codimension, but we will restrict here to codimension 1 for convenience.

James McCoy and Glen Wheeler

288

2.2 Some tools Definition 1 (Cut-off functions). For γ˜ ∈ Cc2 (R3 ), γ = γ˜ ◦ f : Σ → [0, 1] satisfies ∇γ ∞ ≤ cγ ,

∇2 γ ∞ ≤ cγ (cγ + |A|),

for some absolute constant cγ < ∞. Theorem 2 (Michael-Simon Sobolev Inequality with boundary, [4]). For f : M m → Rn a smooth immersion of M with boundary ∂ M into Rn and any u ∈ C1 M ,

 M

 m−1

m

|u| m−1 d μ

2

4m+1



1/m

ωm

 M

(|∇u| + |H| |u|) d μ +



 ∂M

|u| d σ

where ωm is the volume of the unit ball in Rm .

3 Estimates Using the Divergence Theorem on Σ (integration by parts) we begin with Lemma 2. Surfaces satisfying (4) also satisfy 

(Δ H)2 γ p d μ =







I [ f ] H γ p d μ + |A|2 |∇H|2 γ p d μ + H ∇i H∇i |A|2 γ p d μ Σ Σ Σ    Σ   

0 i j 2 p i H ∇ Δ H + H |A| − Δ H ∇i H ∇i γ · γ p−1 d μ + H A ∇i H∇ j H γ d μ + p Σ

Σ

Lemma 3. Surfaces satisfying (4) also satisfy         ∇2 H 2 γ p d μ ≤ c I [ f ] H γ p d μ +c |A|2 |∇A|2 γ p d μ +c cγ2 ∇A0 2 γ p−2 d μ Σ Σ Σ Σ     

+p

Σ

H∇i Δ H + H |A|2 − Δ H ∇i H ∇i γ · γ p−1 d μ

Idea of proof: [9] has for a universal constant c  

1 c

Σ

  ∇2 H 2 + H 2 |∇H|2 γ p d μ ≤

  Σ

    p−2  2 dμ. (Δ H)2 + A0  |∇H|2 γ p d μ + c2γ ∇A0  Σ

Further, we estimate 

 

i j 1 c  0 2 H A0 ∇i H∇ j H γ p d μ ≤ H 2 |∇H|2 γ p d μ + A |∇H|2 γ p d μ ; 2c Σ 2 Σ Σ

A rigidity theorem for ideal surfaces with flat boundary

 Σ

H ∇i |A|2 ∇i H γ p d μ ≤ c˜



289

|A|2 |∇A|2 γ p d μ . 

Combining these with the previous Lemma yields the result. Lemma 4. Surfaces satisfying (4) also satisfy       ∇2 H 2 + |A|4 A0 2 γ p d μ Σ

     2 6 |A|2 |∇A|2 γ p d μ + c A0  γ p d μ + c c2γ ∇A0  γ p−2 d μ Σ Σ Σ Σ         2 4 0 2 p−4 i i  H∇ Δ H + H |A| − Δ H ∇ H ∇i γ · γ p−1 d μ + c cγ dμ + p A γ

≤c



I [ f ] H γ pd μ + c



Σ

Σ

Idea of proof: [9] has  

 2  2  H 4 A0  + H 2 ∇A0  γ p d μ      2  6   2  2 ≤c H 2 |∇H|2 + A0  ∇A0  + A0  γ p d μ + c c4γ A0  γ p−4 d μ

Σ

Σ

Σ

The result follows using Lemma 3 and  Σ

     2  2 6 |A|4 A0  γ p d μ ≤ (1 + ε ) H 4 A0  γ p d μ + (1 + c (ε )) A0  γ p d μ Σ

Σ

 Lemma 5. Surfaces satisfying (4) also satisfy       ∇2 A2 + |A|2 |∇A|2 + |A|4 A0 2 γ p d μ Σ

     2 6 |A|2 |∇A|2 γ p d μ + c A0  γ p d μ + c c2γ ∇A0  γ p−2 d μ Σ Σ Σ Σ         2 2 + c c4γ A0  γ p−4 d μ + p H∇i Δ H + H |A| − Δ H ∇i H ∇i γ · γ p−1 d μ

≤c



I [ f ] H γ pd μ + c Σ



Σ

Idea of proof: Simons’ identity implies  2  2  6

0 2 Δ A ≤ c ∇2 H  + c H 4 A0  + c A0  ; interchange of second covariant derivatives and the Divergence Theorem then shows   

    6  2 ∇2 A0 2 γ p d μ ≤ 2 |A|2 |∇A|2 + A0  γ p d μ Δ A0 γ p d μ + c Σ Σ Σ   

+ c c2γ

The result then follows using the previous Lemmas. Lemma 6. Surfaces satisfying (4) and (1) also satisfy

Σ

∇A0 2 γ p−2 d μ . 

James McCoy and Glen Wheeler

290

 

    ∇2 A2 + |A|2 |∇A|2 + |A|4 A0 2 γ p d μ ≤ c

Σ

 Σ

I [ f ] H γ p d μ +c cγ4

 Σ

|A|2 γ p−4 d μ



Idea of proof: Write A 22,[γ >0] = [γ >0] |A|2 d μ . The idea is to use the smallness condition to estimate the terms on the right hand side of Lemma 4. [9] showed using the Michael-Simon Sobolev inequality  

  A0 2 |A|4 + |A|2 |∇A|2 γ p d μ

Σ

c A 22,[γ >0]



 

      ∇2 A0 2 + |A|2 ∇A0 2 + |A|4 A0 2 γ p d μ + c c4γ A 4

Σ

2,[γ >0] ,

so we can absorb the non-cγ terms on the right hand side of Lemma 4. We estimate the cγ terms from Lemma 4 as follows:     A0 2 γ p−4 d μ ≤ c c4γ |A|2 γ p−4 d μ ;

c c4γ

Σ

Σ

via the Divergence Theorem, Cauchy-Schwarz and Peter-Paul c cγ2

  Σ

     ∇A0 2 γ p−2 d μ ≤ ε ∇2 A2 γ p d μ + c (ε ) c4γ |A|2 γ p−4 d μ ; Σ

Σ

with this in turn we estimate 

p

Σ

   ∇2 A |∇H| γ p−1 d μ Σ   

Δ H∇i H∇i γ · γ p−1 d μ ≤ c cγ

≤ε 

p

Σ

H |A|2 ∇i H∇i γ · γ p−1 d μ ≤ ε

 Σ

Σ

∇2 A2 γ p d μ + c (ε ) cγ4

 Σ

|A|2 γ p−4 d μ ;

  2 |A|2 ∇A0  γ p d μ + c (ε ) c2γ H 2 |A|2 γ p−2 d μ . Σ

Now by the Michael-Simon Sobolev inequality (|A| = 0 on ∂ Σ )  Σ

≤c

H 2 |A|2 γ p−2 d μ ≤

 Σ

|∇A| |A| γ

p−2 2

 Σ



|A|4 γ p−2 d μ ≤ c

2 +c cγ2

 Σ

2  2    p−2  2  p−2 |A|3 γ 2 d μ ∇ |A|  γ 2 d μ + Σ

|A|2 γ

p−4 2

Σ

2 dμ

+c A 22,[γ >0]

 Σ

|A|4 γ p−2 d μ .

Absorbing on the left and using the Cauchy-Schwarz inequality we obtain  Σ

|A|4 γ p−2 d μ ≤ c A 22,[γ >0]

 Σ

|∇A|2 γ p−2 d μ + c c2γ A 42,[γ >0]

and so 

c c2γ

Σ

H 2 |A|2 γ p−2 d μ ≤ c c2γ A 22,[γ >0]

 Σ

|∇A|2 γ p−2 d μ + c c4γ A 42,[γ >0] .

A rigidity theorem for ideal surfaces with flat boundary

291

For the remaining term we use the Divergence Theorem (H = 0 on ∂ Σ )  Σ

H∇i Δ H∇i γ · γ p−1 d μ = −

 Σ

Δ H∇i H∇i γ · γ p−1 d μ − − (p − 1)





Σ

Σ

H Δ H Δ γ · γ p−1 d μ

H Δ H |∇γ |2 γ p−2 d μ .

We now estimate 





Σ

 Σ

Δ H∇i H∇i γ · γ p−1 d μ ≤ ε

H Δ H Δ γ · γ p−1 d μ ≤ c cγ ≤ε



Σ

2 p



 Σ

Σ

(Δ H)2 γ p d μ + c (ε ) c2γ



|∇H|2 γ p−2 d μ ;

Σ

|H| |Δ H| cγ + |A| γ p−1 d μ

(Δ H) γ d μ + c c4γ



|A|2 γ p−2 d μ + c c2γ

Σ

 Σ

H 2 |A|2 γ p−2 d μ

and − (p − 1)

 Σ

H Δ H |∇γ |2 γ p−2 d μ ≤ ε

 Σ

(Δ H)2 γ p d μ + c c4γ

 Σ

H 2 γ p−4 d μ

Inserting all these estimates and absorbing on the left yields the result.



Acknowledgements Research supported by the Australian Research Council DP150100375 and DP180100431. The authors also acknowledge Benjamin Maldon (University of Newcastle) for assistance with typesetting through the University of Newcastle Priority Research Centre for Computer-Assisted Research Mathematics and its Applications (CARMA).

References 1. Y. Bernard, G. Wheeler, V.-M. Wheeler, Spherocytosis and the Helfrich model, Interfaces And Free Boundaries, 19 (2017) no. 4, 495–523. 2. E. Kuwert, R. Sch¨atzle, The Willmore flow with small initial energy, J. Differential Geom. 57 (2001), 409–441. 3. J. McCoy, S. Parkins, G. Wheeler, The geometric triharmonic heat flow of immersed surfaces near spheres, Nonlinear Anal. 161 (2017), 44–86. 4. J. H. Michael, L. Simon, Sobolev and mean-value inequalities on generalized submanifolds of Rn , Comm. Pure Appl. Math. 26 (1973), 361–379. 5. J. McCoy, G. Wheeler, A classification theorem for Helfrich surfaces, Math. Ann. 357 (2013) no. 4, 1485–1508. 6. S. Parkins, A selection of higher-order parabolic curvature flows, PhD thesis, Uni. Wollongong, 2017. 7. G. Wheeler, Surface diffusion flow near spheres, Calc. Var. 44 (2012), no. 1-2, 131–151. 8. G. Wheeler, Chen’s conjecture and ε -superbiharmonic submanifolds of Riemannian manifolds, Int. J. Math. 24 (2013) no. 4, 1350028. 9. G. Wheeler, Gap phenomena for a class of fourth-order geometric differential operators on surfaces with boundary, Proc. Amer. Math. Soc. 143 (2014) no. 4,

On minimizers and critical points for anisotropic isoperimetric problems Robin Neumayer

Abstract Anisotropic surface energies are a natural generalization of the perimeter functional that arise, for instance, in scaling limits for certain probabilistic models on lattices. We survey two recent results concerning isoperimetric problems with anisotropic surface energies. The first is joint work with Delgadino, Maggi, and Mihaila and provides a weak characterization of critical points in the anisotropic isoperimetric problem. The second is joint work with Choksi and Topaloglu and describes energy minimizers in an anisotropic variant of a model for atomic nuclei.

1 Introduction The Euclidean isoperimetric problem, in which one minimizes the perimeter among sets of a fixed volume, is one of the most classical problems in mathematics and its study dates back over two millennia. In the language of modern calculus of variations, it is the minimization problem inf{P(E) : |E| = 1},

(1)

where E ⊂ Rn is a set of finite perimeter and P(E) is the distributional perimeter; see [27]. Modulo translations, the unique minimizer of (1) is the ball of volume one. Rephrased in a scaling invariant way, this fact gives the isoperimetric inequality: P(E) ≥ n|B|1/n |E|(n−1)/n , with equality if and only if E is a translation or dilation of the unit ball B. Anisotropic surface energies are a natural generalization of the perimeter functional, which frequently arise in models for equilibrium shapes of crystalline maRobin Neumnayer Institute for Advanced Study, e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_20

293

Robin Neumayer

294

terials and in scaling limits of probabilistic models on lattices [5, 2]. Given a surface tension f : Rn → R, i.e. a convex, positively one-homogeneous function with f |Sn−1 > 0, the anisotropic surface energy of a smooth open set E ⊂ Rn is given by ˆ f (νE ) dH n−1 F (E) = ∂E

where νE is the outer unit normal to E. The definition extends to sets of finite perimeter by integrating over the reduced boundary ∂ ∗ E and taking νE to be the measure theoretic outer unit normal. One can then study the corresponding minimization problem inf{F (E) : |E| = 1}. (2) This problem is known as the Wulff problem, so named for the Russian crystallographer George Wulff who in 1901 conjectured the form of energy minimizers [32]. The unique minimizer, modulo translations, is given by the Wulff shape K=



{x · ν < f (ν )};

ν ∈Sn−1

see [4, 15, 16, 30, 31]. This bounded convex set K plays the role of the ball in the anisotropic setting. As with the Euclidean isoperimetric problem, one can express this minimality in the scaling invariant form F (E) ≥ n|K|1/n |E|(n−1)/n ,

(3)

with equality if and only if E = rK + x for some r > 0 and x ∈ Rn . Of course, the case when f is the Euclidean norm corresponds to the classical notion of perimeter, in which case the Wulff shape is a ball. A less trivial example comes from the class of smooth, elliptic surface tensions: those that are smooth on Rn \ {0} and are (λ -)elliptic in the sense that

λ Id ≤ ∇2 f (ν ) ≤ λ −1 Id

on ν ⊥ ∀ν ∈ Sn−1 .

The Wulff shapes for such norms are smooth and uniformly convex. From an analytic perspective, the surface energies arising from smooth elliptic norms share many desirable properties with the perimeter functional. However, many examples of surface tensions are not smooth nor elliptic; in fact, every bounded convex set is the Wulff shape for some surface energy. In typical applications, the physically relevant surface tensions are crystalline surface tensions, those that are the maximum of finitely many linear functions. Wulff shapes corresponding to crystalline norms are convex polyhedra. So, from both applied and theoretical perspectives, an important question is to understand which structural aspects of anisotropic isoperimetric problems are dictated by the smoothness and ellipticity of the surface tension, and which are preserved when these assumptions are relaxed.

On minimizers and critical points for anisotropic isoperimetric problems

295

One example of a property that is independent of smoothness and ellipticity assumptions is seen through the work of [14]. The main result there states that the deviation of a set from achieving equality in (3) quadratically controls the distance of a set to a homothety of the Wulff shape. More specifically, F (E) − n|K|1/n |E|(n−1)/n ≥ c infn x∈R n|K|1/n |E|(n−1)/n



2

|E Δ (rK + x)| : |rK| = |E| |E|

.

Remarkably, the constant c depends only on the dimension. This can been seen as a uniform convexity property of the energy profile of F (E) near the global minimizer: after modding out by translations and dilations, the energy F (E) grows from its global minimum quadratically in the symmetric difference, with a modulus of convexity is independent of the surface energy. On the other hand, the following example provides a property that is dependent on ellipticity. If f is an elliptic surface tension1 , then for any set of finite perimeter E and any half-space H intersecting E nontrivially (i.e. |E ∩ H| > 0 and |E \ H| > 0), one has F (E) > F (E ∩ H). (4) In particular, if one considers the Plateau problem with respect to F (E) and with boundary data, say, a copy of Sn−2 that is contained in a hyperplane, then the unique solution is given by the (n − 1)-dimensional ball contained in this hyperplane. One sees (4) from the following simple calibration argument. For simplicity, say that E is smooth. Let νH be the outer unit normal to H and let −x0 ∈ Rn be the slope of a supporting hyperplane to the convex function f at νH . The ellipticity of f ensures that the hyperplane with slope x0 is a supporting hyperplane to f at exactly one ν0 ∈ Sn−1 , and that f (ν ) > ν · x0 for every other ν ∈ Sn−1 . So ˆ ˆ F (E) − F(E ∩ H) = f (νE ) − f (νH ) (5) ∂ E\H ∂ H∩E ˆ ˆ ˆ > x 0 · νE − (−x0 ) · νH = x0 · νR , (6) ∂ E\H

∂ H∩E

∂R

where we let R = E \ H. Now, by the divergence theorem we see that the righthand side is equal to zero, which establishes (4). In contrast, in the absence of ellipticity assumptions on f , one can construct examples of quite dramatic failure of uniqueness for Plateau’s problem. For instance, considering f (ν ) = ν ∞ in R2 , the line segment joining (−1, 0) and (1, 0) has the same energy as the segment joining (−1, 0) and (0, 1) union the segment joining (0, 1) and (1, 0). Here, we survey two results concerning minimizers and critical points of anisotropic isoperimetric problems. The first, which is joint work with Delgadino, Maggi, and Mihaila [11], points toward a further properties of the energy profile of F (E) that are independent of smoothness and ellipticity. The second, which is joint work with 1

One can actually assume slightly less; f needs only to be strictly convex is directions orthogonal in tangential directions to its level sets.

Robin Neumayer

296

Choksi and Topaloglu [7], demonstrates a variational problem in which the character of energy minimizers depends crucially on smoothness and ellipticity assumptions on the surface tension.

2 Critical points in the Wulff problem Suppose f is a smooth elliptic surface tension. Then the first variation of the surface energy F (E) for a variation with initial velocity X ∈ Cc1 (Rn , Rn ) is given by ˆ δ F (E)[X] = divτ (∇ f (νE )) X · νE . ∂ ∗E

Here divτ X denotes the tangential divergence. Notice that when f is the Euclidean norm, this is the usual first variation of perimeter, and we call HEf := divτ (∇ f (νE )) the anisotropic mean curvature of E in analogy with the isotropic case. If E is a critical point of F (E) with respect to variations that preserve area, then HEf = const

on ∂ ∗ E.

In such a case, this constant is given by H0 :=

(n − 1)F (E) . n|E|

A celebrated theorem of Aleksandrov [1] says that the only smooth, bounded, connected, embedded hypersurfaces in Rn of constant mean curvature are spheres. Or, in the language of the present setting, if a smooth, bounded, connected set is a critical point of the perimeter among variations that preserve volume, then E is a ball. For smooth elliptic norms, the analogous result was shown in [20]: a smooth bounded connected set E with constant anisotropic mean curvature HEf is a translation or dilation of the Wulff shape. For a generic surface tension, the interpretation of what constant anisotropic mean curvature means is a subtle issue in itself; if a surface tension is not C1 , then the first variation is not even a linear functional. Following the common theme in analysis, one may interpret “constant anisotropic mean curvature” via approximation by smooth objects. Given any surface tension f , we approximate f point-wise by a sequence of smooth λh -elliptic norms { fh }. (Note that necessarily λh → 0 if f fails to be smooth or elliptic.) We quantify the L2 deficit of a smooth set from having f constant HEh with the scaling invariant quantity

On minimizers and critical points for anisotropic isoperimetric problems

297

 2 ⎞1/2  H fh   E  δ fh (E) = ⎝ − 1 ⎠ .   ∂ E  H0 ⎛

A natural definition for a set E to have constant f -mean curvature is for E to be approximated in L1 by smooth, bounded sets Eh with δ fh (Eh ) → 0. The following theorem, proven in [11], roughly says that such a set must be the union of Wulff shapes. Theorem 1. Let f be any surface tension and let K be the corresponding Wulff shape. Let { fh } to be a sequence of smooth λh -elliptic norms approximating f in a point-wise sense. Suppose {Eh } is a sequence of smooth, bounded open sets f normalized to have H0 = n that satisfy HEhh ≥ ε on ∂ Eh , sup diam(Eh ) < ∞, and Fh (Eh ) ≤ LFh (K). If λh−2 δh (Eh ) → 0 and Eh → E in L1 , then E=

M

(K + xi ),

(7)

i=1

where M ≤ L and the K + xi are pairwise disjoint. The fact that E is a finite union of Wulff shapes, instead of just one, is an instance of the type of bubbling phenomenon exhibited in many geometric variational problems, and is an artifact of only using first order information. On the other hand, while our proof requires δh (Eh ) to converge to zero faster than the loss of ellipticity of fh , we expect that this assumption is purely technical and the result should likely hold if one simply assumes that δh (Eh ) → 0. We refer the reader to [11] for the proof of Theorem 1, and here we attempt only indicate some of the key ideas. The starting point for proving Theorem 1 is an anisotropic version of the Heintze-Karcher inequality, which states the following. Let f be a smooth, elliptic surface tension and let E be a smooth, bounded, connected set with HEf > 0. Then ˆ n−1 f (νE ) dH n−1 ≥ n|E|, (8) f ∂ E HE with equality if and only if E = rK + x. Notice that (8) implies the result of [20] (indeed, this is the method of proof in [20]). Indeed, if H Ef is constant, then it is not difficult to show that this constant must be H Ef =

(n − 1)F (E) . n|E|

Plugging this into the left-hand side of (8), we immediately see that such a set achieves equality in (8) and thus is a homothety of the Wulff shape. A further key point is that, provided HEf ≥ ε , the scale-invariant deficit δ (E) from having con-

Robin Neumayer

298

stant anisotropic mean curvature controls the deficit from equality in (8). As such, a crucial part of the proof of Theorem 1 is a quantitative analysis of (8). To derive quantitative estimates for sets almost achieving equality in (8), it is fruitful to trace through a PDE proof of the inequality, which is due to Ros in the isotropic case [29]. In this proof, we consider the solution to the equation L f u = 1 in E (9) u=0 on ∂ E where the elliptic operator L f is given by

L f u = div ∇ f 2 /2(∇u) . When E = K, the solution is given by uK (x) =

f∗ (x)2 . 2n

where f∗ (x) = sup{x · ν : f (ν ) < 1} is the dual norm to f . The above discussion holds only for smooth and elliptic surface tensions. In the proof of Theorem 1, we solve the equation (9) with f = fh and E = Eh . The main idea of the proof is to show that these solutions uh this function converge to a sum of (translations of) the model function uK corresponding to the limit surface energy, and from there deduce that the support of this limit function is the L1 limit of Eh and takes the form (7). To this end, we first establish Lipschitz estimates allowing us to produce a C0 limit u of the sequence {uh }. We then prove quantitative estimates from (8) that allow us to show that u is supported on a countable union of disjoint Wulff shapes ri K + xi , possibly of different radii ri . Some of the more difficult analysis comes into showing that all the radii are equal to one. In this step, we establish a family of Pohozaev-type identities involving integral quantities of ∇uh . With a somewhat delicate Young measure argument, we can pass these identities to the limit function ∇u, despite having only weak-∗ convergence in L∞ for the gradients. Then, pairing these identities with a scaling argument allows us to conclude the ri = 1 for all i.

3 Minimizers in the anisotropic liquid drop model Gamow’s liquid drop model [19] is among the principal models, along with the shell and cluster models, used to describe atomic nuclei [10]. (None of these models come from first principles, and none individually can be used to describe all observed phenomena.) In its simplest form, the liquid drop model assumes that the nucleus of an atom minimizes an energy comprising the sum of a perimeter term and a Coulombic self-interaction term:

On minimizers and critical points for anisotropic isoperimetric problems

inf{P(E) + V (E) : |E| = m}.

299

(10)

Here, for a fixed parameter α ∈ (0, n), V (E) is a nonlocal repulsion term defined by ˆ ˆ dx dy V (E) = . α E E |x − y| The physical case is n = 3 and α = 1, corresponding to a Coulombic potential in three dimensional space. This model predicts that nuclei of small mass are spherical and nuclei of sufficiently large mass do not exist. While the liquid drop model was introduced in the 1930s, the variational problem (10), its study in the calculus of variations community has mostly been concentrated in the past decade. Due primarily in large part to the work of Kn¨upfer and Muratov in [22, 23], along with important contributions [3, 8, 18, 21, 25, 12], the state of the art for global minimizers of (10) is as follows: For any n ≥ 2, we have: 1. for all α ∈ (0, n) there exists m1 > 0 such that if m ≤ m1 , then the problem admits a minimizer; 2. for all α ∈ (0, n) there exists m0 > 0 such that if m ≤ m0 , then the minimizer is uniquely (modulo translations) given by the ball of mass m; and 3. for all α ∈ (0, 2) there exists m2 > 0 such that if m > m2 , then no minimizer exists. It is conjectured in [9] that m0 = m1 = m2 when n = 3 and α = 1. While the conjecture remains open, it was shown in [3] that m0 = m1 = m2 in any dimension for α sufficiently small. It also remains open whether the nonexistence result (iii) can be extended to α ∈ [2, n). Nuclei can exhibit distortions from a spherical shape, and some of the physics literature [26] suggests that instead this is due to the fact that “nuclei may possess anisotropic surface tension.” This motivates the replacement of the perimeter functional by an anisotropic surface energy in [7], leading to the minimization problem inf{F (E) + V (E) : |E| = m}.

(11)

The properties (i) and (iii) for (10) are, at their core, consequences of the inhomogeneous scaling of the energy P(E) + V (E) with respect to dilations. As the anisotropic surface energy scales in the same way as perimeter, it comes as no surprise that in [7] we readily establish analogous existence and nonexistence properties for (11). More interesting is the question of what form property (ii) should take in the setting of (11). Given that the Wulff shape plays the role of the ball for the anisotropic surface energy, is natural to wonder whether if it is a minimizer of (11). We show in [7] that the answer depends in a crucial way on the regularity and ellipticity of the surface tension: Theorem 2. Fix n ≥ 2 and α ∈ (0, n − 1/3), and m > 0. Suppose f is a smooth elliptic surface tension with Wulff shape K. Then K is a critical point of (11) if and only if f is the Euclidean norm.

Robin Neumayer

300

Theorem 3. Let n = 2 and f (ν ) = ν 1 . There exists m2 such that if m ≤ m2 then the Wulff shape is the unique minimizer of (11). This result is an instance where, for smooth, elliptic nature of the problem is notably different than the isotropic case, and the character of minimizers depends crucially on the regularity and ellipticity of the surface tension. The proof of of Theorem 2 makes use of a first variation argument, and in fact we prove a stronger statement: if the Wulff shape is a critical point of (11) for any mass m, then f is the Euclidean norm. Indeed, a critical point satisfies

´

HEf + vE = const

on ∂ ∗ E

(12)

where vE (x) = E |x − y|−α dx is the first variation of V (E). One directly computes that HKf = n − 1. So, if K satisfies (12), then vK = const

on ∂ K

Theorem 2 then follows from the following characterization: Proposition 1. Fix n ≥ 2 and α ∈ (0, n−1/3). Let K be a smooth set with vK =const on ∂ K. Then K is a ball. Proposition 1 was established for the Coulombic case α = n − 2 in [17] and was extended to α ∈ (0, n − 1) in [24]. Both proofs use the method of moving planes; see also [28]. The case when α ≥ n − 1 is significantly more delicate, principally due to the fact that the Riesz potential vK is merely H¨older continuous in this case. Our proof of Proposition 1 in the subtler case α ∈ [n − 1, n − 1/3) pairs the method of moving planes on integral forms in the spirit of [24, 6] with some new reflection arguments and estimates on how the Riesz potential vK grows compared to its reflection across a hyperplane. In the setting of Theorem 3, the Wulff shape is a square in R2 with sides aligning with the coordinate axes. The proof of Theorem 3 makes use of a structure theorem proven in [13] for crystalline surface energies in R2 . This result says that suitably defined quasi-minimizers of such a crystalline surface energy must be convex polyhedra with whose set of normal vectors is contained in the set of normal vectors of the Wulff shape. Pairing this result with a compactness argument, we are able to deduce that any minimizer of (11) for small enough mass in the setting of Theorem 3 is a rectangle with side aligning with the coordinate axes. From here, the study of minimizers of (11) essentially reduces to a one-dimensional variational problem, and the analysis can be done in a quite explicit way.

References 1. Aleksandrov, A.D.: Uniqueness theorems for surfaces in the large. V. Vestnik Leningrad. Univ. 13(19), 5–8 (1958)

On minimizers and critical points for anisotropic isoperimetric problems

301

2. Auffinger, A., Damron, M., Hanson, J.: 50 years of first-passage percolation, University Lecture Series, vol. 68. American Mathematical Society, Providence, RI (2017) 3. Bonacini, M., Cristoferi, R.: Local and global minimality results for a nonlocal isoperimetric problem on RN . SIAM J. Math. Anal. 46(4), 2310–2349 (2014) 4. Brothers, J.E., Morgan, F.: The isoperimetric theorem for general integrands. Michigan Math. J. 41(3), 419–431 (1994). DOI 10.1307/mmj/1029005070. URL http://dx.doi.org.ezproxy.lib.utexas.edu/10.1307/mmj/1029005070 5. Cerf, R.: The Wulff crystal in Ising and percolation models, Lecture Notes in Mathematics, vol. 1878. Springer-Verlag, Berlin (2006). Lectures from the 34th Summer School on Probability Theory held in Saint-Flour, July 6–24, 2004, With a foreword by Jean Picard 6. Chen, W., Li, C., Ou, B.: Classification of solutions for an integral equation. Comm. Pure Appl. Math. 59(3), 330–343 (2006). DOI 10.1002/cpa.20116. URL https://doi.org/10.1002/cpa.20116 7. Choksi, R., Neumayer, R., Topaloglu, I.: Anisotropic liquid drop models. Preprint available at arXiv:1810.08304 8. Choksi, R., Peletier, M.: Small volume fraction limit of the diblock copolymer problem: I. Sharp-interface functional. SIAM J. Math. Anal. 42(3), 1334–1370 (2010). DOI 10.1137/090764888. URL http://dx.doi.org/10.1137/090764888 9. Choksi, R., Peletier, M.: Small volume-fraction limit of the diblock copolymer problem: II. Diffuse-interface functional. SIAM J. Math. Anal. 43(2), 739–763 (2011). DOI 10.1137/10079330X. URL http://dx.doi.org/10.1137/10079330X 10. Cook, N.: Models of the Atomic Nucleus: Unification Through a Lattice of Nucleons. Springer Berlin Heidelberg (2010). URL https://books.google.com/books?id=CwRGogWF5-oC 11. Delgadino, M.G., Maggi, F., Mihaila, C., Neumayer, R.: Bubbling with L2 -Almost Constant Mean Curvature and an Alexandrov-Type Theorem for Crystals. Arch. Ration. Mech. Anal. 230(3), 1131–1177 (2018). DOI 10.1007/s00205-018-1267-8. URL https://doiorg.turing.library.northwestern.edu/10.1007/s00205-018-1267-8 12. Figalli, A., Fusco, N., Maggi, F., Millot, V., Morini, M.: Isoperimetry and stability properties of balls with respect to nonlocal energies. Comm. Math. Phys. 336-1, 441–507 (2015) 13. Figalli, A., Maggi, F.: On the shape of liquid drops and crystals in the small mass regime. Arch. Ration. Mech. Anal. 201(1), 143–207 (2011). DOI 10.1007/s00205-010-0383-x. URL http://dx.doi.org/10.1007/s00205-010-0383-x 14. Figalli, A., Maggi, F., Pratelli, A.: A mass transportation approach to quantitative isoperimetric inequalities. Invent. Math. 182(1), 167–211 (2010). DOI 10.1007/s00222-010-0261-z. URL http://dx.doi.org/10.1007/s00222-010-0261-z 15. Fonseca, I.: The Wulff theorem revisited. Proc. Roy. Soc. London Ser. A 432(1884), 125–145 (1991). DOI 10.1098/rspa.1991.0009. URL http://dx.doi.org.ezproxy.lib.utexas.edu/10.1098/rspa.1991.0009 16. Fonseca, I., M¨uller, S.: A uniqueness proof for the Wulff theorem. Proc. Roy. Soc. Edinburgh Sect. A 119(1-2), 125–136 (1991). DOI 10.1017/S0308210500028365. URL http://dx.doi.org.ezproxy.lib.utexas.edu/10.1017/S0308210500028365 17. Fraenkel, L.E.: An introduction to maximum principles and symmetry in elliptic problems, Cambridge Tracts in Mathematics, vol. 128. Cambridge University Press, Cambridge (2000). URL https://doi.org/10.1017/CBO9780511569203 18. Frank, R.L., Lieb, E.H.: A compactness lemma and its application to the existence of minimizers for the liquid drop model. SIAM J. Math. Anal. 47(6), 4436–4450 (2015) 19. Gamow, G.: Mass defect curve and nuclear constitution. Proc. R. Soc. Lond. A 126(803), 632–644 (1930). DOI 10.1098/rspa.1930.0032. URL http://rspa.royalsocietypublishing.org/content/126/803/632 20. He, Y., Li, H., Ma, H., Ge, J.: Compact embedded hypersurfaces with constant higher order anisotropic mean curvatures. Indiana Univ. Math. J. 58(2), 853–868 (2009). DOI 10.1512/iumj.2009.58.3515. URL http://dx.doi.org/10.1512/iumj.2009.58.3515 21. Julin, V.: Isoperimetric problem with a Coulomb repulsive term. Indiana Univ. Math. J. 63(1), 77–89 (2014). URL https://doi.org/10.1512/iumj.2014.63.5185

302

Robin Neumayer

22. Kn¨upfer, H., Muratov, C.B.: On an isoperimetric problem with a competing nonlocal term I: The planar case. Comm. Pure Appl. Math. 66(7), 1129–1162 (2013). URL https://doi.org/10.1002/cpa.21451 23. Kn¨upfer, H., Muratov, C.B.: On an isoperimetric problem with a competing nonlocal term II: The general case. Comm. Pure Appl. Math. 67(12), 1974–1994 (2014). URL https://doi.org/10.1002/cpa.21479 24. Lu, G., Zhu, J.: An overdetermined problem in Riesz-potential and fractional Laplacian. Nonlinear Anal. 75(6), 3036–3048 (2012). URL https://doi.org/10.1016/j.na.2011.11.036 25. Lu, J., Otto, F.: Nonexistence of a minimizer for Thomas-Fermi-Dirac-von Weizs¨acker model. Comm. Pure Appl. Math. 67(10), 1605–1617 (2014). URL https://doi.org/10.1002/cpa.21477 26. Mackie, F.D.: Anisotropic surface tension and the liquid drop model. Nuclear Physics A 245(1), 61–86 (1975). DOI https://doi.org/10.1016/0375-9474(75)90082-2. URL http://www.sciencedirect.com/science/article/pii/0375947475900822 27. Maggi, F.: Sets of finite perimeter and geometric variational problems, Cambridge Studies in Advanced Mathematics, vol. 135. Cambridge University Press, Cambridge (2012). URL https://doi.org/10.1017/CBO9781139108133. An introduction to geometric measure theory 28. Reichel, W.: Characterization of balls by Riesz-potentials. Ann. Mat. Pura Appl. (4) 188(2), 235–245 (2009). URL https://doi.org/10.1007/s10231-008-0073-6 29. Ros, A.: Compact hypersurfaces with constant higher order mean curvatures. Rev. Mat. Iberoamericana 3(3-4), 447–453 (1987) 30. Taylor, J.E.: Existence and structure of solutions to a class of nonelliptic variational problems. In: Symposia Mathematica, Vol. XIV (Convegno di Teoria Geometrica dell’Integrazione e Variet`a Minimali, INDAM, Roma, Maggio 1973), pp. 499–508. Academic Press, London (1974) 31. Taylor, J.E.: Unique structure of solutions to a class of nonelliptic variational problems. In: Differential geometry (Proc. Sympos. Pure. Math., Vol. XXVII, Stanford Univ., Stanford, Calif., 1973), Part 1, pp. 419–427. Amer. Math. Soc., Providence, R.I. (1975) 32. Wulff, G.: Zur frage der geschwindigkeit des wachsturms und der aufl¨osungder kristallfl¨achen. Z. Kristallogr. 34, 449–530 (1901)

Liouville-type theorems for nonlinear elliptic and parabolic problems Philippe Souplet

1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Motivation and classical results: Fujita, Gidas-Spruck, Liouville . . . . . . . . 1.2 Equations vs. inequalities – a first method: rescaled test-functions . . . . . . . 2 Liouville-type theorems for the nonlinear heat equation . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Results and conjectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Radial case: proof based on zero-number . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Nonradial case: proof based on similarity variables and energy estimates . 3 Applications of parabolic Liouville theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Sketch of proof of Theorem 3.1(i) (initial-final blow-up estimate in Rn ) . . 4 Elliptic systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Elliptic systems I: Lane-Emden . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Elliptic systems II: positive self-interaction . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Elliptic systems III: negative self-interaction . . . . . . . . . . . . . . . . . . . . . . . . . 5 Liouville for parabolic systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Low values of p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Gradient structure-homogeneous case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Gross-Pitaevskii case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Lotka-Volterra case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 2 3 4 4 5 7 9 9 10 12 12 14 15 18 18 18 19 21 21

Abstract We give a survey of Liouville-type theorems and their applications for various classes of semilinear elliptic and parabolic equations and systems.

´ ´ Universite´ Sorbonne Paris Nord, CNRS UMR 7539, Laboratoire Analyse, Geometrie et Applications, 93430 Villetaneuse, France. Email: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_21

303

Philippe Souplet

304

1 Introduction 1.1 Motivation and classical results: Fujita, Gidas-Spruck, Liouville The past few decades have seen intensive development of Liouville type nonexistence theorems for elliptic and parabolic problems (equations and systems). At the same time, these have emerged as a fundamental tool for many applications to the qualitative properties of solutions of these problems. The aim of these notes is to summarize some of the main results and their applications. We shall also emphasize a number of methods for the derivation of Liouville type theorems (sometimes with only a sketch of proof, though). In view of the huge existing literature and the large variety of problems treated, we stress that no attempt to exhaustivity is made. We refer to, e.g., [31], [32] for further references. In all this article, p is a real number with p > 1. Consider the semilinear parabolic equation (1) ut − Δ u = u p . The following two results are classical and fundamental. The first one is essentially due to Fujita [10], except for the critical case (see [15], [41], [31] and the references therein). The so-called Fujita exponent is defined by pF = 1 + 2/n. Theorem 1.1 Equation (1) does not admit any positive global classical solution in Rn × (0, ∞) if and only if p ≤ pF . Theorem 1.1 remains even valid for distributional solutions (see e.g. [31]). The second result, which concerns the corresponding stationary equation −Δ u = u p ,

(2)

is the celebrated elliptic Liouville-type theorem of Gidas and Spruck [12] (see also [3], [5]). We recall that the Sobolev exponent is given by  ∞, if n ≤ 2, pS := (n + 2)/(n − 2), if n > 2. Theorem 1.2 Equation (2) does not admit any positive classical solutions in Rn if and only if p < pS . Extensions and applications of both results have received considerable attention in the last 30 years. Although a natural question, parabolic Liouville-type theorems for equation (2) have not been as intensively studied until recently and are up to now not yet fully understood. More precisely, the question is the following:

Liouville-type theorems

305

If one now considers positive (classical) solutions of ut − Δ u = u p that are global for both positive and negative time, i.e. solutions on the whole space Rn+1 = Rn ×R, can one prove nonexistence for a larger range of p’s than in the Fujita problem ? The exponent pS is a natural candidate for the dividing line between existence and nonexistence. On the other hand, like for Fujita-type and elliptic Liouville-type results, it is also useful to consider the same question on a half-space. As it will turn out, we shall see in Section 3 that such results have interesting applications in the study of a priori estimates and blow-up singularities of solutions.

1.2 Equations vs. inequalities – a first method: rescaled test-functions The Fujita result remains true for supersolutions (see e.g. [21], [31]), namely: Theorem 1.3 The inequality ut − Δ u ≥ u p ,

x ∈ Rn , t > 0

does not admit any positive classical solutions if and only if p ≤ pF . In this respect it can be considered as the parabolic analogue of the following well-known elliptic property, due to Gidas [11]. To this end we introduce the socalled Serrin’s exponent:  ∞, if n ≤ 2, psg := n/(n − 2), if n > 2, which is critical for the existence of radial singular solutions of the form cr−2/(p−1) . Theorem 1.4 The inequality −Δ u ≥ u p ,

x ∈ Rn

does not admit any positive classical solutions if and only if p ≤ psg . Both the Fujita and the Gidas result, namely Theorems 1.3 and 1.4, can (nowadays) be proved by a rather simple technique of rescaled test-functions (see e.g. [21], [31]). Namely, one tests the equation with functions of the form

φ (x/R)

or

ψ (t/R2 )φ (x/R),

where φ , ψ are suitable compactly supported smooth functions. Then, after integration by parts and use of H¨older’s inequality, one obtains that

Philippe Souplet

306

 Rn

u p dx = 0

 ∞

or 0

Rn

u p dxdt = 0

by letting R → ∞ (the critical case p = psg or p = pF requires a slightly more delicate additional argument). The full Gidas-Spruck theorem is considerably more difficult (in the complementary range (psg , pS )). It can be proved either by Bochner formula and hard integral estimates (original proof of [12], see also [3]) or by Kelvin transform and moving planes [5]. See also [36] and [4], for alternative proofs based on moving-spheres. We shall see that the parabolic Liouville case is equally or even more delicate.

2 Liouville-type theorems for the nonlinear heat equation 2.1 Results and conjectures Let us first consider the case of radial solutions, for which we have the following result in the optimal range [26]. Theorem 2.1 Let 1 < p < pS . Then the equation ut − Δ u = u p ,

x ∈ Rn ,

t ∈R

(3)

has no positive, radial, bounded classical solution. Theorem 2.1 is optimal in view of the existence of positive radial stationary solutions for n ≥ 3 and p ≥ pS . Moreover, the boundedness assumption can be removed (see [28] and cf. Section 3 below). It is very likely that Theorem 2.1 should hold without the radial symmetry assumption, but this has not been proved so far. However, if n ≤ 2 or under the stronger restriction p < pB if n ≥ 3, where pB :=

n(n + 2) , (n − 1)2

we have the following Liouville-type theorem in the general (nonradial) case. We n < pB < pS (for n ≥ 3). The first case is from [29] and the second note that pF < n−2 case is a consequence of [2]. Theorem 2.2 Let p > 1 and assume either n ≤ 2 or p < pB . Then equation (3) has no positive solution. The proofs of Theorem 2.1 and of each case of Theorem 2.2 are completely different: • radial case: intersection-comparison with steady-states; • case n ≤ 2: similarity variables and rescaled energy arguments. This technique actually works for all p < n/(n − 2)+ (< pB if n ≥ 3);

Liouville-type theorems

307

• case p < pB : Bochner formula and hard integral estimates. The last two techniques can be modified to apply to more general problems, including certain classes of parabolic systems. We shall now give the first two proofs. As for the third proof, an application of it to certain parabolic systems will be sketched in Section 5.3.

2.2 Radial case: proof based on zero-number For the proof of Theorem 2.1, we need some simple preliminary observations concerning radial steady states. Let ψ1 be the solution of the equation

ψ  +

n−1  ψ + |ψ | p−1 ψ = 0, r

r > 0,

satisfying ψ (0) = 1, ψ  (0) = 0. Obviously ψ1 (0) < 0. It is known that the solution is defined on some interval and it changes sign due to p < pS (cf. [12]). We denote by r1 > 0 its first zero. By uniqueness for the initial-value problem, ψ1 (r1 ) < 0. We thus have ψ1 (r) > 0 in [0, r1 ) and ψ1 (r1 ) = 0 > ψ1 (r1 ). p−1

Clearly, ψα (r) := αψ1 (α 2 r) is the solution of (2.2) with ψ (0) = α , ψ  (0) = 0, p−1 and with the first positive zero rα = α − 2 r1 . As an elementary consequence of the properties of ψ1 we obtain the following Lemma 2.3 Given any m > 0, we have   lim sup{ψα (r) : r ∈ [0, rα ] is such that ψα (r) ≤ m} = −∞. α →∞

We shall use the well-known properties of the zero-number of the difference of two solutions, in particular the nonincreasing property (see e.g. [31]). Proof of Theorem 2.1. The proof is by contradiction. Assume that u is a positive, bounded classical solution of (3), u(x,t) = U(r,t), where r = |x|. By the boundedness assumption and parabolic estimates, U and Ur are bounded on [0, ∞) × R. It follows from Lemma 2.3 that if α is sufficiently large then U(·,t) − ψα has exactly one zero in [0, rα ] for any t and the zero is simple. We next claim that zα (t) := z[0,rα ] (U(·,t) − ψα ) ≥ 1,

t ≤ 0, α > 0,

(4)

where z[0,rα ] (w) denotes the zero number of the function w in the interval [0, rα ]. Indeed, if not then U(·,t0 ) > ψα in [0, rα ] for some t0 . We know (see e.g. [31]) that each solution of the Dirichlet problem

Philippe Souplet

308

u¯t − Δ u¯ = u¯ p , u¯ = 0, u(x,t ¯ 0 ) = U0 (|x|),

|x| < rα , t > t0 , |x| = rα , t > t0 , |x| < rα ,

⎫ ⎪ ⎬ ⎪ ⎭

blows up in finite time whenever U0 > ψα in [0, rα ). Choosing the initial function U0 between ψα and U(·,t0 ) we conclude, by comparison, that u¯ and u both blow up in finite time, in contradiction to the global existence assumption on u. This proves the claim. Set α0 := inf{β > 0 : zα (t) = 1 for all t ≤ 0 and α ≥ β }. In view of the above remark on large α , we have α0 < ∞. Also α0 > 0. Indeed, for small α > 0 we have ψα (0) < U(0,t) for t = 0 and for t ≈ 0. By the properties of the zero number, we can choose t ≈ 0, t < 0, such that ψα (0) − U(·,t) has only simple zeros and then, by (4), zα (t) ≥ 2. By definition of α0 (and (4)), there are sequences αk → α0− and tk ≤ 0 such that z[0,rα ] (U(·,tk ) − ψαk ) ≥ 2, k

k = 1, 2, . . . .

We get z[0,rα ] (U(·,tk + t) − ψαk ) ≥ 2, k

t ≤ 0, k = 1, 2, . . . .

(5)

This in particular allows us to assume, choosing different tk if necessary, that tk → −∞. By the boundedness assumption and parabolic estimates, passing to a subsequence, we may further assume that u(x,tk + t) → v(x,t),

x ∈ Rn , t ∈ R,

2,1 (Rn × R). Clearly then, there is δ > 0 such that for each with convergence in Cloc fixed t, U(·,tk + t) − ψαk → V (·,t) − ψα0

in C1 [0, rα0 + δ ], where v(x,t) = V (|x|,t). This and (5) imply that for each t ≤ 0, V (·,t) − ψα0 has at least two zeros or a multiple zero in [0, rα0 ). By the properties of the zero number, we may choose t < 0 so that V (·,t) − ψα0 has only simple zeros (and, hence at least two of them). Since U(·,tk + t) − ψα0 is close to V (·,t) − ψα0 in C1 [0, rα0 ], if k is large, it has at least two simple zeros in [0, rα0 ) as well. But then, for α > α0 , α ≈ α0 , the function U(·,tk + t) − ψα has at least two zeros in [0, rα ), contradicting the definition of α0 . We have thus shown that the assumption u ≡ 0 leads to a contradiction, which proves the theorem.



Liouville-type theorems

309

2.3 Nonradial case: proof based on similarity variables and energy estimates We will now prove Theorem 2.2 for all p < n/(n − 2)+ in the case of bounded solutions. We shall see in Section 3.1 that the boundedness assumption can be removed as a consequence of a general principle based on a “rescaling-doubling” procedure. The proof consists of 4 steps: (i) First rescaling by similarity variables along a sequence of final times T = k (ii) Energy estimates (iii) Second rescaling according to the maximum points (iv) Contradiction with the nonexistence of steady states. Proof of Theorem 2.2 for p < n/(n − 2)+ in the case of bounded solutions. Assume on the contrary that there exists a positive bounded solution u of (3). Replacing u by u(x,t) ˜ := λ 2/(p−1) u(λ x, λ 2t) with λ = (sup u)−(p−1)/2 we may assume u(x,t) ≤ 1

for all x ∈ Rn , t ∈ R.

Denote c0 := u(0, 0). For k = 1, 2, . . . , we rescale equation (1) by similarity variables about T = k and a = 0 by setting β = 1/(p − 1), s = − log(k − t) for t < k and √ wk (y, s) := (k − t)β u(y k − t,t) = e−β s u(e−s/2 y, k − e−s ). By direct computation, we see that w = wk satisfies ws − Δ w +

y · ∇w = w p − β w, 2

y ∈ Rn , s ∈ R.

(6)

Then, setting also sk := − log k, we have wk (0, sk ) = kβ c0 and

wk (·, s) ∞ ≤ e2β kβ

for s ∈ [sk − 2, ∞).

(7)

Define the weighted energy functional E (w) :=

 Rn

1 2

|∇w|2 +

1 β 2 w − |w| p+1 ρ dy, 2 p+1

ρ (y) := e−|y|

2 /4

,

and set Ek (s) := E (wk (s)). By direct computation, we have  d  E w(s) = − ds and

 Rn

w2s ρ dy ≤ 0,

(8)

Philippe Souplet

310

1 d 2 ds





Rn

  p−1 w2 ρ dy = −2E w(s) + |w| p+1 ρ dy p + 1 Rn (p+1)/2    ≥ −2E w(s) + c w2 ρ dy .

(9)

Rn

This implies Ek (s) ≥ 0,

s∈R



(10) w2 ρ dy

(since otherwise Ek (s) ≥ 0 would be negative for all large s and Rn would blow up in finite time). Multiplying (6) with w = wk by ρ , integrating over y ∈ Rn and using Jensen’s inequality yields d ds

 Rn



wk (y, s)ρ (y) dy + β

Rn



wkp (y, s)ρ (y) dy

 p ≥ Cn,p wk (y, s)ρ (y) dy ,

wk (y, s)ρ (y) dy =

Rn

Rn

where Cn,p := (4π )−n(p−1)/2 . It follows that 

(since otherwise



Rn wρ dy

 sk  σ

Rn

Rn

wk (y, s)ρ (y) dy ≤ C˜n,p

(11)

would blow up in finite time), hence

wkp (y, s)ρ (y) dy ds ≤ C˜n,p (1 + β (sk − σ )),

σ < sk ,

(12)

where C˜n,p = (β /Cn,p )β . Now (7), (8), (9), (11) and (12) guarantee 2Ek (sk − 1) ≤ 2 ≤

1 2

 Rn

2β β

≤e k

 sk −1 sk −2

Ek (s) ds ≤ 2

w2k (y, sk − 2)ρ (y) dy +

 Rn

 sk sk −2

p−1 p+1

wk (y, sk − 2)ρ (y) dy +

Ek (s) ds

 sk  s −2

 skk 

Rn

sk −2 Rn

wkp+1 (y, s)ρ (y) dy ds

wkp (y, s)ρ (y) dy ds



β

≤ 2C(n, p)k , where C(n, p) := e2β C˜n,p (1 + β ), hence Ek (sk − 1) ≤ C(n, p)kβ . This estimate, (8) and (10) guarantee  sk  sk −1

2

∂w

k (y, s) ρ (y) dy ds = E (sk − 1) − E (sk ) ≤ C(n, p)kβ .

n ∂s R

Next denote λk := k−1/2 and set 2/(p−1)

vk (z, τ ) := λk

wk (λk z, λk2 τ + sk ),

z ∈ Rn , −k ≤ τ ≤ 0.

(13)

Liouville-type theorems

311

Then 0 < vk ≤ e2β , vk (0, 0) = c0 ,

1 ∂ vk − Δ vk − vkp = −λk2 z · ∇vk + β vk ∂τ 2 and, denoting α := −n + 2 + 4/(p − 1) and using (13) we also have  0

 sk 

2

2

∂v

∂w

k

k (y, s) dy ds (z, τ ) dz d τ = λkα

∂τ sk −1 |y| 0 and assume that y ∈ D satisfies M(y) dist(y, Γ ) > 2k. Then there exists x ∈ D such that M(x) dist(x, Γ ) > 2k, and

M(x) ≥ M(y),

  for all z ∈ D ∩ BE x, k M −1 (x) .

M(z) ≤ 2M(x)

The proof of the doubling lemma (see [27]) is by contradiction and induction (in the spirit of the proof of Baire’s lemma). Sketch of proof of Theorem 3.1(i). Denote X = (x,t), Y = (y, s) and consider the parabolic distance dP (X,Y ) = |x − y| + |t − s|1/2 . The result will follow from more general estimate for solutions u on domains D ⊂ Rn+1 : −2/(p−1) ((x,t), ∂ D), (x,t) ∈ D. (15) u(x,t) ≤ C(n, p)dP Indeed, choosing D = (0, T ) × BR , (15) will imply   u(x,t) ≤ C(n, p) t −1/(p−1) + (T − t)−1/(p−1) + (R − |x|)−2/(p−1) , hence the desired estimate by letting R → ∞. Assume (15) fails. Then there exist sequences Dk , uk , Yk ∈ Dk s.t. (p−1)/2

Mk := uk

satisfy

Mk (Yk ) > 2k dP −1 (Yk , ∂ Dk ).

By the Doubling Lemma with E = Rn+1 , applied with Σ = Σk = Dk , D = Dk and Γ = ∂ Dk , there exists Xk = (xk ,tk ) ∈ Dk such that Mk (Xk ) > 2k dP −1 (Xk , ∂ Dk ), and

  in X; dP (X, Xk ) ≤ k Mk−1 (Xk ) .   

Mk (X) ≤ 2Mk (Xk )

≤ 12 dP (Xk ,∂ Dk )

Now set λk = Mk−1 (Xk ) and rescale uk as 2/(p−1)

vk (y, s) := λk

uk (xk + λk y,tk + λk2 s),

which solves the same eqn. with vk (0, 0) = 1. Moreover, (16) implies

(16)

Philippe Souplet

314 (p−1)/2

vk

(y, s) ≤ 2

for |y| +



|s| ≤ k.

Local parabolic estimates guarantee that (up to a subsequence), vk converges to a nontrivial bounded solution v of (1) on Rn+1 , contradicting the assumed Liouville property (14).

Remark 3.2 The Dirichlet case can be treated by a modification of the above argument provided we also have the Liouville property in the half-space Rn+ × R. The latter (for a given p > 1) is a consequence of the Liouville theorem in Rn × R and a moving planes argument (see [28] for details).

4 Elliptic systems Many Liouville type results are available for elliptic systems. We shall present some of them and illustrate different methods.

4.1 Elliptic systems I: Lane-Emden Let us consider the Lane-Emden system:  −Δ u = v p ,

x ∈ Rn

−Δ v = u q ,

x ∈ Rn

(17)

where p, q > 0. The so-called Sobolev hyperbola is defined by 1 n−2 1 + = . p+1 q+1 n The following result [20] shows that the Sobolev hyperbola is the sharp dividing line for the existence of positive solutions in the radial case. Theorem 4.1 System (17) admits positive radial solutions if and only if 1 1 n−2 + ≤ . p+1 q+1 n . It is conjectured that the Liouville property should be true without radial restriction. It has been known so far only in dimensions n ≤ 4 ([38]): Theorem 4.2 Assume

1 1 n−2 + > . p+1 q+1 n

If n ≤ 4, then (17) admits no nontrivial nonnegative classical solution.

Liouville-type theorems

315

Remark 4.1 (Previous and other results) (i) The Liouville property was proved before in [37] for n = 3 and polynomially bounded solutions. (ii) The assumption of polynomial bound for n = 3 was removed in [27] (consequence of [37] and of doubling argument). (iii) For n ≥ 5, only partial results are available. See for instance [17] (biharmonic case (p = 1, q < (n + 4)/(n − 4)), and also [9], [36], [4], [38], [16]. Ideas of proof of Theorem 4.2. By a doubling argument, it is enough to consider bounded solutions. The proof is done in four steps: Step 1. Basic a priori bounds. Denote by α = 2(p + 1)/(pq − 1), β = 2(q + 1)/(pq − 1) the scaling exponents of system (17). By the rescaled test-functions method (cf. Section 1.2), we obtain  BR



uq ≤ CRn−qα

and BR

v p ≤ CRn−pβ ,

R > 0.

Step 2. Maximum principle argument. Assume p ≥ q without loss of generality. Then, by a suitable maximum principle argument, one can show that v p+1 ≤

p + 1 q+1 u , q+1

x ∈ Rn .

Step 3. Pohozaev-type identity. Let us write u(x) = u(r, θ ) in spherical coordinates. By a Pohozaev-type multiplier argument, one obtains the identity

n

n   − a1 − a2 v p+1 + uq+1 p+1 q+1 BR BR   v p+1 (R, θ ) uq+1 (R, θ )  + = Rn dθ p+1 q+1 Sn−1    a1 ur v + a2 uvr (R, θ ) d θ + Rn−1 +R



Sn−1

n



Sn−1

 ur vr − R−2 ∇θ u · ∇θ v (R, θ ) d θ

for any a1 , a2 ∈ R with a1 + a2 = n − 2. Moreover, one can choose a1 , a2 such that n n p+1 − a1 > 0 and q+1 − a2 > 0 whenever (p, q) is below Sobolev hyperbola. From this identity, defining the volume and surface terms: 

F(R) := BR

G1 (R) = Rn

 Sn−1

uq+1 (R, θ ) d θ ,

uq+1 ,

G2 (R) = Rn

 Sn−1



|Dx u| +

we have F(R) ≤ CG1 (R) +CG2 (R).

u

v |Dx v| + dθ , R R

Philippe Souplet

316

Step 4. Feedback argument The idea is to estimate the surface terms by combining: • Basic a priori estimates above • Sobolev imbeddings and interpolation inequalities on Sn−1 • Elliptic estimates in BR • Averaging in r and measure argument. In this way, one can prove that F(R) ≤ CG1 (R) +CG2 (R) ≤ CR−a F b (4R),

along some sequence R = Ri → ∞,

for some powers a, b, which satisfy a > 0 and b < 1 whenever the pair (p, q) is below the Sobolev hyperbola and satisfies an additional condition (which is always true if n ≤ 4). Taking a suitable subsequence and using the boundedness of u, if follows that u ≡ 0, hence v ≡ 0.

Remark 4.2 A heuristic explanation of the dimension restriction n ≤ 4 (say for p = q) can be given as follows. First recall that, due to the standard elliptic theory, bootstap/interpolation from Lr estimate is possible provided r > rc := d(p − 1)/2, where d is the underlying space dimension. Here our basic a priori estimate (cf. Step 1) is in L p (on n dimensional balls). But by means of the Pohozaev-type identity, this estimate can be “projected” onto the unit sphere, whose dimension is d = n − 1. This allows for a crucial gain, since p > (n − 1)(p − 1)/2 ⇐⇒ p < (n − 1)/(n − 3) (= psg (n − 1)) and p < (n + 2)/(n − 2) ≤ (n − 1)/(n − 3) ⇐⇒ n ≤ 4.

4.2 Elliptic systems II: positive self-interaction We now turn to the following class of Schr¨odinger-type systems: − Δ ui =

m

∑ βi j uqi uq+1 j ,

(18)

j=1

where B = (βi j ) is a real m × m symmetric matrix with positive diagonal entries, m ≥ 2, q > 0. We denote the total degree by p := 2q + 1. We begin with the cooperative case, with the following result in the optimal range [36]: Theorem 4.3 Assume βii > 0, βi j ≥ 0 and p < pS . Then (18) has no positive classical solution.

Liouville-type theorems

317

Method of proof: moving spheres.

We now consider the case where some off-diagonal coefficients may be negative. The following matrix property plays an important role. Definition. B is strictly copositive if



βi j zi z j > 0,

for all z ∈ [0, ∞)m , z = 0.

1≤i, j≤m

We have the following necessary condition [40] for the Liouville property to hold. Theorem 4.4 Assume p < pS . If B is not strictly copositive, then (18) has a nontrivial nonnegative bounded solution. Method of proof: Construction of a periodic solution by variational techniques.

The following result (cf. [35], [39]) shows that the copositivity condition is (necessary and) sufficient under suitable assumption on p. Theorem 4.5 Let B be strictly copositive. Assume in addition that either p < pS ,

n ≤ 4,

m=2

p < n/(n − 2)+ ,

m ≥ 3.

or Then (18) has no positive bounded classical solution. Ideas of proof. It is based on modifications of the ideas in the proof of Theorem 4.2. For m ≥ 3, the above ideas are combined with a device from [40], which uses a test-function of the form u−q i .

Remark 4.3 (i) The problem remains open for m, n ≥ 3 in the range n/(n − 2) ≤ p < pS . (ii) The boundedness assumption can be partially relaxed (iii) Earlier results were obtained in [7], [40].

4.3 Elliptic systems III: negative self-interaction We now consider the system  −Δ u = uq vm [avr − cur ],

x ∈ Rn

−Δ v = vq um [bur − dvr ],

x ∈ Rn

(19)

Philippe Souplet

318

where m, q ≥ 0, r > 0, a, b, c, d ≥ 0, with total degree p := q + m + r. Such systems enter in models of problems with negative self-interaction, and are thus in a sense opposite to the case studied in Section 4.2. The typical cases are the following: • Schr¨odinger: m = 0, r = q + 1 −Δ ui =

2

∑ βi j uqi uq+1 j

j=1

• Lotka-Volterra: m = 0, q = r = 1  −Δ u = u(av − cu), −Δ v = v(bu − dv),

x ∈ Rn x ∈ Rn

• Reversible chemical reactions: m = q = r = 1  −Δ u = uv(av − cu), −Δ v = uv(bu − dv),

x ∈ Rn x ∈ Rn

These are reactions of the form A + 2B

k1 −→ ←− k2

2A + B.

We here consider the approach based on the reduction to a scalar Liouville-type theorem, by showing the proportionality of components (or synchronisation). The following result is due to [23]. Theorem 4.6 Assume r ≥ |q − m|, ab > cd

and

q ≤ n/(n − 2)+ .

(i) Then any positive bounded solution of (19) satisfies u/v = Const. (ii) If also p < pS , then (19) has no positive bounded solution. Remark 4.4 (i) Theorem 4.6(i) applies to (some) critical and supercritical cases. Also the boundedness assumption can be partially relaxed. (ii) One can show that the proportionality constant is unique. (iii) The condition q ≤ n/(n − 2)+ is optimal (cf. [34], [32]). On the other hand, it can be replaced by m ≤ 2/(n − 2)+ if c, d > 0. (iv) Other related results showing proportionality of components of various elliptic systems can be found in [18], [34], [1], [6], [8], [22]. See Section 5.4 for a result of this type for parabolic systems. Sketch of proof of Theorem 4.6. Step 1. Key “dissipativity” property. We show that there exists a unique constant K > 0 (independent of the solution) such that

Liouville-type theorems

319

∀u, v > 0, [ f (u, v) − Kg(u, v)](u − Kv) ≤ 0. Moreover, we have aK r > c. Step 2. Auxiliary functions. Set W = |u − Kv| ≥ 0, Z = min(u, Kv) > 0. One can show that (W, Z) is a weak solution of the auxiliary system  ΔW ≥ 0 with α = max(m + r, 1). −Δ Z ≥ cW α Z q Step 3. Extension of Gidas’ Liouville theorem for inequalities. One can prove the following: Lemma 4.7 Let 0 < q ≤ n/(n − 2)+ and V ∈ C(Rn ), V ≥ 0, satisfy lim inf R−n R→∞

If U ≥ 0 and

 B2R \BR

V (x) dx > 0.

−Δ U ≥ V (x)U q ,

x ∈ Rn ,

then U ≡ 0. Step 4. Contradiction argument to prove (i). Assume W ≡ 0. Since Δ W ≥ 0, it is well known that the average W˜ (R) of W on the sphere of radius R is nondecreasing in R. Consequently, 1 |BR |

 BR

W (x) dx ≤

n R

 R 0

2n W˜ (r) dr ≤ R

 R R/2

W˜ (r) dr ≤ C(n)R−n

 B2R \BR

W (x) dx.

Since W ≡ 0, it follows easily from the mean-value inequality that lim inf R−n R→∞

 B2R \BR

W (x) dx > 0,

hence lim inf R−n R→∞

 B2R \BR

W α (x) dx > 0

by Jensen’s inequality. Since −Δ Z ≥ cW α Z q , it suffices to apply Lemma 4.7 with V = cW α . Step 5. Proof of (ii). It suffices to note that v = Ku and A := aK r − c > 0 imply −Δ u = Au p .

Philippe Souplet

320

5 Liouville for parabolic systems We consider parabolic systems that can be written in the general vector form

∂t U − Δ U = F(U),

x ∈ Rn , t ∈ R,

(20)

where U = (u1 , . . . , um ), F = (F1 , . . . , Fm ). As mentioned in the previous section, many results are known in the elliptic case. In comparison, only few results are available in the parabolic case. We will review some them in the next subsections.

5.1 Low values of p A first basic approach is to rely on Fujita type results (nonexistence of global solutions in (0, ∞) × Rn ) which of course guarantee the Liouville property, but usually in a quite nonoptimal way in terms of exponent range. Proposition 5.1 Assume: F is p-coercive:

∃ξ > 0, ξ · F(U) ≥ c|U| p ,

n+2 . n Then system (20) has no nontrivial entire solutions U ≥ 0. 1 < p ≤ pF :=



Idea of proof. Apply the classical scalar Fujita result Theorem 1.1 to z := ξ · U.

5.2 Gradient structure-homogeneous case The following result is due to [29]. Theorem 5.2 Let G ∈ C2+α for some α > 0 and G(U) > G(0) for all U ∈ [0, ∞)m \ {0}. Assume F = ∇G, F p-coercive, F is p-homogeneous, 1 < p < n/(n − 2)+ . Then system (20) has no nontrivial entire solutions U ≥ 0.

Liouville-type theorems

321

Idea of proof. Similar to the proof of Theorem 2.2 for p < n/(n − 2)+ , based on a combination of similarity variables, weighted energy and rescaling.

Remark 5.1 Theorem 5.2 is true in the full range 1 < p < pS if U is radial. The proof is based on the 1d Liouville result, combined with doubling and energy arguments, so as to reduce the parabolic Liouville property to an elliptic one (see [29] and cf. also [33]). Note that zero-number is not available for systems. On the other hand, partial related results were previously obtained in [24] for n = 1 or radial solutions.

5.3 Gross-Pitaevskii case We consider system (20) with nonlinearities of the form fi (U) = uri

m

∑ βi j ur+1 j

j=1

where β = (βi j ) is a symmetric matrix and r > 0. This system enjoys a gradient structure and is p-homogeneous with p = 2r + 1 The classical cubic case corresponds to r = 1: m

fi (U) = ui ∑ βi j u2j . j=1

In the case of nonnegative coefficients, the following result from [25] improves the range of p with respect to Theorem 5.2 for n ≥ 3. Theorem 5.3 Assume

βii > 0, βi j ≥ 0, 1 < p < pB :=

n(n + 2) . (n − 1)2

Then system (20) has no positive (component-wise) entire solutions. In particular this is true for p = n = 3. Sketch of proof of Theorem 5.3. It is based on modifications of ideas from [2] in the scalar case, which was a parabolic modification of the elliptic proof from [12] (also [3]). Step 1. Basic functionals and 1-parameter family of inequalities (No PDE involved !) Let I(u) =



|∇u|4 ϕ, u2

J(u) =



|∇u|2 (−Δ u)ϕ , u

K(u) =



(Δ u)2 ϕ

Philippe Souplet

322





where ≡ Q dxdt, Q = B1 ×(−1, 1), ϕ ∈ C0∞ (Q). We have, the following lemma, where “L.O.T.” means that the total number of derivatives of u is less than 4, e.g. |∇u|2 Δ ϕ , · · · . Lemma 5.4 Let 0 < u ∈ C1,2 (Q) (real valued), 0 ≤ ϕ ∈ C0∞ (Q) and α ∈ R. Then we have α J(u) − K(u) + A(α )I(u) ≤ L.O.T.,

n (n − 1)2 α 1− α . where A(α ) = n+2 n(n + 2) Sketch of proof of Lemma 5.4. It is based on the following three ingredients: (i) the Bochner formula 1 Δ |∇v|2 = ∇v · ∇(Δ v) + |D2 v|2 ; 2 (ii) testing with ϕ vm and integration by parts; (iii) the substitution v = uk (for suitable choices of m, k in terms of α ).

Step 2. Transformation of J and K for solutions of (20). We let I := ∑ I(ui ), i

J := ∑ J(ui ), i



K := ∑ K(ui ),

L :=

i

∑( fi (U))2 ϕ . i

Lemma 5.5 Let U > 0 be a solution of (20) in Q. Then K = L + L.O.T., L ≤ pJ + L.O.T. Ideas of proof of Lemma 5.5. It is done in two steps: (i) First write (Δ ui )2 = ( fi (U) − ∂t ui )2 and transform ∂t terms to L.O.T. by using a localized energy.  (ii) Then integrate by parts

r+1 |∇ui |2 ur−1 i u j ϕ.

Step 3. Conclusion of sketch of proof of Theorem 5.3. Combining Lemmas 5.4 and 5.5, we obtain:

α J − K + A(α )I ≤ L.O.T. L ≤ pJ + L.O.T,

with A(α ) =

K = L + L.O.T.

It then follows that

α − 1 L + A(α ) I ≤ L.O.T.    p    >0 >0

n (n − 1)2 α 1− α , n+2 n(n + 2)

if

p 0 is an entire solution, we then rescale as: Uλ (x,t) = λ 2/(p−1)U(λ x, λ 2t). Applying (21) to Uλ and letting λ → ∞, we get

 Rn+1

|U|2p = 0: a contradiction.



5.4 Lotka-Volterra case We consider the system  ut − Δ u = uq [avr − cur ],

x ∈ Rn , t ∈ R,

vt − Δ v = vq [bur − dvr ],

x ∈ Rn , t ∈ R,

(22)

with q ≥ 0, r > 0, a, b, c, d ≥ 0. We denote the total degree by p := q + r. Note that, unlike (20), system (22) has no variational structure in general. The following result is due to [30]. Theorem 5.6 Let q, a, b, c, d > 0 with q + r > 1. Assume ab > cd

and

r ≥ q.

(i) Then any positive solution of (22) satisfies u/v = Const. (ii) If also n = 2 or p < pB , then (22) has no positive (component-wise) solution. Method of proof: It is a parabolic modification of the proof of Theorem 4.6, based on suitable maximum principle arguments.



Acknowledgements These notes are based on a series of lectures given at MATRIX, Creswick, Australia, in November 2018. The author thanks this institution for the hospitality, as well as the University of Sydney.

References 1. Th. Bartsch, Bifurcation in a multicomponent system of nonlinear Schr¨odinger equations, J. Fixed Point Theory Appl. 13 (2013), 37–50.

324

Philippe Souplet

2. M.-F. Bidaut-V´eron, Initial blow-up for the solutions of a semilinear parabolic equation with source term, Equations aux d´eriv´ees partielles et applications, articles d´edi´es a` Jacques-Louis Lions, Gauthier-Villars, Paris (1998) 189–198 3. M.-F. Bidaut-V´eron and L. V´eron, Nonlinear elliptic equations on compact Riemannian manifolds and asymptotics of Emden equations, Invent. Math. 106 (1991) 489–539 4. J. Busca and R. Man´asevich, A Liouville-type theorem for Lane-Emden system, Indiana Univ. Math. J. 51 (2002) 37–51 5. W. Chen and C. Li, Classification of solutions of some nonlinear elliptic equations, Duke Math. J. 63 (1991) 615–622 6. L. D’Ambrosio, A new critical curve for a class of quasilinear elliptic systems, Nonlinear Anal. 78 (2013) 62–78 7. E.N. Dancer, J. Wei and T. Weth, A priori bounds versus multiple existence of positive solutions for a nonlinear Schr¨odinger system, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire, 27 (2010) 953–969 8. A. Farina, Symmetry of components, Liouville-type theorems and classification results for some nonlinear elliptic systems, Discrete Contin. Dyn. Syst. 35 (2015) 5869–5877 9. D.G. de Figueiredo and P. Felmer, A Liouville-type theorem for elliptic systems, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 21 (1994) 387–397 10. H. Fujita, On the blowing up of solutions of the Cauchy problem for ut = Δ u + u1+α , J. Fac. Sci. Univ. Tokyo Sec. IA Math. 13 (1966) 109–124 11. B. Gidas, Symmetry properties and isolated singularities of positive solutions of nonlinear elliptic equations, Nonlinear partial differential equations in engineering and applied science, Dekker, New York, Lecture Notes in Pure and Appl. Math. 54 (1980) 255–273 12. B. Gidas and J. Spruck, Global and local behavior of positive solutions of nonlinear elliptic equations, Comm. Pure Appl. Math. 34 (1981) 525–598 13. B. Gidas and J. Spruck, A priori bounds for positive solutions of nonlinear elliptic equations, Comm. Partial Differential Equations 6 (1981) 883–901 14. Y. Giga, A bound for global solutions of semilinear heat equations, Comm. Math. Phys. 103 (1986) 415–421 15. K. Hayakawa, On nonexistence of global solutions of some semilinear parabolic differential equations, Proc. Japan Acad. 49 (1973) 503–505 16. K. Li and Z. Zhang, On the Lane-Emden conjecture, Preprint arXiv 1807.06751 17. C.-S. Lin, A classification of solutions of a conformally invariant fourth order equation in Rn , Comment. Math. Helv. 73 (1998) 206–231 18. Y. Lou, Necessary and sufficient condition for the existence of positive solutions of certain cooperative system, Nonlinear Anal. 26 (1996) 1079–1095 19. F. Merle and H. Zaag, Optimal estimates for blowup rate and behavior for nonlinear heat equations, Comm. Pure Appl. Math. 51 (1998) 139–196. 20. E. Mitidieri, Nonexistence of positive solutions of semilinear elliptic systems in Rn , Differential Integral Equations 9 (1996) 465–479 21. E. Mitidieri and S.I. Pohozaev, A priori estimates and blow-up of solutions of nonlinear partial differential equations and inequalities, Proc. Steklov Inst. Math. 234 (2001) 1–362 22. A. Montaru and B. Sirakov, Stationary states of reaction-diffusion and Schr¨odinger systems with inhomogeneous or controlled diffusion, SIAM J. Math. Anal. 48 (2016), 2561–2587 23. A. Montaru, B. Sirakov and Ph. Souplet, Proportionality of components, Liouville theorems and a priori estimates for noncooperative elliptic systems, Arch. Rational Mech. Anal. 213 (2014) 129–169 24. Q.H. Phan, Optimal Liouville-type theorems for a parabolic system, Discrete Contin. Dyn. Syst. 35 (2015) 399–409 25. Q.H. Phan and Ph. Souplet, A Liouville-type theorem for the 3-dimensional parabolic GrossPitaevskii and related systems, Math. Ann. 366 (2016) 1561–1585 26. P. Pol´acˇ ik and P. Quittner, A Liouville-type theorem and the decay of radial solutions of a semilinear heat equation, Nonlinear Anal. 64 (2006) 1679–1689

Liouville-type theorems

325

27. P. Pol´acˇ ik, P. Quittner and Ph. Souplet, Singularity and decay estimates in superlinear problems via Liouville-type theorems. Part I: elliptic equations and systems, Duke Math. J. 139 (2007) 555–579 28. P. Pol´acˇ ik, P. Quittner and Ph. Souplet, Singularity and decay estimates in superlinear problems via Liouville-type theorems. Part II: parabolic equations, Indiana Univ. Math. J. 56 (2007) 879–908 29. P. Quittner, Liouville theorems for scaling invariant superlinear parabolic problems with gradient structure, Math. Ann. 364 (2016) 269–292 30. P. Quittner, Liouville theorems, universal estimates and periodic solutions for cooperative parabolic Lotka-Volterra systems, J. Differential Equations 260 (2016) 3524–3537 31. P. Quittner and Ph. Souplet, Superlinear parabolic problems. Blow-up, global existence and steady states, Birkh¨auser Advanced Texts: Basel Textbooks, Birkh¨auser Verlag, Basel, 2007. 32. P. Quittner and Ph. Souplet, Superlinear parabolic problems. Blow-up, global existence and ¨ steady states, Second edition, Birkhauser Advanced Texts: Basel Textbooks, Birkhauser/ Springer, Cham, 2019. 33. P. Quittner and Ph. Souplet, Parabolic Liouville-type theorems via their elliptic counterparts, Discrete Contin. Dyn. Syst., Special Issue (2011), 1206–1213 34. P. Quittner and Ph. Souplet, Symmetry of components for semilinear elliptic systems, SIAM J. Math. Anal. 44 (2012) 2545–2559 35. P. Quittner and Ph. Souplet, Optimal Liouville-type theorems for noncooperative elliptic Schr¨odinger systems and applications, Comm. Math. Phys. 311 (2012) 1–19 36. W. Reichel and H. Zou, Non-existence results for semilinear cooperative elliptic systems via moving spheres, J. Differential Equations 161 (2000) 219–243 37. J. Serrin and H. Zou, Non-existence of positive solutions of Lane-Emden systems, Differential Integral Equations 9 (1996) 635–653 38. Ph. Souplet, The proof of the Lane-Emden conjecture in four space dimensions, Adv. Math. 221 (2009) 1409–1427 39. Ph. Souplet, Liouville-type theorems for elliptic Schr¨odinger systems associated with copositive matrices, Netw. Heterog. Media 7 (2012) 967–988 40. H. Tavares, S. Terracini, G. Verzini, T. Weth, Existence and nonexistence of entire solutions for non-cooperative cubic elliptic systems, Comm. Partial Differential Equations 36 (2011) 1988–2010 41. F.B. Weissler, Existence and non-existence of global solutions for a semilinear heat equation, Israel J. Math. 38 (1981) 29–40

Part II

Other Contributed Articles

Chapter 4

Algebraic Geometry, Approximation and Optimisation

Schur functions for approximation problems Nadezda Sukhorukova, Julien Ugon and David Yost

Abstract In this paper we propose a new approach to least squares approximation problems. This approach is based on partitioning and Schur function. The nature of this approach is combinatorial, while most existing approaches are based on algebra and algebraic geometry. This problem has several practical applications. One of them is curve clustering. We use this application to illustrate the results.

1 Introduction In this paper we formulate a specific least least squares approximation problem and provide a signal processing application where this problem is used. The main technical difficulty for this problem is to solve linear systems with the same system matrix and different right-hand sides. One simple approach that can be proposed here is to invert the system matrix and multiply the updated right-hand side by this inverse at each iteration. In general, it is not very efficient to solve linear systems through computing matrix inverses, but in this particular application it is very beneficial. One technical difficulty here is to know in advance whether the system matrix

Nadezda Sukhorukova Faculty of Science, Engineering and Technology, Swinburne University of Technology, PO Box 218, Hawthorn, Victoria, Australia and Centre for Informatics and Applied Optimization, Federation University Australia e-mail: [email protected] Julien Ugon Faculty of Science, Engineering and Built Environment, Deakin University, 221 Burwood Highway Burwood Victoria 3125, Australia and Centre for Informatics and Applied Optimization, Federation University Australia e-mail: [email protected] David Yost Centre for Informatics and Applied Optimization, Federation University Australia e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_22

331

332

Nadezda Sukhorukova, Julien Ugon and David Yost

is invertible or not. Similar problems appear in Chebyshev (uniform) approximation problems as well. In this paper we suggest a new approach for dealing with this kind of systems. This approach in based on Schur functions, a well-established techniques that is used to describe partitioning [2]. The very nature of these functions is combinatorial. Based on our previous experience [4], the characterisation of the necessary and sufficient optimality conditions for multivariate Chebyshev approximation is also combinatorial and therefore Schur function is a very natural tool to work with these problems. This paper is organised as follows. In section 2 we introduce a signal processing application that relies on approximation and optimisation. In section 3 we provide a mathematical formulation to the signal processing problem and discuss how it can be simplified. In section 4 we introduce an innovative approach for solving the problem. This approach is based on Schur functions. Finally, in section 5 we provide future research directions.

2 Signal clustering In signal processing, there is a need for constructing signal prototypes. Signal prototypes are summary curves that may replace the whole group of signal segments, where the signals are believed to be similar to each other. Signal prototypes may be used for characterising the structure of the signal segments and also for reducing the amount of information to be stored. Any signal group prototype should be an accurate approximation for each member of the group. On the top of this, it is desirable that the process of recomputing group prototypes, when new group members are available, is not computationally expensive. In this paper we suggest a k-means and least square approximation based model. Similar models are proposed in [3]. This is a convex optimisation problem. There are several advantages of this model. First of all, it provides an accurate approximation to the group of signals. Second, this problem can be obtained as a solution to a linear system and can be solved efficiently. Finally, the proposed approach allows one to compute prototype updates without recomputing from scratch.

3 Mathematical formulation 3.1 Prototype construction Assume that there is a group of l signals S1 (t), . . . , Sl (t), whose values are measured at discrete time moments

Schur functions for approximation problems

333

t1 , . . . ,tN , ti ∈ [a, b], i = 1, . . . , N. We suggest to construct the prototype as a polynomial Pn (X,t) = ∑ni=0 xit i of degree n, whose least squares deviation from each member of the group on [a, b] is minimal. That is, one has to solve the following optimisation problem: l

N

minimise F(X) = ∑ ∑ (S j (ti ) − Pn (X,ti ))2 ,

(1)

i=1 j=1

where X = (x0 , . . . , xn ) ∈ Rn+1 , xk , k = 0, . . . , n are the polynomial parameter and also the decision variables. Each signal is a column vector S j = (S j (t1 ), . . . , S j (tN ))T , j = 1, . . . , l. Problem (1) can be formulated in the following matrix form: minimise F(X) = Y − BX,

(2)

where X = (x0 , . . . , xn ) ∈ Rn+1 , are the decision variables (same as in (1)); vector ⎛ 1⎞ S ⎜S2 ⎟ ⎜ ⎟ Y = ⎜ . ⎟ ∈ R(n+1)l ⎝ .. ⎠ Sl matrix B contains repeated matrix blocks, namely, ⎛ ⎞ B0 ⎜B0 ⎟ ⎜ ⎟ ⎜ ⎟ B = ⎜B0 ⎟ , ⎜ .. ⎟ ⎝ . ⎠ B0 where



t12 . . . t22 . . . .. . ... 1 tN tN2 . . .

1 ⎜1 ⎜ B0 = ⎜ . ⎝ ..

t1 t2 .. .

⎞ t1n t2n ⎟ ⎟ .. ⎟ . . ⎠ tNn

This least squares problem can be solved using a system of normal equations: BT BX = BT Y.

(3)

Taking into account the structure of the system matrix in (3), the problem can be significantly simplified:

Nadezda Sukhorukova, Julien Ugon and David Yost

334

lBT0 B0 X = BT0

l

∑ Sk .

(4)

k=1

Therefore, instead of solving (3), one can solve BT0 B0 X = BT0

∑lk=1 Sk = BT0 S, l

(5)

where S is the average of all l signals of the group (centroid).

3.2 Prototype update Suppose that a signal group prototype has been constructed. Assume now that we need to update our group of signals: some new signals have to be included, while some others are to be excluded. To update the prototype, one needs to update the centroid and solve (5) with the updated right-hand side, while the system matrix BT0 B remains the same. If only few signals are moving in and out of the group, then the updated centroid can be calculated without recomputing from scratch. Assume that la signals are moving in the group (signals Sa1 (t), . . . Sala ), while lr are moving out (signals Sr1 (t), . . . Srlr ), then the centroid can be recalculated as follows: Snew (t) =

a r lSold (t) + ∑lk=1 Sak (t) + ∑lk=1 Srk (t) . l − lr + la

Since the same system has to be solved repeatedly with different right-hand sides, one approach is to invert matrix BT0 B0 , which is an (n + 1) × (n + 1) matrix. In most cases, n is much smaller than N or l and therefore this approach is quite attractive, if we can guarantee that matrix BT0 B0 is invertible. In the next section we discuss the verification of this property.

4 Schur functions and matrix inverse 4.1 Vandermonde and generalised Vandermonde matrices Consider matrix BT0 B0 . In general, matrix B0 can be defined as follows: ⎞ ⎛ g1 (t1 ) g2 (t1 ) g3 (t1 ) . . . gn+1 (t1 ) ⎜ g1 (t2 ) g2 (t2 ) g3 (t2 ) . . . gn+1 (t2 ) ⎟ ⎟ ⎜ B0 = ⎜ . ⎟, .. .. .. ⎠ ⎝ .. . ... . . g1 (tN ) g2 (tN ) g3 (tN ) . . . gn+1 (tN )

Schur functions for approximation problems

335

where gi , i = 1, . . . , n + 1 are basis functions. In section 3 we were discussing polynomial approximation and therefore, the components of matrix BT0 B0 are monomials that are evaluated at different time-moments. Recall that n + 1 0, i = 1, . . . , n + 1, then the system is Chebyshev. Note that this statement can be proven using a logarithmic transformation [1]. We believe, however, that our approach is also applicable to more general settings. There are many studies on Schur polynomials and many efficient ways for computing them. This approach can be used, for example, if one needs to know if BT0 B0 is invertible. If the matrix is invertible, one can develop a very fast and efficient algorithm for curve cluster prototype updates. If the matrix is singular, one can use the singular-value decomposition for constructing the prototype updates. This decomposition can be computed once, since BT0 B0 remains unchanged when the cluster membership is updated.

5 Discussions and future research directions There are many studies on how to compute Schur functions. We are particularly interested in the extension of this approach to Chebyshev (uniform) approximation and multivariate approximation. This is a very promising approach for dealing with this type of problems, since, as our previous studies suggested [4] the corresponding optimality conditions are very combinatorial in their nature and therefore, Schur functions are a very natural tool for study this kind of systems. We are also planning to conduct a thorough numerical study of the signal processing application we are discussing in this paper. Acknowledgements This paper was inspired by the discussions during a recent MATRIX program “Approximation Optimisation and Algebraic Geometry’’ that took place in February 2018. We are thankful to the MATRIX organisers, support team and participants for a terrific research atmosphere and productive discussions. This study was supported by the Australian Research Council Discovery Project DP18010060 “Solving hard Chebyshev approximation problems through nonsmooth analysis”.

Schur functions for approximation problems

337

References 1. Samuel Karlin and William Studden, Tchebycheff systems, with applications in analysis and statistics, Interscience Publishers New York, 1966 (English). 2. I. G. Macdonald, Symmetric functions and hall polynomials, Clarendon Press Oxford University Press, Oxford New York, 1995. 3. H. Sp¨ath, Cluster analysis algorithms for data reduction and classification of objects, Ellis Horwood Limited, Chichester, 1980. 4. Nadezda Sukhorukova, Julien Ugon, and David Yost, Chebyshev multivariate polynomial approximation: alternance interpretation, 2016 MATRIX Annals (David R. Wood, Jan de Gier, Cheryl Praeger, and Terence Tao, eds.), MATRIX Book Series, vol. 1, Springer, 2018, pp. 177– 182.

Chapter 5

Dynamics, Foliations, and Geometry In Dimension 3

Research announcement: Partially hyperbolic diffeomorphisms homotopic to the identity on 3-manifolds Thomas Barthelm´e, Sergio R. Fenley, Steven Frankel and Rafael Potrie

Abstract We announce some results towards the classification of partially hyperbolic diffeomorphisms on 3-manifolds, and outline the proofs in the case when the diffeomorphism is dynamically coherent. Detailed proofs will appear later.

1 Introduction A diffeomorphism f of a 3-manifold M is partially hyperbolic if it preserves a splitting of the tangent bundle T M into three 1-dimensional sub-bundles T M = E s ⊕ E c ⊕ E u, where the stable bundle E s is eventually contracted, the unstable bundle E u is eventually expanded, and the central bundle E c is distorted less than the stable and unstable bundles at each point. That is, one has D f n |E s (x)  < 1, D f n |E u (x)  > 1, and D f n |E s (x)  0 such that d(x, f˜(x))<  K for every x ∈ M. (L2) f˜commutes with every deck transformation (which we identify with π1(M)⊂  Isom(M)). Remark 1. Such a lift can always be obtained by lifting an homotopy from the identity to f . Notice however that the choice of f˜ might not be unique (this will be important for Seifert manifolds). Whenever we write f˜ we will be assuming that f˜ is a lift that verifies both properties. This lift is fixed throughout. In this announcement, we will further assume that the foliations W cs and W cu are f -minimal. This means that M is the only set that is closed, saturated by the foliation, f -invariant and non empty. The difference with the usual notion of minimality of foliations is that we require the set to be f -invariant. This hypothesis simplifies several arguments but is not needed in certain cases (for instance when the manifold is Seifert or hyperbolic, although it requires some additional non-trivial arguments). Notice that this hypothesis is always verified in many important, from a dynamical standpoint, cases e.g. when f is transitive or volume preserving. cs and W cu are fixed Our main goal is to show that every leaf of both foliations W ˜ by f and the same holds for the connected components of their intersections (i.e. the c ). Once this is obtained, it is not difficult to show that f should be leaf leaves of W conjugate to a (topological) Anosov flow (very similar arguments already appear in [BW]). σ then so is f˜(L). σ means that if L is a leaf of W Notice that invariance of W σ one has that f˜(L) = L. Showing that leaves are fixed means that for every L ∈ W

2.1 Dichotomy for foliations A foliation F on M is said to be R-covered if the leaf space of the lifted foliation   in the universal cover is homeomorphic to R. In general if F is Reebless (for M/ F  example if it does not have compact leaves), then M/F is a simply connected one dimensional manifold, but usually it is not Hausdorff [Nov, Ba3 ]. The foliation F  in M  is a bounded Hausdorff distance is called uniform if every pair of leaves of F apart [Th, Ca1 , Fen4 ]. The bound obviously depends on the particular pair. Assuming that the foliations W cs and W cu are f -minimal in M we first show that: cs is fixed by f˜ or the foliation W cs is RProposition 2.1 Either every leaf of W cs . The covered and uniform, and f˜ acts as a translation on the leaf space of W cu  same dichotomy holds for W .

346

Thomas Barthelm´e, Sergio R. Fenley, Steven Frankel and Rafael Potrie

Proof (Sketch). Since the 2-dimensional foliations do not have compact leaves, they are taut. In particular both connected components of the complement of a leaf L in  contain arbitrarily large balls. This implies that the image of a leaf must be nested M with itself, i.e. for a fixed transverse orientation, the positive half-space determined by one leaf contains the positive half-space determined by the other. This way, if a  defined by leaf L is not fixed by f , one can consider the set V ∈ M V :=



f˜n (L ∪U),

n

where U is the region ‘between’ L and f˜(L). It follows that the set V can be shown to be open and f˜-invariant. Using that f˜ commutes with deck transformations and that the image of a leaf is nested with itself one can show that the boundary leaves of V are also invariant under f˜ and therefore the set V verifies that for every deck trans/ By f -minimality formation γ ∈ π1 (M) one has that either γV = V or γV ∩ V = 0.  and this implies the second possibility. Additional work is we obtain that V = M needed to show that W cs is R-covered. Given this proposition there are three possibilities: cs and W cu , referred to as the doubly invariant case; 1. f˜ fixes every leaf of both W 2. f˜ fixes no leaves of either foliation, henceforth called the double translation case; and 3. f˜ fixes every leaf of one of the foliations, but no leaf of the other foliation, henceforth called the mixed case. Our goal is to rule out 2. and 3.

2.2 No mixed behavior We can show (this will be expanded upon later): cs then it fixes Proposition 2.2 If M is hyperbolic or Seifert and f˜ fixes a leaf of W cs and W cu . every leaf of both W

2.3 Double translation In order to be leaf conjugate to the time one map of a (topological) Anosov flow cs or W cu are one needs to exclude the possibility that either of the foliations W ˜ translated by f . The proof of this is very different in the Seifert and the hyperbolic case (and we do not know how to make it work for more general manifolds). In the hyperbolic case it depends crucially on dynamical coherence.

Partially hyperbolic diffeomorphisms on 3-manifolds

347

In the Seifert case we can (after considering a finite iterate) choose a different lift  are fixed by that new lift f˜. This is such that all the leaves of one of foliations in M enough to exclude this possibility (for this specific lift). In the hyperbolic manifold case the proof is much more involved and uses the existence of a transverse pseudo-Anosov flow to the R-covered foliation (either W cs or W cu ). This forces a particular dynamics on periodic center leaves. Using both translations it is possible to find a contradiction (see Proposition 5.2 and Proposition 5.3).

2.4 Double invariance Once we know that both foliations are fixed by f˜, the next step is to show that concu (i.e. center cs and W nected components of the intersections between leaves of W c  ) are also fixed by f˜. In turn, after some more or less standard considleaves − W erations (see also [BW]), this yields the desired statement that f is leaf conjugate to the time one map of a topological Anosov flow in Theorems 1.1 and 1.2. c is open The key point of this stage is to show that the set of fixed leaves of W  and closed in M. From this, if the set of fixed centers was to be empty, we can apply Proposition 3.4 below to obtain a contradiction. Showing that the set of fixed center leaves is open is not so complicated, but closedness is a bit more delicate.

2.5 Important property As will be explained later, the proof that the mixed case or the double translation case cannot happen under certain situations, is achieved as follows: We analyze the structure forced by the hypothesis of mixed or double translation situation and we prove that in a center leaf that is periodic under f , we have that both rays have to be (say) contracting, and at the same time one of the rays has to be expanding. So the analysis of the action of f on rays of periodic center leaves is crucial to our strategy.

3 A key general proposition In this section we analyze the case where one assumes that one of the foliations, cs is leafwise fixed by f˜. A symmetric analysis holds for W cu . (say) W

348

Thomas Barthelm´e, Sergio R. Fenley, Steven Frankel and Rafael Potrie

3.1 Consequences of fixed center-stable leaves The first relatively simple but powerful consequence of having cs-leaves fixed was already noted in [BW] (see also [HP2 ]): Lemma 3.1 The lift f˜ has no fixed (or periodic) points. cs is fixed by f˜ and consider This is fairly simple. Suppose that x in a leaf L of W the unstable leaf u(x) of x. The intersection of an unstable (one dimensional) leaf in  with a center stable (two dimensional) leaf is at most a single point [Nov]. Since M both u(x) and any L center stable leaf are fixed by f˜, then every point in u(x) is fixed by f˜. This contradicts the fact that iteration by f˜ pushes points in u(x) ‘away’ from x. It follows that no fixed or periodic points of f˜ can exist. Using this, and the fact that f˜ contracts the one dimensional stable leaves, one cs is deduces that the action of f˜ on the space of stable leaves in a fixed leaf of W free (i.e. it has no fixed points). Similarly, since the stable foliation (in M) is by lines (it contains no circles) one also knows that the action of every deck transformation in the space of stable leaves is also free. Putting this together with the fact that f˜ commutes with deck transformation and using the theory of axes for actions on non-Hausdorff, simply connected one manifolds (see e.g. [Ba3 , Fen2 , Fen5 ]); one deduces the following very important property: Proposition 3.2 Every leaf of W cs is a cylinder, a plane, or a M¨obius band. Note also that by a result of Rosenberg [Ros] not every leaf (in M) can be a plane. Hence, by f -minimality, we get that cylinder and M¨obius leaves are dense in M.

3.2 Gromov hyperbolic leaves For foliations by surfaces on 3-manifolds one has the following important result: Theorem 3.3 ([Sul, Gro]) Let F be a codimension one foliation with no compact leaves on a closed 3-manifold M. Then, either there exists a transverse invariant measure, or the leaves of F are Gromov hyperbolic. Sullivan [Sul] proved that leaves satisfy a linear isoperimetric inequality. Later, Gromov [Gro, section 6.8] proved that it implies Gromov hyperbolicity of the leaves. This result also follows from Candel’s uniformization theorem [Can]. In our setting (either M is hyperbolic or Seifert or the foliations are f -minimal), using partial hyperbolicity, we can show that the foliation cannot admit a transverse cs leaves fixed by f˜). invariant measure (in the case of all W

Partially hyperbolic diffeomorphisms on 3-manifolds

349

3.3 Coarse contraction and a key proposition We start by defining a property that will be helpful in our study. We say that a center leaf c ∈ W c is coarsely contracting if: • • • •

it is homeomorphic to R (i.e. not a circle), it is periodic by f (i.e. there exists k such that f k (c) = c), it has a bounded interval I containing all the fixed points of f k , every point y ∈ c  I converges to I under forward iteration of f k .

We say that a center leaf is coarsely expanding if f −1 is coarsely contracting. Here is the first result concerning dynamics on periodic center leaves. In this result we do not assume dynamical coherence. Proposition 3.4 Let f : M → M be a partially hyperbolic diffeomorphism (not neccs and does not fix essarily dynamically coherent) such that f˜ fixes every leaf of W c cs  . Assume moreover that the foliations W and W cu are f -minimal. any leaf of W Then, every periodic center leaf c of f is coarsely contracting. Moreover, there is at least one coarsely contracting periodic center leaf.  To prove this Proposition we use some properties of deck transformations of M cs . By Theorem 3.3, these leaves are Gromov hyperbolic, hence fixing a leaf of W isometries restricted to the leaves are hyperbolic [Gro]. This hyperbolic behavior will be crucial in our analysis. Addendum 3.5 Suppose the hypothesis of Proposition 3.4 are satisfied. Suppose moreover that f is dynamically coherent. Then the dynamical behavior described in cs then it has Proposition 3.4 is impossible. In other words, if f˜ fixes every leaf of W c  to fix a leaf of W . So far, we have to assume dynamical coherence to get the Addendum in this generality. We can easily prove this result without the assumption for Seifert manifolds, and, with a lot more work, for hyperbolic manifolds. It is not yet clear to us whether the assumption is really needed in the general case. The proof of the proposition above is quite involved but we can sum up the main idea as follows: Proof (Sketch of the proof). Up to a finite cover and iterates we may assume that there are no M¨obius band leaves in W cs . Since f˜ has no periodic points, it follows that every periodic point of f has to be in a cylinder leaf.  which is stabilized by a deck transformation Take a cylinder leaf and its lift L to M γ. Since the action of f˜ is free on the stable foliation in L, there is an axis for the action on the stable leaf space in L. The first thing to notice is that a graph transform argument shows that a center leaf in L cannot intersect a leaf s of W s and a translate γ k s for some k = 0 as that would produce a fixed center leaf for f˜ contradicting the hypothesis.

350

Thomas Barthelm´e, Sergio R. Fenley, Steven Frankel and Rafael Potrie

cs is fixed by f˜ one can show that f˜ is a bounded distance Since every leaf of W from the identity in the induced metric on L. This and the previous remark allows to obtain a structure on the leaf L where, essentially, the leaf L is covered by bands of bounded width between a stable leaf s and its translate by γ. Notice that since L/γ is an annulus with a hyperbolic metric, then width of L/γ itself goes to infinity as one escapes into the ends of L/γ. Moreover, every center leaf gets trapped in such a ‘band’ (i.e. the translates γ k s bound bands which fill up L). There is an iterate  > 0 and k ∈ Z \ {0} such that h = γ k ◦ f  fixes each band (in particular, fixes s). Notice that h is still a partially hyperbolic diffeomorphism and has a fixed point x ∈ s. From Candel’s theorem, we can assume that γ acts on L as a hyperbolic isometry. Thus, we can show that there are points in s which are mapped by h arbitrarily far (i.e. for every R > 0 there are points z ∈ s in both sides of s \ {x} such that dL (h(z), z) > R). This in turn yields that all points in between s and γs are mapped in a trapping way and provides the coarse contraction on centers. When f is dynamically coherent, this behavior is impossible, and this provides the addendum (the behavior is very similar to that of the examples in [HHU] and a similar argument shows that this cannot happen).

4 Seifert Manifolds In this section, let f : M → M be a dynamically coherent partially hyperbolic diffeomorphism with a lift f˜ as described before. We denote by c a generator of the center of π1 (M) which corresponds to the fibers of the Seifert fibration.

4.1 Horizontality It was shown in [HaPS] that in the setting of this section one has that both W cs and W cu are horizontal (i.e. leaves are uniformly transverse to the Seifert fibers after isotopy). This is relevant to show that both foliations must be minimal (and therefore one can apply Theorem 3.3).

4.2 Changing the lifts Since the fundamental group of a Seifert manifold have a non-trivial center, a trick that we can use to simplify a lot our analysis is to chose our lift f˜ well: →M  be a lift of f at bounded distance from the identity Proposition 4.1 Let f˜ : M and commuting with deck transformations. Then, there exists  > 0, k ∈ Z such that

Partially hyperbolic diffeomorphisms on 3-manifolds

351

ck ◦ f˜ is a lift of f  which is at bounded distance from the identity, commutes with cs . deck transformations, and fixes a leaf of W

4.3 Putting information together Using Proposition 4.1 we can choose an iterate f k of f which admits two lifts f˜1 and cs and the other fixes every leaf of W cu . (Notice f˜2 one of which fixes all leaves of W that we could apply directly Addendum 3.5 but we rather explain this slightly longer argument that is generalizable to the non dynamically coherent setting.) Assuming that the lifts do not coincide (i.e. we are in the ‘mixed behavior case’) one gets a contradiction by applying Proposition 3.4 to both lifts. Indeed, the proposition implies that all periodic center leaves must be both coarsely contracted and coarsely expanded by f k , and since the proposition also ensures the existence of at least one periodic center leaf, we get a contradiction. This gives a lift f˜ of f k which cs and W cu . leafwise fixes every leaf of both W Once this is obtained, we argue as in subsection 2.4: it is possible to show that the set of fixed center leaves is either everything or empty, and in the latter case one can again apply Proposition 3.4 to both foliations to get a contradiction. This shows that every center leaf is fixed by f˜ and this is enough to complete the proof of Theorem 1.1.

5 Hyperbolic Manifolds In this section we explain the main tools that need to be added to work out the case of hyperbolic 3-manifolds and f dynamically coherent (Theorem 1.2). Some of the arguments can be carried out in more generality (e.g. without assuming dynamical coherence) but others use dynamical coherence in a crucial way as we will explain below.

5.1 Uniform foliations and transverse pseudo-Anosov flows Following [Th] (see also [Ca2 , Fen4 ]) we say that a foliation F of a 3-manifold M is R-covered and uniform if the following two properties hold:   is homeomorphic to R and, • The leaf space L := M/ F , there exists K > 0 such that the Hausdorff • for every pair of leaves L, L ∈ F distance between L and L is less than K. When M is obtained as the suspension of a pseudo-Anosov diffeomorphism of a surface S (i.e. M = S × [0, 1]/(x,0)∼(ϕ(x),1) ) it is clear that the foliation by fibers

352

Thomas Barthelm´e, Sergio R. Fenley, Steven Frankel and Rafael Potrie

S × {t} is R-covered and uniform, and admits a transverse pseudo-Anosov flow. This is an instance of a much more general result dealing with general uniform foliations in hyperbolic 3-manifolds: Theorem 5.1 (Thurston, Calegari, Fenley [Th, Ca1 , Fen4 ]) If F is a transversely orientable, R-covered and uniform foliation in a hyperbolic 3-manifold M then it admits a regulating transverse pseudo-Anosov flow Φ. .  in M  intersects every leaf of F By regulating we mean that every orbit of Φ Being transverse just says that orbits of Φ are transverse to F . In our proof, we use this result in an essential way to eliminate the double translation case in hyperbolic 3-manifolds. We do it by comparing the dynamics f with that of the pseudo-Anosov flow Φ.

5.2 Forcing a particular type of dynamics on periodic center leaves Proposition 5.2 Let f : M → M be a dynamically coherent partially hyperbolic difcs feomorphism of a hyperbolic manifold M such that f˜ acts as a translation on W then, there is a periodic center leaf which is coarsely expanding. Proof (Sketch of the proof of Proposition 5.2). Recall that, according to our dics then W cs is chotomy result (Proposition 2.1), if f˜ acts as a translation on W R-covered and uniform. Let Φcs be the regulating transverse pseudo-Anosov flow given by Theorem 5.1. Consider γ a periodic orbit of Φcs and write γ again for an associated deck transformation. The first step in the proof consists in showing that there exist  > 0 and k ∈ cs . This is shown using that f˜ is a Z \ {0} such that h = γ k ◦ f˜ fixes a leaf L ∈ W cs cs , understood in the following way: Flowing along Φ bounded distance from Φ defines a homeomorphism between the leaf L and f˜(L) and this homeomorphism is a bounded distance from the map f˜|L from L to f˜(L). After some involved arguments  γ . Then, using recurrence and partial we obtain a compact f˜/γ invariant subset in M/ hyperbolicity, we get the desired periodic center stable leaf. Once this is obtained, we use the Lefschetz fixed point theorem to compare the indices of fixed points of h in L with the corresponding first return map of the flow cs . This forces the existence of at least one fixed center leaf whose index is negΦ ative, which produces the desired coarsely expanding leaf (because the transverse behavior is contracting because L is a center-stable leaf).

Partially hyperbolic diffeomorphisms on 3-manifolds

353

5.3 Obstructions to dynamical coherence, no double translation Proposition 5.3 Let f : M → M be a dynamically coherent partially hyperbolic difcs . feomorphism of a hyperbolic 3-manifold M such that f˜ acts as a translation on W Then f cannot have any coarsely contracting center leaf. cu together If f is dynamically coherent, then putting Proposition 5.2 applied to W cs yields that the double translation case cannot with Proposition 5.3 applied to W happen. Unfortunately, our proof of this result uses dynamical coherence in a crucial way, as we will see in the sketch below. cs Proof (Sketch of the proof of Proposition 5.3). Let L be a center stable leaf of W  fixed by h = γ ◦ f˜ for some deck transformation γ (in the terminology of the previous proposition γ = γ k ). Assume that L contains a coarsely contracting fixed center stable leaf c. cs on L. We separate This proof requires a finer study of the dynamics forced by Φ this study in two cases, determined by whether or not γ corresponds to a periodic orbit of Φcs . Both are very similar so we only sketch the case where γ does correspond to a periodic orbit. In this case, we start by showing that the contracting rays of the center leaf c must accumulate (in ∂∞ L, the boundary at infinity of the leaf L) on points which are repelling for the action of τcs on the boundary, where τcs : L → L is the map obtained by composing the holonomy along Φ˜ cs -orbits from L to γ −1 L with γ. Similarly, any stable manifold that is periodic under h also accumulates only on repelling points of τcs on ∂∞ L. Notice that h and τcs are homeomorphisms of L a bounded distance from each other, so induce the same action on ∂∞ L. An index counting argument shows that it is impossible to compensate the (positive) index contributed by the coarsely contracting center leaf with other coarsely expanding centers (because there are only finitely many contracting points at infinity) unless some center leaves merge, contradicting dynamical coherence. Notice that this type of merging for non-dynamically coherent diffeomorphisms actually appears in examples, as in [BGHP, Section 5].

5.4 No mixed behavior The impossibility of having mixed behavior is proven, for dynamically coherent diffeomorphisms, by Addendum 3.5. As we previously mentioned, we are also able to eliminate mixed behavior on hyperbolic manifolds even in the non-dynamically coherent case. However, our argument is very specific to hyperbolic manifolds (since we use the existence of a transverse pseudo-Anosov flow) and considerably more delicate than the dynamically coherent case. At best, this argument could be extended to manifolds with at least one atoroidal piece (see below).

354

Thomas Barthelm´e, Sergio R. Fenley, Steven Frankel and Rafael Potrie

5.5 Doubly invariant case This follows exactly as in the Seifert case (cf. subsection 4.3).

6 Extensions and limits of our arguments 6.1 Beyond dynamical coherence When f is not necessarily dynamically coherent, one can use, instead of foliations, the branching foliations introduced in [BI] (see also [HP3 , Section 4]). After a substantial amount of preparation, and suitable reinterpretation of objects (like leaf spaces), most of the arguments of the dynamically coherent situation extend to the non dynamically coherent setting. Some properties require different and involved arguments, notably showing that in the hyperbolic and Seifert case the branching foliations are f -minimal, and we sometimes need additional hypothesis. One step that we are so far unable to complete is to remove dynamical coherence from the assumptions in Proposition 5.3 — that is when M is hyperbolic. Indeed, the type of configuration that we obtain using our arguments turns out to be very similar to what actually happens in the non-dynamically coherent examples constructed in [BGHP] in some Seifert manifolds. Therefore, it is unclear whether this situation in hyperbolic manifolds can really be ruled out. Notice that once we can prove double invariance, then dynamical coherence follows after the fact: Once we have shown that all branching leaves, as well as the connected components of their intersections are fixed, we can deduce that the branching foliations are true foliations, i.e. the partially hyperbolic diffeomorphism is dynamically coherent. Since all our arguments can be extended to the non dynamically coherent case when M is Seifert fibered, we obtain Theorem 1.1.

6.2 Double translation in hyperbolic manifolds One particularly obvious gap so far is our inability to either prove or disprove the existence of a double translation example in hyperbolic manifolds (such examples would necessarily be non dynamically coherent). We previously explained why our method has failed so far, but to understand the intricacy of this problem, the reader can meditate on the following example: Let φ t be a (smooth) Anosov flow on a hyperbolic manifold M, such that its (say) weak stable (2-dimensional) foliation is R-covered (such flows are called R-covered Anosov flows). Since the flow is R-covered, there exists a map η : M → M that conjugates φ t with its inverse φ −t (see, for instance [BartFe2, Proposition 2.4] for

Partially hyperbolic diffeomorphisms on 3-manifolds

355

a description of η). In particular, η preserves the weak stable and weak unstable foliations of φ t , and acts as a translation on both leaf spaces. If η was C1 , then it would be easy to show that, for a time T0 big enough, the map f = φ T0 ◦ η 2 would be a partially hyperbolic diffeomorphism. Moreover, one can easily deduce from known facts about Anosov flows in dimension 3 that f could not be leaf conjugate to a time-1 map of an Anosov flow. This is essentially because the only Anosov flows that can be transverse to a R-covered foliation are orbit equivalent to suspensions. Since f preserves the weak stable and weak unstable foliations of φ t , these foliations must be the center stable and center unstable foliations. In particular, f would be dynamically coherent and act as a double translation, in contradiction with Theorem 1.2. It follows that η cannot be C1 (actually, Barbot [Ba4 , Proposition 6.6] proved that, in general, the map η is C1 if and only if the flow φ t is a lift of a geodesic flow). A natural question is: Does there exists a C1 -map h, C0 -close to η, and such that h sends the strong unstable (resp. stable) leaves of φ t to curves transverse to the weak stable (resp. unstable) foliation of φ t ? If the answer is yes, then, for big enough T0 , the map φ T0 ◦ h will be a partially hyperbolic diffeomorphism (see e.g. [BGHP, Section 2]), homotopic to the identity, and probably not the time-1 map of an Anosov flow, so acting as a double translation (hence not dynamically coherent).

6.3 More general manifolds Seifert and hyperbolic 3-manifolds are a large part of the family of irreducible 3manifolds (which are the only ones that can admit partially hyperbolic diffeomorphisms [Nov, BI]). However, the case of general irreducible 3-manifolds, even under the assumption of being homotopic to identity still requires further work (though as we have mentioned, our results also provide some progress in this general case). Several arguments, starting with the dichotomy, require f -minimality of the (branching) foliations (which can be obtained in the contexts of Theorems 1.1 and 1.2). Even assuming minimality some arguments do not carry directly in general. Notably we do not know how to rule out the double translation case in general. We remark, though, that it is reasonable to expect an analogue of Theorem 5.1 in the context of manifolds whose JSJ decomposition contains at least one atoroidal piece. Indeed, it is believed (see, for instance, [Ca1 , Remark 5.3.17]) that a regulating flow that behaves like a pseudo-Anosov inside the atoroidal piece do exist, extending Theorem 5.1 to that setting. This, together with additional work, might be enough to extend Propositions 5.2 and 5.3 in this setting. The remaining case is when M does not have atoroidal pieces, but M is not Seifert, that is, when M is a graph manifold. In that case there is no “pseudo-Anosov”-like flow in any piece, nor does the fundamental group admit a center, which prevents us to use our Seifert trick. Hence to analyze this case, one will need new ideas.

356

Thomas Barthelm´e, Sergio R. Fenley, Steven Frankel and Rafael Potrie

Acknowledgements S. Fenley was partially supported by a grant from the Simons foundation. S. Frankel supported by the National Science Foundation under Grant No. DMS-1611768. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. R.Potrie was partially supported by CSIC 618, FCVF-2017-111 and FCE-1-2017-1-135352. He also acknowledges the hospitality of Yale University and Laboratoire Math. d’Orsay (funded by CNRS and IFUM) during part of the preparation of this work.

References [Ba1 ] T. Barbot, Caractrisation des flots d’Anosov en dimension 3 par leurs feuilletages faibles. Ergodic Theory Dynam. Systems 15 (1995), no. 2, 247–270. [Ba2 ] T. Barbot, Flots d’Anosov sur les vari´et´es graph´ees au sens de Waldhausen. Ann. Inst. Fourier (Grenoble) 46 (1996), no. 5, 1451–1517. [Ba3 ] T. Barbot, Actions de groupes sur les 1-vari´et´es non s´epar´ees et feuilletages de codimension un. Ann. Fac. Sci. Toulouse Math. (6) 7 (1998), no. 4, 559–597. [Ba4 ] T. Barbot, De l’hyperbolique au globalement hyperbolique. Habilitation a` Diriger des Recherches. Universit´e Claude Bernard - Lyon I, 2005. [BaFe1] T. Barbot, S. Fenley, Pseudo-Anosov flows in toroidal manifolds. Geom. Topol. 17 (2013) 1877-1954. [BaFe2] T. Barbot, S. Fenley, Free Seifert pieces of Pseudo-Anosov flows, Preprint arxiv:1512.06341. ´ S. Fenley, Knot theory of R-covered Anosov flows: homotopy versus [BartFe1] T. Barthelme, isotopy of closed orbits. J. Topol. 7 (2014), no. 3, 677–696. [BartFe2] T. Barthelm´e, S. Fenley, Counting periodic orbits of Anosov flows in free homotopy classes. Comm. Math. Helv. 92 (2017), no. 4, 641–714. [BDV] C. Bonatti, L. Diaz, M. Viana, Dynamics beyond uniform hyperbolicity. A global geometric and probabilistic perspective. Encyclopaedia of Mathematical Sciences, 102. Mathematical Physics, III. Springer-Verlag, Berlin, 2005. xviii+384 pp. ISBN: 3-540-22066-6 [Nov] S.P. Novikov, Topology of foliations, Trans. Moscow Math. Soc 14 (1963) 268-305. [BPP] C. Bonatti, K. Parwani, R. Potrie, Anomalous partially hyperbolic diffeomorphisms I: dy´ Norm. Sup´er. (4) 49 (2016), no. 6, 1387–1402. namically coherent examples, Ann. Sci. Ec. [BGP] C. Bonatti, A. Gogolev, R. Potrie, Anomalous partially hyperbolic diffeomorphisms II: stably ergodic examples. Invent. Math. 206 (2016), no. 3, 801–836. [BGHP] C. Bonatti, A. Gogolev, A. Hammerlindl, R. Potrie, Anomalous partially hyperbolic diffeomorphisms III: abundance and incoherence, Preprint arXiv:1706.04962. [BW] C. Bonatti and A. Wilkinson, Transitive partially hyperbolic diffeomorphisms on 3manifolds, Topology 44 (2005) (2005), no. 3, 475–508. [BZ] C. Bonatti and J. Zhang, Transverse foliations on the torus T 2 and partially hyperbolic diffeomorphisms on 3-manifolds, to appear in Comm. Math. Helv. 92 (2017), no. 3, 513–550. [Bru] M. Brunella, Expansive flows on Seifert manifolds and on torus bundles. Bol. Soc. Brasil. Mat. (N.S.) 24 (1993), no. 1, 89–104. [BI] D. Burago, S. Ivanov, Partially hyperbolic diffeomorphisms of 3-manifolds with abelian fundamental groups. J. Mod. Dyn. 2 (2008), no. 4, 541–580. [Ca1 ] D. Calegari, The geometry of R-covered foliations. Geom. Topol. 4 (2000), 457–515. [Ca2 ] D. Calegari, Foliations and the geometry of 3-manifolds. Oxford Mathematical Monographs. Oxford University Press, Oxford, (2007). xiv+363 pp. ISBN: 978-0-19-857008-0 ´ [Can] A. Candel, Uniformization of surface laminations, Ann. Sci. Ecole Norm. Sup. 26 (1993) 489-516. [CHHU] P. Carrasco, F. Rodriguez Hertz, J. Rodriguez Hertz, R. Ures, Partially hyperbolic dynamics in dimension 3, Ergodic Theory Dynam. Systems 38 (2018), no. 8, 2801–2837.

Partially hyperbolic diffeomorphisms on 3-manifolds

357

[CrP] S. Crovisier, R. Potrie, Introduction to partially hyperbolic dynamics, Lecture notes for a minicourse at ICTP available in the webpages of the authors. [DPU] L. J. Diaz, E. R. Pujals, R. Ures, Partial hyperbolicity and robust transitivity, Acta Math. 183 (1999), no. 1, 1–43 [Fen1 ] S. Fenley, Anosov flows in 3-manifolds. Ann. of Math. (2) 139 (1994), no. 1, 79–115. [Fen2 ] S. Fenley, Limit sets of foliations in hyperbolic 3-manifolds. Topology 37 (1998), no. 4, 875–894. [Fen3 ] S. Fenley, The structure of branching in Anosov flows of 3-manifolds, Comm. Math. Helv. 73 (1998) 259-297. [Fen4 ] S. Fenley, Foliations, topology and geometry of 3-manifolds: R-covered foliations and transverse pseudo-Anosov flows. Comment. Math. Helv. 77 (2002), no. 3, 415–490. [Fen5 ] S. Fenley, Pseudo-Anosov flows and incompressible tori, Geom. Ded. 99 (2003) 61-102. [Fra] S. Frankel, Coarse hyperbolicity and closed orbits for quasigeodesic flows, Ann. of Math. (2) 188 (2018), no. 1, 1–48. [Gh] E. Ghys, Flots d’Anosov sur les 3-vari´et´es fibr´ees en cercles.Ergodic Theory Dynam. Systems 4 (1984), no. 1, 67–80. [Gro] M. Gromov, Hyperbolic groups, in Essays in Group Theory, Springer-Verlag, 1987, pp. 75-263. [HP1 ] A. Hammerlindl, R. Potrie, Pointwise partial hyperbolicity in three-dimensional nilmanifolds. J. Lond. Math. Soc. (2) 89 (2014), no. 3, 853–875. [HP2 ] A. Hammerlindl, R. Potrie, Classification of partially hyperbolic diffeomorphisms in 3manifolds with solvable fundamental group. J. Topol. 8 (2015), no. 3, 842–870. [HP3 ] A. Hammerlindl, R. Potrie, Partial hyperbolicity and classification: a survey, Ergodic Theory Dynam. Systems 38 (2018), no. 2, 401–443. [HaPS] A. Hammerlindl, R. Potrie, M. Shannon, Seifert manifolds admitting partially hyperbolic diffeomorphisms, J. Mod. Dyn. 12 (2018), 193–222. [HHU] F. Rodriguez Hertz, J. Rodriguez Hertz, R. Ures, A non-dynamically coherent example on T 3 . Ann. Inst. H. Poincar Anal. Non Linaire 33 (2016), no. 4, 1023–1032. [Par] K. Parwani, On 3-manifolds that support partially hyperbolic diffeomorphisms. Nonlinearity 23 (2010), no. 3, 589–606. [Pot] R. Potrie, Robust dynamics, invariant geometric structures and topological classification, Proceedings of the International Congress of Mathematicians Volume 2 (2018) 2057-2080. [Ros] H. Rosenberg, Foliations by planes, Topology 7 (1968) 131-138. [Sul] D. Sullivan, Cycles for the dynamical study of foliated manifolds and complesx manifolds, Inven. Math. 36 (1976) 225-255. [Th] W. Thurston, Three-manifolds, Foliations and Circles, I, Preprint arXiv:math/9712268 [Ur] R. Ures, personal communication. [Wi] A. Wilkinson, Conservative partially hyperbolic dynamics. Proceedings of the International Congress of Mathematicians. Volume III, 1816–1836, Hindustan Book Agency, New Delhi, 2010.

Some remarks on projective Anosov flows in hyperbolic 3-manifolds Christian Bonatti, Jonathan Bowden, and Rafael Potrie

Abstract We explore some constructions of projectively Anosov flows on hyperbolic 3-manifolds that may lead to new ways to construct pairs of transverse taut contact forms and foliations.

1 Introduction This (informal) note is to report some discussions that took place at the MATRIX program ‘Dynamics, foliations and the geometry of 3-manifolds’ during September 2018. We thought that some outcomes of the discussion could be relevant and seem to open several questions that we intend to pursue in the future. The discussions1 revealed strong connections between the work of participants coming from very different fields so we thought it could be a good idea to advertise them. We warn the reader that the results and examples exposed here need to be expanded and revised carefully, we hope to do so in the near future. There is a strong link between pairs of negative and positive contact structures and projectively Anosov flows. This was first noticed by Mitsumatsu [Mit, Mit1 ] (see also [Asa] and references therein) who in particular used this as well as Eliashberg-Thurston [ET] approximating theorem to show that any 3-manifold adChristian Bonatti CNRS - IMB. UMR 5584. Universit´e de Bourgogne, 21004 Dijon, France, e-mail: [email protected] Jonathan Bowden School of Mathematics, Monash University, 9 Rainforest Walk, VIC 3800 Australia, e-mail: [email protected] Rafael Potrie CMAT, Facultad de Ciencias, Universidad de la Rep´ublica, Uruguay, e-mail: [email protected] 1 Other people also participated in some of the discussions, we thank in particular Sergio Fenley for his comments.

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_24

359

360

Christian Bonatti, Jonathan Bowden, and Rafael Potrie

mits a projectively Anosov flows given by the vector field of intersection of the two contact planes. The contact structures are open, and it follows from a result of Arroyo-Rodriguez Hertz ([ARH]) that generic projectively Anosov flows are hyperbolic (Axiom A with strong transversality). Recently, it has been shown that even C0 -foliations can be approached by contact structures (Bowden, Kazez-Roberts [Bow, KR]). It is natural to ask when do these flows can be modified by isotopy to become Anosov flows. See e.g. the work of Gourmelon-Potrie ([GP]) where the same problem is considered in the case of projectively Anosov diffeomorphisms of surfaces. Of course, not every projectively Anosov flow can be deformed into an Anosov flow (e.g. the construction above gives projectively Anosov flows in the sphere S3 which does not admit Anosov flows) but for example, if the 3-manifold is hyperbolic, this could in principle be the case. In fact, Thurston asked [Thu] if a hyperbolic 3-manifold admitting 3 transverse taut foliations should admit an Anosov flow. Taut foliations give rise (by approximation) to tight contact structures. When one has a pair of transverse positive and negative contact forms, there is a notion of tautness (due to Colin-Firmo [CF]) which is adapted to the setting we are in. Tautness should be an essential hypothesis of the positive and negative contact structures giving rise to projectively Anosov flows that can be deformed into Anosov flows. In this note we provide some examples in the negative direction showing that certain pairs of transverse taut foliations cannot be approached by contact structures giving rise to projectively Anosov flows that can be deformed into Anosov (or pseudo-Anosov) flows within projectively Anosov flows. The key point is the use of an adaptation of the hyperbolic plugs introduced in the paper of Beguin-Bonatti-Yu [BBY] to construct such examples. We end the note by rising several questions and future directions.

2 Projectively Anosov plugs In this section we introduce projectively hyperbolic plugs a` l`a Beguin-Bonatti-Yu [BBY] and explain some of their basic properties in certain situations. We will then construct some solid torus plugs. These will be attracting plugs (i.e. the vector field points inward) and will have an associated foliation (it could have branching) in the boundary, associated to the weak stable direction. In the following sections we will glue them with some repelling plugs to obtain the examples we want to present. As explained in the introduction, we want to have plugs with Morse-Smale dynamics and for which the (branching) foliations have no Reeb-components. The specific ones we construct are adapted to the examples we will present here, but

Some remarks on projective Anosov flows in hyperbolic 3-manifolds

361

clearly, it makes sense both to construct more as to make a theoretical study of which such plugs are possible and understand obstructions2 .

2.1 Definition of projectively hyperbolic plug Important concept is that of projectively hyperbolic Reebless plug: I.e. the pA dynamics has no compact leafs in the branching foliation. We will work under the simplifying assumption that our plugs are either attracting or repelling. This simplifies immensely many technical work of [BBY] but one can of course expect to extend some of these definitions to more general settings. Similarly, we will make some assumptions on the dynamics in the maximal invariant subset of the plug which are convenient for our purposes, but of course one can imagine extending these definitions to other settings. An attracting projectively hyperbolic Reebless plug (V, X) will be a compact (not necessarily connected) 3-manifold V with boundary ∂ V and a vector field X pointing inwards on ∂ V so that: 

• The maximal invariant set Λ = t>0 Xt (V ) is projectively hyperbolic, i.e. the differential DXt preserves two continuous invariant two dimensional bundles E and F defined on Λ so that for every vector v not in E it follows that the angle between DXt v and F decreases exponentially (see [Asa, ARH]). • There are no compact boundaryless invariant surfaces in Λ . • The dynamics in Λ is Axiom A. Remark 1. In the case that the plug contains no attracting periodic orbits, the assumptions imply that the maximal invariant set is a lamination by C1 -surfaces. If (V, X) is an attracting projectively hyperbolic Reebless plug, it provides extra information on how the E direction intersects the boundary ∂ V of V . Notice that the bundles E and F are only defined on the maximal invariant set. However, using cone-fields, one sees that E extends in a unique way on the whole plug as an invariant bundle which contains the direction of the vector field X. In particular E is transverse to the boundary of the plug, and induces by intersection a (non-sigular) 1-dimensional bundle on ∂ V . As a consequence (if the manifold is oriented) each boundary component is a torus T2 . What we want the plug to verify: • The maximal invariant set is projectively hyperbolic and Axiom A. In particular, a projectively hyperbolic tight plug on the solid torus has Morse-Smale dynamics. • Other than that, definitions are the same as in [BBY]. 2 An easy obstruction is that an attracting plug cannot have the foliation defined in the boundary having no Reeb annuli, as this would allow a contractible loop transverse to the weak stable foliation, giving rise to a Reeb component in the solid torus.

362

Christian Bonatti, Jonathan Bowden, and Rafael Potrie

One should state a result analogous to [BBY] which in the attractor/repelling setting goes back to Franks and Williams: Theorem 1. Let (V, X) an attracting projectively hyperbolic plug and with boundary components ∂inV = S1 ∪ . . . ∪ Sk and (W,Y ) a repelling projectively hyperbolic plug with boundary components ∂out W = Sˆ1 ∪ . . . ∪ Sˆk . Assume there are diffeomorphisms ϕi : Si → Sˆi so that the image of the induced foliation of Si by X is mapped transverse to the induced foliation on Sˆi by ϕ . Then, one can put a differentiable structure in M = V  W /{ϕi } so that the flow X and Y glue well and induce a projectively Anosov flow on M. Proof (Idea of the proof). The fact that the gluing can be made a smooth manifold and induce a flow is standard (see [BBY] for details). Also, in this setting, showing that the resulting flow is projectively Anosov is rather easy as a cone-field criteria suffice (when the pieces are not attracting and repelling this becomes more subtle, see [BBY]).

2.2 A plug with six Reeb components Consider a vector field X in the disk D as in figure 1.

ϕ >0

ϕ 0

Fig. 1 A vector field in the disk

To get a vector field in the solid torus, multiply D × S1 and consider the vector field

Some remarks on projective Anosov flows in hyperbolic 3-manifolds

363

∂ , ∂θ where ϕ is a smooth bump function also as in the figure 1. Transversally, (in coordinates (y, θ ) of the plane x = 0) the flow looks exactly like in figure 2. It is clear that since the contraction in the direction ∂∂x is stronger than in the ∂∂y direction, this flow is projectively hyperbolic. Y (x, y, θ ) = X(x, y) + ϕ

Fig. 2 The flow in the x = 0 plane

The maximal invariant set in the solid torus is a band with three periodic orbits, two attracting ones flowing in one direction and one saddle in the middle (flowing in the oposite direction). This produces two Reeb annuli of orbits that, when pushed along the weak stable direction produce four Reeb-annuli in the entering torus. The upper and lower attracting parts of the attracting sinks produce other two Reeb annuli for the weak stable foliation intersected with the entering torus. In total, the weak stable foliation intersected with the entering tours has six Reeb annuli, all oriented in the same direction. Remark 2. This can be easily extended to create projectively hyperbolic plugs with 6 + 4n (with n ≥ 0) Reeb annuli in the entering tours oriented in the same direction. Other variations are also possible (see section 6 for further discussions).

364

Christian Bonatti, Jonathan Bowden, and Rafael Potrie

2.3 A plug compatible with an incoherent repeller Here we’ll present a projectively hyperbolic Reebless plug on the solid torus that will be possible to glue to the examples in Section 9 of [BBY] (see in particular Theorem 1.10 in [BBY] and the proof of Lemma 9.8 with its Figure 17). The notion of incoherent repeller (or attractor) was introduced by Christy [Ch] (see [BBY, Subsection 1.4.2]) and corresponds to a certain configuration induced in the boundary of a neighborhood of a hyperbolic repeller that forbids the existence of Birkhoff sections. The configuration is depicted in figure 3. For simplicity we just construct one specific example whose entry and exit torus look like in figure 3. Notice that the attracting solid torus with that entrance foliation can be made similarly to the zipped tori constructed in [BBY, Example 7.13].

Fig. 3 Solid lines correspond to the incoherent repeller and in dashed ones represent the foliation in the attracting solid torus.

3 Filling a pseudo-Anosov flow Here we show how to construct a projectively Anosov flow in a hyperbolic 3manifold by making a DA-construction in a suspension pseudo-Anosov flow and including the plug constructed in subsection 2.2. First, notice that there exists a pseudo Anosov homeomorphism of a genus two surface admitting a unique singular point, which by necessity needs to be a 6-prong saddle (see for example [FM], in particular figure 11.6 and the criteria in Theorem 14.4).

Some remarks on projective Anosov flows in hyperbolic 3-manifolds

365

Gluing such plug with the one constructed in subsection 2.2 according to the conditions of Theorem 1 gives the desired projectively Anosov flow. Now we provide some arguments for the tautness of the foliations induced by the bundles of the projectively Anosov splitting (a different approach is presented in section 5). The arguments are not symmetric for the E and F direction, particularly because by construction, the bundle E contains a uniformly contracting subbundle, and therefore it is uniquely integrable and its integral surfaces are planes or cylinders (depending on whether it contains a closed orbit or not). This implies that the foliation tangent to E is taut. For F one needs to be a bit more careful. We will only show a weaker form of tautness, through every surface tangent to F there is a closed transversal intersecting the surface, but certainly one should be able to push the arguments to get a stronger version (for the subtleties with the definitions, we refer the reader to [KR2 ]). First notice that F contains an expanding direction, and is therefore uniquely integrable everywhere except at the attracting points (c.f. Figure 4) in the maximal invariant set of the solid torus attracting plug (which is an invariant annulus). This implies that if there is a surface tangent to F which is closed, it has to be a torus and intersect at least one of the two attracting periodic orbits3 . Notice that in any case, every surface tangent to F cannot be completely contained in the solid torus, so it accumulates somewhere in the repeller. This implies that the surface has some recurrence and this allows to construct a closed transversal through it. This completes the sketch of the proof.

4 Filling incoherent repellers This follows by gluing the plug in subsection 2.3 with an example given by Theorem 1.10 (see in particular Figure 17 in the proof of Lemma 9.8 of [BBY]). Similarly to the previous section, applying Theorems 1, one sees that one obtains a projectively Anosov flow. We now argue for the tautness of the bundles. Tautness of E is simpler as again it contains a uniformly contracting subbundle. The F direction can also be handled similarly to the case of the previous example using the fact that every surface tangent to F will accumulate in the repeller which is an essential lamination. It remains to show that this can be done in a hyperbolic 3-manifold. We sketch an argument showing that this should be possible (but details should be checked more carefully). The example is made with some atoroidal pieces and some seifert pieces of the form pair of pants times S1 so that the flow is ’horizontal’. As the attractor is transitive, there is a periodic orbit which ’fills’ every seifert piece. Now, doing DehnGoodman surgery along the periodic orbit carefully one should be able to obtain a 3

In this case, as the manifold is hyperbolic this implies that it must bound a solid torus, therefore, there should be a Reeb component in the attracting plug which does not have one by assumption.

366

Christian Bonatti, Jonathan Bowden, and Rafael Potrie

Fig. 4 The F direction is only defined in the maximal invariant set before gluing. After gluing it is shown in dotted lines how it make look in a transversal, in particular, it may have (and indeed has) merging points in the attracting periodic orbits.

hyperbolic 3-manifold (in particular, most surgeries in the atoroidal parts will remain atoroidal and any dehn-surgery in a filling curve in the seifert part will make the resulting manifold atoroidal, notice also that the periodic orbit should traverse all the torus of the JSJ decomposition thus kill them after Dehn surgery, see FoulonHasseblatt [FH, Appendix] for similar arguments). Notice that Dehn-Goodman surgery does not affect the bundles much outside the closed orbit where it is performed and therefore tautness remains unchanged by this procedure. The bundles need not be orientable for this construction, but this can be achieved by taking a finite cover (obviously, being a hyperbolic manifold is stable under taking finite lifts).

5 A foliation/contact interpretation of the examples The example given by filling a DA blow up of a suspension pseudo-Anosov flow given in section 3, has the interesting property that it gives a pair of transverse foliations that are without compact leaves, and are hence taut. Moreover, it gives a template for building these purely in terms of filling the blown-up stable and unstable laminations of the suspension flow by configurations of monkey saddles.

Some remarks on projective Anosov flows in hyperbolic 3-manifolds

367

Blow-up of laminations Blowing up the singular leaves of the stable/unstable laminations gives a pair or s , Λ u . These correspond to the stable respectively unstable transverse laminations Λ laminations obtained after doing attracting resp. repelling DA blow-ups at all singular orbits.

Filling by monkey saddles u consist of ideal polygon bundles, where s , Λ The complements of the laminations Λ the number of sides of the polygon corresponds to the number of prongs of the corresponding singular orbit. These can be filled by monkey saddle foliations consisting of simply connected leaves assuming that the number of sides is even. However, if this is done in the standard way the resulting foliation will not even have trivial Euler class and cannot be associated to any flow with dominated splitting. We assume that the number of sides is 6 as in the examples above. Then we add 3 cylindrical leaves whose ends connect different ends of a complementary bundle region. Taking the complement of these leaves, we obtain 4 complementary ideal polygon regions, two of which have two sides and two of which have 4 sides. Filling each of these with monkey saddles in such a way that each of the leaves are stable in that they have attracting resp. holonomy, we obtain the foliation from the example in section 3.

Pairs of complementary foliations As the flow the flow given in Section 3 has no repelling closed orbits and the dynamics of are Axiom A the stable invariant plane field E is actually integrable. The u is obtained by the splitting and filling operation above corresponding foliation F u without closed leaves. applied to the lamination Λ It is not hard to see that there is a branching foliation (c.f. [BI]) tangent to the unstable invariant plane field F that branches precisely at the two attracting orbits (recall figure 4). Pulling these leaves apart we thus obtain a complementary foliation  , one of whose leaves corresponds to the invariant annulus described in Figure F u s and complementary foliation F  . Both 2. One has an analogous construction of F s of these foliations are without compact leaves due to the dynamics of the flow: Any compact leaf would have to be a torus and there are no invariant tori. These foliations can then be approximated by contact structures which are then transverse and universally tight cf. [Bow], [KR], giving a tight dominated splitting. It is not clear what this new flow has to do with the one given in Section 3, however under some genericity they ought to be semi-conjugate.

368

Christian Bonatti, Jonathan Bowden, and Rafael Potrie

Fig. 5 Splitting up a complementary region. The splitting leaves are shown in green.

Remark 3. By stacking these regions on top of each other it is easy to do this when all prongs are of order 6 + 4n and all foliations are orientable. In this way we obtain orientable foliations without compact leaves that have trivial Euler class. This construction will work for any pseudo-Anosov flow satisfying the assumption on prongs.

6 Questions and future directions Clearly, the examples presented here are just a sample of what can be done using these techniques and are far from exhaustive. In particular, some questions that should be addressed in addition to completing several of the arguments sketched above are the following: • Is it possible to construct attracting plugs with Morse-Smale dynamics and without closed invariant surfaces so that the boundary behaviour is arbitrary4 ? What are the possible obstructions? Can these be characterised? • Can one obtain general criteria for gluing projectively hyperbolic Reebless plugs so that they remain Reebless? taut? • Is it possible to characterise those pairs of transverse contact structures that can be deformed into Anosov flows? • In the partially hyperbolic setting there is some additional information on the contact structures. Is it true that if a hyperbolic 3-manifold admits a partially hyperbolic diffeomorphism, then it admits an Anosov flow? See [BFFP] for some progress in this direction. 4

As explained above a simple obstruction is that the foliation in the boundary has to have at least one Reeb annuli, but in principle we do not know of other obstruction.

Some remarks on projective Anosov flows in hyperbolic 3-manifolds

369

Acknowledgements R.P. was partially supported by CSIC 618, FCVF-2017-111 and FCE-12017-1-135352

References [ARH] A. Arroyo and F. Rodriguez Hertz, Homoclinic bifurcations and uniform hyperbolicity for three-dimensional flows. Ann. Inst. H. Poincar Anal. Non Lin´eaire 20 (2003), no. 5, 805–841. [Asa] M. Asaoka, Regular projectively Anosov flows on three-dimensional manifolds. Ann. Inst. Fourier (Grenoble) 60 (2010), no. 5, 1649–1684. [BFFP] T. Barthelme, S. Fenley, S.Frankel, R. Potrie, Partially hyperbolic diffeomorphisms isotopic to the identity: Research Announcement, this volume. [BBY] F. Beguin, C. Bonatti and B. Yu, Building Anosov flows on 3-manifolds. Geom. Topol. 21 (2017), no. 3, 1837–1930. [Bow] J. Bowden, Approximating C0 -foliations by contact structures. Geom. Funct. Anal. 26 (2016), no. 5, 1255–1296. [BI] D. Burago, S. Ivanov, Partially hyperbolic diffeomorphisms of 3-manifolds with abelian fundamental groups. J. Mod. Dyn. 2 (2008), no. 4, 541?580. [Ch] J Christy, Anosov flows on three manifolds, PhD thesis, University of California, Berkeley (1984) [CF] V. Colin and S. Firmo, Paires de structures de contact sur les vari´et´es de dimension trois. Algebr. Geom. Topol. 11 (2011), no. 5, 2627–2653. [ET] Y. M. Eliashberg and W. P. Thurston, Confoliations, University Lecture Series, vol. 13, American Mathematical Society, Providence, RI, 1998, x+66 pages. [FM] B. Farb, D. Margalit, A primer on mapping class groups. Princeton Mathematical Series, 49. Princeton University Press, Princeton, NJ, 2012. xiv+472 pp. [FH] P. Foulon and B. Hasselblatt, Contact Anosov flows on hyperbolic 3-manifolds. Geom. Topol. 17 (2013), no. 2, 1225–1252. [GP] N. Gourmelon and R. Potrie, Projectively Anosov diffeomorphisms on surfaces, In preparation. [KR] W.H. Kazez and R. Roberts, C0 approximations of foliations. Geom. Topol. 21 (2017), no. 6, 3601–3657. [KR2 ] W.H. Kazez and R. Roberts, Taut foliations. arXiv:1605.02007v1. [Mit] Y. Mitsumatsu, Anosov flows and non-Stein symplectic manifolds. Ann. Inst. Fourier (Grenoble) 45 (1995), no. 5, 1407–1421. [Mit1 ] Y. Mitsumatsu, Foliations and contact structures on 3-manifolds, in Foliations: geometry and dynamics (Warsaw, 2000), World Sci. Publ., River Edge. NJ. 2002, p. 75–125. [Thu] W. Thurston, Three manifolds, foliations and circles, arXiv:math/9712268

Notes on Global Product Structure Andy Hammerlindl

Abstract These notes present a careful explanation of parts of a proof showing that if an Anosov diffeomorphism has global product structure and the universal cover has polynomial growth of volume, then the Anosov diffeomorphism is topologically conjugate to an infranilmanifold automorphism. The original proof published by Brin and Manning relied on an incorrectly stated result of Auslander about the properties of maps on infranilmanifolds. The current notes show how the proof can be adapted to avoid relying on this incorrect result.

1 Introduction These notes give a careful explanation of parts of the proof of the following result. Theorem 1.1 (Brin-Manning) If an Anosov diffeomorphism f has Global Product Structure, and the universal cover has polynomial growth of volume, then f is topologically conjugate to an infranilmanifold automorphism. The original proof of Brin and Manning [3] relies on an incorrect statement of Auslander regarding infranilmanifold automorphisms whereas these current notes avoid using this incorrect statement. These notes explain in detail only the first part of the proof of the theorem, which is the construction of a semi-conjugacy. They do not include the proof that the semiconjugacy is injective and therefore a true conjugacy since the original arguments for this step hold without modification. I have ordered the steps of the proof so that as much is proved as possible before introducing infranilmanifolds.

Andy Hammerlindl School of Mathematics, Monash [email protected]

University,

Victoria

3800

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_25

Australia

e-mail:

371

Andy Hammerlindl

372

My original motivation in preparing these notes was to convince myself of the correctness of the proof. Almost all of the following repeats arguments already given by Brin, Manning, Franks, Dekimpe, and others [3, 4, 5, 6].

2 The proof Lemma 2.1 If f : M → M has Global Product Structure, then for every ε > 0, there is k ≥ 1 such that f −k (Bε (x)) ∩ Bε (y) = ∅ for all x, y ∈ M. Here, Bε (x) := {y ∈ M : d(x, y) < ε }. ˜ the intersection Proof. For the lifted foliations W u and W s on the universal cover M, [x, ˜ y] ˜ = W s (x) ˜ ∩W u (y) ˜ ˜ As W u and W s are tangent to continuous subbundepends continuously on x, ˜ y˜ ∈ M. u s ˜ [x, ˜ y]) ˜ and du (y, ˜ [x, ˜ y]) ˜ are continuous as well and are bounded dles E and E , ds (x, for (x, ˜ y) ˜ ∈ K × K where K is a compact fundamental domain of the covering M˜ → M. Projecting down, there is R > 0 such that for all x, y ∈ M, there is z ∈ W s (x) ∩W u (y) ⊂ M with ds (x, z) < R and du (y, z) < R. One can then find n ≥ 1, independent of x and y, such that ds ( f n (x), f n (z)) < ε and du ( f −n (y), f −n (z)) < ε . This is enough to prove the lemma with k = 2n.



Corollary 2.2 With f , ε , k as in the last lemma, for any periodic point f m (x) = x, there is a closed ε -pseudo orbit x, f x, · · · , f m−1 x, y, f y, · · · , f k−1 y, x. Proof. Take y ∈ Bε (x) ∩ f −k (Bε (x)) = ∅.



Lemma 2.3 If f has Global Product Structure, there is k such that # Fix( f m ) ≤ # Fix( f m+k ) for all m ≥ 1. Proof. f is expansive; there is δ > 0 such that if d( f n (x), f n (x )) < δ for all n ∈ Z, then x = x . It also has a periodic shadowing property; there is ε > 0 such that every closed ε -pseudo orbit x0 , x1 , · · · xi = x0 is 13 δ -shadowed by a true orbit x = f i (x). Suppose x = f m (x). By the previous corollary, there is an ε -pseudo orbit x, f x, · · · , f m−1 x, y, f y, · · · , f k−1 y, x. and this is 13 δ -shadowed by some z = f m+k (z). In particular, d( f i (x), f i (z)) < 13 δ for 0 ≤ i < m. Say by the same process that x = f m (x ) leads to a point z = f m+k (z ).

Notes on Global Product Structure

373

If z = z , then d( f i (x), f i (x )) < 23 δ for 0 ≤ i < m. Hence, for all i ∈ Z, and therefore x = x .  Remark 1. Brin and Manning use 13 δ , but it seems that 12 δ would suffice. Lemma 2.3 will later be used to establish hyperbolicity of a Lie group automorphism. For this, we will also need an elementary result about complex numbers. Lemma 2.4 Suppose that λ1 , · · · , λn ∈ C are such that no λi is a root of unity. Define n

am := ∏ |1 − λim |. i=1

If there is k ≥ 1 such that am ≤ am+k for all m ≥ 1, then |λi | = 1 for all i. For completeness, we include a proof at the end of the paper. Lemma 2.5 If A is a hyperbolic automorphism of a nilpotent Lie group L, then αb : L → L, x → A(x) · b has a fixed point for any b ∈ L. Proof. By the Anosov closing lemma, if g : M → M is Anosov, there are constants δ , ε > 0 such that if d(g(x), x) < ε , then there is y = f (y) with d(x, y) < δ . (This is just shadowing of a constant pseudo orbit.) The values of δ and ε depend on bounds on the angle between E u and E s and bounds related to expansion and contraction given in the definition of an Anosov diffeomorphism. The closing lemma is proven locally, so it holds for M non-compact so long as E u and E s are uniformly continuous and these bounds hold uniformly on M. To prove the lemma, equip L with a metric such that d(x · y, x · z) = d(y, z) for all x, y, z ∈ L. The closing lemma holds for all αb and with δ and ε independent of b. If αb has a fixed point x, then d(αb (x), x) < ε for all b ∈ Bε (b), so every such b has a fixed point. Since α1 = A has a fixed point, and Bε (Bnε (1)) = B(n+1)ε (1) we can show that any αb has a fixed point. Alternatively, one can prove this result using induction on the nilpotency class of the Lie group and the fact that the result is true for the abelian Lie group Rn .  Assumption 2.6 For the rest of this note, assume that f : M → M is an Anosov diffeomorphism with Global Product Structure and the universal cover M˜ has polynomial growth of volume. Lemma 2.7 M˜ is homeomorphic to Rn for some n. Proof. By Global Product Structure, M˜ is homeomorphic to an unstable leaf direct product with a stable leaf. As each of these is homeomorphic to Rn for some n, so is the direct product.  Lemma 2.8 π1 (M) is torsion free.

Andy Hammerlindl

374

Proof. Suppose not. Then there would be a deck transformation γ : M˜ → M˜ of finite period. As M˜ is homeomorphic to Rn , this gives a fixed-point free homeomorphism of RN which is periodic. This is ruled out by a classic result of P. A. Smith. (This is repeating an argument given by Franks [6]. Franks cites [2] as a reference.)  Lemma 2.9 The maximal normal nilpotent subgroup N of π1 (M) has finite index. Proof. By Gromov, π1 (M) has a nilpotent group H of finite index [7]. Then, there is a subgroup K ≤ H of finite index and normal (in π1 (M)). As it is a subgroup of H, it is nilpotent. The Hirsch-Plotkin radical N of π1 (M) contains K and is therefore of finite index. Hence, it the desired maximal subgroup.  Corollary 2.10 N is characteristic; it is preserved by every automorphism. ˜ Viewing π1 (M) as the group of all deck Lift f : M → M to a map f˜ : M˜ → M. transformations, the choice of lift defines a group automorphism f∗ of π1 (M) by f˜(γ (x)) = f∗ (γ )( f˜(x)) for all x ∈ M˜ and γ ∈ π1 (M). By the above corollary, f∗ (N) = N. We use the following results on nilmanifolds as given by Malcev [9]. Theorem 2.11 (Malcev) If N is a torsion-free nilpotent finitely-generated group, there is • a connected, simply connected, nilpotent Lie group L, • a subgroup Λ < L, and • a group isomorphism T : Λ → N, such that L/Λ is a compact manifold (a nilmanifold). Every group automorphism of Λ extends to a unique Lie group automorphism of L. Any such L is unique up to Lie group isomorphism. In general, Λ and T are not unique. Lemma 2.12 There is a nilpotent Lie group L, a subgroup Λ , an isomorphism T : Λ → N, and a Lie group automorphism A : L → L such that • A(Λ ) = Λ , and • A(x) = T −1 f∗ T (x) for x ∈ Λ . Proof. This is the theorem of Malcev applied to our specific situation.



Lemma 2.13 A is hyperbolic. Remark 2. As the choice of T in lemma 2.12 was arbitrary, this will show that A is hyperbolic for any choice of T . Note that we do not assume E u is orientable on M or that f preserves an orientation of E u if it exists. However, in order to prove lemma 2.13, we will temporarily make an additional assumption on orientation. Recall that any bundle over a simply connected manifold is orientable, as there is no double cover to lift to.

Notes on Global Product Structure

375

Assumption 2.14 For now, assume that on the universal cover, f˜ preserves the orientation of E u . If need be, replace f and A with f 2 and A2 . As A is hyperbolic if and only if A2 is hyperbolic, this assumption can be made freely for the purposes of proving lemma 2.13. Viewing γ ∈ π1 (M) as a diffeomorphism M˜ → M˜ whose derivative maps the lifted unstable bundle E u to itself, define a group homomorphism θ : π1 (M) → {+1, −1} by whether γ preserves or reverses the orientation of E u . Then N + := ker(θ ) ∩ N defines a finite-index normal nilpotent subgroup of π1 (M). By assumption 2.14, f∗ restricts to an automorphism of N + . ˜ + . That is, x, y ∈ M˜ are identified if there Define a manifold Mˆ as a quotient M/N + is γ ∈ N ⊂ π1 (M) such that γ (x) = y. Since γ (x) = y implies f∗ (γ )( f˜(x)) = f˜(y), ˆ Further, π1 (M) ˆ is isomorphic to N + and f˜ can be f˜ quotients to a map fˆ : Mˆ → M. ˆ ˆ ˜ viewed as a lift of f to M. As such, the induced group automorphism fˆ∗ of π1 (M) can be identified with the restriction of f∗ to N + . Lemma 2.15 There is a subgroup Λ + ⊂ Λ such that • T (Λ + ) = N + , • A(Λ + ) = Λ + , • A(x) = T −1 fˆ∗ T (x) for x ∈ Λ + . Proof. This follows from lemma 2.12 and the fact that f∗ (N + ) = N + (under assumption 2.14).  Let gˆ : L/Λ + → L/Λ + be the quotient of A to the nilmanifold L/Λ + . Lemma 2.16 There is a homotopy equivalence hˆ : Mˆ → L/Λ + with lift h˜ : M˜ → L such that the maps induced by fˆ, g, ˆ and hˆ on the fundamental groups satisfy hˆ ∗ fˆ∗ = gˆ∗ hˆ ∗ . ˆ and Proof. By the construction of g, ˆ there is an isomorphism between π1 (M) + n ˆ ˜ π1 (L/Λ ) that conjugates f∗ and gˆ∗ . Since M is homeomorphic to R and L/Λ + is a nilmanifold, both spaces are of type K(π , 1) the lemma follows from standard results in algebraic topology.  Remark 3. Many treatments of Eilenberg-MacLane spaces assume that the fundamental group is defined with regard to a pointed space (X, x0 ). This complicates matters, as we have not yet proved that fˆ has a fixed point, and so may not be a based map. However, one can isotope fˆ to a map fˆ1 which does have a fixed point, prove the lemma for fˆ1 , and then use the isotopy to prove it for fˆ. Lemma 2.17 Let L(·) denote the Lefschetz number of a diffeomorphism of a manifold. Then, n

# Fix( fˆm ) = |L( fˆm )| = |L(gˆm )| = ∏ |1 − λim | i=1

where λi are the eigenvalues of A.

376

Andy Hammerlindl

Proof. The left-most equality is a standard result for Anosov diffeomorphisms which preserve the orientation of E u . (See [11].) By the previous lemma, the maps induced by fˆ and gˆ on the homology groups are conjugate. Therefore, their traces are the same, and the resulting Lefschetz numbers are the same. The last equality is proved in [10].  Proof (of lemma 2.13). None of the λi in lemma 2.17 can be a root of unity, for then some iterate of fˆ would be an Anosov map without periodic points. The result then follows as a combination of lemmas 2.3, 2.4, and 2.17.  End of assumption 2.14. As we have proved lemma 2.13, we no longer need the assumption. Lemma 2.18 f has a fixed point. Proof. Suppose that instead of using N + and Λ + , we had used N and Λ to define ˜ ˜ maps f˘ : M/N → M/N and g˘ : L/Λ → L/Λ . As f˘ may not preserve an orientation of E u , we do not compare # Fix( f˘m ) to |L( f˘m )|. However, the other two equalities given by lemma 2.17 hold in this case, and using m = 1 and that A is hyperbolic, we get that L( f˘) is non-zero. Then f˘ has a fixed point and this projects to a fixed point for f .  Let Aff(L) denote the affine transformations of L, those functions of the form x → b · β (x), where β is an automorphism of the Lie group and b ∈ L. Theorem 2.19 (Auslander-Schenkman) If Γ is a torsion-free finitely-generated group with nilpotent Hirsch-Plotkin radical N of finite index, then Γ can be viewed as a subgroup of Aff(L), where L is the Lie group given by theorem 2.11 corresponding to N. Moreover, L/Γ is a compact manifold. This follows from section 2 of [1]. Remark 4. There is a subtle issue here. When regarded as a subgroup of Γ , N consists of affine maps. These maps are of the form L → L, x → a · x for some a ∈ L. If N is regarded as a subgroup of L, its elements are no longer maps, but simply elements a ∈ L. These two distinct interpretations may have contributed to Auslander providing an incorrect proof for the extension of an automorphism of Γ . To try to avoid confusion, we will not regard N as a subgroup of L, and instead use the symbol Λ ⊂ L and say that they are identified by an isomorphism T : Λ → N. We now follow the proof given in the paper of Lee and Raymond [8], but in regards to our specific situation. By the previous theorem, π1 (M) can be identified with a subgroup Γ of Aff(L) and f∗ then defines an automorphism ψ : Γ → Γ for which ψ (N) = N. This restriction to N defines an automorphism T −1 ψ T on Λ which extends to A : L → L by the result of Malcev. Continuing on, the proof of Lee and Raymond shows that ψ is conjugation by an element of Aff(L). They give a formula in the proof: using their notation, ψ is conjugation by (b, μ (b−1 )A) [8, page 75]. This can be re-written as ψ (γ ) = αγα −1 where α (x) = A(x) · b.

Notes on Global Product Structure

377

Lemma 2.20 α is hyperbolic. Proof. The above construction of A using the results of Malcev is exactly the same as in lemma 2.12, and therefore A is hyperbolic by lemma 2.13.  Corollary 2.21 α has a fixed point. Proof. This is a specific case of lemma 2.5.



If x0 ∈ L is a fixed point of α , then α (x) = A(x · x0−1 ) · x0 , so α = β Aβ −1 , where β (x) = x · x0 . Note that β ∈ Aff(L). For γ ∈ Γ , the formula ψ (γ ) = αγα −1 expands to ψ (γ ) = β Aβ −1 γβ A−1 β −1 ⇒ β −1 ψ (γ )β A = Aβ −1 γβ . Define Γ¯ = {β −1 γβ : γ ∈ Γ } and ψ¯ : Γ¯ → Γ¯ by ψ¯ (β −1 γβ ) = β −1 ψ (γ )β , so that the above formula can be rewritten simply as

ψ¯ (γ¯)A = Aγ¯. The manifold P = L/Γ¯ is a compact manifold, and the Lie group automorphism A quotients down to an Anosov diffeomorphism g : P → P. Identifying π1 (P) with Γ¯ , there is a commutative diagram

π1 (P) −−−−→ ⏐ ⏐g∗ 

Γ¯ −−−−→ ⏐ ⏐ψ¯ 

Γ −−−−→ π1 (M) ⏐ ⏐ ⏐ψ ⏐f  ∗

π1 (P) −−−−→ Γ¯ −−−−→ Γ −−−−→ π1 (M) where all arrows are isomorphisms and the top and bottom rows are the same. Thus, f∗ and g∗ are conjugate. As M and P are K(π , 1) and f and g have fixed points, we can apply the results of Franks to find a semi-conjugacy h : M → P such that h f = gh [6]. Beyond this point, the proof given by Brin and Manning is fairly easy to follow (see also [5]), and there are no subtlies with respect to infranilness.

3 Proof of lemma 2.4 We now proceed to prove lemma 2.4. First, consider only the eigenvalues which lie on the unit circle. Assumption 3.1 Assume for the next two lemmas that λ1 , · · · λn ∈ C are not roots of unity, and |λi | = 1 for all i. The λi need not be distinct. n

Lemma 3.2 If {n j } is a subsequence of N such that λ1 j → 1, then there is p ∈ N pn pn such that λ1 j → 1 and λi j → λi for all i.

Andy Hammerlindl

378 n

Proof. Clearly, (λ1 j ) p → 1 for any p. Suppose for some i and distinct p, q, that pn qn pn qn λi j → λi and λi j → λi . Then (λi j )q → λiq and (λi j ) p → λip , and so λip = λiq . This contradicts the fact that λi is not a root of unity. Therefore, for all but finitely many p, the lemma is satisfied.  mj

Lemma 3.3 There is a subsequence {m j } of N such that for all i, Li = lim j→∞ λi m j +1

exists and Li = 1. Further, lim j→∞ λ1

= 1. n

Proof. As λ1 is not a root of unity, there is a subsequence n j such that λ1 j → 1. n Since each sequence λi j lies in a compact subset of C, by replacing n j with a further n subsequence, we may assume λi j converges for all i. Choosing p as in the previous lemma, m j = pn j − 1 is the desired subsequence in the statement of this lemma.  We now consider eigenvalues of any modulus. Lemma 3.4 Suppose that λ1 , · · · , λn ∈ C are such that no λi is a root of unity. Further suppose |λ1 | = 1. Define n

am := ∏ |1 − λim |. i=1

Then there is a subsequence {am j } such that Proof. If |λi | < 1, then

If |λi | > 1, then

am j +1 am j

→ 0.

|1 − λim+1 | → 1. |1 − λim | |1 − λim+1 | → |λi |. |1 − λim |

By the previous lemma, there is {m j } such that m j +1

lim

|1 − λi

|

m j→∞ |1 − λ j | i

exists for those i with |λi | = 1, and for i = 1 in particular, the limit is zero. Then, m +1 am j +1 |1 − λi j | = 0. = ∏ lim mj j→∞ am j i j→∞ |1 − λi |

lim

 Proof (of lemma 2.4). If we define a˜m similarly to am , but using λ˜ i = λik in place of λi , then a˜m = akm and the hypothesis am ≤ am+k for all m implies a˜m ≤ a˜m+1 for all m. Therefore, it is enough to prove the lemma in the case k = 1. Suppose that |λi | = 1 for some i. Without loss of generality, i = 1. By the previous lemma, there is a sequence of terms am j +1 /am j ≥ 1 which tends to zero, a contradiction. 

Notes on Global Product Structure

379

References 1. Auslander, L., Schenkman, E.: Free groups, Hirsch-Plotkin radicals, and applications to geometry. Proc. Amer. Math. Soc. 16, 784–788 (1965) 2. Borel, A.: Seminar on transformation groups. Annals of Mathematics Studies, No. 46. Princeton University Press, Princeton, N.J. (1960) 3. Brin, M., Manning, A.: Anosov diffeomorphisms with pinched spectrum. Dynamical Systems and Turbulence, Warwick 1980 pp. 48–53 (1981) 4. Dekimpe, K.: What an infra-nilmanifold endomorphism really should be. preprint arXiv:1008.4500 (2010) 5. Franks, J.: Anosov diffeomorphisms on tori. Transactions of the American Mathematical Society 145, 117–124 (1969) 6. Franks, J.: Anosov diffeomorphisms. Global Analysis: Proceedings of the Symposia in Pure Mathematics 14, 61–93 (1970) 7. Gromov, M.: Groups of polynomial growth and expanding maps. Publications Math´ematiques ´ 53(1), 53–78 (1981) de l’IHES 8. Lee, K.B., Raymond, F.: Rigidity of almost crystallographic groups. In: Combinatorial methods in topology and algebraic geometry (Rochester, N.Y., 1982), Contemp. Math., vol. 44, pp. 73–78. Amer. Math. Soc., Providence, RI (1985). DOI 10.1090/conm/044/813102. URL http://dx.doi.org/10.1090/conm/044/813102 9. Malcev, A.I.: On a class of homogeneous spaces. Amer. Math. Soc. Translation 1951(39), 33 (1951) 10. Manning, A.: Anosov diffeomorphisms on nilmanifolds. Proc. Amer. Math. Soc. 38, 423–426 (1973) 11. Manning, A.: There are no new Anosov diffeomorphisms on tori. Amer. J. Math. 96(3), 422– 42 (1974)

Chapter 6

Geometric and Categorical Representation Theory

The Greenberg Functor is Site Cocontinuous Geoff Vooys

Abstract In this paper we show that it is possible to define a topology on the category of formal schemes over a ring of p-adic integers such that the left adjoint of the Greenberg Transform is a site cocontinuous functor when we equip the category of schemes over the residue field with the ´etale topology. We show furthermore that this topology allows us to give an isomorphism between the corresponding fundamental groups, and use this isomorphism to show that it is possible to geometrize the quasicharacters of a p-adic torus by a local system on a formal scheme over the ring of p-adic integers.

1 Introduction The Greenberg Transform, and its left adjoint (often called the Greenberg Functor), are two functors that are ubiquitous in arithmetic algebraic geometry. The Greenberg Transform itself first appeared in a proto-form in Serge Lang’s thesis [18] and was further explored and formally introduced by Marvin J. Greenberg in [12] and [13], where Greenberg showed that the transform that bears his name admits a left adjoint, as well as how to use the pair of functors to study torsors and cohomology. More recently, these functors have appeared in the book [6] by Bosch, L¨ utkebohmert, and Raynaud on N´eron Models; in the work of Buium to study p-jet spaces (cf. [7]); in the work of Nicaise and Sebag to study motivic properties of (formal) schemes (cf. [20] and [21]); in the work of Cunningham and Roe in [11] to geometrize quasicharacters of p-adic tori; in the work of Yu in the study of smooth models as they are used in Bruhat-Tits theory (cf. [23]); in the work of Bertapelle and Tong to study the Picard group and the pro-algebraic structures of Serre (cf. [2]); and in the work of Bhatt and Scholze in defining and proving the Geoff Vooys Department of Mathematics and Statistics, University of Calgary, Calgary, Alberta, Canada. e-mail: [email protected]

© Springer Nature Switzerland AG 2020 J. de Gier et al. (eds.), 2018 MATRIX Annals, MATRIX Book Series 3, https://doi.org/10.1007/978-3-030-38230-8_26

383

384

Geoff Vooys

representability of the positive p-adic loop group functor (cf. Proposition 9.2 of [3]). While the Greenberg functor and the Greenberg Transform have given myriad tools with which to study schemes and their arithmetic properties, especially by relating the mixed-characteristic case of formal schemes over Spec R to schemes over the residue field Spec k, the Greenberg Transform was written in a pre-Grothendieck language, which made applying the rich theory around it more difficult. It was this issue that lead Bertapelle and Gonz´alez-Avil´es to revist and recast the Greenberg Transform into scheme (and formal scheme)-theoretic language of modern algebraic geometry in [1]. This significantly helped in the study of the Greenberg Transform, as it established many site-theoretic and geometric properties of the functor itself, and studied the Greenberg Transform in great detail. However, in contrast to the in-depth study and development of the Greenberg Transform, the theory surrounding the Greenberg functor is less developed. In this paper, we work with and study the Greenberg functor and consider an application to local systems. In particular, we show the Greenberg functor is site-cocontinuous for certain Grothendieck topologies on Sch/ Spec k and FSch/ Spec R , where k is the residue field of a complete integral extension R/ Zp and FSch/ Spec R is the category of formal schemes over Spec R. More precisely, we will define topologies on the category FSch/ Spec R for which the Greenberg functor h : Sch/ Spec k → FSch/ Spec R becomes cover-reflecting for the ´etale and fppf topologies. This means that if {ϕi : Xi → X | i ∈ I} is a collection of morphisms over X in Sch/ Spec k , and if {F ϕi : F Xi → F X | i ∈ I} generates a covering sieve over F X, then the set {ϕi : Xi → X | i ∈ I} is a cover over X. This topology, which we will call K for the moment (cf. Definition 2 for an explicit description), allows us to show that there is a canonical isomorphism of fundamental groups π1Et (X, x) ∼ = π1K (h X, h x) ´

for any scheme X over Spec k. In particular, from this group isomorphism, we show that there is an isomorphism of the category of ´etale local systems over a k-scheme X with the category of local systems in the K topology over h X. Theorem 1.1 (cf. Theorem 5.1) For any group scheme G over Spec k with geometric point g of G, if h is the Greenberg functor, then there is an isomorphism of categories ´ Rep(π1Et (G, g)) ∼ = Rep(π1K (h G, h g)).

In particular, this gives an isomorphism of categories ∼ LocEt ´ (G) = LocK (h G).

The Greenberg Functor is Site Cocontinuous

385

Our study of the Greenberg functor is motivated in two directions: 1. The first motivation comes from wishing to better understand the Greenberg Transform and a desire to use it in arithmetic algebraic geometry, since understanding a right adjoint functor is to understand its left adjoint, and dually. Furthermore, a deeper understanding of this left adjoint and how it affects Grothendieck topologies on its domain will lead to a comparison between Grothendieck topologies on schemes over fields of prime characteristic and Grothendieck topologies on categories of formal schemes over p-adically complete rings. 2. The second motivation comes from [11] in which the authors begin with a torus T over a p-adic field F , take its N´eron model NT over its ring of integers OF , and then take the Greenberg Transform of NT , to define a scheme X over Spec k for which there is an identification of groups T (F ) ∼ = NT (OF ) ∼ = X(k). By then taking the category of ´etale local systems over X and using the Trace of Frobienius, they arrive at the following: Any quasicharacter of T (F ) is geometrized as the Trace of Frobenius of some ´etale local system L. This causes one to ask whether it is possible to construct a category of local systems in some topology over a formal scheme over Spec OF for which this new local system geometrizes the quasicharacter χ : T (F ) → C∗ directly. In an attempt to address both of the considerations above, we are lead to study a class of functors, which we call geometrically adhesive (cf. Definition 1); Corollary 2 shows that the Greenberg functor is geometrically adhesive. With this notion, we build a topology which allows the transfer of sites from categories of schemes to categories of formal schemes in the sense that the fundamental groups are preserved by the transfer. This is Theorem 4.1, which is the main theorem of the paper: Theorem 1.2 (cf. Theorem 4.1) Let F : C → D be a geometrically adhesive functor and assume that there is a Grothendieck topology J on C for which Shv(C , J)lcf is a Galois category. Then there is a topology K on D and a functor F ∗ : Shv(D, K)lcf → Shv(C , J)lcf such that if F ∗ is fully faithful, there is an isomorphism of fundamental groups π1J (X, x) ∼ = π1K (F X, F x) for all objects X of C and geometric points x of X. As an application, we show that the pullback functor h∗ : Shv(FSch/ Spec R , J) → Shv(Sch/ Spec k , J) is fully faithful (cf. Lemma 5). Furthermore, we derive:

386

Geoff Vooys

Corollary 1. If J is a topology on Sch/ Spec k for which Shv(Sch/ Spec k , J) is a Galois category and if h is the Greenberg functor, then for any k-scheme X there is an isomorphism of fundamental groups π1J (X, x) ∼ = π1K (h X, h x).

2 An Introduction to Geometrically Adhesive Functors, with Motivation We begin by recalling the full-level Greenberg Transform for a complete field extension K/ Qp ; cf. [1] for a modern account and [12] and [13] for the introduction of the functors. Begin by assuming that R/ Zp is an absolutely unramified extension ring of Zp with residue field k/ Fp and fraction field K/ Qp . Furthermore, if n ∈ N, define the scheme Sn := Spec so that

R pn+1

  R R ∼ Spec R = Spec lim n+1 ∼ = lim Spec n+1 = lim Sn , ←− p −→ −→ p

where the limit is calculated in the category Cring of commutative rings with identity and the colimit is calculated in the category AffSch of affine schemes. Then the full-level Greenberg Transform is a functor Gr : FSch/ Spec R → Sch/ Spec k , where FSch/ Spec R is the category of formal schemes over the scheme Spec R, has the left adjoint h : Sch/ Spec k → FSch/ Spec R , i.e., we have an adjucntion diagram h



Sch/ Spec k

FSch/ Spec R

F∗

In order to give a particularly concrete description of h, we now assume that the ramification degree e = [K : Qp ]/[k : Fp ] = 1, i.e., K is unramified over Qp . The left adjoint h then can be seen as the colimit of the functors hn : Sch/ Spec k → Sch/Sn → FSch/ Spec R where each hn is defined by, for each scheme X = (|X|, OX ) in Sch/ Spec k , hn |X| := |X|

387

The Greenberg Functor is Site Cocontinuous

and hn OX := Wn+1 OX where Wn OX is the sheaf of length n Witt Vectors in OX ; note that the action of Wn+1 OX is given by, for each open set U ⊆ |X|,     Wn+1 OX (U ) = Wn+1 OX (U ) , where Wn+1 (OX (U )) is the ring of length n+1 Witt Vectors with coefficients in OX (U ). It is worth observing that na¨ıvely these functors do not admit a colimit; however, since there is a natural map Sm → S n whenever m ≤ n, as it comes from the correspondence Sch(Sm , Sn ) ∼ = Cring(R/ pn+1 , R/ pm+1 ); in this way a scheme over Sm is a scheme over Sn by simply post-composing by map Sm → Sn , which itself allows us to regard each functor hm : Sch/ Spec k → Sch/Sm as a functor hm : Sch/ Spec k → Sch/Sn . Through this process, we derive a natural transformation hn → hn+1 for all n ∈ N. Taking the colimit of these functors hn , together with all necessary embeddings of categories Sch/Sn → FSch/ Spec R , gives the desired colimit h : Sch/ Spec k → FSch/ Spec R . From these definitions a routine calculation implies that for each scheme X = (|X|, OX ) in Sch/ Spec k , we have that h|X| = |X| and h OX = W OX , where W OX is the sheaf of Witt Vectors on OX . A standard gluing argument then allows one to show that if {Ui | i ∈ I} is an open gluing of a scheme X in Sch/ Spec k , then     h Ui = hX = hUi ; i∈I

i∈I

such a geometric condition is very powerful and is easier to work with than the mystical and difficult to understand Greenberg Transform itself. In particular, we call such a functor geometrically adhesive because it preserves all possible gluing! Remark 1. The union written above is meant simply as a short hand for a specific colimit. Explicitly, for locally ringed spaces U and V , we write

388

Geoff Vooys

U ∪ V := U



V

X

as the gluing pushout of along their intersection X → U and X → V . Similarly, we write U ∩ V to denote the pullback of U and V along the morphisms U → U ∪ V and V → U ∪ V . In particular, the diagram U ∩ V _

/U

 V

  _ / U ∪V

in LRS is simultaneously a pushout and pullback diagram. Remark 2. If the algebraic field extension F/ Qp has nontrivial ramification, √ say as in the quadratic extension Qp ( p), then there is an analogous story to the one told above that may be used to construct the Greenberg Transform and its left adjoint h. It essentially the same construction, save for now we take products of the Wn OX sheaves to build up the Eisenstein equation of ramification termwise, and then build up our Witt Vector structure with these ramifications in mind. Explicit details may be found in [1], but will not be particularly relevant for this paper, save for in providing intuition for the explicit case to which we wish to apply our results. We will now codify some notation useful as we proceed in the section. In particular, let C and D be subcategories of the category LRS of locally ringed spaces or slice categories of subcategories of LRS. Definition 1. A functor F : C → D is said to be geometrically adhesive if, whenever U = (|U |, OU ) ∈ Ob C /X has a collection of open subobjects {Vj | j ∈ J} with  Vj = V, j∈J

then

⎛ F⎝



j∈J

⎞ Vj ⎠ =



F Vj .

j∈J

Remark 3. For the reader familiar with the work of Lack, Cockett, et al. (cf. [8], [9], [10], amongst others), one may notice the similarity of a join restriction functor between join restriction categories, or perhaps more readily with adhesive functors between adhesive categories (cf. [17], for instance), and our given definition of a geometrically adhesive functor above. There are similarities, especially in the fact that these are all getting towards some sort of manifold-type construction and intuition, but we have given the definition above for the following reasons:

389

The Greenberg Functor is Site Cocontinuous

• The definition of a geometrically adhesive functor is intimately tied to the sheaves that come equipped with a locally ringed space, and geometrically adhesive functors are built to bring this perspective to mind and to task. • We are not giving any extra attention to restriction or join properties of the categories C or D, if nontrivial versions of said structures even exist, save for potentially in incidental circumstances. • The definition of join restriction functors is tied to the abstract differential geometry of Cartesian differential categories. Because we are focusing only on the algebro-geometric aspects induced by these functors, we provide an algebraic dual to the more analytic perspective afforded by join restriction functors. • These are different from the adhesive functors of Lack (save for perhaps morally) in the following way: A geometrically adhesive functor only asks to preserve gluings of open subobjects between arbitrary categories of locally ringed spaces, while adhesive functors in the sense of Lack are functors between adhesive categories that preserve pushouts against arbitrary monomorphisms. In particular, Proposition 3 shows that in general these are different notions. • Proposition 3 shows, in addition to differentiating adhesive functors from geometrically adhesive functors, that geometrically adhesive functors are not necessarily cocontinuous. Remark 4. Not all functors are geometrically adhesive. For instance, consider the functor Γ : Sch → Sch given by Γ (X) → Spec(OX (X)). This functor is not geometrically adhesive because it destroys nonaffine schemes and their affine gluings. In particular, recall that if X is affine if and only if Γ (X) ∼ = Γ (Spec A) = Spec(OA (|Spec A|)) = Spec A. To see that this is not geometrically adhesive, consider the scheme P1Z with the gluing 1 9 PZ eKK s KKK s s KKK ss s s KKK s s s K s s A1Z A1Z O O Spec Z[x, x−1 ] o

∼ =

/ Spec Z[y, y −1 ]

where the bottom isomorphism is induced by the Cring morphism ϕ : Z[x, x−1 ] → Z[y, y −1 ] given by x → y −1 . Then we find that

390

Geoff Vooys

Γ (P1Z ) = Spec(OP1Z (|P1Z |)) = Spec Z so taking Γ of the whole diagram gives the commuting diagram

A1Z O

p

p

∃! p

p

Spec Z[x, x−1 ] o

p

Spec Zf N p8 N

N∃! N

∼ =

N

N

A1Z O

/ Spec Z[y, y −1 ]

which is not a gluing diagram because, amongst other reasons, A1Z is not a subscheme of Spec Z. Example 1. Let D be a subcategory of LRS such that Spec 0 ∈ Ob D. Then the functor F : C → D defined by F U := Spec 0 is geometrically adhesive. Proposition 1. Adhesive functors need not reflect isomorphisms. Proof. Let F be the functor given in the above example and let C := Sch. Consider the schemes Ui , Uj := A1Z with gluing := P1 Ui ∪ Uj = U Z and define U := Spec Z. We then calculate that ) F (U ) = F (Spec Z) = Spec 0 = F (A1Z ∪ A1Z ) = F (P1Z ) = F (U . but, of course, U = Spec Z ∼ P1Z = U = Proposition 2. Adhesive functors need not be faithful. Proof. The “affinization” functor F above does the job for sufficiently chosen categories C of locally ringed spaces. In particular, take C = Sch. Proposition 3. Geometrically adhesive functors need not be cocontinuous. In particular, geometrically adhesive functors need not preserve gluings along closed subobjects. Proof. As above, the proof is by example. Consider the ring A := {(a, f ) | a ∈ Zp , f ∈ Zp [x], a ≡ f (0) mod p} = Zp ×Fp Zp [x] and define the set of objects C 0 := {Spec A, Spec Zp , Spec Fp , A1Zp , P1Zp , Spec Zp [x, x−1 ]}

391

The Greenberg Functor is Site Cocontinuous

with morphism set C 1 := {f | ∃X, Y ∈ C 0 . f ∈ Sch(X, Y )} and take C to be the category (C 0 , C 1 ). Note that the only object with nontrivial open subobjects is P1Zp , which is the gluing of two copies of A1Zp along the isomorphism Spec Zp [x, x−1 ] → Spec Zp [x, x−1 ] given by x → x−1 . Furthermore, observe that the object Spec A is a gluing along a closed subscheme by an argument of [22], i.e., Spec A is the pushout Spec Fp (p,x)

s

ι1



A1Zp

/ Spec Zp

ι2

_   / Spec A

where the closed immersion Spec Fp → Spec Zp picks out the special fibre and the closed immersion Spec Fp → A1Zp picks out the closed point (p, x). Define a functor F : C → Sch as follows: If X ∈ C 0 , define

Spec Zp if X = Spec Fp ; F X := X else and if f ∈ C 1 , define ⎧ ⎪ ⎨f F (f ) := idSpec Zp ⎪ ⎩ ϕ

if Dom f = Spec Fp ; if f = idSpec Fp ; if Dom f = Spec Fp , f = ϕ ◦ s, ϕ : Spec Zp → X, X = Spec Fp ;

this fully defines the functor because for any X ∈ C 0 , if X = Spec Fp then any map Spec Fp → X factors through the special fibre of Spec Zp , i.e., there exists a morphism g : Spec Zp → X making the diagram f /X Spec Fp KK x; x KK xx KK xxϕ s KK x x K% x Spec Zp

commute in Sch. In particular, with this the verification that F is a functor is trivial and omitted. The functor F just perserves the gluing of P1Zp by construction, and since this is the only nontrivial open subobject gluing, this shows that F is geometrically adhesive. On the other hand, F sends the pushout diagram defining Spec A to the diagram, where x : Spec Zp → A1Zp is the spectrum of the map Zp [x] → Zp corresponding to evaluation at x = 0,

392

Geoff Vooys

Spec Zp x

Spec Zp ι1



A1Zp

ι2

 / Spec A =

which is evidently not a pushout in Sch; explicitly, the pushout of Spec Zp ← − x Spec Zp − → A1Zp is Spec B, where B is the ring B = {(a, f ) | a ∈ Zp , f ∈ Zp [x], f (0) = a} = Zp ×Zp Zp [x] ∼ = Zp [x]. But then F does not preserve the closed gluing of Spec A and does not preserve the pushout diagram. This proves that geometrically adhesive functors need not be cocontinuous or preserve gluings along closed subobjects. Proposition 4. Adhesive functors need not be full. Proof. Consider the functor h extended to all schemes X via the same assignment, i.e., via h X = (|h X|, h OX ) where |h X| = |X| and the sheaf h OX is defined by, for all U ⊆ |X| open, h OX (U ) = W (OX (U )) where W is the p-typical Witt vector functor a l´a Borger; see [4] and [5] for details. A routine calculation (when we identify FSch/ Spec R as the indclosure of Sch/ Spec R and then give all affine schemes the trivial filtration) shows that FSch(Spec Zp , Spec Fp ) ∼ = Cring(Fp , Zp ) = ∅. However, since there is a unique continuous map τ : {η, s} → {∗}, it is the case that we could have a sheaf morphism τ  : h OFp → τ∗ h OZp . Explicitly we note that ⎧ ⎪ ⎨Zp ⊕a if U = |Spec Zp |; h OZp (U ) = W (OZp (U )) = QN if U = {η}; p ⎪ ⎩ 0 if U = ∅ where a =

 n≥1

pn Zp and the multiplication in Zp ⊕a is by the rule (a, x)(b, y) = (ab, ay + bx + xy).

The morphism ϕ : Zp → Zp ⊕a given by a → (a, 0) is a ring homomorphism. On the other hand we see that for all U ⊆ |Spec Fp | open,

393

The Greenberg Functor is Site Cocontinuous

h OFp (U ) = W (OFp (U )) =

Zp 0

if U = {∗}; if U = ∅.

Then we can define a sheaf morphism τ  : h OFp → τ∗ h OZp because such a morphism of locally ringed spaces only sees global sections and the diagram ϕ

/ Zp ⊕a Zp     ∃! ∃!   0 _ _ _ _/ 0 ∃!

commutes; moreover, it is not hard to see that this is the only such sheaf map between h OFp and τ∗ h OZp . Thus we find that FSch(h Spec Zp , h Spec Fp ) = {(τ, τ  )} = ∅ = Sch(Spec Zp , Spec Fp ). This shows that h is not full when extended to a functor h : Sch → FSch. We now move from these definitions and remarks to give some properties of geometrically adhesive functors. In particular we will show that if U is an open subobject of V in C (that is there is an open immersion i : U → V ) and if F : C → D is geometrically adhesive, then F U is an open subobject of F V (more explicitly, the map F i : F U → F V is an open immersion). Proposition 5. Let i : U → V be an open immersion in C and assume that F : C → D is geometrically adhesive. Then F i : F U → F V is an open immersion. Proof. Begin by recalling that the fact that F is geometrically adhesive implies that if {Vi → V | i ∈ I} is an open covering of V , then     FV = F Vi = F Vi . i∈I

i∈I

Thus it follows that each F Vi remains a subobject of F V , and hence we conclude that |F V | ⊆ |F V | whenever V is an open subobject of V . Therefore we have that F U is a subobject of F V . Finally, to see that F U is an open subobject of F V , we note that since F is a functor between (slice) categories of locally ringed spaces, F takes sheaves to sheaves and hence is at least Zariski continuous. Now consider the map i : U → V and observe that we can deduce, from the functoriality of F and the pushforward/pullback adjunction i−1  i∗ : Shv(V ) → Shv(U ), that the following deduction may be made: First, observe that since i is an open immersion, there is an isomorphism of U -sheaves i : i−1 OV ∼ = OU ; pushing this through the adjunction gives rise to the sheaf morphism

394

Geoff Vooys

i  : O V → i∗ O U . Applying the functor F on sheaves then gives the sheaf morphism F (i ) : F OV → (F i)∗ F OU which is equivalent by the functoriality of F to the sheaf morphism F (i) : OF V → F (i)∗ OF U . Passing this through the adjunction F (i)−1  F (i)∗ sends the morphism F (i) to the map F (i) : F (i)−1 OF V → OF U which is equivalent, again by the functoriality of F , to the map F (i ) : (F i)−1 F OV → F OU . Now, because i is an isomorphism it follows that F (i ) = F (i) is as well. Therefore (F i)−1 OF V → OF U is an isomorphism of F U -sheaves, and hence the map F i : F U → F V is an open immersion. This completes the proof of the proposition. We now proceed to show that geometrically adhesive functors preserve pullbacks defined by open gluings, i.e., that such functors perserve intersections. This will be necessary because it will allow us to show that geometrically adhesive functors preservve pullbacks along covers induced by pretopologies. As we proceed with this, we assume the following: 1. The category C has the property that if U and V are objects of C , then the union space U ∪ V and intersection space U ∩ V exists in C . In this way we can regard any two objects as open subobjects of some larger object, and we can view their intersection as an open subspace of U, V, and U ∪V. Theorem 2.1 If U and V are objects in C and F : C → D is geometrically adhesive, then F (U ∩ V ) = F U ∩ F V. Proof. Let U and V be objects of C and assume that there is are open immersions U, V → W , for some W ∈ Ob C . However, since U and V are glued along their intersection, it suffices to prove the proposition for the locally ringed space  W := U V = U ∪ V, 

U ∩V

where the pushout U U ∩V V is the gluing of U and V along the subspace U ∩ V ; note also that in this definition, since U and V are open in W , U ∩ V is open and the immersion U ∩ V → W factors as

395

The Greenberg Functor is Site Cocontinuous

V x; xx x xx xx U ∩ VF FF FF F iU ∩V,U FFF # U iU ∩V,V

AA AA iV,W AA AA /W }> } }} }}iU,W } }

where each of the i morphisms are open immersions. We will first show that the canonical map F (U ∩ V ) → F U ∩ F V is an open immersion. Begin with the observations that  F W = F (U ∪ V ) = F U ∪ F V = F U FV F U ∩F V

and that U ∩ V is the pullback iU ∩V,V

U ∩ V _

iU ∩V,U

/V iV,W

 U

 /W

iU,W

in C . Thus we get that F U ∩ F V is the pullback F U ∩ F V _

iF U ∩F V,F V

iF U ∩F V,F U

 FU

/ FV iF V,F W

iF U,F W

 / FW

and so applying F gives rise to a unique morphism θ : F (U ∩ V ) → F U ∩ F V making the diagram F (U ∩ V ) N

N

N

∃!θ F iU ∩V,U

F iU ∩V,V

N

N' FU ∩ FV #

iF U ∩F V,F V

iF U ∩F V,F U

 FU

iF U,F W

/) F V iF V,F W

 / FW

commute in D /Y . Moreover, since F preserves open immersions by Proposition 5, the morphisms F iU ∩V,V and F iU ∩V,U are both open immersions in D /Y . Thus, from the equations

396

Geoff Vooys

iF U ∩F V,F U ◦ θ = F iU ∩V,U and iF U ∩F V,F V ◦ θ = F iU ∩V,V it follows that θ is both monic and open, and hence an open immersion. We will now show that   = FU ∪ FV = FW = FU F V, FU F U ∩F V

F (U ∩V )

as this will imply that F (U ∩ V ) ∼ = F U ∩ F V ; the fact that F (U ∩ V ) is an open subobject of F U ∩ F V will then complete the proof. To do this assume that there is an object S of D /Y and that there are morphisms ϕ : F U → S and ψ : F V → S such that the rectangle ϕ

FU O

/S O

iF U ∩F V,F U

ψ

FU ∩ FV

iF U ∩F V,F V

/ FV

commutes. Then there exists a unique morphism ρ : F W → S making the diagram 1S ϕ {= J ∃!ρ { { { iF U,F W / FW FU O O ψ iF U ∩F V,F U

FU ∩ FV

iF V,F W F U ∩F V,F V

/ FV

commutes. Now observe that since the open immersion iF (U ∩V ),F U ∩F V : F (U ∩ V ) → F U ∩ F V makes the diagram

397

The Greenberg Functor is Site Cocontinuous ϕ iF U,F W / FW FU O O gOOO OOO OOO iF U ∩F V,F U OO iF (U ∩V ),F U F U7 ∩ FKV iF V,F W KK pp p KK p p KK p pp p iF U ∩F V,F V KKK p p % / FV F (U ∩ V )

0S {= L { ρ { { {{ {{ ψ

iF (U ∩V ),F V

commute, where the arrow F (U ∩ V ) → F U ∩ F V is the open immersion iF (U ∩V ),F U ∩F V ; thus using that iF (U ∩V ),F U = iF U ∩F V,F U ◦ iF (U ∩V ),F U ∩F V and iF (U ∩V ),F V = iF U ∩F V,F V ◦ iF (U ∩V ),F U ∩F V as open immersions, we have that the diagram ϕ

FU O

iF U,F W

iF (U ∩V ),F U

F (U ∩ V )

/ FW O

1S {= J { ρ { { {{ { { ψ

iF V,F W iF (U ∩V ),F V

/ FV

 commutes as well. Thus, to show that F W = F U F (U ∩V ) F V as well, we need only show that ρ is the unique such map making this diagram commute. With this in mind, assume that there exists a morphism σ : F W → S making the diagram ϕ

FU O iF (U ∩V ),F U

F (U ∩ V )

iF U,F W

/ FW O

1S {= J { { σ { {{ {{

iF V,F W iF (U ∩V ),F V

/ FV

commute as well. But then we derive that the diagram

ψ

398

Geoff Vooys

=0 S {{{{= L { {{ {{{{ {{{{{ σ / FW O

ϕ

ρ

iF U,F W

FU O gOOO OOO OO iF U ∩F V,F U OOO iF (U ∩V ),F U F U7 ∩ FKV iF V,F W KK pp p KK p p KK p pp p iF U ∩F V,F V KKK p p % / FV F (U ∩ V )

ψ

iF (U ∩V ),F V

commutes with the equations σ ◦ iF U,F W = ϕ and σ ◦ iF V,F W = ψ holding by assumption. But then since ϕ ◦ iF U ∩F V,F U = ψ ◦ iF U ∩F V,F V it follows that the diagram ϕ

FU O

iF U,F W

iF U ∩F V,F U

FU ∩ FV

=1 S {{{{= J { {{ {{{{ {{{{{ σ / FW O ψ ρ

iF V,F W iF U ∩F V,F V

/ FV

commutes as well. Thus the universal property of F W as the pushout  F U F U ∩F V F V gives that ρ = σ and so there is a unique morphism ρ : F W → S making the diagram ϕ

FU O iF (U ∩V ),F U

F (U ∩ V )

iF U,F W

∃!ρ

/ FW O iF V,F W

iF (U ∩V ),F V

/ FV

commute. This shows that if P is the pushout

{

{

{

1S {= J

ψ

399

The Greenberg Functor is Site Cocontinuous



P := F U

FV

F (U ∩V )

then F W is a subobject of P . To prove that P is a subobject of F W , assume that there exist morphisms ϕ : F U → S and ψ : F V → S making the diagram ϕ

FU O

/ S O

iF (U ∩V ),F U

ψ

F (U ∩ V )

/ FV

iF (U ∩V ),F V

commute. Now let ρ : P → S be the unique map out of the pushout making ϕ

FU O

ι1

∃!ρ

{

/P O

{

1 S {= J

{

ψ ι2

iF (U ∩V ),F U

F (U ∩ V )



iF (U ∩V ),F V

/ FV

commute. Factorizing the morphisms iF (U ∩V ),F U and iF (U ∩V ),F V through F U ∩ F V then allows us to produce the commuting diagram: ϕ ι1 /P FU O gOOO O OOO OOO iF U ∩F V,F U OO ι2 iF (U ∩V ),F U F U7 ∩ FJV JJ pp p J p JJ p J ppp iF U ∩F V,F V JJJ ppp % / FV F (U ∩ V )

0 S {= L { ρ { { {{ { { 

ψ

iF (U ∩V ),F V

Observe that because P is the pushout of F U and F V over the open subobject F (U ∩ V ), ι1 and ι2 are both open immersions as well. Furthermore, since F (U ∩ V ) is an open subobject of F U ∩ F V and since the morphisms iF U ∩F V,F U and iF U ∩F V,F V are both open immersions as well, we have that ι1 ◦ iF U ∩F V,F U = ι2 ◦ iF U ∩F V,F V .

400

Geoff Vooys

We then compute that ϕ ◦ iF U ∩F V,F U = ρ ◦ ι1 ◦ iF U ∩F V,F U = ρ ◦ ι2 ◦ iF U ∩F V = ψ ◦ iF U ∩F V,F V which shows that the diagram 1 S {= J { {{ {{ { {

ϕ

ρ

ι1

FU O

/P O

ψ ι2

iF U ∩F V,F U

FU ∩ FV

iF U ∩F V,F V

/ FV

commutes. To see that this does so universally through ρ , assume that there exists a σ : P → S such that 1 S {= J { σ {{ {{ { {

ϕ



ι1

FU O

/P O

ψ ι2

iF U ∩F V,F U

FU ∩ FV

iF U ∩F V,F V

/ FV

commutes. However, consider now the diagram 1 S {= J { σ {{ {{ { {

ϕ ι1

:F U O iF (U ∩V ),F U

iF U ∩F V,F U iF U ∩F V,F V

F U7 ∩ F V pp p p p p p p ppp iF (U ∩V ),F V F (U ∩ V ) and note that this composite implies that



/P O

ψ ι2

/5 F V

401

The Greenberg Functor is Site Cocontinuous

1 S {= J { σ {{ {{ { {

ϕ ι1

FU O



/P O

ψ ι2

iF (U ∩V ),F U

F (U ∩ V )

iF (U ∩V ),F V

/ FV

commutes. Using the universal property of the pushout P then gives us that σ = ρ and hence shows that P is a subobject of F W by a factorization of the universal property.  Now that we have shown that F W is a subobject of F U F (U ∩V ) F V and  F U F (U ∩V ) F V is a subobject of F W , it follows that FU



FV = FU ∪ FV = FW = FU

F U ∩F V



FV

F (U ∩V )

and so we derive from this that F U ∩ F V ∼ = F (U ∩ V ); however, since the map iF (U ∩V ),F U ∩F V : F (U ∩ V ) → F U ∩ F V is an open immersion, it follows that F U ∩ F V = F (U ∩ V ) and we are done. Proposition 6. Let R be a ring object in LRS. Then the functor hR : C → LRS defined by, for a locally ringed space X = (|X|, OX ), |hR X| = |X| and, for opens U ⊆ |X|, OhR X (U ) := C (U, R) is geometrically adhesive. Proof. The verification that this satisfies the topological side of the definition is trivial, while the sheaf-theoretic side follows from a standard gluing argument. Corollary 2. If Gr : FSch/ Spec R → Sch/ Spec k is the Greenberg transform, then the left adjoint (h  GrR ) : Sch/ Spec k → FSch/ Spec R is geometrically adhesive. Proposition 7. If F : C → D and G : D → A are geometrically adhesive, then G ◦ F : C → A is geometrically adhesive. Proof. Immediate from the calculation

402

Geoff Vooys

 (G ◦ F )



 Ui

  =G F

i∈I



 Ui

 =G

i∈I



 F Ui

=

i∈I



G(F Ui )

i∈I

=



(G ◦ F )(Ui ).

i∈I

3 The Adhesive Site From the definition of geometrically adhesive functors and Proposition 5, we see that geometrically adhesive functors have good behaviour with respect to open gluings of schemes and open immersions of schemes. We would now like to define a topology on the codomain of the functor F : C → D so that the gluing condition     Ui ∼ F Ui F = i∈I

i∈I

becomes instead an intrinsic property of the functor preserving coverages that are allowed to occur with respect to whatever topology on C that we have at hand. In particular, we will show that under the construction of this topology, the Greenberg functor h is site-cocontinuous (cf. Corollary 3). As we proceed, we recall briefly the notion of a sieve on a category. A sieve S on an object U of C is a subfunctor of the representable functor on U , i.e., a monomorphism S → C (−, U ) in the presheaf topos [C op , Set]. Results from classical Grothendieck topos theory (cf. [14]) show that a Grothendieck topology is completely determined by its covering sieves in a nonambiguous way. Thus we will work with sieves on the highest level possible, and specialize to working with coverages in the sense of Grothendieck pretopologies when we need to work with explicit covers. Begin by letting C := {ϕi : Ui → U | i ∈ I} be a set of covering morphisms on an object U in C . We then define the sieve generated by C to be the functor (C) : C op → Set where first we define a set (C) := {ϕi ◦ ψ | ϕi ∈ C, Dom ϕi = Codom ψ} and then defining the action of (C) on any object X of C via (C)(X) := {θ ∈ (C) | Dom θ = X}, while the action of (C) on morphisms is simply by precomposition. It is readily checked that (C) defines a sieve on U . In a similar vein, if C is a collection of sieves we wish to make into covering sieves on a category, we write C | C ∈ C 

The Greenberg Functor is Site Cocontinuous

403

for the Grothendieck topology generated by the sieves C in C , i.e., for the minimal Grothendieck topology J on C which contains all the sieves C as covering sieves. We will use both of these notions to define a topology on D in terms of a given topology on C . Definition 2. Let J be a fixed Grothendieck topology on a category of locally ringed spaces C and let F : C → D be an geometrically adhesive functor where D is also a category of locally ringed spaces. We then generate a topology AJF on D by taking AJF := (F C) | C is a J − covering sieve where, if C is a sieve on U ∈ Ob C , F C := {F ϕ : F V → F U | ϕ ∈ C}. The topology AJF is then called the J-F -geometrically adhesive site on D. Remark 5. If the functor F and the site J are clear from context, we will simply refer to the site AJF as the geometric adhesive site on D instead of the F -J-geometric adhesive site on D. Lemma 1. If F : C → D is a geometrically adhesive functor and if (C , J) is a site, then the functor F : (C , J) → (D, AJF ) is cover reflecting. Proof. This is immediate from construction. Since the topology AJF is minimally generated by F J, a subfunctor S → D(−, F U ) is an AJF -covering sieve on F U if and only if there exists an R ∈ J(U ) such that F R ⊆ S. However, this is what it means to be cover reflecting. Corollary 3. The Greenberg functor h : Sch/ Spec k → FSch/ Spec R is sitecocontinuous. Proof. From the discussion around the definition of what it means to be cover reflecting in [16], a functor of sites is cocontinuous if and only if it is cover reflecting. The cover reflecting property of the AJF topology is a very practical one, as it will allow us to prove an intuitive result: It allows us to show that if a pretopology τ generates the topology J on C , then F τ generates AJF . However, we cannot in general say that there is a pretopology generated by F τ on D, as D need not admit pullbacks. Proposition 8. Let F : C → D be geometrically adhesive and assume that the site (C , J) is generated by the pretopology τ and that D admits pullbacks. Then the pretopology τ = F τ  on D generates AJF . Proof. Observe that if D(F U, V ) = ∅ = D(V, F U ), then there is nothing to say, as the only covering sieves of V are trivial. Moreover, by the Cover

404

Geoff Vooys

Reflecting Property it suffices to show that if R ∈ AJF (F U ) is a covering sieve, then there exists a cover C ∈ J(U ) for which F C ⊆ R, as we can refine along covers formed from F Ui ’s to collections that come from functorial images of J-covers through the functor F . To prove this let ρ be the maximal pretopology on D which generates AJF and let {ψi : Vi → F U | i ∈ I} ∈ ρ(F U ) be given such that {Vi → F U | i ∈ I} ⊆ R; such a cover exists because ρ generates AJF . Now, since AJF is cover reflecting, there exists a collection of morphisms {F ϕj : F Uj → F U | j ∈ J} which refines {ψi : Vi → F U | i ∈ I}. However, we also have that {F ϕj : F Uj → F U | j ∈ J} = F {ϕj : Uj → U | j ∈ J} and C := {ϕj : Uj → U | j ∈ J} ∈ τ (U ). But then it follows that if S is the J-covering sieve generated by this cover, (F S) ⊆ R. Since the cover F C ∈ τ (F U ), we get that F C ⊆ (F S) ⊆ R and so it follows that τ generates AJF . Remark 6. Note that the above proof does not show that the pretopology σ = F τ  is the maximal pretopology generating AJF . I do not think that this will happen, in general, but this remains open. Many of the results we will now give are presented in this section for the purpose of having tools to compute the fundamental groups of the sheaf toposes Shv(D, AJF ) so that we can have comparisons of the nature, for locally ringed spaces U ∈ Ob C and a geometrically adhesive functor F : C → D, ? ? AJ AJ π1J (U ) → π1 F (F U ), π1 F (F U ) → π1J (U ) at least in the case in which both U and F U are connected. We will use these in particular to study the adjunction h  GrR : Sch/ Spec k → FSch/ Spec R and to hopefully understand why, at least in the case in which J is the ´etale topology on Sch/ Spec k , passing through the Greenberg Transform allows us to geometrize quasi-characters and see things over R that simply do not come from the ´etale site over Spec R; for details see [11] on the motivation for this idea. In order to discuss how to move these results over, we need to prove two key lemmas which will allow us to reduce proving that a presheaf P : D op → Set is an AJF -sheaf to checking that it satisfies the sheaf axiom on images of J-covers. This will then allow us to prove Lemma 3 below, which itself is essential for the proof of Theorem 3.1. As we proceed, we will assume the following: The categories C and D have pullbacks, and if J is a topology on C generated by pretopology τ , then ρ = F τ  is the pretopology on D generated by τ . Note that by Proposition 8 ρ generates AJF , so it suffices to argue if presheaves P on D are AJF -sheaves by

405

The Greenberg Functor is Site Cocontinuous

checking on ρ-coverages by a standard result of site theory (cf. Proposition 1 of [19]). However, we need to prove the lemma below, save for with one observation at hand. Note that since AJF is generated by F J, any nontrivial cover of an object V of D can be refined by some cover of the form {gi : F Ui → V | i ∈ I}. Thus, when one considers the sheaf condition for a presheaf P on D, if V is an object of D with D(F U, V ) = ∅ for some U in C , it suffices to consider refinements of {Vj → V | j ∈ J} of the form {F Ui → V | i ∈ I}. Lemma 2. Let V be an object of D such that there exists a cover D := {hj : Vj → V | j ∈ J} ∈ ρ(V ) for which there is a refinement {gi : F Ui → V | i ∈ I} that makes {F Ui → F X | i ∈ I} the functorial image of a cover C := {Ui → X | i ∈ I}, where  X := Ui . i∈I

Then a presheaf P satisfies the sheaf axiom with respect to the cover D if and only if P satisfies the sheaf axiom with respect to F (C). Proof. Let us begin by fixing some notation. The map  eV : P (V ) → P (Vj ) j∈J

is the pairing map eV = P (hj )j∈J (and analogously eF X = P (ιF Ui )i∈I ), while since {gi : F Ui → V | i ∈ I} refines D, for each j ∈ J there exists an i ∈ I such that there is a factorization F UiC CC CC ϕij CCC !

gi

Vj

/V ?    hj  

 in  D. There then is a map, for any presheaf P on D, α : j∈J P (Vj ) → i∈I P (F Ui ) which is given as follows: For each j ∈ J, find all i ∈ I for which there are factorizations as above, say indexed by the set Ij , and then construct the map   αj / P (Vj ) P (F Ui ) j∈J

πj

 P (Vj )

i∈Ij

8 rr rr r r rr ϕij i∈Ij

in Set; the map αthen takes the form α  = αj j∈J . In this same way we can get a map β : j,j  ∈J P (Vj ×V Vj  ) → i,i P (F Ui ×F U F Ui ) by taking β = P (ϕij × ϕi j  ).

406

Geoff Vooys

=⇒ : This direction is  clear by performing a refinement argument and lifting any map f : Y → i∈I P (F Ui ) along the P ϕij , iterating through all the j ∈ J, and then using that this lift factors through both the equalizer eV and the unique map P (g) : P V → P (F X). ⇐= : We now assume that P satisfies the sheaf axiom with respect to the cover D. Now consider the commuting diagram /  P (Vj )

eV

P (V )

//  P (V × V  ) j V j j,j  ∈J

j∈J

α

Pg

 P (F X)

eF X

 /

i∈I

β

 P (F Ui )

//

 i,i ∈I

 P (F Ui ∩ F Ui )

where the maps α, β, eV , and eF X are defined as above, and eF X is an equalizer by assumption of P satisfying the sheaf axiom  on F C. Now suppose that there exists some set Z and a morphism f : Z → j∈J P (Vj ) such that the diagram f //  /  P (Vj )  Z j,j  ∈J P (Vj ×V Vj ) j∈J

commutes in Set. It then follows by construction that the diagram Z

/  P (F Ui )

α◦f

i∈I

//

 i,i ∈I

P (F Ui × F U F Ui )

commutes in Set, so by the universal property of an equalizer there exists a unique function k : Z → P (F X) making the diagram FX O

eF X

/

 i∈I

P (F Ui )

 u: ∃!k uuα◦f u  uuu u V

//

 i,i ∈I

P (F Ui × F U F Ui )

commute. This implies that in particular, the diagram P (F X) O  ∃!k   V

eF X

/  P (F Ui ) i∈I O α

f

/  P (Vj ) j∈J

//  P (F U × i F U F U i ) i,i ∈I O β

//  P (V × V  ) j V j j,j  ∈J

commutes. We now claim that k factors through P (V ) and P (g). To see this note that since

407

The Greenberg Functor is Site Cocontinuous

α ◦ f = eF X ◦ k = P (ιF Ui )i∈I ◦ k = P ιF Ui ◦ ki∈I  it suffices to argue on the F Ui by virtue of the fact that F X = F Ui and a colimit is determined by the maps out if its colimiting objects. Now, since for each j ∈ J we can find an i ∈ I such that gi = hj ◦ ϕij , we need only make local gluing arguments. In particular, fix some s ∈ P (F X) and note )i∈I produces an element whose image in the product of that eF X (s) = (si pullback sections i,i ∈I P (F Ui ×F X F Ui ) is the same under P applied to either pullback map; call this image 



t = P (π1i,i )(si )i∈I = P (π2i,i )(si )i∈I = (si,i )i,i ∈I . However, from the refinement condition on coversand the commutativity of the diagram, we can find some section (vj )j∈J ∈ j∈J P (Vj ) for which α(vj )j∈J = (si )  and hence find some t = (vj,j  )j,j  ∈J ∈ j,j  ∈J P (Vj ×V Vj  ) which maps through β to t. Explicitly, β(t ) = β(sj,j  ) = (si,i )i,i ∈I = t. This in turn allows us to conclude that 



P (π1jj )(vj ) = P (π2jj )(vj ) by applying β and then using the commutativity of the diagram. However, writing s ∈ P (F X) as s = k(z) for some z ∈ Z gives that α(vj )j∈J = (α ◦ f )(z) = (eF X ◦ k)(z) and hence the section (si )i∈I = eF X (s) comes from simultaneously a uniquely given section over F X and a section on each of the Vj . Thus we can find some v ∈ P (V ) for which P (g)(v) = s and eV (v) = (vj )j∈J , by the geometry of the Vj and F X, and the choice of these v determines a function γ : Z → P (V ). That is, we have a factorization

408

Geoff Vooys

ZH HH k HH HH HH $ P (V )

( / P (F X)

P (g)

f eV

  P (Vj )

j∈J

eF X

α

/

 i∈I

 P (F Ui )

which makes the diagram PV O ∃γ

Z

/  P (Vj ) j∈J ; vv v v vv f vv

//

eV



P (Vj ×V Vj  )

j∈J

commutes. However, this map is γ can be easily seen to be unique as follows: If there exists a morphism δ : Z → P (V ) giving the same factorization, the fact that eF X is an equalizer gives that P (g) ◦ γ = k = P (g) ◦ δ; consequently, from the uniqueness of k and the fact that P (g) is determined based on the gluing data of the F Ui , it follows that γ = δ. However, this implies that the diagram PV O

eV

/

 v; ∃! vvf v  vvv v Z

 j∈J

P (Vj )

//

 j∈J

P (Vj ×V Vj  )

commutes and so eV is an equalizer, as was to be shown. Remark 7. The above lemma shows that if V is an object of D with V ∼ FX = for all objects X of C , then we can characterize the AJF -covering sieves of V as follows: Assume that the J-cover {Vj → V | j ∈ J} of V is refined by {F Xi → V | i ∈ I} and assume that X is the gluing of the Xi so that F X is the gluing of the F Xi . Let ρ : F X → V be the canonical map. Then, for any covering sieve S containing the cover {Vj → V | j ∈ J}, we must be able to find a J-cover R of X which factors through covers on each of the Xi , and S must contain the set {ρ ◦ F ϕ | ϕ ∈ R}. Corollary 4. Let F : D op → Set be a presheaf. Then F is an AJF -sheaf if and only if for all U ∈ Ob C , the diagram

409

The Greenberg Functor is Site Cocontinuous

F (F U )

/

 i∈I

//

F (F Ui )

 i,j∈I

F (F Ui ∩ F Uj )

is an equalizer for all J-covers {Ui → U | i ∈ I} of U . Proof. For objects of D that either receive or have morphisms from or into objects F X, for X ∈ C 0 , this follows from the above lemma. The remaining case follows from observing that if V ∈ D 0 with the relations D(V, F X) = ∅ = D(F X, V ) for all X ∈ C 0 , then the only AJF covers on V are trivial. This in turn implies that any presheaf satisfies the sheaf axiom over V ; combining this with the prior lemma gives the corollary. Lemma 3. Let F : C → D be geometrically adhesive and let J be a site on C . Then the functor F ∗ : Shv(D, AJF ) → [(C )op , Set] given by F ∗ (F ) := F ◦ F factors as: ∗

F / [(C )op , Set] Shv(D, AJF ) 7 OOO ppp OOO p p OOO ppp OO' F∗ ppp Shv(C , J)

Proof. Begin by observing that by Corollary 4 it suffices to prove that a presheaf on D is in fact an AJF -sheaf by using the functorial images of Jcovers. Let F be any AJF -sheaf and let {ϕi : Ui → U | i ∈ I} be a J-covering in a pretopology τ generating J. Then consider the following diagram, where the equality between the second and third rows follows from: (F ∗ F )(U )

F (F U )

F (F U )

/  (F ∗ F )(Ui )

(F ∗ F )(ϕi ) i∈I

i∈I



F (F ϕi ) i∈I

/

F (F ϕi ) i∈I

/

i∈I

 i∈I

//  (F ∗ F )(U ∩ U ) i j i,j∈I

F (F Ui )

//  F (F (U ∩ U )) i j

F (F Ui )

//  F (F (U ) ∩ F (U )) i j

i,j∈I

i,j∈I

Since AJF is generated by the pretopology F τ  defined by the J-covers (cf. Proposition 8 above), it follows that {F ϕi : F Ui → F U | i ∈ I} generates and refines an AJF -cover; using that F is an AJF -sheaf implies that the bottom row in the diagram is an equalizer and hence that the top is as well. This proves the lemma.

410

Geoff Vooys

Theorem 3.1 The functor F : C → D induces an essential geometric morphism, perversely also named F , F : Shv(C , J) → Shv(D, AJF ). Proof. We will prove that the functor F ∗ preserves all small1 colimits, as from here an appeal to Freyd’s Adjoint Functor Theorem will show that F ∗ has a right adjoint. To do this, let {Fi | i ∈ I} be a family of AJF -sheaves and let F be the colimit F := lim Fi −→

with colimit morphisms αi : Fi → F . Now consider that to prove that F ∗ preserves small colimits, we must show that   F ∗ lim Fi = lim (F ∗ Fi ) ; −→

−→

unwrapping this definition shows that we must prove that for all U ∈ Ob C ,   ? lim Fi (F U ) = lim (Fi (F U )) . −→

−→

To do this, we first observe that observe that since Shv(C , J) is a subtopos of the presehaf topos [(C )op , Set], it suffices to compute whether or not the proposed equality holds by evaluating each natural transformation. However, consider that for each sheaf Fi , there is an induced natural transformation F ∗ Fi → F ∗ F given from the horizontal composition in the pasting diagram below, where r : D → (D)op is the formal reflection of D to its opposite category: C

F

/D

r

/ D op

Fi

   αi *

4 Set

F

Since the sheaves F ∗ Fi and F ∗ F act on C through the above diagram, we find that in the comparison diagrams that only αi varies. Thus, taking the colimit we find that   lim(Fi ◦ r ◦ F ) = lim Fi ◦ r ◦ F = F ◦ r ◦ F −→

so we have that

−→

  lim (Fi (F U )) = F (F U ) = lim Fi (F U ) −→

−→

1 Here the word “small” means that we assume we’re working in some Grothendieck Universe V where all our “small” objects are of size α ≤ κ, where κ is some strongly inaccessible cardinal. This is the last comment we will make on this subject, and the reader who does not care may simply say that anything that is “small” is as large as some set that is not a proper class.

411

The Greenberg Functor is Site Cocontinuous

which proves that F ∗ preserves all small colimits. Thus, by Freyd’s Adjoint Functor Theorem, F ∗ has a right adjoint F∗ . Thus there is an adjunction F∗

Shv(C /X , J)



Shv(D /Y , AJF )

F∗

which proves the first half of the essential geometric morphism. We will now be done if we can exhibit the existence of a left adjoint to F ∗ , i.e., if we can prove that there is a functor F! : Shv(C , J) → Shv(D, AJF ) such that there is an adjunction: F!



Shv(C , J)

Shv(D, AJF )

F∗

However, we can show that if F = lim Fi ←− i∈I

with projections ρi : F → Fi , then we can calculate the limit in Shv(C , J) as in the pasting diagram C

F

/ D /Y

r

/ D op

F

   ρi *

4 Set

Fi

just like the prior case. Dualizing the argument from here shows by Freyd’s Adjoint Functor Theorem that F ∗ preserves all small limits and hence that there is an adjunction F!



Shv(C , J)

Shv(D, AJF )

F∗

and hence we have the triple adjunction F!  F ∗  F∗ describing the essential geometric morphism F : Shv(C , J) → Shv(D, AJF ). Remark 8. Note that Theorem 3.1 cannot be deduced from [16] because, amongst other reasons, the functor F does not preserve terminal objects in general. For instance, if F = h : SchSpec Fp → FSch/ Spec Zp , then F (Spec Fp ) = Spf Zp = Spec Zp .

412

Geoff Vooys

Lemma 4. If p is a point of Shv(C /X , J) then the composite geometric morphism F ◦ p is a point of Shv(D /Y , AJF ). Proof. Recall that the 2-category Topos of toposes has for morphisms geof → F . Then since a point of Shv(C /X , J) is a geometric metric morphisms E − morphism p : Set → Shv(C /X , J) and F : Shv(C /X , J) → Shv(D /Y , AJF ) is an essential geometric morphism by Theorem 3.1, we have that the diagram Shv(C /X , J) QQQ s9 QQQF p sss QQQ s s QQQ s s ( ss / Shv(D /Y , AJ ) Set F F ◦p

commutes in Topos. Thus F ◦ p is a point of Shv(D /Y , AJF ). Corollary 5. The topos Shv(FSch/ Spf Zp , AEt h ) has a point. ´

´ has points. Proof. It is well-known that the ´etale topos Shv(Sch/ Spec Fp , Et) Thus so does the sheaf topos Shv(FSch/ Spec Zp , AEt h ) by the above Lemma. ´

We now give some basic results about whether or not a geometrically adhesive functor induces a connected morphism of sheaf toposes. A key assumption that we will make here involves essential surjectivity of the geometrically adhesive functor F ; this assumption may be relaxed to a condition like that given in the statement of Lemma 2, but one does need to make some assumptions on the functor F and the categories C and D in order for F ∗ to be fully faithful. A counter-example is given below that discusses the failure of F ∗ to be faithful more precisely, but for the connectivity results below, it simply suffices to prove that F is essentially surjective. Proposition 9. Let F : C /X → D /Y be a full and essentially surjective geometrically adhesive functor. Then the functor F ∗ in the essential geometric morphism F : Shv(C /X , J) → Shv(D /Y , AJF ) is full. Proof. Recall that since F is full, for all U, U ∈ Ob C /X we have that the map C /X (U, U ) → D /Y (F U, F U ) given by ϕ → F ϕ is epic in Set. Now let F , G ∈ Ob Shv(D /Y , AJF ) and consider a natural transformation α : F ∗ F → F ∗ G . Then for all ϕ : U → U in C /X the diagram (F ∗ F )(U ) O

αU

(F ∗ F )(ϕ)

(F ∗ F )(U ) which is equivalent to

/ (F ∗ G )(U ) O (F ∗ G )(ϕ)

αU 

/ (F ∗ G )(U )

413

The Greenberg Functor is Site Cocontinuous

F (F U ) O

αU

/ G (F U ) O

F (F ϕ)

G (F ϕ)

F (F U )

αU

/ G (F U )

commutes in Set. To prove that F ∗ is full, we simply must find a lift of α, i.e., a natural transformation β : F → G in such that F ∗ β = α. To do this, we will construct β in two steps: First, define βF U : F (F U ) → G (F U ) by setting βF U := αU . Now assume that V ∈ Ob D /Y and find a U ∈ Ob C /X such that V ∼ = FU; this is possible by the essential surjectivity of F . Let ψ : F U → V be any fixed isomorphism between V and F U and define βV by the rule βV := G (ψ −1 ) ◦ αU ◦ F ψ. Now let ϕ˜ : V → V be a morphism in D /Y , where V ∼ = F U for some U ∈ Ob C /X through ψ : F U → V , and consider the commuting diagram FO V



Fϕ ˜

FV

F ψ

/ F (F U ) O  ∃λ  / F (F U )

in Set. Note that λ exists because since F ψ is an isomorphism, a direct calculation with the morphism λ := F ψ ◦ F ϕ˜ ◦ F (ψ )−1 shows that λ ◦ F ψ = F ψ ◦ F ϕ. ˜ However, by the fullness of F , there exists a morphism θ : U → U in C /X such that   λ = F (F θ) = F (ψ )−1 ◦ ϕ˜ ◦ ψ . This shows that the diagram FO V



Fϕ ˜

FV

F ψ

/ F (F U ) O  ∃λ  / F (F U )

414

Geoff Vooys

commutes in Set . Dually, we have that the diagram G (F U ) O

G (ψ −1 )

/ G (V ) O

F (F θ)

Gϕ ˜

G (F U )

G (ψ  )−1

/ G V

commutes in Set; thus, since the diagram F (F U ) O

αU

/ G (F U ) O

F (F θ)

G (F θ)

F (F U )

αU 

/ G (F U )

commutes by assumption, it follows that every cell in the diagram FO V Fϕ ˜

FV



/ F (F U ) O

αU

/ G (F U )G (ψ O

F (F θ) F (ψ  )

/ F (F U )

−1

)

G (F θ) αU 

/ G (F U )

G (ψ  )−1

/ GV O Gϕ ˜

/ G V

commutes and hence the outer rectangle commutes. But since we have by definition that βV = G (ψ −1 ) ◦ αU ◦ F ψ, it follows that the outer rectangle contracts to βV / GV FO V O Fϕ ˜

FV

Gϕ ˜ βV 

/ GV

which proves that β : F → G is a natural transformation. Then by construction we can show that F ∗ β = α, which proves that F ∗ is full, as was desired. Remark 9. With the Lemma 2 and Corollary 4 together suggest that in order to have the faithfulness of F ∗ that may be suggested by Lemma 3, it is necessary to have the functor F : C → D satisfy the condition that for all V ∈ D 0 .∃ X ∈ C 0 .(D(V, F X) = ∅) ∨ (D(F X, V ) = ∅). This may be made explicit by the following counter-example: Let C = Sch/ Spec Fp and let D = FSch/ Spec Zp ∪{Spec Z }, where gcd(, p) = 1 and with the hom-set D(Spec Z , Spec Z ) = {idSpec Z }. Then

415

The Greenberg Functor is Site Cocontinuous

define a functor h : Cat → D by h(X) = h X for all X ∈ C 0 and observe that h is geometrically adhesive. However, F ∗ is not faithful as a functor of sheaf categories because you can choose any set on a presheaf over Spec Z which is an AJh -sheaf over the FSch/ Spec Zp component and get an AJh -sheaf. In particular, by taking the sheaves S and T to be defined by

{∗} if U ∈ FSch/ Spec Zp T (U ) = {0, 1} Else; and S (U ) = {∗} for all U ∈ D. Then Shv(D, AJh )(S , T ) = {0, 1}, where the labels come from which element the natural transformation picks out on the map {∗} → {0, 1} (so the sections over Spec Z ). However, since the pullback h∗ only sees sheaves that arise as taking h(−) of Fp -schemes, it is straightforward to calculate that h∗ S (X) = {∗} = h∗ T (X) for all schemes X in Sch/ Spec Fp (together with the only possible morphisms). In particular, from this construction it follows that Shv(C , J)(h∗ S , h∗ T ) = {idh∗ S } ∼ = {0, 1} = Shv(D, AJh )(S , T ), which shows that h∗ is not faithful. Proposition 10. If F : C /X → D /Y is a geometrically adhesive, full, and essentially surjective functor, then F ∗ is fully faithful. Proof. We have already seen from Proposition 9 that F ∗ is full; we therefore only need to show that F ∗ is faithful. To do this, assume that there is a natural transformation γ : F → G such that F ∗ β = F ∗ γ for some α : F ∗ F → F ∗ F . Since F ∗ β = α = F ∗ γ, we have that γF U = αU = βF U for all U ∈ Ob C /X . Now for each V ∈ Ob D /Y , find a U ∈ Ob C /X and an isomorphism ϕ : F U → V . Then since β and γ are natural transformations, we have that the diagrams FV

βV



 F (F U ) and

/ GV Gψ

βF U

 / G (F U )

416

Geoff Vooys

FV

γV



 F (F U )

/ GV Gψ

γF U

 / G (F U )

both commute. But then we calculate that βV = idG V ◦βV = G ψ −1 ◦ G ψ ◦ βV = G ψ −1 ◦ βF U ◦ F ψ = G ψ −1 ◦ γF U ◦ F ψ = G ψ −1 ◦ G ψ ◦ γV = idG V ◦γV = γV so it follows that β = γ and hence F ∗ is fully faithful. We would now like to show that there is an analogous version of the above proposition that holds even in the case that F is fully faithful but not essentially surjective. It will tell us that in the case in which the functor F is fully faithful, we can infer that both F ∗ reflects isomorphisms, as well as the fact that F ∗ is fully faithful. It uses the condition stated explicitly in Lemma 2: That for every object V of Codom F there exists an object X of Dom F for which D(V, F X) = ∅ or D(F X, V ) = ∅. Proposition 11. Assume that F : C → D is a fully faithful, geometrically adhesive functor and that for all objects V of D there exists an object X of C for which D(F X, V ) = ∅ or D(V, F X) = ∅. Then F ∗ is isomorphism reflecting. Proof. Let F and G be AJF -sheaves on D such that F ∗ F ∼ = F ∗ G as J-sheaves on C . The for all objects U of C and for all morphisms ϕ of C we get: From the isomorphism F ∗ F (U ) ∼ = F ∗ G (U ), we have that

F (F U ) ∼ = G (F U );

and from the isomorphism F ∗ F (ϕ) ∼ = F ∗ G (ϕ) we also have

F (F ϕ) ∼ = G (F ϕ).

Furthermore, because both F ∗ F and F ∗ G are J-sheaves, it follows that for all J-sieves S ∈ J(U ), the diagram (F ∗ F )(U )

/

 f ∈S

(F ∗ F )(Dom f )

//

 f ∈S,g∈C 1 f ◦g∈C 1

(F ∗ F )(Dom g)

417

The Greenberg Functor is Site Cocontinuous

is an equalizer diagram, where the f ∈ S are sieving morphisms; similarly, the same diagram with G replaced with F is an equalizer as well. Moreover, since the definition of the AJF -topology shows that the AJF covering sieves are generated by the sets F (S), applying Corollary 4 and using the hypotheses in the proposition to avoid technical malfunctions that occur off the connected component2 generated by the essential image of F allows us to check the isomorphism types of F and G along the F S’s. However, applying the definition of the functor F ∗ shows that the diagram above is equivalent to the diagram /

F (F U )

 f ∈S

F (F Dom f )

//

 f ∈S,g∈C 1 f ◦g∈C 1

F (F Dom g)

∼ Using the fully faithfulness of F together with the isomoprhisms F Dom f = Dom(F f ) then shows that for every set F S, F and G are isomorphic on the desired covers. This implies that F ∼ = G and completes the proof of the proposition. We now need a lonely arithmetic geometric result for use later in this paper. Lemma 5. Let R/ Zp be an integral extension with residue field k and let h  Gr : Sch/ Spec k → FSch/ Spec R be the Greenberg adjunction. Then the pullback functor h∗ : Shv(FSch/ Spec R , AJh ) → Shv(Sch/ Spec k , J) is fully faithful. Proof. This follows from the fact that the topology AJh is lifted from J, from the fact that h is a left adjoint so there is always the canonical map εX : h(Gr(X)) → X for any formal scheme X in FSch/ Spec R , and from the structure of rings of Witt vectors and algebras of Witt vectors over rings of Witt vectors.

4 The Adhesive Fundamental Group As a point of notation, if E is a Grothendieck topos, then we will write E lcf for the full subtopos of locally constant, locally finite objects in E. We will be making a study of these categories based on the restriction of the essential geometric morphism F : Shv(C , J) → Shv(D, AJF ) induced by a geometrically adhesive functor F : C → D. Throughout this section we make the following assumptions: 2

By which we mean the connected components of the category as a graph.

418

Geoff Vooys

A1. F : C → D is geometrically adhesive and the pullback F ∗ is fully faithful; A2. Shv(C , J)lcf is a Galois category with fibre functor Fi : Shv(C , J)lcf → FinSet; A3. For every V ∈ D 0 , there exists an X ∈ C 0 for which D(V, F X) = ∅ or D(F X, V ) = ∅.

We will use this set up to study the fundamental group on Shv(D, AJF ) with the provision that there is a fundamental group of Shv(C , J)lcf with which we can work. This will culminate with us showing that in some situations (and in particular in the case of the Greenberg Transform) that the fundamental groups coincide. However, before we do that we must show that we can still use the techniques afforded by the essential geometric morphism which are, fittingly, essential to us. Proposition 12. The functor F ∗ : Shv(D, AJF ) → Shv(C , J) restricts to a functor F ∗ : Shv(D, AJF )lcf → Shv(C , J)lcf . Proof. Immediate from the fact that F ∗ is exact and hence preserves all limits and colimits. Proposition 13. There is an essential geometric morphism F : Shv(C , J)lcf → Shv(D, AJF ) with pullback F ∗ given as above.

Proof. The proof of this proposition is formal, and follows in the same way as Theorem 3.1. That is, if {Fi | i ∈ I} is a finite family of functors with colimit F and colimit maps αi : Fi → F , consider the following family of 2-cells: F ∗ Fi

C

 

op

 F



*

4 Set

αi



F F ∗

Observing that the functor F Fi = F (F (−)) : C op → Set, we can rewrite the 2-cell as Fi (F ( −))

C

 

op

 αi

*

4 Set

F (F (−))

which factors as F op

C

 

op F

id  F

Fi

*

op 4D

op

  β  i *

4 Set

F

and further simplifies to the diagram: C

op

F op

/ D op

Fi

   αi

F

*

4 Set

419

The Greenberg Functor is Site Cocontinuous

From this it follows that upon taking the colimit of the diagram that the colimiting aspect is calculated solely in the Shv(D, AJF ) 2-cell. Thus it follows that F ∗ Fi = F ∗ F ; lim −→ i∈I

note that this is because Shv(C , J)lcf and Shv(D, AJF )lcf are the categories of locally constant, locally finite objects and morphisms between them. The fact that F ∗ preserves limits is shown mutatis mutandis, and the fact that this implies F ∗ admits both left and right adjoints comes from Freyd’s Adjoint Functor Theorem and the fact that the codomain of F ∗ is a topos. Finally, the fact that F ∗ F remains locally constant and locally finite comes from the fact that F ∗ preserves all finite limits and finite colimits. Lemma 6. Let Fi : Shv(C , J)lcf → FinSet be a functor. Then: 1. If Fi is exact, so is Fi ◦F ∗ : Shv(D, AJF ) → FinSet. 2. If Fi reflects isomorphisms, so does Fi ◦F ∗ . 3. If Fi is pro-representable, so is Fi ◦F ∗ . Proof. Claim (1) follows immediately from the fact that F ∗ is exact because it has a left and right adjoint, and from the fact that the composite of exact functors is exact. For part (2) we assume that there are sheaves F , G ∈ Ob Shv(D, AJF ) such that (Fi ◦F ∗ )(F ) ∼ = (Fi ◦F ∗ )(G ). However, in this case because Fi reflects isomorphisms, we have that F ∗ F ∼ = F ∗ G , and so this reduces to the fact that F ∗ reflects isomorphisms. However, because we have assumed properties (A1) above, the result follows from that fact that F ∗ reflects isomorphisms and the fact that the composite of two isomorphism reflecting functors is again isomorphism reflecting. To prove (3), assume that F i is representable and let {Ai | i ∈ I} be the filtered inverse system of objects such that Fi(F ) ∼ Shv(C , J)(Ai , F ) = lim −→ I∈I

for all F ∈ Ob Shv(C , J). However, fix a G ∈ Ob Shv(D, AJF ) and consider that for each i ∈ I, Shv(C , J)(Ai , F ∗ G ) ∼ = Shv(D, AJF )(F! Ai , G ) so (Fi ◦F ∗ )(G ) = Fi(F ∗ G ) ∼ Shv(C , J)(Ai , F ∗ G ) ∼ Shv(D, AJF )(F! Ai , G ). = lim = lim −→ −→ I∈I

I∈I

420

Geoff Vooys

Because {Ai | i ∈ I} is a filtered inverse system, the category I is filtered and so {F! Ai | i ∈ I} is a filtered inverse system by the covariance of F! . This proves the lemma. Note that the proof of the pro-reresentability of the functor Fi ◦F ∗ used in an essential way the adjunction F!  F ∗ . Using this again in the same way we derive the lemma below: Lemma 7. If F ∈ Ob Shv(D, AJF )lcf and if A is a normal object for Shv(C , J)lcf such that Shv(C , J)(A, F ∗ F ) ∼ = Fi(F ∗ F ) then

Shv(D, AJF )(F! A, F ) ∼ = Fi ◦F ∗ (F ).

Lemma 8. Assume that Shv(D, AJF )lcf admits an exact isomorphism reflecting functor I : Shv(D, AJF ) → FinSet. Then if an object A in Shv(C , J) is non-initial and indecomposable, so is F! A. Proof. Assume that A ∈ Ob Shv(C , J) is indecomposable and find a nontrivial composition n  Ai . A∼ = i=1

Then if we apply F! to the above coproduct we get that F! A =

n 

F! Ai ,

i=1

and this will be nontrivial if we can show at least one F! Ai ∼ F! A or F! A ∼ ⊥. = = ∼ ⊥. Consider that since the We will first show that at least one F! Ai = decomposition of A is nontrivial, there exists at least one object in the composite, call it Ak , such that Ak ∼ A and Ak ∼ ⊥. Because Ak ∼ ⊥, for each = = = object U of C , Ak (U ) = ∅ as the initial sheaf ⊥ is uniquely defined by the equation ⊥(V ) = ∅ for all objects X of C . But then we observe that Shv(Ak , ⊥) = ∅, as no nonempty set can map to the empty set and if a sheaf is empty anywhere it is empty everywhere. Now assume that there is a map ϕ : F! Ak → F! ⊥; however, using that F! is a right adjoint shows that F! ⊥ = ⊥ and so there is a map F! Ak → ⊥. Now observe that since F ∗ is left exact we have that

421

The Greenberg Functor is Site Cocontinuous

Shv(D, AJF )(F! Ak .⊥) = Shv(D, AJF )(F! Ak , F! ⊥) ∼ = Shv(C , J)(Ak , F ∗ F! ⊥) = Shv(C , J)(Ak , ⊥) = ∅. Therefore F! Ak ∼ ⊥ as Shv(D, AJF )(F! Ak , F! ⊥) = ∅. = We will now show that F! Ak ∼ F! A. To do this aftpodac this were the case = and note that from the assumption F! Ak ∼ = F! A and from F! A ∼ =

n 

F! A i

i=1

we have that F! Ak ∼ =

n 

F! Ai = F! Ak

i=1



⎛ ⎝

n 

⎞ F! A i ⎠ .

i=1,i=k

Now apply I to get that I (F! Ak ) = I(F! Ak )

n  

I(F! Ai ) = I(F! Ak ) 

i=1,i=k

n 

I(F! Ai ).

i=1,i=k

Now since the above equality of finite sets is a disjoint union it follows that I(F! Ai ) = ∅. However, because I is exact, I(⊥) = ∅ and since I reflects isomorphisms, this happens exactly when each F! Ai ∼ = ⊥. On the other hand, if Ak ∼ = A and A is indecomposable, by the above argument if follows that there is an Aj for which Aj is also nontrivial. Proceeding with the same line of thought and using that F! Aj ∼ = ⊥ and using that just above we have shown that F! Aj ∼ = ⊥ we have that I(F! Aj ) ∼ = I(F! A) ∼ = I(F! Ak ) while simultaneously we have I(F! Aj )  I(F! Ak ) = I(F! A). F! A. This This is evidently false, and so it must be the case that F! Ak ∼ = proves that F! A is indecomposable. Remark 10. Note that the above lemma can be false (or at least this proof fails) if we relax the assumption that the exact isomorphism reflecting functor I factors through Set and not FinSet, as infinite sets can be isomorphic to some of their proper subsets. Remark 11. It is worth remarking that the proof we gave above used the Law of the Excluded Middle. I could not find a way that did not make use

422

Geoff Vooys

of the Principle of Contraposition or Proof by Contradiction. Perhaps this should not be bothersome, however, because of the Boolean nature of Galois categories. Corollary 6. For every sheaf F in Shv(D, AJF ), there is an indecomposable sheaf G in Shv(D, AJF ) such that Fi ◦F ∗ (F ) ∼ = Shv(D, AJF )(G , F ). Proof. Begin by observing that since each functor F ∗ F is a J-sheaf on C , there is a normal object N in Shv(C , J) such that Fi(F ∗ F ) ∼ = Shv(C , J)(N, F ∗ F ) by Proposition 8.46 of [15]. Thus by Lemmas 7 and 8 we have that F! N is indecomposable and that Shv(D, AJF )(F! N, F ) ∼ = Fi ◦F ∗ (F ), which concludes the proof. Lemma 9. If N is normal in Shv(C , J)lcf then Shv(D, AJF )(F! N, F! N ) is a group. Proof. Because N is normal, it is indecomposable and N ∼ ⊥. Thus we have = that F! N by Lemma 8 that F! N is indecomposable. Now let ϕ : F! N → F! N be an endomorphism of F! N and consider the (E, M)-factorization of ϕ: F! NC CC CC ε CCC !

ϕ

X

/ F! N {= { { { {{ μ {{

where X is a subobject of F! N , ε is a regular epic, and μ is monic. Then since ϕ cannot factor through ⊥, X ∼ ⊥ and hence a nonzero subobject of F! N ; = by the indecomposability of F! N this implies that X = F! N . Therefore, by considering that (Fi ◦F ∗ )(ε) : (Fi ◦F ∗ )(F! N ) → (Fi ◦F ∗ )(F! N ) is an epimorphism from a finite set to itself, it follows that ε is monic and hence an automorphism of F! N . Finally, since μ is monic, the map (Fi ◦F ∗ )(μ) : (Fi ◦F ∗ )(F! N ) → (Fi ◦F ∗ )(F! N ) is a monomoprhim from a finite set to itself. This implies that μ is then epic and hence an automorphism as well; from here it follows that ϕ = ε ◦ μ is also an automorphism an we are done.

The Greenberg Functor is Site Cocontinuous

423

Corollary 7. If N is a normal object in Shv(C , J)lcf , F! N is normal in Shv(D, AJF )lcf . Proof. Begin by observing that Shv(C , J)(N, N ) ∼ = Fi(N ) so we derive that Shv(D, J)(F! N, F! N ) ∼ = Shv(N, F ∗ (F! N )) ∼ = Fi(F ∗ (F! N )) ∼ = (Fi ◦F ∗ )(F! N ) because N is the normal object representing Fi(F ∗ F! N ). Finally, by Lemma 9 it follows that Shv(D, AJF )(F! N, F! N ) is a group and hence that F! N is normal. We now proceed to show the last ingredient in our proof of the equivalence of fundamental groups: We must know that the automorphism group of normal objects N in Shv(C , J)lcf is isomorphic to the automorphism group of F! N . Lemma 10. If N is a normal object in Shv(C , J) then there is an isomorphism of groups Shv(C , J)(N, N ) ∼ = Shv(D, AJF )(F! N, F! N ). Proof. Begin by observing that Assumption (A1) gives that F ∗ is fully faithful; consequently the counit ε : F! ◦ F ∗ → idShv(D,AJF )lcf is an isomorphism and F! is essentially surjective. Then by the property of adjoint functors, there is a natural isomorphism θF! N of hom-sets θF

N

! → Shv(C , J)(N, (F ∗ ◦ F! )N ); Shv(D, AJF )(F! N, F! N ) −−−

moreover, from the fact that F ∗ is fully faithful, this isomorphism factors as θ

F! N / Shv(C , J)(N, (F ∗ ◦ F! )N ) Shv(D, AJF )(F! N, F! N ) 3 WWWWW ggggg WWWWW ggggg WWWWW g g g g WWWWW gg η∗ F∗ + ggggg Shv(C , J)((F ∗ ◦ F! )N, (F ∗ ◦ F! )N )

where the map η ∗ is precomposition by the unit of adjunction. Now, using that F ∗ is a conservative exact functor between toposes of locally constant, locally finite sheaves implies that there is a Beck-Chevally condition for (F! , F ∗ ) (cf. Page 179 of [19]). This induces a natural map (F ∗ ◦ F! )N → N as an algebra morphism for the monad induced by F!  F ∗ , and hence gives a further isomorphism Shv(C , J)(N, (F ∗ ◦ F! )N ) ∼ = Shv(C , J)(N, N ).

424

Geoff Vooys

Composing these all gives the desired isomorphism Shv(D, AJF )(F! N, F! N ) ∼ = Shv(C , J)(N, N ). We can now state and prove the main theorem of the paper, and then show the geometrization result as a corollary. We reiterate the assumptions made throughout this section here for the sake of clarity. Theorem 4.1 Let F : C → D be a geometrically adhesive functor, suppose that F ∗ is fully faithful, let X be an object in (C , J), and let x be a point of X making (Shv(C /X , J/X )lcf , Fix ) into a Galois category. Moreover, suppose that for all V ∈ D there exists an object X of C such that such that D(F X, V ) = ∅ or D(V, F X) = ∅. Then there is an isomorphism of fundamental groups AJ π1J (X, x) ∼ = π1 F (F X, F x). Proof. Applying the various results of this section show that there is a welldefined profinite fundamental group of Shv(D /F X , AJF /F X )lcf at F X and F x whose normal objects are all of the form F! N , where N is a normal object of Shv(C /X , J/X )lcf . This implies that the F! N give the correct cofinal system to limit against. Taking then the isomorphism Shv(C , J)(N, N ) ∼ = Shv(D, AJF )(F! N, F! N ) for all normal objects N , we get that J

A π1 F (F X, F x) ∼ = lim Shv(D /X , AJF /X )(F! Ni , F! Ni ) ←−

∼ = lim Shv(C /X , J/X )(Ni , Ni ) ←−

∼ = π1J (X, x). Corollary 8. If G is a group scheme over Sch/ Spec k and if J is any topology on Sch/ Spec k for which Shv(Sch/Speck , J)lcf is a Galois category, then there is an isomorphism of fundamental groups J

A ´ π1Et (G, x) ∼ = π1 h (h G, h x) .

5 Applications to Local Systems Here we would like to present an important corollary of Theorem 4.1 above. In particular, it says that we can geometrize quasicharacters of p-adic group schemes by using p-adic formal schemes and the AJF topology. To see this, we let F be a p-adic field and let G be a group scheme over F with geometric point x. Then, as is well-known to representation theorists, ´etale local sys-

425

The Greenberg Functor is Site Cocontinuous

tems of G arise as -adic representations of the fundamental group, i.e., as representations π1 (G, x) → GL(V ) where V is a vector space over Q , for some integer prime  coprime to p. Thus, if one wishes to geometrize representations of a connected, reductive group G over a p-adic field F with residue field k, it suffices to consider AJF -local systems on schemes of characteristic zero. This is a new result, as it allows us to (in theory, although at this point not in practice) translate some of the results in geometric representation theory to a characteristic zero analogue in formal schemes over the trait Spec OF of integers of F . Theorem 5.1 If G is a group scheme over Spec k with geometric point x, then there is an isomorphism of categories  Et    ´ Ah ´ Et ∼ Rep π1 (G, x) = Rep π1 (h G, h x) . In particular, upon restricting to the categories of admissible irreducible representations, we obtain an isomorphism of categories  Et    ´ Ah ´ Et ∼ AdRep π1 (G, x) = AdRep π1 (h G, h x) . This theorem is simply a restatement of the fact that if G ∼ = H as p-adic groups, then their categories of representations (and their categories of admissible representations) are isomorphic as well. To see how to apply this to quasicharacters of p-adic tori, we follow [11]. Let F be a p-aidc field and let OF be the ring of integers of F . Then if T is a torus over F , it admits a N´eron model NT which is locally of finite type as a smooth, commutative group scheme over OF ; cf. [6] for details. Moreover, NT (OF ) = T (F ) and Gr(NT )(k) = NT (OF ) = T (F ). In [11], the authors showed that the Trace of Frobenius gives a natural transformation between the category of quasicharacter sheaves on NT to ∗ continuous representations NT (OF ) → Q which is surjective, i.e., the group homomorphism ∗

Trace(Frob) : QCS/iso (NT ) → Top(Grp)(NT (OF ), Q ) is surjective. Now, since quasicharacter sheaves on NT arise as certain lo´ cal systems on Gr(NT ), which in turn are representations of π1Et (Gr(Nt ), x), ´ we can use the functoriality of h and the adhesive site AEt h to lift these lo´ fundamental group cal systems to corresponding representations of the h-Et together with the corresponding functorial lift of Frobenius. Mimicking Sections 4.5 and 4.7 of [11] then allows one to prove that the adhesive site is rich enough to geometrize quasicharacters of the torus T .

426

Geoff Vooys

Acknowledgements I would like to give deep thanks to my PhD supervisor, Clifton Cunningham, for both giving me the problem motivating this paper, and for all his help in guiding me through the preparation of this article. I would also like to thank the organizers for the MATRIX program Geometric and Categorical Representation Theory for the kind invitation to present the work in this paper, as well as the MATRIX institute itself for the hospitality shown during the program.

The Greenberg Functor is Site Cocontinuous

427

References [1] Alessandra Bertapelle and Cristian D. Gonz´ alez-Avil´ es, The Greenberg functor revisited, Eur. J. Math. 4 (2018), no. 4, 1340–1389. MR3866700 [2] Alessandra Bertapelle and Jilong Tong, On torsors under elliptic curves and Serre’s pro-algebraic structures, Math. Z. 277 (2014), no. 1-2, 91–147. MR3205765 [3] Bhargav Bhatt and Peter Scholze, Projectivity of the Witt vector affine Grassmannian, Invent. Math. 209 (2017), no. 2, 329–423. MR3674218 [4] James Borger, The basic geometry of Witt vectors, I: The affine case, Algebra Number Theory 5 (2011), no. 2, 231–285. MR2833791 [5] , The basic geometry of Witt vectors. II: Spaces, Math. Ann. 351 (2011), no. 4, 877–933. MR2854117 [6] Siegfried Bosch, Werner L¨ utkebohmert, and Michel Raynaud, N´ eron models, Ergebnisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)], vol. 21, Springer-Verlag, Berlin, 1990. MR1045822 [7] Alexandru Buium, Geometry of p-jets, Duke Math. J. 82 (1996), no. 2, 349–367. MR1387233 [8] J. R. B. Cockett and G. S. H. Cruttwell, Differential structure, tangent structure, and SDG, Appl. Categ. Structures 22 (2014), no. 2, 331–417. MR3192082 [9] J. R. B. Cockett, G. S. H. Cruttwell, and J. D. Gallagher, Differential restriction categories, Theory Appl. Categ. 25 (2011), No. 21, 537–613. MR2861119 [10] Robin Cockett and Stephen Lack, Restriction categories. III. Colimits, partial limits and extensivity, Math. Structures Comput. Sci. 17 (2007), no. 4, 775–817. MR2347616 [11] Clifton Cunningham and David Roe, From the function-sheaf dictionary to quasicharacters of p-adic tori, J. Inst. Math. Jussieu 17 (2018), no. 1, 1–37. MR3742553 [12] Marvin J. Greenberg, Schemata over local rings, Ann. of Math. (2) 73 (1961), 624–648. MR0126449 [13] , Schemata over local rings. II, Ann. of Math. (2) 78 (1963), 256–266. MR0156855 [14] Alexander Grothendieck, Revˆ etements ´ etales et groupe fondamental (SGA 1), Lecture notes in mathematics, vol. 224, Springer-Verlag, 1971. [15] Peter T. Johnstone, Topos theory, Dover Edition, Academic Press, Inc., New York, 1977. Dover Publications, Inc. 2014 republication. , Sketches of an elephant: a topos theory compendium. Vol. 2, Oxford [16] Logic Guides, vol. 44, The Clarendon Press, Oxford University Press, Oxford, 2002. MR2063092 [17] Stephen Lack and Pawel Soboci´ nski, Adhesive and quasiadhesive categories, Theor. Inform. Appl. 39 (2005), no. 3, 511–545. MR2157046 [18] Serge Lang, On quasi algebraic closure, Ann. of Math. (2) 55 (1952), 373–390. MR0046388 [19] Saunders Mac Lane and Ieke Moerdijk, Sheaves in geometry and logic, Universitext, Springer-Verlag, New York, 1994. A first introduction to topos theory, Corrected reprint of the 1992 edition. MR1300636 [20] Johannes Nicaise and Julien Sebag, Motivic Serre invariants and Weil restriction, J. Algebra 319 (2008), no. 4, 1585–1610. MR2383059 [21] , Motivic invariants of rigid varieties, and applications to complex singularities, Motivic integration and its interactions with model theory and nonarchimedean geometry, 2011, pp. 244 –304. [22] Karl Schwede, Gluing schemes and a scheme without closed points, Recent progress in arithmetic and algebraic geometry, 2005, pp. 157–172. MR2182775 [23] Jiu-Kang Yu, Smooth models associated to concave functions in Bruhat-Tits theory, Autour des sch´ emas en groupes. Vol. III, 2015, pp. 227–258. MR3525846