Computing and Combinatorics: 26th International Conference, COCOON 2020, Atlanta, GA, USA, August 29–31, 2020, Proceedings [1st ed.] 9783030581497, 9783030581503

This book constitutes the proceedings of the 26th International Conference on Computing and Combinatorics, COCOON 2020,

318 63 16MB

English Pages XIV, 678 [691] Year 2020

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Front Matter ....Pages i-xiv
Subspace Approximation with Outliers (Amit Deshpande, Rameshwar Pratap)....Pages 1-13
Linear-Time Algorithms for Eliminating Claws in Graphs (Flavia Bonomo-Braberman, Julliano R. Nascimento, Fabiano S. Oliveira, Uéverton S. Souza, Jayme L. Szwarcfiter)....Pages 14-26
A New Lower Bound for the Eternal Vertex Cover Number of Graphs (Jasine Babu, Veena Prabhakaran)....Pages 27-39
Bounded-Degree Spanners in the Presence of Polygonal Obstacles (André van Renssen, Gladys Wong)....Pages 40-51
End-Vertices of AT-free Bigraphs (Jan Gorzny, Jing Huang)....Pages 52-63
Approaching Optimal Duplicate Detection in a Sliding Window (Rémi Géraud-Stewart, Marius Lombard-Platet, David Naccache)....Pages 64-84
Computational Complexity Characterization of Protecting Elections from Bribery (Lin Chen, Ahmed Sunny, Lei Xu, Shouhuai Xu, Zhimin Gao, Yang Lu et al.)....Pages 85-97
Coding with Noiseless Feedback over the Z-Channel (Christian Deppe, Vladimir Lebedev, Georg Maringer, Nikita Polyanskii)....Pages 98-109
Path-Monotonic Upward Drawings of Graphs (Seok-Hee Hong, Hiroshi Nagamochi)....Pages 110-122
Seamless Interpolation Between Contraction Hierarchies and Hub Labels for Fast and Space-Efficient Shortest Path Queries in Road Networks (Stefan Funke)....Pages 123-135
Visibility Polygon Queries Among Dynamic Polygonal Obstacles in Plane (Sanjana Agrawal, R. Inkulu)....Pages 136-148
How Hard Is Completeness Reasoning for Conjunctive Queries? (Xianmin Liu, Jianzhong Li, Yingshu Li)....Pages 149-161
Imbalance Parameterized by Twin Cover Revisited (Neeldhara Misra, Harshil Mittal)....Pages 162-173
Local Routing in a Tree Metric 1-Spanner (Milutin Brankovic, Joachim Gudmundsson, André van Renssen)....Pages 174-185
Deep Specification Mining with Attention (Zhi Cao, Nan Zhang)....Pages 186-197
Constructing Independent Spanning Trees in Alternating Group Networks (Jie-Fu Huang, Sun-Yuan Hsieh)....Pages 198-209
W[1]-Hardness of the k-Center Problem Parameterized by the Skeleton Dimension (Johannes Blum)....Pages 210-221
An Optimal Lower Bound for Hierarchical Universal Solutions for TSP on the Plane (Patrick Eades, Julián Mestre)....Pages 222-233
Quantum Speedup for the Minimum Steiner Tree Problem (Masayuki Miyamoto, Masakazu Iwamura, Koichi Kise, François Le Gall)....Pages 234-245
Access Structure Hiding Secret Sharing from Novel Set Systems and Vector Families (Vipin Singh Sehrawat, Yvo Desmedt)....Pages 246-261
Approximation Algorithms for Car-Sharing Problems (Kelin Luo, Frits C. R. Spieksma)....Pages 262-273
Realization Problems on Reachability Sequences (Matthew Dippel, Ravi Sundaram, Akshar Varma)....Pages 274-286
Power of Decision Trees with Monotone Queries (Prashanth Amireddy, Sai Jayasurya, Jayalal Sarma)....Pages 287-298
Computing a Maximum Clique in Geometric Superclasses of Disk Graphs (Nicolas Grelier)....Pages 299-310
Shortest Watchman Tours in Simple Polygons Under Rotated Monotone Visibility (Bengt J. Nilsson, David Orden, Leonidas Palios, Carlos Seara, Paweł Żyliński)....Pages 311-323
Tight Approximation for the Minimum Bottleneck Generalized Matching Problem (Julián Mestre, Nicolás E. Stier Moses)....Pages 324-334
Graph Classes and Approximability of the Happy Set Problem (Yuichi Asahiro, Hiroshi Eto, Tesshu Hanaka, Guohui Lin, Eiji Miyano, Ippei Terabaru)....Pages 335-346
A Simple Primal-Dual Approximation Algorithm for 2-Edge-Connected Spanning Subgraphs (Stephan Beyer, Markus Chimani, Joachim Spoerhase)....Pages 347-359
Uniqueness of DP-Nash Subgraphs and D-sets in Weighted Graphs of Netflix Games (Gregory Gutin, Philip R. Neary, Anders Yeo)....Pages 360-371
On the Enumeration of Minimal Non-pairwise Compatibility Graphs (Naveed Ahmed Azam, Aleksandar Shurbevski, Hiroshi Nagamochi)....Pages 372-383
Constructing Tree Decompositions of Graphs with Bounded Gonality (Hans L. Bodlaender, Josse van Dobben de Bruyn, Dion Gijswijt, Harry Smit)....Pages 384-396
Election Control Through Social Influence with Unknown Preferences (Mohammad Abouei Mehrizi, Federico Corò, Emilio Cruciani, Gianlorenzo D’Angelo)....Pages 397-410
k-Critical Graphs in \(P_5\)-Free Graphs (Kathie Cameron, Jan Goedgebeur, Shenwei Huang, Yongtang Shi)....Pages 411-422
New Symmetry-less ILP Formulation for the Classical One Dimensional Bin-Packing Problem (Khadija Hadj Salem, Yann Kieffer)....Pages 423-434
On the Area Requirements of Planar Greedy Drawings of Triconnected Planar Graphs (Giordano Da Lozzo, Anthony D’Angelo, Fabrizio Frati)....Pages 435-447
On the Restricted 1-Steiner Tree Problem (Prosenjit Bose, Anthony D’Angelo, Stephane Durocher)....Pages 448-459
Computational Complexity of Synchronization Under Regular Commutative Constraints (Stefan Hoffmann)....Pages 460-471
Approximation Algorithms for General Cluster Routing Problem (Xiaoyan Zhang, Donglei Du, Gregory Gutin, Qiaoxia Ming, Jian Sun)....Pages 472-483
Hardness of Sparse Sets and Minimal Circuit Size Problem (Bin Fu)....Pages 484-495
Succinct Monotone Circuit Certification: Planarity and Parameterized Complexity (Mateus Rodrigues Alves, Mateus de Oliveira Oliveira, Janio Carlos Nascimento Silva, Uéverton dos Santos Souza)....Pages 496-507
On Measures of Space over Real and Complex Numbers (Om Prakash, B. V. Raghavendra Rao)....Pages 508-519
Parallelized Maximization of Nonsubmodular Function Subject to a Cardinality Constraint (Hongxiang Zhang, Dachuan Xu, Longkun Guo, Jingjing Tan)....Pages 520-531
An Improved Bregman k-means++ Algorithm via Local Search (Xiaoyun Tian, Dachuan Xu, Longkun Guo, Dan Wu)....Pages 532-541
Approximating Maximum Acyclic Matchings by Greedy and Local Search Strategies (Julien Baste, Maximilian Fürst, Dieter Rautenbach)....Pages 542-553
On the Complexity of Directed Intersection Representation of DAGs (Andrea Caucchiolo, Ferdinando Cicalese)....Pages 554-565
On the Mystery of Negations in Circuits: Structure vs Power (Prashanth Amireddy, Sai Jayasurya, Jayalal Sarma)....Pages 566-577
Even Better Fixed-Parameter Algorithms for Bicluster Editing (Manuel Lafond)....Pages 578-590
Approximate Set Union via Approximate Randomization (Bin Fu, Pengfei Gu, Yuming Zhao)....Pages 591-602
A Non-Extendibility Certificate for Submodularity and Applications (Umang Bhaskar, Gunjan Kumar)....Pages 603-614
Parameterized Complexity of Maximum Edge Colorable Subgraph (Akanksha Agrawal, Madhumita Kundu, Abhishek Sahu, Saket Saurabh, Prafullkumar Tale)....Pages 615-626
Approximation Algorithms for the Lower-Bounded k-Median and Its Generalizations (Lu Han, Chunlin Hao, Chenchen Wu, Zhenning Zhang)....Pages 627-639
A Survey for Conditional Diagnosability of Alternating Group Networks (Nai-Wen Chang, Sun-Yuan Hsieh)....Pages 640-651
Fixed Parameter Tractability of Graph Deletion Problems over Data Streams (Arijit Bishnu, Arijit Ghosh, Sudeshna Kolay, Gopinath Mishra, Saket Saurabh)....Pages 652-663
Mixing of Markov Chains for Independent Sets on Chordal Graphs with Bounded Separators (Ivona Bezáková, Wenbo Sun)....Pages 664-676
Back Matter ....Pages 677-678
Recommend Papers

Computing and Combinatorics: 26th International Conference, COCOON 2020, Atlanta, GA, USA, August 29–31, 2020, Proceedings [1st ed.]
 9783030581497, 9783030581503

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

LNCS 12273

Donghyun Kim R. N. Uma Zhipeng Cai Dong Hoon Lee (Eds.)

Computing and Combinatorics 26th International Conference, COCOON 2020 Atlanta, GA, USA, August 29–31, 2020 Proceedings

Lecture Notes in Computer Science Founding Editors Gerhard Goos Karlsruhe Institute of Technology, Karlsruhe, Germany Juris Hartmanis Cornell University, Ithaca, NY, USA

Editorial Board Members Elisa Bertino Purdue University, West Lafayette, IN, USA Wen Gao Peking University, Beijing, China Bernhard Steffen TU Dortmund University, Dortmund, Germany Gerhard Woeginger RWTH Aachen, Aachen, Germany Moti Yung Columbia University, New York, NY, USA

12273

More information about this series at http://www.springer.com/series/7407

Donghyun Kim R. N. Uma Zhipeng Cai Dong Hoon Lee (Eds.) •





Computing and Combinatorics 26th International Conference, COCOON 2020 Atlanta, GA, USA, August 29–31, 2020 Proceedings

123

Editors Donghyun Kim Georgia State University Atlanta, USA Zhipeng Cai Computer Science Georgia State University Atlanta, GA, USA

R. N. Uma Department of Mathematics and Physics North Carolina Central University Durham, NC, USA Dong Hoon Lee Korea University Seoul, South Korea

ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Computer Science ISBN 978-3-030-58149-7 ISBN 978-3-030-58150-3 (eBook) https://doi.org/10.1007/978-3-030-58150-3 LNCS Sublibrary: SL1 – Theoretical Computer Science and General Issues © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

The 26th International Computing and Combinatorics Conference (COCOON 2020) was held during August 29–31, 2020. This year, COCOON 2020 was organized as a fully online conference. COCOON 2020 provided an excellent venue for researchers working in the area of algorithms, theory of computation, computational complexity, and combinatorics related to computing. The technical program of the conference included 54 regular papers selected by the Program Committee from 126 full submissions received in response to the call for papers. Each submission was carefully reviewed by Program Committee members and/or external reviewers. Some of the papers were selected for publication in special issues of Theoretical Computer Science and the Journal of Combinatorial Optimization. It is expected that the journal version of the papers will appear in a more complete form. We thank everyone who made this meeting possible: the authors for submitting papers, the Program Committee members, and external reviewers for volunteering their time to review conference papers. We also appreciate the financial sponsorship from Springer for the Best Paper Award. We would also like to extend special thanks to the chairs and conference Organizing Committee for their work in making COCOON 2020 a successful event. August 2020

Donghyun Kim R. N. Uma Zhipeng Cai Dong Hoon Lee

Organization

Steering Committee Members Lian Li (Chair) Zhipeng Cai Xiaoming Sun My Thai Cong Tian Jianping Yin Guochuan Zhang Zhao Zhang

Hefei University of Technology, China Georgia State University, USA Chinese Academy of Sciences, China University of Florida, USA Xidian University, China Dongguan University of Technology, China Zhejiang University, China Zhejiang Normal University, China

General Chairs Zhipeng Cai Dong Hoon Lee

Georgia State University, USA Korea University, South Korea

Program Chairs Donghyun Kim R. N. Uma

Georgia State University, USA North Carolina Central University, USA

Publication Chairs Yan Huang Yingshu Li

Kennesaw State University, USA Georgia State University, USA

Web/Submission System Chairs Yongmin Ju Yi Liang Vannel Zeufack

Georgia State University, USA Georgia State University, USA Kennesaw State University, USA

Financial Chairs Yongmin Ju Daehee Seo

Georgia State University, USA Sangmyung University, South Korea

viii

Organization

Local Organization Chairs Jeongsu Park Daehee Seo

Korea University, South Korea Sangmyung University, South Korea

Program Committee Hans-Joachim Boeckenhauer Zhi-Zhong Chen Rajesh Chitnis Ovidiu Daescu Bhaskar Dasgupta Thang Dinh Stanley Fung Xiaofeng Gao Anirban Ghosh Raffaele Giancarlo Yan Huang Michael Khachay Sung-Sik Kwon Daniel Lee Joong-Lyul Lee Shuai Cheng Li Wei Li Xianyue Li Bei Liu Bodo Manthey Manki Min Rashika Mishra N. S. Narayanaswamy Nam Nguyen Pavel Skums Xiaoming Sun Ryuhei Uehara Jianxin Wang Wei Wang Gerhard J. Woeginger Wen Xu Boting Yang Chee Yap

ETH Zurich, Switzerland Tokyo Denki University, Japan University of Birmingham, UK The University of Texas at Dallas, USA University of Illinois at Chicago, USA Virginia Commonwealth University, USA University of Leicester, UK Shanghai Jiao Tong University, China University of North Florida, USA Universita di Palermo, Italy Kennesaw State University, USA Krasovsky Institute of Mathematics and Mechanics, Russia North Carolina Central University, USA Simon Fraser University, Canada The University of North Carolina at Pembroke, USA City University of Hong Kong, Hong Kong Georgia State University, USA Lanzhou University, China Northwest University, USA University of Twente, The Netherlands Louisiana Tech University, USA Lenovo, USA IIT Madras, India Towson University, USA Georgia State University, USA Institute of Computing Technology, Chinese Academy of Sciences, China Japan Advanced Institute of Science and Technology, Japan Central South University, China Xi’an Jiaotong University, China RWTH Aachen University, Germany Texas Woman’s University, USA University of Regina, Canada New York University, USA

Organization

Peng Zhang Zhao Zhang Yi Zhu Yuqing Zhu Martin Ziegler

Shandong University, China Zhejiang Normal University, China Hawaii Pacific University, USA California State University Los Angeles, USA KAIST, South Korea

ix

Contents

Subspace Approximation with Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . Amit Deshpande and Rameshwar Pratap

1

Linear-Time Algorithms for Eliminating Claws in Graphs . . . . . . . . . . . . . . Flavia Bonomo-Braberman, Julliano R. Nascimento, Fabiano S. Oliveira, Uéverton S. Souza, and Jayme L. Szwarcfiter

14

A New Lower Bound for the Eternal Vertex Cover Number of Graphs . . . . . Jasine Babu and Veena Prabhakaran

27

Bounded-Degree Spanners in the Presence of Polygonal Obstacles . . . . . . . . André van Renssen and Gladys Wong

40

End-Vertices of AT-free Bigraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jan Gorzny and Jing Huang

52

Approaching Optimal Duplicate Detection in a Sliding Window . . . . . . . . . . Rémi Géraud-Stewart, Marius Lombard-Platet, and David Naccache

64

Computational Complexity Characterization of Protecting Elections from Bribery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lin Chen, Ahmed Sunny, Lei Xu, Shouhuai Xu, Zhimin Gao, Yang Lu, Weidong Shi, and Nolan Shah

85

Coding with Noiseless Feedback over the Z-Channel. . . . . . . . . . . . . . . . . . Christian Deppe, Vladimir Lebedev, Georg Maringer, and Nikita Polyanskii

98

Path-Monotonic Upward Drawings of Graphs . . . . . . . . . . . . . . . . . . . . . . . Seok-Hee Hong and Hiroshi Nagamochi

110

Seamless Interpolation Between Contraction Hierarchies and Hub Labels for Fast and Space-Efficient Shortest Path Queries in Road Networks . . . . . . Stefan Funke

123

Visibility Polygon Queries Among Dynamic Polygonal Obstacles in Plane. . . Sanjana Agrawal and R. Inkulu

136

How Hard Is Completeness Reasoning for Conjunctive Queries? . . . . . . . . . Xianmin Liu, Jianzhong Li, and Yingshu Li

149

Imbalance Parameterized by Twin Cover Revisited . . . . . . . . . . . . . . . . . . . Neeldhara Misra and Harshil Mittal

162

xii

Contents

Local Routing in a Tree Metric 1-Spanner . . . . . . . . . . . . . . . . . . . . . . . . . Milutin Brankovic, Joachim Gudmundsson, and André van Renssen

174

Deep Specification Mining with Attention . . . . . . . . . . . . . . . . . . . . . . . . . Zhi Cao and Nan Zhang

186

Constructing Independent Spanning Trees in Alternating Group Networks . . . Jie-Fu Huang and Sun-Yuan Hsieh

198

W[1]-Hardness of the k-Center Problem Parameterized by the Skeleton Dimension. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Johannes Blum

210

An Optimal Lower Bound for Hierarchical Universal Solutions for TSP on the Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Patrick Eades and Julián Mestre

222

Quantum Speedup for the Minimum Steiner Tree Problem . . . . . . . . . . . . . . Masayuki Miyamoto, Masakazu Iwamura, Koichi Kise, and François Le Gall Access Structure Hiding Secret Sharing from Novel Set Systems and Vector Families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vipin Singh Sehrawat and Yvo Desmedt

234

246

Approximation Algorithms for Car-Sharing Problems . . . . . . . . . . . . . . . . . Kelin Luo and Frits C. R. Spieksma

262

Realization Problems on Reachability Sequences. . . . . . . . . . . . . . . . . . . . . Matthew Dippel, Ravi Sundaram, and Akshar Varma

274

Power of Decision Trees with Monotone Queries . . . . . . . . . . . . . . . . . . . . Prashanth Amireddy, Sai Jayasurya, and Jayalal Sarma

287

Computing a Maximum Clique in Geometric Superclasses of Disk Graphs. . . Nicolas Grelier

299

Shortest Watchman Tours in Simple Polygons Under Rotated Monotone Visibility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bengt J. Nilsson, David Orden, Leonidas Palios, Carlos Seara, and Paweł Żyliński Tight Approximation for the Minimum Bottleneck Generalized Matching Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Julián Mestre and Nicolás E. Stier Moses Graph Classes and Approximability of the Happy Set Problem . . . . . . . . . . . Yuichi Asahiro, Hiroshi Eto, Tesshu Hanaka, Guohui Lin, Eiji Miyano, and Ippei Terabaru

311

324 335

Contents

xiii

A Simple Primal-Dual Approximation Algorithm for 2-Edge-Connected Spanning Subgraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stephan Beyer, Markus Chimani, and Joachim Spoerhase

347

Uniqueness of DP-Nash Subgraphs and D-sets in Weighted Graphs of Netflix Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gregory Gutin, Philip R. Neary, and Anders Yeo

360

On the Enumeration of Minimal Non-pairwise Compatibility Graphs . . . . . . . Naveed Ahmed Azam, Aleksandar Shurbevski, and Hiroshi Nagamochi

372

Constructing Tree Decompositions of Graphs with Bounded Gonality . . . . . . Hans L. Bodlaender, Josse van Dobben de Bruyn, Dion Gijswijt, and Harry Smit

384

Election Control Through Social Influence with Unknown Preferences . . . . . Mohammad Abouei Mehrizi, Federico Corò, Emilio Cruciani, and Gianlorenzo D’Angelo

397

k-Critical Graphs in P5 -Free Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kathie Cameron, Jan Goedgebeur, Shenwei Huang, and Yongtang Shi

411

New Symmetry-less ILP Formulation for the Classical One Dimensional Bin-Packing Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Khadija Hadj Salem and Yann Kieffer

423

On the Area Requirements of Planar Greedy Drawings of Triconnected Planar Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Giordano Da Lozzo, Anthony D’Angelo, and Fabrizio Frati

435

On the Restricted 1-Steiner Tree Problem. . . . . . . . . . . . . . . . . . . . . . . . . . Prosenjit Bose, Anthony D’Angelo, and Stephane Durocher Computational Complexity of Synchronization Under Regular Commutative Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stefan Hoffmann

448

460

Approximation Algorithms for General Cluster Routing Problem . . . . . . . . . Xiaoyan Zhang, Donglei Du, Gregory Gutin, Qiaoxia Ming, and Jian Sun

472

Hardness of Sparse Sets and Minimal Circuit Size Problem . . . . . . . . . . . . . Bin Fu

484

Succinct Monotone Circuit Certification: Planarity and Parameterized Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mateus Rodrigues Alves, Mateus de Oliveira Oliveira, Janio Carlos Nascimento Silva, and Uéverton dos Santos Souza

496

xiv

Contents

On Measures of Space over Real and Complex Numbers . . . . . . . . . . . . . . . Om Prakash and B. V. Raghavendra Rao Parallelized Maximization of Nonsubmodular Function Subject to a Cardinality Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hongxiang Zhang, Dachuan Xu, Longkun Guo, and Jingjing Tan An Improved Bregman k-means++ Algorithm via Local Search . . . . . . . . . . Xiaoyun Tian, Dachuan Xu, Longkun Guo, and Dan Wu Approximating Maximum Acyclic Matchings by Greedy and Local Search Strategies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Julien Baste, Maximilian Fürst, and Dieter Rautenbach

508

520 532

542

On the Complexity of Directed Intersection Representation of DAGs. . . . . . . Andrea Caucchiolo and Ferdinando Cicalese

554

On the Mystery of Negations in Circuits: Structure vs Power . . . . . . . . . . . . Prashanth Amireddy, Sai Jayasurya, and Jayalal Sarma

566

Even Better Fixed-Parameter Algorithms for Bicluster Editing . . . . . . . . . . . Manuel Lafond

578

Approximate Set Union via Approximate Randomization . . . . . . . . . . . . . . . Bin Fu, Pengfei Gu, and Yuming Zhao

591

A Non-Extendibility Certificate for Submodularity and Applications . . . . . . . Umang Bhaskar and Gunjan Kumar

603

Parameterized Complexity of MAXIMUM EDGE COLORABLE SUBGRAPH . . . . . . . . Akanksha Agrawal, Madhumita Kundu, Abhishek Sahu, Saket Saurabh, and Prafullkumar Tale

615

Approximation Algorithms for the Lower-Bounded k-Median and Its Generalizations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lu Han, Chunlin Hao, Chenchen Wu, and Zhenning Zhang A Survey for Conditional Diagnosability of Alternating Group Networks . . . . Nai-Wen Chang and Sun-Yuan Hsieh Fixed Parameter Tractability of Graph Deletion Problems over Data Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Arijit Bishnu, Arijit Ghosh, Sudeshna Kolay, Gopinath Mishra, and Saket Saurabh

627 640

652

Mixing of Markov Chains for Independent Sets on Chordal Graphs with Bounded Separators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ivona Bezáková and Wenbo Sun

664

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

677

Subspace Approximation with Outliers Amit Deshpande1 and Rameshwar Pratap2(B) 1

Microsoft Research, Bengaluru, India [email protected] 2 IIT Mandi, Mandi, India [email protected]

Abstract. The subspace approximation problem with outliers, for given n points in d dimensions x1 , x2 , . . . , xn ∈ Rd , an integer 1 ≤ k ≤ d, and an outlier parameter 0 ≤ α ≤ 1, is to find a k-dimensional linear subspace of Rd that minimizes the sum of squared distances to its nearest (1 − α)n points. More generally, the p subspace approximation problem with outliers minimizes the sum of p-th powers of distances instead of the sum of squared distances. Even the case of p = 2 or robust PCA is non-trivial, and previous work requires additional assumptions on the input or generative models for it. Any multiplicative approximation algorithm for the subspace approximation problem with outliers must solve the robust subspace recovery problem, a special case in which the (1 − α)n inliers in the optimal solution are promised to lie exactly on a k-dimensional linear subspace. However, robust subspace recovery is Small Set Expansion (SSE)-hard, and known algorithmic results for robust subspace recovery require strong assumptions on the input, e.g., any d outliers must be linearly independent. In this paper, we show how to extend dimension reduction techniques and bi-criteria approximations based on sampling and coresets to the problem of subspace approximation with outliers. To get around the SSE-hardness of robust subspace recovery, we assume that the squared distance error of the optimal k-dimensional subspace summed over the optimal (1 − α)n inliers is at least δ times its squared-error summed over all n points, for some 0 < δ ≤ 1 − α. Under this assumption, we give an efficient algorithm to find a weak coreset or a subset of poly(k/) log(1/δ) log log(1/δ) points whose span contains a kdimensional subspace that gives a multiplicative (1+)-approximation to the optimal solution. The running time of our algorithm is linear in n and d. Interestingly, our results hold even when the fraction of outliers α is large, as long as the obvious condition 0 < δ ≤ 1−α is satisfied. We show similar results for subspace approximation with p error or more general M-estimator loss functions, and also give an additive approximation for the affine subspace approximation problem.

Full version of paper is available here https://arxiv.org/pdf/2006.16573.pdf. c Springer Nature Switzerland AG 2020  D. Kim et al. (Eds.): COCOON 2020, LNCS 12273, pp. 1–13, 2020. https://doi.org/10.1007/978-3-030-58150-3_1

2

1

A. Deshpande and R. Pratap

Introduction

Finding low-dimensional representations of large, high-dimensional input data is an important first step for several problems in computational geometry, data mining, machine learning, and statistics. For given input points x1 , x2 , . . . , xn ∈ Rd , a positive integer 1 ≤ k ≤ d and 1 ≤ p < ∞, the p subspace approximation problem asks to find a k-dimensional linear subspace V of Rd that essentially minimizes the sum of p-th powers of the distances of all the points to the subspace V , or to be precise, it minimizes the p error 1/p  n n   p d(xi , V ) or equivalently d(xi , V )p . i=1

i=1

For p = 2, the optimal subspace is spanned by the top k right singular vectors of the matrix X ∈ Rn×d formed by x1 , x2 , . . . , xn as its rows. The optimal solution for p = 2 canbe computed efficiently by the Singular Value Decomposition  (SVD) in time O min{nd2 , n2 d} . Liberty’s deterministic matrix sketching [15] and subsequent work [9] provide a faster, deterministic algorithm that runs in O (nd · poly(k/)) time and gives a multiplicative (1 + )-approximation to the optimum. There is also a long line of work on randomized algorithms [17,19] that sample a subset of points and output a subspace from their span, giving a multiplicative (1 + )-approximation in running time O(nnz(X)) + (n + d)poly(k/), where nnz(X) is the number of non-zero entries in X. These are especially useful on sparse data. For p = 2, unlike the p = 2 case, we do not know any simple description of the optimal subspace. For any p ≥ 1, Shyamalkumar and  Varadarajan [18]give a (1+ )-approximation algorithm that runs in time O nd · exp((k/)O(p) ) . Building upon this, Deshpande and Varadarajan [4] give a bi-criteria (1+)-approximation by finding a subset of s = (k/)O(p) points in time O (nd · poly(k/)) such that their s-dimensional linear span gives a (1 + )-approximation to the optimal k-dimensional subspace. The subset they find is basically a weak coreset, and projecting onto its span also gives dimension-reduction result for subspace approximation. Feldman et al. [6] improve the running time to nd · poly(k/) + (n + d) · exp(poly(k/)) for p = 1. Feldman and Langberg [5] extend this result to achieve a running time of nd · poly(k/) + exp((k/)O(p) ) for any p ≥ 1. Clarkson and Woodruff [3] improve this running time to O(nnz(X) + (n + d) · poly(k/) + exp(poly(k/))) for any p ∈ [1, 2). The case p ∈ [1, 2), especially p = 1, is important because the 1 error (i.e., the sum of distances) is more robust to outliers than the 2 error (i.e., the sum of squared distances). We consider the following variant of p subspace approximation in the presence of outliers. Given points x1 , x2 , . . . , xn ∈ Rd , an integer 1 ≤ k ≤ d, 1 ≤ p < ∞, and an outlier parameter 0 ≤ α ≤ 1, find a k-dimensional linear subspace V that minimizes the sum of p-th powers of distances of the (1 − α)n points nearest to it. In other words, let Nα (V ) ⊆ [n] consist of the indices of the points to V among x1 , x2 , . . . , xn . We want to minimize  nearest (1 − α)n p i∈Nα (V ) d(xi , V ) .

Subspace Approximation with Outliers

3

The robust subspace recovery problem is a special case in which the optimal error for the subspace approximation problem with outliers is promised to be zero, that is, the optimal subspace V is promised to go through some (1 − α)n points among x1 , x2 , . . . , xn . Thus, any multiplicative approximation must also have zero error and recover the optimal subspace. Khachiyan [11] proved that it is NP-hard to find a (d − 1)-dimensional subspace that contains at least (1 − )(1 − 1/d)n points. Hardt and Moitra [10] study robust subspace recovery and define an (, δ)-Gap-Inlier problem of distinguishing between these two cases: (a) there exists a subspace of dimension δn containing (1 − )δn points and (b) every subspace of dimension δn contains at most δn points. They show a polynomial time reduction from the (, δ)-Gap-Small-Subset-Expansion problem to the (, δ)-Gap-Inlier problem. For more on Small Set Expansion conjecture and its connections to Unique Games, please see [16]. Under a strong assumption on the data (that requires any d or fewer outliers to be linearly independent), Hardt and Moitra give an efficient algorithms for finding k-dimensional subspace containing (1 − k/d)n points. This naturally leaves open the question of finding other more reasonable approximations to the subspace approximation problem with outliers. In recent independent work, Bhaskara and Kumar (see Theorem 12 in [1]) showed that if (, δ)-Gap-Small-Subset-Expansion problem is NP-hard, then there exists an instance of subspace approximation with outliers where the optimal inliers lie on a k-dimensional subspace but it is NP-hard to find even a sub√ space of dimension O(k/ ) that contains all but (1 + δ/4) times more points than the optimal number of outliers. This showed that even bi-criteria approximation for subspace recovery is a challenging problem. We compare and contrast our results with the result of Bhaskara and Kumar [1]. Their algorithm throws more outlier than the optimal solution, while we don’t throw any extra outlier. Also, their bi-criteria approximation depends on the “rank-k condition” number which is a somewhat stronger assumption than ours. The problem of clustering using points and lines in the presence of outliers has been studied in special cases of k-median and k-means clustering [2,13], and points and line clustering [7]. Krishnaswamy et al. [13] give a constant factor approximation for k-median and k-means clustering with outliers, whereas Feldman and Schulman give (1 + )-approximations for k-median with outliers and k-line median with outliers that run in time linear in n and d. Another recent line of research on robust regression considers data coming from an underlying distribution where a fraction of it is arbitrarily corrupted [12,14]. The problem we study is different as we do not assume any generative model for the input.

2

Our Contributions

– We assume that the p error of the optimal subspace summed over the optimal (1 − α)n inliers is at least δ times its total p error summed over all n points, for some δ > 0. Under this assumption, we give an algorithm to efficiently find

4







– –

3

A. Deshpande and R. Pratap

a subset of poly(pk/) · log(1/δ) log log(1/δ) points from x1 , x2 , . . . , xn such that the span of this subset contains a k-dimensional linear subspace whose p error over its nearest (1 − α)n points is within (1 + ) of the optimum. The running time of our algorithm is linear in n and d. Note that even for δ as small as 1/poly(n), our algorithm outputs a fairly small subset of poly(pk/) · log n log log n points. The running time of our sampling-based algorithm is linear in n and d. Alternatively, the entire span of the above subset is a linear subspace of dimension poly(pk/) · log(1/δ) log log(1/δ) that gives a bi-criteria multiplicative (1 + )-approximation to the optimal k-dimensional solution to the p subspace approximation problem with outliers. Interestingly, this holds even when the fraction of outliers α is large, as long as the obvious condition 0 < δ ≤ 1 − α is satisfied. Our assumption that the p error of the optimal subspace summed over the optimal (1 − α)n inliers is at least δ times its total p error summed over all n points, for some δ > 0, is more reasonable and realistic than the assumptions used in previous work on subspace approximation with outliers. Without this assumption, our problem (even its special case of subspace recovery) is known to be Small Set Expansion (SSE)-hard [10]. The technical contribution of our work is in showing that the sampling-based weak coreset constructions and dimension reduction results for the subspace approximation problem without outliers [4] also extend to its robust version for data with outliers. If we know the inlier-outlier partition of the data, then the result of [4] can easily be extended for the outlier version of the problem. However, if we don’t know such a partitioning, then a brute-force n subsets and picks the best solution. approach has to go over all (1−α)n This is certainly not an efficient approach as the number of such subsets is exponential in n. Further, on inputs that satisfy our assumption (stated above), it is easy to see that solving the subspace approximation problem without outliers gives a multiplicative 1/δ-approximation to the subspace approximation problem with outliers. Our contribution lies in showing that this approximation guarantee can be improved significantly in a small number of additional sampling steps. We show immediate extensions of our results to more general M-estimator loss functions as previously considered by [3]. We show that our multiplicative approximation for the linear subspace approximation problem under 2 error implies an additive approximation for the affine subspace approximation problem under 2 error. The running time of this algorithm is also linear in n and d.

Warm-Up: Least Squared Error Line Approximation with Outliers

As a warm-up towards the main proof, we first consider the case k = 1 and p = 2, that is, for a given 0 < α < 1, we want to find the best line that minimizes

Subspace Approximation with Outliers

5

the sum of squared distances summed over its nearest (1 − α)n points. Let x1 , x2 , . . . , xn ∈ Rd be the given points and let l∗ be the optimal line. Let I ⊆ [n] consist of the indices of the nearest (1 − α)n points to l∗ among x1 , x2 , . . . , xn . Our algorithm iteratively builds a subset S ⊆ [n] by starting from S = ∅ and in each step samples with replacement poly(k/) i.i.d. points where each point xi is picked with probability proportional to its squared distance to the span of the current subset d(xi , span (S))2 . We abuse the notation as span (S) to denote the linear subspace spanned by {xi : i ∈ S}. These sampled points are added to S and the sampling algorithm is repeated poly(k/) times. 3.1

Additive Approximation

We are looking for a small subset S ⊆ [n] of size poly(1/) that contains a close additive approximation to the optimal subspace over the optimal inliers, that is, for the projection of l∗ onto span (S) denoted by PS (l∗ ), 

d(xi , PS (l∗ ))2 ≤

i∈I



d(xi , l∗ )2 + 

n 

2

xi

(1)

i=1

i∈I

This immediately implies that there exists a line lS in span (S) such that 

d(xi , lS )2 ≤

i∈Nα (lS )



d(xi , l∗ )2 + 

n 

2

xi ,

i=1

i∈I

where Nα (lS ) ⊆ [n] consists of the indices of the nearest (1 − α)n points from x1 , x2 , . . . , xn to lS . Given any subset S ⊆ [n], define the set of bad points as a subset of inliers I whose error w.r.t. PS (l∗ ) is somewhat larger than their error w.r.t. the optimal line l∗ , that is, B(S) = {i ∈ I : d(xi , PS (l∗ ))2 > (1 + /2) d(xi , l∗ )2 } and good points as G(S) = I \ B(S). The following lemma shows that sampling points 2 with probability proportional to their squared lengths xi picks a bad point from B(S) with probability at least /2. Its proof is deferred to the full version.  n 2 2 Lemma 1. If S ⊆ [n] does not satisfy (1), then i∈B(S) xi ≥ 2 i=1 xi . Below we show that a bad point sampled by squared-length sampling can be used to get another line closer to the optimal solution by a multiplicative factor, and repeat this. A proof of the following theorem is deferred to the full version. Theorem 1. For any given x1 , x2 , . . . , xn ∈ Rd , let S be an i.i.d. sample of  2 O (1/ ) log(1/) points picked by squared-length sampling. Let I ⊆ [n] be the set of optimal (1 − α)n inliers and l∗ be the optimal line that minimizes their squared distance. Then  i∈I

d(xi , PS (l∗ ))2 ≤

 i∈I

d(xi , l∗ )2 + 

n  i=1

2

xi ,

with a constant probability.

6

A. Deshpande and R. Pratap

3.2

Multiplicative Approximation

To turn this into a multiplicative (1+) guarantee, we need to use this adaptively by treating the projections of x1 , x2 , . . . , xn orthogonal to span (S) as our new points, and repeating the squared length sampling on these new points. Here is the modified statement of the additive approximation that we need. Theorem 2. For any given points x1 , x2 , . . . , xn ∈ Rd and any initial subset S0 ,  let S be an i.i.d. sample of O (1/2 ) log(1/) points sampled with probability proportional to d(xi , span (S0 ))2 . Let I ⊆ [n] be the set of optimal (1−α)n inliers and l∗ be the optimal line that minimizes their squared distance. Then, with a constant probability, we have 

d(xi , PS∪S0 (l∗ ))2 ≤

i∈I



d(xi , l∗ )2 + 

n 

d(xi , span (S0 ))2 .

i=1

i∈I

Proof. Similar to the proof of Theorem 1 but using the projections of xi ’s orthogonal to span (S0 ) as the point set. In particular, we apply Lemma 1 to the projections of xi ’s orthogonal span (S0 ) instead of xi ’s. Note that Theorem 1 is a special case with S0 = ∅. Repeating this squared-distance sampling adaptively for multiple rounds brings the additive approximation error down exponentially in the number of rounds. We defer a proof of the following theorem to the full version. d Theorem 3. For any given points x1 , x2 , . . . , xn ∈  R ,2any initial subsetS0 and positive integer T , let St be an i.i.d. sample of O (1/ ) log(1/) log T points sampled with probability proportional to d(xi , span (St−1 ))2 , for 1 ≤ t ≤ T . Let I ⊆ [n] be the set of optimal (1 − α)n inliers and l∗ be the optimal line that minimizes their squared distance. Then, with a constant probability,



d(xi , PS0 ∪S1 ∪...∪ST (l∗ ))2 ≤ (1 + )

i∈I



d(xi , l∗ )2 + T

i∈I

n 

d(xi , span (S0 ))2 .

i=1

∗ Now assume that the optimal for l is at least δ times its error  inlier error n ∗ 2 ∗ 2 over the entire data, that is, i∈I d(xi , l ) ≥ δ i=1 d(xi , l ) . In that case, we can show a much stronger multiplicative (1 + )-approximation instead of additive one using similar analysis as of [8]. We defer its proof to the full version of the paper.

Theorem 4. For any given points x1 , x2 , . . . , xn ∈ Rd , let I ⊆ [n] be the set of line that minimizes their squared optimal (1 − α)n inliers and l∗ be the optimal n  distance. Suppose i∈I d(xi , l∗ )2 ≥ δ  i=1 d(xi , l∗ )2 . For any 0 <  < 1,  we can efficiently find a subset S of size O (1/2 ) log(1/) log(1/δ) log log(1/δ) s.t.  i∈I

d(xi , PS (l∗ ))2 ≤ (1 + )

 i∈I

d(xi , l∗ )2 ,

with a constant probability.

Subspace Approximation with Outliers

4

7

p subspace approximation with outliers

Given an instance of k-dimensional subspace approximation with outliers as points x1 , x2 , . . . , xn ∈ Rd , a positive integer 1 ≤ k ≤ d, a real number p ≥ 1, and an outlier parameter 0 < α < 1, let the optimal k-dimensional linear subspace be V ∗ that minimizes the p error summed over its nearest (1 − α)n points from x1 , x2 , . . . , xn . Let I ⊆ [n] denote the subset of indices of these nearest (1 − α)n points to V ∗ . In other words, I = Nα (V ∗ ) consists of the indices of the optimal inliers. Given any subset S ⊆ [n], let lS be the line or direction in it that makes the smallest angle with V ∗ , and define the subspace WS as the rotation of V ∗ along this angle so as to contain lS . To be precise, let l∗ be the projection of lS onto V ∗ and let W ∗ be the orthogonal complement of l∗ in V ∗ . Observe that WS is the k-dimensional linear subspace spanned by lS and W ∗ . We say that S contains a line lS useful for additive approximation if 

d(xi , WS )p ≤

i∈I



d(xi , V ∗ )p + 

n 

p

xi .

(2)

i=1

i∈I

Define the set of bad points as the subset of inliers I whose error w.r.t. WS is somewhat larger than their error w.r.t. the optimal subspace V ∗ , that is, B(S) = {i ∈ I : d(xi , WS )p > (1 + /2) d(xi , V ∗ )p } and good points as G(S) = I \ B(S). The following lemma shows that sampling i-th points with p probability proportional to xi picks a bad point i ∈ B(S) with probability at least /2. A proof of the following lemma is deferred to the full version of the paper.  n p p Lemma 2. If S ⊆ [n] does not satisfy (2), then i∈B(S) xi ≥ 2 i=1 xi . 4.1

Additive Approximation: One Dimension at a Time

Below we show that a bad point i ∈ B(S) can be used to improve WS , or in other words, span (S ∪ {i}) contains a line lS∪{i} that is much closer to V ∗ than lS . We defer a proof of the following theorem to the full version of the paper. d Theorem  2 25. For anygiven points x1 , x2 , . . . , xn ∈ R , let S be an i.i.d. sample p of O (p / ) log(1/) points picked with probabilities proportional to xi . Let ∗ I ⊆ [n] be the set of optimal (1 − α)n inliers and V be the optimal subspace that minimizes the p error over the inliers. Also let WS be defined as in the beginning of Sect. 4. Then, with a constant probability, we have

 i∈I

d(xi , WS )p ≤

 i∈I

d(xi , V ∗ )p + 

n 

p

xi .

i=1

Note that we can start with any given initial subspace S0 and prove a similar result for sampling points with probability proportional to d(xi , span (S0 ))p .

8

A. Deshpande and R. Pratap

d Theorem 6. For any given points  x1 , x2 , . . . , xn ∈  R and an initial subspace S0 , let S be an i.i.d. sample of O (p2 /2 ) log(1/) points picked with probabilities proportional to d(xi , span (S0 ))p . Let I ⊆ [n] be the set of optimal (1 − α)n inliers and V ∗ be the optimal subspace that minimizes the p error over the inliers. Also let WS be defined as in the beginning of Sect. 4. Then, with a constant probability, we have



d(xi , WS∪S0 )p ≤

i∈I



d(xi , V ∗ )p + 

n 

d(xi , span (S0 ))p .

i=1

i∈I

Proof. Similar to the proof of Theorem 5 above. Once we have a line lS that is close to V ∗ , we can project orthogonal to it and repeat the sampling again. The caveat is, we do not know lS . One can get around this by projecting all the points x1 , x2 , . . . , xn to span (S) of the current sample S, and repeat. Theorem 7. For any given points x1 , x2 , . . . , xn ∈ Rd , let S = S1 ∪ S2 ∪ . . . ∪  ˜ p2 k 2 /2 points picked as follows: S1 be an i.i.d. sample Sk be a sample of O  p with probability proportional to xi , of O (p2 k/2 ) log(k/) points  2 picked  S2 be an i.i.d. sample of O (p k/2 ) log(k/) points picked with probability proportional to d(xi , span (S1 ))p , and so on. Let I ⊆ [n] be the set of optimal (1 − α)n inliers and V ∗ be the optimal subspace that minimizes the p error over the inliers. Then, with a constant probability, span (S) contains a k-dimensional subspace VS such that  i∈I

d(xi , VS )p ≤



d(xi , V ∗ )p + 

i∈I

n 

p

xi .

i=1

Proof. Similar to the proof of Theorem