197 103 2MB
English Pages 52 Year 2019
PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON FOUNDATIONS OF COMPUTER SCIENCE
Editors Hamid R. Arabnia Fernando G. Tinetti
CSCE’17 July 17-20, 2017 Las Vegas Nevada, USA americancse.org ©
CSREA Press
This volume contains papers presented at The 2017 International Conference on Foundations of Computer Science (FCS'17). Their inclusion in this publication does not necessarily constitute endorsements by editors or by the publisher.
Copyright and Reprint Permission Copying without a fee is permitted provided that the copies are not made or distributed for direct commercial advantage, and credit to source is given. Abstracting is permitted with credit to the source. Please contact the publisher for other copying, reprint, or republication permission.
© Copyright 2017 CSREA Press ISBN: 1-60132-456-1 Printed in the United States of America
Foreword It gives us great pleasure to introduce this collection of papers to be presented at the 2017 International Conference on Foundations of Computer Science (FCS’17), July 17-20, 2017, at Monte Carlo Resort, Las Vegas, USA. An important mission of the World Congress in Computer Science, Computer Engineering, and Applied Computing, CSCE (a federated congress to which this conference is affiliated with) includes "Providing a unique platform for a diverse community of constituents composed of scholars, researchers, developers, educators, and practitioners. The Congress makes concerted effort to reach out to participants affiliated with diverse entities (such as: universities, institutions, corporations, government agencies, and research centers/labs) from all over the world. The congress also attempts to connect participants from institutions that have teaching as their main mission with those who are affiliated with institutions that have research as their main mission. The congress uses a quota system to achieve its institution and geography diversity objectives." By any definition of diversity, this congress is among the most diverse scientific meeting in USA. We are proud to report that this federated congress has authors and participants from 64 different nations representing variety of personal and scientific experiences that arise from differences in culture and values. As can be seen (see below), the program committee of this conference as well as the program committee of all other tracks of the federated congress are as diverse as its authors and participants. The program committee would like to thank all those who submitted papers for consideration. About 65% of the submissions were from outside the United States. Each submitted paper was peer-reviewed by two experts in the field for originality, significance, clarity, impact, and soundness. In cases of contradictory recommendations, a member of the conference program committee was charged to make the final decision; often, this involved seeking help from additional referees. In addition, papers whose authors included a member of the conference program committee were evaluated using the double-blinded review process. One exception to the above evaluation process was for papers that were submitted directly to chairs/organizers of pre-approved sessions/workshops; in these cases, the chairs/organizers were responsible for the evaluation of such submissions. The overall paper acceptance rate for regular papers was 19%; 8% of the remaining papers were accepted as poster papers (at the time of this writing, we had not yet received the acceptance rate for a couple of individual tracks.) We are very grateful to the many colleagues who offered their services in organizing the conference. In particular, we would like to thank the members of Program Committee of FCS’17, members of the congress Steering Committee, and members of the committees of federated congress tracks that have topics within the scope of FCS. Many individuals listed below, will be requested after the conference to provide their expertise and services for selecting papers for publication (extended versions) in journal special issues as well as for publication in a set of research books (to be prepared for publishers including: Springer, Elsevier, BMC journals, and others). •
• • •
Prof. Hamid R. Arabnia (Congress Steering Committee); Graduate Program Director (PhD, MS, MAMS); The University of Georgia, USA; Editor-in-Chief, Journal of Supercomputing (Springer);Fellow, Center of Excellence in Terrorism, Resilience, Intelligence & Organized Crime Research (CENTRIC). Prof. Juan Jose Martinez Castillo; Director, The Acantelys Alan Turing Nikola Tesla Research Group and GIPEB, Universidad Nacional Abierta, Venezuela Prof. Kevin Daimi (Congress Steering Committee); Director, Computer Science and Software Engineering Programs, Department of Mathematics, Computer Science and Software Engineering, University of Detroit Mercy, Detroit, Michigan, USA Prof. Zhangisina Gulnur Davletzhanovna; Vice-rector of the Science, Central-Asian University, Kazakhstan, Almaty, Republic of Kazakhstan; Vice President of International Academy of Informatization, Kazskhstan, Almaty, Republic of Kazakhstan
• • • •
• • • • • • • • •
Prof. Leonidas Deligiannidis (Congress Steering Committee); Department of Computer Information Systems, Wentworth Institute of Technology, Boston, Massachusetts, USA; Visiting Professor, MIT, USA Prof. Tai-hoon Kim; School of Information and Computing Science, University of Tasmania, Australia Prof. Dr. Guoming Lai; Computer Science and Technology, Sun Yat-Sen University, Guangzhou, P. R. China Dr. Vitus S. W. Lam; Senior IT Manager, Information Technology Services, The University of Hong Kong, Kennedy Town, Hong Kong; Chartered Member of The British Computer Society, UK; Former Vice Chairman of the British Computer Society (Hong Kong Section); Chartered Engineer & Fellow of the Institution of Analysts and Programmers Prof. Dr., Eng. Robert Ehimen Okonigene (Congress Steering Committee); Department of Electrical & Electronics Engineering, Faculty of Engineering and Tech., Ambrose Alli University, Edo State, Nigeria Prof. Igor Schagaev; Director of ITACS Ltd, United Kingdom (formerly a Professor at London Metropolitan University, London, UK) Chiranjibi Sitaula; Head, Department of Computer Science and IT, Ambition College, Kathmandu, Nepal Ashu M. G. Solo (Publicity), Fellow of British Computer Society, Principal/R&D Engineer, Maverick Technologies America Inc. Dr. Tse Guan Tan; Faculty of Creative Technology and Heritage, Universiti Malaysia Kelantan, Kelantan, Malaysia Prof. Fernando G. Tinetti (Congress Steering Committee); School of CS, Universidad Nacional de La Plata, La Plata, Argentina; Co-editor, Journal of Computer Science and Technology (JCS&T). Varun Vohra; Certified Information Security Manager (CISM); Certified Information Systems Auditor (CISA); Associate Director (IT Audit), Merck, New Jersey, USA Prof. Layne T. Watson (Congress Steering Committee); Fellow of IEEE; Fellow of The National Institute of Aerospace; Professor of Computer Science, Mathematics, and Aerospace and Ocean Engineering, Virginia Polytechnic Institute & State University, Blacksburg, Virginia, USA Prof. Jane You (Congress Steering Committee); Associate Head, Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hong Kong
We would like to extend our appreciation to the referees, the members of the program committees of individual sessions, tracks, and workshops; their names do not appear in this document; they are listed on the web sites of individual tracks. As Sponsors-at-large, partners, and/or organizers each of the followings (separated by semicolons) provided help for at least one track of the Congress: Computer Science Research, Education, and Applications Press (CSREA); US Chapter of World Academy of Science; American Council on Science & Education & Federated Research Council (http://www.americancse.org/); HoIP, Health Without Boundaries, Healthcare over Internet Protocol, UK (http://www.hoip.eu); HoIP Telecom, UK (http://www.hoip-telecom.co.uk); and WABT, Human Health Medicine, UNESCO NGOs, Paris, France (http://www.thewabt.com/ ). In addition, a number of university faculty members and their staff (names appear on the cover of the set of proceedings), several publishers of computer science and computer engineering books and journals, chapters and/or task forces of computer science associations/organizations from 3 regions, and developers of high-performance machines and systems provided significant help in organizing the conference as well as providing some resources. We are grateful to them all. We express our gratitude to keynote, invited, and individual conference/tracks and tutorial speakers - the list of speakers appears on the conference web site. We would also like to thank the followings: UCMSS (Universal Conference Management Systems & Support, California, USA) for managing all aspects of the conference; Dr. Tim Field of APC for coordinating and managing the printing of the proceedings; and the staff of Monte Carlo Resort (Convention department) at Las Vegas for the professional service they
provided. Last but not least, we would like to thank the Co-Editors of FCS’17: Prof. Hamid R. Arabnia and Prof. Fernando G. Tinetti. We present the proceedings of FCS’17.
Steering Committee, 2017 http://americancse.org/
Contents SESSION: GRAPH AND NETWORK BASED ALGORITHMS Critical Graphs for the Minimum-Vertex-Cover Problem Andreas Jakoby, Naveen Kumar Goswami, Eik List, Stefan Lucks Some Hardness Results for Distance Dominating Set Yong Zhang
3
10
SESSION: SOFTWARE SYSTEMS AND RELATED ISSUES User-level Deterministic Replay via Accurate Non-deterministic Event Capture Jeongtaek Lim, Hosang Yoon, Hyunmin Yoon, Yoomee Ko, Minsoo Ryu
15
Analysis and Evaluation of Locks Designed for NUMA System Joohwan Hong, Seokyong Jung, Kihyun Yun, Minsoo Ryu
19
SESSION: NOVEL ALGORITHMS A New Algorithm for Tiered Binary Search Ahmed Tarek
25
Algorithms for the Majority Problem Rajarshi Tarafdar, Yijie Han
32
Investigating the Benefits of Parallel Processing for Binary Search Paul Mullins, Gennifer Elise Farrell, C. Ronald Baldwin
38
Int'l Conf. Foundations of Computer Science | FCS'17 |
SESSION GRAPH AND NETWORK BASED ALGORITHMS Chair(s) TBA
ISBN: 1-60132-456-1, CSREA Press ©
1
2
Int'l Conf. Foundations of Computer Science | FCS'17 |
ISBN: 1-60132-456-1, CSREA Press ©
Int'l Conf. Foundations of Computer Science | FCS'17 |
3
Critical Graphs for the Minimum-Vertex-Cover Problem Andreas Jakoby( ) , Naveen Kumar Goswami, Eik List, Stefan Lucks Faculty of Media, Bauhaus-Universität Weimar, Bauhausstr. 11, D-99423, Weimar, Germany @uni-weimar.de
Abstract— In the context of the chromatic-number problem, a critical graph is an instance where the deletion of any element would decrease the graph’s chromatic number. Such instances have shown to be interesting objects of study for deepen the understanding of the optimization problem. This work introduces critical graphs in context of Minimum Vertex Cover. We demonstrate their potential for the generation of larger graphs with hidden a priori known solutions. Firstly, we propose a parametrized graph-generation process which preserves the knowledge of the minimum cover. Secondly, we conduct a systematic search for small critical graphs. Thirdly, we illustrate the applicability for benchmarking purposes by reporting on a series of experiments using the state-of-the-art heuristic solver NuMVC. Keywords: critical graphs, minimum vertex cover, graph generation, benchmark generator
1. Introduction The Minimum Vertex Cover Problem. A vertex cover C for a given graph G = (V, E) defines a subset of vertices C ⊆ V such that every edge in E is incident to at least one vertex in C. A minimum vertex cover (MVC) is a vertex cover with the smallest possible size. The task of finding a minimum vertex cover in a given graph is a classical N Phard optimization problem [3], and its decision version one the 21 original N P-complete problems listed by Karp [6]. While the construction of a maximal matching yields a trivial approximation-2 algorithm, it is N P-hard to approximate MVC within any factor smaller than 1.3606 [2] unless P = N P, according to the Unique Game Conjecture; though, one can achieve an approximation ratio of 2 − o(1) [5]. The MVC problem is strongly related to at least three further N P-hard problems. Finding a minimum vertex cover is equivalent to finding a Maximum Independent Set (MIS), i. e. a subset of vertices wherein no pair of vertices shares an edge. An MIS problem instance can again be transformed into an instance of the Maximum-size Clique (MC) problem; moreover, there is a straight-forward reduction of a (binary) Constraint Satisfaction Problem (CSP) to an MIS problem. In practice, the MVC problem plays an important role in network security, industrial machine assignment, or facility location [7], [8], [16]. Furthermore, MVC algorithms can be used for solving MIS problems e.g., in the analysis of social networks, pattern recognition, and alignment of protein sequences in bioinformatics [8], [10].
Critical Graphs. Graphs are called critical if they are minimal with regards to a certain measure. More precisely, an edge of a given graph G is called a critical element iff its deletion would decrease the measure. G is then called edgecritical (or simply critical) iff every edge is a critical element. The concept of critical graphs is mostly used in works on the chromatic number. Though, it can also be of significant interest for the MVC problem, where we call a graph critical iff the deletion of any edge would decrease the size of the minimal cover. The insights of studying such graphs could help to deepen our understanding on the complexity of the Vertex Cover problem or to find more efficient solvers. Randomized Graph Generation. Critical graphs can further serve as base for constructing larger graphs. Since small critical instances possess an easily determined cover size, a parametrized graph-generation process that preserves the criticality could create large instances while maintaining the knowledge about the solution. One application for such graphs could be, e.g., the dedicated generation of particularly hard instances for benchmarking purposes. Following a series of previous graph-generation models [9], [13], the idea for generating such graphs for the minimum-vertex cover problem had been introduced by Xu and Li [14], [15] and revisited in [12]. During the past decade, Xu’s BHOSLIB suite [11] has established as a valuable benchmark suite for the evaluation of MVC, MIS, MC, and CSP solvers. Contribution. This work studies critical graphs for the Minimum Vertex Cover problem. First, we propose a (not necessarily efficiently implementable) graph-generation process which can create all possible graphs while preserving the knowledge of the minimum cover size. To implement this process efficiently, we restrict it to a certain set of extensions that enlarge a critical graph while maintaining the criticality. Second, we systematically search for small critical instances. As a useful observation, we show that, if all critical graphs for the MVC problem were known, our restricted process could efficiently generate all possible graphs. Third, we illustrate the applicability of instances generated by our process for benchmarking. We report on a series of experiments with a state-of-the-art heuristic solver NuMVC [1] on examplary instances that were generated by our randomized process. Outline. In the following, Section 2 defines critical graphs for the Vertex-Cover problem. Section 3 describes our approach for generating hard random graphs from critical graphs. Section 5 presents our used extensions. Section 6 details the results of our experiments and Section 7 concludes.
ISBN: 1-60132-456-1, CSREA Press ©
4
Int'l Conf. Foundations of Computer Science | FCS'17 |
2. Critical Graphs for the MinimumVertex-Cover Problem Definition 2.1: A connected graph G = (V, E) is edgeuncritical (uncritical hereafter) according to an optimization problem P on graphs iff there exists an edge e ∈ E such that every solution for P at G0 = (V, E \ {e}) is a solution for P at G. A connected graph is edge-critical (critical hereafter) according to an optimization problem P iff it is not uncritical. For the vertex-cover problem, this implies: Observation 2.2: A connected graph G = (V, E) is critical according to the vertex-cover problem (in short: VCcritical) iff deleting any edge reduces the minimum cover size. Several simple critical graphs exist. Theorem 2.3: Cliques and cycles of odd length are VCcritical. Analyzing graphs with a perfect matching, one can show that cycles of even length are uncritical. Now, we can easily prove the following observations: Observation 2.4: Let G = (V, E) be a connected undirected graph that has a perfect matching E 0 . The minimal size of a vertex cover is at least |V |/2. Moreover, if the minimal size of a vertex cover is exactly |V |/2, then either E = E 0 , i.e., G is its own perfect matching, or |E| > |E 0 |, and then G is VC-uncritical. Since there exists a perfect matching for cycles of even length, Observation 2.4 implies: Corollary 2.5: Cycles of even length 2k with k ≥ 2 are VC-uncritical. Observation 2.6: Let G = (V, E) be a connected undirected graph and let U ⊂ V be a minimum vertex cover for G. Assume that there exists an edge e ∈ E such that both endpoints of e are in the cover U then either G is uncritical or there exists a minimum vertex cover U 0 for G such that only one of the endpoints of e is in the cover U 0 . Given a graph G = (V, E) and a minimum cover U ⊆ V , such that no {u, v} ∈ E exists with u, v ∈ U , then the graph is bipartite. In that case, the minimum cover is one side of the bipartite decomposition of the vertex set. Furthermore, one can show that VC-critical graphs have to be 2-connected. Theorem 2.7: Let G = (V, E) be a graph with an articulation vertex u, then G is VC-uncritical. Another useful observation is the following: Observation 2.8: Let G = (V, E) be a VC-critical graph. Then, for every vertex u ∈ V , there exists a minimum vertex cover C of G with u ∈ C.
3. Generating Larger Graphs from Critical Instances One relevant application of critical graphs is the construction of larger graphs. For this purpose, we need a (randomized) generation process which (1) allows to construct
all possible graphs, and (2) preserves the knowledge of a hidden solution, i.e. about the minimum vertex cover. This section presents such a generation process. Definition 3.1: Let B = {B1 , B2 , . . .} be a set of graphs where each graph is given by a triple Bi = (U, V, E) with two disjoint sets of vertices U and V such that U gives a minimum vertex cover for the graph (U ∪ V, E). Define the following random processes • Given B and `, m, n ∈ N for the vertex cover size `, an upper bound m for the number of edges, and an upper 1 bound n for the number of vertices, then GB,`,m,n is a random variable that uniformly at random gives a collection S1 , . . . , Sk of elements of B (some elements of B may repeat) with Si = (Ui , Vi , Ei ) s.t. k k k [ [ [ Ui = ` and Vi + ` ≤ n and Ei ≤ m . i=1
i=1
i=1
Sk Sk Sk Let C(S1 , . . . , Sk ) = ( i=1 Ui , i=1 Vi , i=1 Ei ). • Given a triple B = (U, V, E) with |V | + |U | ≤ n, then Gn2 (B) will be the triple (U, V ∪ V 0 , E) where V 0 denotes a set of n − |U | − |V | new vertices (V 0 ∩ (U ∪ V ) = ∅). 3 • Given a triple B = (U, V, E) with |E| ≤ m, let Gm (B) be a random variable that uniformly at random adds m − |E| new edges E 0 ⊂ U × (U ∪ V ) to B. Since none of the defined processes reduces the cover size, we can conclude: Theorem 3.2: Let B = {B1 , B2 , . . .} be a set of graphs, where each graph is given by a triple Bi = (U, V, E) with two disjoint sets of vertices U and V such that U gives a minimum vertex cover for (U ∪V, E). Then, for every random 3 1 graph (U 0 , V 0 , E 0 ) in the range of Gm (Gn2 (C(GB,`,m,n ))),, U 0 0 0 0 is a minimum vertex cover for (U ∪ V , E ) of size `, and the graph (U 0 ∪ V 0 , E 0 ) has n vertices and m edges. Thus, the three processes do not affect our a-priori knowledge of the minimum cover size. It remains to show that any graph can be constructed by the three processes. This observation follows by analyzing the reverse processes: Theorem 3.3: Let B = {B1 , B2 , . . .} with Bi = (U, V, E) be a set of all critical graphs Gi = (U ∪ V, E), where U and V are disjoint and U gives a minimum vertex cover 3 1 for the graph Gi . Then the range of Gm (Gn2 (C(GB,`,m,n ))) determines the set of all graphs of n vertices, m edges, and minimum vertex cover size `.
4. Circulant Graphs We will now discuss graph classes that can be seen as a generalization of cycles and as extensions of cliques. Definition 4.1: A circulant graph CirculantGraph(n, L) with n ∈ N and L ⊆ {1, . . . , dn/2e} is an undirected graph with n vertices v0 , . . . , vn−1 where each vertex vi is adjacent to both vertices v(i+j) mod n and v(i−j) mod n for all j ∈ L.
ISBN: 1-60132-456-1, CSREA Press ©
Int'l Conf. Foundations of Computer Science | FCS'17 |
5
Fig. 1
Fig. 2
V ISUALIZATION OF CRITICAL CIRCULANT GRAPHS OF DEGREE FOUR .
T HE CRITICAL GRAPH C15,3 . u0
20
u14
u1
15
Interval j
u13
u2
10
u12
u3
u11
u4
5
0
0
10
20
30
40
50
60
70
80
Vertices n
u10
To determine critical graphs, we analyzed circulant graphs of degree four, i.e. CirculantGraph(n, L) with n ∈ [2, 80], L = {1, j}, and j ∈ [2, 20]. We identified 121 critical graphs with degree four in this range whose occurrences are visualized in Figure 1. Furthermore, we implemented a search for all critical circulant graphs of degree six, i.e. for CirculantGraph(n, L) with |L| = 3. For n ∈ [4, 60], L = {1, i, j}, and i, j ∈ [2, 20]. We determined in total 427 critical graphs within this range. Due to space limitations, they are given in the full version [4]. To conclude this section, we will present a general rule for a subset of circulant graphs to determine whether they are critical or not. One can see that Theorem 2.3 is a conclusion of the following result. Definition 4.2: For n, dh ∈ N, define the undirected graph Cn,dh = S (Vn,dh , En,dh ) by Vn,dh = {u0 , . . . , un−1 } and n−1 + − En,dh = j=0 {{uj , v}|v ∈ Nj,n,dh ∪ Nj,n,dh }, where for all i ∈ {0, . . . , n − 1}, it holds + Ni,n,d = {u(i+1) mod n , .., u(i+dh ) mod n } and h − Ni,n,d = {u(i−1) mod n , .., u(i−dh ) mod n }. h Thus, Cn,dh = CirculantGraph(n, L) where L = [1, dh ]. Figure 2 illustrates the graph C15,3 exemplarily. Lemma 4.3:l The mminimum vertex cover of each graph h Cn,dh is n − n−d dh +1 . Theorem 4.4: Cn,dh is a connected critical graph iff either n ≤ 2dh + 1 or n − dh is a multiple of dh + 1.
5. Extensions for the Efficient Generation of Graphs To obtain a variety of critical graphs, one needs efficient generation methods and patterns. This section introduces two methods which can efficiently produce critical graphs from extending cycles of odd length. They are called the parallel extension in (see Figure 3a)), and the chain extension (see Figure 3b)), hereafter. Let us consider the parallel extension
u5
u9
u6
u8
u7
first. Prior, we need a useful definition of what we call a VC-Overlap. Definition 5.1: Let G = (V, E) be an undirected connected graph and let U ⊆ V be a subset of vertices of G. U is called a VC-Overlap iff for every minimum vertex cover C of G, it holds that U 6⊆ C. If additionally, for any vertex u ∈ U , there exists a minimum vertex cover Cu of G such that u 6∈ Cu and U \ {u} ⊆ C, then U is a 1-VC-Overlap. Let u be a new vertex and U ⊆ V . Then, define the extension of G according to u and U as Γ(G, U, u) = (V ∪ {u}, E ∪ {{u, v}|v ∈ U }). Observation 5.2: Given a graph G = (V, E), a VCOverlap U ⊆ V , and a new vertex u, then the size of a minimum vertex cover of Γ(G, U, u) is given by the size of a minimum vertex cover of G plus one. Observation 5.3: Given a VC-critical graph G = (V, E), a VC-Overlap U ⊆ V , and a new vertex u. If U is not a 1-VC-Overlap, then Γ(G, U, u) is VC-uncritical. Theorem 5.4: Given a graph G = (V, E), a set U ⊆ V with a 1-VC-Overlap, and a new vertex u, then if G is VC-critical, then Γ(G, U, u) is VC-critical. Note that, if U is given by a vertex of G plus its neighborhood, then U is a VC-Overlap. However, Theorem 5.4 does not provide us with an efficient method for constructing critical graphs. For this purpose, the following definition simplifies the extension Γ(G, U, u). Definition 5.5 (Parallel Extension): Let G = (V, E) be an undirected connected graph and let v ∈ V be a vertex of G. Then, define G \ v = (V \ {v}, E \ {{v, u}|u ∈ V }) to be the graph G without v. For an undirected graph G = (V, E) and a vertex v ∈ V let NG (v) = {u ∈ V |{v, u} ∈ E} denote the neighborhood of v in G.
ISBN: 1-60132-456-1, CSREA Press ©
6
Int'l Conf. Foundations of Computer Science | FCS'17 |
Fig. 3 T WO CRITICAL GRAPHS WHICH ARE GENERATED FROM A CYCLE OF FIVE VERTICES .
b)
a)
v
v
x u
u
Let e = {u, v} ∈ E be an edge of G. Then, we call u and v neighbor-equivalent iff NG (v) \ {u} = NG (u) \ {v}. Moreover, let G be a family of undirected graphs, then define the set of parallel extensions Γpar (G) of G to be the set of graphs that can be generated from any graph G = (V, E) ∈ G by adding a new node u 6∈ V and new edges E 0 such that for at least one vertex v ∈ V , it holds that E 0 = {{u, v 0 }|v 0 ∈ NG (v) ∪ {v}}. We call u the parallel extension of v. Define Γ0par (G) = G and Γkpar (G) = Γk−1 par (Γpar (G)). We write [ Γ∗par (G) = Γkpar (G) k∈{0,...,∞}
for the transitive closure of G. If G = {G} consists of a single G, we usually write Γpar (G) and Γ∗par (G) instead of Γpar (G) and Γ∗par (G). A minimum vertex cover for any graph G ∈ Γpar (G0 ) can easily be determined from a minimum vertex cover of G0 . Theorem 5.6: Let G = (V, E) be an undirected connected graph and let e = {u, v} ∈ E such that u and v are neighborequivalent. Let U be a minimum vertex cover for G\u. Then, U ∪ {u} is a minimum vertex cover for G. Corollary 5.7: Let G be a critical graph with minimum vertex cover size m. Then, the minimum vertex cover size of each graph of Γpar (G) is m + 1. More general, the minimum vertex cover size of each graph in Γkpar (G) is m + k. We show that a parallel extension of any critical graph gives a new critical graph. Theorem 5.8: If G is a connected critical graph, then all graphs in Γpar (G) are connected and critical. Corollary 5.9: Let G be a critical graph. Then, all graphs of Γ∗par (G) are critical. Let C denote the family of all cliques and let OC denote the family of all cycles of odd length. For easier notion, we assume that a graph that consists of only a single node is in C and in OC. Furthermore, we assume that a graph which consists of only two connected nodes is in C. Then, we have: Observation 5.10: C ⊂ Γ∗par (OC) and C = Γ∗par (G1 ) where G1 = ({v}, ∅). Moreover, if G2 = ({u, v}, {{u, v}}) and if G3 denotes the cycle of length three, then we have C \ {G1 } = Γ∗par (G2 ) and C \ {G1 , G2 } = Γ∗par (G3 ).
y
Thus, simple edges and cycles of length three are interesting candidates to start with a generation process for critical graphs. Based on the following observation, one can show that the relation of neighbor-equivalence can be used to define equivalence classes of vertices. Observation 5.11: Given an undirected connected graph G = (V, E). Assume that u is neighbor-equivalent to v and to w, then v is neighbor-equivalent to w. The analysis of the equivalence classes leads to the observation that the sets of graphs which can be generated by parallel extensions from two different cycles of different odd length i, j ≥ 3 are always distinct. Thus, we look for a second type of extension that helps us to generate cycles of odd length. Consider the critical graph of Figure 3.b) which cannot be generated by parallel extensions of a cycle. This graph yields a second type of extension. Definition 5.12 (Chain Extension): Let G = (V, E) be an undirected connected graph and let e = {u, v} ∈ E be an edge of G. Let x, y 6∈ V denote two new nodes. Then the graph G0 = (V ∪ {x, y}, {{u, x}, {x, y}, {y, v}} ∪ E \ {e}) is called a chain extension of G. Let G be a family of undirected graphs, then define Γchain (G) to be the set of graphs that can be generated from any graph G = (V, E) ∈ G by one chain extension. Let Γ∗chain (G) denote the transitive closure of G according to Γchain . If we apply the extension k times, then the set of resulting graphs is denoted by Γkchain (G). If G consists of a single graph G, then we use the notions Γchain (G) and Γ∗chain (G). Chain extensions do not always result in critical graphs if the original graph is critical. Consider, e. g. a graph that consists of a single edge. Extending this edge leads to a chain of four vertices, and therefore in a uncritical graph. To avoid this problem, we focus now on extension of edges, where endpoints occur in a single minimum vertex cover. Definition 5.13: An edge e = {u, v} of a graph G fulfills the double-cover condition if there exists a minimum vertex cover U of G with u, v ∈ U . A graph G fulfills the doublecover condition if any edge of G fulfills it. Theorem 5.14: Let G be a family of critical undirected graphs. If G neither contains the graph with a single node nor the graph with a single edge and if each graph of G fulfills
ISBN: 1-60132-456-1, CSREA Press ©
Int'l Conf. Foundations of Computer Science | FCS'17 |
Fig. 4 A CRITICAL GRAPH THAT CANNOT BE GENERATED BY CHAIN AND PARALLEL EXTENSIONS FROM A CYCLE OF LENGTH THREE .
7
Lemma 5.18: [Lower-Bound Lemma] Let G = (V, E) be a graph generated according to Definition 3.1, where B is given by the graphs in Γ∗chain, par (K3 ) ∪ {K1 , K2 }. Let (αn(G) , αn(G)−1 , . . . , α1 ) ∈ Nn(G) denote the lexicographically minimal vector that fulfills the following two equations n(G)
n(G) =
X
n(G)
i · αi
and
c(G) =
i=1
the double-cover condition, then all graphs of Γchain (G) and Γpar (G) are critical and fulfill the double-cover condition. Corollary 5.15: Let G be a family of critical undirected graphs without the single-node graph and without the singleedge graph and fulfilling the double- cover condition, then any graph of Γ∗chain (G) is critical. So, the parallel and the chain extension preserve the property that a graph is uncritical or critical. One can also show that the inverse extensions also preserve this property. The proofs for these observations can be found in the full version of this work [4]. Definition 5.16: Let G be a family of undirected graphs, then define Γchain, par (G) = Γchain (G) ∪ Γpar (G) . Γ∗chain, par (G)
Let denote the transitive closure of G according to Γchain, par . If we apply the extension k times, then the set of resulting graphs is denoted by Γkchain, par (G). If G consists of a single graph G, we use the notions Γchain, par (G) and Γ∗chain, par (G). There are connected critical graphs that cannot be generated by chain and parallel extensions from a cycle of length three. An example is illustrated in Figure 4. For generating our benchmark graphs, we focus on the set of base graphs B that can be generated by parallel and chain extensions from a cycle of length three. To efficiently generate such graphs, we used several bounds on the number of edges according to the number of vertices and the size of a minimum vertex cover. For a given graph G = (V, E), we define n(G) = |V | for the number of vertices of G, m(G) = |E| for the number of edges of G, c(G) for the size of the minimum vertex cover of G, and c(G) = n(G) − c(G) for the number of vertices not in the minimum cover. We establish some general bounds for m(G) according to n(G) and c(G). Since there are no edges between two vertices that are not in the cover, we can conclude: Observation 5.17: For every graph G, it holds that c(G) + 1 m(G) ≤ c(G) · n(G) − . 2 Given two vectors ~u = (u1 , . . . , ut ) and ~v = (v1 , . . . , vt ), then we call ~u lexicographically smaller than ~v , denoted by ~u ≤lex ~v , if either both vectors are equal or if there exists an index i ∈ {1, . . . , t} such that ui < vi and uj = vj for all j ∈ {1, . . . , i − 1}.
X
(i − 1) · αi .
i=1
Then, we have n(G)
X i · (i − 1) · αi . 2 i=1 We conclude this section with the observation that the vectors α ~ = (αn , . . . , α1 ) fulfill m(G) ≥
n=
n X
i · αi
and
c=
i=1
n X (i − 1) · αi . i=1
One can show that if for α ~ = (αn , . . . , α1 ) there are two indices i, j with αi , αj > 0 and i 6∈ {j, j + 1} then also the vector α ~ i,j = (αn0 , . . . , α10 ) with for k 6∈ {i, i − 1, j + 1, j} αk αk − 1 for k ∈ {i, j} αk0 = αk + 1 for k ∈ {i − 1, j + 1} is a solution for the two equations on c and n. On the other hand, if i > j + 1, then n X k · (k − 1) k=1
2
· αk0
m. L1 : Search Pool with n given elements. L2 : List containing m different keys. ki : The ith key in the list, L2 , i = 1, 2, . . . , m. n n r: List size to key size ratio, which is m . For n > m, m > 1. ti : Index position of the ith key, ki in the search pool, L1 . Here 1 ≤ ti ≤ n. Also, 1 ≤ i ≤ m. lef t: Index position of the leftmost element in search pool, L1 . right: Index position of the rightmost element in search pool, L1 . l: Denotes the length of a list (search pool) or a search space. p: Denotes number of partitioning keys for the generalized tiered search. Definition 1: Keygap, g: The number of elements between two successive keys, ki and ki+1 in search pool, L1
is known as the Keygap. Here, i = 1, 2, . . . , (m − 1). For instance, the number of elements in between the keys, kj and kj+1 is, tj+1 − tj − 1. Therefore, the Keygap between these two keys is, gj = tj+1 −tj −1 elements. For uniformly distributed keys in the search pool, L1 , Keygap, g ≈ (n−m) m n ≈ m − 1 throughout the search pool. Definition 2: Inter-Key Space Elements, I: Elements in between all successive pairs of keys is collectively known as the Inter-Key Space ∑ Elements (IKSE). This is denoted m−1 by I. Therefore, I = j=1 (tj+1 − tj − 1). If a key is not identified inside the larger list, L1 , then I measures the number of elements between the first previous existing key within the search pool and the next succeeding key existent within the search pool. If none of the m keys exist within L1 , I = 0. If there is only 1 key, say the jth key, kj that exists in L1 , then I = (n−tj ). If only the first key, k1 and the last key, km exist within L1 , then I = (tm - t1 − 1). If only the hth and the sth keys are in L1 such that 1 ≤ h < s ≤ m, then I = ts - th - 1. Also, I for other key index combinations may be determined analogously. Definition 3: Search Space Improvement Factor, SSIF : It is ratio of the computational search space, CO , if k individualized keys are searched inside the search pool, arr[], one key at a time to the search space, CT explored by the proposed tiered binary search algorithm. Therefore, CO O SSIF = C CT . In general, SSIF = CT >> 1. The higher is the ratio, SSIF , the better efficiency is encountered through the proposed tiered binary search algorithm.
3. Formal Foundation With multiple key binary search strategy, instead of bluntly looking for a single key inside of a larger list, all keys are simultaneously being searched for. This strategy significantly reduces the computational overhead due to the individualized execution of the search algorithm with only one key element. Multi-key tiered binary search provides with a very high yield with sorted list elements, which is a requirement for binary search. With the proposed search algorithm, there are two sets. One set is, S1 , the search pool (the larger list) of distinct elements where the proposed algorithm is being applied to. The other set, S2 contains the distinct keys that are required to be mapped onto the set, S1 . Using set theoretic notation, |S1 | = n and |S2 | = m, and n >> m. If f is the mapping function from S2 to S1 , then {y = f (x) = x|y ∈ S1 & x ∈ S2 }. Therefore, here the mapping function f is an Identity Function, Ix , which maps each key element from the domain of keys to the identical element in the co-domain of search ∪ pool, L1 . The key index domain for the mapping is, Z + {0} for Java or C + + programming language implementation. The element index in the search pool for the ∪ co-domain ∪ mapping is, Z + {0} {−1} for programming language implementation. Assuming all key elements from S2 may be mapped onto the corresponding elements in set, S1 , the
ISBN: 1-60132-456-1, CSREA Press ©
Int'l Conf. Foundations of Computer Science | FCS'17 |
Predicate Logic Model for the mapping becomes, ∀x∃y(x ∈ S2 → (y ∈ S1 ∧ (y = x))). Therefore, the negation of the logical statement is, ∃x∀y(x ∈ S2 ∧ (y ∈ / S1 ∨ (y ̸= x))). Again, if x = kj and y = Itemtj , then ∀j(j ≥ 0 ∧ j ≤ n ∧ (x = kj ) → ((y = Itemtj ) ∧ (y = x)). With distinct keys and list elements, the proposed algorithm essentially performs an One-to-one mapping from the set, S2 to the set, S1 , provided that all key elements in set, S2 exists in set, S1 . The mapping is not Onto, as |S1 | = n >> |S2 | = m. Since both the keys and the list elements are distinct (by assumption), no two keys map onto the same search pool element. Due to the same reason, no key maps onto 2 or more different search pool elements. As the algorithm simply performs a one-to-one mapping, which is not onto, therefore, the problem of Collision due to Hashing does not arise with the proposed tiered binary search model. Following is the complete predicate logic model that takes these constraints into account. ∀x∀y∀z∀k((((z = f (x)) ∧ (y = f (x))) → (z = y)) ∧ (((y = f (x)) ∧ (y = f (k))) → (x = k)))). Since the multiple element binary search mapping is not Bijective, the inverse mapping function does not exit. Furthermore, assuming all keys from S2 exist in, L1 , which is represented by the set, S1 , S2 becomes a proper subset of S1 . With the set theoretic notation, this becomes, S2 ⊂ S1 . In that event, S2 is a member of the Power Set, Ψ of S1 . Stated formally, S2 ∈ Ψ(S1 ). If some keys from S2 are not present in L1 , S2 * S1 . In that event, there is a third set, S3 , which is a subset of both S2 and S1 that lies ∩ at the set intersection of S2 and S1 . Therefore, S3 = S2 S1 . Also, (S3 ⊂ S2 ) ∧ (S3 ⊂ S1 ). The Relation, R between the sets, S2 and S1 is an Equality Relation, which is a subset of the Cartesian Product, S2 × S1 between the sets. Hence, S2 RS1 is such that {(x, y) ∈ R|x ∈ S2 ∧ y ∈ S1 ∧ (y = x)}. Also, R ⊆ S2 × S1 . Here, R is a binary relation between the sets. This relation, R between S2 and S1 is Reflexive, Symmetric and Transitive. Therefore, R is an Equivalence Relation. Initially, the proposed algorithm searches through the entire search pool, L1 containing n different elements to identify the middle key index position, k⌊ m2 ⌋ . As |S1 | = n, therefore, there are n possible one-to-one mappings from S2 to S1 . Once k⌊ m2 ⌋ is identified at location t⌊ m2 ⌋ , the smallers keys, k1 through k⌊ m2 ⌋−1 is searched for only within the subset of S1 containing (t⌊ m2 ⌋ −1) search pool elements. The larger keys, k⌊ m2 ⌋+1 through km are explored only within the (n − t⌊ m2 ⌋ − 1) search pool elements. Hence, due to subset reduction, the key search space is reduced, and eventually, gets optimized. Beginning with the nth root finding algorithm as a variation to pure binary search, which gradually reduces the search space to converge the algorithm to the closest nth root, the proposed tiered binary search algorithm is presented as another variation to binary search.
27
4. Proposed nth Root Finding Algorithm Following is the nth root finding algorithm that is very efficient, and works fine for any non-negative real number. Algorithm nthRoot Purpose: This algorithm computes the nth root of a nonnegative real number. The supplied parameters are: a constant P RECISION = 0.0000001, a non-negative real number: x from keyboard, and the value of positive integer n, where n ≥ 2. So, n ∈ Z + . Here, Z + is the set of positive integers, and Z + = {1, 2, 3, . . .}. nth Root Finding algorithm computes the nth root of the supplied non-negative real number, x. Require: n ∈ Z + , n ≥ 2, and x ≥ 0. Ensure: Correct value of the nth root of the supplied real number, x is computed. Here, x ≥ 0. Input non-negative real number x from keyboard. {May prompt the user to input x with x ≥ 0.} while x P RECISION do midP oint = (a + b)/2.0 if midP oint × midP oint × midP oint × . . . (n − times) < x then b = midP oint else a = midP oint end if end while Display nth root = midP oint else b=x while (b − a) > P RECISION do midP oint = (a + b)/2.0 if midP oint × midP oint × midP oint × . . . (n − times) < x then a = midP oint else b = midP oint end if end while Display nth root = midP oint
ISBN: 1-60132-456-1, CSREA Press ©
28
Int'l Conf. Foundations of Computer Science | FCS'17 |
end if
4.1 nth Root Finding Algorithm Analysis Here, a modified binary search technique is used to assign an estimate of the nth root to a non-negative real number, x to another variable, root. There are four different cases associated with the computational scenario. • •
• •
Case 1: x = 0.0. In this case, the nth root of x = 0.0. Case 2:In this case x > 0.0 and x < 1.0. Following are the steps to compute the nth root: – Step 1: Since x < 1.0, the nth root lies in between 0.0 and x itself. – Step 2: Declare 2 real variables a and b. Initialize, a = 1.0, and b = x. So, x < root √ < 1.0. This is true, as if x < 1.0 and x > 0.0, n x > x. – Step 3: Change, or fine tune the double variables a and b, and make them closer and closer to each other by narrowing the root search space. It is necessary to ensure that the root lies in between a and b. Therefore, the invariant for the search is, b ≤ root ≤ a. Case 3: In this case, x = 1.0. Therefore, with this case, √ n x = 1.0. Case 4: In this case, x > 1.0. So, the nth root of x must lie in between 1.0 and x itself. Following are the steps involved in computation: – Step 1: Declare two variables a and b. – Step 2: Assign a = 1.0, and b = x. – Step 3: Therefore, the nth root must lie between a and b. Change or fine tune a and b, so that a and b get closer and closer to each other to converge to the actual nth root. In this case, the invariant for the computation is, a ≤ root ≤ b.
5. Proposed Tiered Binary Search Algorithm Following is the proposed Tiered Binary Search algorithm with m keys, where m ≥ 2. Following algorithm uses only 1 partitioning key, which is the middle key in the keys[] array. Algorithm T ieredBinarySearchP1 mkey Purpose: This algorithm performs m-key Tiered Binary Search with only 1 middle key. The supplied parameters are: Search Pool arr[], having n elements where the keys are to be identified. An array of m different keys, keys[]. Assumption: n >> m. m-key Tiered Binary Search returns the array keyLocations[] (in Java) with the computed index positions for m different keys. If a key does not exist, the corresponding position in array keyLocations[] contains a −1.
Require: Both the arrays arr[] and keys[] are required to be sorted. For this algorithm, both arr[] and keys[] are sorted in ascending order. Ensure: Proper key index positions are identified. int[] keyLocations = new int[keys.length] {Number of elements in keyLocations[] = m = the number of keys in keys[].} startIndex = 0 endIndex = (arr.length − 1) keyF irst = 0; keyLast = (keys.length − 1) middleKey = (keyF irst + keyLast) / 2 keyLocations[middleKey] = RecursiveBinarySearch(arr, startIndex, endIndex, keys[middleKey]); {Determine the middleKey key location in the search pool, arr[].} for i=0 to (middleKey − 1) do keyLocations[i] = RecursiveBinarySearch(arr, startIndex, (keyLocations[middleKey] − 1), keys[i]) {Use a for loop to determine the key index locations in search pool, arr[] for all keys that are smaller than middleKey.} startIndex = (keyLocations[i] + 1) {Shift start position to index next to current key location.} end for{Next the algorithm performs a Balanced Binary Search for the keys that are larger than the keys[middleKey].} {Beginning with the last key index location, keyLast, at each iteration, the value of i is decreased by 1.} for i=keyLast to (middleKey + 1) do keyLocations[i] = RecursiveBinarySearch(arr, (keyLocations[middleKey]+1), endIndex, keys[i]) endIndex = (keyLocations[i] − 1) {Shift end position to an index immediately before current key index location.} end for return keyLocations {Return the Java array, keyLocations[] containing the m key index positions to the calling program.} A more generalized version as discussed in the following uses k partitioning keys, where k ≥ 2. Algorithm T ieredBinarySearchPk mkey Purpose: This algorithm performs m-key Tiered Binary Search with k partitioning keys. The supplied parameters are: Search pool array arr[], having n elements where the keys are to be identified. An array of m different keys, keys[]. Assumption: n >> m. Also, an integer k ≥ 2. k determines the number of Partitioning Keys. m-key Tiered Binary Search returns the array keyLocations[] (in Java) with the identified index positions of m different keys.
ISBN: 1-60132-456-1, CSREA Press ©
Int'l Conf. Foundations of Computer Science | FCS'17 |
If a key does not exist, the corresponding key position in array, keyLocations[] contains a −1. Require: Both the arrays arr[] and keys[] are required to be sorted. For this algorithm, both arr[] and keys[] are sorted in ascending order. Ensure: Proper key index positions are identified. int[] keyLocations = new int[keys.length] {Number of elements in keyLocations[] = m = the number of keys in keys[].} int[] keyP artitions = new int[k] {The array to hold the partitioning keys.} int[] partitionIndices = new int[k] {partitionIndices[] holds the indices of the Partitioning Keys.} startIndex = 0 endIndex = (arr.length − 1) keyF irst = 0; keyLast = (keys.length − 1) for i=0 to (k − 1) do keyP artitions[i] = keys[int(((keyLast) / (k + 1))×(i + 1))] {Determine each partitioning key in keys[] array, and load it to the corresponding index location in keyP artitions[] array.} partitionIndices[i] = int(((keyLast) / (k + 1))×(i + 1)) {Hold the indices of Partitioning Keys in partitionIndices[] array.} keyLocations[int(((keyLast) / (k + 1))×(i + 1))] = RecursiveBinarySearch(arr, startIndex, endIndex, keyP artitions[i]) {Determine the keyP artitions[i] key location in the larger array, arr[].} startIndex = (int(((keyLast) / (k + 1))×(i + 1))) + 1 {Shift the start position for the next Partitioning key.} end for startIndex = 0 {The beginning index for key search in the search pool array, arr[].} for i=0 to (k-1) do if i == 0 then indexLef t = 0 {Define indexLef t as the left index for the current sub-interval inside the key array, keys[]} else indexLef t = partitionIndices[(i − 1)]+1 end if if i < k-1 then indexRight = partitionIndices[i]-1 {Define indexRight as the right index for the current sub-interval in key array, keys[]} else indexRight = (keys.length − 1) {For the last interval, the end interval for the keys array, is the length of keys[] − 1.} end if indexM iddle = int((indexLef t + indexRight)/2) {Define indexM iddle as the middle index for the current sub-interval in key array, keys[]}
29
keyLocations[indexM iddle] = RecursiveBinarySearch(arr, startIndex, keyLocations[partitionIndices[i]-1, keys[indexM iddle]) {Determine the key index location for the key at indexM iddle inside the search pool, arr[]} {Next apply Binary Search to the keys within the range of indexLef t and (indexM iddle 1)} for j=indexLef t to (indexM iddle - 1) do keyLocations[j] = RecursiveBinarySearch(arr, startIndex, (keyLocations[indexM iddle]-1), keys[j]) {Use a for loop to determine the key index locations in search pool, arr[] for all keys that are smaller than keys[indexM iddle].} startIndex = (keyLocations[j]+1) {Shift start index position to 1 index position right to current key index location, keyLocations[j].} {This strategy significantly reduces the key index search space.} end for{Next the algorithm performs a Balanced Binary Search for the keys that are larger than keys[indexM iddle].} startIndex = keyLocations[indexM iddle] + 1 {Adjust the beginning index for key search inside the search pool, arr[] one position right to keyLocations[indexM iddle].} if i == (k-1) then endIndex = (arr.length − 1) {For the last interval, set endIndex to length of arr[] array - 1.} else endIndex = keyLocations[partitionIndices[i]]-1 {Adjust the ending index for search interval inside the search pool, arr[] to one position left of the key partitioning index.} end if for j=indexRight to (indexM iddle + 1) do keyLocations[j] = RecursiveBinarySearch(arr, startIndex, endIndex, keys[j]) endIndex = (keyLocations[j]-1) {Shift the end position in the search pool, arr[] to an index value immediately before the current key location.} end for startIndex = keyLocations[partitionIndices[i] + 1 {Adjust the starting interval for key search inside the search pool , arr[] one position to the right of the end of current search interval in arr[].} end for return keyLocations {Return the Java array, keyLocations[] containing m key index positions to the calling program.}
ISBN: 1-60132-456-1, CSREA Press ©
30
Int'l Conf. Foundations of Computer Science | FCS'17 |
5.1 Algorithm Analysis The proposed multiple element tiered binary search algorithm performs a balanced nested range-based binary search to optimize the key search space, and hence, improves the search efficiency. This is essentially a list-to-list mapping, where the smaller list has m elements and the larger list, which is the search pool has n elements. Here, m, n ∈ Z + . Nested tiered binary search provides with better time and space efficiency. Also, the proposed algorithm balances the computation by performing a balanced binary search. With the simpler 1 partitioning key version, the algorithm finds the middle element in the key array. Here, middle_key_index = ⌊m 2 ⌋. Next, the algorithm performs binary search to find the exact key index location for kmiddle_key_index inside the search pool, arr[n]. Suppose, kmiddle_key_index was identified at index position p inside the array, arr[n]. If kmiddle_key_index does not exist inside, arr[n], the proposed algorithm records the last index before lef t_index > right_index for the search following through the binary search strategy. Next, the algorithm confines the search for the key elements from index 0 through (middle_key_index-1) inside the sub-array of array, arr[] from arr[0] through arr[p − 1], and confines the search for the key elements beginning at index (middle_key_index+1) through m within the subarray, arr[p + 1] through arr[n − 1] (as index begins at 0). Example 4 (Search Space Optimization): Consider a list, L1 containing 106 elements, and a list, L2 of keys with 103 elements. Then the middle key for L2 lies at index 499 (assuming indexing begins at 0 and goes up until 999). The middle key will be searched in the space of 106 elements. So the binary searches performed will be c × log2 (106 ) = c×log10 (106 )×log2 (10). But log2 (10) is a constant. Denote log2 (10) by acutec, and c × acutec by C. Then the number of searches for the middle partitioning key is 6C. Without partitioning, applying the basic binary search for all 103 keys in L2 , the number of searches will be 1000 × 6C = 6000C, C is the constant for binary search. With only one partitioning middle key, assume for simplicity, that the keys are uniformly distributed inside the search pool L1 . Then the middle key will be identified half-way at L1 , which is at index 499999. So, search for the keys at index 0 through index 498 will be confined only to the elements at index 0 through index 499998. For the first element, it will be searched through a space of 499, 999 elements ≈ 500, 000. The second element will be searched over a space of 498, 998 ≈ 499, 000 elements, and so. As the list is uniformly distributed, the last element in the first half will be searched through a space of approximately, 2000 elements. So the total search space size is ≈ 500000+499000+498000+. . .+2000 ≈ 1000× 499×500 ≈ 2 124, 750, 000 elements. So the binary search performed will be c × log2 (124750000) = C × log10 (124750000) ≈ 8.1C. Since this is balanced binary search, the algorithm performs
another 8.1C binary searches for the keys at index 500 through 999. So the binary search required is ≈ ((8.1+8.1C) = 16.2C. With the assumption, the search space reduces by a factor of 6000C 16.2C = 370.4. So, the search space improvement factor, SSIF = 370.4. Hence, the effective search space may significantly be reduced through the enhanced k-partitioning key version using the proposed multiple key tiered binary search algorithm.
6. Search Variations As articulated before, the nth Root Finding Algorithm is motivational and foundational to the proposed tiered binary search in gradually narrowing down the search space. With different combinations of elements in L1 and L2 , four optimum search strategies are possible as variations to the proposed algorithm. These are precisely described below. Here, both L1 and L2 are sorted. Combination 1: Consider when both L1 and L2 are sorted in ascending order. This combination is the most common, and is used throughout this paper. Assume that the middle key, kmiddle is identified at index tmiddle with a single partitioning key. So the keys smaller than this partitioning key are searched only within the sub-array, arr[0] through arr[ti − 1]. Once a key, ki is identified at index position, ti , the leftmost index is shifted right to ti + 1 for this sub-list. The rightmost index remains at its original position. For balanced binary search, keys larger than kmiddle is searched within the sub-array, arr[ti + 1]] through arr[n − 1], beginning with the largest key at k(m−1) . Once it is identified at index t(m−1) , the rightmost index is shifted left to t(m−1) −1 for this sublist. Hence, both left and right index positions are gradually shifted towards the partitioning index, gradually narrowing down the converging search space. Hence, the overall search space is optimized. Combination 2: Consider when L1 is in ascending order and L2 is in descending order. Therefore, the keys in L2 are organized from higher to lower order. Assume that the middle key, kmiddle is identified at index tmiddle with a single partitioning key. So, keys smaller than the partitioning key that are located in the right half of the key array are searched only within the sub-array, arr[0] through arr[ti − 1] beginning with the smallest key or the last key at k(m−1) . Once a key, ki is identified at index position, ti , the rightmost index is shifted left to ti − 1 for this sub-list. The leftmost index remains at its original position. As the proposed algorithm performs a balanced binary search, the keys larger than kmiddle , which are inside the left half of the list are searched only within the sub-array, arr[ti + 1] through arr[n − 1], beginning with the largest key at k0 . Once it is identified at index location t0 , the leftmost index is shifted left to t0 + 1 for this sublist. Hence, both the left and the right index positions are gradually narrowed down towards the partitioning index, squeezing the key search space.
ISBN: 1-60132-456-1, CSREA Press ©
Int'l Conf. Foundations of Computer Science | FCS'17 |
Combination 3: Here, both L1 and L2 are in descending order. This combination works pretty similarly to that in Combination 1. Assume that the middle key, kmiddle is identified at index tmiddle with a single partitioning key. So the keys larger than this partitioning key are searched only within the search pool, arr[0] through arr[ti − 1]. Once a key, ki is identified at index position, ti , the leftmost index is shifted right to ti +1 for this sub-list. The rightmost index remains at its original position. As the proposed algorithm performs a balanced binary search, the keys smaller than kmiddle are searched for only within the sub-array, arr[ti + 1] through arr[n − 1], beginning with the smallest key at k(m−1) . Once it is identified at index location t(m−1) , the rightmost index is shifted left to t(m−1) − 1 for this sublist. Hence, both left and right index positions are gradually shifted towards the partitioning index, optimizing the key search space. Combination 4: With this combination, L1 is in descending order and L2 is in ascending order. Therefore, the keys in L2 are organized from lower to higher order. Assume that the middle key, kmiddle is identified at index tmiddle with only 1 partitioning key. So the keys smaller than the partitioning key that are located within the left half of the key array are searched only within the sub-array, arr[ti + 1] through arr[n − 1] beginning with smallest key or the first key at k0 . Once a key, ki is identified at index position, ti , the rightmost index is shifted left to ti − 1 for this sub-list. The leftmost index remains at its original position. As the proposed algorithm performs a balanced binary search, the keys larger than kmiddle , which are located in the right half of the keys array are searched for only within the sub-array, arr[0] through arr[ti − 1], beginning with the largest key at k(m−1) . Once it is identified at index location t(m−1) , the leftmost index is shifted right to t(m−1) + 1 for this sublist. Hence, both left and right index positions are gradually shifted narrowing down the overall key search space. Hence, the overall search space is optimized by shifting the boundaries towards the middle key index position. This is the beauty of the proposed computational algorithm in this paper.
7. Conclusion and Future Research This paper proposes a new algorithm for range-based tiered multiple element binary search strategy, and shows the effectiveness of the proposed algorithm in search space reduction. The paper also explores a formal foundation to the proposed algorithm. Examples are incorporated to illustrate the concepts introduced. In future, to demonstrate the overall computational effectiveness of the proposed algorithm, following computation time measurements will be carried out: •
31
Both the Search Pool and the Keys are Sorted in Descending Order. • The Search Pool is Sorted in Ascending Order and the Keys are Sorted in Descending Order. • The Search Pool is Sorted in Descending Order and the Keys are Sorted in Ascending Order. Computation times for the above 4 combinations will be compared to each other through graphical as well as tabular means. As per as the computational mechanics is concerned, following will be implemented for algorithmic computation. • Beginning with the index position calculated for the middle key in search pool, the left half of the search pool will be searched in ascending order, and right half of the search pool will be searched in descending order of keys. • Beginning with the index position computed for the middle key in search pool, the left half of the search pool will be searched in descending order, and right half of the search pool will be searched in ascending order of keys. • Beginning with the index position calculated for the middle key in search pool, both the search pool halves will be searched with descending order of keys. • Beginning with the index position calculated for the middle key in search pool, both the search pool halves will be searched with ascending order of keys. Binary search is popular due to its logarithmic efficiency, and the strong application flavor. One of the most important future works will be to research out a significant application of the proposed tiered binary search algorithm. In future, the algorithm will be implemented for different key and list order combinations, as outlined above. The related algorithmic performances will be evaluated, and compared to each other. Other performance issues will also be considered in details. •
References [1] A. Tarek, "A Logarithmic Algorithm for Multi-Key Binary Search," in Proc. SCI’04, 2004, vol. II, pp. 222-227. [2] A. Tarek, "A New Approach for Multiple Element Binary Search in Database Applications," NAUN INTERNATIONAL JOURNAL OF COMPUTERS, vol. 1, pp. 269–279, Issue 4, 2007. [3] A. Tarek, "Multi-key Binary Search and the Related Performance," in Proc. WSEAS American Conference on Applied Mathematics, University of HARVARD, Cambridge, 2008, pp. 104-109. [4] A. Tarek, "A New Algorithm for Multiple Key Interpolation Search in Uniform List of Numbers," Recent Advances in Applied Mathematics and Computational and Information Sciences, vol. II, 2009. [5] A. Tarek, "A New Algorithm for Multiple Key Linear Interpolation with the Formal Foundation," in Proc. CSC’14, 2014, paper CSC2003.
Both the Search Pool (the larger array) and the Keys are Sorted in Ascending Order.
ISBN: 1-60132-456-1, CSREA Press ©
32
Int'l Conf. Foundations of Computer Science | FCS'17 |
Algorithms for the Majority Problem Rajarshi Tarafdar and Yijie Han School of Computing and Engineering, University of Missouri at Kansas City, Kansas City, MO 64110
[email protected], [email protected] Abstract-The main idea of the paper to give solutions to the majority problem where we are counting the number of occurrences of the majority element more than half of the total number of the elements in the input set and also for the number of occurrences of the element at least half of the total number of the elements in the input set. In the model we use elements that cannot be used to index into an array and there is no order for the input elements. Thus the outcome of the comparison of two elements can only be either equal or not equal and cannot be greater than or smaller than. The focus of the paper is to propose algorithms for these problems and analyze their time complexity. For both versions we show O(n) time algorithms. These results could be compared with cases whose elements can be ordered. Keywords: Algorithm, Majority, Complexity
1. Introduction The majority problem [3] is to find the element with occurrences at least half of the number of elements in the input set. This element found is called the majority element. Majority problem is an interesting problem in algorithms field [1][2][3][4]. We study two versions of this problem. The first version is the more than half majority version in which the majority element occurs more than half the number of times of the cardinality of the input set. The other version is the at least half version in which the majority element occurs at least half the number of times of the cardinality of the input set. In this case there might exist two majority elements.
The first version had been studied in [4] and an O(n) time algorithm is presented there. In the studying of the majority problem we assume the model that input elements cannot be ordered and they cannot be used to index an element in an array. The result of the comparison between two elements can only be equal or not equal. That is the comparison will not return the result of one element is “larger’’ or “smaller” than the other element. Example 1 [3]: Majority with more than half the number of times of the cardinality of the input set: James stares at the pile of papers in front of him. His class has just finished voting in the election of a class representative. All pupils have written the name of their preferred candidate on a piece of paper, and James has volunteered to count the votes and determine the result of the election. Prior to the election the class agreed that a candidate should become class representative only if more than half the class voted for him or her. If none of the candidates wins the absolute majority of the votes, the election will have to be repeated. James’s task is now to find out whether any candidate has received more than half of all the votes. Example 2: Majority with at least half the number of times of the cardinality of the input set: In the previous case James is counting the number of votes of the candidates to check whether they have received votes more than half of the total votes. But there may exist some circumstances where one or two candidate(s) received the half of the total number of the votes. The candidate(s) must be the winner if he/she received half of the total number of the votes. What will be the approach that James
ISBN: 1-60132-456-1, CSREA Press ©
Int'l Conf. Foundations of Computer Science | FCS'17 |
will follow to elect the class representative who received only half the total number of the votes?
33
if(a=NO-MAJORITY-ELEMENT) then return NO-MAJORITY-ELEMENT; count=0;
2. Approaches
for (i=0; i n/2) return a else return NO-MAJORITY-ELEMENT; }
Sub Find(A[0..n-1]) { if (n==1) return A[0]; if(n == 2) { if A[0] == A[1] then return A[0] ; else return NO-MAJORITY-ELEMENT; } if(n==3) { if (A[0] == A[1] || A[0] == A[2]) return A[0]; else if(A[1]==A[2]) return A[1]; else return NO-MAJORITY-ELEMENT;
3. Algorithms Procedure Find Majority1 (A [0...n-1]) version, more than half case. */
/* First
} if (n%2==0) then n1=n; else n1=n-1;
Input: Array A [0..n-1] of n objects.
j=0;
Output: Majority element a if it exist. Otherwise report no majority element.
for(i=0; i= (n+n%2)/2) {
else f=b; } if(countc >= (n+n%2)/2)
Procedure Find Majority 2 (Second version, at least half case) Input: Array A[0..n-1] of n objects. Output: Majority element (a, b). If a and/or b is nil then it is not a majority element.
{ if(count==0) { e=c; count++; } else f=c; }
ISBN: 1-60132-456-1, CSREA Press ©
Int'l Conf. Foundations of Computer Science | FCS'17 |
35
if(countd >= (n+n%2)/2)
4. Further Improvement
{
Algorithm Find Majority 3(A)
if(count==0) { e=d; count++; }
Input: set A of n elements.
else f=d; }
Output: Majority element (a, b). If a and/or b is nil then it is not a majority element.
return (e, f);
Main
}
{ for(i=0; i=n/2) return (b, nil); }
sub Find { if(n%2==1) call Majority 1(A) else (n%4==2) Group every 4 elements in a group with the last 2 elements form another group. else /* n%4 ==0 */ Group every 4 elements in a group. for every group of 4 elements a, b, c, d do { if (a==b && a==c && a==d)
ISBN: 1-60132-456-1, CSREA Press ©
36
Int'l Conf. Foundations of Computer Science | FCS'17 |
{
{
Discard any two elements and put the other two elements into set S;
Discard both a and b;
}
}
else if(three elements (say a, b, c) are equal and d is different)
Theorem 1: Majority 3 find the majority
Discard b, c, d and put a into S;
element(s) in O(n) time.
} else if(there are two pairs and elements within each pair are equal (say a==b && c==d)) { Discard b and d and put a and c in S; } else if(there a pair of elements that are equal and other two elements are not equal (say a==b && a != c && a!=d & c != d) { Discard b, c, d and put a in S; } else /* No two elements are equal among 4 elements */ { Discard all 4 elements; } } for the group of two elements a and b do
if(a==b) { Discard a and put b in S; } else
Call Find Majority 3(S); }
{
{
}
Proof: The correctness of Algorithm Majority 3 follows from the facts as shown below: Let us consider a set consisting of 4 elements which has 4 same elements { a, a, a, a} So the algorithm will be designed in such a way such that if all the 4 elements are equal then we should keep two element(here a) and discard the remaining elements. Because at each recursion level we reduce the total number of elements to at most half. Thus if a is a majority element it will remain to be the majority element. If the set consists of elements where there are 3 equal elements such as in {a, a, a, b} then we will keep one of the equal elements and discard the remaining elements. Because a != b thus throw away the pair a and b will not affect the selection of the majority element. For 2 a’s we kept one and because the number of elements is reduced to at most half, thus if a is a majority element it will remain to be the majority element. If the set consists of two sets of equal elements such as {a, a, b, b} we will discard one a and b and keep the remaining two. Because recursion reduced the set to a set of at most half the number of elements and thus if a and/or b are majority elements they are kept. If the set consists of two equal elements {a, a, b, c} then we discard a, b, c and keep one of the equal elements (here a). Removing b and c will not affect the counting of majority because b != c. We kept one a for the number of elements is reduced to at most half and therefore if a is a majority element it will be kept.
ISBN: 1-60132-456-1, CSREA Press ©
Int'l Conf. Foundations of Computer Science | FCS'17 |
If the set consists of 4 different elements such as { a, b, c, d} we will discard all of them. In this case we pair a and b and a != b and thus we throw them away. And we pair c and d and c != d and thus we throw them away. These actions will not affect the counting of the majority. Time Complexity The recursion formula for this algorithm, T(n)=T(n/2)+O(n).Therefore the time complexity will be O(n).
5. Conclusions In this paper we have studied the majority problem. For both the more than half case and the at least half case we provided optimal algorithms. We note that the model used for the study of these algorithms is interesting and we plan to do further research on the majority problem in the near future.
References [1]. T. H. Corman, C. E. Leiserson, R. L. Rivest, C. Stein. Introduction to Algorithms, Third Edition, MIT Press, 2009. [2]. S. Dasgupta , C. H. Papadimitriou, and U. V. Vazirani. Algorithms, McGraw-Hill Education, 2006. [3]. Berthold Vocking , Helmut Alt , Martin Dietzfelbinger R¨udiger Reischuk Christian Scheideler , Heribert Vollmer , Dorothea Wagner. Algorithms UnPlugged , Springer, 2011. [4]. Boyer, R. S.; Moore, J S., "MJRTY - A Fast Majority Vote Algorithm", in Boyer, R. S., Automated Reasoning: Essays in Honor of Woody Bledsoe, Automated Reasoning Series, Dordrecht, The Netherlands: Kluwer Academic Publishers,1991.
ISBN: 1-60132-456-1, CSREA Press ©
37
38
Int'l Conf. Foundations of Computer Science | FCS'17 |
Investigating the Benefits of Parallel Processing for Binary Search 1, 2, 3
A. Paul Mullins Ph.D., Advisor1, B. Gennifer Elise Farrell 2, and C. Ronald Baldwin 3 Department of Computer Science, Slippery Rock University, Slippery Rock, Pennsylvania, USA
Abstract - Research explored possible benefits of parallel programming for the binary search, specifically relating to how divisions of threads in a binary search affected efficiency of the algorithm. A unique search, named Congruent Binary Search (CBS), was written with the goal of avoiding the pitfalls of the standard parallel binary search. It was hypothesized the CBS would run in , where N is number of elements in the search and p is number of processors. Virtually no speed increase of the CBS was found when compared to standard serial and paralyzed binary search. Research failed to find strong evidence that the CBS offers a performance boost over other binary searches. The binary search is not a good candidate for parallel programing.
Keywords:
Parallel programming, undergraduate research project
1
binary
search,
Introduction
This undergraduate research project asks if the binary search can benefit from parallel programming. This research investigates if any speed up factor will be equivalent to the number of threads used. Current implementations of the parallel binary search tend to split an array into small arrays many times over. An example is included in “Efficient Parallel Binary Search on Sorted Arrays” by Danny Z. Chen. [1] Chen discusses an algorithm that uses two arrays A and B with n numbers and m numbers that has a time complexity of Though this is a valid approach there are some key points where an alternative may be more desirable. One such alternative is one that sorts in place and does not require the overhead of splitting data into multiple arrays. This most notably causes an increase in space complexity. The Congruent Binary Search was written as an alternative. In this algorithm one only needs to use a single array for storage; implementation is much simpler by nature, and in theory should have a significant speed boost. Furthermore, this experiment is important because there is currently little related research established. According to many on stack overflow, the community is currently divided on how this should be done, or if it should be done at all. [2] This obvious inconsistency in the programming community gives ample reason for testing.
For the experiment, the Congruent Binary Search is created and tested against its serial version, and the seemingly most common algorithm for parallel binary search. The execution times are compared and speed factor is calculated, using the serial binary searches execution time for the same data sets. It is hypothesized that the Congruent Binary Search will improve execution time. It is hypothesized the execution time of the Congruent Binary Search will , where p is the number of processors. The array is split by even and odd indexes. To split the array in half is foolish, because on the first iteration of a serial binary search one of these halves would be thrown out.
2
Report
First, two classes are created for testing. The first class has an array of size one hundred million. Each index in the array contains an integer with an assigned value of its address. This is done for an easy way to attain a large sorted array. Objects are then created which are initialized with the value that is being searched for in this array. Then there are methods that perform the serial binary search and the Congruent Binary Search. The Congruent Binary Search utilizes one array and two threads. One thread searches only even indexes for the value and the other searches only odd indexes. One thread searches the first array the other searches the second. The second class contains two arrays, each with a size of fifty million. It is initialized same way as the first class. After class creation, a test application was created that runs each search with the same number of elements. All algorithms were searching for the same value. This is done sequentially in a for loop so each search can be executed and timed many times over. Each search was saved to a file specified before execution begins. Each individual execution of each search was recorded with its type, time in nanoseconds, and number of threads. After all iterations were complete, the average is taken from each type of search. To find speedup factor, average execution time of each search was used. The serial algorithm was divided by the Congruent Binary Search algorithm to get its speedup factor. Then, the alternative parallel algorithm average execution time is divided by that of the Congruent Binary Search algorithm. Because the Congruent Binary Search is an enhancement of
ISBN: 1-60132-456-1, CSREA Press ©
Int'l Conf. Foundations of Computer Science | FCS'17 |
39
the two established algorithms, Amdahl’s law is used to calculate speedup of the Congruent Binary Search compared to both established algorithms. Thus, by formula (1), speedup is found.
(ii) Average Found Through Single Iterations (Two Threads) Trial
1 2 3 4 5 6 7 8 9 10
(1) In case the thread scheduling by the operating system was affecting the results, each test was run individually. Results were approximately the same. Additionally, all three algorithms were running at once with one iteration and the results were inconsistent; the very first execution is slower than expected. This can be explained by the Java Virtual Machine warm up. The first iteration isn’t useful but it is negligible. The many trials executed assure a more precise answer, by the law of large numbers. That is why execution times are recorded for a million trials.
Congruent Binary Search
Search Val
165702 ns 140568 ns 148635 ns 153290 ns 168495 ns 154220 ns 143981 ns 152980 ns 157014 ns 142739 ns
1000 1000 1000 1000 1000 1000 1000 1000 1000 1000
Table ii: Difference: 496.2 ns in favor of even/odd split Speedup = 153258.2ns/152762.4ns = 1.0032 (virtually no speed increase)
To calculate time complexity, the relationship between each threads' iterations is used to hypothesize the Congruent Binary Search true execution time. A value is chosen on either end of the array: aiming to set up a worst-case scenario. Thus, 0 stored at index 0 is searched for. After analyzing all the data, it was concluded that the parallel search is not a good candidate for a parallel implementation.
3
Standard Parallel Binary Search 143980 ns 143050 ns 163530 ns 149256 ns 159496 ns 160737 ns 160116 ns 153290 ns 148324 ns 150807 ns
(iii)
Iterations Per Thread
N 2 10 13 15 29 30 61 75 150
Data
Tables (i) and (ii) record execution time for the two parallel algorithms that are compared. Table (iii) compares the Congruent Binary Search to the Serial Binary Search. Serial binary search is approximately five times faster, although not depicted in the table data.
Thread 1 0 1 2 2 3 3 4 4 5
Thread 2 1 2 2 3 3 4 4 5 6
Serial 1 2 2 3 3 4 4 5 6
Table iii: This data table lists the number of iterations for each thread, where n is the length of the array being searched.
(i) Average Found Through Looping Trials
Search
Average Execution time
Search Val
Million
Parallel (new)
62455
1000
4
Million
Parallel (old)
62491
1000
One thread handles even indexes and one thread handles odd indexes. For a thread of even and odd indexes, ‘left’ is set to the first index for even or (first + 1) for odd. Furthermore ‘right’ to the last even index for even and the last odd index for odd. Then the midpoint is calculated. With the even thread midpoint is checked that it is even and odd in the odd thread midpoint is checked that it is odd. This is done via the modulus operator. If needed, the value of midpoint changes so that it remains within its proper range. For example, it is possible to find an odd midpoint between two even indexes. This is not allowed, because the midpoint must always be located within the even thread. If the midpoint isn’t even, it is shifted one space to the left.
Table i: Difference: 32 (Nano seconds) in favor of odd/even array split. Speedup = 62491/62455 = 1.0006 (virtually no speed increase)
Congruent Binary Search for Evens and Odds
ISBN: 1-60132-456-1, CSREA Press ©
40
Int'l Conf. Foundations of Computer Science | FCS'17 |
Either ‘left’ or ‘right’ are assigned a new index, depending on the value at the midpoint index when compared to the current value being searched for. With these new ‘left’ and ‘right’ points, the process of calculating the midpoint is repeated, cutting the array in half each iteration. The iteration stops when either ‘left’ is greater than ‘right’ (indicating value was not found within thread) or midpoint is equal to the value being searched for (indicating that the value was found).
The above process is shown in the code in figures 1 and 2.
Figure 2: This is the code for the section of the Congruent Binary Search that handles the thread of odd indexes
4.1
Figure 1: This is the code for the section of the Congruent Binary Search that handles the thread of even indexes
Generalizing the congruent Binary Search
Currently, established research has only been performed on an array divided into two threads. The Congruent Binary Search can easily split an array into any number of even number threads. It can be generalized to an odd number of divisions, but these divisions are not desirable for the experiment at hand. An odd number of divisions is undesirable because each thread will contain both even and odd indexes. This causes two problems: The Congruent Binary Search works on threads of just even or just odd indexes. Accommodating for the case of a thread with both even and odd indexes would require a different algorithm. Secondly, even if the Congruent Binary Search could work within one of these threads with both even and odd indexes, new code would have to be written to accommodate the fact that midpoint can now be a decimal. This would be done by adding a call to the floor method. For splitting the index into p threads, where p is an element of 2N, let each thread be a subset of the congruency classes . Let each thread be the only subset of an individual congruency class. Do not allow for more than one thread to be a subset of each congruency class.
ISBN: 1-60132-456-1, CSREA Press ©
Int'l Conf. Foundations of Computer Science | FCS'17 |
For each thread, set Left to the first most element of the congruency class it is a subset of. So the index p(0)+0, p(0) +1, ... p(0) + (n-1) will be Left for each respective thread. Set Right to the last (i.e. largest) index for each thread. Within each thread, the algorithm will iterate through either even or odd indexes. Because the array is divided between an even number of threads, there will never be a thread that contains both even and odd indexes. It is possible that the midpoint between the Left and the Right indexes is not even when within a thread of even indexes. It is possible that the midpoint between two odd indexes is even. The general form for finding the midpoint, and determining if its value must be changed as to remain within the thread in question, is as follows: Let ‘a’ = first address in each respective thread. Let ‘remainder’ = ‘midpoint’ mod p If (‘midpoint’ mod p != a) Then mid = mid – (remainder-a) Else, midpoint remains the same. When midpoint mod p is equal to a, midpoint is located within the thread in question and its value does not need to be adjusted. Following the form of the Congruent Binary Search, outlined above, either ‘left’ or ‘right’ are assigned a new index, depending on the value at the midpoint index when compared to the current value being searched for. An array divided into p threads will need Left and Right values that can change in increments of p. For example, in the even and odd two threaded search, Left and Right's values can change in increments of two. For an array divided into four threads, Left and Right's values can change in increments of four. The search terminates under the same conditions as the above form. The iteration stops when either ‘left’ is greater than ‘right’ or midpoint is equal to the value being searched for.
5
Conclusions
The hypothesis that the Congruent Binary Search algorithm could offer a performance boost to the binary search was not supported. In fact, any speed up was negligible because the same execution time was also achieved with the prior method. Some possibilities for error can be accounted for by the fact that Java is not particularly good for parallel programming. In Java, all threads run within the Java Virtual Machine. Unfortunately, it is not possible to see what the Operating System is doing in terms of handling the threads. The Java threads may have had such low priority that the operating
41
system interrupted them often for other system tasks that have higher priority in the system queue. To get more accurate results a different environment may be better. An ideal environment is a parallel programming model that uses explicit parallelism where the programmer has more control over memory management as well as task management. Additionally, the research fails to find evidence that the Congruent Binary Search offers a performance boost when compared to the serial binary search and the standard parallel binary search. In fact, the serial Binary search outperformed both parallel algorithms. This could be because of underlying low level processes that are unseen. However, it is important to remember that in some cases serial execution is the most performant, especially for smaller tasks. With such tasks the multithreading overhead resulting from parallelization becomes more noticeable. The theoretical time complexity of the Congruent Binary Search algorithm is determined to be . This contradicts the actual time complexity of log where k is a constant for thread overhead. This is due to the fact that even when an array is split by even and odd indexes, by the halving of the array, only approximately one iteration is saved than when compared to an array double the size. An example is the time complexity of an array of 100,000,000, performed under the serial binary Search and the Congruent Binary Search, respectively: and 8. Thus, the two threads are still doing roughly twenty-six iterations apiece despite half the number of elements in each search. This can be seen in table (iii), a data table that lists the iterations for each thread for several different values of n, as well as the number of iterations for the serial version. Therefore, if there was no overhead of using threads in parallel time, complexity would be the same as serial’s time complexity, Unfortunately, the overhead of threads can be particularly large and not efficient in some languages. The actual time complexity is calculated to be where k is the overhead for a single thread. The idea of accessing indexes by even and odd indexes may have some potential in algorithms that are constant or quadratic in time, because of their steeper curve than that of logarithmic algorithms. The Congruent Binary Search algorithm is parallel but not performant. It is concluded binary search is a bad candidate for parallel programming. For future work, there are currently a few avenues of investigation for the potential effectiveness of the Congruent Binary Search algorithm. Research is being conducted on the effects of changes to the Congruent Binary Search algorithm, utilizing its generalized form. These effects are: splitting the array into a larger number of threads, and a adding a ‘flag’ that allows communication between all threads. Also, current research is exploring adding counter variables to the code to
ISBN: 1-60132-456-1, CSREA Press ©
42
Int'l Conf. Foundations of Computer Science | FCS'17 |
compare each algorithm’s number of instructions. This offers a way of comparing efficiency in terms of instruction rather than speed.
6
References
[1] Chen, Danny Z., "Efficient Parallel Binary Search on Sorted Arrays" (1990). Computer Science Technical Reports. Paper 11. http://docs.lib.purdue.edu/cstech/11. [Accessed: 20Feb- 2017] [2] 'Parallel Binary Search', 2012 [Online]. Available: http://stackoverflow.com/questions/8423873/parallel-binarysearch. [Accessed: 20- Feb- 2017]
ISBN: 1-60132-456-1, CSREA Press ©
Author Index Baldwin, C. Ronald - 38 Farrell, Gennifer Elise - 38 Goswami, Naveen Kumar - 3 Han, Yijie - 32 Hong, Joohwan - 19 Jakoby, Andreas - 3 Jung, Seokyong - 19 Ko, Yoomee - 15 Lim, Jeongtaek - 15 List, Eik - 3 Lucks, Stefan - 3 Mullins, Paul - 38 Ryu, Minsoo - 15 , 19 Tarafdar, Rajarshi - 32 Tarek, Ahmed - 25 Yoon, Hosang - 15 Yoon, Hyunmin - 15 Yun, Kihyun - 19 Zhang, Yong - 10