Simple Yet E ective Algorithms for Constraint

0 downloads 0 Views 5MB Size Report
Aug 28, 1996 - In this thesis, we will investigate and formally prove the e ectiveness of simple ... The Constraint Satisfaction Problem (CSP) is the following NP search problem: ...... Shift assignment completes the schedule by assigning shifts to the schedule ...... [99] Bart Selman, Hector Levesque, and David Mitchell. A new ...
Simple Yet E ective Algorithms for Constraint Satisfaction and Related Problems BY Hoong Chuin LAU

Dept. of Computer Science, Tokyo Institute of Technology 2-12-1 Ookayama, Meguro-ku, Tokyo 152, Japan

A dissertation submitted in partial ful llment of the requirements for the degree of Doctor of Engineering at the Tokyo Institute of Technology August 28, 1996

i

Abstract Constraint-based reasoning, whose origin came from computer vision research of the 1970's, is now a central topic of growing importance in many disciplines including arti cial intelligence (AI), computer science, robotics, operations research (OR), management technology, logic programming and others. This is witnessed by recent international workshops and symposia where constraint processing is contributing exciting new directions in computational linguistics, concurrent and distributed computing, database systems, graphical interfaces, combinatorial optimization, and geographical information systems. A central problem in constraint-based reasoning is the constraint satisfaction problem (CSP): We are given a set of variables, a discrete and nite domain for each variable and a set of constraints. Each constraint is de ned over some subset of variables and limits the combination of values that these variables can take simultaneously. The goal is to obtain an assignment that satis es either all constraints or (failing which) a set of constraints such that certain linear objective function is maximized. While a pervasive amount of experimental results have been presented by AI researchers to solve CSP (e.g. see the Constraint Satisfaction section in recent AAAI, IJCAI and ECAI conference proceedings), little is done in the theoretical analysis of the algorithms and much less the study of its inherent computational complexity. Hence, we see the need to provide a formal treatment for the CSP, primarily in terms of the design and analysis of ecient algorithms, and the investigation of its inherent computational intractability. In this thesis, we are mainly concerned with the design and analysis of simple yet provably e ective algorithms to tackle various CSP and related problems. The main topics of discussion are as follows.

1. We will study the computational complexity of and exact algorithms for restricted forms of the standard CSP. 2. We will consider random CSP instances and study the behavior of several types of local search algorithms. We show conditions under which local search performs well with high probability, both theoretically and experimentally. 3. We next consider CSP optimization problems, namely, nding an assignment which maximizes certain linear objective functions. For this, we will rst propose and analyse the worst-case behavior of local search algorithms in nding approximate solutions. Next, we will propose and analyse a new approach based on the randomized rounding of mathematical programs. Our theoretical results improve several known results in approximation of NP-hard problems. 4. In the nal part of this thesis, we will present ecient algorithms to solve a special case of the CSP, the manpower scheduling problem, which has direct real-world applications. This research is motivated by potential applications in scheduling. With implementation in mind, we will mainly employ practical tools and techniques from computer science, operations research and combinatorics. The subject material presented is thus of interest to both theoretical computer scientists and CSP practitioners. It is hoped that this thesis bridges theoretical results and the needs in real-world applications.

ii

Acknowledgments Many people have helped me bring this thesis to fruition. First among them is Osamu Watanabe. As my advisor, he made himself frequently available to discuss and verify technical details about my research, clarify dicult concepts in Complexity Theory, and provide challenging new ideas on how to obtain tighter results. I am indebted to Magnus Halldorsson for introducing me to the rich research area of approximation algorithms for NP-hard problems. Through many email discussions, I learnt much about algorithmic techniques and presentation styles from him. I would also like to thank members of the examination committee comprising Masakazu Kojima (my co-advisor, who got me interested in semide nite programming), Shuichi Ueno, Masayuki Numao and Takuya Katayama for carefully reading and commenting on early drafts of this work. Thanks to my group at the Knowledge Processing Research Lab of Fujitsu Laboratories (esp. Hirotaka Hara, Nobuhiro Yugami and Yuiko Ohta) which provided a good environment for me to perform most of my experiments as well as discuss practical aspects of theoretical results. Thanks to researchers within and outside Japan who answered my queries and provided pointers to my research, including David Williamson, Madhu Sudan, Luca Trevisan, Richard Wallace, Avrim Blum, Toshihide Ibaraki, Peter van Beek, Peter Jeavons, Carlos Domingo, and Tatsuie Tsukiji. I also specially thank Katsuki Fujisawa for letting use his SDPA solver for solving semide nite programs. I also thank the National Computer Board of Singapore for o ering me a postgraduate scholarship which enables me to study in Japan and enjoy Japanese culture. Finally, I must thank my family. My wife, for providing a loving home in Japan; my parents, sister and brother in Singapore for constantly feeding me with encouraging words. Most of all, I thank the Almighty God, who makes all things possible.

Contents 1 Introduction 1.1 Problem De nition : : 1.2 CSP Related Problems 1.3 Key Concepts : : : : : 1.4 Overview of Results :

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

2 Exact Algorithms 2.1 Related Work : : : : : : : : : : : : : : : : : : : 2.1.1 0/1/All Networks : : : : : : : : : : : : : 2.1.2 Row-Convex Networks : : : : : : : : : : 2.1.3 q -Tight Networks : : : : : : : : : : : : : 2.1.4 Functional Networks : : : : : : : : : : : 2.1.5 Networks with Small Domain Sizes : : : 2.2 Our Contributions : : : : : : : : : : : : : : : : 2.2.1 NP-Completeness of Grid RC-CSP : : : 2.2.2 NP-Completeness of Non-Grid RC-CSP

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

3 Monte Carlo Local Search Algorithms 3.1 Related Work : : : : : : : : : : : : : : : : : : : : : : : : : 3.1.1 Koutsoupias and Papadimitriou's Algorithm : : : : 3.1.2 Gu's Algorithm : : : : : : : : : : : : : : : : : : : : 3.1.3 Selman, Levesque and Mitchell's GSAT Algorithm 3.2 Our Contributions : : : : : : : : : : : : : : : : : : : : : : 3.2.1 Local Search Algorithm LS1 : : : : : : : : : : : : : 3.2.2 Iterative Local Search Algorithm LS2 : : : : : : : 3.2.3 Experimental Results : : : : : : : : : : : : : : : : 4 Local Search Approximation Algorithms 4.1 Related Work : : : : : : : : : : : : : : : : 4.1.1 MAX CUT and MAX SAT : : : : 4.1.2 MAX-CSP and W-CSP : : : : : : 4.1.3 Alimonti's Algorithm : : : : : : : : 4.1.4 Khanna et al.'s Algorithm : : : : : 4.2 Our Contributions : : : : : : : : : : : : : 4.2.1 Simple Local Search : : : : : : : : 4.2.2 Greedy Approach : : : : : : : : : : 4.2.3 Modi ed Local Search : : : : : : : 4.2.4 Satis able Instances : : : : : : : :

iii

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

1 3 4 6 7 9 9 10 11 12 13 13 13 14 16 19 19 19 20 21 22 23 26 28 31 31 31 32 34 35 36 36 41 43 48

CONTENTS

iv

5 Randomized Approximation Algorithms 5.1 Related Work : : : : : : : : : : : : : : : : : : : : : : : : 5.1.1 Method of Conditional Probabilities : : : : : : : 5.1.2 Rounding Linear Program : : : : : : : : : : : : : 5.1.3 Rounding Semide nite Program : : : : : : : : : 5.2 Our Contributions : : : : : : : : : : : : : : : : : : : : : 5.2.1 Non-Approximability of W-CSP : : : : : : : : : 5.2.2 Linear Time s-Approximation Algorithm : : : : 5.2.3 Randomized Rounding of Linear Program : : : : 5.2.4 Randomized Rounding of Semide nite Program : 5.2.5 Simple Rounding : : : : : : : : : : : : : : : : : : 5.2.6 Computational Experience : : : : : : : : : : : : 6 Special CSP: Manpower Scheduling Problem 6.1 Related Work : : : : : : : : : : : : : : : : : : : 6.2 Our Contributions : : : : : : : : : : : : : : : : 6.2.1 NP-Completeness of CSAP : : : : : : : 6.2.2 Monotonic Exact CSAP : : : : : : : : : 6.2.3 Monotonic Slack CSAP : : : : : : : : : 6.2.4 Arbitrary Slack CSAP : : : : : : : : : : 6.2.5 Arbitrary Exact CSAP : : : : : : : : : : 7 Conclusion Bibliography

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

49 49 49 50 51 53 54 56 57 58 59 65 69 69 70 71 74 76 83 87 89

List of Figures 2.2

Example of reduction from 3-SAT to grid RC-CSP.

: : : : : : : : : : : : : : : : :

16

2.3

Example of reduction from 3-SAT to non-grid RC-CSP

3.1

Experimental performance of LS1.

: : : : : : : : : : : : : : : : : : : : : : : : : :

29

4.1

Illustration of non-approximability.

: : : : : : : : : : : : : : : : : : : : : : : : : :

41

6.2

Example of reduction from 3-SAT to CSAP(D).

6.3

Demand and shift change matrices used in reduction.

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

6.4

Fixed-length workstretches and its representation.

6.5

Examples of dominance.

6.6

Illustration of algorithm M.

: : : : : : : : : : : : : : : :

16

73 73

: : : : : : : : : : : : : : : : :

76

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

77

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

6.7

Construction of canonical schedule.

6.8

Shape of backtrack tree.

6.9

Illustration of algorithm F.

78

: : : : : : : : : : : : : : : : : : : : : : : : : :

79

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

81

6.10 Tableau-shaped schedules.

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

6.11 Example of network for tableau-shaped schedule.

v

: : : : : : : : : : : : : : : : : :

85 87 88

vi

LIST OF FIGURES

List of Tables 1.1

Organization of thesis. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

3.1

Experimental performance of algorithm LS2.

: : : : : : : : : : : : : : : : : : : :

29

5.1

Transformation of constraint relations into functions. : : : : : : : : : : : : : : : :

63

5.2

Computational results: comparing local search with randomized rounding. : : : :

66

6.1

Experimental performance of algorithm BT. : : : : : : : : : : : : : : : : : : : : :

83

vii

2

viii

LIST OF TABLES

Chapter 1 Introduction Constraint-based reasoning involves representing relationships between objects by constraints such as equalities and inequalities. A central problem in constraint-based reasoning is the following search problem called the Constraint Satisfaction Problem (CSP). We are given a set of variables, a discrete and nite domain for each variable and a set of constraints. Each constraint is de ned over some subset of variables and limits the combination of values that these variables can take simultaneously. The goal is to obtain an assignment that satis es either all constraints or (failing which) a set of constraints such that certain linear objective function is maximized. CSP is a generalization of several key combinatorial problems including the Satis ability, Graph Coloring and N -Queen Problems. It also has direct applications in a number of real-world problems: some traditional examples are scene labelling, puzzle-solving and temporal reasoning (see [103] for a survey) and more recent examples are resource scheduling [66] and topological inference [45]. Computationally, CSP is NP-complete, with only very special cases being polynomial solvable, like when the underlying constraint graph is a tree. This thesis is devoted the design and analysis of a suite of simple yet e ective algorithms for various CSP and its related problems. By simple, we mean that they are easy to understand and implement which precludes having low time and space complexities; and by e ective, we mean that the quality of the solutions is provably good either in the average or worst case. The scope of this thesis is as follows.

Decision and Search Problems

The decision problem of CSP asks whether there exists a solution given the set of constraints. Given that the general decision problem is NP-complete, it is natural to nd subclasses of CSP which are polynomial-time solvable (or tractable in computer science terminology). It is also natural to show the NP-completeness of restricted decision problems. The rst part of this thesis pursues these directions further. Particularly, we will investigate subclasses of CSP which are polynomial-time solvable or otherwise show that they are NP-complete.

Random Instances of Search Problem

A number of approaches have been developed for solving CSP exactly. All these algorithms involve some kind of backtracking, with constraint propagation and local consistency checking. These approaches are common in the 1980s. Unfortunately, these approaches take exponential time in the worst case and are often not amenable to solving large-scale CSP in practice. Recently, there is much ongoing research in the design of simple algorithms that solve random instances of NP-hard problems with high probability, both theoretically and experimentally. Such algorithms are commonly known as Monte Carlo algorithms. In this thesis, we will investigate and formally prove the e ectiveness of simple local-search-based Monte Carlo algorithms for solving random CSP instances. 1

2

CHAPTER 1.

INTRODUCTION

The main part of this thesis will be devoted to the study of CSP optimization problems. Here, the goal is to obtain an assignment which maximizes either the number of satis ed constraints or, where constraints have weights attached to them, the weighted sum of satis ed constraints. These problems arise in scheduling. In scheduling, our task is to assign resources to jobs under a set of constraints, some of which are more important than others. Most often, instances are over-constrained and no solution exists that satis es all constraints. Thus, our goal is to nd an assignment which maximizes satis ability of constraints. Over the past few years, new substantial progresses have been made in the study of the approximation of NP-hard optimization problems. Papadimitriou and Yannakakis [88] de ned the complexity class MAX SNP which characterizes a problem's approximability simply from its syntactic de nition. They proved that for every MAX SNP-complete problem, there exists a constant c > 0 within which the problem can be approximated in polynomial time. On the other hand, Arora et al. [7] proved that for every MAX SNP-complete problem, there exists a constant d > 0 within which the problem cannot be approximated in polynomial time unless P=NP. This means that certain optimizations are not only hard to solve optimally, but even hard to approximate well in the worst case. Hence, designing polynomial time approximation algorithms to close the gap between constants c and d is a key concern. We will propose and analyse the worst-case performance of deterministic algorithms (mainly local search), as well as randomized algorithms based on rounding of mathematical programs. With the rapid increase in the speed of computing and the growing need for eciency in scheduling, it becomes increasingly important to explore ways of obtaining better schedules at some extra computational cost, short of going all the way towards the usually futile attempt of nding a guaranteed optimal schedule. Our work is aimed at achieving this goal. Besides being theoretically interesting, the knowledge of the worst-case performance gives us some peace of mind: that our algorithm will never perform embarrassingly poorly. This is an important factor to consider in scheduling critical resources. Optimization Problems

Finally, we turn our attention to a special case of CSP which has direct application: the manpower scheduling problem (MSP). The scheduling of manpower resources (or rostering) is a critical management function in service organization which operates round-the-clock. Examples are the scheduling of nurses in hospitals, ground crews in airports, and operators in telephone companies. Essentially, MSP is concerned with the assignment of workers or teams of workers to shifts in order to meet the time-varying demands, and to satisfy a set of labour constraints imposed by the management, the labour union and the government. MSP is an active research topic in OR and a good survey of OR approaches for solving MSP is given by Glover and McMillan [42]. Unfortunately, standard OR techniques such as branch and bound fail rather miserably because they are too slow to cope with frequently changing demands. In this thesis, we will study the computational complexity of several variants of the problem and propose ecient algorithms to solve them. Special Case: Manpower Scheduling Problem

This thesis is organized as shown in Table 1.1. Main Topics Decision and Search Problems Random Instances Optimization Problems Manpower Scheduling Problems

Algorithms Exact Algorithms Monte Carlo Local Search Local Search Approximation Algorithms Randomized Approximation Algorithms Exact Algorithms

Table 1.1: Organization of Thesis

Chapter 2 3 4 5 6

1.1.

3

PROBLEM DEFINITION

1.1 Problem De nition Let = f1 ... g be a set of variables. Each variable 2 has a domain ( ) which contains a nite and discrete set of assignable values. A constraint on a set of variables is a relation over the Cartesian products of their domains; this relation contains the tuples of values that these variables can take simultaneously. We also say that the values are consistent with one another. An assignment is a one-one mapping of variables to values in their respective domains, and it satis es a given constraint i the assigned values of the respective variables are related by that relation. The Constraint Satisfaction Problem (CSP) is the following NP search problem: INSTANCE: Set of variables, set of domains and collection of constraints. SOLUTION: An assignment s.t. all constraints in are satis ed. The decision problem asks whether there exists a solution, i.e. an assignment satisfying all constraints. An instance is called satis able i it has a solution. V

;

;n

i

V

V

D i

D

E

E

Latin square puzzles consist of a matrix and a set of objects to be arranged on the matrix. The objects each have a set of attributes and there are constraints on which objects can be placed next to each other based on their attributes. One simple Latin square puzzle consists of a 4 2 4 matrix and 16 objects, each object having one of four possible colors and one of four possible shapes . The problem is to arrange the objects (repetition allowed) in the matrix such that each row, column and each of the two diagonals contain objects of unique colors and shapes. The Latin square puzzles reduces to a CSP instance as follows. Let C = f g and g. The elements of the matrix can be thought of as the set of 16 variables, S=f each variable having a domain = C 2 S . Let  4 consist of the 10 4-tuple elements which form the rows, columns and diagonals of the matrix. For each 2 , the constraint de ned on is the relation on 4 having the form f(( 1 1) ( 2 2) ( 3 3) ( 4 4)) : i 6= j and i 6= j 81   4 6= g The collection of constraints is the set of these 10 constraints. Example : The Latin Square Puzzle

Red; B lue; Y ellow; Green

Circle; T riangle; Oval; S quare

K

K

R; B ; Y ; G

C; T ; O; S

V

K

K

K

T

V

t

t

T

K

c ;s

; c ;s

; c ;s

; c ;s

c

c

s

s ;

i; j

;i

j

:

E

A constraint is said to be -ary or have arity if it is de ned on variables. An instance is -ary (or has arity ) i all its constraints have arity or less. In this thesis, we will deal mainly with binary instances, unless otherwise speci ed. Arity

t

t

t

t

t

t

Constraint Graph/Network A CSP instance can be represented by a constraint (hyper)graph or network = ( ) where and represent the variables and constraints respectively. Where there is no ambiguity, we shall use the terms 'node' (or 'vertex') and 'edge' interchangeably with the terms 'variable' and 'constraint'. We adopt the conventional notations used in graph theory. Let , and 1 denote the number of nodes, the number of edges and the degree of a given graph. Let i denote the number of edges adjacent to node . G

V; E

n

V

E

m

m

i

For technical simplicity, when discussing optimization problems, we will assume that all domains have xed size and are equal to the set = f1 ... g. We are careful in making that assumption so that all results presented still hold for instances having unequal domain sizes, provided that the maximum domain size is .

Optimization Problems

k

K

k

;

;k

4

CHAPTER 1.

INTRODUCTION

We introduce two CSP optimization problems: the Maximum Constraint Satisfaction Problem (MAX-CSP) and the Weighted Constraint Satisfaction Problem (W-CSP). MAX-CSP is de ned formally as follows: INSTANCE: Set V of variables, collection E of constraints and integer k. OUTPUT: Assignment  : V 0! K s.t. number of satis ed constraints in E is maximized. W-CSP is the weighted version with weight function w : E 0! IN. The output is an assignment maximizing the weighted sum of satis ed constraints (or simply weight).

Consistency

An important property of CSP is the consistency of its constraints. A CSP instance is r-consistent i , given any instantiation of any r 0 1 variables satisfying all constraints among them, there exists an instantiation of any rth variable such that the r values taken together satisfy all constraints among the r variables. It is strongly r-consistent i it is j consistent for all j  r. Strong 2-consistency and 3-consistency are commonly known as arcand path-consistency respectively. Given a satis able CSP instance x, it is possible to establish consistency (i.e. to construct an instance y which has a given level of consistency such that an assignment satis es x i it satis es y ). This process involves systematically removing values from the variable domains and tightening the constraint relations accordingly. Arc-consistency and path-consistency are interesting properties because they can be established in O(nk 2 ) and O(n3 k3 ) time respectively [30]. If a solution can be obtained deterministically by assigning values in some linear ordering without backtracking, we say that it is globally-consistent. Clearly, a strongly n-consistent instance is globally-consistent since any consistent instantiation of a subset of variables can be extended to a solution without backtracking. Unfortunately, establishing strong n-consistency takes exponential time. We generalize the notion of arc-consistency as follows. A constraint is said to be arc(r)consistent if no matter what value one of the variables has, there are at least r di erent values that can be assigned to the other variable which will satisfy the constraint. Hence, when r = 1, we get the standard de nition of arc-consistency. Two special types of constraints are of interest. The EQUAL constraint speci es that the values of each pair are equal. For example, for k = 3, the EQUAL constraint is the relation f(1; 1); (2; 2); (3; 3)g. The NOT-EQUAL constraint likewise speci es that the values of each pair are not equal. Note that the EQUAL constraint is arc-consistent while the NOT-EQUAL constraint is arc(k 0 1)-consistent. Henceforth, the following notations will be used. CSP(k) instances of CSP with domain size k CSP(k; r) instances of CSP with domain size k and are arc(r)-consistent 1.2

CSP Related Problems

We present a list of well-known problems and discuss how they relate to CSP. Further details about these problems can be found in Garey and Johnson's book [39] or the online compendium of Crescenzi and Kann [27]. Our intention is to demonstrate that the CSP encompasses a wide variety of interesting and realistic problems, even for small domain sizes and special types of constraints. 1. k-Satis ability Problem (k-SAT):

INSTANCE: Set U of boolean variables fU1; U2; :::; Un g and k -CNF F = fC1 ; C2 ; :::; Cm g over U . QUESTION : Is there a truth assignment for U such that F is satis able?

1.2.

5

CSP RELATED PROBLEMS

Comment: k-SAT is NP-complete even for k = 3 but satisfying assignments for 2-SAT instances can be found in polynomial time by the Davis-Putnam procedure. Every kary CSP(2) instance can be reduced to a k-SAT instance. Hence, binary CSP(2) is also polynomial-time solvable. 2. Graph k-Coloring Problem (k-COLOR):

INSTANCE: Graph G = (V; E ), positive integer k  jV j. QUESTION : Is G k -colorable, i.e. does there exist a function f :V

0! f1; . . . ; kg such that f (u) 6= f (v) for all (u; v) 2 E?

Comment: k-COLOR is equivalent to CSP(k) with all constraints being NOT-EQUAL constraints. Binary CSP(3) is NP-complete since 3-COLOR is NP-complete. 3. Maximum k-Satis ability Problem (MAX k-SAT):

INSTANCE: Set U of boolean variables, set C of disjunctive clauses of literals. of length at most k and a weight function w : C

0! IN

SOLUTION: A truth assignment for U such that the sum of weights of satis ed clauses is maximized.

Comment: When there is no restriction on the length of clauses, the problem is known as MAX SAT. MAX 2-SAT is both NP-complete and MAX SNP-complete. MAX k-SAT is a special case of k-ary W-CSP(2) where every constraint has exactly 1 inconsistent tuple (the one that makes the clause false). 4. Maximum k-Cut Problem (MAX k-CUT):

INSTANCE: Graph G = (V; E ), a weight function w : E 0! IN, and 2  k  jV j. SOLUTION: A partition of V into k disjoint sets V1 ; V2 ; . . . ; V such that the sum k

of weights of edges between the disjoint sets is maximized.

Comment: The case of k = 2 is known simply as MAX CUT. MAX CUT is both NPcomplete and MAX SNP-complete. MAX k-CUT is equivalent to W-CSP(k) where all constraints are NOT-EQUAL constraints. 5. Quadratic Integer Program (QIP):

INSTANCE: Integer n, integer n 2 n matrix a, n-vector b, m 2 n matrix c and m-vector d.

SOLUTION: Integer n-vector P x which maximizes such that

=1 cij xj

n j

P

n

=1 aij xi xj

= d for all 1  i  m. i;j

+

P

=1 bi xi

n i

i

Comment: As we will see, a W-CSP instance can be formulated by a QIP. 6. Job-shop Scheduling Problems. There are many variants of job-shop scheduling problems. Essentially, we are given m machines and n non-preemptible jobs, and each job j has a processing time, release time r and due date d . The goal is to assign the start time of machines to jobs so that every machine will handle at most one job at any one time. This scheduling constraint is often called disjunctive constraints and can be expressed elegantly as a collection of binary CSP constraints follows: for all jobs j1 and j2 executing on the same machine,  1 0  2  p 2 or  2 0  1  p 1 j

j

j

j

j

j

j

j

where  refers to the start time of job j having domain values in the range [r ::d j

j

j

0p

j

+1].

6 1.3

CHAPTER 1.

INTRODUCTION

Key Concepts

We explain several key concepts used in this thesis.

Approximation Algorithms

Let A be an algorithm for a maximization problem P. We say that A approximates P within  (0 <   1) i for all input instances y of P, the ratio A(y )=OP T (y ) is at least , where A(y ) and OP T (y ) denote the objective value of the solution returned by A and the optimal objective value of y respectively. The quantity  is commonly known as the performance guarantee or approximation ratio for P. The ratio is absolute if the denominator is the maximum possible objective value instead of OP T (y ). In the case of W-CSP for example, the maximum possible objective value is the sum of edge weights, although the optimal value can be much smaller. Hence, the absolute ratio is always a lower bound of (and therefore better bound than) the performance guarantee. Observe that the ratio is as close to 1 as the assignment is close to an optimum solution. On the contrary, we say that A cannot approximate P within  i there exists an input such that the above condition does not hold. We say that P has a polynomial time approximation scheme (PTAS) i for every xed  > 0, there exists an algorithm which approximates P within , and the time taken is polynomial in 1= and the size of the input.

Local Search

Commonly known in the AI community as iterative repair, local search is a general algorithmic paradigm in which a current solution is iteratively improved by making \local" changes. It is an appealing paradigm because it is conceptually simple and eciently implementable. It encompasses a wide spectrum of well-known algorithms ranging from the LinKernighan 2-Opt algorithm for the Travelling Salesman problem to sophisticated algorithms such as simulated annealing, genetic algorithms, tabu search and the simplex method for solving Linear Programming. Local search is widely used to solve optimization problems. An optimization problem has a set of feasible assignments and an objective function that assigns a numerical value to every assignment. The goal is to nd an optimal assignment, i.e. one that has the maximum objective value for the case of a maximization problem. To de ne a local search algorithm, one imposes a neighborhood structure on the assignments. The neighborhood is often based on Hamming distances. Given two assignments, their Hamming distance is the number of variables whose assigned values are di erent. Two assignments are h-neighbors to each other i their Hamming distance is between 1 and h. The local search algorithm starts from some initial assignment (generated randomly or by some procedure) and keeps nding a neighbor with a higher objective value until it reaches a local optimum, i.e. one whose objective value is higher than or equal to all of its neighbors'. It then outputs that assignment.

Randomized Rounding

Randomization has proved to be a powerful technique in nding approximate solutions to combinatorial optimization problems. An interesting and ecient algorithmic paradigm is that of randomized rounding, due to Raghavan and Thompson [93]. The key idea is to formulate a given optimization problem as an integer program and then nd an approximate solution by solving a polynomial-time solvable convex mathematical program such as a linear program. The linear program must constitute a \relaxation" of the problem under consideration, i.e. all integer solutions are feasible for the linear program and have the same value as they do in the integer program. One easy way to do this is to drop the integrality conditions on the variables. Given the optimal fractional solution of the linear program, the question is how to nd a good integer solution. Traditionally, one rounds the variables to the nearest integers. Randomized rounding is a technique which treats the values of the fractional solution as a probability distribution and obtains an integer solution using this distribution.

1.4.

OVERVIEW OF RESULTS

7

Raghavan and Thompson presented a technique based on basic probability theory that ensures that the values chosen under the distribution do in fact yield a solution near the expectation, thus giving good approximate solutions to the integer program. Semide nite Programming A semide nite program is the optimization problem of a linear

function of a symmetric matrix subject to linear equality constraints and the constraint that the matrix be positive semide nite. Semide nite programming is a generalization of linear programming and a special case of convex programming. There have been active recent developments in semide nite optimization and its relevance for a wide range of practical applications in elds such as combinatorial optimization, engineering design, matrix inequalities in systems and control theory, and matrix completion problems. The simplex method for linear programming can be generalized to semide nite programs. By interior point methods, one can show that semidefinite programming is solvable in polynomial time under some realistic assumptions; for details see a good survey written by Alizadeh [2]. In practice, there are solvers which yield solutions quickly for reasonably large semide nite programs (e.g. [38]). In this thesis, we will design approximation algorithms for W-CSP using what is known as randomized rounding of a semide nite program relaxation. The idea is to represent W-CSP as a quadratic integer program and solve a corresponding instance of semide nite programming. The solution returned by the semide nite program is then rounded to a valid assignment by randomized rounding. This approach yields a randomized algorithm. To convert it into a deterministic algorithm, we apply the method of conditional probabilities which is a well-known method in combinatorics. This method is clearly explained in the text of Alon and Spencer [5]. Experimentation Platform All our experiments were coded in ANSI C and conducted on a SUN SparcStation 10 workstation (40 MHz). For experiments involving the generation of random numbers, the standard UNIX long random() function is used, initialized with a random seed which depends on the time of the day. 1.4

Overview of Results

This thesis is written as a series of self-contained chapters. In each chapter, a summary of related results will be given rst, followed by a technical description of our contributions. All log are base 2 and ln refers to the natural logarithm. In Chapter 2, we study the computational complexity of a restricted class of CSP. We prove that CSP remains NP-complete, even for grid-graphs whose constraints are row-convex in one direction and taken from a set closed under composition and intersection. This provides a theoretical limit to van Beek's positive result [106]. Part of this work has appeared in [72]. In Chapter 3, we consider satis able CSP instances drawn randomly from a distribution parameterized by the edge probability q and consistency probability (i.e. the probability that a constraint is satis ed by a random assignment) . Part of this chapter has appeared in [72, 73]. We give the following results: 

n ln k If q = ( (1log 0)n ) and   k , then almost every instance has only one solution. Furthermore, if the Hamming distance between the initial assignment and the solution is less than n=2, then our local search algorithm nds the solution with high probability. Our analysis is a re nement of [82] and an improvement of their result. This result is motivated from applications in rescheduling [87]. In a manufacturing plant, for example, unanticipated events such as machine breakdowns and dateline amendments often cause parts of the existing schedule to become infeasible. When that happens, it is

8

CHAPTER 1.

INTRODUCTION

more practical to 'patch' the existing schedule rather than to create a new one. The new schedule should resemble the existing schedule as much as possible.  For q = O(log n=n) and 0:43    1, a solution can be found with high probability by local search using polynomial number of iterations. This result is motivated by recent works of Selman [99]. There, they proposed an iterative local search algorithm GSAT and showed empirically that it works well for random satis able 3CNF formulas. We adopt their idea and formalize its behavior on CSP. In Chapter 4, we consider ecient heuristics for approximating MAX-CSP and related problems. Part of this chapter has appeared in [68, 51]. We give the following results:  We consider the basic local search heuristic: Starting with an arbitrary assignment to the variables, iteratively improve the solution by changing the value of a single variable. This approach attains an absolute ratio of r=k for MAX-CSP(k; r). This is tight for this heuristic, both as an absolute and as a relative ratio.  We then consider a slightly more advanced version of local search, which is based on a littleknown lemma of Lovasz for partitioning graphs [76]. We give an ecient implementation of this lemma and generalizes it. We apply it to MAX k-CUT and obtain an absolute ratio of (k 0 1)=k 1 (1 + 1=(21 + k 0 1)) in linear time. Unfortunately, we can show that this approach cannot improve the ratio of r=k for MAX-CSP(k; r).  Using the same Lovasz's lemma, we obtain improved (relative) approximation ratios for a host of problems. In particular, we obtain a ratio of 3=(1 + 2) for weighted independent set problem, 1=d(1 + 1)=3e for weighted hereditary subgraph problems, and 3(1 + 2)=4 for 3-COLOR. In Chapter 5, we consider randomized rounding algorithms for approximating W-CSP. Most of this chapter has appeared in [74, 70]. We give the following results:  We prove that for every  > 0, there exists a domain size k depending on  such that W-CSP cannot be approximated within  unless P=NP. In general, W-CSP has no O(log n) approximation ratio, even if the underlying constraint graph is bipartite, unless EXP=NEXP.  By representing W-CSP as a linear integer program and applying randomized rounding to its linear program relaxation, we obtain a ratio of 1=k. We improve a result by Khanna by a factor of 2 over satis able instances.  By representing W-CSP as a quadratic integer program and applying a simple randomized rounding to its semide nite program relaxation, we obtain a 0.408-approximation for k  3. This ratio is almost best possible in the sense that, for any rounding scheme chosen, there exists an instance whose solution has expected value no more than 0.5 times the optimal value. We also consider the hyperplane rounding method of Goemans and Williamson for the case k = 2 and obtain a ratio of 0.634, which can be improved to 0.878 for some special cases. In Chapter 6, we consider the manpower scheduling problem with shift change constraints (CSAP). We will prove the NP-completeness of CSAP and solve several variants of the CSAP by polynomial-time greedy algorithms. We show that the NP-hard version can be reduced to nding a minimum-cost ow in a xed-charge network which can be eciently solved by commercially available software. Most of this chapter has appeared in [71, 67, 69]. et al.

et

al.

Chapter 2

Exact Algorithms This chapter is concerned with ecient algorithms for solving the CSP exactly, i.e. seeking a solution that satis es all constraints. The traditional approach is to perform enumerative search. To reduce run-time, constraint propagation techniques are used to simplify the problem instance dynamically. Many bounding procedures have been proposed to improve the backtracking process. Good surveys on these approaches can be found in [56, 103]. More recent works in the design of backtracking algorithms can be found in [12, 15, 41]. The disadvantage of enumerative approaches is that they su er combinatorial explosion. A natural question to ask is what subclasses of CSP can be solved in polynomial time. Recently, it has been shown that under certain restrictions, one can perform preprocessing which ensures that backtracking is either unnecessary or bounded in some way. These restrictions fall under two broad categories, restricting the topology of the constraint network ([30, 34, 118]) and restricting the properties of the constraints ([26, 28, 106]). E orts on restricting the network topology have made little progress. This is to be expected since it is known that graph-coloring, a special case of CSP, is NP-hard even under restricted topologies [39]. On the other hand, some interesting results have been reported on restricting the properties of constraints. In this chapter, we will give a survey of these results. Our contribution is taking one such tractable subclass and showing that it becomes NP-hard when the restriction is slightly relaxed. 2.1

Related Work

In this section, we discuss related work in the characterization of tractable CSP based on the properties of constraints. First, we give some de nitions. A binary constraint between two variables x and y , denoted R(x; y ) can be represented by a boolean matrix M (x; y ) with D(x) rows and D(y ) columns such that for all u 2 D(x) and v 2 D (y ), M [u; v ] = 1 i (u; v ) 2 R(x; y ). Constraints are undirected in general but several of their properties are directional. A binary constraint can also be viewed as a bipartite graph where D(x) and D(y ) represent the bipartition and the edges represent the consistent value pairs. A binary constraint R(x; y ) is 0/1/al l (or implicational) i all values in D(x) is consistent with either zero, one or all of the values in D(y ), and vice versa. It is q-tight i all values in D(x) is consistent with at most q values in D(y ), and vice versa. It is row-convex from x to y i there exists an ordering of the values in D(y ) such that every value in D(x) is consistent with only consecutive values of D(y). In other words, in each row of the matrix representation M (x; y ), no two ones within a single row are separated by a zero in that same row. It is bidirectionally row-convex if row-convexity holds in both directions. Clearly, every row-convex constraint is 9

10

CHAPTER 2.

EXACT ALGORITHMS

0/1/all. It is functional from x to y i all values in D(x) is consistent with at most one value v in D(y ). A constraint of arity t is 0/1/all (resp. q -tight, row-convex) i for all instantiations of any subset of t 0 2 variables, the relation between the remaining 2 variables is 0/1/all (resp. q -tight, bidirectionally row-convex). A CSP instance is 0/1/all (resp. q -tight, row-convex, functional) i all the constraints are 0/1/all (resp. row-convex, q -tight, functional). A set of (non-binary) constraints R1; . . . ; Rr01 over the variable sets V1; . . . ; Vr01 is relar-consistent relative to variable x i for any consistent instantiation of variables in Stionally r0 1 V 0 fxg, there exists an value for x such that all r 0 1 constraints can be simultaneously i i=1 satis ed. A network is relationally r-consistent i every set of r 0 1 constraints is relationally r-consistent. Note that in this de nition, the relations rather than the variables are the primitive entities. Unfortunately, for r  3, establishing relational r-consistency takes exponential time. For establishing arc- and path-consistency, several constraint operations are frequently used. The operations composition and intersection between two constraints correspond to the multiplication and intersection of their matrices. The unary operation transposition corresponds to the transposition of the matrix. The unary operation restriction on two given sets P; Q 2 V corresponds to deleting the rows of the matrix not in P and columns of the matrix not in Q. The unary operation permutation on two given permutations 1 and 2 corresponds to the permutation of rows of the matrix according to 1 and the columns of the matrix according to 2 . The reader may consult [79] for the details of these operations. 2.1.1

0/1/All Networks

Cooper et al. [26] showed that for any binary 0/1/all network, a solution can be obtained in O(nmk + mk2) time if one exists. Kirousis [62] extended it to t-ary 0/1/all networks and showed that a solution can be obtained in O(nmk2 t2 ) time. He also showed that his algorithm can be executed in time O(log3 (nk)) with O((mk2t2 + n3k 3)= log(nk)) processors on a EREW PRAM. In the following, we discuss Cooper el al.'s algorithm OA: procedure OA: establish arc-consistency; S =V; repeat until S is empty select a variable i from S ; CONSIST ENT = false; for u 2 D(i) until CONSIST ENT do assign i to u; Propagate(S , i, u; T , CONSISTENT ); endfor if CONSISTENT then S = S 0 T else output fail; endrepeat

A variable l is said to be uniquely determined by an instantiation i 0! u i there is a unique value v 2 D(l) such that (u; v ) satis es the constraint between i and l. Let S be the current set of variables which have not been assigned. First establish arc-consistency so that the constraints are 1/all (instead of 0/1/all). Iteratively assign a variable i in S with a value u 2 D(i). If such a value cannot be found, then the instance has no solution; otherwise, assign i provisionally to u and propagate this information (i.e. recursively and provisionally assign all variables which can be uniquely determined). If any inconsistency occurs during the propagation (i.e. there is a variable which must be assigned to two distinct values), undo all provisional instantiations and re-assign i to the next value in D(i), else make all provisional instantiations permanent and

2.1.

11

RELATED WORK

update S accordingly. Theorem 2.1.

([26]) OA solves 0/1/all CSP in O(nmk + mk2 ) time.

Proof. Each instantiation of i to a value u either uniquely determines the value of another variable l or is consistent with all values of l, since constraints are 1/all. Thus, if the propagation is successful, we can permanently assign the provisionally set variables and ensure that (1) all constraints between assigned variables are satis ed; and (2) any constraint connecting a permanently assigned variable and an unassigned variable will always be satis ed regardless of what value is subsequently assigned to the latter. This establishes the correctness of OA and the time complexity can be veri ed. tu

0/1/all CSP is, in a certain sense, the largest class of tractable CSP which can be obtained by restricting the semantics of the constraints. Let CSP(S ) denote the class of decision 0/1/all CSP instances in which all the constraints are elements of the set S . Clearly, if S is the set of 0/1/all constraints, then CSP(S ) is tractable. One can verify that the set of 0/1/all constraints is closed under permutation and restriction. If we deviate from this set slightly, then CSP(S ) becomes NP-complete. By a reduction from the graph-coloring problem, Cooper et al. proved the following: ([26]) Let S be a set of constraints closed under permutation and restriction which does not consist entirely of 0/1/all constraints. Then CSP(S ) is NP-complete.

Theorem 2.2.

2.1.2

Row-Convex Networks

Van Beek [106] showed that for any binary network which is path-consistent, if there exists an indexing of the variables and an ordering of the domain values such that constraints are row-convex from the low to the high index variables, then a solution can be obtained without backtracking if one exists. Subsequently, van Beek and Dechter [108] generalized the result to any t-ary network which is strongly 2(t 0 1) + 1-consistent. We will outline van Beek's result. Let F be a nite collection of boolean vectors that are row-convex and of equal length such that every pair of vectors have a non-zero entry in common. Then, all vectors in F have a non-zero entry in common. Lemma 2.3.

([106]) Given a path-consistent binary network, if there exists an indexing of the variables and an ordering of the domain values such that constraints are row-convex from the low to the high index variables, then the network is globally-consistent.

Theorem 2.4.

This can be shown by induction. Surely, the network is 2- and 3-consistent by de nition. Assume that the network is (r0 1)0consistent , for some r < n. Consider any set of r0 1 variables i1  i2  . . . ;  ir 01 which have been consistently instantiated to u1 ; . . . ; ur 0 1 respectively. To show that the network is r-consistent, we must show that for any variable ir  ir01 , there exists a value ur 2 D(ir ) such that (uj ; ur ) 2 R(ij ; ir ) for all j = 1; . . . ; r 0 1. For each j , row uj of the matrix M (ij ; ir ) is a vector that speci es the allowed values for ir . Since all these vectors are row-convex, by Lemma 2.3, it is sucient to show that any two vectors have a non-zero entry in common to show that they all have a non-zero entry in common. But path-consistency guarantees that this holds. Hence, the network is r-consistent. tu Proof.

This theorem leads to a constructive algorithm RC in that all domains are equal to the set f1; . . . ; kg):

(

2

O n k

) time, as follows (assuming

12

CHAPTER 2.

RC: for i = 1; . . . ; n do X = [1; . . . ; 1]; % of length kT for j = 1; . . . ; i 0 1 do X = X (row if X = [0; . . . ; 0] output fail; assign i to ui such that Xui = 1;

EXACT ALGORITHMS

procedure

uj

of

(

));

R j; i

endfor

Van Beek's theorem is practical in the sense that, given a path-consistent network, one can test whether all the constraints are row-convex using a theorem by Booth and Lueker [20], which states that the testing of row-convexity of a p 2 q matrix can be performed in O(pq ) time. Having ascertained row-convexity, algorithm RC can be applied to nd a solution in O(n2k ) time. Van Beek's theorem leads to an interesting result: Corollary 2.5. ([106]) Let S be a set of constraints which are row-convex and closed under composition, intersection and transposition. Then CSP(S ) can be solved in O(n3 k3 ) time.

First establish path-consistency which takes O(n3 k3) time. It is known that pathconsistency can be computed by a series of compositions, intersections and transpositions of the given set of constraints. Since these constraints are from a set which is closed under those operations, the resulting path-consistent network has only row-convex constraints. By algorithm RC, a solution can be obtained in O(n2 k) time. tu Proof.

Van Beek's result requires that the given network be bidirectionally row-convex. We will prove that, if there exists an ordering of the variables and domain values such that constraints are row-convex from the smaller to the larger variables, and drawn from a set which is closed under composition and intersection, but not transposition, then the problem becomes NP-complete. 2.1.3

q-Tight Networks

Van Beek and Dechter [107] showed that for any binary q -tight network, strong (q +2)-consistency induces global-consistency. For t-ary network, relational (q + 2)-consistency induces globalconsistency. In the following, we outline their work for the binary case. Let F be a nite collection of equal-length boolean vectors of such that each vector has at most q number of 1's and every set of q + 1 or less vectors have at least one non-zero entry in common. Then, all vectors in F have a non-zero entry in common. Lemma 2.6.

([107]) Given a binary network, if the constraints are q -tight, then strong (q +2)consistency induces global-consistency.

Theorem 2.7.

We must show that a q -tight network which is strongly (q + 2)-consistent is r-consistent for any q +2 < r  n. This can be shown by induction. Assume the network is (r 0 1)-consistent. Consider any set of r 0 1 variables i1 ; . . . ; ir01 which have been consistently instantiated to u1; . . . ; ur 0 1 respectively. To show that the network is r -consistent, we must show that for any variable ir , there exists a value ur 2 D(ik ) such that (uj ; ur ) 2 R(ij ; ir ) for all j = 1; . . . ; r 0 1. For each j , row uj of the matrix M (ij ; ir ) is a vector that speci es the allowed values for ir . Since all these vectors are q -tight, by Lemma 2.6, it is sucient to show that any set of q + 1 or less vectors have a non-zero entry in common to show that they all have a non-zero entry in common. But strong (q + 2)-consistency guarantees that this holds. Hence, the network is r-consistent. tu Proof.

2.2.

OUR CONTRIBUTIONS

13

It is tempting to think that we can solve q -tight CSP in polynomial time by enforcing strong (q + 2)-consistency. However, it is known that such approach would not work in general, since enforcing r-consistency (r > 3) may introduce constraints of higher arity into the network, thus making the0network not globally consistent. Moreover, the worst-case time and space complexity Pr n1 j is O( j=1 j k ), which is exponential in r [25]. 2.1.4

Functional Networks

David [28] showed that if for any functional network, if there exists a consistent instantiation of any root set, then pivot-consistency induces global-consistency. A root set of a directed graph is a subset of nodes such that all remaining nodes are descendents from an element of the root set. It is known that computing a minimum root set of a graph takes O(n + m) time by depth rst search. Pivot-consistency is a weaker form of path-consistency which can be established in O(n2 k2 ) time [28]. However, the time complexity of David's result lies in the consistent instantiation of the root set which takes O(k ) time, where  is the size of the root set. Hence, David's result is meaningful only for networks with small minimum root sets. It remains an open question what types of graphs have small minimum root sets. 2.1.5

Networks with Small Domain Sizes

When the domain size k is small, the following results are useful: 1. Dechter [29] proved that any t-ary network that is strongly k(t 0 1)+1-consistent is globally consistent. In particular, for binary networks, strong k + 1-consistency is sucient. 2. Schaefer [96] showed that there are only three polynomially-solvable networks on the domain f0; 1g, namely, Horn-clauses, 2-SAT and linear equations modulo 2. All CSPs on domain f0; 1g that do not t into one of these three categories are NP-complete. 2.2

Our Contributions

We saw in the previous section that row-convexity is an important property in tractability if they are closed under composition, intersection and transposition. Our contribution is in proving a negative result for a slightly wider class of CSP. Namely, if the constraints are row-convex in one direction and taken from a set which is closed under composition and intersection but not necessarily under transposition, then CSP remains NP-hard even if : (1) the network is a grid and k  4, or (2) the network has degree 3 and k  3, where k is the maximum domain size. Hence, we provide a theoretical limit to van Beek's positive result by considering both the network topologies and the properties of constraints. We de ne the decision problem RC-CSP as follows:

INSTANCE: Set V of variables, collection E of constraints and domain size k, such that for all (i; l) 2 E , R(i; l) is row-convex from i to l and taken from a set closed under composition and intersection. QUESTION: Is there a solution V 0! f1; . . . ; kg?

Note that RC-CSP does not include graph-coloring instances because the NOT-EQUAL constraint is not row-convex. A p 2 q grid is a grid network composed of p rows and q columns of nodes connected by vertical and horizontal edges.

14

CHAPTER 2.

2.2.1

EXACT ALGORITHMS

NP-Completeness of Grid RC-CSP

We show the NP-completeness of RC-CSP by a many-one reduction from 3-SAT: Theorem 2.8.

RC-CSP is NP-complete, even for grid networks with k = 4.

Let F be an instance of 3-SAT with n Boolean variables fU1 ; . . . ; U g and m clauses. Assume that for each clause, if the literal U occurs in the clause, then U will. not occur and vice versa. Construct an n 2 m grid such that nodes in row i represent the variable U and nodes in column j represent the clause C . Let the node on row i and column j be called x . Call a node x active if U or U occurs in C , and passive otherwise. We introduce 7 distinct values f0; ; 1; a; c; b; dg. Call the subset f; 1; c; dg the accepting set. Except the rst and last rows, active nodes have domain f0; ; 1g while passive nodes have domain fa; b; c; dg. In the rst row, active nodes have domain f0; 1g while passive nodes have domain fa; bg. In the last row, active nodes have domain f; 1g while passive nodes have domain fc; dg. The value 1 represents the truth value T rue, while both 0 and  represent F alse. The values a and c are used to convey 0 or  while b and d convey 1. Two main issues need to be addressed. First, values of active nodes must be consistently conveyed along the rows. This is essential so that every Boolean variable carries exactly one truth value throughout all clauses. This condition is enforced by the horizontal constraints. Next, every clause must be satis ed, i.e. contains at least one true literal. This is why we need two values to represent F alse. Basically, an active node is assignable to 0 only if no active node in the same column and a smaller-number row is assigned 1. Similarly, a passive node is assignable to a or b only if no active node in the same column and a smaller-number row is assigned 1. By this approach, enforcing the satis ability of clause C is equivalent to enforcing the last node of column j to have a value of the accepting set. This condition can be enforced by the vertical constraints. We discuss how vertical and horizontal constraints are constructed. For all i and j < j 0 , let (x ; x ) denote the sign change clauses C to C , i.e.: Proof.

n

i

i

i

j

i;j

i

i

i;j

j

j

i;j

i;j

0

 (xi;j ; xi;j 0

8 (+), >< (0), )= >: (x; z), (+),

j0

j

if either U or U occurs in both clauses C and C if U occurs in one clause and U in the other clause if y is passive and there exists an active node z right of y otherwise. i

i

i

j

j0

i

De ne 10 distinct constraint relations named by 3-letter identi ers as follows (H stands for horizontal, V for vertical, A for active and P for passive) : 1. HAA+ = f(0; 0); (0; ); (; 0); (; ); (1; 1)g, 2. HAA0 = f(0; 1); (; 1); (1; 0); (1;  )g, 3. HAP + = f(0; a); (0; c); (; a); (; c); (1; b); (1; d)g, 4. HAP 0 = f(0; b); (0; d); (; b); (; d); (1; a); (1; c)g, 5. HP A = f(a; 0); (a;  ); (b; 1); (c; 0); (c;  ); (d; 1)g, 6. HP P = f(a; a); (a; c); (b; b); (b; d); (c; a); (c; c); (d; b); (d; d)g, 7. V AA = f(0; 0); (0; 1); (; ); (; 1); (1; ); (1; 1)g, 8. V AP = f(0; a); (0; b); (; c); (; d); (1; c); (1; d)g, 9. V P A = f(a; 0); (a; 1); (b; 0); (b; 1); (c;  ); (c; 1); (d; ); (d; 1)g, 10. V P P = f(a; a); (a; b); (b; a); (b; b); (c; c); (c; d); (d; c); (d; d)g. The constraints for the rc-CSP instance are de ned as follows:

2.2.

15

OUR CONTRIBUTIONS

Vertical constraints { for all y below x: Horizontal constraints { for all y right of x: x y  (x; y ) R(x; y ) x y R(x; y ) active active V AA active active (+) HAA+ active passive V AP active active (0) HAA0 passive active V P A active passive (+) HAP + passive passive V P P active passive (0) HAP 0 passive active (+) or (0) HP A passive passive (+) or (0) HP P Order the variables in increasing row numbers and within each row, by increasing column numbers. Order the domain values as (0; ; 1; a; c; b; d). The horizontal constraints are rowconvex. The vertical constraints are not row-convex but we can replace them by equivalent row-convex constraints. For example, if R(x; y) = V P A, introduce an intermediate node z with domain f ; g and replace R(x; y) by R(x; z ) = f(a; ); (b; ); (c; ); (d; )g and R(y; z ) = f(0; ); (; ); (1; ); (1; )g which are row-convex. The other vertical constraints can be replaced in a similar way. One can verify by inspection that the new set remains closed under composition and intersection. Note however that the vertical constraints are not closed under transposition. Clearly the above reduction can be done in polynomial time. We claim that F has a satisfying assignment if and only if the constructed RC-CSP instance has a solution. Suppose we have a solution for the rc-CSP instance. For every i, let j be the clause in which either the literal U or U rst occurs. If U occurs, then set t(U ) to T rue when x = 1, and False when x = 0 or  . If U occurs instead, do the reverse. Every Boolean variable U is consistently set by observing that all active nodes in row i are set to the same truth value if their signs agree and to opposite values otherwise. Consider two consecutive active nodes x and y in row i. Suppose x has been assigned to 1 (similar reasoning can be applied if x has been assigned to 0 or ). Then we have the following cases: 1. if y is right of x and 1(x; y) = (+), then by HAA+, y will be assigned to 1; 2. if y is right of x and 1(x; y) = (0), then by HAA0, y will be assigned to 0 or ; 3. if x and y are separated by passive node(s) and 1(x; y) = (+), then by HAP +, HP P and HP A, y will be assigned to 1; and 4. if x and y are separated by passive nodes(s) and 1(x; y) = (0), then by HAP 0 , HP P and HP A, y will be assigned to 0 or  (see for example, row U5 of Figure 2.1). The resulting truth assignment t satis es F because every clause contains at least one true literal. Consider any column j . For any value in f0; 1; a; bg assigned to the top node x1 , the bottom node x will be constrained by the vertical constraints to take a value in the accepting set i at least one active node in column j takes the value 1. Therefore, C contains at least one true literal. Conversely, suppose we have a satisfying assignment t. De ne a solution for the rc-CSP instance as follows. Assign the active nodes from top to bottom, left to right. For each t(U ) = True, assign node x to 1 for all j such that U occurs in C , and to 0 or  for all j such that U occurs in C (depends on whether a '1' has been assigned to a node in the same column and previous rows). Do the reverse for each t(U ) = F alse. Assign the passive nodes from top to bottom, left to right. Clearly, such assignment is possible. Since each clause has at least one true literal, the nodes in the last row will be assigned values in the accepting set. tu i

i

i

i

i;ji

i

i;ji

i

i

;j

n;j

j

i

i;j

i

i

j

i

j

16

CHAPTER 2.

EXACT ALGORITHMS

Figure 2.1 gives an example of reduction from 3-SAT to grid RC-CSP. In this example, (a) is the RC-CSP instance constructed from the 3-SAT instance

Example

U = fU1 ; U2 ; U3 ; U4 ; U5 g; F = f(U1; U2; U5 ); (U2 ; U3; U4); (U1 ; U3 ; U4 ); (U1 ; U3 ; U5 )g:

Active nodes are unshaded while passive nodes are shaded. Signs above edges represent the sign changes. From the solution (b), the 3-SAT satisfying assignment is fT rue; F alse; T rue; F alse; T rueg. C1 U1

U2

U3

C2 (-)

(-)

(+)

(+)

C3 (+)

(+)

(+)

(+)

C4 (-)

(+)

(-)

(-)

(+)

C2

C3

C4

U1

1

a

0

1

U2

θ

1

b

d

U3

c

θ

0

1

U4

d

1

1

d

U5

1

c

c

θ

(+)

U4

U5

C1

(+)

(a)

(b)

Figure 2.1: Example of reduction from 3-SAT to grid RC-CSP.

X C1 Y

C2

Z

Figure 2.2: Example of reduction from 3-SAT to non-grid RC-CSP

2.2.2

NP-Completeness of Non-Grid RC-CSP

We consider the case when the constraint network is not a grid. Theorem 2.9.

RC-CSP is NP-complete, even for networks with degree 3 and k = 3.

Proof. Again, let F be an instance of 3-SAT with n variables and m clauses. We now construct n chains of m nodes each, where each chain corresponds to one Boolean variable. Instead of vertical constraints, attach two count nodes to the active nodes of each column, as shown in Figure 2.2. In the gure, X , Y and Z are the active nodes of column 1 and the black nodes

represent the count nodes. The count nodes of other columns are omitted for simplicity. These count nodes will indicate whether at least one of the nodes in the corresponding columns has been set to 1. Here, we need only three distinct values 0; 1 and . With the help of count nodes, we can make every solution correspond to a valid truth assignment more easily than before. De ne the domains of both active and passive nodes to be f0; 1g and the horizontal constraint types as follows:

2.2.

OUR CONTRIBUTIONS

17

HAA+ = HAP + = HP A = HP P = f(0; 0); (1; 1)g; and HAA0 = HAP 0 = f(0; 1); (1; 0)g. The key is the use of count nodes to ensure that every solution corresponds to a satisfying truth assignment. Consider any column. Let X , Y and Z be the active nodes, and C 1 and C 2 be the count nodes. We want to impose the condition that C 2 can be assigned a value i at least one of X , Y or Z is assigned 1. This can be done by de ning the domains of C 1 and C 2 to be f0; 1; g and f1; g respectively, and the constraints as follows:

R(X; C 1) = f(0; 0); (0; 1); (1; 1); (1; )g, R(Y; C 1) = f(0; 0); (0; ); (1; 1)g, R(Z; C 2) = f(0; ); (1; 1); (1; )g, R(C 1; C 2) = f(0; 1); (1; 1); (1; ); ( ; )g. It is clear from Figure 2.2 that the degree of the network is 3.

u t

18

CHAPTER 2.

EXACT ALGORITHMS

Chapter 3

Monte Carlo Local Search Algorithms This chapter is concerned with the performance of local search for solving random satis able instances of CSP. Local search involves generating an initial assignment by some means (e.g. randomly) and improving it iteratively until it reaches a local optimum. The advantage of local search is that it is fast and can hence be used to solve large-scale problems. The disadvantage is that it does not guarantee nding a solution even if one exists. Recently, several empirical and theoretical results on applying local search to solve NP-hard problems have emerged. These theoretical results show that, under certain assumptions, local search converges to a solution with high probability. In this chapter, we will review some of those works. Our contributions are in applying local search to random satis able CSP instances and show that it converges rather nicely in one iteration under certain assumptions. We also show that by iteratively applying local search a polynomial number of times, we nd a solution with high probability. 3.1

Related Work

In this section , we review several recent empirical and theoretical results on applying simple local search to solve the satis ability, graph-coloring and constraint satisfaction problems. Satis ability Problem Koutsoupias and Papadimitriou [64], Gu [46] and Selman et al. [99]

gave local search algorithms which behave very well on average (see details below). Those results however acknowledge the fact that there is a (possibly exponentially small) region of instances for which local search behaves badly. To get around that, certain conditions have to be imposed. For instance, in [64], it was proved that, for random satis able 3CNF Boolean formulas, if the initial assignment agrees with the satisfying assignment in more than half the number of variables, then local search based on ipping succeeds with high probability. 3.1.1

Koutsoupias and Papadimitriou's Algorithm

Koutsoupias and Papadimitriou showed that the following local search algorithm G nds satisfying assignment with high probability when 3SAT (i.e. 3CNF) formula F with n variables and m = (n2 ) clauses is given uniformly at random from the set of satis able formulas. Note that some clauses may appear more than once in the formula, and some variable may appear more than once in one clause. Thus, the total number of distinct clauses is N = 8n3 . In their algorithm, the neighborhood of an assignment is the set of assignments of Hamming distance 1. 19

20

CHAPTER 3.

MONTE CARLO LOCAL SEARCH ALGORITHMS

KP-LS: generate an assignment  for n variables randomly; while true do if  satisfies F output  ; search for a neighbor  0 which satisfies more clauses; 0 if no such  exists output fail; 0 set  to  ; endwhile. procedure

Consider any assignment b , and let it be xed. Consider only formulas that are satis ed by b . Thus, b can be regarded as a satisfying assignment. For any assignment ,  is good (w.r.t. b ) if d(; b ) (i.e., the Hamming distance between  and b ) is at most n(1=2 + ), where 0 <  < 1=2 is some constant.  is called bad otherwise. By Cherno bounds, they proved that: 0 2 n . Lemma 3.1. The probability that the randomly chosen  is bad is at most e For any  ,  0 is one step closer to b if d(;  0) = 1 and d(0 ; b ) = d(; b ) 0 1. For any formula b ,  !  0 is a good direction w.r.t. F if F , and any  and  0 such that 0 is one step closer to  0 the number of clauses satis ed by  increases by changing from  to  0 . By Cherno bounds, the following is proved: 0 0 Lemma 3.2. Let  and  be any two assignments such that  is good and  is one step closer to b . Then for some constant c2 > 0, we have Prf  ! 0 is a bad direction j F (b ) = 1 g  2e0c2 m=n : F

So, the probability that , starting from a good assignment, will not discover b is at most since there are at most n2n01 possible ippings { the number of edges of the nhypercube. Combining the two lemmas, they arrive at the following theorem: n2n e0c2 m=n ,

([64]) If the input formula F of n variable and m = dn2 clauses (for suciently large d) is given randomly from the set of satis able 3CNF formulas, then there exists a constant c > 0 such that the algorithm KP-LS nds a satisfying assignment with probability  1 0 e0cn .

Theorem 3.3.

3.1.2

Gu's Algorithm

Gu considered the average time complexity of a local-cum-exhaustive search algorithm for solving k-SAT over random satis able k-CNF formulas. Let X = fx1 ; . . . ; xn g. A randomly generated kCNF F is a CNF formula with independently generated clauses. In each clause, pick k variables from X uniformly at random then negate each of them with probability 12 . In their algorithm, each variable is ipped in sequence if the number of satis ed clauses is increased. procedure for

t

=1

Gu-LS:

to

poly (n) do

generate an assignment

 randomly; i = 1 to n do if  satisfies F output  ; let  0 be  with variable i flipped; 0 if  satisfies more clauses then set 

for

to

0 ;

endfor endfor

exhaustively find a satisfying assignment and output it.

3.1.

21

RELATED WORK

Basically, a polynomial number (poly (n)) of independent local search iterations is performed until a satisfying assignment is found or the time limit has been reached. In the latter case, an exhaustive search is carried out. The time complexity of the outer loop is O(poly (n) 1 mk) while the exhaustive search takes O(2n ) time. The key is in showing that with probability at least 1 0 e0n , the local search loop succeeds in nding a satisfying assignment (if one exists), and thus only very rarely it is necessary to invoke exhaustively search. This is shown in the following lemma: Let F be a random satis able k-CNF. The probability that the objective function is reduced to 0 in one local search iteration is at least, Lemma 3.4.

(





1 m 10 10 10 k 2 2 n



10

k k 0 2 1n

m k )m=2k

:

From the lemma, with k  log n 0 log log n 0 c and m=n < 2k02=k, they derive that the probability that local search nds a solution within 2na (1 + n) = poly (n) iterations is at least 1 0 e0n , where a is a constant. Thus, the average time complexity of Gu-LS is (1 0 e0n ) O(poly (n) 1 mk) + e0n O(2n ) = O(poly (n) 1 m) which proves the following: Let F be a random satis able k -CNF formula with n variables, m clauses. The average time complexity of Gu-LS is O(poly (n)1m) for k  log n 0log log n0 c and m=n < 2k 02 =k, where c is a constant.

Theorem 3.5.

The paper claims that the same performance holds for unsatis able formulas. However, not enough evidence is provided to support that claim. 3.1.3

Selman, Levesque and Mitchell's GSAT Algorithm

Selman, Levesque and Mitchell [99] gave the following practical procedure, GSAT, for solving the SAT problem in general. They also successfully applied it to solve other related problems such as graph-coloring and the N -queen problem. The idea is again that of iterative local search. This distinguishing feature is that each local search terminates after a pre-determined (MAXF LIP S ) number of ips. procedure for

t=1

Gu-LS:

to

T

do

generate an assignment  randomly; for j = 1 to MAXF LIP S do if  satisfies F output  ; let i be a variable s.t. flipping i gives the largest increase in number of satisfied clauses; set  to  with i flipped; endfor

endfor

output fail.

The value of MAXF LIP S is about ten times the number of variables, while the value of T depends on the amount of time one wants to spend in looking for a solution before giving

22

CHAPTER 3.

MONTE CARLO LOCAL SEARCH ALGORITHMS

up. Given a choice between equally good ips, GSAT picks one at random. If no ip increase the objective value, then a variable is ipped which does not change the value (or failing that) decreases the value least. In other words, the local search allows sideway and downhill moves. In fact, it has been shown that GSAT's performance degrades greatly without such moves. There is no theoretical analysis given in the paper, but several papers have been written to evaluate GSAT experimentally ([40, 98]). For example, for \hard" random formulas (i.e. 3CNF formulas generated using uniform distribution such that the number of clauses is approximately 4.3 times the number of variables), GSAT is able to solve instances with several hundred variables within an hour while the standard Davis-Putnam procedure fails. It is well-known that dense random k-colorable ( xed k) graphs are easy to color but not for sparse ones (e.g. [104]). This prompted Blum and Spencer [18] to ask if one can design an ecient coloring algorithm that works with high probability when the edge probability is at least polylog(n)=n. Recently, Alon and Kahale [4] proved that random 3-colorable graphs can be colored eciently with high probability when the edge probability is at least polylog(n)=n. In another development, Biro, Hujter and Tuza [17] considered the complexity of graph-coloring given that some nodes are precolored such that the precoloring can be extended to a full-coloring using the minimum number of colors. They showed that that problem is polynomially solvable on interval graphs when every color is used at most once in the precoloring, and becomes NP-hard when the colors can be used twice. Graph Coloring Problem

Constraint Satisfaction Problem Minton et al. [82] considered CSP instances on regular graphs. Under the assumption that instances have unique solutions and constraints are consistent with a xed probability, they proved that local search performs well when the Hamming distance between the initial assignment and the solution is small. The Hamming distance refers to the number of variables with di erent assigned values. Part of our contributions (below) is in the same vein as their analysis.

3.2

Our Contributions

Let R-CSP denote a set of distributions on satis able CSP instances. An element of R-CSP is a speci c distribution parameterized by the edge probability 0  q  1 and the consistency probability 0    1. A CSP instance of n variables is drawn from this distribution by the following procedure: 1. Generate a random graph G of n nodes such that an edge (i.e. constraint) exists between any two variables with probability q. 2. Generate a random assignment ^ . For each edge (i1 ; i2 ) in G, construct the constraint relation as follows. Insert the value pair (^i1 ; ^i2 ) with probability 1 and all other k2 0 1 pairs with probability . This generation procedure has been used by others and an online implementation can be found in [105]. Let m denote the number of edges in G. Let p = 1 0  be the inconsistency probability. For brevity, an instance generated by the above method will be called an R-CSP instance. Clearly, every R-CSP instance is satis able since it has at least one solution ^ . Furthermore, it has the following property: Fact 1.

For all constraints R(i1 ; i2 ) and all value pairs (j1 ; j2 ), Pr[(j1 ; j2) 2 R(i1; i2 )] =



1, if j1 = ^i1 and j2 = ^i2 , otherwise.

3.2.

23

OUR CONTRIBUTIONS

The expected probability that (j ; j ) is a consistent pair is  1  + 1 :  = 10 1

2

k2

k2

We ask the following questions: 1. Suppose we have an initial assignment which is suciently close to an arbitrary solution, how likely does local search converge to a solution? 2. Suppose we apply local search iteratively for a polynomial number of times, what proportion of constraints can we satisfy simultaneously with high probability? Our contributions are summarized as follows. 1. If q = ( 0nn ) and   kk , then almost every instance has only one solution. Furthermore, if the initial assignment is 'good', i.e. the Hamming distance between the initial assignment and the solution is less than half the number of variables, then our local search algorithm nds the solution with high probability. Our analysis is a re nement of [82] and an improvement of their result. We also provide a simple method to obtain a good initial assignment with high probability. 2. For q = O(log n=n), an assignment which satis es at least a fraction min(+1:17p(1 0 ); 1) of the set of constraints can be obtained with high probability after polynomial number of local search iterations. This fraction is signi cantly higher than , which a random assignment is expected to satisfy. This leads us to conclude that for q = O(log n=n) and 0:43    1, a solution can be found with high probability in polynomial time. Our analysis relies heavily on the use of two forms of Cherno bounds (see for example, [49]) given as follows: Let X ; ... Xn be a sequence of n independent Bernoulli trials (coin ips) each with probability of success E[Xi] = p. Let X = X + 1 1 1 + Xn be a random variable indicating the number of successes, so E[X ] = E[X ] + 1 1 1 + E[Xn ] = np. Then for 0    1: Pr[X  (1 + )np]  e0 pn= Pr[X  (1 0 )np]  e0 pn= In this chapter, the phrase almost surely means: with probability approaching 1 when n is suciently large. log (1 )

Proposition 3.6.

ln

1

1

1

3.2.1

2

3

2

2

Local Search Algorithm LS1

Given an assignment , let 1(; i; j ) denote the assignment obtained by replacing the assigned value of variable i (i.e. i) with another value j. The logic of our local search algorithm LS1 is simple. Begin with a random assignment and iterate until we reach a solution or all variables have been reset. Let  denote the current assignment and S denote the set of variables which have not been reset. De ne a potential function c : S 2 K 0! IN by: c(i; j ) = # fv 2 V 0 fig j (j; v ) 62 R(i; v )g That is, c(i; j ) counts the number of variables which con ict with the variable i under the assignment 1(; i; j ). Among all variables in S, the algorithm selects a variable-value pair (i; j )

24

CHAPTER 3.

MONTE CARLO LOCAL SEARCH ALGORITHMS

such that the potential function is minimized and reset i to the value j. LS1 is codi ed as follows: LS1: set S = V ; generate a random initial assignment  ; while S 6= ; and  is not a solution do select (i; j ) such that i 2 S , j 6= i and c(i; j ) is minimized; set  = 1(; i; j ) and S = S 0 fig;

procedure

endwhile

output  ;

LS1 is similar to Minton et al.'s algorithm [82] with the following exceptions: 1. Instead of arbitrarily picking a variable in con ict, we pick a variable-value pair which has the minimum number of con icts, ties broken arbitrarily. We call this the min-con ict heuristic. 2. The algorithm terminates after O(n) iterations since we do not allow variables to be reset more than once. Hence, the worst-case time complexity of LS1 in terms of number of comparisons is O(n k). We proceed to analyze the performance of LS1. It is similar in avour to the analysis given by Minton et al. [82]. Let x be an instance of R-CSP with n variables, domain size k, edge probability q  log n=n, and consistency probability 0    ln k=k. Then x has one solution almost surely. The instance x has at least one solution, ^, since it is satis able. Consider any assignment  6=  ^. The probability that  is a solution is m (see Fact 1) because it must satisfy all constraints. Thus, the probability that there exists a solution other than ^ is no more than (kn 0 1)m  knm. Since q  log n=n, m is expected to be at least n0 log n. Substituting  and the bounds for m and , we get  ln k   n n n n kn m  kn  1  (ln k + k ) 10 1 + 1 2

Lemma 3.7.

Proof.

2

log

k

k2

k2

1

1

log

kn(log n01)

for suciently large n. tu From now, consider an R-CSP instance which contains exactly one solution ^. Observe that for each variable, only one value is correct while the other k 0 1 values are wrong. Each variablevalue pair (i; j) is correct if j is the correct value for i. Our aim is to show that, if the initial assignment is 'good', then LS1 will pick a correct (i; j) pair almost surely thus bringing us one step closer to the solution. Let D denote the Hamming distance between the initial assignment and ^. Let the local search iterations be numbered D; D 0 1; . .. ; 1. De ne a binomial random variable Xd[i; j] for the value of the potential function c(i; j ) at iteration d, given that we have chosen correct pairs from iterations D to d + 1. Let B (N; ) denote the number of successes in N independent Bernoulli trials when the probability of success in each trial is  . For all i 2 S and j 2 K , if (i; j ) is correct then Xd[i; j ] = B (d; qp), else Xd[i; j ] = B(n 0 1; qp). Lemma 3.8.

3.2.

25

OUR CONTRIBUTIONS

Let  be the current assignment at the beginning of iteration d. In , d variables have wrong values. If (i; j ) is a correct pair, only these d variables can con ict with i under the assignment 1(; i; j). Conversely, if (i; j ) is wrong, then all n 0 1 variables can potentially con ict with i. Any variable i0 will con ict with i i there is an edge between i and i0 and the value pair (j; i0 ) is inconsistent. The probability that an edge exists between i and i0 is q. By tu Fact 1, the probability that (j; i0 ) is inconsistent is p. Let (imin ; njmin ) denote the variable-value pair chosen by the min-con ict heuristic at iteration 0 d. Let N = d e. Let Ed denote the event that (imin ; jmin ) is a wrong pair. Proof.

2

1





For all d  N , Pr[Ed ]  2e0c n0 qp d , for some constant c. Since (imin ; jmin) is wrong, by the min-con ict heuristic, there exist d correct pairs examined at iteration d whose potential function values are no less than that of (imin ; jmin ) (otherwise, one of those pairs would have been picked instead of the wrong pair). Therefore, the probability that Ed occurs is at most the probability that Xd[i; j ]  Xd[imin ; jmin], for all d correct pairs (i; j ). Since the d + 1 random variables are pairwise independent, Pr[Ed]  (Pr[Xd [imin ; jmin]  Xd[i; j]])d  (Pr[B (n 0 1; qp)  B (d; qp)])d (by Lemma 3.8) d  Pr[B (n 0 1; qp)  (n 0 1)qp] + Pr[B (d; qp)  (n 0 1)qp]  d (by Proposition 3.6)  2e0c n0 qp (

Lemma 3.9.

1)

Proof.

3 4

3 4

(

1)

tu

Putting Lemmas 3.7 and 3.9 together, we get the following result: Let x be an instance of R-CSPc withn n variables, domain size k, consistency probability 0    kk and edge probability q  00 n for some constant c . Then LS1 returns a solution for x almost surely in O(n k) time if the initial assignment is good. By Lemma 3.7, x has exactly one solution almost surely. Let D  N be the Hamming distance between the initial assignment and the solution. The probability that LS1 fails to return the solution is the probability that at least one event Ed occurs along the way, for some 2  d  D. Notice that E cannot occur because the algorithm always correctly reset the last wrongly-set variable. Thus, Pr[LS1 fails] = Pr[ED ] + Pr[ED0 j ED ] + 1 1 1 + Pr[E j ^Dd Ed]  Pr[ED ] + 0 D0D1 + 1 1 1 + 0PD 2 t t=3  PDd 0PDed et (by Lemma 3.9)

Theorem 3.10.

ln

(1

2

log )

0

Proof.

1

1 Pr[E ] 1 Pr[E ]

=2

1

Pr[E ]

1

2

=3

Pr[E ]

t=d+1

where ed is an upper-bound of Pr[ED ]. Particularly, by an appropriate choice of constant c for q, e = O( n ). Now consider the sequence E = (e ; . .. ; eN ). By geometric progression, this sequence has the interesting property that 2

0

1

2

N X

which follows that

t=3

10

ed

PD

t=d+1 et

et  e2

 1 0ede

2

;

8 2  d  D:

26

CHAPTER 3.

MONTE CARLO LOCAL SEARCH ALGORITHMS

Therefore, Pr[LS1fails]

1



D X

e 1 0 e2 d=2 d

which tends to 0 for suciently large n.



2e2 : 1 0 e2

tu

Finally, we address the problem of how to generate an initial assignment which is 'good'. Unfortunately, the assignment is no longer a random assignment and thus the above analysis does not work. Consider the following procedure to decide on a value for each variable i. Call a value j clearly impossible if there is some variable i0 adjacent to i such that no value pair (j; j 0) is in R(i; i0) for any j 0 . That is, i cannot possibly take value j because it never leads to a solution. The procedure is simply to randomly pick a value for i that is not clearly impossible. Let mi denote the number of constraints incident on i. The expected probability that a value j is not clearly impossible is 1 if j = ^i and = (1 0 (1 0 )k )mi otherwise. Substituting the bounds on mi and ,



 10 10

 !log n ln k k k





 2 log n 10 : 3k

1 . Hence, the expected number of values that is not For suciently large n, we have < k01 clearly impossible for i is 1 + (k 0 1) < 2, implying that the probability of setting a variable wrongly is less than 12 . De ne a random variable Z which counts the number of variables correctly set by our procedure. Clearly, Z = B (n; 21 + ) for some  > 0. By Cherno bound, the initial assignment is 'good' almost surely. 3.2.2

Iterative Local Search Algorithm LS2

The previous section shows that LS1 performs well when  is reasonably small. When  becomes large, our analysis of LS1 collapses because we can no longer assume that instances have unique solutions. Experimentally, it is also known that hill-climbing local search algorithm performs poorly for certain R-CSP instances [117]. In this section, we turn our attention to iterative local search, i.e. performing local search iteratively with independent random initial assignments. We will analyse its performance. We consider the following iterative local search algorithm LS2. Let LS denote any local search which begins with a random assignment and performs hill-climbing (i.e. no downhill moves) to increase the number of satis ed constraints until it arrives at a local optimum. Note that LS1 is not an example of LS. Execute LS for a maximum of T = O(nk+1 ln n) times, where n is the number of variables and k is the domain size. procedure LS2: for t = 1 to T do call LS to obtain a local optimal assignment if  is -near exit; endfor output the best local optimal assignment;

;

Consider an R-CSP instance generated with edge probability q and consistency probability . We say that an assignment is -near i it satis es at least m constraints (recall m is total number of constraints). Let  3 be the output computed by LS2. Given 0   1, what is the

3.2.

27

OUR CONTRIBUTIONS

probability that  3 is -near? Clearly, if   (recall  is the expected consistency probability), the probability is close to 1, since a random assignment would satisfy m constraints. If >  however, it turns out that the probability is still close to 1 for a signi cantly large range of . The reason (which we will show) is that the assignment obtained by one local search iteration is near with not-so-small probability. Hence, with enough independent random initial assignments, we can expect to nd an -near assignment almost surely. We need the following theorem from analytic inequalities which lower-bounds the tail of a standard normal distribution: Proposition 3.11.

[83]. For all x > 0, Z 1 z2

 x2 1 p 2 x + 4 0 x e0 2 : 2

e0 2 dz >

x

De ne ( ) to be the probability that one invocation of LS returns an -near assignment. q  p Let c0 = p 0 and a0 = 12 c20 qnk + 4 0 c0 qnk . (10)

2

Lemma 3.12.

c0 qnk For suciently large n,  ( )  pa20 e0 2 .

Let  be the assignment obtained by LS. The probability that  is not -near is the probability that we randomly pick a variable i and it is incident on less than mi (mi is degree of i) satis ed constraints. For counting purpose only, re-label all domain values so that each domain has a distinct set of values. Since  is locally optimal, the total number of consistent value pairs occurring in the incident constraints, which is a binomial random variable B (mi k; ), must be less than mi k. Therefore, Proof.

1 0  ( ) = Pr [B (mi k; ) < mi k] : Since the instance is generated with parameters q and   , this implies that

 ( )



Pr [B (qnk; )  qnk ] :

By the Central Limit Theorem, B (qnk; ) can be approximated closely by a Gaussian random variable Z with mean qnk and variance qnk(1 0 ). Thus, Z1 z2 1 ( )  Pr [Z  qnk] = p e0 2 dz p 2 c0 qnk and the lemma follows by Proposition 3.11,. Lemma 3.13.

p12 n0(k +1) .

Proof.

For suciently large n, if q = O(log n=n) and c0



p

2 ln 2, then

ut  ( ) 

For suciently large n,

 p  1 q2 a0 = c0 qnk + 4 0 c0 qnk  n01 : 2

Since q = O(log n=n) and c0 

 ( ) >

p

2 ln 2, by Lemma 3.12,

2 0  pa e0 0 2  pn 1 e0 0

2

c qnk

1

2

log

2 n c0 k=2

 p1 n0 k : 2

( +1)

tu

28

CHAPTER 3.

MONTE CARLO LOCAL SEARCH ALGORITHMS

Theorem 3.14. Let x be an instance of R-CSP with n variables, xed domain size k , edge p probability O(log n=n), and consistency probability . For 0   min( + 1:17 (1 0 ); 1), LS2 almost surely returns an -near assignment after O(nk+1 ln n) iterations.

By Lemma 3.13, after cnk+1 ln n (constant c) independent local search iterations, the probability of nding an -near assignment is at least Proof.



1 1 0 1 0 p n0 k 2

( +1)

cnk+1 ln n

which tends to 1 for suciently large n.



1 1 0 e0 ln n = 1 0

n

tu

While the above theorem gives a good approximation result, our main concern is to nd an exact solution. Setting the upper bound of to be 1, we get: Corollary 3.15. For 0:43    1 and edge probability O (log n=n), LS2 almost surely returns a solution of an R-CSP instance in polynomial time.

Corollary 3.15 establishes a range on the consistency probability such that iterative local search is e ective. It has been shown in recent works on phase transition (e.g. [92, 100]) that, for xed n, k and q , there exists a critical value of  such that a backtrackingbased algorithm will take much search e ort. This occurs even when restricted to satis able instances. Our result shows that if the critical value falls within our range, then it may be worthwhile to consider local search. For example, for the case of n = 20; k = 10 and q = 0:3, it was reported in [92] that the critical value of  exceeds 0.43. Hence, by our result, iterative local search returns a solution with high probability. In fact, as our experiments (below) show, the number of iterations taken to return a solution is typically very small. Links to Phase Transition

3.2.3

Experimental Results

In this section, we give experimental results. For all cases, we x n=200, k = 10. Recall that q is the edge probability and  is the consistency probability. These experiments show that, even for small n, the performance of LS1 and LS2 are consistent with Theorems 3.10 and 3.14 on generated R-CSP instances. Preliminary experiments show that, for edge probability log n=(1 0 )n, LS1 always returns a solution, even if the initial assignment is not 'good'. We then experiment on and report harder (i.e. sparser) graphs which have edge probabilities log n=2(1 0 )n and log n=n. For each consistency probability  = 0:1 to 1, we generate 100 R-CSP instances. We compare the probabilities of failure of LS1 under initial assignments of di erent Hamming distances from the solution. Assignments of di erent Hamming distances are obtained simply by perturbing the values of the solution randomly. Figure 3.1 illustrates the outcome of the experiments. From the graphs, we observe that if the initial assignment is generated randomly, then the performance of LS1 is bad for a signi cantly large range of . On the contrary, if the initial assignment is close to a solution, LS1 almost surely returns a solution. Performance of LS2 We next investigate the performance of LS2. The local search routine we used is the simple steepest ascent procedure. Since the outcome for the boundary case ( = 0:43 and q = log n=n) is somewhat trivial, we also report harder instances with  = 0:36 and q = 4=n, which were reported to be the hardest CSP instances for local search in [117]. For each = 0:50 to 1:00, we generate 100 R-CSP instances and obtain the percentage of successful instances (i.e. instances for which LS2 returns an -near assignment in less than 100 iterations) Performance of LS1

3.2.

29

OUR CONTRIBUTIONS

1.0

1.0

random initial assignment

0.9

0.9

0.8

0.8

0.7

0.7

Probability that LS1 fails

Probability that LS1 fails

random initial assignment

0.6

0.5

0.4

0.3

0.2

0.6

0.5

0.4

0.3

0.2

0.1

initial assignment with Hamming distance = n/4 initial assignment with Hamming distance = log n

0.1

initial assignment with Hamming distance = (n-1)/2

0.05

0.05

0

0 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Consistency Probability

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Consistency Probability

Figure 3.1: Experimental performance of LS1 for q = log n=2(1 0 )n and q = log n=n and the mean number of iterations over the successful instances. The outcome is presented in Table 3.1. We observe that it is easy to get locally optimal assignments which are within 95% optimal. However, this is not an encouraging observation because in hard instances of CSPs, it is precisely due to the presence of many local optimal assignments that often fools a local search procedure to move towards local optimality. Beyond the 90% limit, It becomes increasingly hard for LS2 to obtain an optimal solution (satisfying all constraints) on instances with low edge and consistency probabilities.

0.50 0.70 0.90 0.95 0.98 0.99 1.00

% successful instances (i) # iterations 100 1.00 100 1.00 100 1.05 100 2.08 100 2.11 100 2.11 100 2.19

% successful instances (ii) # iterations 100 1.00 100 1.00 100 1.14 100 6.11 72 33.00 38 48.84 0 {

Table 3.1: Experimental performance of LS2 for (i) q = log n=n;  = 0:43, (ii) q = 4=n;  = 0:36.

30

CHAPTER 3.

MONTE CARLO LOCAL SEARCH ALGORITHMS

Chapter 4

Local Search Approximation Algorithms In this chapter, we consider the optimization versions of CSP and present worst-case analysis of local search algorithms. Local search is a common technique for deriving ecient approximation algorithms for optimization problems. A good survey of work done in this aspect can be found in [115]. We will rst discuss known results in the approximation of related problems. Our contribution is in obtaining tight and improved approximation bounds for MAX-CSP, W-CSP and other related problems by local search. 4.1

Related Work

In this section, we will review key complexity and algorithmic results for several NP optimization problems related to CSP. 4.1.1

MAX CUT and MAX SAT

MAX CUT is NP-hard even if the graph is unweighted. However, by matching theory, Hadlock [48] showed that for planar graphs (weighted), it can be solved in polynomial time. Recently, Barahona [11] tightened the result to show that for all graphs which are not contractible to a 5-clique1 , MAX CUT can be formulated as a polynomial-sized linear program.

NP-hardness

Approximation

A key result in MAX CUT and MAX SAT approximation is the following:

Theorem 4.1.

([88]) Unweighted MAX CUT and MAX 2-SAT are MAX SNP-complete.

Corollary 4.2.

Arc-consistent MAX-CSP(2) is MAX SNP-complete.

The hardness can be shown by a simple reduction from the unweighted MAX CUT, which is known to be MAX SNP-complete [88]. Let graph G be an arbitrary instance of unweighted MAX CUT. Construct an arc-consistent instance of MAX-CSP(2) with the same constraint graph G such that all constraints are NOT-EQUAL constraints. Clearly, the reduction is costpreserving. Furthermore, MAX-CSP(2) is in the MAX SNP class. u t Proof.

1

A graph is said to be contractible to a graph

H H if

can be obtained by a sequence of operations in which a

pair of adjacent vertices is merged and all other adjacencies between vertices are preserved, and multiple edges are replaced by single edges.

31

32

CHAPTER 4.

LOCAL SEARCH APPROXIMATION ALGORITHMS

This means that unless P=NP, there exists a positive constant c such that MAX-CSP(2) cannot be approximated within c. In fact, this negative result can be tightened in several ways. Petrank [89] showed that unweighted MAX k-CUT is MAX SNP-hard even for k-colorable graphs. This means that the class of satis able instances of arc-consistent MAX-CSP is also MAX SNP-hard. Seeking approximation algorithms for MAX CUT has had a long history. The rst algorithm is due to Sahni and Gonzales [95]. Their algorithm repeatedly chooses the partition for each node i that maximizes the weight of the cut of nodes 1 to i. This achieves an approximation ratio of 12 . Using local search which moves a node from one partition to the other that increases the cut set can also achieve the same result. Since 1976, several approximations results have 1 01 [91]. For been obtained. For the unweighted case, the absolute ratios of 12 + 2n [50] and 21 + n4m k 0 1 the weighted case, a ratio of k [109] has been obtained for MAX k-CUT. Recently, Goemans and Williamson surprised many by presenting a randomized algorithm based on semide nite programming relaxation which achieves a ratio of 0:878:::. Their algorithm can be derandomized with a slight loss in performance guarantee. This line of research will be discussed in Chapter 5. The approximation of MAX SAT also dates back to 1974, when Johnson [55] presented a (1 0 21k )-approximation algorithm if each clause has at least k literals. His algorithm is to repeatedly choose the truth value for each variable that maximizes a certain linear function of the yet unsatis ed clauses. Johnson's algorithm has a probabilistic interpretation which lends itself to improvement by randomized techniques. Yannakakis [116] saw the interpretation recently and improved the ratio to 34 . This and further improvements via randomization will be discussed in Chapter 5. Local Optimality A well-known result about the hardness of nding a locally optimal solution is given by Scha er and Yannakakis: Theorem 4.3.

([97]) MAX CUT is PLS-complete.

The FLIP neighborhood of a cut consists of the cuts obtained by moving a single node from one partition to the other. This theorem means that there are MAX CUT instances for which local search under the FLIP neighborhood requires exponential time, even to nd a locallyoptimal solution. Nevertheless, when restricted to cubic graphs, Poljak [90] shows that local search based on ipping nds a local optimum in O(n2 ) steps. Since we can express every MAX CUT instance by an arc-consistent W-CSP(2) instance, it follows that there are arc-consistent W-CSP(2) instances such that local search under FLIP neighborhood requires exponential time to nd a local optimal. 4.1.2

MAX-CSP and W-CSP

Due to the lack of nice structure of the problem, surprisingly little is known about the approximation of CSP in general and the known results are very weak. We give a summary of known results and then discuss the local search-based algorithms in detail in the following sections: 1. Freuder and Wallace [35, 36] gave the rst formal de nition of MAX-CSP (which they call Partial CSP or PCSP) They considered the case where the underlying constraint network is a tree and showed that it can be solved in O(nk 2) time in two phases. First, a reverse breadth- rst search of the tree beginning at the leaves establishes a set of assignable values for each node such that the subtree rooted at that node has an optimal assignment. By the principle of optimality that an optimal assignment is also optimal at all subtrees, a forward breadth- rst search can be used to obtain an optimal assignment. For the general PCSP, they proposed a general framework based on branch-and-bound and some enhancements

4.1.

RELATED WORK

33

[113, 112]. Heuristic algorithms which yield good near-optimal solutions have also been investigated such as [114], the connectionist architecture GENET [103] and guided local search [110]. 2. Berman and Schnitger [16], while studying the maximum independent set problem, considered a special case of MAX-CSP(2) where each constraint is composed of a conjunction of at most t(n) literals, where t(n)  log2 n. They showed that unless MAX 2-SAT has a randomized PTAS, their CSP cannot be approximated within O(1=(1 + )ct(n) ), where c is a positive constant and 0 <  < 1. By generalizing Johnson's algorithm, they obtained a (1=2t(n) )-approximate algorithm. 3. Alimonti [1] considered yet-another special case of MAX-CSP(2), the maximum generalized satis ability problem (MAX GSAT(B )) where each constraint is disjunctive clause and each of the disjuncts is a conjunction of at most B literals. They proved that MAX GSAT(B ) cannot be approximated within a constant factor by local search with respect to the hneighborhood for constant h. This means that MAX-CSP(2) cannot be approximated within a constant factor by constant h-neighborhood. They then obtained a (1=2B )approximation by a local search algorithm which uses a relaxed notion of neighborhood and objective function (see details below). 4. Khanna et al. [59] considered a version of the constraint satisfaction problem which, by our de nition, is equivalent to W-CSP(2) of xed arity t. Their problem is a generalization of Berman and Schnitger's as well as Alimonti's problems. They showed that, for any xed arity t, W-CSP(2) is approximable within a factor of 1=2t by local search with a non-oblivious objective function (see details below). This means they achieve a ratio of 14 for the case of W-CSP(2). The underlying technique is similar to Alimonti's technique. 5. Amaldi and Kann [6] considered the approximation of of nding maximum feasible subsystems of linear systems (MAX FLS). Their problem may be seen as MAX-CSP on real domains. They showed that MAX FLS is MAX SNP-hard even if, (1) the domains are 2valued (speci cally f-1,+1g); and (2) the constraints are binary and restricted to the form a1 x1 + a2x2 f=; gc such that a1; a2 2 f01; 0; 1g. They have thus identi ed a subclass of MAX-CSP(2) which is MAX SNP-hard. By a greedy algorithm, they demonstrated a ratio of 12 for MAX FLS when the constraints are inequalities, i.e. no constraint of the form a1x1 + a2x2 = c. 6. Poljak and Turzik [91] showed that, given any disjoint 2-partition of the edges of a connected graph into E1 and E2 (E1 [ E 2 = E ), there is a 2-partition of the nodes such that the number of edges in E1 between the partitions plus the number of edges in E2 within the partitions is at least m2 + n04 1 . Such partition can be found in O(n3 ) time by repeatedly nding articulation nodes within the graph. This result can be used to approximate MAX 1 CUT of connected graphs within 12 + n0 in O(n3) time. This result is also interesting 4m because it can be used directly to approximate MAX-CSP(2,1) within an absolute ratio of 1 1 + n0 in O(n3 ) time as follows. Each 1-consistent constraint can be satis ed either when 2 4m the end points are assigned to di erent values or the same value. Classify them under E1 and E2 respectively. Now nd a 2-partition of the nodes V1 and V2 using the above algorithm. Assign all nodes in V1 to value 1 and V2 to 2. The number of satis ed edges under this assignment is at least m2 + n04 1 . 7. Hochbaum et al.[54] showed that integer programs with only two variables per inequality can be approximated within 21 .

34

CHAPTER 4.

4.1.3

LOCAL SEARCH APPROXIMATION ALGORITHMS

Alimonti's Algorithm

Let the set of Boolean variables be Y = fy1 ; . . . ; yn g. The hardness of approximating MAX GSAT(B ) is given by the following result: Theorem 4.4. ([1]) For each B > 1, for any constant h > 0 and for any constant 0  c  1, there exists an instance of x MAX GSAT(B ) and an assignment T such that the number of clauses satis ed by T is less than c times the optimal value of x, and T is local optimum with respect to a h-neighborhood.

Without loss of generality, let T be the truth assignment of all 1's. Let S be the set of all possible conjunctions of exactly B literals from the literal set fy ; . . . ; yq g (q to be determined 0 1 0 q01 1 1 later). Clearly, jS j = Bq and every literal occurs exactly B01 times in A. Let S be the set of all possible conjunctions of exactly B literals of the form (r ^ s1 ^ . . . ^ sB01g such that r 2 fy1 ; . . . ; yq g and each si 2 fy q+1 ; . . . ; yn g and all si 's are distinct within and among the n0q . Now construct the instance x to have the set of clauses S S S . clauses. Clearly, jS j = B01 Clearly, S is the set of clauses satis ed by T . To make0 a local improvement from T , we must q01 1 clauses in S unsatis ed, implying

ip at least one variable yi , 1  i h q , but that makes B01 i 0 q01 1 + 1 (B 0 1) variables yj , q + 1  j  n. Hence, given that we need to ip at least H = B01 h, to make T locally optimal, simply set H > h, which establishes a value for q . The next step is to bound jS j to be less than c times the optimal value. Knowing that the optimal value is at least 20B times the number of clauses (jS j + jS j), it suces to make Proof.

! q B

which lower-bounds

n

0B 1 < c12

to be in terms of q ,

B

"

! q B

+

n B

0 01

#

q

and c.

tu

The above result says that the standard local search is unable to guarantee a constant factor approximation. By a relaxation of the neighborhood structure, Alimonti presented the following local search algorithm for approximating MAX GSAT(2) within 14 : Let S0 ; S1 and S2 denote the set of clauses with 0, 1 and 2 false literals under the current assignment T . Thus, S0 is the set of satis ed clauses. Let P (i; j ) and N (i; j ) denote the number of clauses in set Sj (0  j  2) containing the literals yi and yi respectively. Let m be the number of clauses of the given instance. Let the neighborhood of T be those assignments which are obtained by either ipping one or all values of T . Call this neighborhood NT . Consider the following algorithm: AlimontiLS: generate an assignment T ; m while T satisfies less than 4 clauses do if (2jS0 j < jS1 j) then find a variable yi such that P (i; 0) < N (i; 1); flip the value of yi in T ; else flip all values in T ;

procedure

endwhile

output

T

;

The approximation ratio is obtained directly from the following observation and lemma: Observation 4.5.

If jS0j < m4 then either (2jS0 j < jS1j) or (jS0j < jS1 j).

4.1.

35

RELATED WORK

Lemma 4.6.

If

T

is locally optimal w.r.t.

N

T , then

T

satis es at least m4 clauses.

Suppose T satis es less than m4 clauses. Then by Observation 4.5, either case must occur. In the rst case, by de nition, Proof.

n X

( 0)
2, Alimonti applied a weighted objective function to derive the same result. This approach is also adopted by Khanna et al. to approximate a more general problem which we will discuss now. 4.1.4

Khanna et al. 's Algorithm

The basic idea is to represent each constraint by a disjunction of conjunctions of at most t literals. Each conjunct has precisely one satisfying assignment which represents one tuple of the relation. Each monomial is associated with a weight which is the weight of the corresponding constraint. Given an assignment T , let Si (1  i  t) now denote the set of monomials with i true literals and let W (Si ) denote the sum of weights of monomials in Si . Then, the objective function is a weighted function of the form F =

i

c

0

t X

i

i01 1 X

i01 = 0t1 i i j =0

c

( i ), where

c W S

i=1

! t j

and c0 = 0. The following is claimed: ([59]) W-CSP(2) of xed arity t can be approximated within search with 1-neighborhood and objective function F .

Theorem 4.7.

1 2t

using local

Here, we will prove the special case when each constraint is a disjunction of 2 literals (i.e. MAX 2-SAT). The generalization to xed arity can be found in their paper [59]. ([59]) MAX 2-SAT can be approximated within neighborhood and objective function F = 32 W (S1 ) + 2W (S2). Theorem 4.8.

3 4

using local search with 1-

Without loss of generality, let T be the locally optimal truth assignment of all ones. Let P (i; j ) and N (i; j ) respectively denote the total weight of clauses in Sj containing the literal yi and yi . By local optimality, each variable i must obey the following inequality, Proof.

0 12

( 2) 0

P i;

3 1 3 P (i; 1) + N (i; 1) + N (i; 0)  0 2 2 2 P

P

Summing the P inequalities overPall i and observing that ni=1 P (i; 2) = 2W (S2), ni=1 N (i; 0) = 2W (S0 ) and ni=1 P (i; 1) = ni=1 N (i; 1) = W (S1 ), we get the inequality W (S2 ) + W (S1)  3W (S0 ) implying that W (S0 )  41 (S0 + S1 + S2 ). Thus, this algorithm ensures an approximation ratio of 43 . tu

36 4.2

CHAPTER 4.

LOCAL SEARCH APPROXIMATION ALGORITHMS

Our Contributions

The performance of the algorithms to be presented will necessarily be in terms of two measures: the maximum domain size k and its arc-consistency r. Our main contributions are as follows: 1. We consider the basic local search heuristic: Starting with an arbitrary assignment to the variables, iteratively improve the solution by changing the value of a single variable. This simple approach attains an absolute ratio of r=k for MAX-CSP and r=k 0  for W-CSP. This is tight for this heuristic, both as an absolute and as a relative ratio, while we attain slightly better ratios when certain relations between r, k and 1 are satis ed. 2. We consider a slightly more advanced version of local search, which is based on a littleknown lemma of Lovasz for partitioning graphs [76]. We give an ecient implementation of this lemma, generalize its statement, and extend it to CSP as well. We apply it to MAX k-CUT and obtain an absolute ratio of (k 0 1)=k 1 (1 + 1=(21+ k 0 1)) in linear time. For small 1, this ratio is better than the recent result of Frieze and Jerrum [37] who obtained a (relative) ratio of (k 0 1)=k + 2 ln k=k2. Unfortunately, we can show that this approach cannot improve the ratio of r=k for MAX-CSP, except for special cases. 3. Using Lovasz's lemma, we obtain improved (relative) approximation ratios for a host of problems. In particular, we obtain an improved ratio of 3=(1+2) for weighted independent set problem, 1=d(1 + 1)=3e for weighted hereditary subgraph problems, and 3(1 + 2)=4 for 3-COLOR. Karger, Motwani and Sudan [58] have recently obtained a 110 (1) log n approximation for 3-COLOR. The advantage of our approach, however, is speed, simplicity, and better bounds for all constant (or slightly superconstant) values of 1. 4.2.1

Simple Local Search

We consider the local search algorithm LS with respect to 1-neighborhood and an objective function f where f () counts the number of constraints which are satis ed by : Start with any arbitrary initial assignment. Iteratively seek a neighbor with a higher objective value until a local optimum is reached. Since the objective values are monotonically increasing, by a naive implementation, LS takes at most O(mnk) time. By careful implementation, we can reduce the complexity of LS. One such implementation is by maintaining a n 2 k lookup table which contains the objective function values of all neighbors of the current assignment and a list of k pointers to the maximum element of each column. Hence, nding a neighbor with a higher objective value takes O(k) time, while updating the table in each local-improvement step takes O(1k) time. For special cases such as MAX-CUT where all constraints are NOT-EQUAL constraints, each local improvement only a ects at most 21 values of the lookup table, and hence the time complexity of LS can be reduced to O(m1 + nk), the second term being the time to initialize the table. MAX-CSP

The following lemma is the key to our analysis: Lemma 4.9. (Local Lemma). Let x be an instance of MAX-CSP(k; r) and  be any local optimal solution computed byrmLS. Then, for all variables i, the number of constraints incident to i satis ed by  is at least d k e. i

4.2.

37

OUR CONTRIBUTIONS

Proof. Consider any variable i. Let i1 ; i2 ; ... ; i be the adjacent variables of i. Since each constraint is arc(r)-consistent, there are at least r distinct values J = (j 1; ... ; j ) such that (f (i ); j ) 2 R(i ; i), for 1  l  r. Let J be the m 2 r matrix whose row t is J as follows: mi

t

t

tl

t

t

tr

i

J

2 j11 j12 6 j j 21 22 6 6 .. . . 4. . j i1 j i2

=

t

1 1 1 j1 1 1 1 j2

r

. . . ...

r

3 7 7 7 5

111 j By the pigeon-hole principle, there exists a value j 2 K which occurs at least d m ;

m ;

mi ;r

r1mi

e times in

di erent rows of J . If this does not hold for (i) then re-assigning i to j would increase the number of satis ed constraints, which is a contradiction. tu Theorem 4.10. LS approximates MAX-CSP(k; r) within an absolute ratio of . Proof. Summing up over the vertices, a locally optimal solution satis es 1 Xd m 1 r e  d r P m e = d rm e 2 k k 2 k constraints, by the Local Lemma. tu For constraint networks which are regular, the approximation bound can be slightly improved:   + 11 0 1 Corollary 4.11. LS approximates MAX-CSP(k; r) within an absolute ratio of on 1-regular networks such that r1  s (mod k) and s  1. 1 e. Hence Proof. For each variable i, the number of satis ed incident constraints is at least d the total number of satis ed constraints is, 1 d r1 en = 1 n  r1 0 s + 1 =  r + 1 0 s  m 2 k 2 k k 1 k1 and the corollary follows. tu The Local Lemma exhibits the nice property that every vertex has at least a guaranteed fraction of incident constraints which are satis ed. This suggests that the local optimal solution can be used to partition the given graph into subgraphs of small degrees. Particularly, for MAX CUT and graph coloring, this partitioning strategy enables us to obtain good approximations, which we will now show. k

r

k

i

i2V

i

i2V

s

r

k

k

r

k

Bounded-Degree Maximum Cut





Vitanyi [109] proved that MAX k-CUT can be approximated within 1 0 1 in O(mnk) time. By Theorem 4.10, we can obtain the same result. Furthermore, if the degree of the given graph is bounded, this result can be improved. Lemma 4.12. Given a graph of m edges and maximum degree 1, O(m1 + nk) time suces to construct a k-partition such that the number of cut-edges is at least m 0 21 P =1b c. k

n i

mi k

38

CHAPTER 4.

LOCAL SEARCH APPROXIMATION ALGORITHMS

Proof. Using LS, solve an instance of MAX-CSP( 0 1) with underlying graph with all NOT-EQUAL constraints. By careful implementation, this special case takes ( 1+ ) time. Now partition into subgraphs 1 . .. k where each subgraph contains nodes assigned with the same values. Let the indegree of a node refer to the number of nodes adjacent to it within m its subgraph. By the Local Lemma, every node has anPindegree of at most b k c. Hence, the total number of edges within all subgraphs is at most 12 ni=1 b mk c) and the lemma follows. tu For = 2, we can construct a 2-partition which contains at least m2 + n4 edges for odd-degree graphs (i.e. all nodes have odd degrees). This is the best possible ratio because there exists an odd-degree graph, namely, a clique of even number of nodes, whose maximum cut is exactly m + n edges. Our ratio for unweighted MAX 2-CUT on odd-degree graphs is hence 1 + n and 2 4 2 4m is absolute. This improves the ratio of Poljak and Turzik [91] by a marginal additive factor of 1 3 4m . Poljak and Turzik's algorithm runs in ( ) time, while our algorithm requires ( 1+ ) time. Unfortunately however, if we relax the odd-degree restriction, we can do no better than a ratio of 12 using local search, as the negative result (Theorem 4.19) would show. For 2, we have the following: Corollary 4.13. ( (1 + ) time suces to approximate unweighted MAX -CUT within d0b 1 c d02 d , where is the average degree. Particularly, if 1  3 0 1, then the ratio is d . 1 1 Proof. The total number of edges within all subgraphs is at most 2 b k c , which implies that 1 1 dn the total number of cut edges is at least 2 0 2 b k c . Thus, the performance ratio is at least k; k

G

O m

G

G ;

nk

;G

i

i

k

O n

O m

nk

k >

O m

k

nk

k

d

k

n

n

dn

2

0 21 b 1k c dn

2

n

=

d

0 b 1k c d

:

In particular, dwhen 1  3 0 1, each subgraph will have indegrees at most 2, and therefore we 0 2 get a ratio of d . tu Corollary 4.13 is stronger than Vitanyi's result when 1 0 where 1  (mod ); and in particular, when 2  1  3 0 1. This means that when the average degree is close to maximum degree, we obtain improved approximation bounds. k

d >

k < d

s

s

k

k

Bounded-Degree Graph Coloring

The following result on bounded-degree graph coloring is due to Brooks: Theorem 4.14. ([21]) Let be a graph with maximum degree 1  3. If does not contain a complete subgraph on 1 + 1 nodes, then can be colored with at most 1 colors in polynomial time. By Theorem 4.10, Brooks' result can be generalized as follows: Corollary 4.15. Let be a graph with maximum degree 1  3. If does not contain a complete subgraph on (4   1+1) nodes, then can be colored with at most ( 0 1)d 1+1 t e colors in ( 1) time. The proof is essentially the same as that for MAX -CUT, namely, partition into = d 1+1 t e subgraphs so that every subgraph has a maximum indegree of at most 0 1. Hence, 0 1 colors are sucient to color each subgraph by Brooks' Theorem. For the entire graph , we need at most ( 0 1) = ( 0 1)d 1+1 t e colors. By swapping nodes between subgraphs carefully using a slightly modi ed objective function, 1+1 one can achieve a precise bound of (1 + 1) 0 b t c with the same time complexity [52]. G

G

G

G

G

t

t

G

t

O m

k

G

t

k

t

G

k t

t

4.2.

39

OUR CONTRIBUTIONS

W-CSP

Finally we extend our analysis to the weighted case, W-CSP. By extending the Local Lemma, one can show that for W-CSP(k; r), LS always returns a solution whose weight is at least kr of the optimal. Unfortunately, there is no guarantee that LS would terminate in polynomial time. Since each iteration improves the objective value by at least 1 unit, in the worst case, LS may take time polynomial in the sum of edge weights W , which may be exponential in n and m. In fact, W-CSP belongs to a class of PLS-hard problem implying that even nding a locally optimal solution takes exponential time in the worst case. To ensure polynomiality independent of W , we modify LS so that each improvement must be a constant factor of W and consider partially local optimal solutions. We show that, using polynomial number of iterations, we can obtain a solution with almost the same quality as guaranteed by a local optimal. We use the following notations. Wi denotes the sum of weights of edges incident to variable i. For any assignment , Wi denotes the sum of weights of edges incident to variable i that  are satis ed by  ; and Wi;j denotes the sum of weights of satis ed edges incident to i when the current value of i is replaced by the value j . Let w = W=m denote the average edge weight. To simplify discussion, we will prove a restricted version of the main theorem rst, and then relax the restriction later. More precisely, we rst consider instances in which the weight on each edge is at least some factor  (0    1) of the average edge weight. Let the collection of such instances be denoted W-CSP(k; r;  ). Let WLS be the local search algorithm which iteratively seeks a neighbor that improves the weight of the current assignment by at least  = w units. (Weighted Local Lemma). Let x be an instance of W-CSP(k; r; ) and  be any 1 0 local optimum computed by WLS described above. Then, for all variables i, Wi  kr 0  Wi .

Lemma 4.16.

Proof.

Consider any variable i in x and suppose the lemma does not hold. Then,   Wi


0, there exists a constant k depending on  such that W-CSP(k) cannot be approximated within  unless P=NP.

Two Prover One Round Interactive Proof In a two-prover proof system, two provers P1 and P2 try to convince a probabilistic polynomial time verifer V that a common input x of size n belongs to a language L. V sends messages s and t respectively to P1 and P2 according to a

distribution  which is a polynomial-time computable function of input x and a random string r. The provers return answers P1 (s) and P2 (t) respectively without communicating with each other.

De nition 5.2. A language L has a two-prover one-round interactive proof system of parameters , f1 , f2 (abbreviated IP(; f1 ; f2 )) if, in one round of communication: 1. 2.

8x 2 L; 9 P1; P2 Pr[(V; P1; P2) accepts x] = 1; 8x 62 L; 8 P1; P2 Pr [(V; P1; P2) accepts x] < ;

3. V uses

O(f1)

random bits; and

4. the answer size is O(f2 ). Several results relating language classes to interactive proofs have appeared recently. Particularly, we need the following: 1. ([32]) All languages in NEXP have IP(20n ; poly (n); poly (n));

5.2.

55

OUR CONTRIBUTIONS

2. ([33, 7, 94]) For all L 2 N P , for every constant  > 0, there exists a constant  depending on  such that L has IP(;  log n;  ). A two-prover one-round proof system can be modelled as a problem on a two-player game G. Let S and T be the sets of possible messages. Hence, the sizes of S and T are O(2O(f1 ) ). A pair of messages (s; t) 2 S 2 T is chosen at random according to probability distribution  and sent to the players respectively. A strategy of a player is a function from messages to answers. Let U and W be the sets of answers returned by the two players respectively, whose sizes are O(2O(f2 ) ). The objective is to choose strategies P1 and P2 which maximizes the probability over  that V(s; t; P1 (s); P2 (t)) accepts x. Let the value of the game, denoted ! (G), be the probability of success of the players' optimal strategy in the game G. Proof of Non-Approximability We can formulate the problem of nding the optimal strategy for the game G as a W-CSP instance with a bipartite constraint graph as follows. The set of nodes in the constraint graph is,

V

= fxs :

Edges are given by: E

s

2 Sg

= f(xs ; yt ) :

[ fy

t :

 (s; t)

t

2 T g:

6= 0g:

The domain of each xs (resp., yt) is U (resp., W ). For each edge (xs ; yt ) 2 E , the corresponding relation contains exactly those pairs (u; v ) such that V(s; t; u; w) accepts x. Finally, de ne the weight of the constraint (xs ; yt ) 2 E as the number of random strings on x which generate the query pair (s; t) (i.e. the value  (s; t) scaled up to an integer). Since each variable must be assigned exactly one value, the assignment of variables in S and T encodes a strategy for the two players respectively. By de nition, the scaled optimum value of this W-CSP instance (i.e. optimal value divided by the total number of random strings) is exactly ! (G) and hence the accepting probability of the proof system. If there is a polynomial time algorithm that approximates W-CSP within a O(polylog(n)) multiplicative factor, then EXP = NEXP.

Theorem 5.3.

Consider an arbitrary language L in NEXP and an input x. There is a two-prover one round proof system such that the acceptance probability re ects membership of x in L. We can construct a W-CSP instance y with n0 = O(2poly (n) ) variables whose optimal value is the acceptance probability. Suppose there is a polynomial time algorithm which approximates y to some c=polylog(n0) factor. Then, if x 2 L, the optimal value is at least c=polylog(n0 ) = 1=poly (n) and if x 62 L, the optimal value is less than 20n . Hence, applying the polynomial time approximation algorithm to y would give an exponential time decision procedure for L, implying EXP=NEXP. tu Proof.

In fact, the same proof can be used to prove a stronger theorem that there exists a constant t 0 such that W-CSP of n variables cannot be approximated within a factor of 1=2(log n) factor, unless EXP = NEXP. Next, we consider the non-approximability of W-CSP with xed domain size k . We derive the following theorem: t >

For all  > 0, there exists a constant cannot be approximated within  unless P=NP.

Theorem 5.4.

k

depending on



such that W-CSP(k)

56

CHAPTER 5.

RANDOMIZED APPROXIMATION ALGORITHMS

Proof. Consider an arbitrary language L in NP and an input x. Fix any constant . Then there exists a constant  such that the W-CSP instance constructed has O(2O( log n)) = poly(n) number of variables and each variable has constant domain size k = O(2 ). Suppose there is a polynomial time which approximates y within . Then, we have again a gap in the acceptance probability which enables us to determine membership of x in polynomial time, implying P=NP. tu

5.2.2

Linear Time

s-Approximation Algorithm

In this section, we use the method of conditional probabilities to derive a linear-time greedy algorithm for derandomizing the randomized rounding schemes proposed in the subsequent section. We also prove an initial 1=k2-approximation for W-CSP(k) using this greedy algorithm. Consider an instance of W-CSP(k) of n variables. Suppose we are given an n by k matrix 5 = (piu ) such that all pi;u 2 [0::1] and Pku=1 pi;u = 1 for all 1  i  n. If we assign each variable i independently to value u with probability pi;u , we obtain a probabilistic assignment whose expected weight is given by: ^ =

W

X j 2E

0 1 X @X wj 2 Pr[constraint j is satis ed] = wj cj (u; v) 1 p ;u 1 p ;v A : j 2E

u;v 2K

j

j

Hence, there must exist an assignment whose weight is at least W^ . The method of conditional probabilities speci es that such an assignment can be found deterministically by computing certain conditional probabilities. The following greedy algorithm performs the task: Assign variables 1 to n iteratively. At the beginning of iteration i, let W~ denote the expected weight of the partial assignment where variables 1; . . . ; i 0 1 are xed and variables i; . . . ; n are assigned according to distribution 5. Let W~ u denote the expected weight of that partial assignment with variable i xed to the value u. Assign variable i to value v maximizing W~ v . From the law of conditional probabilities, we know: ~ = X W~ u 1 pi;u:

W

k

u=1

Since we always pick v such that W~ v is maximized, W~ is non-decreasing in all iterations, and the complete assignment has weight no less than the initial expected weight, which is W^ . Therefore, to obtain assignments of large weights, the key factor is to obtain the probability distribution matrix 5 such that the expected weight is as large as possible. In the following, we consider the most naive probability distribution { the random assignment, i.e. for all i and u, we have pi;u = 1=k. By linearity of expectation (i.e. expected sum of random variables is equal to the sum of expected values of random variables), the expected weight of the random assignment is given by, ^ = X wj 1 sj = s X wj : W j 2E

j 2E

That is, the expected weight is s times the total edge weights, implying that W-CSP(k) and SMAX k-CUT can be approximated within absolute ratio s. Since each constraint contains at least 1 value pair, this gives an absolute approximation ratio of 1=k2.

5.2.

57

OUR CONTRIBUTIONS

Time Complexity The greedy algorithm runs in linear time because the conditional prob~ u can be derived from W ~ as follows. abilities can be eciently computed. More precisely, W Maintain a vector r where rj stores the probability that constraint j is satis ed given that vari~ u is just W ~ ables 1; . . . ; i 0 1 are xed and the remaining variables assigned randomly. Then, W o set by the change in probabilities of satis ability of those constraints incident to variable i. More precisely, X ~u = W ~ + W wj (rj0 0 rj ) j

incident to

i

where rj0 is the new probability of satis ability of constraint j . Letting l be the second variable connected by j , rj0 is computed as follows: if l < i (i.e. l has been assigned) then set rj0 to 1 if (l ; u) 2 Rj and 0 otherwise else set rj0 to the fraction #fv 2 K : (u; v ) 2 Rj g=k.

~ u takes O(mi k) time, where mi is the number of conClearly, the computation of each W P straints incident to variable i. Hence, the total time needed is O( mi k2 ) = O((n + m)k2 ), which is linear in the size of the input. 5.2.3

Randomized Rounding of Linear Program

The problem with GREEDY is that the ratio is dependent on the strength of the given instance which can be very small. In this section, we will obtain ratios which are independent (or weakly dependent) of strength. We will consider of randomized rounding of linear program and analyze its performance guarantee. Basically, our algorithm is as follows. First model a W-CSP instance as an integer program, then solve its linear programming (LP) relaxation and nally apply randomized rounding on the solution obtained. For every variable i 2 V , de ne k Boolean variables xi;1; . . . ; xi;k such that i is assigned to u in the W-CSP instance i xi;u is assigned to 1. A W-CSP instance can be formulated by the following integer linear program: (IP) : maximize subject to

X

0 1 X wj @ cj (u; v )zj;u;v A

j 2E zj;u;v zX j;u;v

u;v 2K

 x ;u  x ;v

for j 2 E and u; v 2 K (I1) for j 2 E and u; v 2 K (I2) for i 2 V (I3)

xi;u 2 f0; 1g 0  zj;u;v  1

for i 2 V and u; v 2 K (I4) for j 2 E and u; v 2 K (I5)

u2K

j

j

xi;u = 1

Inequalities (I1) and (I2) ensure that zj;u;v is 1 only if x j ;u and x j ;v are both 1. Inequality (I3) ensures that each W-CSP variable is assigned to exactly one value. Since the edge weights are positive and we are maximizing a linear function of z , the inner sum of the objective function is 1 if constraint j is satis ed and 0 otherwise. Given (IP), solve the corresponding linear programming problem (LP) by relaxing the integrality constraints (I4). Let (x3; z 3) denote the optimal solution obtained. We propose the following rounding scheme: assign variable i to u with probability

1 2



 x3i;u + k1 , for all i 2 V and u 2 K .

58

CHAPTER 5.

RANDOMIZED APPROXIMATION ALGORITHMS

This is a valid scheme since the sum of probabilities for each variable is exactly 1 by inequality (I3). Claim 2. Proof.

The expected weight of this probabilistic assignment is at least k1

OP T

(IP).

The expected weight of the probabilistic assignment is given by, 1 0 X X 1 1 1 1 ^ = (x3 j ;u + ) 1 (x3 j ;v + )A W wj @ cj (u; v ) 1 2 k 2 k j 2E u;v 2K 0 1 X X 1 3 1  wj @ cj (u; v ) 1 (zj;u;v + )2A 4 k j 2E u;v 2K

where the inequality follows from (I1) and (I2). By simple calculus, one can derive that the minimum value of the function (z + k1 )2 f (z ) = 4z in the interval [0; 1] is 1=k at the point z = 1=k. Hence, the expected weight 1 0 X X 1 1 1 3 ^  zj;u;v A  OP T (LP)  OP T (IP) W wj @ cj (u; v ) 1 j 2E

u;v 2K

k

k

which completes the proof.

k

u t

The above analysis is tight. In fact, the above rounding scheme is best possible with respect to the (IP) formulation, as can be shown by considering a W-CSP instance in which all constraints are full relations (i.e. contain all possible value pairs). Then, the optimal solution is the sum of weights W . On the other hand, a feasible solution of the linear program where all variables are equal to 1=k has objective value kW . The above formulation and rounding scheme can be extended in a straight-forward manner to handle instances   of arbitrary arity t. In this case, we assign variable i to value u with probability 1 t0 1 . 3 xi;u + t k The rounding step can be derandomized in linear-time using the greedy method proposed in the previous section. Hence: Theorem 5.5.

ratio of kt101 .

For any xed t, W-CSP(k) of arity

t

can be approximated within an absolute

Trevisan pointed out to us that this ratio is almost the best that we can hope for because one can show by interactive proof that, for all k  2, it is NP-hard to approximate W-CSP(k) (of arity 2) to within (1=k)c , for some constant c > 0.

5.2.4 Randomized Rounding of Semide nite Program In this section, we consider approximation of W-CSP by semide nite programming. The strategy is to represent W-CSP as a QIP and apply randomized rounding to its semide nite programming relaxation.

5.2.

59

OUR CONTRIBUTIONS

5.2.5

Simple Rounding

Consider an instance of W-CSP(k). Formulate a corresponding quadratic integer program (Q) as follows. For every variable i 2 V , de ne k decision variables xi;1 ; . . . ; xi;k 2 f01; +1g such that i is assigned to u in the W-CSP instance i xi;u is assigned to +1 in (Q). (Q) : maximize subject to

X jX 2E

j j (x)

w f

x0 xi;u = 0(k 0 2) u2K xi;u 2 f0 1; +1 g x0

for i 2 V

(I6)

for i 2 V and u 2 K

= +1







P

In the above formulation, fj (x) = 14 u;v cj (u; v ) 1 + x0 x ;u 1 + x0 x ;v encodes the satis ability of constraint j and hence the objective function gives the weight of the assignment. Inequality (I6) ensures that every W-CSP variable gets assigned to exactly one value. The reason for introducing a dummy variable x0 is so that all terms occurring in the formulation are quadratic, which is necessary for the subsequent semide nite programming relaxation. The essential idea of the semide nite programming relaxation is to coalesce each quadratic term xi xj into a matrix variable yi;j . Let Y denote the (kn + 1) 2 (kn + 1) matrix comprising these matrix variables. The resulting relaxed problem (P) is the following: j

(P) : maximize subject to

X jX 2E

j j (Y )

w F

;iu = 0(k 0 2)

for i 2 V

y0

u2K yiu;iu = 1 y0;0 = 1 Y

P

j

for i 2 V and u 2 K (I7)

symmetric positive semide nite.

Here, Fj (Y ) = 14 u;v cj (u; v )(1 + y u; v + y0; u + y0; v ). This semide nite program can be solved in polynomial time within an additive factor (see [2]). By a well-known theorem in Linear Algebra, a t 2 t matrix Y is symmetric positive semide nite i there exists a full row-rank matrix r 2 t (r  t) X such that Y = X T X (see for example, [65]). One such matrix X can be obtained in O(t3 ) time by an incomplete Cholesky's decomposition. Since Y has all 1's on its diagonal (by inequality (I7)), the decomposed matrix X corresponds precisely to a list of t unit-vectors X1 ; . . . ; Xt which are the t columns of X . Furthermore, these vectors have the nice property that the inner product Xc 1 Xc0 = yc;c0 . Henceforth, for simplicity, we will regard that the program (P) returns a set of vectors X (instead of matrix Y ) as the solution. j

j

j

j

We propose the following randomized approximation algorithm for the case of k = 2 (which can also be used for k = 3 as shown later): 1. (Relaxation) Solve the semide nite program (P) to optimality (within an additive factor) and obtain an optimal set of vectors X . 2. (Rounding) Construct an assignment for the W-CSP instance as follows. arccos(X03 X 3 ) For each i, assign variable i to value u with probability 1 0 .  3

1

i;u

60

CHAPTER 5.

RANDOMIZED APPROXIMATION ALGORITHMS

The Rounding step has the following intuitive meaning: the smaller the angle between Xi;u and X , the higher the probability that i would be assigned to u. Since the vector assignment is constrained by the equation X 1 Xi; + X 1 Xi; = 0 for all i, the sum of angles between X and Xi; and between X and Xi; must be 180 degrees (or ). Thus, the sum of probabilities of assigning i to 1 and to 2 is exactly 1, implying that the assignment obtained is valid. Furthermore, the variable x is always assigned to +1. Before proving the performance guarantee, we present a technical lemma: For all unit vectors a; b and c, b 1 c  cos(arccos(a 1 b) 0 arccos(a 1 c)). The vectors a; b and c span a unit 3-D sphere. Since the vectors have unit length, the angles between the vectors (denoted ()) are equal in cardinality to the distance between the respective endpoints on the sphere. Using the form of triangle inequality on spherical distances (see for example, [85],pages 346{347), we get:  (b; c) j  (a; b) 0  (a; c) j : Since the cosine function is monotonically decreasing in the range [0; ], the lemma follows. tu The expected weight of this probabilistic assignment is at least 0:4082 OP T (Q). The expected weight of the probabilistic assignment is given by, 0 #" #1 " X arccos( X 1 X j ;v ) X arccos( X 1 X j ;u ) ^ = wj @ cj (u; v) 1 0 A W 10   3

3

0

3

3

0

3

3

0

3

1

3

0

3

3

0

1

2

2

0

Lemma 5.6. Proof.

Claim 3.

Proof.

3

j 2E

3

3

0

0

3

u;v 2K

X X3

arccos( 03 1

X X 3j ;u ) 

arccos( 03 1

j ;v , and q = . One can show that  (1 0 p)(1 0 q)  0:102[cos(p 0 q) + cos(p) + cos(q) + 1] in the range 0  p; q  1 by graph plotting. By Lemma 5.6, the right-hand-side is at least h i 0:102 X j ;u 1 X j ;v + X 1 X j ;u + X 1 X j ;v + 1 : Hence, 0 1 h i X X 0 : 408 ^ @ A W 4 j E wj u;v K cj (u; v) X j ;u 1 X j ;v + X 1 X j ;u + X 1 X j ;v + 1 = 0:4082OP T (P) which completes the proof. tu For the case of k = 3, the technical diculty is in ensuring that the sum of probabilities of assigning a variable to the three values is exactly 1. Fortunately, by introducing additional valid inequalities, it is possible to enforce this condition, which we will now explain. Call two vectors X and X opposite if X = 0X . The following lemma provides the trick. Given 4 unit vectors a, b, c, d, if a 1 b + a 1 c + a 1 d = 01 (5.1) b 1 a + b 1 c + b 1 d = 01 (5.2) c 1 a + c 1 b + c 1 d = 01 (5.3) d 1 a + d 1 b + d 1 c = 01 (5.4) then a, b, c and d must form two pairs of opposite vectors.

Let p =

3

3

3

2

3

0

3

3

3

0

3

3

0

2

1

Lemma 5.7.

)

2

1

2

3

3

0

3

5.2.

61

OUR CONTRIBUTIONS

1

Proof. 2

[(3) + (4) 0 (1) 0 (2)] gives: a

1

b

= c 1 d:

Similarly, one can show that a 1 c = b 1 d and a 1 d = b 1 c. This means that they form either two pairs of opposite vectors or two pairs of equal vectors. Suppose we have the latter case, and w.l.o.g., suppose a = b and c = d. Then, by (1), a 1 c = a 1 d = 01, implying that we still have two pairs of opposite vectors (a; c) and (b; d). tu Thus, for k = 3, we add the following set of 4n valid equations into (Q). For all i: (

+ xi;2 + xi;3 ) xi;1 (x0 + xi;2 + xi;3 ) xi;2 (x0 + xi;2 + xi;3 ) xi;3 (x0 + xi;2 + xi;3 ) x0 xi;1

01 01 01 01

= = = =

By Lemma 5.7, the corresponding relaxed problem will return a set of vectors with the property ~ 2 fXi;1 ; Xi;2 ; Xi;3 g which is opposite to X0 that for each i, there exists at least one vector X ~ while the remaining two are opposite to each other. Noting that 1 0 arccos(X0 1X ) = 0, the sum of probabilities of assigning i to the other two values is exactly 1. Thus, we have reduced the case of k = 3 to the case of k = 2. The following result follows after derandomization via the method of conditional probabilities: Theorem 5.8.

W-CSP(k) and SMAX k-CUT (k  3) can be approximated within 0.408.

Note that this ratio is an improvement over the linear programming bound of 0.333. Larger Domain Size Note that the above rounding works only for cases of k  3. For domain size greater than 3, we use the following rounding strategy:

assign variable i to value u with probability

1+X03

1X 3

2

i;u

.

This rounding scheme always works for all values of k since the sum of probabilities for each variable i is equal to k   1X 3 = 1: 1 + X03 1 Xi;u 2 u=1 Unfortunately, it is easy to show that we cannot expect to get a ratio of better than 1=k regardless of the rounding scheme. This can be shown by considering a W-CSP instance in which all constraints are full relations. Then, the optimal solution is W , while a feasible solution of (P) where all vectors Xi;u are equal and X0 1 Xi;u = 0 k 0k 2 has objective value kW . Limits of Simple Rounding

We will show that the above result is almost best possible. We prove that, Given the above formulation, regardless of the randomized rounding scheme we choose in Step 2 of the algorithm, there exists a W-CSP(2) instance such that the expected weight of the solution is no more than 0.5 times the optimal weight. Let S be the set consisting of two constraint relations f(1; 1); (2; 2)g and f(1; 2); (2; 1)g. De ne W-CSPS to be the collection of W-CSP(2) instances whose constraints are drawn from the set S .

62

CHAPTER 5.

RANDOMIZED APPROXIMATION ALGORITHMS

S Let X^ be the set of vectors fX^ 0 g fX^ i;u : i 2 V ; u 2 K g such that all X^i;u 's are equal and orthogonal to X^ 0 . Then, X^ is an optimal solution for the relaxation problem (P) associated with any instance of W-CSPS . Lemma 5.9.

Consider the relaxation problem (P) of an arbitrary instance of W-CSPS . For any feasible solution X , the objective value is,

Proof.

0

1

i h 1X @ X wj cj (u; v ) 1 + X0 1 X j ;u + X0 1 X j ;v + X j ;u 1 X j ;v A 4 j 2E u;v2f1;2g



X j 2E

j

w

since X0 1 Xi;1 = 0X0 1 Xi;2 for allPi. On the other hand, X^ is a feasible solution of (P) whose objective value is always equal to j wj . tu

Lemma 5.10. Let fpi;u g represent a xed probability distribution Then, there exists an instance in W-CSPS such that the expected weight of the assignment is no more than 0.5 times the optimal weight. Proof. Construct the following W-CSPS instance. Let the constraint graph be a simple chain connecting n variables. For each constraint j connecting variables i and l, let umax (resp. umin ) be the value in f1; 2g such that pi;u is the larger (resp. smaller) quantity, ties broken arbitrarily. Let vmax and vmin be de ned similarly for plv . De ne the constraint relation of j to be f(umax ; vmin ); (umin ; vmax )g which is an element of S . Now, one can verify by simple arithmetic that the expected weight of the solution is at most 0.5 times the sum of weights. We know however that there exists an assignment which can satisfy all constraints simultaneously.

tu

Using the above two lemmas, we arrive at the following negative result: Using the above semide nite programming formulation, W-CSP(2) cannot be approximated by more than 0.5 regardless of the rounding scheme.

Theorem 5.11.

Given a randomized rounding scheme, let fpi;u g be the xed probability distribution associated with the xed set of vectors X^ . By Lemma 5.10, we can construct at least one instance I in W-CSPS for which the probabilistic assignment has expected weight no more than 0.5 times the optimal and by Lemma 5.9, X^ is an optimal solution of the corresponding relaxation problem (P) of I . The theorem follows if we suppose that X^ is returned by the Relaxation step of the algorithm. tu Proof.

Rounding Via Hyperplane Partitioning

Can we obtain better ratios using another formulation? It turns out that for the case of k = 2, we can improve the ratio by extending the approach of Goemans and Williamson (for approximating MAX SAT) [43]. First, we require two lemmas from [43]: Since we are interested in the case k = 2, we can view W-CSP as a generalization of the MAX 2SAT problem. Recall that an input instance of MAX 2SAT is a set of boolean clauses which are disjunctions of two literals drawn from a given set of boolean variables fx1 ; . . . ; xn g. Goemans and Williamson [43] formulated MAX 2SAT as a special case of QIP as follows. For each xi , introduce a variable yi 2 f01; +1g in the QIP. Introduce an additional variable y0 2 f01; +1g. The variable xi is set to true i yi = y0 . De ne v(C ) to be 1 i the formula C is true. Then, 1+y0 yi and v (x ) = 10y0 yi . Thus, each clause C can be mapped into a quadratic v (xi ) = i j 2 2 polynomial of the form: i 1h v (C j ) = (1 6 y0 y j ) + (1 6 y0y j ) + (1 6 y j y j ) 4

5.2.

63

OUR CONTRIBUTIONS

For example, in the case of Cj = xi ( j) = 1 0

v C

W

x

l,

1 0 y y  1 0 y y  0 l 0 i 2

2

1 = [(1 + y0yi ) + (1 + y0 yl ) + (1 0 yi yl )] : 4

Hence, the instance may be formulated by an objective function which comprises non-negative linear combinations of 1 6 yi yl for all i 2 f0; 1; . . . ; ng. Model a given instance of W-CSP by the following QIP. Each variable has domain f01; +1g. In this way, we can directly use xi 2 f01; +1g to indicate the value assigned to variable i. Introduce an additional variable x0 . The variable i is assigned to +1 i xi = x0 (similar to Goemans and Williamson's approach).

X

wj f j (x) i T S , output ; 5. obtain a feasible schedule  from fmin (N ) (see Lemma 6.17 below); 6. output  .

if

fail

Figure 6.8 gives an illustration of algorithm F. In this example, K =3, I =3, J =2 and shift change is monotonic. (a) shows a demand matrix and (b) shows a show-up schedule. From (a) and (b), we derive DD=[1,3,2], DS =[2,3,2] and DSL=[1,0,0]. The network constructed is as shown in (c). Numbers in nodes represent the supply/demand quantities. To simplify drawing, the edges from layer 3 to 1 are not drawn. (d) shows a minimum-cost ow of (c). Numbers above edges represent ow values. (e) shows a feasible schedule which is derived from (d). We will discuss how the xed-cost network N is constructed (Step 2) and then prove the correctness of our algorithm by showing that any minimum-cost ow induces a feasible schedule provided that the cost does not exceed the total supply T S .

Network Construction A xed-cost network is constructed as follows: 1. For each workstretch k, create a W-node

W [k ]

whose supply quantity is

j j. k

2. For 1  j  J and 1  i  I , create D number of D-nodes D[j; i; 1]; D[j; i; 2]; 111 ; D[j; i; p] (where p = D ). Each D-node has demand quantity 1. j;i

j;i

6.2.

85

OUR CONTRIBUTIONS

Shift\Day 1 2 3 1 0 1 1 2 1 2 1

Workstretch\Day 1 2 3 1 0 - 0 2 - - 3 - - -

(a)

(e)

(b)

Layer 0

Workstretch\Day 1 2 3 1 0 2 0 2 * 1 1 3 2 2 2

Layer 1

2

W[1]

3

0

1 W[1]

SL[1]

1

-1

1

2

W[2] 3

*[1,1,1] 0

D[1,2,1] -1

D[1,3,1] -1

W[2] 3

3 W[3]

-1 D[2,1,1]

-1 D[2,2,1]

-1

3 W[3]

0 *[2,1,1]

-1 D[2,2,2]

D[2,3,1]

3 3

3

SL[1] 1

1

*[1,1,1] 0 -1 D[2,1,1]

2

2

-1 D[1,2,1] D[1,3,1] 1 -1 -1 -1 -1 1 D[2,2,1] D[2,3,1] -1 D[2,2,2]

0 *[2,1,1]

(d)

(c)

Figure 6.8: Illustration of algorithm F. 3. For 1  j  J and 1  i  I , create DS Li number of 3-nodes 3[j; i; 1]; 3[j; i; 2]; 111 ; 3[j; i; q ] (where q = DSLi ). Each 3-node has demand quantity 0.

4. For 1  i  I , create one slack node SL[i] whose demand quantity is DS Li . Slack nodes act as sinks which absorb the ows out of 3-nodes. The nodes are arranged into a cyclic layered network, where layer i (0  i  I ) contains:

1. W-nodes representing workstretches that start on day i + 1; 2. D-nodes and 3-nodes associated with day i (i.e. nodes whose second index is i); and 3. the slack node associated with day

i

0 1, i.e.

[

SL i

0 1].

Edges represent permissible shift changes. Since the rst (i.e. leftmost) slot of a workstretch can be assigned to any shift, each W-node in layer i is connected to all D-nodes and 3-nodes in layer i + 1. D-nodes and 3-nodes in adjacent layers are connected if shift changes are permitted. All 3-nodes in layer i are connected to the slack node in layer i + 1. All edges are assigned unit xed costs, except the edges from 3-nodes to slack nodes, which are assigned zero xed costs. All variable costs are set to zeros.

Proof of Correctness Let f (N ) denote a ow of the given network N . A node in N is said to be in-active if at least one of its incoming edges has a positive ow value. Similarly, a node is out-active if at least one of its outgoing edges has a positive ow value. If neither condition occurs, then a node is said to be dead. Ignoring all dead nodes, it is clear that f (N ) is a directed layered graph such that, 1. all W-nodes, D-nodes and slack nodes are in f (N ); 2. the number of 3-nodes in f (N ) is at least S , since the sum of demand quantities of slack nodes is S and each 3-node can supply at most one unit of ow to its slack node; and

86

CHAPTER 6.

3.

z

SPECIAL CSP: MANPOWER SCHEDULING PROBLEM

(the cost of f (N )) equals the total number of edges in f (N ) which have xed costs.

In a ow, we say that a f ork occurs at a node u if u has more than one outgoing edges. Similarly, a join occurs at a node u if u has more than one incoming edges. A path is disjoint if none of the nodes along the path has a fork or a join.

De nition 6.15. A disjoint path ow is a ow such that: 1. the edges of the ow can be partitioned two sets of edges: (a) disjoint paths connecting the W-nodes, D-nodes and 3-nodes and (b) edges from 3-nodes to slack nodes; 2. every 3-node supplies exactly one unit to its slack node.

De nition 6.16. A column of a schedule in matrix representation is proper if every slot in that column has been assigned a shift and the column assignment satis es the demand for that day. Likewise, a partial schedule is i-proper if the columns 1 to i of the schedule are proper. Hence, a schedule is feasible if and only if it is I -proper. Let  = (I; J; K; D; U;  ) be an instance of AS-CSAP. Let N denote the network constructed by Step 2 of algorithm F. Then, the following lemma holds:

Lemma 6.17. Every disjoint path ow of N induces a feasible schedule for  i  has a feasible schedule.

Proof. Let f (N ) denote any disjoint path ow of N . Then, f (N ) contains exactly K disjoint

paths, since there are K W-nodes. Each disjoint path in f (N ) must originate from a W-node and end in a D-node or 3-node. Consider the disjoint path which begins with an arbitrary W-node W [k], (W [k]; D[j1 ; i; p1]; D[j2 ; i + 1; p2 ]; :::; D[j ; i + r 0 1; p ]); r

r

where r = b(W [k ]). This represents a permissible shift assignment of workstretch k, where shift j is assigned to slot  + 01 for 1  t  r. Similarly, given a feasible schedule for , it is easy to construct a disjoint path ow for N , by associating each workstretch assignment to one disjoint path. tu t

k;i

t

Let f (N ) and z denote a minimum-cost ow of N and its cost respectively. Then the following two lemmas hold. min

Lemma 6.18.

min

zmin

 T S.

contains T D D-nodes and at least S 3-nodes. Every such node has at least one incoming edge, and each such edge has a unit xed cost. Since T S = T D + S by de nition, z  T S. tu

Proof.

fmin (N )

min

Lemma 6.19.

zmin

= T S i f

min

(N ) is a disjoint path ow.

Proof. If f

(N ) is a disjoint path ow, then clearly z = T S . Now, suppose z = T S. Then, f (N ) must have T D D-nodes and exactly S 3-nodes, and every such node has exactly one incoming edge. Therefore, min

min

min

1. every 3-node sends one unit of ow to its respective slack node; 2. joins cannot exist in f

min

(N );

min

6.2.

87

OUR CONTRIBUTIONS

3. forks cannot exist in f (N ). This may be shown by contradiction. We scan f (N ) from layer 0 and suppose a fork rst occurs between layers i 0 1 and i. Since layers 0,...,i 0 1 are fork and join free, we can construct an i 0 1-proper schedule. Thus, number of out-active nodes in layer i 0 1 = number of in-active nodes in layer i, and so a join must occur between layer i 0 1 and i, giving rise to a contradiction. tu min

min

Theorem 6.20.

Algorithm F nds a feasible schedule i one exists.

If the algorithm returns a schedule, then the schedule is clearly feasible. Conversely, if a feasible schedule exists, by Lemma 6.17, there exists a disjoint path ow for the network constructed by Step 2 of F. By Lemma 6.18 and 6.19, any minimum-cost ow is a disjoint path ow. Hence Step 3 of F always returns a disjoint path ow, and Step 5 produces the corresponding feasible schedule. tu

Proof.

6.2.5

Arbitrary Exact CSAP

We now consider the arbitrary exact CSAP (AE-CSAP). Further, we restrict workstretches to be non-cyclic and have either equal start days or end days. This includes the case that all workers have the same o day(s). If we rearrange workstretches in non-decreasing order of their lengths, i.e., j j  j +1 j, for k = 1; :::; K 0 1, then the schedule will be shaped like a tableau as shown in Figure 6.9 (a) or (b), with each row of the tableau representing one workstretch. We show that AE-CSAP is polynomially solvable. k

k

Workstretch\Day M T W H F S U 1 - - - - - - 2 - - - - - - 3 - - - - - - 0 4 - - - - - 0 0 5 - - - - 0 0 0 6 - - - 0 0 0 0 7 - - - 0 0 0 0 8 - - 0 0 0 0 0

M

T W

H

F

S

Workstretch\Day M T W H F S U 1 - - - - - - 2 - - - - - - 3 0 - - - - - 4 0 0 - - - - 5 0 0 0 - - - 6 0 0 0 0 - - 7 0 0 0 0 - - 8 0 0 0 0 0 - -

U

(a)

M

T W

H

F

S

U

(b)

Figure 6.9: Tableau-shaped schedules. A node-disjoint path cover, or simply path cover, of a directed graph is a collection of nodedisjoint paths (possibly having zero length) which covers all nodes of the graph. Given a directed graph N , a path cover always exists. An optimal path cover is one which has the least number of paths, and its size is denoted by  (N ). The problem of nding an optimal path cover for an n-node directed acyclic graph is solvable in O(n2 5) time [19]. Using the notion of path cover, we propose the following algorithm P to solve AE-CSAP: :

P(I; J; K; D; U;  ): construct the corresponding network N using Step 2 of algorithm F; apply any known algorithm to find an optimal path cover  for N ; if  (N ) > K output fail; obtain a feasible schedule  from ; output .

procedure

1. 2. 3. 4. 5.

K

We prove that algorithm P is correct, by showing that an optimal path cover always contains disjoint paths and it induces a feasible schedule.

88

CHAPTER 6.

SPECIAL CSP: MANPOWER SCHEDULING PROBLEM

Let  = (I; J; K; D; U;  ) be an instance of AE-CSAP and N be the layered network constructed by Step 2 of algorithm F. Since all workstretches have equal start days, we remove all W-nodes from N . Clearly, layer 1 of the network contains K nodes. Since demand is exact, there there are neither 3-nodes nor slack nodes in N . The resulting network is as shown in Figure 6.10. In this gure, (a) shows the given tableau-shaped schedule; (b) shows the network constructed from (a); and (c) shows an optimal path cover corresponding to (b). Let  = (1; 2 ; :::; K ) be a path cover of size K of N . Let jk j denote the number of nodes on path k. W.l.o.g, suppose jk j  jk +1 j for k = 1; :::; K 0 1. Then, jk j = jk j, for k = 1; :::; K . Lemma 6.21.

Consider the non-increasing sequences fj1 j; j2 j; :::; jK jg and fj1j; j2j; :::; jK jg. Suppose k is the smallest index such that jk j 6= jk j. If jk j > jk j, then we may conclude that DDi > DSi for i = jk j + 1 to jk j, which is a contradiction because demand is exact. Similarly, if jk j < jk j, then DDi < DSi for i = jk j + 1 to jk j, a contradiction also. Thus, such k does not exist. tu Proof.

Theorem 6.22.



has a feasible schedule i  (N ) = K .

Suppose  (N ) = K and  is an optimal path cover of N . We construct a feasible schedule  as follows. For each 1  k  K , the path k , composed of a sequence of nodes (D[j1; 1; p1 ]; D[j2 ; 2; p2 ]; :::; D[jr ; r; p k ]), represents a permissible shift assignment of workstretch k such that shift jt is assigned to slot k;t for 1  t  jk j. The assignment is always possible by Lemma 6.21. Since  covers all nodes in N and all workstretches are assigned,  is feasible. Conversely, given a feasible schedule  , we apply the reverse procedure to obtain a path cover of size K . This has to be optimal since  (N )  K , because a path in N consists of at most one node from each layer and there are K nodes in layer 1. tu Proof.

j

M

T W

H

(a)

F

S

j

U

(b)

(c)

Figure 6.10: Example of network constructed for tableau-shaped schedule.

Chapter 7

Conclusion This thesis has discussed several di erent approaches to solving the constraint satisfaction and its related problems. In all of them, the unifying theme has been that the algorithm must be simple, i.e. having low time and space complexities, yet e ective, i.e. having provably good performance. Several major research topics arise from this work.

Probabilistic Analysis of Local Search

We have not addressed how local search performs probabilistically when applied to random CSP instances generated di erently from some other distributions. Also, we have not analyzed its performance on instances which have more than one solutions. Even if local search succeeds with high probability, the probability is not high enough to guarantee that the algorithm runs in polynomial time on average over our distribution of CSP instances.

Improved Approximations

Ever since the writing of this thesis, we have improved the ratio for W-CSP(2) from 0.634 to 0.714. This result will be published in a journal paper. While we have achieved constant factor approximation for W-CSP with xed domain sizes, we foresee possible improvements in the performance guarantee using better mathematical programming formulations. Particularly, we ask whether W-CSP can be approximated within a factor better than 1=k. Also, we conjecture that there is a polynomial-time approximation scheme for dense instances of W-CSP, after studying the work of Arora et al.in the coming FOCS96 conference. Experimentally, we have compared our algorithms with somewhat naive approaches. It remains to compare our algorithms with more sophisticated approaches such as tabu search and simulated annealing. It is also an interesting theoretical venture to obtain non-approximability results for xed domain sizes, in the vein of Bellare et al.[13].

Practical Derandomization We proposed a randomized algorithm which rounds fractional solutions to integral solutions based on partitioning of hyperplanes. The algorithm can be derandomized using the method proposed in [80]. However, their method involves computation of sequences of nested integrals, and performing these operations with a small enough error seems hard to do with low time complexity. Furthermore, in implementing their algorithm, a host of precision issues will inevitably crop up. Thus, from the practical perspective, the hypergraph rounding method seems inecient, compared with the simple rounding method. Derandomization is also an issue recognized by other researchers of randomized algorithms. One broad area of research is to nd practical derandomizations. 89

90

Real World Applications

CHAPTER 7.

CONCLUSION

CSP is able to formalize a large variety of combinatorial problems of practical importance, especially scheduling and resource allocation problems. However, we must emphasize that much work is needed to nd good CSP formulation of given problems. We also foresee the use of CSP as a general problem solver in the real world. In our experience, contemporary CSP solvers such as the ILOG Solver tend to be slow and often exhibit unpredictable behaviour. It is an important research and development project to develop stable and fast CSP solvers using the algorithms proposed in this thesis.

Bibliography [1] Paola Alimonti. New local search approximation techniques for maximum generalized satis ability problems. In Proc. Italian Conf. on Algorithms and Complexity (CIAC), pages 40{53. Springer Verlag Lect. Notes. Comp. Sci. (778), 1994. [2] F. Alizadeh. Interior point methods in semide nite programming with applications to combinatorial optimization. SIAM J. Optimiz., 5(1):13{51, 1995. [3] N. Alon, R. A. Duke, H. Lefmann, V. Rodl and R. Yuster. The algorithmic aspects of the regularity lemma. In Proc. 33th IEEE Symp. on Found. of Comp. Sci., pages 473{480, 1992. [4] Noga Alon and Nabil Kahale. A spectral technique for coloring random 3-colorable graphs. In Proc. 26th ACM Symp. on Theory of Computing, pages 346{355, 1994. [5] Noga Alon and Joel Spencer. Math. and Optimiz., 1992.

The Probabilistic Method.

Wiley Interscience Ser. Disc.

[6] Edoardo Amaldi and Viggo Kann. On the approximability of nding maximum feasible subsystems of linear systems. In Proc. Symp. Theo. Aspects of Comp. Sci. (STACS), pages 521{532. Springer Verlag Lect. Notes. Comp. Sci. (775), 1994. [7] S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. Proof veri cation and hardness of approximation problems. In Proc. 33th IEEE Symp. on Found. of Comp. Sci., pages 2{13, 1992. [8] Sanjeev Arora, David Karger, and Marek Karpinski. Polynomial time approximation schemes for dense instances of NP-hard problems. In Proc. 27th ACM Symp. on Theory of Computing, pages 284{293, 1995. [9] K. R. Baker and M. J. Magazine. Workforce scheduling with cyclic demands and day-o constraints. Mgmt. Sc., 24:161{167, 1977. [10] N. Balakrishnan and R. T. Wong. A network model for rotating workforce scheduling problem. Networks, 20:25{32, 1990. [11] Francisco Barahona. On cuts and matchings in planar graphs. 1993.

Math. Prog.,

60:53{68,

[12] Roberto J. Bayardo and Daniel P. Miranker. An optimal backtrack algorithm for treestructured constraint satisfaction problems. Artif. Intell., 71:159{181, 1994. [13] M. Bellare, O. Goldreich, and M. Sudan. Free bits, PCP and non-approximability { Towards tight results (v.2). Technical Report No. TR95-024, ECCC, 1995. Extended abstract in FOCS95. 91

BIBLIOGRAPHY

92

[14] M. Bellare and M. Sudan. Improved non-approximability results. In Symp. on Theory of Computing, pages 294{304, 1994.

Proc. 26th ACM

[15] Hachemi Bennaceur and Gerard Plateau. An exact algorithm for the constraint satisfaction problem: Application to logical inference. Inf. Process. Lett., 48(3):151{158, 1993. [16] P. Berman and G. Schnitger. On the complexity of approximating the independent set problem. Infor. & Comput., 96:77{94, 1992. [17] M. Biro, M. Hujter, and Z. Tuza. Precoloring extension I : Interval graphs. 100:267{279, 1992.

Discrete Math.,

[18] Avrim Blum and Joe Spencer. Coloring random and semi-random k-colorable graphs. Algorithms, 19:204{234, 1995.

J.

[19] F. T. Boesch and J. F. Gimpel. Covering the points of a digraph with point-disjoint paths and its application to code generation. J. Assoc. Comput. Mach., 24(2):192{198, 1977. [20] K. S. Booth and G. S. Lueker. Testing for the consecutive ones property, interval graphs, and graph planarity using pq-tree algorithms. J. Comput. Sys. Sci., 13:335{379, 1976. [21] R. L. Brooks. On colouring the nodes of a network. In volume 37, pages 194{197, 1941.

Proc. Cambridge Philos. Soc.,

[22] R. N. Burns and G. J. Koop. A modular approach to optimal multiple-shift manpower scheduling. Ops. Res., 35(1):100{110, 1987. [23] P. Carraresi and G. Gallo. A multi-level bottleneck assignment approach to the bus drivers' rostering problem. Euro. J. Ops. Res., 16:163{173, 1984. [24] Paul Catlin. A bound on the chromatic number of a graph. 1978. [25] Martin Cooper. An optimal k-consistency algorithm.

Discrete Math., 22:81{83,

Artif. Intell., 41:89{95, 1989.

[26] Martin Cooper, David Cohen, and Peter Jeavons. Characterizing tractable constraints. Artif. Intell., 65:347{361, 1994. [27] P. Crescenzi and V. Kann. A compendium of NP optimization problems. Dynamic on-line survey available at nada.kth.se, Jan. 1995. [28] Philippe David. Using pivot consistency to decompose and solve functional CSPs. Research, 2:447{474, 1995. [29] Rina Dechter. From local to global consistency.

J. AI

Artif. Intell., 55:87{107, 1992.

[30] Rina Dechter and Judea Pearl. Network-based heuristics for constraint satisfaction. Intell., 34:1{38, 1988. [31] P. Erd os and L. Selfridge. On a combinatorial game. 301, 1989.

Artif.

J. Comb. Theory Series A, 14:298{

[32] Uriel Feige and L aszlo Lovasz. Two-prover one-round proof systems: Their power and their problems. In Proc. 24th ACM Symp. on Theory of Computing, pages 733{744, 1992. [33] L. Fortnow, J. Rompel, and M. Sipser. On the power of multiprover interactive protocols. In Proc. Third Annual Conf. on Structures in Complexity Theory, pages 156{161, 1988.

BIBLIOGRAPHY

93

[34] Eugene C. Freuder. A sucient condition for backtrack-free search. Mach., 29(1):24{32, 1982.

J. Assoc. Comput.

[35] Eugene C. Freuder. Partial constraint satisfaction. In Proc. Int'l Joint Conf. Artif. Intell. (IJCAI-89), pages 278{283, Detriot, MI, 1989. [36] Eugene C. Freuder and Richard J. Wallace. Partial Constraint Satisfaction. 58(1-3):21{70, 1992.

,

Artif. Intell.

[37] Alan Frieze and Mark Jerrum. Improved approximation algorithms for MAX k-CUT and MAX BISECTION. In Proc. 4th IPCO Conf. Springer Verlag Lect. Notes Comp. Sci. (920), 1995. [38] Katsuki Fujisawa and Masakazu Kojima. Sdpa (semide nite programming algorithm) user's manual. Technical Report B-308, Dept. Information Science, Tokyo Inst. of Technology, 1995. Online implementation available at ftp.is.titech.ac.jp under directory /pub/OpsRes/software. [39] M. R. Garey and D. S. Johnson. Computers and Intractability. A Guide to NP-Completeness. W. H. Freeman and Co, New York, 1979. [40] Ian P. Gent and Toby Walsh. An empirical analysis of search in GSAT. 1:47{59, 1993. [41] Matthew L. Ginsberg. Dynamic backtracking.

J. AI Research

the Theory of

,

J. AI Research

, 1:25{46, 1993.

[42] F. Glover and C. McMillan. The general employee scheduling prolem: An integration of MS and AI. Computers Ops. Res., 13(5):563{573, 1986. [43] Michel X. Goemans and David P. Williamson. Approximation algorithms for MAX CUT and MAX 2SAT. In Proc. 26th ACM Symp. on Theory of Computing, pages 422{431, 1994. Full version in J. ACM, 42:1115-1145 . [44] Michel X. Goemans and David P. Williamson. New 3/4 approximation algorithms for the maximum satis ability problem. SIAM J. Disc. Math, 7(4):656{666, 1994. [45] Michelangelo Grigni, Dimitris Papadias, and Christos Papadimitriou. Topologoical inference. In Proc. Int'l Joint Conf. Artif. Intell. (IJCAI-95), pages 901{906, 1995. [46] Jun Gu. Local search for satis ability problem. 23(4):1108{1129, 1993.

,

IEEE Trans. Syst., Man and Cyber.

[47] G. M. Guisewite and P. M. Pardalos. Minimum concave-cost network ow problems: Applications, complexity and algorithms. Annals Ops. Res., 25:75{100, 1990. [48] F. O. Hadlock. Finding a maximum cut of a planar graph in polynomial time. Comput., 4:221{225, 1975.

SIAM J.

[49] Torben Hagerup and Christine Rub. A guided tour of Cherno bounds. Inf. Process. Lett., 33:305{308, 1989. [50] David J. Haglin and Shankar M. Venkatesan. Approximation and intractability results for the maximum cut problem and its variants. IEEE Trans. Computers, 40(1):110{113, 1991.

94

BIBLIOGRAPHY

[51] M. Halldorsson and H. C. Lau. Low-degree graph partitioning via local search with applications to constraint satisfaction, max cut, and 3-coloring. Submitted to J. Graph Algorithms and Applications, 1996. ~ Halldorsson, 1994. Personal communication. [52] Magnus M. [53] D. S. Hochbaum. Ecient bounds for the stable set, vertex cover, and set packing problems. Disc. Applied Math, 6:243{254, 1983. [54] D. S. Hochbaum, N. Megiddo, J. Naor, and A. Tamir. Tight bounds and 2-approximation algorithms for integer programs with two variables per inequality. Math. Prog., 62:69{83, 1993. [55] David S. Johnson. Approximation algorithms for combinatorial problems. J. Comput. Sys. Sci., 9:256{278, 1974. [56] Laveen Kanar and Vipin Kumar, editors. Search in Arti cial Intelligence. Springer Verlag, 1988. [57] V. Kann, S. Khanna, J. Lagergren, and A. Panconesi. On the hardness of approximating MAX k-CUT and its dual. Technical Report TRITA-NA-P9505, Dept. Numerical Analysis and Computing Science, Royal Inst. of Technology, Stockholm, 1995. To appear in Proc. ISTCS 1996. [58] David Karger, Rajeev Motwani, and Madhu Sudan. Approximate graph coloring by semide nite programming. In Proc. 35th IEEE Symp. on Found. of Comp. Sci., pages 2{13, 1994. [59] S. Khanna, R. Motwani, M. Sudan, and U. Vazirani. On syntactic versus computational views of approximability. In Proc. 35th IEEE Symp. on Found. of Comp. Sci., pages 819{830, 1994. [60] C. M. Khoong and H. C. Lau. ROMAN: An integrated approach to manpower planning and scheduling. In O. Balci, R. Sharda, and S. Zenios, editors, CS & OR: New Development in Their Interfaces, pages 383{396. Pergammon Press, 1992. [61] C. M. Khoong, H. C. Lau, and L. W. Chew. Automated manpower rostering: Techniques and experience. Int. Trans. Opl. Res., 1(3):353{361, 1994. [62] Lefteris M. Kirousis. Fast parallel constraint satisfaction. In Proc. Int'l Coll. Automata, Lang. and Programming (ICALP). Springer Verlag Lect. Notes Comp. Sci., 1993. Full version appears in Artif. Intell., 64:147-160. [63] G. Koop. Multiple shift workforce lower bounds. Mgmt. Sc., 34(10):1221{1230, 1988. [64] Elias Koutsoupias and Christos H. Papadimitriou. On the greedy algorithm for satis ability. Inf. Process. Lett., 43:53{55, 1992. [65] P. Lancaster and M Tismenetsky. The Theory of Matrices. Academic Press, Orlando, FL, 1985. [66] H. C. Lau. Preference-based scheduling via constraint satisfaction. In Proc. Int'l. Conf. on Optimiz. Techniques and Applications (ICOTA'92), pages 546{554, 1992.

BIBLIOGRAPHY

95

[67] H. C. Lau. Manpower scheduling with shift change constraints. In Proc. 5th Int'l. Symp. Algorithms and Computation (ISAAC), pages 616{624. Springer Verlag Lect. Notes Comp. Sci. (834), 1994. Journal version in Trans. Infor. Proc. Soc. of Japan, 36(5), 1995, 12711279. [68] H. C. Lau. Approximation of constraint satisfaction via local search. In Proc. 4th Wrksp. on Algorithms and Data Structures (WADS), pages 461{472. Springer Verlag Lect. Notes Comp. Sci. (955), 1995. [69] H. C. Lau. Combinatorial approaches for hard problems in manpower scheduling. J. Ops. Res. Soc. of Japan, 39(1):88{98, 1996. [70] H. C. Lau. A new approach for weighted constraint satisfaction: Theoretical and computational results. In Proc. 2nd Int'l Conf. Principles and Practice of Constraint Programming (CP). Springer Verlag Lect. Notes Comp. Sci., 1996. To appear. [71] H. C. Lau. On the complexity of manpower shift scheduling. Computers Ops. Res., 23(1):93{102, 1996. [72] H. C. Lau. Probabilistic analysis of local search and NP-completeness result for constraint satisfaction. In Proc. 2nd Int'l Conf. on Computing and Combinatorics (COCOON). Springer Verlag Lect. Notes Comp. Sci., 1996. To appear. [73] H. C. Lau. Probabilistic and analysis of local search for random instances of constraint satisfaction. In Proc. 12th European Conf. on Arti cial Intelligence (ECAI), 1996. To appear. [74] H. C. Lau and O. Watanabe. Randomized approximation of the constraint satisfaction problem. In Proc. Fifth Scandinavian Wrksp. on Algorithm Theory (SWAT). Springer Verlag Lect. Notes Comp. Sci., 1996. To appear. [75] N. Linial. Locality in distributed graph algorithms. SIAM Journal on Computing, 21:193{ 201, 1992. [76] L. Lovasz. On decomposition of graphs. Studia Sci. Math. Hungar., 1:237{238, 1966. [77] L. Lovasz. Three short proofs in graph theory. J. Combin. Theory Ser. B, 19:269{271, 1975. [78] C. Lund and M. Yannakakis. On the hardness of approximating minimization problems. In Proc. 25th ACM Symp. on Theory of Computing, pages 286{293, 1993. [79] Alan K. MacWorth. Consistency in networks of relations. Artif. Intell., 8:99{118, 1977. [80] S. Mahajan and H. Ramesh. Derandomizing semide nite programming based approximation algorithms. In Proc. 36th IEEE Symp. on Found. of Comp. Sci., pages 162{168, 1995. [81] S. Martello and P. Toth. A heuristic approach to the bus driver scheduling problem. Euro. J. Ops. Res., 24(1):106{117, 1986. [82] Steve Minton, Mark D. Johnson, Andrew B. Philips, and Philip Laird. Minimizing con icts: a heuristic repair method for constraint satisfaction and scheduling problems. Artif. Intell., 58:161{205, 1992. [83] D. S. Mitrinovic. Analytic Inequalities. Springer Verlag, Heidelberg, 1970.

96

BIBLIOGRAPHY

[84] J. G. Morris and M. J. Showalter. Simple approaches to shift, day o and tour scheduling problems. Mgmt. Sc., 29(8):942{950, 1983. [85] Barrett O'Neill. Elementary Di erential Geometry. Academic Press, New York, 1966. [86] Takao Ono, Tomio Hirata, and Takao Asano. An approximation algorithm for MAX 3SAT. In Proc. Symp. on Algorithms and Computation (ISAAC). Springer Verlag Lect. Notes Comp. Sci., 1995. To appear. [87] P. S. Ow, S. F. Smith, and A. Thireiz. Reactive plan revision. In Proc. Nat'l Conf. on Artif. Intell. (AAAI-88), pages 77{82, St Paul, MN, 1988. [88] Christos H. Papadimitriou and Mihalis Yannakakis. Optimization, approximation, and complexity classes. J. Comput. Sys. Sci., 43:425{440, 1991. [89] E. Petrank. The hardness of approximation: Gap location. Comput. Complexity, 4:133{ 157, 1994. [90] Svatopluk Poljak. Integer linear programs and local search for max-cut. SIAM J. Comput., 24(4):822{839, 1995. [91] Svatopluk Poljak and Daniel Turzik. A polynomial time algorithm for constructing a large bipartite subgraph, with an application to a satis ability problem. Can. J. Math, 34(3):519{524, 1982. [92] Patrick Prosser. Binary constraint satisfaction problems, some are harder than others. In Proc. ECAI, pages 95{98. John Wiley and Sons Ltd, 1994. [93] P. Raghavan and C. D. Thompson. Randomized rounding: A technique for provably good algorithms and algorithmic proofs. Combinatorica, 7(4):365{374, 1987. [94] Ran Raz. A parallel repetition theorem. In Proc. 27th ACM Symp. on Theory of Computing, pages 447{456, 1995. [95] S. Sahni and T. Gonzalez. P-complete approximation problems. J. Assoc. Comput. Mach., 23:555{565, 1976. [96] T. J. Schaefer. The complexity of satis ability problems. In Proc. 10th ACM Symp. on Theory of Computing, pages 216{226, 1993. [97] Alejandro A. Scha er and Mihalis Yannakakis. Simple local search problems that are hard to solve. SIAM J. Comput., 20(1):56{87, 1991. [98] Bart Selman and Henry A. Kautz. An empirical study of greedy local search for satis ability testing. In Proc. Nat'l Conf. on Artif. Intell. (AAAI-93), pages 46{51, 1993. [99] Bart Selman, Hector Levesque, and David Mitchell. A new method for solving hard satis ability problems. In Proc. AAAI, pages 440{446, St Jose, CA, 1992. [100] Barbara Smith. Phase transition and the mushy region in constraint satisfaction problems. In Proc. ECAI, pages 100{104. John Wiley and Sons Ltd, 1994. [101] Mario Szegedy and Sundar Vishwanathan. Locality based graph coloring. In Proc. 25th ACM Symp. on Theory of Computing, pages 201{207, 1993. [102] J. Tien and A. Kamiyama. On manpower scheduling algorithms. SIAM Review, 24(3):275{ 287, 1982.

BIBLIOGRAPHY [103] Edward Tsang.

97 . Academic Press, 1993.

Foundations of Constraint Satisfaction

[104] Jonathan S. Turner. Almost all k-colorable graphs are easy to color. 9:63{82, 1988.

,

J. Algorithms

[105] Peter van Beek. On-line C-programs available at ftp.cs.ualberta.ca under directory /pub/ai/csp. [106] Peter van Beek. On the minimality and decomposability of constraint networks. In Nat'l Conf. on Artif. Intell. (AAAI-92), pages 447{452, St Jose, CA, 1992.

Proc.

[107] Peter van Beek and Rina Dechter. Constraint tightness vs global consistency. In Proc. Int'l Conf. Principles of Knowledge Representation and Reasoning, pages 572{582, 1994. [108] Peter van Beek and Rina Dechter. On the minimality and global consistency of row-convex constraint networks. J. Assoc. Comput. Mach., 42(3):543{561, 1995. [109] Paul Vitanyi. How well can a graph be n-colored?

, 34:69{80, 1981.

Discrete Math.

[110] Chris Voudouris and Edward Tsang. Partial constraint satisfaction problems and guided local search. Technical Report No. CSM-250, Dept. Computer Science, University of Essex, 1995. [111] Richard J. Wallace, 1995. Personal communication. [112] Richard J. Wallace. Directed arc consistency preprocessing as a strategy for maximal constraint satisfaction. In M. Meyer, editor, Constraint Processing, pages 121{138. Springer Verlag Lect. Notes Comp. Sci. (923), 1995. [113] Richard J. Wallace and Eugene C. Freuder. Conjunctive width heuristics for maximal constraint satisfaction. In Proc. Nat'l Conf. on Artif. Intell. (AAAI-93), pages 762{768, 1993. [114] Richard J. Wallace and Eugene C. Freuder. Heuristic methods for over-constrained constraint satisfaction problems. In CP95 Wrksp. on Over-constrained Systems, 1995. [115] Mihalis Yannakakis. The analysis of local search problems and their heuristics. In Proc. Symp. Theo. Aspects of Comp. Sci. (STACS), pages 298{311. Springer Verlag Lect. Notes Comp. Sci., 1990. [116] Mihalis Yannakakis. On the approximation of maximum satis ability. 17:475{502, 1994.

,

J. Algorithms

[117] Nobuhiro Yugami, Yuiko Ohta, and Hirotaka Hara. Improving repair-based constraint satisfaction methods by value propagation. In Proc. AAAI, Seattle, WA, 1994. [118] Ramin Zabih. Some applications of graph bandwidth to constraint satisfaction problems. In Proc. Nat'l Conf. on Artif. Intell. (AAAI-90), pages 46{51, Boston, MA, 1990.