The 15th International Conference on Principles

0 downloads 0 Views 2MB Size Report
either within the doctoral program or in the main Constraint Programming. Conference. ..... All algorithms were implemented in C on a PC with 1 GB RAM and.
The 15th International Conference on Principles and Practice of Constraint Programming Doctoral Program Proceedings

September 20 - 24 2009

ii

Welcome to the proceedings of the 2009 Constraint Programming Doctoral Program, held in conjunction with the 15th annual Constraint Programming conference in Lisbon, Portugal. The doctoral program is open to PhD students in all areas related to constraint programming. All participants present work either within the doctoral program or in the main Constraint Programming Conference. The papers in this proceedings are those which have been submitted directly to the doctoral program. They contain a wide variety of work, either completed or in progress, being undertaken by the current generation of PhD students. We also list all students who have papers accepted into the main conference. The Doctoral Program would not be possible without the support of many people and organisations. In particular, we would like to thank the program committee members and the sponsors of the Constraint Programming conference. We hope you have an enjoyable time in Portugal and a productive conference. Karen Petrie and Olivia Smith Doctoral Program Chairs, 2009 The Program Committe Peter Nightingale Chris Jefferson Neil Yorke-Smith Zeynep Kiziltan Sebastian Brand Hubie Chen Claude-Guy Quimper Standa Zivny Justin Pearson Roland Yap

consists of: University of St Andrews University of St Andrews SRI International University of Bologna University of Melbourne Universitat Pompeu Fabra University of Waterloo University of Oxford Uppsala University National University of Singapore

iii

Students with papers accepted into the main conference: Carleton Coffrin Alberto Delgado Aurelie Favier Mohammad Fazel-Zarandi Andy Grayland Serdar Kadioglu Ronan Le Bras Nina Naroditskaya Aurlien Rizk Elaine Sonderegger David Stynes Kevin Tierney Mohamed Wahbi Justin Yip Alessandro Zanarini Standa Zivny

Brown University IT University of Copenhagen INRA, Centre de recherches de Toulouse University of Toronto University of St Andrews, Scotland Brown University Centre de recherche sur les transports University of New South Wales and NICTA, Sydney, Australia Inira University of Connecticut Cork Constraint Computation Centre, University College Cork Brown University LIRMM/CNRS, U. of Montpellier 2, France and LIMIARF/FSR, U. of Mohammed V Agdal, Morroco Brown University Ecole Polytechnique de Montreal University of Oxford

iv

Table of Contents A Filtering Technique for Non-normalized CSPs Marlene Arangu, Miguel A. Salido and Federico Barber Continuous Search in Constraint Programming: An Initial Investigation Alejandro Arbelaez and Youssef Hamadi Constraint Based Languages for Biological Reactions Marco Bottalico and Stefano Bistarelli Capturing fair computations on Concurrent Constraint Languages Paola Campli and Stefano Bistarelli Finding Stable Solutions in Constraint Satisfaction Problems Laura Climent, Miguel A. Salido and Federico Barber Preliminary studies of BnB-ADOPT related with Soft Arc Consistency Patricia Gutierrez and Pedro Meseguer An automaton Constraint for Local Search Jun He, Pierre Flener and Justin Pearson Research Overview: Improved Boolean Satisfiability Techniques for Haplotype Inference Eric Hsu and Sheila McIlraith Foundations of Symmetry Breaking Revisited Tim Januschowski, Barbara Smith and Marc van Dongen Energetic Edge-Finder For Cumulative Resource Roger Kameugne and Laure Pauline Fotso Dominion – A constraint solver generator Lars Kotthoff, Ian Miguel and Ian Gent On learning CSP specifications Matthieu Lopez and Arnaud Lallouet Propagating equalities and disequalities Neil Charles Armour Moore, Ian Gent and Ian Miguel Tractable Benchmarks Justyna Petke and Peter Jeavons A simple efficient exact algorithm based on independent set for Maxclique problem Zhe Quan and Chu Min Li Exploring Local Acyclicity within Russian Doll Search Margarita Razgon and Gregory M. Provan Optimal Solutions for Conversational Recommender Systems Based on Comparative Preferences and Linear Inequalities Walid Trabelsi, Nic Wilson and Derek Bridge A multithreaded solving algorithm for QCSP+ Jeremie Vautard and Arnaud Lallouet

1 7 13 19 25 31 37 43 48 54 64 70 76 82 88 94 100 106

1

A Filtering Technique for Non-normalized CSPs Marlene Arang´u (Student) Miguel A. Salido and Federico Barber (Supervisors) Instituto de Autom´atica e Inform´atica Industrial Universidad Polit´ecnica de Valencia. Valencia, Spain

Abstract. In this work we present a new algorithm, called 2-C3OP, that achieves 2-consistency in binary and non-normalized CSPs. This algorithm is a reformulation of 2-C3 algorithm and it performs the constraint checks bidirectionally using inference. The evaluation section shows that 2-C3OP achieve 2-consistency like 2-C3 and it is 40% faster and therefore it is able to prune more search space than AC3. Furthermore, 2-C3OP performs fewer constraint checks than both AC3 and 2-C3.

1 Introduction Proposing efficient algorithms for enforcing arc-consistency (which involves two variables) has always been considered to be a central question in the constraint reasoning community. Thus, there are many arc-consistency algorithms such as AC1, AC2, and AC3 [13]; AC4 [14]; AC5 [15, 12]; AC6 [5]; AC7 [6]; AC8[8]; AC2001, AC3.1 [7]; and more. However, AC3 and AC4 are the most often used [3]. Algorithms that perform arc-consistency have focused their improvements on time-complexity and spacecomplexity. Main improvements have been achieved by: changing the way of propagation, appending new structures; performing bidirectional search, changing the support search, improving the propagation, etc. However, little works has been done for developing algorithms to achieve 2-consistency in binary CSPs (binary: all constraints involve only two variables). The concept of consistency was generalized to k-consistency by [11] and an optimal k-consistency algorithm for labeling problem was proposed by [9]. Nevertheless, it is only based in normalized CSPs. If the constraints have two variables in k-consistency (k=2) then we talk about 2-consistency. Also, much work on arc-consistency made the simplifying assumptions that CSPs are binary and normalized (two different constraints do not involve exactly the same variables), because these notations are much simpler and new concepts are easier to present. But [16] shows a strange effect of associating arc-consistency with binary normalized CSPs. It is the confusion between the notions of arc-consistency and 2-consistency (2-consistency guarantees that any instantiation of a value to a variable can be consistently extended to any second variable). On binary CSPs, 2-consistency is at least as strong as arc-consistency. When the CSP is binary and normalized, arc-consistency and 2-consistency perform same pruning. A non-normalized CSP can be transformed into normalized by using the intersection of valid tuples [2]. It can be a very hard task in problems with large domains.

2 We will focus our attention on binary and non-normalized CSPs. Figure 1 left shows a binary CSP with two variables X1 and X2 , D1 = D2 = {1, 2, 3} and two constraints 0 R12 : X1 ≤ X2 , R12 : X1 6= X2 presented in [16]. It can be observed that this CSP is arc-consistent due to the fact that every value of every variable has a support 0 for constraints R12 and R12 . In this case, arc-consistency does not prune any value of the domain of variables X1 and X2 . However, (as authors say in [16]) this CSP is not 2-consistent because the instantiation X1 = 3 can not be extended to X2 and the instantiation X2 = 1 can not be extended to X1 . Thus, Figure 1 right presents the resulting CSP filtered by arc-consistency and 2-consistency. It can be observed that 2-consistency is at least as strong as arc-consistency.

Fig. 1. Example of Binary CSP.

By following standard notations and definitions in the literature [4]; [3], [10]; we have summarized the basic definitions that we will use in this rest of the paper. Constraint Satisfaction Problem (CSP) is a triple P = hX, D, Ri where, X is a finite set of variables {X1 , X2 , ..., Xn }. D is a set of domains D = D1 , D2 , ..., Dn such that for each variable Xi ∈ X there is a finite set of values that variable Xi can take. R is a finite set of constraints R = {R1 , R2 , ..., Rm } which restrict the values that the variables can simultaneously take. We denote Rij ≡ (Rij , 1) ∨ (Rij , 3)1 as the direct constraint defined over the variables Xi and Xj and Rji ≡ (Rij , 2) as the same constraint in the inverse direction over the variables Xi and Xj (inverse constraint). A block of constraints Cij is a set of binary constraints that involve the same variables Xi and Xj . Instantiation is a pair (Xi = a), that represents an assignment of the value a to the variable Xi , and a is in Di . A constraint Rij is satisfied if the instantiation of Xi = a and Xj = b holds in Rij . Constraint symmetry. If the value b ∈ Dj supports a value a ∈ Di , then a supports b as well. Arc-consistency: A value a ∈ Di is arc-consistent relative to Xj , iff there exists a value b ∈ Dj such that (Xi , a) and (Xj , b) satisfies the constraint Rij ((Xi = a, Xj = b) ∈ Rij ). A variable Xi is arc-consistent relative to Xj iff all values in Di are arcconsistent. A CSP is arc-consistent iff all the variables are arc-consistent, e.g., all the 1

More information regarding direct constraint is presented in the next section.

3 constraints Rij and Rji are arc-consistent. (Note: here we are talking about full arcconsistency). 2-consistency: A value a ∈ Di is 2-consistent relative to Xj , iff there exists a k value b ∈ Dj such that (Xi , a) and (Xj , b) satisfies all the constraints Rij (∀k : (Xi = k a, Xj = b) ∈ Rij ). A variable Xi is 2-consistent relative to Xj iff all values in Di are 2-consistent. A CSP is 2-consistent iff all the variables are 2-consistent, e.g., any instantiation of a value to a variable can be consistently extended to any second variable.

2 Algorithm 2-C3OP We present a new coarse grained algorithm called 2-C3OP that achieves 2-consistency in binary and non-normalized CSPs (see Algorithm 1). This algorithm deals with block of constraints as 2-C3 [1] but only requires keeps half block of constraints in Q and call two procedures: Revise and AddQ. 2-C3OP performs bidirectional checks (as AC7) and performs inference to avoid unnecessary checks. However, it is done by using structures that are shared by all the constraints: – suppInv: it is a vector whose size is the maximum size of all domains (maxD) and it stores the value Xi that supports the value of Xj . – minSupp: it is a integer variable that stores the first value b ∈ Dj that supports any a ∈ Di . By using minSupport inference is carried out to avoid unnecessary checks because all b ∈ Dj < minsupport are pruned without any check once suppInv is evaluated. – t: is an integer parameter an it takes values from {1, 2, 3}. This value is used to guide to the Revise procedure (direct or inverse order, bidirectional search - see below). Initially 2-C3OP procedure stores in queue Q the constraint blocks (Cij , t) : t = 1. Then, a simple loop is performed to select and revise the block of constraints stored in Q, until no change occurs (Q is empty), or the domain of a variable remains empty. The first case ensures that every value of every domain is 2-consistent and the second case returns that the problem is not consistent. The AddQ procedure would add tuples to be evaluated again. The added tuples will depend on the tuples stored in Q and the variable change. Depending on the value of t, generated by AddQ, the Revise procedure will guide the search. If t = 1 the search is full bidirectional (directed and inverse); if t = 2 the search is inverse and the search is also directed if and only if some pruning is carried out; finally if t = 3 the search is directed and the search is also inverse if and only if some pruning is carried out. The Revise procedure (see Algorithm 2) requires two internal variables changei and changej . They are initialized to zero and are used to remember which domains were pruning. For instance, if Di was pruned then changei = 1 and if Dj was pruned then changej = 2. However, if both Di and Dj were pruned then change = 3 (because change = changei + changej ). During the loop of steps 3-9, each value in Di is checked 2 . If the value b ∈ Dj supports the value a ∈ Di then suppInv[b] = a, 2

if t=2 the inverse operator is used

4 Algorithm 1: Procedure 2-C3OP Input A CSP, P = hX, D, Ri Result: true and P 0 (which is 2-consistent) or false and P 0 (which is 2-inconsistent because some domain remains empty) begin 1 for every i, j do 2 Cij = ∅ 3 4

for every arc Rij ∈ R do Cij ← Cij ∪ Rij

5 6

for every set Cij do Q ← Q ∪ {(Cij , t) : t = 1}

7 8

for each d ∈ Dmax do suppInv[d] = 0

9 10 11 12 13 14 15 16 17

while Q 6= ∅ do select and delete (Cij , t) from queue Q with t = {1, 2, 3} change = Revise((Cij , t)) if change > 0 then if change ≥ 1 ∧ change ≤ 3 then Q ← Q ∪ AddQ(change, (Cij , t)) else return false /*empty domain*/

end

return true

Algorithm 2: Procedure Revise

1 2 3 4 5 6

Input A CSP P 0 defined by two variables X = (Xi , Xj ), domains Di and Dj , and tuple (Cij , t) and vector suppInv. Result: Di , such that Xi is 2-consistent relative Xj and Dj , such that Xj is 2-consistent relative Xi and integer variable change begin changei = 0; changej = 0 minSupp =dummyvalue for each a ∈ Di do if @b ∈ Dj such that (Xi = a, Xj = b) ∈ (Cij , t) then remove a from Di changei = 1

7 8 9

else suppInv[b] = a if minSupp =dummyvalue then minSupp = b

10 11 12 13 14

if ([(t = 2 ∨ t = 3) ∧ changei = 1] ∨ t = 1) then for each b ∈ Dj do if b < minSupp then remove b from Dj changej = 2

15 16 17

else

if suppInv[b] > 0 then suppInv[b] = 0

18 19 20 21

else

22 23

change = changei + changej return change

end

if @a ∈ Di such that (Xi = a, Xj = b) ∈ (Cij , t) then remove b from Dj changej = 2

5 due to the fact that it is based on symmetry of the constraint(the support is bidirectional). Furthermore the first value b ∈ Dj (that supports a value in Di ) is stored in minSupport. The second part of Algorithm 2 depends on the values t and changei . If t = 2 or t = 3, and changei = 0 then Cij is not needed to be checked due to the fact that the constraint, in the previous loop, has not generated any prune. However, if t = 1 then Cij requires full bidirectional evaluation. If t = 2 or t = 3, and changei = 1 then Cij also requires full bidirectional evaluation. In both cases, the processing of suppInv can be done in three different ways: 1) the values of b ∈ Dj : b < minSupport are pruned without performing any checking. Furthermore, variable changej is updated to changej = 2 to indicate that a change in the second loop occurs. 2) the values of b with suppInv[b] > 0 will not be checked due to the fact that they are supported in Di . Thus, they are initialized to 0 for further use of this vector. 3) the values of b with b > minorSupport and suppInv[b] = 0 will be checked until a support a is founded or they will be removed in other case. In the last case changej is set to 2 in order to indicate that a change in the second loop occurs.

3 Experimental Results The experiments were performed on random instances characterized by the 4-tuple < n, d, m, c >, where n was the number of variables, d the domain size, m the total number of binary constraints and c the number of constraints in each block. Constraints were implicity encoded. We evaluated 50 test cases for each type of the problem. Performance was measured in terms of number of values pruned, constraint checks and computation time. All algorithms were implemented in C on a PC with 1 GB RAM and 3.0 GHz Pentium IV processor. Table 1. Number of prunings and constraint checks by using AC3, 2-C3 and 2-C3OP in problems 100% non-normalized < n, 20, 800, 2 >. Arc-consistency 2-consistency value AC3 2-C3 2-C3OP n prunings time checks prunings time checks time checks 50 50 9 109001 140 20 66774 10 58778 70 70 9 109401 140 20 64816 13 56816 90 90 11 109801 140 20 63296 13 55296 110 110 14 110201 150 20 62801 13 54801 130 130 11 110601 170 23 63001 13 55001 150 150 12 111001 180 24 62616 16 54616

In Table 1 the results show that the number of prunings was lower (53%) in AC3 than in both 2-C3 y 2-C3OP in all cases. This is due to the fact that the problems start with the same domain size and they have constraints with operators (=, 6=, ≤, or ≥), so that AC3 does not pruned any value by analyzing the constraints individually. However,

6 2-C3 and 2-C3OP analyzed the blocks of constraints which have a mix of these operators and so is able to prune more search space. Furthermore, 2-C3OP performed fewer constraints checks than both AC3 and 2-C3 by 50 % and 13 % respectively. Finally 2-C3OP was 40% faster than 2-C3.

4 Conclusions and Further Work Filtering techniques are widely used to prune the search space of CSPs. In this paper, we have presented 2-C3OP algorithm, a reformulation of 2-C3 algorithm, which achieves 2-consistency in binary and non-normalized CSPs using bidirectionally and inference. The evaluation section shows that 2-C3OP achieves 2-consistency and thus is able to prune more search space than AC3. Furthermore 2-C3OP performs fewer constraint checks than both AC3 and 2-C3 and its computation time is near to AC3 and better than 2-C3 by 40%. In further work, we will focus our attention on improved versions of 2C3OP to avoid redundant checks by using both the Last value support (as in AC3-2001 [7]). Furthermore, our goal is to apply these filtering techniques to distributed CSPs.

References 1. Marlene Arang´u, Miguel Salido, and Federico Barber. 2-c3: From arc-consistency to 2consistency. In SARA 2009: The Eighth Symposium on Abstraction, Reformulation and Approximation. To appear, 2009. 2. Marlene Arang´u, Miguel Salido, and Federico Barber. Extending arc-consistency algorithms for non-normalized csps. In AI2009: Twenty-ninth SGAI International Conference on Artificial Intelligence. To appear, 2009. 3. Roman Bart´ak. Theory and practice of constraint propagation. In J. Figwer, editor, Proceedings of the 3rd Workshop on Constraint Programming in Decision and Control, 2001. 4. Christian Bessiere. Constraint propagation. Technical report, CNRS/University of Montpellier, 2006. 5. Christian Bessiere and Marie Cordier. Arc-consistency and arc-consistency again. In Proc. of the AAAI’93, pages 108–113, Washington, USA, July 1993. 6. Christian Bessiere, E.C. Freuder, and J. C. R´egin. Using constraint metaknowledge to reduce arc consistency computation. Artificial Intelligence, 107:125–148, 1999. 7. Christian Bessiere, J. C. R´egin, Roland Yap, and Yuanling Zhang. An optimal coarse-grained arc-consistency algorithm. Artificial Intelligence, 165:165–185, 2005. 8. Assef Chmeiss and Philippe Jegou. Efficient path-consistency propagation. International Journal on Artificial Intelligence Tools, 7:121–142, 1998. 9. Martin Cooper. An optimal k-consistency algorithm. Artificial Intelligence, 41:89–95, 1994. 10. R. Dechter. Constraint Processing. Morgan Kaufmann, 2003. 11. Eugene Freuder. Synthesizing constraint expressions. Commun ACM, 21:958–966, 1978. 12. P. Van Hentenryck, Y. Deville, and C. M. Teng. A generic arc-consistency algorithm and its specializations. Artificial Intelligence, 57:291–321, 1992. 13. A. K. Mackworth. Consistency in networks of relations. Artificial Intelligence, 8:99–118, 1977. 14. R. Mohr and T.C. Henderson. Arc and path consistency revised. Artificial Intelligence, 28:225–233, 1986. 15. M. Perlin. Arc consistency for factorable relations. Artificial Intelligence, 53:329–342, 1992. 16. F. Rossi, P. Van Beek, and T. Walsh. Handbook of constraint programming. Elsevier, 2006.

7

Continuous Search in Constraint Programming: An Initial Investigation Alejandro Arbelaez1 (student) Youssef Hamadi1,2 1

2

Microsoft-INRIA joint-lab, Orsay France, [email protected] Microsoft Research, Cambridge United Kingdom, [email protected]

Abstract. In this work, we present the concept of Continuous Search, the objective of which is to allow any user to eventually get top performance from their constraint solver. Unlike previous approaches (see [9] for a recent survey), Continuous Search does not require the disposal of a large set of representative instances to properly train and learn parameters. It only assumes that once the solver runs in a real situation (often called production mode), instances will come over time, and allow for proper offline continuous training. The objective is therefore not to instantly provide good parameters for top performance, but to take advantage of the real situation to train in the background and improve the performances of the system in an incremental manner.

1

Introduction

In Constraint Programming, properly crafting a constraint model which captures all the constraints of a particular problem is often not enough to ensure acceptable runtime performance. One way to improve efficiency is to use well known tricks like redundant and channeling constraints or to be aware that the constraint solver has a particular global constraint that can do part of the job more efficiently. The problem with these improvements (or tricks) is that they are far from obvious. Indeed, they do not change the solution space of the original model, and for a normal user (with a classical mathematical background), it is difficult to understand why adding redundancy helps. Because of that, normal users are often left with the tedious task of tuning the search parameters of their constraint solver, and this again, is both time consuming and not necessarily straightforward. Indeed, even if tuning is conceptually simple (try different parameters, pick the best), it requires a set of representative instances in order to properly work. This might be obvious for a constraint programmer, but not for a normal user which could train on instances far different from the ones faced by his application. In this work, we present the concept of Continuous Search (CS), the objective of which is to allow any user to eventually get top performance from their constraint solver. Unlike previous approaches (see [9] for a recent survey), Continuous Search does not require the disposal of a large set of representative

8 instances to properly train and learn parameters. It only assumes that once the solver runs in a real situation (often called production mode), instances will come over time, and allow for proper offline continuous training. The objective is therefore not to instantly provide good parameters for top performance, but to take advantage of the ’real’ situation to train in the background and improve the performances of the system in an incremental manner. The Continuous Search paradigm, uses an online learning algorithm to update a prediction function which matches instances features to the most efficient set of parameters for a given instance (e.g. Variable/Value selection algorithms). Since CS can start without offline training, this prediction function might be initially undefined. If this is the case, an instance is solved by running the solver with its default parameters. Once the instance is solved and the solution is given back to the application1 , we start our Continuous Search training. At that point, the instance is reused and the goal is to refine the strategy used to tackle it. This is done by a specific repair-like algorithm which perturbs the strategy to find a new strategy able to solve the problem more efficiently. If such a strategy is found, its parameters are stored with the instance features, and therefore, could be reused to solve similar instances more efficiently. Technically, this means that there are two different search efforts. The one done to solve the real problem, and the one related to the long term improvement of the constraint solver. We believe that this extra use of computational resources is realistic, since nowadays systems (especially production ones) are almost always on. Moreover, this has to be balanced against the huge computational cost of offline training [10]. Last, this late adaptation is the only way to face the ’real’ instances and even, to adapt to changes on the modeling or to the arrival of a completely new class of problem. The paper is organized as follows. Background material is presented in Section 2. Section 3 introduces the continuous search paradigm. Section 4 presents experimental results. Finally, before our general conclusion, Section 5 presents related work.

2

Background

In this section, we briefly introduce definitions and notations used hereafter. 2.1

Constraint Satisfaction Problems

Definition 1 A Constraint Satisfaction Problem (CSP) is a triple (X, D, C) where, – X = {X1 , X2 , . . . , Xn } represents a set of n variables. – D = {D1 , D2 , . . . , Dn } represents the set of associated domains, i.e., possible values for the variables. – C = {C1 , C2 , . . . , Cm } represents a finite set of constraints. 1

Since our standpoint is real settings, we consider the full application stack where solvers are not isolated pieces of software called from a command line, but are critically embedded in large and complex applications.

9 Each constraint Ci is associated to a set of variables vars(Ci ), and is used to restrict the combinations of values between these variables. Similarly, the degree deg(Xi ) of a variable is the number of constraints associated to Xi and dom(Xi ) corresponds to the current domain of Xi . Solving a CSP involves finding a solution, i.e., an assignment of values to variables such as all constraints are satisfied. If a solution exists the problem is stated as satisfiable and unsatisfiable otherwise. In this paper, we consider four well known variable selection heuristics. mindom selects the variable with the smallest domain [4], wdeg [2] selects the variable which is involved in more failed constraints, dom/wdeg [2] which selects the dom and impacts [7] whose objective is to variable which minimizes the ratio wdeg select the variable-value pair that maximizes the reduction of the remaining search space. 2.2

Support Vector Machines

Among the prominent Machine Learning (ML) algorithms are Support Vector Machines (SVM) [3]. This algorithm is highly used in binary classification due to its statistical learning properties, it determines the separating hyperplane with maximum distance or margin to the closest examples (so-called support vectors) of the training set. Learning a high quality model depends on the quality of the training set. On the one hand, the description of the examples must enable to discriminate among positive and negative examples; on the other hand, the available examples must enable to accurately localize the frontier between the two classes.

3

Continuous Search in Constraint Programming

The goal of this paper is to take advantage of real world situations as shown in Figure 1, where instances are presented one at a time. Therefore, in Continuous Search settings we consider two different phases, exploitation (or solving time) and exploration (or learning time). The former tries to solve new instances using the acquired knowledge, and the latter is focused on improving the classification accuracy by means of re-using previously seen instances. Instances

Solving time Learning time

I0

I1

...

Ik

Fig. 1. Continuous Search scenario

3.1

Online Learning

To deal with the continuous search scenario, we follow the same approach as in [1], where the authors propose to use a supervised Machine Learning algorithm to select the most appropriate heuristic at different states (so-called checkpoints) of the search tree. To this end, we characterize CSP instances by means of features

10 (i.e., general information common to all CSPs). Those features are the input of an Online SVM algorithm which selects the best heuristic within the checkpoint window. The features set is divided into two main categories, static and dynamic. The former intends to distinguish instances from each other (e.g., number of variables, constraints, etc.), while the latter is used to monitor the progress of the search process (e.g., max. number of failures, variable’s weight, etc.). For more details about these features see [1]. Our choice of an Online SVM algorithm is motivated by the fact that it does not need to re-train the classifier once a new example arrives. Note that training a classical SVM (so-called batch learning) requires the solution of an optimization problem which is not an ideal situation in Continuous Search.

4

Experiments

This section describes the experimental validation of the proposed approach. In these experiments we included a collection of 100 nurse-scheduling problems from the MiniZinc3 repository. 4.1 Experimental setting The learning algorithm used in the experimental validation of the proposed approach is a Support Vector Machine with Gaussian kernel; we used the libSVM implementation. All our CSP heuristics (see Section 2.1) are home-made implementations integrated in the Gecode-2.1.1 constraint solver. ID Variable sel Value Sel ID Variable sel Value sel 1 dom/wdeg min 5 dom/deg min 2 dom/wdeg max 6 dom/deg max 3 wdeg min 7 min-dom min 4 wdeg max 8 impacts — Table 1. Candidate heuristics; the default heuristic is the first one.

Two CSP adaptive strategies have been experimented, respectively considering the first 4 and 8 strategies in Table 1. In all cases, the default heuristic is the first one: dom/wdeg for variable selection and min-value for value selection. The exploration examples are generated by adding minor perturbations to the default execution of heuristics (i.e., executing the default heuristic at each checkpoint). Thus, during the learning time, we ran the last seen instance replacing the default heuristic by another candidate in exactly one checkpoint, this process is repeated for each heuristic in Table 1 for a limited number of checkpoints (10 in this paper). Currently, we are working on a more informative way of selecting the exploration points considering the distance of unlabeled examples to the decision boundary. All experiments were performed on a 8-machine cluster running Linux Mandriva 2009, all machines have 64 bits and two quad-core 2.33 Ghz with 8 Gb of RAM. A time out of 10 minutes was used for each experiment. 3

Available at http://www.g12.cs.mu.oz.au/minizinc/download.html

11 It can be observed in Figure 2 that the dynamic approach is able to solve more instances that its competitors (i.e., the default strategy and a random heuristic selection). However the performance goes down as the number of candidate heuristics increases. The main explanation for this phenomenon relies on the fact that we are not using any sophisticated strategy for breaking ties (i.e., if several heuristics are predicted to outperform the default one we pick one at random). We are currently studying different approaches to breaking ties, selecting the best algorithm using the decision value, exploiting the fact that examples that are far from the classification boundaries are more likely to be correctly predicted. flatzinc nsp-14 (4 heuristics)

600

500 def dyn random

400

run-time (s)

run-time (s)

500

300 200 100 0

flatzinc nsp-14 (8 heuristics)

600

def dyn random

400 300 200 100

0

10

20

30 40 50 60 solved instances

70

80

0 0

10

20 30 40 50 solved instances

60

70

Fig. 2. Nurse Scheduling-14 (nsp-14), Note that this data shows the performance of the continuous search approach with a particular ordering of the problem instances

5

Related work

In this section, we describe some related work that has been proposed to integrate Machine Learning Algorithms into CSP and related areas such as: SAT and Quantified Boolean Formulas (QBF). SATzilla [10] is a well known SAT portfolio solver which is built upon a set of features, in general words SATzilla includes two kinds of features: basic features such as number of variables, number of propagators, etc. and local search features which actually probe the search space in order to estimate the difficulty of each problem-instance. The goal of SATzilla is to learn a runtime predictor using a simple linear regression model. CPHydra [6], one of the best constraint solvers in the lastest CSP competition4 is a portfolio approach based on case-based reasoning. Broadly speaking CPhydra maintains a database with all solved instances (so-called cases). Later on, once a new instance arrives a set of similar cases C is computed and the heuristic that is able to solve the majority of instances in C is selected. The main drawback of this portfolio approach is that due to its high complexity to 4

http://www.cril.univ-artois.fr/CPAI08/

12 select the best solver, it is limited to a small number of solvers (in competition settings less than 6 solvers were used). Our work is related to [8] in a way that they also apply machine learning techniques to perform on-line combination of heuristics into search tree procedures. Their paper proposes to use a multinomial logistic regression method in order to maximize the probability of predicting the right heuristic at different states of the search procedure. Unfortunately, this work requires an important number of training instances to get enough generalization of the target distribution of problems and does not fulfill all the requirements of the Continuous Search settings.

6

Conclusion

This paper has presented an introduction to Continuous Search, this new concept includes an online algorithm to adaptively tune a constraint solver. At different states of the search, the instance feature is provided with dynamic information collected by the search engine to dynamically adapt the search strategy of a well known CP solver in order to more efficiently solve the current instance. The results in this paper show that the online approximation to deal with continuous search outperforms a very good default heuristic for solving CSPs.

7

Acknowledgements

We would like to thank the anonymous reviews and Michele Sebag for helpful discussions of the integration of Machine Learning and Constraint Programming.

References 1. A. Arbelaez, Y. Hamadi, and M. Sebag. Online heuristic selection in constraint programming. In Symposiumon Combinatorial Search (SoCS), 2009. 2. F. Boussemart, F. Hemery, C. Lecoutre, and L. Sais. Boosting systematic search by weighting constraints. In ECAI’04, 2004. 3. N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press, 2000. 4. R. M. Haralick and G. L. Elliott. Increasing tree search efficientcy for constraint satisfaction problems. In Artificial Intelligence, 1980. 5. F. Hutter and Y. Hamadi. Parameter adjustment based on performance prediction: Towards an instance-aware problem solver. Number MSR-TR-2005-125, Cambridge, UK, Jan 2005. Microsoft-Research. 6. E. O’Mahony, E. Hebrard, A. Holland, C. Nugent, and B. O’Sullivan. Using casebased reasoning in an algorithm portfolio for constraint solving. In AICS’08, 2008. 7. P. Refalo. Impact-based search strategies for constraint programming. In CP’04, 2004. 8. H. Samulowitz and R. Memisevic. Learning to solve qbf. In AAAI’07, 2007. 9. K. Smith-Miles. Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Comput. Surv., 41(1), 2008. 10. L. Xu, F. Hutter, H. H. Hoos, and K. Leyton-Brown. Satzilla-07: The design and analysis of an algorithm portfolio for sat. In CP’07, 2007.

Constraint Based Languages for Biological Reactions Marco Bottalico1 (student) Stefano Bistarelli1,2,3 (supervisor)

13

1

3

Universit`a G. d’Annunzio, Pescara, Italy [bottalic,bista]@sci.unich.it 2 Istituto di Informatica e Telematica (CNR), Pisa, Italy [stefano.bistarelli]@iit.cnr.it Dipartimento di Matematica Informatica, Universit`a di Perugia, Italy [email protected]

Abstract. In this paper, we study the modelization of biochemical reaction by using concurrent constraint programming idioms. In particular we will consider the stochastic concurrent constraint programming (sCCP), the Hybrid concurrent constraint programming languages (Hcc) and the Biochemical Abstract Machine (BIOCHAM). Keywords: Biochemical Reactions, (Stochastic - Hybrid) Concurrent Constraint Programming, Biocham.

1 Introduction System biology is a science integrating experimental activity and mathematical modeling. They study the dynamical behaviors of biological systems. While current genome project provide a huge amount of data on genes or proteins, lots of research is still necessary to understand how the different parts of a biological system interact in order to perform complex biological functions. Mathematical and computational techniques are central in this approach to biology, as they provide the capability of formally describing living systems and studying their proprieties. A variety of formalisms for modeling biological systems has been proposed in the literature. In [1], the author distinguishes three basic approaches: discrete, stochastic, continuous, additionally we have various combinations between them. Discrete models are based on discrete variables and discrete state changes; continuous models are based on differential equations that typically model biochemical reactions; finally in the stochastic the probabilities may appear explicitly in random variables and random numbers, or implicitly like in kinetics laws. In the latest approach we have a simplified representation of processes, an integration of stochastic noise in order to get more realistic models. The need to capture both discrete and continuous phenomena, motivates the study of dynamical systems [5]. The goal of this paper is to show different kinds of languages to model biochemical reactions in order to compare and to use their appropriate features in different ways.

2 Background on Biochemical reactions and Blood Coagulation 14 In this paper, we want to examine the Biochemical Reactions: they are chemical reactions involving mainly proteins. In a cell, there are many different proteins, hence, the number of reactions that can take place, can be very high. All the interactions that take place in a cell, can be used to create a diagram, obtaining a biochemical reactions network. We will examine in the following, one of the thirteen enzymatic reaction of the blood coagulation [12], in the generic form, through the Michaelis-Menten kinetics: k1 XI + XIIa ⇋ XI : XIIa ⇀k2 XI + XIa k −1

where XI is the enzyme (E) that binds substrate (S) = XIIa, to form an enzymesubstrate complex (ES) = XI : XIIa. After we have the formation of product (P) = XIa and the release of the unchanged enzyme (E) = XI, ready for a new reaction. We are interested of the Blood Coagulation [12]. It is part of an important host defense mechanism termed hemostasis (the cessation of blood loss from a damaged vessel). Blood clotting is a very delicately balanced system; when hemostatic functions fail, hemorrhage or thromboembolic phenomena results. The chemical reactions that constitute all the process, can be see as a decomposition of many kinds of enzymatic reactions, involving reactants, products, enzymes, substrates, stoichiometric coefficients, proteins, inhibitors and chemical accelerators. Under the Michaelis-Menten hypotheses [4], the most important equations are: 2 Michaelis constant. It measures the affinity of the enzyme for the KM = k−1k+k 1 substrate: if KM is small there is a high affinity, and viceversa. VMAX = V0 = k2 [E0 ]. This is the maximum rate, would be achieved when all of the enzyme molecules have substrate bound (Hp1). [E0 ] is the starting quantity of enzyme E. k2 is also called kcat . d [P] dt

[S] = KVMMAX + [S] . This final equation, is usually called the “Michaelis-Menten equation”. It shows the speed of the formation of the product. When the amount of product P is small, this will be a good approximation and the equations can now be integrated:

d [E] dt = (k2 + k−1 )[ES] − (k1 )[E][S]. d [S] S: dt = k−1 [ES] − (k1 )[E][S]. ES: d [ES] dt = k1 [E][S] − (k−1 + k2 )[ES]. d [P] P: dt = k2 [ES].

E:

3 Stochastic Concurrent Constraint Programming sCCP [2] is obtained by adding a stochastic duration to the instruction inte15 racting with the constraint store C, i.e. ask and tell. The most important feature added in the sCCP is the continuous random variable T, associated with each instruction. It represents the time needed to perform the corresponding operations in the store. T is exponentially distributed, and its probability density function is f (τ) = λe−λτ where λ is a positive real number (rate of the exponential random variable) representing the expected frequency per unit of time. The duration of an ask or a tell can depend on the state of the store at the moment of the execution. The main difference of sCCP with classical cc, is the presence of two different actions with temporal duration, ask and tell, identified by a rate function λ: tellλ (c) and askλ (c), following the probability law. It means that the reaction occurs in a stochastic time T, f (τ) = λe−λτ whose mean is 1/λ; i.e. tell∞ is an instantaneous execution while tell0 never occurs. Other functions are the same that in CCP, except for the variables, that in CCP are rigid, in the sense that, whenever they are instantiated, they keep that value forever. Time-varying variables (called stream variables) can be easily modeled in sCCP as growing lists with a unbounded tail: X = [a1 , ..., an |T]. When the quantity changes, we simply need to add the new value, say b, at the end of the list by replacing the old tail variable with a list containing b and a new tail variable: T = [b|T′ ]. When we need to know the current value of the variable X, we need to extract from the list, the value immediately preceding the unbounded tail. The stream variables are denoted with assignment $=. We model the biochemical equation [4], in sCCP, with the following recursively defined method: react(XIIa, XIa, KM , V0 ) : − askrMM (KM ,V0 ,XIIa) (XIIa > 0). (tell∞ (XIIa $= XIIa − 1)||tell∞ (XIa $= XIa + 1)). react(XIIa, XIa, KM , V0 ) Where the rate λ of the ask, is computed by the Michaelis-Menten kinetics: V0 ×XIIa rMM(KM ,V0 ,XIIa) = XIIa+K . Roughly the program inserts in the store the current M value for the variables KM , V0 , XIIa; it checks the value of the factor XIIa, then, with an immediate effect, it updates the values for the factors XIIa (reagent) and XIa (product) with the new values. Subsequently it executes a new instance of the program.

4 Hybrid cc Hybrid concurrent constraint programming languages (Hybrid cc [8]) is a powerful framework for modeling, analyzing and simulating hybrid systems, i.e., systems that exhibit both discrete and continuous change. It is an extension of

Timed Default cc [11] over continuous time. One of the major difficulty in the original cc framework is that cc programs can detect only the presence of information, not the absence [1]. Default cc extends cc by a negative ask combinator 16 (i f a else A) which imposes the constraint a at the program A. e = 10, s = 5, es = 0.01, p = 0, always { k1 = 1, km1 = 0.1, k2 = 0.01, cont(e), cont(s), if (e >= 0.000000001) then { e’ = ((km1+k2) * es) - (k1 * e * s), s’ = (km1 * es) - (k1 * e * s) , es’ = (k1 * e * s) - ((km1+k2) * es), p’ = k2 * es } else { e’ = 0, s’ = 0, es’ = 0, p’ = 0 } }, We can observe that in the first row, we have the initial conditions with the quantity of Enzyme (100), Substrate (10), Enzyme-Substrate (0), Product (0), and the rate of the three reactions: k1 = 1 for the first one, k−1 = 0.1 for the second one and k2 = 0.01 for the third reaction. With the syntax cont(e, s, es, p) we assert that the rates of e, s, es, p are continuous. Subsequently we control the current values of e and s then we can start the reaction; the && and = operators, has the usual meaning of “and” and “equal”. We can easily obtain the quantity of e, s, es, p in the next time instant. If the “if condition” doesn’t hold, we obtain the previous amount of factors.

5 Biochemical Abstract Machine Biochemical Abstract Machine (BIOCHAM [6]) is a software environment for modeling complex cell processes, making simulations (i.e. in silico experiments), formalizing the biological properties of the system known from real experiments, checking them and using them as specification when refining a model. BIOCHAM is based on two aspects: the analysis and simulation of boolean, kinetic and stochastic model and the simulation of biological proprieties in temporal logic. For kinetics model, BIOCHAM can search for appropriate parameter values in order to reproduce a specific behavior observed in experiments and formalized in temporal logic. We can use the Michaelis-Menten kinetics to represent the first enzymatic reaction of blood coagulation. To explain the language used to model the reaction in the BIOCHAM [3] language, we can translate the syntax in the following way: (k1*[E]*[S],km1*[ES]) for E + S ES.

k2*[ES] for ES => E + P. parameter(k1,1). parameter(km1,0.1). parameter(k2,0.01). present(E,100). present(S,10). absent(ES). absent(P).

17

There are two different syntax operator, used to model the different kinds of reaction: and =>. The first one model the reversible reaction, involved in the ES formation, this reaction can be reversible. The second one model the irreversible reaction which produces the P factor. The f or operator shows us for k1 which substances, the reaction is performed: the first f or is for the E+S ⇋ E:S k−1 k2 reaction, the second one is for the E : S ⇀ E + P reaction. In BIOCHAM, the result of this simulation, generates a graph and a table, in relation to time and nanomolar variation. In the graph (fig.1), we have the plot of four kinds of curves, referring to the four substances involved in the reaction: the initial decreasing of the Enzyme and of the Substrate (violet and red curves respectively), the formation to the Enzyme-Substrate complex (green curve) and finally the Product formation (sky blue line).

Fig. 1. Graphic

6 Related and future works Most important features are represented in [10, 7]. In [10] the authors suggests to model biomolecular process i.e. protein networks, by using the pi-Calculus, while in [7] is shown that there are two formalisms for mathematically describing the time behavior of a spatially homogeneous chemical systems: the deterministic approach and the stochastic one. The first regards the time evolution

as a continuous and predictable process which is governed by a set of ordinary differential equations (the “reaction-rate equations”), while the seconds regards the time evolution as a kind of random-walk process which is governed by a 18 single differential-difference equation (the “master equation”). From the application point of view, the examined languages allows the biologist to model biological systems in a high-level and declarative way, using different kinds of applications and languages construct that capture directly a variety of biological phenomena. We are interest in the non Deterministic process Calculus (ntcc [9]) because it is a concurrent constraint programming which include time process with a graphic formation, in order to describe our biochemical reactions. In this paper we have modeled only one to the thirteen kinetic reactions used in the blood coagulation; our future work is to extend the coagulation chain using all the reactions, the goal is to compare the formation of fibrin in the scientific literature, with previous results. The second step is the in silico study of a drug which decrease the hepatic secretion of four clotting factors, and we want to analyze the derived outcomes.

References 1. Alexander Bockmayr and Arnaud Courtois. Using hybrid concurrent constraint programming to model dynamic biological systems. In ICLP, pages 85–99, 2002. 2. Luca Bortolussi and Alberto Policriti. Modeling biological systems in stochastic concurrent constraint programming. Constraints, 13(1-2):66–90, 2008. 3. Nathalie Chabrier-Rivier, Franc¸ois Fages, and Sylvain Soliman. The biochemical abstract machine biocham. In Proc. CMSB, pages 172–191, 2004. 4. Thomas Delvin. Texbook of biochemistry with clinical correlations. McGraw Hill Book co., 2001. 5. Schaft A.J. van der and J.M. Schumacher. Introduction to Hybrid Dynamical Systems. Springer-Verlag., 1999. 6. Franc¸ois Fages. Temporal logic constraints in the biochemical abstract machine biocham. In Patricia M. Hill, editor, LOPSTR, volume 3901 of Lecture Notes in Computer Science, pages 1–5. Springer, 2005. 7. Daniel T. Gillespie. Exact stochastic simulation of coupled chemical reactions. In The Journal of Physical Chemistry, pages 2340–2352, 1977. 8. Vineet Gupta, Radha Jagadeesan, and Vijay A. Saraswat. Computing with continuous change. Sci. Comput. Program., 30(1-2):3–49, 1998. 9. Julian Guti´errez, Jorge A. P´erez, Camilo Rueda, and Frank D. Valencia. Timed concurrent constraint programming for analysing biological systems. Electr. Notes Theor. Comput. Sci., 171(2):117–137, 2007. 10. Aviv Regev, William Silverman, and Ehud Y. Shapiro. Representation and simulation of biochemical processes using the pi-calculus process algebra. In Proc. Pacific Symposium on Biocomputing, pages 459–470, 2001. 11. Vijay A. Saraswat, Radha Jagadeesan, and Vineet Gupta. Timed default concurrent constraint programming. J. Symb. Comput., 22(5/6):475–520, 1996. 12. Williams. Williams Hematology. McGraw Hill Book co., 2006.

19

Capturing fair computations on Concurrent Constraint Language Paola Campli ([email protected]), Supervisor: Stefano Bistarelli ([email protected]) University G.D’Annunzio - Pescara, Italy

Abstract. The study of concurrency and nondeterminism in programming languages is often related to fairness. It is a feature that should be included in contexts where there are repetitive choices among alternatives and it is desirable that no alternative will be never taken or postponed infinitely often. The aim of my research is to guarantee equitable computations in Concurrent Constraint languages. This paper presents an extension of the language related to the operator of parallelism using as criterion of selection a metric provided by a fair carpool scheduling algorithm. The new operator of Parallelism (km ) is able to deal with a finite number (m) of agents to guarantee a fair prosecution of the computation, assuring that no (enabled) agents will be never executed.

Introduction This paper contains an extension of Concurrent Constraint Programming (CCP) in order to guarantee a fair criterion of selection between parallel agents and to restrict or remove the possible unwanted behavior of a program. This document is structured as follows: in the first section the common definition of fairness are briefly presented; in section two the Concurrent Constraint language is described; in section three a fair CC version through an extension of the parallelism’s operator based on the idea proposed by a fair carpooling algorithm is presented.

1

Fairness in programming languages

Some of the most common notion of fairness in computer science are given by Nissim Francez in [1]: weak fairness requires that if an action (or agent) is continuously enabled (so it can almost always proceed) then it must eventually do so, while strong fairness requires that if an action (or agent) can proceed infinitely often then it must eventually proceed (be executed). In order to guarantee the above properties of fairness over the CC language we add a condition (or guard) that enables the agents with a lower cost to succeed.

20

2

Concurrent Constraint Programming

The concurrent constraint (cc) programming paradigm [4] concerns the behavior of a set of concurrent agents with a shared store, which is a conjunction of constraints ( relations among a specified set of variables). Each computation step possibly adds new constraints to the store. Thus information is monotonically added until all agents have evolved. The final store is a refinement of the initial one and is the result of the computation. The concurrent agents communicate with the shared store, by either checking if it entails a given constraint (ask operation) or adding a new constraint to it (tell operation). The syntax of a cc program is show in Table 1: P is the class of programs, F is the class of sequences of procedure declarations (or clauses), A is the class of agents, c ranges over constraints, and x is a tuple of variables. The + combinator expresses nondeterminism. We also assume that, in p(x) :: A, vars(A) ⊆ x, where vars(A) is the set of all variables occurring free in agent A. In a program P = F.A, A is the initial agent, to be executed in the context of the set of declarations F . The intuitive behavior of the agents is: agent “success” succeeds in one step;

Table 1. cc syntax P ::= F.A F ::= p(x) :: A | F.F A ::= success | f ail | tell(c) → A | E | AkA | ∃x A | p(x) E ::= ask(c) → A | E + E

agent “f ail” fails in one step; agent “ask(c) → A” checks whether constraint c is entailed by the current store and then, if so, behaves like agent A. If c is inconsistent with the current store, it fails, and otherwise it suspends, until c is either entailed by the current store or is inconsistent with it; agent “ask(c1 ) → A1 + ask(c2 ) → A2 ” may behave either like A1 or like A2 if both c1 and c2 are entailed by the current store, it behaves like Ai if ci only is entailed, it suspends if both c1 and c2 are consistent with but not entailed by the current store, and it behaves like “ask(c1 ) → A1 ” whenever “ask(c2 ) → A2 ” fails (and vice versa); agent “tell(c) → A” adds constraint c to the current store and then, if the resulting store is consistent, behaves like A, otherwise it fails; agent A1 kA2 behaves like A1 and A2 executing in parallel; agent ∃x A behaves like agent A, except that the variables in x are local to A; p(x) is a call of procedure p. Here is a brief description of (some of) the transition rules:

21 htell(c) → A, σi → hA, σ ⊗ ci

tell

σ`c, hAsk(c)→A,σi→hA,σi

ask

0

0

hA1 ,σi→hA1 ,σ i 0 hA1 kA2 ,σi→hA1 kA2 ,σ 0 i 0

hA1 ,σi→hsuccess,σ i hA1 kA2 ,σi→hA2 ,σ 0 i 0

parallelism (1) parallelism (2)

hE1 ,σi→hA1 ,σ i hE1 +E2 ,σi→hA1 ,σ 0 i

nondeterminism

hA[y/x],σi→hA1 ,σ 0 i h∃x A,σi→hA1 ,σ 0 i

hidden variables

hp(y), σi → hA[y/x], σi where p(x) :: A

procedure call

Notice that || and + are commutative and associative operators.

3

Fair Concurrent Constraint Programming

The obtain a Fair version of the parallel execution, we modify the parallel operator k to use quantitative metrics that provides a more accurate way to establish which of the agents can succeed. The metric we use is the one proposed by a Fair carpooling scheduling algorithm [2], which is described more in detail below. 3.1

The fair carpool scheduling algorithm

Carpooling consists in sharing a car among a driver and one or more passengers to divide the costs of the trip. We want a scheduling algorithm that will be perceived as fair by all the members as to encourage their continued participation. Let U be a value that represents the total cost of the trip. It is convenient (but not necessary) to take U to be the least common multiple of [1, 2, . . . , m] where m is the largest number of people who ever ride together at a time in the carpool. Since in a determined day the participans can be less than m, we define n as the number of participants in the carpooling in a given day (n ∈ [1 . . . m]). Each day we calculate the passengers and driver’s scores. In the first day each member has score zero; in the following days the driver will increase his score by αn = U (n − 1)/n , while the remaining n − 1 passengers decrease their score by βn = U/n As proved by [2] this algorithm is fair.

22 3.2

Applying the carpooling algorithm over CC

A problem we encounter on the CC semantic is that only two parallel agents are represented in the Parallelism rule; therefore it is not possible to express fairness with more than 2 agents, since we need to associate a value (cost or preference) to each agent. In this section the syntax and the semantic of the language will be extended with the new operator km in order to model the parallel execution of m agents (with m finite). Moreover we include an array k (with m elements) that allows us to keep track of the actions performed by each agent during the computational steps (we use the notation hAi , k[i], σi instead of hAi , σi). In this way we can associate a value k[i] to each agent Ai . The syntax will be modified as in Table 2.

Table 2. fair version of cc syntax P ::= F.A F ::= p(x) :: A | F.F A ::= success | f ail | tell(c) → A | E | E ::= ask(c) → A | E + E

km (A1 . . . Am ) | ∃x A | p(x)

With reference to the carpooling algorithm we consider the participants in the carpool in a given day as the n enabled agents (Al1 →, . . . , Aln → ) and we represent the driver with the agent Ali , while the passengers are the remaining Alj (j ∈ [1, . . . , n]) agents. Agent Ali increases his score by αn , while the n − 1 passengers decrease their score by βn . Initially all the n elements of the array k are equal to 0. We add in the new transition rule the set of values associated to each agent by the array k; we also insert in the precondition of the rule a guard (k[i] ≤ k[j]) to compare the agent’s values and to establish which of the agents can evolve or succeed. 0

In the initial phase, the enabled agent Ali evolves in Ali with value αn . The other agents instead assume the value −βn . In next steps we update the values of the array k to obtain k0 by increasing/decreasing the previous values of k by αn /βn . We obtain the transition rule in table 3. Since km is an associative and commutative operator, we can omit the rewriting of the second rule of Parallelism (1). We use the same criterion also for the Parallelism (2) rule (table 4). If agent Ali succeeds, it is deleted by the rule together with the corresponding element k[i]. Consequently the value m of the operator km is continuously

23 Table 3. Carpooling fairness - parallelism (1)rule

k[li ] ≤ k[lj ]

Al1 →, . . . , Aln → j = 1, . . . , n < Ali , σ >→< A0li , σ 0 >



< ||m (A1 , . . . , Ali , . . . , Am ), k, σ >→< ||m (A1 , . . . , A0li , . . . , Am ), k0 , σ 0 > 0

k[x] =

(

k[x] if x = 1, . . . , m k[x] + αn if x = li k[x] − βn if x = lj j 6= i

Table 4. Carpooling fairness - parallelism (2) rule

k[li ] ≤ k[lj ]



Al1 →, . . . , Aln → j = 1, . . . , n < Ali , σ >→< success, σ 0 >

< ||m (A1 , . . . , Ali , . . . , Am ), k, σ >→< ||m−1 (A1 , .., Ali−1 , Ali+1 , .., Am ), k0 , σ 0 > k[x]0 =

½

k[x] if x 6= li null if x = li

decreased by 1 every time an agent performs a success. We obtain therefore a final computation with a single agent and a single element on the array that reinstate the original rule: hk1 (A1 )[k1 ], σi → hA1 , σi.

Acknowledgement We wish to thank Paolo Baldan for his help in discussing and improving the contents of the paper.

4

Future Works

The existing definitions of fairness in the literature (strong fairness, weak fairness, unconditional fairness, equifairness) yield the development of various fairnesslike constructs, that vary according to the computational model considered. The limits we can observe in such expressions are that they can be used to model only infinite computations. In fact, when dealing with finite computations, it is not possible that an agent is continuously or infinitely often enabled; for this reasons we need a quantitative and more accurate notion of fairness. As proved by [3], it is not possible to write an expression which captures all the fair behavior of a system in a programming language with a primitive to express finitary choice because to express fairness unbounded nondeterminism is needed.

24 The aim of our research is to provide new notions of fairness suitable for finite computations and to guarantee equitable computations in the Concurrent Constraint language and its extensions. Moreover we plan to provide quantitative valuations by using soft constraints; we will use a semiring structure to measure how much the service is fair. To do this we plan to use the same indexes that are used to measure the inequality in economics, such as the Gini coefficient that indicates the inequality of income or wealth.

References 1. Nissim Francez, FAIRNESS (text and monographs in computer science), isbn: 0387962352, Springer Verlag (1986) 2. Ronald Fagin and John H.Williams, A Fair Carpool Scheduling Algorithm, International Business Machine Corporation, 1983, pages 133–139 3. Abha Moitra, Prakash Panangaden, Finitary choice cannot express fairness: A metric space technique, 1986 4. Vijay A. Saraswat and Martin Rinard, Concurrent constraint programming, POPL 1990

25

Finding Stable Solutions in Constraint Satisfaction Problems Laura Climent, Miguel A. Salido and Federico Barber Instituto de Autom´atica e Inform´atica Industrial Universidad Polit´ecnica de Valencia. Valencia, Spain

Abstract. Constraint programming is a successful technology for solving combinatorial problems modeled as constraint satisfaction problems (CSPs). An important extension of constraint technology involves problems that undergo changes that may invalidate the current solution. These problems are called Dynamic Constraint Satisfaction Problems (DynCSP). Many works on dynamic problems sought methods for finding new solutions. In this paper, we focus our attention on finding stable solutions which are able to absorb changes. To this end, each constraint maintains a label that measures the degree of variability. Thus, we restrict the original search space by using these labels and solutions inside the restricted search space remain valid under changes.

1 Introduction Nowadays many real problems can be modeled as constraint satisfaction problems and are solved using constraint programming techniques. Much effort has been spent to increase the efficiency of the constraint satisfaction algorithms: filtering, learning and distributed techniques, improved backtracking, use of efficient representations and heuristics, etc. This effort resulted in the design of constraint reasoning tools which were used to solve numerous real problems. However all these techniques assume that the set of variables and constraints which compose the CSP is completely known and fixed. This is a strong limitation when dealing with real situations where the CSP under consideration may evolve. Dynamic Constraint Satisfaction Problem [2] is an extension to a static CSP that models addition and retraction of constraints and hence it is more appropriate for handling dynamic real-world problems. It is indeed easy to see that all the possible changes to a CSP (constraint or domain modifications, variable additions or removals) can be expressed in terms of constraint additions or removals [4]. Several techniques have been proposed to solve Dynamic CSPs, including: searching for stable solutions that are valid after small problem changes [5]; searching for a new solution that minimizes the number of changes from the original solution [3]; or reusing the original solution to produce a new solution [4]. – Searching for stable solutions that are valid after small problem changes has the objective of exploring methods for finding solutions that are more likely to remain valid after changes, because this fact means that these are stable solutions [5]. In

26 order to develop this method, it is necessary to find all the solutions if we want to obtain the most stable based on an objective function. – Searching for a new solution that minimizes the number of changes from the original solution, considers the stability like close solutions in case of changes [3].Nevertheless with this method we do not obtain stable solutions. – Reusing the original solution to produce a new solution consist on producing the new solution by local changes on the previous one [4]. In order to develop this technique, we must explore the search space. In this paper we focus our attention on the first set of techniques: Searching for stable solutions that are valid after problem changes. Thus, we assume that the user knows the degree of stability of each constraints and they can be labeled to identify this stability.Thus the constrainedness of CSP can be altered by modifying/restricting the dynamic constraints. 1.1 Definitions By following standard notations and definitions in the literature [1] we have summarized the basic definitions that we will use in this rest of the paper. Definition 1. Constraint Satisfaction Problem (CSP) is a triple P = hX, D, Ci where, X is a finite set of variables {x1 , x2 , ..., xn }. D is a set of domains D = D1 xD2 x...xDn such that for each variable xi ∈ X there is a finite set of values that variable can take. C is a finite set of constraints C = {C1 , C2 , ..., Cm } which restrict the values that the variables can simultaneously take. Definition 2. Instantiation is a pair (xi = a), that represents an assignment of the value a to the variable xi , and a ∈ Di . Definition 3. To Check a Constraint: Once all variables, involved in a constraint, are instantiated, the constraint can be evaluated/check to analyze whether these assignments of values to variables satisfy the constraint or not. Definition 4. The Solution Space (S): is the portion of search space that satisfies all constraints. The instantiations of all variables that satisfy all constraints are inserted in the solution space. If all constraints are convex, the solution space is a convex hyperpolyhedron, which is included in the Cartesian Product of the variable domains.

2 Finding Stable Solutions Our aim is to find solutions to problems that remain valid under changes. Many real problems, such as planning and scheduling problems, can be modeled as a CSP. However, many of the constraints are set knowing that they are probably to be modified. For instance, in scheduling problems, if a worker starts working at time T 1begin and his first task must be finished in 50 minutes, then the constraint is (T 1end − T 1begin < 50). However, the worker could arrive late to work due to many factors. Thus, we can restrict this constraints to (T 1end − T 1begin < 40). If something is wrong the task can be delayed by more minutes. Thus, we can label this constraint as a dynamic constraint of level 2 (for instance). If the problem is feasible with the more restricted constraints, the

27 obtained solutions will be more stable than the solutions of the original problem due to the fact that if the worker arrives late (, variable i is NC if every value is NC, problem instance is NC if every variable is NC; 2. value (i, v) is arc consistent (AC) with respect to cost function Cij if it is NC and there exists another value w ∈ Dj such that Cij (v, w) = ⊥. Value w is called a support for v. Variable i is AC if all its values are AC, and and a problem instance is AC if every variable is AC [3]. Values > and ⊥ represent the first unacceptable cost (any cost lower than > is acceptable) and no cost (with the usual addition for cost aggregation, as search progresses > is the global cost of the best solution found so far and ⊥ is 0). This definition can be translated into the distributed context: variables are agents, each agent keeps its unitary and binary constraint in which it is involved (so every agent knows the domain of every other agent it is constrained with). Node consistency remains unchanged, enforced by the owner agent, and arc consistency also remains unchanged, enforced by any of the two agents involved in the constraint. To apply this idea to DCOPs, we assume that the distributed instance is initially WAC (otherwise, it can be made WAC by preprocessing). If value a is unconditionally deleted from Di , it might happens that value a ∈ Di were the only support of a value b ∈ Dj . For this reason, after the notification of a deletion, directional WAC has to be enforced on cost function Cij from i to j (observe that in the other direction enforcing will cause no change). This has to be done in both agents i and j, to assure that both maintain the same representation of Cij . In addition, agent j may pass binary costs to unary costs, which might result that some value b ∈ Dj becomes node inconsistent. In that case, b should be deleted from Dj and its deletion should be propagated. The idea of propagating unconditional deleted values can be included in BnBADOPT, producing a new algorithm and exploiting the idea that a constraint Cij is

36

known by both agents i and j. This approach would require some minor changes with respect to BnB-ADOPT: 1. the domain of every variable constrained with self has to be represented in self ; 2. a new message type, DEL, is required. When self deletes value a in D(self ), it sends a DEL message to every agent constrained with it. When self receives a DEL message, it registers that the message value has been deleted from the domain of sender, and it enforces soft arc consistency on the constraint between self and sender. If, as result of this enforcing, some value is deleted in D(self ) it is propagated as above. Including the propagation of unconditionally deleted values does not changes the semantic of original BnB-ADOPT messages. In addition, T OP should be included in VALUE messages: from the second value tried in root, T OP is a finite value which is propagated downwards, informing the other agents of the lowest unacceptable cost. It is worth noting that this approach keeps the good BnB-ADOPT properties, namely correctness, completeness and termination: since we are eliminating values which are either suboptimal (values at root) or not WAC, their removal will not cause to miss any solution. If the value assigned to self is found to be not WAC, this is recorded, the value is removed when possible and another value is tried for self. Any value removal is propagated to agents constrained with self. Connecting BnB-ADOPT with the simplest form of soft arc consistency (WAC) is expected to be beneficial for communication. More complex forms of soft arc consistency remain to be investigated [2], [4]. The use of soft arc consistency in a distributed context requires a careful analysis, to avoid that the extra communication effort needed to enforce them surpasses the expected benefits in search space reduction. Preliminary experimental results indicates improvement in the performance with respect to the BnBADOPT algorithm.

References 1. Dechter R., Constraint Processing, Morgan Kaufmann, 2003. 2. de Givry S., Heras F., Larrosa J., Zytnicki M. Existential arc consistency: getting closer to full arc consistency in weighted CSPs Proc. of IJCAI-05, 2005. 3. Larrosa J. Node and arc consistency in weighted CSP Proc. of AAAI-02, 2002. 4. Larrosa J., Schiex T. In the quest of the best form of local consistency for Weighted CSP Proc. of IJCAI-03, 2003. 5. Modi P. J., Shen W.M., Tambe M., Yokoo M. Adopt: asynchronous distributed constraint optimization with quality guarantees. Artificial Intelligence, 161, 149–180, 2005. 6. Petcu A., Faltings B. A scalable method for multiagent constraint optimization Proc. of IJCAI-05, 266–271, 2005. 7. Silaghi M. Yokoo M. Nogood-based Asynchronous Distributed Optimization (ADOPT-ng). Proc. of AAMAS-06, 2006. 8. Yeoh W., Felner A. Koenig S. BnB-ADOPT: An Asynchronous Branch-and-Bound DCOP Algorithm Proc. of AAMAS-08, 591–598, 2008.

37

An automaton Constraint for Local Search Jun He (student), Pierre Flener, and Justin Pearson Department of Information Technology Uppsala University, Box 337, SE – 751 05 Uppsala, Sweden [email protected]

Abstract In this paper we explore the idea of using deterministic finite automata (DFAs) to implement constraints for local search (this is already a successful technique in global search). We show how it is possible to automatically incrementally maintain violation counts from a specification of a DFA. We show the practicality of the idea on a real-life scheduling example.

1

Introduction

When a high-level constraint programming (CP) language lacks a (possibly global) constraint that would allow the formulation of a particular model of a combinatorial problem, then the modeller traditionally has the choice of (1) switching to another CP language that has all the required constraints, (2) formulating a different model that does not require any lacking constraints, or (3) implementing the lacking constraint in the low-level implementation language of the chosen CP language. This paper addresses the core question of facilitating the third option, and as a side effect makes the first option unnecessary. The user-level extensibility of CP languages has been an important goal for over a decade. In the traditional global search approach to CP (namely heuristic-based tree search interleaved with propagation), higher-level abstractions for describing new constraints include indexicals [15], (possibly enriched) deterministic finite automata (DFAs) via the automaton [2] and regular [10] generic constraints, and multi-valued decision diagrams (MDDs) via the mdd [5] generic constraint. In the more recent local search approach to CP (called constraint-based local search, CBLS, in [12]), higher-level abstractions for describing new constraints include invariants [8], a subset of first-order logic with arithmetic via combinators [14] and differentiable invariants [13]; and existential monadic second-order logic for constraints on set variables [1]. In the former approach, a generic but efficient propagation algorithm achieves a suitable level of local consistency by processing the higher-level description of the new constraint, while in the latter approach, a generic but incremental violation maintenance algorithm is proposed. In this paper, we revisit the description of new constraints via DFAs, already successfully tried within the global search approach to CP, and show that it can also be successfully used within the local search approach. The rest of this paper is organised as follows. In Section 2, we present our algorithm for incrementally

38 1

2 d d x 5 d x x 3 d x e 6 x e e e 4

Figure 1. An automaton for a simple work scheduling constraint

maintaining both the violation degree of a constraint described by a DFA, and the violation degrees of each decision variable of that constraint. In Section 3, we present experimental results establishing the practicality of our results. Finally, in Section 4, we summarise this work and discuss related as well as future work.

2

Incremental Violation Maintenance with DFAs

Our method for implementing a constraint described by a DFA has two major components: a pre-processing to set up useful data structures, including the computation of the violations of the constraint and each of its decision variables for the initial assignment, and the incremental maintenance of these violations when an actual move is made. Our running example is the following, for a simple work scheduling constraint. There are values for two work shifts, day (d) and evening (e), as well as a value for enjoying a day off (x). Work shifts are subject to the following three conditions: one must take at least one day off before a change of work shift; one cannot work for more than two days in a row; and one cannot have more than two days off in a row. A DFA for this constraint is given in Figure 1. The start state 1 is marked by a transition entering from nowhere, and the success states 5, 6 are marked by double circles. Missing transitions, say from state 2 upon reading a value e, are assumed to go to an implicit failure state, with a self-looping transition for every value (so that no success state is reachable from it). The pre-processing is similar to the pre-processing in [10], namely unrolling the DFA for a given length n of the sequence V = V1 , . . . , Vn of decision variables. Definition 1 (Layered Graph). Given a DFA with m states, the layered graph over n variables is a graph with m · (n + 1) nodes. Each of the n + 1 vertical layers has a node for each of the m DFA states. The node for the start state of the DFA in the layer 1 is labelled as the start node. There is an arc labelled w from node f in layer i to node t in layer i + 1 if and only if there is a transition labelled w from f to t in the DFA. A node in layer n + 1 is labelled as a success node if it corresponds to a success state in the DFA. The layered graph is further processed by removing all nodes and paths that do not lead to a success node. Next, the resulting graph, seen as a DFA (or ordered MDD), is minimised (or reduced)[6], as the size of the graph influences the run time of algorithms. For instance, the unrolled version for n = 6 decision 2

Layer 1

Layer 2 12 2 x d

42 1

x e

d d 18 e 3 x x 4 e 12

Layer 3 6 2 x d d 3 e 8 x x 4 6xe 5 d 4 e 6 6

Layer 4 3 2 d

Layer 5

x

4 d 3 e 3x 4 e x 2 x 5 d 6 2

e

Layer 6 1 2

1 2

d

x 1 3

d 3 2

Layer 39 7

e

1 4

x

1 5

e

Figure 2. The unrolled automaton of Figure 1

variables of the DFA in Figure 1 is given in Figure 2 (which is best viewed in colour). The purple number by each node is the number of paths from that node to a success node in the last layer. Next, we compute the violations of the automaton constraint and each of its decision variables for the initial assignment. Toward this, we show how an assignment is transformed into a sequence of disconnected segments in the layered graph G, where each segment is a connected sequence of arcs in G. First, if the current assignment is a solution, then there exists a path P in the layered graph from the start node in the first layer through arcs labelled by V1 , V2 , . . . , Vn and finally to a success node in the last layer. In this case, we get only one segment, which is P . Second, if the constraint is unsatisfied, then there is some index i such that from the start node in the first layer through arcs labelled with V1 , V2 , . . . , Vi−1 and finally to a node Si in layer i that has no out-going arcs labelled with value Vi . The sequence of arcs labelled with V1 , V2 , . . . , Vi−1 is the first segment. From the node Si in layer i, we select one of Si ’s successor nodes and start the next segment. Ideally we want to select a node that corrects the path in the best possible way. This is done by selecting the next node randomly from the set of nodes in the next layer that are reachable from the current node according to a distribution where nodes are weighted according to the number of paths from the current node to a success node in the last layer. Repeating this procedure until we arrive at the last layer, we will finally get several disjoint segments. For instance, in Figure 2, with the assignment V = hx, e, d, e, x, di, the first segment will be hx, ei. Next, V3 is a violated variable because there is no arc labelled with d that connects node 4 in layer 3 with any nodes in layer 4. Node 4 in layer 3 has two out-going arcs that are connected with nodes 3 and 5 in layer 4. In layer 4, there are 4 paths from node 3 compared to 2 paths from 4 node 5, so node 3 is chosen with probability 4+2 , and we assume it is chosen. From node 3 in layer 4, we will get the second segment he, x, di, which stops at success node 5 in the last layer. 3

40 Definition 2 (Violation of an Automaton Constraint and its Variables). Given an assignment yielding the segments σ1 , . . . , σℓ of the sequence of n decision variables Vi of an automaton constraint c: Pℓ – The violation Violation of c is n − i=1 |σi |. – The violation Violation(Vi ) of decision variable Vi of c is 0 if there exists a segment index j in 1, . . . , ℓ such that i ∈ σj , and 1 otherwise. It can easily be seen that the violation of an automaton constraint is the sum of the violations of its decision variables, and that it is never an underestimate of the minimal Hamming distance between the current assignment and any solution assignment. Local search proceeds from a giving assignment by checking a number of neighbours of that assignment and picking a neighbour that ideally reduces the violation: the exact heuristics are often problem dependent. But in order to make local search computationally efficient, the violations of the constraint and its variables have to be computed in an incremental fashion whenever a variable changes value. When a variable changes value only segments after that variable need to be changed. Initialisation and each incremental update only take a linear amount (in n) of work (algorithm and proof omitted for space reasons).

3

Experiments

Many industries and services need to function around the clock. Rotating schedules are a popular way of guaranteeing a maximum of equity to the involved work teams (see [7]). In our example in Figure 3(a), there are day (d), evening (e), and night (n) shifts of work, as well as days off (x). Each team works maximum one shift per day. The scheduling horizon has as many weeks as there are teams. In the first week, team i is assigned to the schedule in row i. For any next week, each team moves down to the next row, while the team on the last row moves up to the first row. The daily workload may be uniform: in Figure 3(a) each day has exactly one team on-duty for each work shift, and hence two teams entirely offduty; assuming the work shifts average 8h, each employee will work 7·3·8 = 168h over the five-week-cycle, or 33.6h per week. Daily workload, whether uniform or not, can be enforced by global cardinality (gcc) constraints on the columns. Further, any number of consecutive workdays must be between two and seven, and any change in work shift can only occur after two to seven days off. This can be enforced by a pattern(X, {(d, x), (e, x), (n, x), (x, d), (x, e), (x, n)}) constraint [4] and a circular stretch(X, [d, e, n, x], [2, 2, 2, 2], [7, 7, 7, 7]) constraint [9] on the table flattened row-wise into a sequence X. Our model posts the two required pattern and stretch constraints on the row-wise flattened matrix of decision variables, by actually using the conjunction automaton of the pattern and stretch DFAs, as this gives better run times than the decomposition. The gcc constraints on the columns of the matrix are kept invariant: the first assignment is chosen so as to satisfy them, and then only swap moves (between distinct values) inside a column are considered. As a meta-heuristic, we use tabu search 4

41 1 2 3 4 5

Mon Tue Wed Thu Fri Sat Sun x x x d d d d x x e e e x x d d d x x e e e e x x n n n n n n n x x x (a) Classical rotating schedule

instance total time opt time iterations 1d, 1e, 1n, 1x 18 2 14 1d, 1e, 1n, 2x 26 2 7 2d, 1e, 1n, 2x 26 2 9 2d, 2e, 1n, 2x 33 5 17 2d, 2e, 2n, 2x 42 6 20 2d, 2e, 2n, 3x 43 7 21 3d, 2e, 2n, 3x 52 8 25 (b) Experimental results on rotating schedules

Figure 3. (a) A five-week rotating schedule with uniform workload. (b) Average run times (total and optimisation, in milliseconds) and numbers of iterations to the first solutions of rotating schedules

with a restarting mechanism. The search has been implemented in Comet using the violations from the algorithm to guide the search by selecting violated variables at each iteration to be changed. In addition to the classical instance in Figure 3(a), here denoted 1d, 1e, 1n, 2x, we ran experiments over other instances with uniform daily workload, namely those over 4 to 10 weeks where the weekly workload was between 33h and 42h. Figure 3(b) gives the average run times (in milliseconds) and numbers of iterations over 50 runs to the first solution of each of these instances. All runs were made under Comet (revision 2.0 Beta) on an Intel 2.4 GHz Linux machine with 512 MB memory. Although much more experimentation is required these initial results show that even on instances with 70 decision variables (3d, 2e, 2, 3x) it is possible to find solutions very quickly.

4

Conclusion

In summary, we have shown that the idea of describing novel constraints by automata can be successfully imported from classical (global search) constraint programming to constraint-based local search. The only related work we are (now) aware of is a Comet implementation [11] of the regular constraint [10], based on the ideas for the propagator of the soft regular constraint [16]. The difference is that they estimate the violation change compared to the nearest solution (in terms of Hamming distance from the current assignment), whereas we estimate it compared to one randomly picked solution. Experiments are needed to compare the two approaches. We do not claim that automata boost the expressive power of Comet, which already has a very powerful means of achieving extensibility of its modelling language, namely differentiable invariants [13]. Indeed, re-consider the unrolled automaton in Figure 2: its paths from source to success nodes correspond to an extensional definition of the constraint, and can be read off as the nested and/or formula (V1 = d∧((V2 = x∧. . . )∨(V2 = d∧. . . )))∨(V1 = x∧. . . )∨(V1 = e∧. . . ), which can be posted as a differentiable invariant (and is more compact than the disjunctive normal form obtained by the disjunction over all such paths). Our first experiments (omitted here for space reasons) indicate that our incremental 5

42 algorithms for automata exploit the problem structure better than those for differentiable invariants, but a deeper comparison is needed. Future work includes a variety of extensions: it is possible to extend the implementation to handle non-deterministic finite automata, which is interesting since they are often smaller than the equivalent DFAs; further incorporating reification should be investigated, and finally extensions such as the counters used in [2]. Acknowledgements. The authors are supported by grant 2007-6445 of the Swedish Research Council (VR), and Jun He is also supported by grant 2008611010 of China Scholarship Council and the National University of Defence Technology of China. Many thanks to Magnus ˚ Agren (SICS) for some useful discussions on this work, and to the anonymous referees, especially for pointing out the existence of [11,16].

References 1. M. ˚ Agren, P. Flener, and J. Pearson. Generic incremental algorithms for local search. Constraints, 12(3):293–324, September 2007. 2. N. Beldiceanu, M. Carlsson, and T. Petit. Deriving filtering algorithms from constraint checkers. Proc. of CP’04, LNCS 3258:107–122. Springer, 2004. 3. N. Beldiceanu, M. Carlsson, and J.-X. Rampon. Global constraint catalogue: Past, present, and future. Constraints, 12(1):21–62, 2007. Dynamic on-line version at www.emn.fr/x-info/sdemasse/gccat. 4. S. Bourdais, P. Galinier, and G. Pesant. HIBISCUS: A CP application to staff scheduling in health care. Proc. of CP’03, LNCS 2833:153–167. Springer, 2003. 5. K.C.K. Cheng and R.H.C. Yap. Maintaining generalized arc consistency on ad hoc r-ary constraints. Proc. of CP’08, LNCS 5202:509–523. Springer, 2008. 6. M.Z. Lagerkvist. Techniques for Efficient Constraint Propagation. Licentiate Thesis, KTH – The Royal Institute of Technology, Stockholm, Sweden, November 2008. 7. G. Laporte. The art and science of designing rotating schedules. Journal of the Operational Research Society, 50(10):1011–1017, October 1999. 8. L. Michel and P. Van Hentenryck. Localizer: A modeling language for local search. Proc. of CP’97, LNCS 1330:237–251. Springer, 1997. 9. G. Pesant. A filtering algorithm for the stretch constraint. Proc. of CP’01, LNCS 2239:183–195. Springer, 2001. 10. G. Pesant. A regular language membership constraint for finite sequences of variables. Proc. of CP’04, LNCS 3258:482–495. Springer, 2004. 11. B. Pralong. Impl´ementation de la contrainte Regular en Comet. Master Thesis, ´ Ecole Polytechnique de Montr´eal, Canada, 2007. 12. P. Van Hentenryck and L. Michel. Constraint-Based Local Search. MIT Press, 2005. 13. P. Van Hentenryck and L. Michel. Differentiable invariants. Proc. of CP’06, LNCS 4204:604–619. Springer, 2006. 14. P. Van Hentenryck, L. Michel, and L. Liu. Constraint-based combinators for local search. Proc. of CP’04, LNCS 3258:47–61. Springer, 2004. 15. P. Van Hentenryck, V. Saraswat, and Y. Deville. Design, implementation, and evaluation of the constraint language cc(FD). Journal of Logic Programming 37(1– 3):293–316, 1998. Unpublished manuscript Constraint Processing in cc(FD), 1991. 16. W.-J. van Hoeve, G. Pesant, and L.-M. Rousseau. On global warming: Flow-based soft global constraints. Journal of Heuristics 12(4-5):347-373, September 2006.

6

43

Research Overview: Improved Boolean Satisfiability Techniques for Haplotype Inference Student: Eric I. Hsu Supervisor: Sheila A. McIlraith Department of Computer Science University of Toronto {eihsu,sheila}@cs.toronto.edu, http://www.cs.toronto.edu

Abstract. Here the authors overview an ongoing effort to extend satisfiabilitybased methods for haplotype inference by pure parsimony (HIPP). This genome analysis task, first formulated as a boolean satisfiability problem by Lynce and Marques-Silva [1], has been performed successfully by modern SAT-solvers. But, it is not as widely used as some better-publicized statistical tools, such as PHASE [2]. This paper presents the authors’ assessment of the current situation, and a preliminary statement of intention concerning their aims in this area. Namely, the situation suggests three categories of improvements for making HIPP more widely-used within the biological community: 1) the ability to handle larger problems; 2) more detailed empirical understanding of the accuracy of the “pure parsimony” criterion; and 3) additional criteria and methods for improving on this level of accuracy.

1 Background As detailed in a recent overview paper [3], the haplotype inference problem is defined over a population of individuals represented by their respective genotypes. Each genotype can be viewed as a sequence of nucleotide pairs, where the two values of each pair are split across the individual’s two chromosomes as inherited from their two parents. Much of the genetic variation between individuals consists of point mutations at known sites within this sequence, known as single nucleotide polymorphisms (SNP’s). Thus, a genotype can represented (with loss of information) as a sequence of nucleotide pairs at successive SNP sites. In particular, at each such site, an individual might have two copies of the “minor allele”– in this case, the presumably mutated or at least rarer of the two possible nucleotide values. At this site the individual is then homozygous (major). Similarly, a SNP site is realized on each chromosome by the nucleotide value that is most common for the species then the site is homozygous (minor) for this individual. The third possibility is that one of the individual’s chromosomes has the major allele at a given site, while the other chromosome has the minor allele at that site–then the genotype is heterozygous at the site in question. Notationally, the first case can be represented by the character ‘0’, the second by ‘1’, and the third, heterozygous case, by ‘2’. So, the sequence “0102” indicates an individual with two minor alleles at both the first and the

2

Improved Techniques for Haplotype Inference

44

third SNP sites measured by a particular study, and two major alleles at the second site. At the fourth site, we know that one chromosome exhibits the major allele and the other exhibits the minor. Thus, an individual’s genotype is well-defined by the sequences of its two constituent chromosomes; these two individually inherited sequences are called haplotypes. If we overload ‘0’ and ‘1’ to indicate a single minor or single major allele in a particular haplotype, then we can denote that the genotype “0102” arises from the two haplotypes “0100” and “0101”. Conversely, though, a genotype with more than one heterozygous site can be explained by multiple pairs of haplotypes, as the major and minor alleles at all heterozygous sites in the genotype can be permuted across corresponding sites in the two haplotypes. More concretely, the genotype “02122” could be realized the pairs “00100”/“01111”, “00101”/“01110”, “00110”/“01101”, or “01100”/“00111”. (Naturally, the term “pair” is used colloquially here, and does not signify any sense of ordering within the haplotype sets of size two that arise in this domain.) Accordingly, there are 2k−1 candidate haplotype pairs for explaining a genotype with k heterozygous sites. In practice, this distinction is made relevant by the lack of any practical experimental method to measure an individual’s two haplotypes instead of its genotype; in other words, the machinery can identify sites at which the two haplotypes have opposing values, but cannot tell which values are grouped together on which chromosome. The goal of haplotype inference is to guess the most likely haplotypes that generated a given set of genotypes.

2 Underlying Biological Principles How can one answer to the haplotype inference problem be preferred to any other? Because the ultimate goal is to accurately predict haplotypes appearing in a particular subject (i.e. human) population, haplotype inference frameworks must apply standards that seek to model the types of phenomena that actually drove the true state of affairs in the evolution of the subject species’ genome. In other words, human haplotypes are not drawn uniformly from the space of all possible pairs that could explain human genotypes. Rather, under the coalescent model of evolution there should be only a small number human haplotypes that were recombined and mutated to produce any of the genotypes within a given haplotype inference problem instance. Thus, early researchers used greedy methods to try to minimize the set of answers to a haplotype inference problem [4], while later systems integrated models based on “perfect phylogeny” [5] or other hierarchical organizations of answer haplotypes [6]. The most widely-used techniques at this point in time integrate empirical statistical measures of likely haplotypes [7–9, 2, 10], by analyzing the population in question, or consulting outside sources [11]. Such approaches may not require that the inferred set of answer haplotypes can be arranged into a particular evolutionary structure, but they all require the haplotype set to be maximally likely according to a particular statistical model that has been fit to the problem and/or outside frequency data. On the other hand, the “pure parsimony” methodology seeks to capture such phenomena implicitly by asking directly for the smallest possible set of haplotypes that as a whole can explain a given population of genotypes [12]. Methods that are based on

Improved Techniques for Haplotype Inference

45

3

this criterion thus perform “HIPP”, indicating haplotype inference by pure parsimony. This pure parsimony principle has been achieved optimally and efficiently by employing a variety of satisfiability-based techniques on small to moderately-sized data sets of about 200 sites and 100 individuals [1, 13–15]. However, aside from a preliminary and limited evaluation of pure parsimony, which did not include such satisfiability-based techniques [16], the overall accuracy and general feasibility HIPP does not seem to be well-understood within the biological community. That is, satisfiability-based HIPP methods must demonstrate two forms of feasibility in order to achieve wider adoption within the biological community: the empirical accuracy of the pure parsimony principle itself, and the efficiency and scalability of SAT methods for achieving this parsimony criterion. While SAT-based techniques can achieve optimal parsimony, on reasonably large data sets, they still cannot be applied to the massive collections characteristic of more popular biological applications that may require on the order of half a million sites. Addressing the issue of scalability will not only enable the assessment of model accuracy and solver efficiency, but can additionally lead to more accurate results. This is because multiple minimum haplotype sets can explain the same population of genotypes [3], but some of them can be safely considered more likely a priori. For instance, solution sets whose members are more similar to each other are more realistic with respect to evolutionary theory [2, 10], and certainly one must certainly be preferred a posteriori with respect to the truth–in the real world there was a single haplotype set that produced a set of genotypes (and this set may or not be minimum.) So while the initial goal is evaluation of the HIPP principle and HIPP solvers, and the prerequisite goal is improved solver scalability, in pursuing these we would like to synergistically attain the overarching goal of making the most accurate predictions possible, as opposed to merely minimal ones. We propose to do this by making finer-grained use of biological principles in designing SAT-based methodologies for HIPP. One prominent example of such phenomena would be “linkage disequilibrium”, or correlation between sites within a sequence [17]. The sequential and mostly non-random nature of genotypes and haplotypes are at core of many of the optimizations and models that underly statistical alternatives to HIPP; finding a way to exploit them within a discrete reasoning framework would make HIPP competitive in efficiency and accuracy. Three proposals for doing so are outlined in the next section.

3 Proposals for Extending the Satisfiability-Based HIPP Framework In this section we propose three general types of improvements to the HIPP framework that can make it competitive to the current methods of choice within the biological community. – Exploiting sequential structure to improve scalability. The best-performing (and most widely-used) statistical systems [7, 8, 2] for haplotype inference utilize explicit or implicit variants of the “partition ligation” scheme of Niu et al [18]. The basic idea is to escape the combinatorial explosion of considering the entire space

4

Improved Techniques for Haplotype Inference

46

of possible haplotypes for a given sequence, and instead break the sequence into blocks. Each such block is small enough to be solved efficiently and to high accuracy; they are then recombined heuristically through a polynomial-time merging scheme. With the merging scheme comes a loss of optimality; in the case of HIPP we may not get the smallest possible explaining haplotype set by means of this scheme. But, the insight of the partition ligation scheme is that evolution does not create haplotypes uniformly at random over all possible explanations for the human genotype, and in practice the loss in optimality has proved negligible in comparison to the gains in accuracy for statistical methods [18, 7, 2]. For beyond being biologically justifiable, block partitioning can actually produce more accurate overall results because each block can be solved to higher standards of likelihood using more computationally expensive methods. For instance, a statistical approach based on MCMC sampling can choose to perform vastly more iterations on a more complex model within the confines of a small block, while realistically hoping to retain much of the benefit during the merging process [2]. Applying this framework to SAT-based approaches can provide similar gains, whether using parsimony alone or integrating statistical information in solving blocks. – Assess the accuracy of HIPP on large-scale data. When HIPP is able to handle larger data sets, it will be possible to directly compare its accuracy with other, more-popular approaches. This will allow an assessment of the model’s strengths and weaknesses and inform any attempts to improve this accuracy. At this point, SAT methods have been highly successful at solving HIPP, but it remains to be seen whether the resulting answers are themselves successful at modeling the human genome–as mentioned previously here and elsewhere [3], there can be many minimum sets that all qualify as HIPP solutions, while some of these are much more empirically likely than others. Characterizing which types of these solutions usually turn out to be more accurate will go a long way to improving the parsimony model’s fit to real populations. – Exploiting linkage disequilibrium to improve accuracy. In the same spirit as the first two points, there are important correlations between various regions of the vast majority of a species’ haplotypes, due to the recombination and mutation processes that drive evolution. To compete with statistically-oriented tools, HIPP can be extended to encompass the same sort of empirical information concerning such correlations [19, 20]. This may entail a weighted SAT or a MAXSAT formulation that favors solutions that adhere to stronger correlations that have been observed from the same sources of data as used by the competitor techniques.

4 Conclusion The authors have begun to implement block decomposition within the SAT-based HIPP framework, but this description of research is decidedly preliminary and strictly for expository purposes. Once the system is able to handle large problems, the next step will be to study its accuracy, especially in terms of adding additional solution criteria to pure parsimony. The concluding step will be to use information derive from individual problem-instance and/or reference data to actually achieve such criteria. It would be a

Improved Techniques for Haplotype Inference

47

5

great opportunity to be able to discuss such plans with others who are working on this problem and related areas!

References 1. Lynce, I., Marques-Silva, J.: Efficient haplotype inference with boolean satisfiability. In: Proc. of 21st National Conference on Artificial Intelligence (AAAI ’06), Boston, MA. (2006) 2. Stephens, M., Scheet, P.: Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation. American Journal of Human Genetics 76 (2005) 449–462 3. Lynce, I., Grac¸a, A., Marques-Silva, J., Oliveira, A.L.: Haplotype inference with boolean constraint solving: An overview. In: Proc. of 20th IEEE Int’l Conf. on Tools with Artificial Intelligence (ICTAI ’08), Dayton, OH. (2008) 4. Clark, A.G.: Inference of haplotypes from PCR-amplified samples of diploid populations. Molecular Biology and Evolution 7(2) (1990) 111–122 5. Kimmel, G., Shamir, R.: GERBIL: Genotype resolution and block identification using likelihood. PNAS 102(1) (2005) 158–162 6. Xing, E.P., Sohn, K.A., Jordan, M.I., Teh, Y.W.: Bayesian multi-population haplotype inference via a hierarchical dirichlet process mixture. (2006) 7. Marchini, J., Cutler, D., Patterson, N., Stephens, M., Eskin, E., Halperin, E., Lin, S., Qin, Z.S., Munro, H.M., Abecasis, G.R., Donnelly, P.: A comparison of phasing algorithms for trios and unrelated individuals. American Journal of Human Genetics 78 (2006) 437–450 8. Browning, S.R.: Missing data imputation and haplotype phase inference for genome-wide association studies. Human Genetics 124 (2008) 439–450 9. Salem, R.M., Wessel, J., Schork, N.J.: A comprehensive literature review of haplotyping software and methods for use with unrelated individuals. Human Genomics 2 10. Scheet, P., Stephens, M.: A fast and flexible statistical model for large-scale population genotype data: Applications to inferring missing genotypes and haplotypic phase. American Journal of Human Genetics 78 (2006) 629–644 11. The International HapMap Consortium: A second generation human haplotype map of over 3.1 million SNPs. Nature 449 (2007) 851–862 12. Gusfield, D.: Haplotype inference by pure parsimony. In: Proc. of 14th Symp. on Combinatorial Pattern Matching (CPM ’03), Morelia, Mexico. (2003) 144–155 13. Grac¸a, A., Marques-Silva, J., Lynce, I., Oliveira, A.L.: Efficient haplotype inference with pseudo-boolean optimization. In: Proc. of 2nd Int’l Conf. on Algebraic Biology (AB ’07), Linz, Austria. (2007) 125–139 14. Erdem, E., T¨ure, F.: Efficient haplotype inference with answer set programming. In: Proc. of 23rd National Conference on A.I. (AAAI ’08), Chicago, IL. (2008) 436–441 15. Lynce, I., Marques-Silva, J., Prestwich, S.: Boosting haplotype inference with local search. Constraints 13(1-2) (2008) 155–179 16. Wang, L., Xu, Y.: Haplotype inference by maximum parsimony. Bioinformatics 19(14) (2003) 1773–1780 17. Slatkin, M.: Linkage disequilibrium–understanding the evolutionary past and mapping the medical future. Nature Reviews Genetics 9 (2008) 477–485 18. Niu, T., Qin, Z.S., Xu, X., Liu, J.S.: Bayesian haplotype inference for multiple linked single nucleotide polymorphisms. American Journal of Human Genetics 70(1) (2002) 157–169 19. Excoffier, L., Slatkin, M.: Maximum-Likelihood estimation of molecular haplotype frequencies in a diploid population. Molecular Biology and Evolution 12(5) (1995) 921–927 20. Kuhner, M.K.: LAMARC 2.0: Maximum likelihood and Bayesian estimation of molecular haplotype frequencies in a diploid population. Bioinformatics 22(6) (2006) 768–770

48

Foundations of Symmetry Breaking Revisited Tim Januschowski (student),1,2? Barbara M. Smith,3 and M.R.C. van Dongen2 1

2

Cork Constraint Computation Centre Computer Science Department, University College Cork, Cork, Ireland 3 School of Computing, University of Leeds, UK [email protected]

Abstract. Puget’s article “On the Satisfiability of Symmetrical Constraint Satisfaction Problems” [6] is widely acknowledged to be one of the founding papers in symmetry breaking (see for example [3, 7]). To date, Puget’s definition of a valid reduction seems to be the only common ground that all authors on static symmetry breaking constraints agree on. His original definition of a valid reduction is for a restricted form of symmetries. We extend Puget’s definition to the recent, more general definition of symmetries by Cohen et al. [1]. In our extension, we require a valid reduction to be a tighter constraint satisfaction problem instead of one with more constraints. We re-formulate Puget’s central theorems on valid reductions for our definition and present a stronger result on the existence of valid reductions.

1

Introduction

Symmetries are a common property of Constraint Satisfaction Problems (csps). In practice, symmetries often are a key to an effective solution of symmetric csps: only if we properly exploit the symmetries may we solve such csps within a reasonable time. The classic approach to dealing with symmetries is the addition of traditionally called symmetry breaking constraints before search. This is the most commonly used symmetry breaking approach in practice and it receives considerable attention. We consider the theoretical foundations of the addition of symmetry breaking constraints in the present work following Puget [6]. Puget introduces his symmetry breaking constraint [6] as a constraint whose addition to a given csp leads to a related csp that he calls a valid reduction. Valid reductions are defined without dependence on any specific symmetry breaking constraint however. This is why they seem to be the only common ground that authors on symmetry breaking constraints agree on, e.g. [2, 5]. These constraints differ in so many aspects, that valid reductions may offer the only framework to theoretically compare the different symmetry breaking constraints. Furthermore, valid reductions offer a means to study the theoretical effects of the addition of symmetry breaking constraints. This is the reason why an investigation of valid reductions is important for the current research on symmetry breaking ?

Tim Januschowski is supported by the EMBARK initiative by the Irish Research Council for Science, Engineering and Technology

49

constraints. We see valid reductions as a first step towards a general theory of static symmetry breaking. We informally revisit Puget’s definition of a valid reduction for a restricted form of symmetry. However, the recent definition of symmetries by Cohen et al. [1] is much wider. We carry Puget’s definition of a valid reduction over to this definition of symmetry and require a valid reduction to be a tighter csp than the original csp. Next we provide a simple example that shows that adding Puget’s constraints may not change the csp. This underlines the importance of our change of emphasis in the definition. We present a valid reduction that has more symmetries than the original csp. We identify the cases in which we can find a valid reduction that differs from the original csp and we show stronger versions of Puget’s central theorems. Due to space restrictions, we have omitted all proofs. We show an example of a symmetric csp, whose symmetries cannot be eliminated.

2

Preliminary Definitions

We define a csp P as the triple (X, D, C), where X is the set of variables of P , where D(x) is the domain of x ∈ X, and where C is the set of constraints of P . Since we are interested in theoretical results, we mainly consider constraints in extensional form. We say that the tuples in the relation of c are allowed by c. All other tuples are forbidden by c. We also call a forbidden tuple a no-good. For ease of presentation, we want to exclude csps that have variables with empty domains. We further assume that in the domain of a variable no values exist that are forbidden by a unary constraint. We call a (variable,value)-assignment a literal . A solution of P is an assignment of values to all variables that is allowed by all constraints. If a solution exists, we say that P is satisfiable, otherwise P is unsatisfiable. A hypergraph G is a tuple (V, E) with V or V (G) the set of nodes and E or E(G) the set of hyperedges. For Vˆ ⊆ V , we denote the subgraph of G that is induced by Vˆ by G[Vˆ ]. The microstructure S complement ( msc) of a csp (X, D, C) is the hypergraph G, where V (G) = x∈X {x} × D(x) and {(x1 , t1 ), (x2 , t2 ), . . . , (xk , tk )} is in E(G) if and only if: – a constraint in C with scope {x1 , x2 , . . . , xk }, k > 1 exists that forbids the tuple ht1 , . . . , tk i – k = 2, x1 = x2 and t1 6= t2 .

For binary csps, this graph is the complement of the microstructure [4]. We want to define the msc as a loop-free hypergraph, which explains why k > 1. More specifically, we assume that literals forbidden by unary constraints do not appear as nodes in the msc. A stable set in a hypergraph G is a set of nodes, Vˆ , such that G[Vˆ ] does not contain any hyperedge. A csp with n variables is satisfiable if and only if its msc contains a stable set with n nodes. A graph automorphism of hypergraph G is a permutation of V (G) that preserves adjacency. We denote the group of

50

automorphisms of G by aut(G). The orbit of a set of nodes Vˆ in V (G) is the set of all images of Vˆ under elements of aut(G). We denote it by orbit(Vˆ ). From now on, we shall write graph instead of hypergraph. The following definition, by Cohen et al. [1], is the current state-of-the-art for symmetry definitions. Definition 1 (Constraint Symmetry [1]). Let G be a msc of a csp P . A symmetry or constraint symmetry of P is an element of aut(G). Puget [6] considers a strict subclass of the symmetries in Def. 1. Having defined what the symmetries of a csp are, we proceed by defining a symmetric csp. Clearly, the identity is an automorphism of any csp. It is natural to exclude csps whose only automorphism is the identity from being called symmetric csps. For technical reasons we also exclude csps the domains of which are all singletons. Definition 2 (Symmetric CSP). A csp having at least one variable with domain size strictly larger than 1 is symmetric if it admits more than one symmetry. Note that an empty csp is not symmetric according to this definition. We now have all ingredients to extend Puget’s definition of a valid reduction.

3

Valid Reductions

Symmetries partition the set of solutions into orbits. The addition of a symmetry breaking constraint always produces a csp with at least one representative of each orbit of solutions: an orbit representative. Puget defines a valid reduction as a csp with more constraints than the original csp such that for every orbit of solutions (under Puget’s restricted symmetries) at least one orbit representative remains. The following is our definition. Definition 3 (Valid Reduction). Let P = (X, D, C) be a csp with n variables ˜ C) ˜ be a csp with the same variables as P , ∅ = and msc G. Let P˜ = (X, D, 6 ˜ ˜ of D(x) ⊆ D(x) for all x ∈ X. We call P˜ a valid reduction of P if the msc G P˜ fulfils the following conditions: ˜ ⊆ V (G) and b) E(G[V (G)]) ˜ ⊆ E(G), ˜ and 1. a) V (G) 2. for every maximum stable set S of size n in G, at least one element of ˜ orbit(S, aut(G)) exists in G. According to the definition, we can produce a valid reduction by adding hyperedges to the msc of the csp, which is equivalent to adding a constraint of arity 2 or higher to the csp and by removing nodes from the msc, which is equivalent to adding a unary constraint. Definition 3 is an extension of Puget’s original definition since we allow for more general symmetries to define the orbits. We stress that we may add unary constraints to a csp (Condition 1a). Puget implicitly allows unary constraints in his definition, though “at the core of [his] method” is “adding ordering constraints” [6]. We disallow unary constrains that remove all literals corresponding

51

to a variable. Though this may reduce symmetries, it may make it too difficult to relate a valid reduction to the original csp and to reconstruct solutions. Further work will address this. In Def. 3, we substitute fewer stable sets (partial or full solutions) in the msc for higher number of constraints. We ensure fewer stable sets in our definition by requiring the msc of a valid reduction to either have fewer nodes or having more edges on the set of nodes in the msc that the valid reduction and the the original csp have in common (Condition 1b). We shall say that a valid reduction with fewer stable sets in its msc than the original msc is tighter. A higher number of constraints in Puget’s sense could mean a mere repetition of an already present constraint. Valid reductions are defined in terms of removing symmetric solutions rather than eliminating symmetries. However, by removing symmetric solutions, we may also eliminate symmetries. The existence of orbit representatives ensures that we can reconstruct all solutions of the original csp by applying the symmetries to the orbit representatives in the original csp. Puget proved a weaker version of the following theorem. Theorem 4. Any csp P is satisfiable if and only if any valid reduction of P is satisfiable. Besides Theorem 4, Puget’s other central result is the existence of a valid reduction for symmetric csps. He proves that for a csp with a symmetry φ that maps a variable x to a different variable y, by showing that adding the constraint x ≤ y always produces a valid reduction. The following example shows that adding this constraint may not produce a tighter valid reduction. Consider a csp P with 2 variables x and y, each with domain {1, 2} and one extensional constraint that only allows {(x, 2), (y, 2)} and forbids all other tuples. This csp has a symmetry (also in Puget’s restricted terminology), that swaps the variables x and y. Adding x ≤ y does not change the msc, but raises the number of constraints of P . Hence, producing a tighter valid reduction with Puget’s constraints is not always possible. We believe that inventing the symmetry breaking constraint x ≤ y is the more important achievement than the statement of the existence theorem for valid reductions. Otherwise, Puget could have proved the existence theorem also by introducing a constraint that is a copy of an already existing constraint. Let us give another example that shows that adding ordering constraints may result in a valid reductions that has more symmetries than the original csp. Example 5. Consider a csp with two variables x and y, each with domain {1, 2, 3}. The constraint on x and y allows {(x, 1), (y, 3)}, {(x, 3), (y, 1)} and {(x, 3), (y, 3)}. The microstructure of this csp with symmetry group of size 4 is depicted in Fig. 1a. Consider symmetry ψ which swaps (x, 2) with (y, 2) and is the identity on all other literals. Symmetry ψ cannot be eliminated through the addition of an ordering constraint, e.g., y ≤ x. We show the microstructure of a valid reduction in Fig. 1b. The symmetry group of the valid reduction has size 12. Example 5 shows that adding ordering constraints may not always eliminate symmetries. However, removing nodes would eliminate the symmetries of Ex. 5.

52 (x, 3)

(y, 3)

(x, 3)

(y, 3)

(x, 2)

(y, 2)

(x, 2)

(y, 2)

(x, 1)

(y, 1)

(x, 1)

(y, 1)

(a) The microstructure of Ex. 5.

(b) Adding y ≤ x to the csp in Ex. 5 produces the above valid reduction with a larger symmetry group than the original csp.

Fig. 1: The microstructure of the csp in Ex. 5 and one of its valid reductions.

In Def. 3, we do not require a valid reduction to be strictly tighter. Any csp, symmetric or not, is a valid reduction of itself. This generalises Puget’s original existence theorem for valid reductions. When we reason about the existence of valid reductions, we must therefore be interested in showing the existence of a valid reduction that is strictly tighter than the original, symmetric csp. We call such a valid reduction proper. The following theorem answers the question which symmetric csps admit a proper valid reduction. Theorem 6. A symmetric csp P has a proper valid reduction if and only if it has a partial solution that is not part of a solution or it has a symmetry φ that maps a solution S solution φ(S) such that S 6= φ(S). If P has such a symmetry, then a proper valid reduction can be obtained by adding a unary constraint. Theorem 6 singles out the symmetries that map a solution to a different solution as the important ones for valid reductions. However, before starting to prove the satisfiability of a csp, we may only know that symmetries exist, e.g., through automatic detection in preprocessing. We cannot assume knowing whether the symmetries map a solution to a different solution. In Fig. 2, we depict the microstructure of a symmetric csp that does not admit a proper valid reduction. Apart from the identity, only one automorphism exists. This automorphism swaps the grey-coloured literals (z, 3) with (y, 2), and is the identity on all other literals. Every partial solution is part of a solution whose orbit consists only of itself. Hence, we cannot remove a node or an edge. This means that no proper valid reduction of this csp exists. An example of an non-symmetric csp without a proper valid reduction is the empty csp.

4

Conclusion

We have presented Puget’s fundamental work [6] on symmetry breaking in light of the recent definition of symmetry [1]. We have re-formulated his central results and strengthened them. Our results suggest that symmetry breaking based on the idea of valid reductions may, in theory, only supply additional information to a problem if symmetric solutions exist. This is somewhat disappointing, since the existence of symmetric solutions can be decided only by solving the csp.

53 (x, 1) (z, 1)

(x, 2) (y, 1)

(z, 2)

(y, 2)

(z, 3) Fig. 2: The microstructure of a symmetric csp which does not admit a proper valid reduction. The only non-trivial symmetry of this csp only swaps the grey-coloured nodes (z, 3) and (y, 2). We can neither remove a node nor remove an edge.

Valid reductions are among the requirements for a constraint to be considered as a symmetry breaking constraint. Hence, they offer a possibility to analyse the theoretical power of symmetry breaking constraints. While valid reductions allow for constraints that are far from our intuition about symmetry breaking constraints, valid reductions can still give us an idea about the theoretical limits of symmetry breaking constraints. In further work, we will consider valid reductions with further properties. We will analyse and compare the very diverse symmetry breaking constraints using valid reductions.

References 1. D. Cohen, P. Jeavons, C. Jefferson, K. Petrie, and B. Smith. Symmetry definitions for constraint satisfaction problems. Constraints 11, 2006. 2. J. Crawford, M. Ginsberg, E. Luks, and A. Roy. Symmetry-breaking predicates for search problems. In KR’96: Principles of Knowledge Representation and Reasoning, pages 148–159. Morgan Kaufmann, 1996. 3. P. Flener, J. Pearson, M. Sellmann, and P. V. Hentenryck. Static and dynamic structural symmetry breaking. In Principles and Practice of Constraint Programming – CP 2006, volume 4204/2006 of Lecture Notes in Computer Science 4204, 2006, pages 695–699, 2006. 4. E. C. Freuder. Eliminating interchangeable values in constraint satisfaction problems. In Proceedings AAAI’91, pages 227–233, 1991. 5. V. Kaibel and M. E. Pfetsch. Packing and partitioning orbitopes. Mathematical Programming, 114, Number 1 / July, 2008:1–36, 2008. 6. J.-F. Puget. On the satisfiability of symmetrical constrained satisfaction problems. In Methodologies for Intelligent Systems, volume 689/1993 of Lecture Notes in Computer Science, pages 350–361, London, UK, 1993. Springer-Verlag. 7. F. Rossi, P. van Beek, and T. Walsh, editors. Handbook of Constraint Programming. Elsevier, 2006.

54

Energetic Edge-Finder For Cumulative Resource Roger KAMEUGNE1 (student) and Laure Pauline FOTSO2 1

2

University of Yaounde 1, Department of Mathematics, P.O Box: 812 Yaounde-Cameroon. [email protected] or [email protected] University of Yaounde 1, Department of Computer Science, P.O Box: 812 Yaounde-Cameroon. [email protected]

Abstract. Edge-finding, not-first/not-last, sweeping and overloading checking are well known propagation rules used to prune the start and end times of tasks which have to be processed without interruption on a cumulative resource. In this paper, we present new propagation rule title energetic edge-finder who can make additional pruning. The new algorithm is organized in two phases. The first phase precompute the innermost maximization in the edge-finder specification and identify the characteristics of the corresponding task interval use for. The second phase based on energetic reasoning condition, uses precomputation to apply the actual updates. The overall complexity of our algorithm is O(n2 ) since the running time of each phase is O(n2 ).

1

Introduction

As it is suggested in [MV05], for a given pair (i, Θ) where i is a task and Θ a set of tasks, if it is prove that in all feasible schedules task i ends after the completion time of all tasks of Θ, then the rest-based update of edge-finding is valid. In this paper, we present a two phases algorithm hybridization of edge-finding and energetic reasoning who can make additional pruning. The first phase precompute the innermost maximization in the edge-finder specification and identify the characteristics of the corresponding task interval use for. It is based on our edgefinder present in [KF09b]. Indeed, in [KF09b], we propose an iterative O(n2 )implementation of the edge-finding who reached the same fix point as the rule after many runs. This idea was already used for other rules [TL00,Vi04,KF09a]. The second phase based on energetic reasoning condition, uses precomputation to apply the actual updates. The overall complexity of our algorithm is O(n2 ) since the running time of each phase is O(n2 ). The rest of the paper is organized as follows. Section 2 presents notations used in the paper. In section 3, we present on an example our motivations. Section 4 specify the energetic edge-finder rule and a relaxation of this new rule. Section 5 presents an O(n2 )-implementation of the relax energetic edge-finder rule. Section 6 concludes the paper.

55

2

Notations

An instance of a CuSP consists of a set T of tasks to be performed on a resource of capacity C. Each task i must be executed (without interruption) over pi time units between an earliest start time ri (release date) and a latest end time di (due date). Moreover, it requires a constant amount of resource ci . K = {ci , i ∈ T } denotes the set of distinct capacity requirements of tasks and k = |K|. Throughout the paper, we assume that ri + pi ≤ di and ci ≤ C, otherwise the problem has no solution. We also assume that all data are integer. A solution of a CuSP is a schedule that assigns a starting date si to each task i so that: 1. ∀i ∈ T : ri P ≤ si ≤ si + pi ≤ di ci ≤ C. 2. ∀t : i∈T si ≤ t < si + pi ei = ci pi denotes the energy of task i. We extend the notation from tasks to set of tasks by setting X rΩ = min rj , dΩ = max dj , eΩ = ej j∈Ω

j∈Ω

j∈Ω

where Ω is a set of tasks. By convention, when Ω is the empty set, rΩ = +∞, dΩ = −∞ and eΩ = 0. CuSP is a sub-problem of the Resource Constrained Project Scheduling Problem (RCPSP), where precedence constraints are relaxed and a single resource is considered at a time. The CuSP is a NP complete problem [BL99]. Definition 1 (Task Intervals). Let i, k ∈ T (eventually the same task). The task intervals Ωi,k is the set of tasks Ωi,k = {j ∈ T | ri ≤ rj ∧ dj ≤ dk } Let i be a task and [t1 , t2 ] be a time interval with t1 < t2 . The ”leftshift/right-shift” required energy consumption of i over [t1 , t2 ] noted WSh is ci times the minimum of the three following durations. • t2 − t1 , the length of the interval; • p+ i (t1 ) = max(0, pi −max(0, t1 −ri )), the number of time units during which i executes after time t1 if i is left-shifted, i.e., scheduled as soon as possible; • p− i (t2 ) = max(0, pi −max(0, di −t2 )), the number of time units during which i executes before time t2 if i is right-shifted, i.e., scheduled as late as possible. − This leads to WSh (i, t1 , t2 ) = ci min(t2 − t1 , p+ i (t1 ), pi (t2 )). The overall required energy consumption W (t , t ) over an interval [t , t Sh 1 2 1 2 ] is define as WSh (t1 , t2 ) = P WSh (i, t1 , t2 ). This required energy consumption was first define in [ELH,Lo]. i∈T

Other varieties of required energy consumption are available. We have for example the Fully Elastic required energy noted WF E and the Partial Elastic required

56 energy noted WP E [BL99,B02]. It is obvious that if there is a feasible schedule, then ∀t1 , ∀t2 ≥ t1 , SSh (t1 , t2 ) = C(t2 − t1 ) − WSh (t1 , t2 ) > 0. (1) We present only algorithm to update start times because the algorithm for update the end times is symmetrical.

3

Motivation

For more visibility, let us analyze the example given in Table 1. Example 1. Consider the CuSP instance of Table 1 where five tasks share a resource of capacity C = 3. Table 1. An instance of CuSP tasks r d p c a b c d e

1 3 3 3 5

10 6 6 5 7

4 1 1 1 2

1 2 2 3 1

This instance satisfy the necessary condition of existence of feasible scheduling give by relation (1). Using the propagation rule edge-finding, extended edgefinding, not-first/not-last, sweeping on this instance, no time bound is adjusted. But the hybrid rule describe up allow the update the release date of task a to 4. Indeed, ∆+ (a, 3, 6) = 1 and rest(Θ, ca ) = 1 with Θ = {b, c, d} , where ∆+ (a, t1 , t2 ) = WSh (t1 , t2 ) − WSh (a, t1 , t2 ) + ca · p+ a (t1 ) − C · (t2 − t1 ) and rest (Θ, ca ) = eΘ − (C − ca ) (dΘ − rΘ ) . Therefore, it follows that ra = rΘ + d c1a rest (Θ, ca )e = 4 since da > dΘ = 6 and ∆+ (a, 3, 6) > 0.

4

Energetic edge-finder

In this section, after a specification of the energetic edge-finder rule, we focus it relaxation on which we can provide a quadratic algorithm.

57 4.1

Specification of the energetic edge-finder algorithm.

As for the energetic reasoning, the energetic edge-finder require a CuSP instance satisfying the necessary condition of existence of feasible scheduling of relation (1). We have the following definition. Definition 2 (Energetic edge-finding algorithm). The energetic edge-finding algorithm receives as input a CuSP instance satisfying the necessary condition of relation (1). It produces as output a vector ­ ® ELB 0 (1) , ..., ELB 0 (n) where

ELB 0 (i) = max (ri , ELB 0 (i)) and ELB (i) = 0

max max Θ⊆T t1 < t2 ∆+ (i, t1 , t2 ) > 0 rest (Θ, ci ) > 0 dΘ ≤ t2 di > t2

»

1 rΘ + rest (Θ, ci ) ci

¼

with ∆+ (i, t1 , t2 ) = WSh (t1 , t2 ) − WSh (i, t1 , t2 ) + ci · p+ i (t1 ) − C · (t2 − t1 ). and rest (Θ, ci ) =

½

eΘ − (C − ci ) (dΘ − rΘ ) 0 if not

if

Θ 6= ∅

.

We can derived for our O(n2 )-implementation of edge-finding [KF09b], a precomputation phase running in O(n2 ). In spite of our efforts, we are unable to exhibit a quadratic algorithm to compute all the adjustments on the O(n2 ) relevant time intervals. 4.2

Specification of the relax energetic edge-finder algorithm

To provide a quadratic algorithm, we have considered a relaxation of the rule. For a given task i, we pair attention on time intervals [t1 , t2 ] satisfying t1 = rΘ , t2 = dΘ and rest(Θ, ci ) > 0. Therefore, we have the following specification of the relax version of the energetic edge-finder. Definition 3 (Relax Energetic edge-finding algorithm). The relax energetic edge-finding algorithm receives as input a CuSP instance satisfying the necessary condition of relation (1). It produces as output a vector D E ELB10 (1) , ..., ELB10 (n)

58 where

ELB10 (i) = max (ri , ELB10 (i))

and ELB10 (i) =

» ¼ 1 max rest (Θ, ci ) rΘ + ci Θ⊆T ∆+ (i, rΘ , dΘ ) > 0 di > dΘ rest (Θ, ci ) > 0

with ∆+ (i, rΘ , dΘ ) = WSh (rΘ , dΘ ) − WSh (i, rΘ , dΘ ) + ci · p+ i (rΘ ) − C · (dΘ − rΘ ). and rest (Θ, ci ) =

5 5.1

½

eΘ − (C − ci ) (dΘ − rΘ ) 0 if not

if

Θ 6= ∅

.

Relax energetic edge-finder algorithm Precomputation

We call P reComp the adapt version of our O(n2 ) edge-finding algorithm presented in [KF09b] who compute the innermost maximization of the edge-finder specification and identify the characteristics of the corresponding task interval use for. It return an array g of length 3n where n is the number of tasks. For a given index i with 1 < i < n, g(i) is the value use to update the release date of task X(i) where X is an array of tasks sorted in non-decreasing order of release dates required as by the algorithm. g(n + i) and g(2n + i) are respectively the release and due date of the task intervals use to compute g(i). For more detail, see the appendix and [KF09b]. 5.2

Relax energetic edge-finder algorithm

Using the precomputation phase P reComp, we can derive a second phase for the relax form of the energetic edge-finder. This second phase runs in O(n2 ) and perform additional adjustment.

Require: X is an array of tasks sorted by non-decreasing release dates; Require: Y is an array of tasks sorted by non-decreasing due dates; Ensure: ELB(x) are computed for all 1 ≤ x ≤ n. 1: g := P reComp(); 2: for i := 1 to n do 3: ELB(i) := rX(i) ;

59 4: end for 5: for i := 1 to n do 6: if g(i) > LB(i) then 7: t1 = g(n + i), t2 = g(2n + i); 8: if t1 < t2 then 9: W := 0; 10: for x := 1 to n do 11: if x 6= X(i) then 12: W := W + WSh (x, t1 , t2 ); 13: end if 14: end for 15: if W + cX(i) p+ X(i) (t1 ) > C(t2 − t1 ) then 16: ELB(i) = max(LB(i), g(i)); 17: end if 18: end if 19: end if 20: end for Algorithm 1: CompEEF: Energetic Edge-finder algorithm in O(n2 ) and O(n) space. It is possible to increase the power of the algorithm without change the complexity. Line 16 of Algorithm 1 can be replace by the following line. 16’: ELB(i) = max(LB(i), g(i), t2 + d c1i W + cX(i) p+ X(i) (t1 ) − C(t2 − t1 )e − pi ) It is possible to compute the complete energetic edge-finder with the partial elastic energy required in O(n2 log(k)). Indeed, with the partial elastic energetic required, the time points t1 and t2 correspond to release dates and due dates of some tasks. We can therefore derived an O(n2 log(k)) algorithm for the energetic edge-finder since Baptiste’s algorithm runs in O(n2 log(k)) [BL99,B02].

6

Conclusion

Edge-finding, not-first/not-last, sweeping, overloading checking and energetic reasoning are well known propagation rules used to prune the start and end times of tasks which have to be processed without interruption on a cumulative resource. In this paper, we have presented a new propagation rule title energetic edge-finder who can make additional pruning. The new algorithm is organized in two phases. The first phase precompute the innermost maximization in the edge-finder specification and identify the characteristics of the corresponding task interval use for. The second phase based on energetic reasoning condition, uses precomputation to apply the actual updates. The overall complexity of our algorithm is O(n2 ) since the running time the of each phase is O(n2 ).

60

References [BL99] Ph. Baptiste. C. Le Pape. & W. Nuijten. Satisfiability Test and Time-Bound Adjustements for Cumulative Scheduling Problems, Annals of Operations Research 92. 305-333. (1999). [TL00] P. Torres and P. Lopez. On Not-First/Not-Last Conditions in Disjunctive Scheduling. European Journal of Operational Research, Vol 127(2):332-343. (2000). [Vi04] P. Vilim. O (n log n) Filtering Algorithms for Unary resource Constraint. In proceeding of CP-AI-OR. (2004). [KF09b] R. Kameugne, L.P. Fotso and Youcheu Ngo-Kateu. Cumulative Edge-Finding Algorithm in O(n2 ). Submitted for publication to INFORMS Journal on Computing (2009). [KF09a] R. Kameugne, L.P. Fotso and E. Kouakam. Cumulative Not-First/Not-Last Algorithms in O(n3 ). Submitted for publication to Journal of Scheduling, (2009). [MV05] L. Mercier and Pascal Van Hentenryck. Edge Finding for Cumulative Scheduling. INFORMS Journal on Computing 20(1), pp. 143-153, (2008). [B02] Ph. Baptiste. R´esultats de Complexit´e et Progammation par Contraintes pour l’Ordonnancement. HDR thesis. Compi`egne University of Technology. (2002). [ELH] Esquirol P. Lopez P. et Huguet M.-J. Ordonnancement de la production, chapitre Propagation de contraintes en ordonnancement, Herm`es Science Publications, Paris, 2001. [Lo] Lopez P. Approche ´energ`etique pour l’ordonnancement de tˆ aches sous contraintes de temps et de ressources, Th`ese de doctorat, Universit´e Paul Sabatier, Toulouse, 1991.

61

7

Appendix

Proposition 1 summarizes the main dominance properties of edge-finding rule. Proposition 1. Let i be a task of an E-feasible CuSP and Ω, Θ be two subsets of T . If the edge-finding rule applied to task i with pair (Ω, Θ) allows to update the earliest start time of i then there exists four tasks j, k, j 0 , k 0 such that rj ≤ rj 0 < dk0 ≤ dk < di ∧ rj ≤ ri and the edge-finding rule applied to task i with the pair (Ωj,k , Ωj 0 ,k0 ) allows the same update of the earliest start time of task i. Proof. See proof of Propositions 2, 3, 4 and 6 of [MV05]. Proposition 2. Let i be a task of an E-feasible CuSP and Ω, Θ be two subsets of T such that Θ ⊆ Ω. If the edge-finding rule applied to task i with pair (Ω, Θ) allows to update the earliest start time of i then there exist two tasks k and jmax1 such that: 1. dk = dΩ ∧ rjmax1 ≤ ri ; 2. ∀j ∈ T if rj ≤ ri then eΩjmax1 ,k + Crjmax1 ≥ eΩj,k + Crj

(2)

3. The pair (Ωjmax1 ,k , Θ) allows to update the earliest start time of i. Proof. See [KF09b]. Definition 4 (Pair of interest ). Let i be a task of an E-feasible CuSP and Ω, Θ be two subsets of T such that Θ ⊆ Ω. A pair (Ω, Θ) is a pair of interest to task i if and only if α(Ω, i)



di > d Ω



rest (Θ, ci ) > 0



rΘ +

1 rest (Θ, ci ) > ri . ci

Definition 5. Let i, k be two tasks of an E-feasible CuSP. jmax is a task such that ri < rjmax and for all task j with ri < rj eΩjmax ,k eΩj,k ≤ . dk − rj dk − rjmax Proposition 3. Let i be a task of an E-feasible CuSP and Ω, Θ be two subsets of T such that Θ ⊆ Ω. If the edge-finding rule applied to task i with pair (Ω, Θ) allows to update the earliest start time of i then 1. if ri < rΘ there exists a task jmax such that ri < rjmax and the pair (Ω, Ωjmax ,k ) is a pair of interest to task i where dΘ = dk . 2. if rΘ ≤ ri the pair (Ω, Ω) or (Θ, Θ) is a pair of interest to task i. Proof. See [KF09b]. According to Proposition 3, after detecting all edge-finding conditions, we can sometime perform a poor adjustment. This approach can be used iteratively, until no adjustment is found (reached the fix point).

62 Require: X is an array of tasks sorted by non-decreasing release dates; Require: Y is an array of tasks sorted by non-decreasing due dates; Ensure: g(x) are computed for all 1 ≤ x ≤ 3n. 1: for x := 1 to 3n do 2: g(x) := −∞; 3: end for 4: for y := 1 to n do 5: W := 0, maxW := 0, maxEst := −∞; 6: for x ¡:= n down to ¢1 do 7: if dX(x) ≤ dY (y) then 8: W ³:= W + eX(x) ´

maxW 9: if dY (y)W then −rX(x) > dY (y) −maxEst 10: maxW := W, maxEst := rX(x) ; 11: end if 12: else 13: restW := maxW − (C − cX(x) )(dY (y) − maxEst); 14: a := maxEst + drestW/cX(x) e else −∞ ¡ if (restW > 0) then ¢ 15: if a > max(g(x), rX(x) ) then 16: g(x) = max(g(x), a), g(n + x) = maxEst, g(2n + x) = dY (y) ; 17: end if 18: end if 19: W (x) := W ; 20: end for 21: minSL := −∞, maxEst1 := dY (y) ; 22: for x ¡:= 1 to n do ¢ 23: if W (x) − C(dY (y) − rX(x) ) > minSL then 24: maxEst1 := rX(x) , minSL := W (x) − C(dY (y) − maxEst1); 25: end if 26: if (dX(x) > dY (y) ) then 27: restW := minSL + cX(x) (dY (y) − maxEst1); 28: b := if (maxEst1 ≤ dY (y) ) ∧ (restW > 0) then maxEst1 + drestW/cX(x) e else −∞ 29: if b > max(g(x), rX(x) ) then 30: g(x) = max(g(x), b), g(n + x) = maxEst1, g(2n + x) = dY (y) ; 31: end if 32: end if 33: end for 34: end for

Algorithm 2: PreComp: Edge-finding precomputation algorithm in O(n2 ) and O(n) space This idea was already used for the disjunctive and cumulative Not-First/NotLast rule [TL00,Vi04,KF09a]. We have the following precomputation phase.

63 Theorem 1. Algorithm 2 computed iteratively the innermost maximization of the edge-finding specification and identify the characteristics of the corresponding task intervals use for. It runs in O(n2 ) time and uses O(n) space. Proof. Direct consequence of Proposition 3. The preprocessing time for sorting the arrays X and Y is O(nlog(n)). The first loop 1-3 runs in O(n). The second loop 4-34 contents two inner loop 6-20 and 22-33 and each inner loop runs in O(n2 ). Therefore the overall complexity of Algorithm 2 is O(n2 ). In algorithm 2, • The first main loop (line 1-3) initializes array g. • The second main loop (line 4-34) iterates over all due dates of tasks in non decreasing order. • The first inner loop (6-20) iterates over all release dates sorted in non increase order and identify the task jmax of Definition 5. Lines 15-17 identify and update the characteristics of the potential task interval that can be use to. • The inner loop (22-33) iterates over all release dates sorted in non decrease order and identify the task jmax1 of Proposition 2. Lines 29-31 identify and update the characteristics of the potential task interval that can be use to.

64

Dominion A constraint solver generator Lars Kotthoff (student) supervised by Ian Miguel and Ian Gent {larsko,ianm,ipg}@cs.st-andrews.ac.uk University of St Andrews

Abstract This paper proposes a design for a system to generate constraint solvers that are specialised for specific problem models. It describes the design in detail and gives preliminary experimental results showing the feasibility and effectiveness of the approach.

1

Introduction

Currently, applying constraint technology to a large, complex problem requires significant manual tuning by an expert. Such experts are rare. The central aim of this project is to improve the scalability of constraint technology, while simultaneously removing its reliance on manual tuning by an expert. We propose a novel, elegant means to achieve this – a constraint solver synthesiser, which generates a constraint solver specialised to a given problem. Constraints research has mostly focused on the incremental improvement of general-purpose solvers so far. The closest point of comparison is currently the G12 project [1], which aims to combine existing general constraint solvers and solvers from related fields into a hybrid. There are previous efforts at generating specialised constraint solvers in the literature, e.g. [2]; we aim to use state-of-the-art constraint solver technology employing a broad range of different techniques. Synthesising a constraint solver has two key benefits. First, it will enable a fine-grained optimisation not possible for a general solver, allowing the solving of much larger, more difficult problems. Second, it will open up many new research possibilities. There are many techniques in the literature that, although effective in a limited number of cases, are not suitable for general use. Hence, they are omitted from current general solvers and remain relatively undeveloped. Among these are for example conflict recording [3], backjumping [4], singleton arc consistency [5], and neighbourhood inverse consistency [6]. The synthesiser will select such techniques as they are appropriate for an input problem. Additionally, it can also vary basic design decisions, which can have a significant impact on performance [7]. The system we are proposing in this paper, Dominion, implements a design that is capable of achieving said goals effectively and efficiently. The design decisions we have made are based on our experience with Minion [9] and other constraint programming systems. The remainder of this paper is structured as follows. In the next section, we describe the design of Dominion and which challenges it addresses in particular. We then present the current partial implementation of the proposed system and give experimental results obtained with it. We conclude by proposing directions for future work.

65

Analyser

Solver specification

Generator

Specialised solver problem model

Solution

Figure 1. Components and flow of information in Dominion. The part above the dashed line is the actual Dominion system. The dotted arrow from the problem model to the specialised solver designates that either the model is encoded entirely in the solver such that no further information is required to solve the problem, or the solver requires further input such as problem parameters.

2

Design of a synthesiser for specialised constraint solvers

The design of Dominion distinguishes two main parts. The analyser analyses the problem model and produces a solver specification that describes what components the specialised solver needs to have and which algorithms and data structures to use. The generator takes the solver specification and generates a solver that conforms to it. The flow of information is illustrated in Figure 1. Both the analyser and the generator optimise the solver. While the analyser performs the high-level optimisations that depend on the structure of the problem model, the generator performs low-level optimisations which depend on the implementation of the solver. Those two parts are independent and linked by the solver specification, which is completely agnostic of the format of the problem model and the implementation of the specialised solver. There can be different front ends for both the analyser and the generator to handle problems specified in a variety of formats and specialise solvers in a number of different ways, e.g. based on existing building blocks or synthesised from scratch. 2.1

The analyser

The analyser operates on the model of a constraint problem class or instance. It determines the constraints, variables, and associated domains required to solve the problem and reasons about the algorithms and data structures the specialised solver should use. It makes high-level design decisions, such as whether to use trailing or copying for backtracking memory. It also decides what propagation algorithms to use for specific constraints and what level of consistency to enforce. The output of the analyser is a solver specification that describes all the design decisions made. It does not necessarily fix all design decisions – it may use default values – if the analyser is unable to specialise a particular part of the solver for a particular problem model. In general terms, the requirements for the solver specification are that it (a) describes a solver which is able to find solutions to the analysed problem

66

model and (b) describes optimisations which will make this solver perform better than a general solver. The notion of better performance includes run time as well as other resources such as memory. It is furthermore possible to optimise with respect to a particular resource; for example a solver which uses less memory at the expense of run time for embedded systems with little memory can be specified. The solver specification may include a representation of the original problem model such that a specialised solver which encodes the problem can be produced – the generated solver does not require any input when run or only values for the parameters of a problem class. It may furthermore modify the original model in a limited way; for example split variables which were defined as one type into several new types. It does not, however, optimise it like for example Tailor [8]. The analyser may read a partial solver specification along with the model of the problem to be analysed to still allow fine-tuning by human experts while not requiring it. This also allows for running the analyser incrementally, refining the solver specification based on analysis and decisions made in earlier steps. The analyser creates a constraint optimisation model of the problem of specialising a constraint solver. The decision variables are the design decisions to be made and the values in their domains are the options which are available for their implementation. The constraints encode which parts are required to solve the problem and how they interact. For example, the constraints could require the presence of an integer variable type and an equals constraint which is able to handle integer variables. A solution to this constraint problem is a solver specification that describes a solver which is able to solve the problem described in the original model. The weight attached to each solution describes the performance of the specialised solver and could be based on static measures of performance as well as dynamic ones; e.g. predefined numbers describing the performance of a specific algorithm and experimental results from probing a specific implementation. This metamodel enables the use of constraint programming techniques for generating the specialised solver and ensures that a solver specification can be created efficiently even for large metamodels. The result of running the analyser phase of the system is a solver specification which specifies a solver tailored to the analysed problem model. 2.2

The generator

The generator reads the solver specification produced by the analyser and constructs a specialised constraint solver accordingly. It may modify an existing solver, or synthesise one from scratch. The generated solver has to conform to the solver specification, but beyond that, no restrictions are imposed. In particular, the generator does not guarantee that the generated specialised solver will have better performance than a general solver, or indeed be able to solve constraint problems at all – this is encoded in the solver specification. In addition to the high-level design decisions fixed in the solver specification, the generator can perform low-level optimisations which are specific to the im-

67

plementation of the specialised solver. It could for example decide to represent domains with a data type of smaller range than the default one to save space. The scope of the generator is not limited to generating the source code which implements the specialised solver, but also includes the system to build it. The result of running the generator phase of the system is a specialised solver which conforms to the solver specification.

3

Preliminary implementation and experimental results

We have started implementing the design proposed above in a system which operates on top of Minion [9]. The analyser reads Minion input files and writes a solver specification which describes the constraints and the variable types which are required to solve the problem. It does not currently create a metamodel of the problem. The generator modifies Minion to support only those constraints and variable types. It furthermore does some additional low-level optimisations by removing infrastructure code which is not required for the specialised solver. The current implementation of Dominion sits between the existing Tailor and Minion projects – it takes Minion problem files, which may have been generated by Tailor, as input, and generates a specialised Minion solver. The generated solver is specialised for models of problem instances from the problem class the analysed instance belongs to. The models have to be the same with respect to the constraints and variable types used. Experimental results for models from four different problem classes are shown in Figure 2. The graph only compares the CPU time Minion and the specialised solver took to solve the problem; it does not take into account the overhead of running Dominion – analysing the problem model, generating the solver, and compiling it, which was in the order of a few minutes for all of the benchmarks. The problem classes Balanced Incomplete Block Design, Golomb Ruler, nQueens, and Social Golfers were chosen because they use a range of different constraints and variable types. Hence the optimisations Dominion can perform are different for each of these problem classes. This is reflected in the experimental results by different performance improvements for different classes. Figure 2 illustrates two key points. The first point is that even a quite basic implementation of Dominion which does only a few optimisations can yield significant performance improvements over standard Minion. The second point is that the performance improvement does not only depend on the problem class, but also on the instance, even if no additional optimisations beyond the class level were performed. For both the Balanced Incomplete Block Design and the Social Golfers problem classes the largest instances yield significantly higher improvements than smaller ones. At this stage of the implementation, our aim is to show that a specialised solver can perform better than a general one. We believe that Figure 2 conclusively shows that. As the problem models become larger and take longer to solve, the improvement in terms of absolute run time difference becomes larger as well. Hence the more or less constant overhead of running Dominion is amortised for larger and more difficult problem models, which are our main focus. Generating

1.0

2922

20

7,3,60 ● 10 7,3,50 ● 7,3,40 24 28 ●

12

11

13

0.9

● 26 27 7,3,30

7,3,70 ●

2,9,4

0.8

2,6,4

2,7,4

2,8,4 ●

0.7

CPU time of the specialised solver relative to Minion

68

2,10,4

1

10

100

BIBD (v,k,lambda) Golomb Ruler (ticks) n−Queens (queens) Social Golfers (weeks,groups,players)

1000

10000

CPU time [seconds] Figure 2. Preliminary experimental results for models of instances of four problem classes. The x axis shows the time standard Minion took to solve the respective instance. The labels of the data points show the parameters of the problem instance, which are given in parentheses in the legend. The times were obtained using a development version of Minion which corresponds to release 0.8.1 and Dominion-generated specialised solvers based on the same version of Minion. Symbols below the solid line designate problem instances where the Dominion-generated solver was faster than Minion. The points above the line are not statistically significant; they are random noise. The dashed line designates the median for all problem instances.

a specialised solver for problem classes and instances is always going to entail a certain overhead, making the approach infeasible for small and quick-to-solve problems.

4

Conclusion and future work

We have described the design of Dominion, a solver generator, and demonstrated its feasibility by providing a preliminary implementation. We have furthermore demonstrated the feasibility and effectiveness of the general approach of generating specialised constraint solvers for problem models by running experiments

69

with Minion and Dominion-generated solvers and obtaining results which show significant performance improvements. These results do not take the overhead of running Dominion into account, but we are confident that for large problem models there will be an overall performance improvement despite the overhead. Based on our experiences with Dominion, we propose that the next step should be the generation of specialised variable types for the model of a problem instance. Dominion will extend Minion and create variable types of the sort “Integer domain ranging from 10 to 22”. This not only allows us to choose different representations for variables based on the domain, but also to simplify and speed up services provided by the variable, such as checking the bounds of the domain or checking whether a particular value is in the domain. The implementation of specialised variable types requires generating solvers for models of problem instances because the analysed problem model is essentially rewritten. The instance the solver was specialised for will be encoded in it and no further input will be required to solve the problem. We expect this optimisation to provide an additional improvement in performance which is more consistent across different problem classes, i.e. we expect significant improvements for all problem models and not just some. We are also planning on continuing to specify the details of Dominion and implementing it.

5

Acknowledgements

The authors thank Chris Jefferson for extensive help with the internals of Minion and the anonymous reviewers for their feedback. Lars Kotthoff is supported by a SICSA studentship.

References 1. Stuckey, P.J., de la Banda, M.J.G., Maher, M.J., Marriott, K., Slaney, J.K., Somogyi, Z., Wallace, M., Walsh, T.: The G12 project: Mapping solver independent models to efficient solutions. In: ICLP 2005. 9–13 2. Minton, S.: Automatically configuring constraint satisfaction programs: A case study. Constraints 1 (1996) 7–43 3. Katsirelos, G., Bacchus, F.: Generalized nogoods in CSPs. In: AAAI 2005. 390–396 4. Prosser, P.: Hybrid algorithms for the constraint satisfaction problem. Computational Intelligence 9(3) (1993) 268–299 5. Bessi`ere, C., Debruyne, R.: Theoretical analysis of singleton arc consistency and its extensions. Artificial Intelligence 172(1) (2008) 29–41 6. Freuder, E.C., Elfe, C.D.: Neighborhood inverse consistency preprocessing. In: AAAI 1996. 202–208 7. Kotthoff, L.: Constraint solvers: An empirical evaluation of design decisions. CIRCA preprint (2009) http://www-circa.mcs.st-and.ac.uk/Preprints/solver-design.pdf. 8. Rendl, A., Gent, I.P., Miguel, I.: Tailoring solver-independent constraint models: A case study with Essence’ and Minion. In: SARA 2007. 184–199 9. Gent, I.P., Jefferson, C., Miguel, I.: MINION: A fast scalable constraint solver. In: ECAI 2006. 98–102

70

On learning CSP specifications Matthieu Lopez?1 and Arnaud Lallouet??2 1

2

1

Universit´e d’Orl´eans — LIFO BP6759, F-45067 Orl´eans [email protected] Universit´e de Caen-Basse Normandie — GREYC BP 5186 - 14032 Caen [email protected]

Introduction

Constraint Satisfaction Problems (CSP) concept enables to model a wide range of decision problems, from arithmetic puzzles to scheduling problems. Even if the goal of constraint programming (CP) is to provide a simple way to formulate problems, experience shows that a fair expertise is required to perform the modeling task[Fre97]. For this reason, machine learning of CSP models has been seriously studied in the literature. In this paper, we consider two objects: a model which corresponds to a CSP and a specification. Our objective is to propose a way to automatically specify a problem by machine learning with instances of different size. Model learning, also named constraints network learning, involves obtaining a set of constraints from a set of examples of the model (solutions and no-solutions of the model). A model is a tuple (V, D, C) where V is a set of variables, D a set of domains for the variables and C a set of constraints. The couple (V, D) defines the viewpoint of the model. In particular, the viewpoint (at least variables) is similar for all examples and they are all of the same size. CONACQ[BCKO06], based on version space, is an example of algorithm aiming to learn a CSP model. From solutions and no-solutions of a problem where the variables of the CSP are given, and a set of possible constraints are given, it tests, for each constraint, if the constraint is necessary to the model. A specification of a problem is a formalization more abstract than a model. Consider for example the n-queens problem. This problem requires putting n queens on a n×n chessboard such that no queen attacks another queen. There exists models for the 8-queens, 12-queens, etc, whereas a specification aims to formalize directly the n-queens problem. We say that a model represents an instance of a problem. To obtain a model from a specification, additional data are needed, like the number of variables and their domains. This paper deals with specification learning rather than model learning. To our mind, a specification is more natural for users who do not know CP because providing examples of instances already solved is an easier task than providing examples for the current instance to be solved. Let imagine that we want to obtain a school timetable for this year. In the past years, we have proceeded manually. Thus, we have already a set of solutions for past instances of this problem but they may be of different size because of the number of groups, teachers or rooms available. In this paper, we propose a language, a subset of first order logic, allowing to describe a subset of problems which could be handled in CP. Let’s take as example the graph coloring problem. Given a set S of vertices of a graph G, and a set C of colors, the problem is to find a coloration of each vertex such that when two ? ??

the student the supervisor

vertices are adjacent, they have not the same color. We can specify the 71problem in the following way, where adj describing the relation of neighborhood, and color define the color of vertices: ∀X, Y ∈ S, ∀A, B ∈ C : adj(X, Y ) ∧ color(X, A) ∧ color(Y, B) → A 6= B

^

∀X ∈ S, ∀A, B ∈ C : color(X, A) ∧ color(X, B) → A = B

We will detail this language in a first part. Next, we present how to obtain a naive model from a specification written in our language. After a formalization of inductive logic programming (ILP), we show how our learning task can be expressed like a standard ILP problem. Finally, we present limits of state-of-art learning algorithms.

2

Specification language

Since a few years, the number of high-level specification languages has highly increased ([FHJ+ 08, MNR+ 08, NSB+ 07a, AA06, PF03, Hni03]). Observing these languages, we notice that their principal interest is to work with collections of variables for which the size is unknown and for which their domain is not specified. Then this abstraction is solved with a data file wherein is given the missing informations to create an instance of the problem (translation from n-queens to 8-queens). However, these languages have been written to facilitate the modeling for human users and so, their complexity is such that they can not be used for target in a learning algorithm. In our language, a specification is a conjunction of rules. These rules describe the way a constraint must be posted in an instance of the problem. The vocabulary of our language is composed of a finite set of symbols of predicates with fixed arity, a set of domain composed of constants and a set of variables (Warning: these variables and these domains are not the ones of the CSP). Predicates are split in two categories. Some are called description predicates and they are used to describe observation in our problem (e.g. in n-queens problem, to describe positions of queens). Other are named constraint predicates and represent a constraint type (e.g. : =, 6=, bef ore, . . .). An atom is an expression of the form P (t1 , . . . , tk ), where P is a k-arity predicate and t1 , . . . , tk are terms, i.e. variables or constants. We call description atom an atom whose predicate is a description predicate and in a similar way, a constraint atom an atom whose predicate is a constraint. The syntax of a rule is: rule ::= ∀ variables : body → head

variables ::= vs ∈ DOMAIN | variables, variables vs ::= VARIABLE | vs, vs

body ::= DESC ATOM | CONSTRAINT ATOM | body ∧ body

head ::= CONSTRAINT ATOM | head 2.1

From specification to model

In order to translate a specification to a constraint network, we need other informations not present in the specification. In the example of graph coloring, the specification describes a representation of any well colored graph. To color a graph, we need the set of vertices and the set of colors. We also need to know the neighborhood of the vertices. Thus, to translate a specification, we must provide different informations like the different sets corresponding to domains of variables in the

specification 72 and “extensions” of certain predicates. The problem will be to find the “extension” of other predicates. In the following, we denote by E the set of predicates whose extension is given by the user and V the set of predicates whose extension must be searched. Rather than translating the specification in a low-level language, we prefer to translate, for sake of clarity, to a model in MiniZinc[NSB+ 07b], language which can be further translated in low-level language. Let substL=[X|v,...] (A) the operator which substitutes in an atom A all occurrences of variables X in L with the associated value v. For each predicate P : K1 × . . .×Kk in V, we must create the following CSP variables: array[K1 , . . . , Kk ] of boolean : P; Next, each rule R : A1 . . . Al → H in our specification can be translated as: constraint forall(v1 in K1 )( ... forall(vi in Kk )( subst[X1 |v1 ,...,Xi |vi ] (A1 )/\ . . . /\subst[X1 |v1 ,...,Xi |vi ] (Al ) -> subst[X1 |v1 ,...,Xi |vi ] (H) )...) Example Let’s take the graph coloring example. Generated variables would be: array[S, C] of boolean : Color; and an example of constraints: constraint forall(v1 in S)(forall(v2 in S)( forall(v3 in C)(forall(v4 in C)( adj(v1 , v2 )/\Color(v1 , v3 )/\Color(v2 , v4 ) -> v3 6= v4 )) )); Remark If we have more informations about predicates, like color is a function, we can easily obtain a better model with another choice for the viewpoint (creating one variable by vertex with colors as domain).

3

Inductive logic programming

Inductive logic programming (ILP)[Mug95] is a field at the crossroad of logic programming and machine learning. Given sets of positive examples E + and negative examples E − of a concept C, a background knowledge B and a description language L, the goal is to find a definition H of the concept C, described with respect to L in the form of a logic program. H must cover all examples from E + and must reject examples from E − . We will define the form of H and B, and then the coverage notion and the nature of concept C in a basic kind of ILP without negation and recursion. A clause(resp. query) is a disjunction(resp. conjunction) of literals, i.e. atoms or negation of atoms. A Horn clause is a clause containing only one positive literal. Its form is: ∃X1 , X2 , . . . : H ← B1 ∧ B2 ∧ . . . ∧ Bn where Xi is a variable, the literal H is called the head, and the conjunction of literals B1 ∧ B2 ∧ . . . ∧ Bn is called the body. The hypothesis H is a disjunction of Horn clauses. In a similar way, B is a set of Horn clauses describing observations about examples and new predicates defined intentionally. We say that H covers an example e if H ∪ B |= e. The concept C corresponds to an undefined predicate. Thus the problem consists in learning a definition H for C such that examples are correctly covered.

4

Specification rule learning

In this section, we will present our learning problem as an ILP problem like previously defined. We recall a specification in our language is a conjuction of rules such

that:

73

(∀X11 , . . . : A11 ∧ . . . ∧ A1k → A1k+1 ) ∧ . . . ∧ (∀X1n , . . . : An1 ∧ . . . ∧ Anl → Anl+1 ) The rules can be written as disjunctions: (∀X11 , . . . : ¬A11 ∨ . . . ∨ ¬A1k ∨ A1k+1 ) ∧ . . . ∧ (∀X1n , . . . : ¬An1 ∨ . . . ∨ ¬Anl ∨ Anl+1 ) The first step consists in setting what concept we aim to learn. Let spec(pb), the target concept where pb is a key to an example. Previously, we have seen hypotheses and so the definition of concept is in the form of a set of clauses. But our specification is in the form of conjunction of queries. However if we search the inverse concept[Lae02], we can easily get back to the framework previously described. Indeed, inverting connectors ∨/∧ and quantifiers ∀/∃ of a definition of concept, we obtain a definition for the inverse concept. Let negSpec(pb) the negated concept. Its definition would be: ∃X11 , . . . : negSpec(pb) ← A11 ∧ . . . ∧ A1k ∧ A1k+1 ... ∃X1n , . . . : negSpec(pb) ← An1 ∧ . . . ∧ Anl ∧ Anl+1

Passing to the negated concept and thus searching hypotheses describing the inverted problem, our problem of learning specification can be defined in this way: given a set of solutions G of our problem and a set of no-solutions N G, a background knowledge B = O ∪ I, where O is a set of observations about examples (the color of vertices for example) and I a set of constraint predicates given intentionally, the problem is to find a definition H for negSpec such that H covers all examples of N G and rejects all examples of G with respect to B.

5

Difficulties to learn and perspectives

ILP problems can be viewed as search problems. A language L sets the possible hypotheses which can be built, for our problem a conjunction of observation and constraint atoms. Thus, states of our search space are possible hypotheses with respect to L. Defined in this way, our search space is clearly infinite. To limit the search space, we can set, for example, the number of atoms or variables in hypotheses. Furthermore the search space can be structured with a partial order. Because of lack of space, we do not detail this part, but the lector can read [LD93]. The idea is to structure the space in a lattice where hypotheses are ordered by a relation of generality. A majority of state-of-the-art algorithms are based on this lattice. To move in this lattice, we need two operators, one to generalize a hypothesis and another to specialize a hypothesis. These operators enable to travel in the search space, either in a complete way, enumerating all possible hypotheses, or with incomplete methods using search strategies like hill-climbing or beam search. To our problem, the complexity of coverage test make prohibitive the use of complete methods. Thus, we have been interested in incomplete methods. Two principal ways have been explored in rule learning. The first one consists in starting with the rule always true (>) and to specialize it step by step, i.e. to add new atoms. It is the reason why a specialization operator is required to choose the best atom to add to our hypothesis. To define it, we need a criterion, generally based on the coverage of examples by the new hypothesis. But for our rule, we did not manage to find such a criterion. Indeed, if we observe our rules, for example coloring graph problem, adding col(X, C) or any other atom to the rule adj(X, Y ) do not change

the coverage 74 of the rule which covers all positive and negative examples. Indeed, we observe that for all studied specifications, the suppression of one literal in rules of specification makes unusable the coverage criterion. The second method consists in starting from the other extremity of the lattice, emphi.e. the most specific hypothesis. [Mug95] proposes to set the most specific clause from an example extended with the background knowledge and where all constants are replaced by variables. This extension is configured by a depth level, limiting the creation of new variables. If this clause is correctly extended, it rejects all negative examples, and covers at least the positive example which has been used to build it. With the generalization operator, we can search a hypothesis covering more positive examples. The principal flaw of this method is to use the coverage test with hypotheses of large size and thus is unusable in practice. Indeed, in certain problems, we have clauses of more than 4000 atoms. We are now building an alternative method aiming to avoid extremities of the lattice which are unusable. The idea is to find, with low cost, an hypothesis located in an area of the lattice where we can move in a relevant way, i.e. traveling across the different states of our search space in order to launch a local search.

Bibliography

75

[AA06] H. Ahriz and I. Arana. Specifying constraint problems with z. Technical report, The Robert Gordon University, 2006. [BCKO06] Christian Bessi`ere, Remi Coletta, Fr´ed´eric Koriche, and Barry O’Sullivan. Acquiring constraint networks using a sat-based version space algorithm. In AAAI, 2006. [FHJ+ 08] Alan Frisch, Warwick Harvey, Chris Jefferson, Bernadette Mart´ınezHern´andez, and Ian Miguel. Essence : A constraint language for specifying combinatorial problems. Constraints, 13(3):268–306, 2008. [Fre97] Eugene C. Freuder. In pursuit of the holy grail. Constraints, 2(1):57–61, 1997. [Hni03] B. Hnich. Thesis: Function variables for constraint programming. AI Commun., 16(2):131–132, 2003. [Lae02] Wim Van Laer. From Propositional to First Order Logic in Machine Learning and Data Mining. PhD thesis, Katholieke Universiteit Leuven, June 2002. [LD93] Nada Lavrac and Saso Dzeroski. Inductive Logic Programming: Techniques and Applications. Routledge, New York, NY, 10001, 1993. [MNR+ 08] Kim Marriott, Nicholas Nethercote, Reza Rafeh, Peter J. Stuckey, Maria Garcia de la Banda, and Mark Wallace. The design of the zinc modelling language. Constraints, 13(3):229–267, 2008. [Mug95] S. Muggleton. Inverse entailment and progol. New Generation Computing, Special issue on Inductive Logic Programming, 13(3-4):245–286, 1995. [NSB+ 07a] N. Nethercote, P. J. Stuckey, R. Becket, S. Brand, G. J. Duck, and G. Tack. Minizinc: Towards a standard cp modelling language. In Christian Bessi`ere, editor, Thirteenth International Conference on Principles and Practice of Constraint Programming, Lecture Notes in Computer Science, Providence, RI, USA, sep 2007. Springer-Verlag. [NSB+ 07b] Nicholas Nethercote, Peter J. Stuckey, Ralph Becket, Sebastian Brand, Gregory J. Duck, and Guido Tack. Minizinc: Towards a standard cp modelling language. In CP, pages 529–543, 2007. [PF03] M. ˚ A gren P. Flener, J. Pearson. The syntax, semantics, and type system of esra. Research report, ASTRA, April 2003.

76

Propagating equalities and disequalities Neil C.A. Moore, supervised by Ian Gent and Ian Miguel School of Computer Science, University of St Andrews, Scotland

1

Introduction

This paper describes an idea to improve propagation in a constraint solver. The most common frameworks for propagation are based on AC5 [5] where propagators can accept bound changed, assignment and value removal events. The idea is to extend this to allow constraints to generate and process equality and disequality events as well, e.g., whenever

r = true,

r→x=y

can generate an equality event

meaning that from now on in any solution

x = y.

Every

constraint in the minion solver [3] can produce such events and benet in some circumstances from receiving them. In this paper I will give an example where the idea works successfully, describe the additional propagation that various propagators in minion can achieve, and describe implementation issues.

2

Example x1 , . . . , xn , y1 , . . . , yn each with domain x1 = y1 , xn = 6 yn and ∀i, xi = yi ⇔ xi+1 = yi+1 .

Consider the following CSP: variables

{1, . . . , n}

and constraints

Chris Jeerson has proved [6] that with any static variable ordering, it takes time exponential in

n

for backtracking search and propagation to prove that no

solution exists. However by propagating equalities and disequalities the following will occur: 1. 2. 3. 4. 5.

x1 = y1 will generate event x1 = y1 x1 = y1 ⇔ x2 = y2 will receive the event and produce x2 = y2 ... xn−1 = yn−1 ⇔ xn = yn will receive xn−1 = yn−1 and produce xn = yn constraint xn 6= yn will recieve the event xn = yn and thus fail

As I show in Section 4 this can be implemented in polynomial time and hence a signicant speedup is achievable on this type of instance. I have implemented this idea and achieved exponential speedups as expected. I hope to be able to present practical results on this and other more realistic instances during my doctoral programme talk.

3

The technique

I propose to allow propagators to generate and receive events of the form and

x 6= y

where

x

and

y

x=y

are variables. Propagators are not able to detect such

77 events just by inspecting the solver state, because they may be true without being entailed by the store. For example, in a CSP containing

dom(y)

= {1, 2, 3}

x = y with dom(x) = x = y is true in any

the propagator can remove no values.

solution but not entailed by the constraint store. Conversely the AC5-style events can always be detected by propagators inspecting the domain state. As a rst example, I exhibit a rule for constraint abs (dened

Theorem 1. Proof.

x = |y|,

In constraint

if

x=y

x = |y|):

then we can infer that

y = x = |y| =⇒ y = |y| =⇒ y ≥ 0.

The standard propagator for fact combined with the event

y ≥ 0.

x = |y| will prune so that x ≥ 0 anyway, and this x = y itself subsumes y ≥ 0. However I will shortly

show that the contrapositive of the rule is useful. As it happens, the converse of the rule is also true:

Theorem 2. Proof.

x = |y|,

In constraint

if

x 6= y

then we can infer that

y < 0.

|y| = x 6= y =⇒ |y| = 6 y =⇒ y < 0.

These theorems show that abs can benet from receiving (dis-)equality events, allowing it to prune all negative (positive) values from the domain of y. Can it also produce such events? I now exhibit a simple meta-theorem to show that any constraint that can use (dis-)equalities can also produce them for other constraints to use:

Theorem 3.

In constraint

allows us to infer

¬event.

C , if event allows us to infer condition, then ¬condition

Proof. Omitted.

For example, a rule to produce events from Theorem 1 can be obtained:

Corollary 1.

In constraint

x = |y|,

if

y y3 and x4 > y4 . The propagator maintains these values α and β as domains are narrowed. If α > β the constraint fails. If α + 1 = β the propagator enforces xα < yα . If α + 1 < β then the propagator enforces xα ≤ yα . In this case

assignment now has

It is quite easy to see how to incorporate (dis-)equality information into the

algorithm: If we nd out that produce the event

3.3

xα 6= yα .

xα = yα then α can be incremented. Once α+1 = β

Table constraint

Roughly speaking, the table constraint propagator in minion works by nding a valid supporting tuple for each variable/value pair. To take account of (dis)equality information the criteria for validity is changed. Before, it was that each variable/value pair in the tuple is in its respective domain. After, tuples must also conforms to any and all known (dis-)equalities. For example, tuple

(x = 1, y = 1, z = 2)

is disallowed if we know

x 6= y

and/or if we know

y = z.

Conversely, if all tuples whose components are in their respective domains are such that a particular pair of components are always equal or unequal then the corresponding event can be generated.

4

80

Implementation

(Dis-)equalities can be handled in a similar way to other propagation events like

v ← 1 or v 8 3. Note that there are a quadratic number of dierent (dis-)equality events possible, the same as (dis-)assignment and bound changed events. Constraints should have the following facilities available to them:

  

Set up static triggers on chosen (dis-)equality events (if the (dis-)equality is already true, it should trigger immediately after being set up). Generate (dis-)equality events for chosen pairs of variables. Check if an equality is true, false or unknown.

x = y and x 6= y are both x = y and y = z are generated,

(Dis-)equalities should be removed after backtrack. If generated the solver should fail and backtrack. If the solver should ensure

x=z

is generated.

(Dis-)equalities need only be generated explicitly by propagators and when they are not obvious from the variable domains. For example, even if

y=1

event

x=y

x=1

and

may not result. This property is not essential, and it would

be a good idea to try generating all events. The advantages are as follows:



Events that are obvious from the variable domains should be picked up in the normal course of constraint propagation. It is better for (dis-)equality propagation to be orthogonal to normal propagation so it may be easily



switched o. Avoids introducing vast numbers of events for problems with small domains where equalities are likely, for example in a boolean problem. Also, no matter the domains, as more variables are assigned more spurious equalities and disequalities are produced.

The disadvantage is that this will lead to code duplication: a propagator might like to use disequality as a trigger, so that it will only propagate when it receives such an event. However using this framework it cannot assume that it will get the (dis-)equality trigger and hence it must place other triggers. It would be sensible at some point to evaluate whether generating all possible (dis-)equalities is advantageous.

Implied events

If events

x=y

and

y =z

are both produced, then

x=z

is

implied. The solver should ensure that these implied events are created, because the variable

y

may not be known to a propagators with a trigger on

x = z.

Implied disequality events also exist. Standard algorithms using the union-nd data structure can be exploited to solve these problems, see [7].

When to enable Potentially, enabling (dis-)equality propagation could be damaging to propagation speed, if the dynamic characteristics of the problem are not suitable. By using a similar technique to [9] the solver can detect cases when (dis-)equality propagation will be unsuccessful. Details are omitted due to space considerations.

5

81

Previous work

This idea is so simple that it is inevitable that similar work will exist. The aim of the work is to be simple and eective; hence I do not seek to compete with more general ideas. Furthermore, I have tailored the idea to work in a propagation solver, so it is not coincident with similar ideas in other types of automated

search and reasoning. I would be interested to hear of any other related work. Hägglund [4] has worked on allowing arbitrary constraints to be put in the store. This clearly generalises (dis-)equality propagation. My work is on a special case of this idea chosen to be ecient and more common in practice. Constraint handling rule (CHR) solvers [2] can easily incorporate (dis-)equalities into their rules. However CHR rules are not suitable for replacing constraint propagators for reasons of eciency. Some satisability modulo theories (SMT) solvers use the theory of equalities with uninterpreted functions [7]. Although (dis-)equality propagation applies to this theory and is exploited, the theory is limited in its expressivity compared to what's available in a CSP solver. Some CSP solvers are able to unify variables, meaning that once they are detected to be equal they become the same variable; Eclipse is an example of such a solver. This allows equality propagation to be implemented, but does not include disequalities. Theorem 3 shows that both are necessary to achieve full propagation.

References 1. Alan M. Frisch, Brahim Hnich, Zeynep Kiziltan, Ian Miguel, and Toby Walsh. Propagation algorithms for lexicographic ordering constraints.

Artif. Intell., 170(10):803

834, 2006.

J. Logic Programming, Special Issue on Constraint Logic Programming, 37(13):95138, 1998.

2. Thom Frühwirth. Theory and practice of Constraint Handling Rules.

3. Ian P. Gent, Christopher Jeerson, and Ian Miguel. Minion: A fast scalable constraint solver. In 4. Bjorn Hagglund.

ECAI, pages 98102, 2006.

A framework for designing constraint stores.

Master's thesis,

Linkoping University, 2007. 5. Pascal Van Hentenryck, Yves Deville, and Choh-Man Teng. consistency algorithm and its specializations.

A generic arc-

Articial Intelligence,

57(23):291

321, 1992. 6. Chris Jeerson, June 2009. Personal correspondence. 7. R. Nieuwenhuis and A. Oliveras.

Decision Procedures for SAT, SAT Modulo

Theories and Beyond. The BarcelogicTools. (Invited Paper).

In G. Sutclie and

12h International Conference on Logic for Programming, Articial Intelligence and Reasoning, LPAR'05, volume 3835 of Lecture Notes in Computer Science, pages 2346. Springer, 2005. A. Voronkov, editors,

8. Jean-Charles Régin. Generalized arc consistency for global cardinality constraint. In

AAAI/IAAI, Vol. 1, pages 209215, 1996.

9. Christian Schulte and Peter J. Stuckey. Dynamic analysis of bounds versus domain

In Maria Garcia de la Banda and Enrico Pontelli, editors, Twenty Fourth International Conference on Logic Programming, volume 5366 of Lecture Notes in Computer Science, pages 332346, Udine, Italy, December 2008. Springerpropagation.

Verlag.

82

Tractable Benchmarks Student: Justyna Petke, Supervisor: Peter Jeavons Oxford University Computing Laboratory, Wolfson Building, Parks Road, Oxford, UK, e-mail: [email protected]

Abstract. The general constraint satisfaction problem for variables with finite domains is known to be NP-complete, but many different conditions have been identified which are sufficient to ensure that classes of instances satisfying those conditions are tractable, that is, solvable in polynomial time. Results about tractability have generally been presented in theoretical terms, with little discussion of how these results impact on practical constraint-solving techniques. In this paper we investigate the performance of several standard constraint solvers on benchmark instances that are designed to satisfy various different conditions that ensure tractability. We show that in certain cases some existing solvers are able to automatically take advantage of the problem features which ensure tractability, and hence solve the corresponding instances very efficiently. However, we also show that in many cases the existing pre-processing techniques and solvers are unable to solve efficiently the families of instances of tractable problems that we generate. We therefore suggest that such families of instances may provide useful benchmarks for improving pre-processing and solving techniques. They may also provide good candidates for global constraints that can take advantage of efficient special-purpose algorithms.

1

Introduction

Software tools for solving finite-domain constraint problems are now freely available from several groups around the world. Examples include the Gecode system [2], the G12 [1], and the Minion constraint solvers [10]. One way to drive performance improvements in constraint solvers is to develop challenging benchmark instances. This approach can also help to drive improvements in the robustness and flexibility of constraint-solving software. One obvious source of benchmark instances is from practical applications such as scheduling and manufacturing process organisation. Another source is combinatorial problems such as puzzles and games. The G12 MiniZinc suite includes several examples. In this paper we suggest another important source of useful benchmarks which has not yet been systematically explored: the theoretical study of constraint satisfaction. From the very beginning of the study of constraint programming there has been a strand of research which has focused on identifying features of constraint problems which make them tractable to solve [6, 8] and

83

this research has gathered pace recently with the discovery of some deep connections between constraint problems and algebra [3, 4], logic [7], and graph and hypergraph theory [5, 11]. This research has focused on two main ways in which imposing restrictions on a constraint problem can ensure that it can be tractably solved. The first of these is to restrict the forms of constraint which are allowed; these are sometimes known as constraint language restrictions. The second standard approach to identifying restrictions on constraint problems which ensure tractability has been to consider restrictions on the way in which the constraints overlap; these are sometimes referred to as structural restrictions. In this paper we begin the process of translating from theoretical results in the literature to concrete families of instances of constraint problems. We obtain several families which are known to be efficiently solvable by simple algorithms, but which cause great difficulties for some existing constraint solvers. We argue that such families of instances provide a useful addition to benchmark suites derived from other sources, and can provide a fruitful challenge to the developers of general-purpose solvers.

2

Definitions

In the theoretical literature the (finite-domain) constraint satisfaction problem (CSP) is typically defined as follows: Definition 2.1. A instance of the constraint satisfaction problem is specified by a triple (V, D, C), where – V is a finite set of variables – D is a finite set of values (this set is called the domain) – C is a finite set of constraints. Each constraint in C is a pair (Ri , Si ) where • Si is an ordered list of ki variables, called the constraint scope; • Ri is a relation over D of arity ki , called the constraint relation.

Two proposed standard higher-level languages for specifying constraint problems in practice are Zinc [14] and Essence [9]. However, both of these languages are considered too abstract and too general to be used directly as the input language for current constraint solvers, so they both have more restricted subsets which are more suitable for solver input: MiniZinc and Essence0 . There exists a software translation tool, called Tailor [15], which converts from Essence0 specifications to the input language for the Minion solver (or Gecode). Another software translation tool distributed with the G12/MiniZinc software [1], converts from MiniZinc to a more restricted language known as FlatZinc, that serves as the input language for the G12 solver; FlatZinc input is also accepted by Gecode. The FznTini solver, developed by Huang, transforms a FlatZinc file into DIMACS CNF format and then uses a Boolean Satisfiability problem (SAT) solver, called TiniSAT, to solve the resulting SAT problem [12]. Definition 2.2. A class of CSP instances will be called tractable if there exists an algorithm which finds a solution to all instances in that class, or reports that there are no solutions, whose time complexity is polynomial in the size of the instance specification. 2

84

3

Max-closed constraints

One of the first non-trivial classes of tractable constraint types described in the literature is the class of max-closed constraints introduced in [13]. Definition 3.1 ([13]). A constraint (R, S) with relation R of arity r over an ordered domain D is said to be max-closed if for all tuples (d1 , ..., dr ),(d01 , ..., d0r ) ∈ R we have (max(d1 , d01 ), ..., max(dr , d0r )) ∈ R.

In particular, one useful form of max-closed constraint is an inequality of the form a1 X1 + a2 X2 + · · · + ar−1 Xr−1 ≥ ar Xr + c, where the Xi are variables, c is a constant, and the ai s are non-negative constants [13]. To generate solvable max-closed CSP instances, we selected a random assignment to all of the variables, and then generated random inequalities of the form above, keeping only those that were satisfied by this fixed assignment. This ensured that the system of inequalities had at least one solution. To generate unsolvable max-closed CSP instances, we generated the inequalities without imposing this condition; if the resulting set was solvable, another set was generated. As with all of the results presented in this paper, we solved the generated problem instances using Gecode (version 1.3), G12 (version 0.8.1), FznTini, and Minion (version 0.8RC1) and the times given are elapsed times on a Lenovo 3000 N200 laptop with an Intel Core 2 Duo processor running at 1.66GHz an 2GB of RAM. These timings exclude the time required to translate the input from MiniZinc to FlatZinc (for input to Gecode, G12 and FznTini) or from Essence0 to Minion input format. (In the special case of FznTini, times include the additional time required to translate from FlatZinc to DIMACS CNF format.) Average times over three runs with different generated instances were taken, but the variability was found in all cases to be quite small. Predictably, FznTini performs very poorly on these inequalities, which it has to translate into (large) sets of clauses. Standard CSP solvers should do well on these instances, because the efficient algorithm for solving max-closed instances is based on achieving arc-consistency, and all standard constraint solvers do this by default. Our results confirm that the standard CSP solvers do indeed all perform well on these instances, although the G12 solver was noticeably less efficient than the other two.

4

0/1/all constraints

Our second example of a language-based restriction ensuring tractability involves the 0/1/all constraints introduced and shown to be tractable in [6]. Definition 4.1 ([6]). Let x1 and x2 be variables. Let A be a subset of possible values for x1 and B be a subset of possible values for x2 . – A complete binary constraint is a constraint R(x1 , x2 ) of the form A × B. – A permutation constraint is a constraint R(x1 , x2 ) which is equal to {(v, π(x))|x ∈ A} for some bijection π : A → B. – A two-fan constraint is a constraint R(x1 , x2 ) where there exists v ∈ A and w ∈ B with R(x1 , x2 ) = (v × B) ∪ (A × w). 3

85

A 0/1/all constraint is either a complete constraint, a permutation constraint, or a two-fan constraint. What is particularly interesting about this form of constraint, for our purposes, is that the efficient algorithm for 0/1/all constraints is based on achieving pathconsistency [6], which is not implemented in standard constraint solvers. We wrote a generator for satisfiable CSP instances with 0/1/all constraints of various kinds on n variables. To ensure satisfiability we first generate a random assignment and then add only those 0/1/all constraints that satisfy the initial assignment. We generated instances for various choices of the parameter n and domain size m, and solved these using Gecode, G12, FznTini, and Minion. All the solvers performed very well, especially Gecode. We also generated unsatisfiable instances with 0/1/all constraints on just a small number of variables, leaving all other variables unconstrained. FznTini’s results were inconclusive as the instances were unsatisfiable [12]. The G12 solver reported ‘no solutions’ in 0.2 seconds, but Minion and Gecode could not solve this problem within 15 min. When the domain size was decreased to two, none of the solvers could verify that this simple unsatisfiable instance had no solutions within 15 minutes. The problem seems to be the fixed default variable ordering. None of the standard solvers focus the search on the few variables that are restricted; having no constraint between two variables is treated in the same way as having a complete constraint. Once the unsatisfiable instances were embedded in satisfiable instances, the performance of all the solvers was as good as before. These results suggest that standard CSP solvers can handle random collections of 0/1/all constraints very effectively, even without specialised algorithms. However, they appear to be poor at focusing search on more highly constrained regions, which is thought to be one of the strengths of the current generation of SAT-solvers. This suggests an obvious target for improvement in adapting the variable ordering to the specific features of the input instance.

5

Bounded-width structures

For our final example, we consider classes of CSP instances which are tractable because of the way that the constraint scopes are chosen. In other words, we consider structural restrictions. For any CSP instance, the scopes of all the constraints can be viewed as the hyperedges of an associated hypergraph whose vertices are the variables. This hypergraph is called the structure of the CSP instance. One very simple condition which is sufficient to ensure tractability is to require the structure to have a tree decomposition [11], with some fixed bound on the maximum number of vertices in any node of the tree. Such structures are said to have bounded width. However, the efficient algorithm for CSP instances with bounded width structures is based on choosing an appropriate variable ordering, and imposing a level of consistency proportional to the width [7]. None of the standard CSP solvers incorporate such algorithms, so it is not at all evident whether they can solve bounded width instances efficiently. 4

86

To investigate this question we wrote a generator for a family of specific CSP instances with a very simple bounded-width structure. The instances we generate are specified by two parameters, w and m. They have (mw + 1) ∗ w variables arranged in groups of size w, each with domain {0, ..., m}. We impose a constraint of arity 2w on each pair of successive groups, requiring that the sum of the values assigned to the first of these two groups should be larger than the sum of the values assigned to the second. This ensures that a solution exists and satisfies the following conditions: the difference between the sum of values assigned to each successive group is 1, and the sum of the values assigned to the last group is zero. It turns out that the runtimes of Gecode and Minion grow rapidly with problem size. The runtimes for the G12 solver do not increase so fast for these specific instances, but if we reverse the inequalities, then they do increase in the same way, although in this case Gecode and Minion perform much better. Somewhat surprisingly, FznTini seems to be able to solve all of these instances fairly efficiently, even though they contain arithmetic inequalities which have to be translated into fairly large sets of clauses. It is concluded that an important opportunity to improve the performance of CSP solvers would be in finding an efficient way of taking advantage of instance structure by adapting the variable ordering or other aspects of the search process to the particular instance. Moreover, as the ordering can be set in the input file, the question arises as to whether those adjustments could be automatically identified by the translators as part of the pre-processing.

6

Conclusions

We believe that the results presented in this paper have established that the various ideas about different forms of tractable CSP instances presented in the theoretical literature can provide a fruitful source of inspiration for the design of challenging benchmarks. The initial applications of these ideas, presented in the previous sections, have already identified significant differences between solvers in their ability to exploit salient features of the problem instances they are given. There are a number of technical difficulties to overcome in developing useful benchmark instances. First of all, unlike SAT solvers, there is no standard input language for CSP solvers. The translation from MiniZinc to FlatZinc, or from Essence0 to Minion, can sometimes obscure the nature of an essentially simple problem, and hence badly affect the efficiency of the solution. On the other hand, the cost of standardisation is a loss of expressive power. We have seen in Section 3 that translating simple forms of constraints such as linear inequalities into CNF may be very inefficient, and may lose the important features of the constraints which guarantee tractability. We suggest that a better awareness of the factors of a problem specification that can ensure tractability could lead to better translation software, which ensures that such features are preserved. Even when they have been successfully captured in an appropriate specification language, and input to a constraint solver, it can be the case that theoretically tractable instances may still be solved very inefficiently in practice. We have seen in Sections 4, and especially in Section 5, that when the tractability is 5

87

due to a property that requires a higher level of consistency than arc-consistency to exploit, instances may be very challenging for standard solvers. Finding effective automatic ways to improve the variable orderings and value orderings used by a solver according to specific relevant features of the input instance seems a promising first step which has not been sufficiently pursued. Summing up, in order to improve the performance of constraint solvers we need effective benchmarks which can explore that performance over a range of different problem types with different characteristics. One way to systematically develop such benchmarks is to use the insights from the theoretical study of constraint satisfaction. Benchmarks derived in this way can be simple enough to analyse in detail, and yet challenging enough to reveal specific weaknesses in solver techniques. This paper has begun to explore the potential of this approach.

References 1. G12/MiniZinc constraint solver. Software available at http://www.g12.cs.mu.oz.au/minizinc/download.html. 2. Gecode constraint solver. Software available at http://www.gecode.org/. 3. A. Bulatov, A. Krokhin, and P. Jeavons. Classifying the complexity of constraints using finite algebras. SIAM Journal on Computing, 34(3):720–742, 2005. 4. A. Bulatov and M. Valeriote. Recent results on the algebraic approach to the CSP. In Complexity of Constraints, volume 5250 of Lecture Notes in Computer Science, pages 68–92. Springer, 2008. 5. D. Cohen, P. Jeavons, and M. Gyssens. A unified theory of structural tractability for constraint satisfaction problems. Journal of Computer and System Sciences, 74:721–743, 2007. 6. M.C. Cooper, D.A. Cohen, and P.G. Jeavons. Characterising tractable constraints. Artificial Intelligence, 65:347–361, 1994. 7. V. Dalmau, Ph. Kolaitis, and M. Vardi. Constraint satisfaction, bounded treewidth, and finite-variable logics. In Proceedings of CP’02, volume 2470 of Lecture Notes in Computer Science, pages 310–326. Springer-Verlag, 2002. 8. E.C. Freuder. A sufficient condition for backtrack-bounded search. Journal of the ACM, 32:755–761, 1985. 9. A. Frisch, W. Harvey, C. Jefferson, B. Martnez-Hernndez, and I. Miguel. The essence of ESSENCE: A constraint language for specifying combinatorial problems. In Proceedings of IJCAI’05, pages 73–88, 2005. 10. I. Gent, C. Jefferson, and I. Miguel. Minion: A fast scalable constraint solver. In Proceeedings ECAI 2006, pages 98–102. IOS Press, 2006. Software available at http://minion.sourceforge.net/. 11. M. Grohe. The structure of tractable constraint satisfaction problems. In MFCS 2006, Lecture Notes in Computer Science, 4162, pages 58–72. Springer, 2006. 12. J. Huang. Universal Booleanization of constraint models. In Proceedings of CP’08, volume 5202 of Lecture Notes in Computer Science, pages 144–158. Springer, 2008. 13. P.G. Jeavons and M.C. Cooper. Tractable constraints on ordered domains. Artificial Intelligence, 79(2):327–339, 1995. 14. N. Nethercote, P. Stuckey, R. Becket, S. Brand, G. Duck, and G. Tack. MiniZinc: Towards a standard modelling language. In Proceedings of CP’07, volume 4741 of Lecture Notes in Computer Science, pages 529–543. Springer, 2007. 15. Andrea Rendl. TAILOR - tailoring Essence0 constraint models to constraint solvers. Software available at http://www.cs.st-andrews.ac.uk/~andrea/tailor/.

6

88

A simple efficient exact algorithm based on independent set for Maxclique problem Chu Min Li and Zhe Quan (student) MIS, EA4290, Université de Picardie Jules Verne, 33 rue St. Leu 80039 Amiens, France Abstract. In this paper we propose a simple and efficient exact algorithm for the famous Maxclique problem called iM axClique, using a new and powerful upper bound based on a partition of a graph into independent subsets. Experimental results show that iM axClique is very fast on DIMACS Maxclique benchmarks and solves six instances remaining open in [9]. Keywords: Branch-and-Bound, Independent set, Maxclique

1 Introduction Consider an undirected graph G=(V , E), where V is a set of n vertices {v1 , v2 , ..., vn } and E is a set of m edges. Edge (vi , vj ) with i 6= j is said to connect vertices vi and vj . The graph is undirected because we do not distinguish (vi , vj ) from (vj , vi ). A clique of G is a subset C of V such that every two vertices in C are connected by an edge in E. The maximum clique problem (Maxclique for short) consists in finding a clique of G of the largest cardinality. Maxclique is known to be NP-hard and is very important in many real-world applications. There are mainly two types of algorithms to solve the Maxclique problem: heuristic algorithms including greedy construction algorithms and stochastic local search algorithms (see e.g. [8,1,3,5,6,10]), and exact algorithms including branch and bound algorithms (see e.g. [4,9]). Heuristic algorithms are able to solve large and hard Maxclique problems but are unable to claim the optimality of their solutions. Exact algorithms may fail to solve large and hard Maxclique problems, but are sure of the optimality of the solutions they are able to find. An independent set of a graph is a subset I of V such that any vertex in I is not connected to any vertex in I. The maximum independent set problem consists in finding an independent set of G of the largest cardinality. Maxclique is closely related to the maximum independent set problem, because a clique of G is an independent set of its ¯ and vice versa, where G ¯ is defined to be (V , E) ¯ with E={(v ¯ complementary graph G, i, vj ) | (vi , vj ) ∈ V × V , i 6= j and (vi , vj ) 6∈ E)}. In this paper, we propose a simple and efficient branch and bound algorithm for Maxclique. We call it iM axClique because it uses a new upper bound based on independent set partition of G. In section 2, we present iM axClique and the new upper bound. In section 3, we compare iM axClique with the best state-of-the-art exact algorithms [4,9] in our knowledge on the widely used DIMACS Maxclique benchmark1. The experimental results show that iM axClique is significantly faster and is able to close 6 of the 14 problems remaining open in [4] and [9]. We conclude in section 4. 1

available at http://cs.hbg.psu.edu/txn131/clique.html

2

Chu Min Li and Zhe Quan (student)

89 Algorithm 1 iMaxClique(G, C, LB), a branch and bound algorithm for Maxclique Input: A graph G=(V , E), a clique C, and the cardinality LB of the largest clique found so far Output: A maximum clique of G begin if |V |=0 then return C; U B ← overestimation(G)+|C|; if LB ≥ U B then return ∅; select the vertex v of minimum degree from G; C1 ← iMaxClique(Gv , C∪{v}, LB); if |C1 | > LB then LB ← |C1 | ; C2 ← iMaxClique(G\v, C, LB); if |C1 | ≥ |C2 | then return C1 ; else return C2 ; end

2 A Branch and Bound Maxclique Solver A clique C of a graph G=(V , E) is maximal, if no other vertex of G can be added into C to form a larger clique. C is maximum if no clique of larger cardinality exists in G. G can have several maximum cliques. Let V ′ be a subset of V , the subgraph G′ of G induced by V ′ is defined as G′ =(V ′ , E ′ ), where E ′ ={(vi , vj ) ∈ E | vi , vj ∈ V ′ }. Given a vertex v of G, the neighbor vertices of v is denoted by N (v)={v ′ | (v, v ′ ) ∈ E}. The cardinality |N (v)| of N (v) is called the degree of v. Gv denotes the subgraph induced by N (v), and G\v denotes the subgraph induced by V \{v}. G\v is obtained by removing v and all edges connecting v from G. 2.1 A Basic branch and bound Maxclique algorithm The space of all possible cliques of a graph G can be represented by a binary search tree. A basic branch and bound algorithm for Maxclique compares all these cliques and outputs a maximum clique, by exploring the search tree in a depth-first manner. Algorithm 1 shows the pseudo-code of this algorithm, inspired from the branch and bound algorithm MaxSatz for the MaxSAT problem proposed in [7]. At every node, the algorithm works with a graph G=(V , E), a clique C of the initial input graph under construction, and the cardinality of the largest clique found so far (called lower bound LB). At the root, G is the input graph, C is the empty set ∅, and LB is 0. At the other nodes, G is a subgraph of the initial input graph, C is disjoint from G but every vertex of G is connected to every vertex of C in the initial input graph. In other words, adding any clique of G into C forms a clique of the initial input graph. The aim of the algorithm is to add a clique of the maximum cardinality of G into C. If G is empty, then a maximal clique C of the initial input graph is found and returned. Otherwise the algorithm compares LB with the sum of the cardinality of C and an overestimation of the maximum cardinality of the cliques in G. The sum is called upper bound (U B). If LB i (i.e. for a subproblem smaller than Zi ). For variables vi , . . . vk−1 we compute the sum of directed arc-inconsistency (DAC) counts, i.e. evaluate the lower bound according to mdac algorithm [14, 5]. In particular, for each (vj , val) such that i ≤ j < k − 1 and val ∈ D(vj ), we compute dacvj ,val , the number of variables vr (r > j) whose current domains conflict with (vj , val). Then we set DM (vj , val, P ) = Conf (vj , val, P ) + dacvj ,val , M inDM (vj , P ) = Pk−1 minval∈D(vj ) DM (vj , val, P ), and SumDM = j=i M inDM (vj , P ). Then LB(rds-mdac) = F +Connect+SumDM +Y is a valid lower bound considered by our algorithm rds-mdac [12].

3

Exploring local acyclicity within RDS

Consider again the last paragraph of Section 2, and let dacrvj ,val be defined analogously to dacvj ,val with the only difference that only arc-inconsistent variables vr (r ≥ k) are counted. Let Z 0 be a wcsp induced by current domains of variables vi , . . . , vk−1 (i.e. weights of conflicts between them are as in an original wcsp). Let DM (v, val, P ) = Conf (v, val, P ) + dacrv,val be the initial weight of (v, val) for each assignment in Z 0 . Denote by C an arbitrary LB on the optimal solution weight of Z 0 (e.g., it can be computed by [8, 2, 7, 3]). Then it can be shown that LB = F + Connect + C + Y is a valid LB on the optimal weight of an extension of P . This method is a generic technique of computing a lower bound in the rds framework. One special case of this method is rds-mdac [12]. We now present another special case. Let k be the largest possible number such that variables vi , . . . vk−1 induce an acyclic constraint subgraph of the constraint graph of Z, i.e the constraint graph of Z 0 is acyclic. Let C be the optimal weight of a solution of Z 0 , that can be computed efficiently. We call the algorithm, which computes LB(rds-tree) = F + Connect + C + Y in this way, rds-tree. If Z 0 involves all the unassigned variables, we update U B ← min(U B, F + C) and backtrack immediately. LB(rds-tree) is generally larger then LB(rds-mdac), because rds-mdac provides only approximation on the weight of Z 0 , while rds-tree gives us the exact optimal weight of Z 0 . It is not hard to show that the complexity of one iteration of rds-tree is O(ed2 ) (as of rds-mdac), where e is the number of constraints between the unassigned variables and d is the maximal domain size. It is also worth mentioning that due to the fixed order of variables explored by rds-tree (as well as by other rds-based algorithms), the range of variables being included to Z 0 can be identified at the preprocessing stage. The kind of local acyclicity explored by rds-tree depends on the existence of sufficiently large induced acyclic subgraph of the constraint graph of the given wcsp. Another interesting type of local acyclicity occurs when the constraint graph has many disjoint acyclic subgraphs. This type of local acyclicity

97

is explored in our algorithm, rds-cascade, described below. At the preprocessing stage, rds-cascade partitions all the variables into subsets T1 , . . . , Tt (i.e. a cascade of sub-cns), so that each subset induces an acyclic subgraph. The variables are statically ordered is such a way that each subset includes consecutive variables according to this order. Let dacfv,val be defined analogously to dacv,val , with the only difference being that for each v ∈ Tm , only variables that belong to Tm+1 , . . . Tt are considered. Let Zm be the wcsp induced by the variables of Tm , where for each assignment (v, val) of v ∈ Zm , DM (v, val, P ) = Conf (v, val, P ) + dacfv,val is considered as the initial weight of this assignment. Let W Tm be the weight of an optimal solution Pt of Zm (which can be computed efficiently). Then LB(rds-cascade) = F + m=1 W Tm is a valid lower bound on the optimal weight of extension of P . Note that the part of rds does not participate in computing the lower bound. However, rds-cascade applies rds as a filtering procedure, i.e., when for the given value (v, val) we are going to decide whether or not to remove this value from the current domain of v, (v, val) is temporarily added to P and then the lower bound is computed using rds. The complexity of rds-cascade is O(ed2 ), where e is a number of constraints between the unassigned variables and d is the maximal domain size. Indeed, given up up the cascade T1 , . . . , Tt , let e = (e1 + e2 + · · · + et ) + (eup 1 + e2 + · · · + et−1 ), where em (m ∈ {1, . . . t}) is a number of constraints between variables of Tm only and eup m (m ∈ {1, . . . t − 1}) is a number of constraints between variables of Tm and the following Tm+1 , . . . , Tt . Computing initial weights of values in each 2 2 Tm takes O(eup i ∗ d ), computing the weight W Tm of each Tm takes O(em ∗ d ), analogously to rds-tree. Hence the resulting complexity is O((e1 + e2 + · · · + up up 2 2 et ) ∗ d2 + (eup 1 + e2 + · · · + et−1 ) ∗ d ) = O(ed ).

4

Experiments

We now compare rds-tree and rds-cascade to rds [13] and rds-mdac [12]. For all the tested algorithms, we order the variables in decreasing order of their degrees in the constraint graph. We use the static value ordering heuristic, first choosing the value that the variable had in the optimal assignment found on the previous subproblem, then choosing the first available value [13]. All the algorithms are implemented in C++, and the experiments are performed on a 2.66GHz Intel Xeon CPU with 4.4GB RAM using the GNU g++ compiler. We measure the search efforts in terms of the number of backtracks and the CPUtime. We run these algorithms on two benchmarks: Binary Random Max-CSPs and Radio Link Frequency Assignment Problems (RLFAP). For Binary Random Max-CSPs, we generate cns given four parameters [11]: (1) the number of variables n, (2) the domain size dom, (3) density p1 (the probability of a constraint between two given variables), (4) tightness p2 (the probability of a conflict between two values of the given pair of variables). We fix three parameters (n, dom, p1 ∈ [0.9, 0.5, 0.1]), and vary p2 over the range [0.65, 1]. For every tuple of parameters, we impose a time limit of 200 seconds,

98

and report results as the average of 50 instances. Due to lack of space, we describe empirical results, omitting their graphic presentation. rds-mdac outperforms rds in the number of backtracks for all the considered values of p1 and over the whole range of p2 . The rate of improvement in backtracks by rds-mdac over rds is increased with decreasing of p1 : 2.5 times for p1 = 0.9, 3 times for p1 = 0.5, 6 times for p1 = 0.1. The improvement in the CPU-time by rds-mdac compared with rds is achieved for middle and low values of density p1 over the whole range of p2 : a reduction is up to 20% for p1 = 0.5 and up to 5 times for p1 = 0.1. The comparison of the number of backtracks between rds-mdac, rds-tree, and rds-cascade is as follows. rds-tree outperforms rds-mdac for middle and low values of density p1 over the whole range of p2 , achieving a reduction of up to 15% for p1 = 0.5 and up to 20 times for p1 = 0.1. rds-cascade outperforms both rds-mdac and rds-tree for all the considered values of p1 over the whole range of p2 . rds-cascade outperforms rds-tree by 70% on average for p1 = 0.9 and p1 = 0.5, and up to 12 times for p1 = 0.1. The comparison of the CPU-time is as follows. For high values of density p1 : both rds-tree and rds-cascade improves rds-mdac by 30% on average over the whole range of p2 , but only rds-cascade reduces the CPU-time of rds by 15% for high values of p2 starting from p2 = 0.93. For middle and low values of density p1 : rds-tree outperforms rds-mdac by up to 30% for p1 = 0.5 and 6 times on average for p1 = 0.1, and rds-cascade outperforms rds-tree by up to 35% on average for p1 = 0.5 and up to 7 times for p1 = 0.1. Thus, for Binary Random Max-CSPs, the “winner” is rds-cascade, followed by rds-tree (the second best), rds-mdac (the third) and rds (the fourth). We also compare the average LB for one iteration in algorithms rds-mdac and rds-cascade, without backtracking. We choose a quarter of the variables randomly to be a current partial solution P , assign variables in P randomly, then compute LB by corresponding formulas of rds-mdac and rdscascade. The experiments empirically show that LB(rds-cascade) is higher than LB(rds-mdac) by up to 50% for all the considered values of density (0.9, 0.5, 0.1), and over the whole range of tightness. The task of RLFAP [1] is to assign frequency (values) to each of the radio links (variables) in the given communication network in such a way that the links may operate together without noticeable interference (unary and binary constraints). For benchmarking we use five CELAR 6 sub-instances (SUB0, SUB1, SUB2, SUB3, SUB4) [1]. The empirical results are as follows. rds-mdac outperforms rds significantly in search efforts for all sub-instances. The best result by rdsmdac over rds is achieved for SUB4: the improvement of 84 times in backtracks and 28 times in the CPU-time. rds-tree outperforms rds-mdac in the search efforts for some sub-instances (SUB1, SUB3, SUB4), achieving a reduction of up 8 times in backtracks and up to 6 times in the CPU-time. rds-cascade outperforms rds-mdac in the search efforts for all sub-instances, reducing backtracks up to 10 times and the CPU-time up to 8 times. The best performance in the

99

search efforts among all the algorithms is achieved by rds-cascade (for SUB0, SUB2, SUB4) or by rds-tree (for SUB1, SUB3).

5

Conclusion

In this paper we present two new algorithms, rds-tree and rds-cascade, which use local acyclicity for improving the rds, when computing LB. rds-tree considers the “small” subproblems with unassigned variables only, as rds-mdac [12], but replaces mdac evaluation of other unassigned variables by exact computation of optimal weights of acyclic sub-cn on these variables. rds-cascade divides all unassigned variables into clusters that have acyclic constraint subgraphs, computes the optimal weight of each cluster in the polynomial time, and sums up the obtained optimal weights of clusters. The empirical evaluation shows that rds-tree and rds-cascade outperform rds-mdac in all search efforts, pointing towards the conclusion that exact computation of the optimal weight of even part of variables increases significantly the LB evaluation.

References 1. B.Cabon and S. de Givry and L. Lobjois and T. Schiex and J. P. Warners. Radio Link Frequency Assignment. Constraints, 4(1):79–89, 1994. 2. M. C. Cooper and T. Schiex Arc consistency for soft constraints. Artif. Intell., 154(1-2):199–227, 2004. 3. S. de Givry, F. Heras, M. Zytnicki, and J. Larrosa. Existential Arc Consistency: Getting closer to full arc consistency in weighted CSPs. In IJCAI, pages 84–89, 2005. 4. J. Larrosa and R. Dechter. Boosting Search with Variable Elimination in Constraint Optimization and Constraint Satisfaction Problems. Constraints, 8(3):303– 326, 2003. 5. J. Larrosa and P. Meseguer. Exploiting the Use of DAC in MAX-CSP. In CP, pages 308–322, 1996. 6. J. Larrosa, P. Meseguer, and T. Schiex. Maintaining Reversible DAC for Max-CSP. Artif. Intell., 107(1):149–163, 1999. 7. J. Larrosa and T. Schiex. In the quest of the best form of local consistency for Weighted CSP. In IJCAI, pages 239–244, 2003. 8. J. Larrosa and T. Schiex. Solving weighted CSP by Maintaining Arc Consistency. Artif. Intell., 159(1-2):1–26, 2004. 9. P. Meseguer and M. S´ anchez. Specializing Russian Doll Search. In CP, pages 464–478, 2001. 10. P. Meseguer, M. S´ anchez, and G. Verfaillie. Opportunistic Specialization in Russian Doll Search. In CP, pages 264–279, 2002. 11. P. Prosser. An empirical study of phase transitions in Binary Constraint Satisfaction Problems. Artif. Intell., 81(1-2):81–109, 1996. 12. M. Razgon and G. M. Provan. Adding Flexibility to Russian Doll Search. In ICTAI, Vol. 1, pages 163–171, 2008. 13. G. Verfaillie, M. Lemaˆıtre, and T. Schiex. Russian Doll Search for Solving Constraint Optimization Problems. In AAAI/IAAI, Vol. 1, pages 181–187, 1996. 14. R. J. Wallace. Directed Arc Consistency Preprocessing. In Constraint Processing, Selected Papers, pages 121–137, 1995.

100

Optimal Solutions for Conversational Recommender Systems Based on Comparative Preferences and Linear Inequalities Walid Trabelsi1 (student), Nic Wilson1 and Derek Bridge2 1

1

Cork Constraint Computation Centre, 2 Department of Computer Science University College Cork, Ireland [email protected], [email protected], [email protected]

Introduction

Recommenders are automated tools to deliver selective information that matches personal preferences. These tools have become increasingly important to help users find what they need from massive amount of media, data and services currently flooding the Internet. In (Bridge and Ricci, 07) [1] a conversational product recommender system is described in which a user repeatedly reconfigures and revises then requests a query until she finds the product she wants (see Section 2). A key component of such a system is the pruning of queries that are judged to be less interesting to the user than other queries. The approach of [1] assumes that the user’s preferences are determined by a sum of weights, which leads to pruning based on deduction of linear inequalities (see Section 3). We have developed a new approach for modelling the user’s preferences, based on comparative preferences statements (see Section 4), which can be used as an alternative method for pruning less interesting queries. Such an approach also allows the possibility of reasoning with a broad range of other input statements, such as conditional preferences, and tradeoffs.

2

Information Recommendation

In this section we describe the recommender system set up, based on [1]. A given use case is characterized by a set of products. Products are defined through a set V of n variables representing the features eventually included in a given product. Each variable Fi is defined on a discrete and finite domain denoted by D(Fi ). In the following cases we assume every variable has a domain D(Fi ) = {fi , fi } where fi represents that the feature is present and fi represents the feature is absent. Thus a product is represented by an assignment to V . For example, if V = {F1 , F2 }, then the assignment f1 f2 refers to a product containing the feature F1 but not F2 . The set of assignments to V is written as D(V ). The user and the system engage in a dialogue. At each point in the dialogue there is a current query. The user has to choose a new query, which will become

101

the next current query. The system will generate a set of queries for the user to choose between. There are three stages to producing this set of queries: 1. The set of queries which are close, in a particular sense, to the current query are generated; these are called the Candidates. 2. Queries which are unsatisfiable are eliminated; the remaining queries are called the Satisfiables. 3. Queries which are judged to be dominated by (i.e., worse than) another of the satisfiable queries are eliminated, leaving the Undominated ones. We consider two approaches for this last stage; the first one is based on deduction with linear inequalities (see Section 3); the second one uses a new approach for reasoning with comparative preferences (Section 4). The user then chooses some element of the Undominated to be the new Current Query. This choice of one query above the other possible choices is used to induce information about the user’s preferences; this information is used in Stage 3 for eliminating queries in future points in the dialogue. 2.1

Generating the Candidates

The candidate queries are chosen, based on the method in [1], to be the set of queries which are obtained from the current query by one of three moves: Add concerns adding one feature to the current query, Switch consists of replacing a feature by another one and Trade considers replacing one feature by two other features. For instance, if Add(Fi ) is selected then advisor is aware of the user’s desire to add the feature Fi not yet present. Choosing Switch(Fi , Fj ) means the user is trying to replace the feature Fi by the feature Fj previously absent. By a Trade(Fi , Fk , Fl ) action the user wants to replace the feature Fi by the features Fk and Fl previously absent.

2.2

Checking Satisfiability

The satisfiable queries are those for which there exists a product which has all the features present in the query (it may also have other features not present in the query). With the list of products being stored explicitly in a database, this can be done by checking through the products explicitly. For configurable products, where the set of products is represented as a set of solutions of a Constraint Satisfaction Problem, satisfiability of a query can be checked by determining if the CSP has solutions containing all the features in the query (which can be checked by checking satisfiability of an augmented CSP). There can still be a large number of queries for the user to consider. It is desirable to prune queries which we believe that the user will consider worse (less of interest) than another of the queries.

102

3

Eliminating Queries Using Deduction of Linear Inequalities

As described above, we need a method to prune dominated queries, i.e., queries which we believe will be less preferred by the user to others. We consider two different approaches for doing this. The first one, which is a slightly amended version of the approach used in [1], is based on deduction of linear inequalities (denoted by LP). It is assumed that the user assigns, for each feature Fi , a certain amount of utility wi representing the value of that feature being included in the final product. Therefore every user has a profile represented by a weight vector denoted by w = (w1 , .., wn ). The approach assumes the system does not know the values of wi it only reasons on them as unknown variables. Each move performed by the user is considered as feedback and makes the advisor induce some user preferences between the product features. These preferences will be defined as linear inequalities on the user’s weights. For instance, the system might infer that one selected feature seems to be more interesting, for the user, than the features not yet present. For example, suppose that V = {F1 , F2 , F3 }, and the current query contains no features (q = {∅}). The user chooses new query {F1 } based on the move Add(F1 ), hence preferring this move to Add(F2 ) and Add(F3 ), suggesting that feature 1 is preferred to feature 2, and to feature 3. As a consequence the advisor induces the constraints w1 ≥ w2 and w1 ≥ w3 . The structure storing the set of these inequalities is called the User Model, which in this case equals {w1 ≥ w2 , w1 ≥ w3 }. The User Model is updated and enriched by new inequalities as soon as the user performs moves. The advisor can now exploit the User Model to eliminate queries in Satisfiables, leaving the set of optimal queries: all solutions which are not dominated by another one. For query α let W (α) be the sum of weights of features in α. With this approach, α dominates β if the set of linear inequalities contained in the User Model imply the constraint W (α) ≥ W (β). A query β in Satisfiables is eliminated if there exists some query α in Satisfiables which strictly dominates β, i.e., such that α dominates β but β does not dominate α.

4

Eliminating Queries Using Reasoning with Comparative Preferences

We describe here a different approach for defining dominance among queries, based on an approach for reasoning with comparative preferences (denoted by CPR). 4.1

Comparative preferences

Comparative preferences are concerned with the relative preference of alternatives, in particular, statements implying that one alternative is preferred to

103

another. Such statements can be relatively easy to reliably elicit: often it is easier to judge that one alternative is preferred to another than it is to allocate particular grades of preference to the alternatives. We describe a language of comparative preference statements, based on [2]. Though it is rather a simple language, it is more expressive than the perhaps best known languages for reasoning with comparative preferences: CP-nets [3], TCP-nets [4], and the formalism defined in [5]. A language of comparative preference statements. Let P, Q and T be subsets of V . Let p be an assignment to P , and let q be an assignment to Q. The language includes the statement p ≥ q||T . The meaning of this statement is that complete assignment α is preferred to β if α extends p, and β extends q and α and β agree on set of variables T . Given a set of comparative preference statements Γ , and two complete assignments α and β, we need to determine if Γ implies that α is preferred to β. With the standard kind of inference, this problem of determining the preference of an outcome α over another outcome β has been shown to be PSPACE-complete [6], even for the case of CP-nets. In our application, efficient deduction is very important in order for the system to have a quick response time. The standard kind of inference is also rather conservative. In [2], a less conservative kind of inference is defined, which allows deduction in polynomial time; we use this deductive method in approach here. 4.2

Expressing induced preference using comparative preference statements

In our setting it is always at least as desirable for a product to have a feature as not. This can be represented by comparative preference statements of the following forms: fi ≥ fi ||V − {Fi }.

This is saying that a query α which contains feature fi is at least as preferred as a query β which is identical to α except for not containing fi . Consider the situation where the user has chosen to add feature i instead of feature j, which with the linear inequalities approach, is represented by the constraint wi ≥ wj . There are a number of different options for generating preference statements for this situation. 1. Let q be the current query, let q i be the current query q with the feature fi added. A rather conservative approach is to just model the preference of feature i over feature j by the preference statement: q i ≥ q j ||∅, which just expresses a preference for q i over q j . 2. fi ≥ fi ||V − {Fi , Fj } is saying that the presence or not of the feature Fi is more important than the choice of Fj . Thus, whatever the state of the feature Fj in the query the user will prefer Fi to be present in the query so that to be included in the final product.

104

3. fi fj ≥ fi fj ||V − {Fi , Fj }; this is saying that the simultaneous presence of the feature Fi and absence of the feature Fj is more interesting than the simultaneous absence of the feature Fi and presence of the feature Fj (all else being equal). For instance if the user has to choose between two products having the same features except Fi and Fj knowing the first contains Fi and not Fj and the second contains Fj and not Fi , she will prefer the first one. From this list we can easily appreciate the flexibility of comparative preferences when representing the user preferences. 4.3

Preliminary Experimentation and Results

The preliminary experimentation involved computing the pruning rate of the new approach (CPR) and the one based on deduction of linear inequalities (LP). Tests consist of engaging simulated users in a dialogue with the two types of advisors (CPR and LP) and verifying at each step of the dialogue the number of queries pruned. We performed several experiments: each one deals with a number (i.e., 100) of simulated users interacting with an advisor adopting either the LP or CPR approach to help select optimal queries, by pruning dominated queries. The products used during tests belong to three databases of hotels. We tested the performance of the preference forms described above. These tests reveal that the CPR approach is pruning much more than the LP approach. In the following table, we report the results (expressed as percentages) of three experiments using the preference forms 1, 2 and 3 for the CPR approach. (Obviously the LP approach performance is not influenced by the preference form adopted, the variation in the bottom being due to random variation.) 1 2 3 CPR 87.62 91.23 89.09 LP 72.91 74.05 71.66 The above results show that the advice given by the CPR approach to the user is more confined than the advice given by the LP approach, e.g., When applying the first preference form, 87.62% of the available queries are eliminated by the CPR approach instead of eliminating only 72.91% by the LP approach. This allows the user to focus only on the most interesting queries according to her preferences.

5

Discussion

Both the linear inequalities and the comparative preferences approaches have been implemented, and we are currently in the process of comparing them experimentally, based on three databases of hotels as products. The advice should

105

contain confirmed feasible and non dominated queries so that real users get satisfied as soon as possible since customers are very quickly annoyed and/or bored [7]. We have enhanced the reliability and efficiency of information recommendation by providing an efficient comparative preference representation and inference engine so that to improve the user preferences elicitation and perform efficient inference helping the user to navigate to the suitable product. A potential advantage of the comparative preferences approach is that it allows also explicit preference statements to be made by the user which are context-dependent. We are planning in the future to address the related problem of finding optimal solutions of a Constraint Satisfaction Problem with respect to a set of comparative preference statements. This will involve incorporating our comparative preferences deduction mechanism in a branch-and-bound search, and developing specialist pruning mechanisms.

References 1. Bridge, D., Ricci, F.: Supporting product selection with query editing recommendations. In: Proceedings of ACM Recommender Systems 2007. (2007) 2. Wilson, N.: An efficient deduction mechanism for expressive comparative preference languages. In: Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI-09). (2009) 3. Boutilier, C., Brafman, R.I., Domshlak, C., Hoos, H., Poole, D.: CP-nets: A tool for reasoning with conditional ceteris paribus preference statements. Journal of Artificial Intelligence Research 21 (2004) 135–191 4. Brafman, R., Domshlak, C., Shimony, E.: On graphical modeling of preference and importance. Journal of Artificial Intelligence Research 25 (2006) 389–424 5. Wilson, N.: Extending CP-nets with stronger conditional preference statements. In: Proceedings of AAAI-04. (2004) 735–741 6. Goldsmith, J., Lang, J., Truszczy´ nski, M., Wilson, N.: The computational complexity of dominance and consistency in CP-nets. In: Proceedings of IJCAI-05. (2005) 144 –149 7. Raskutti, B., Zukerman, I.: Generating queries and replies during informationseeking interactions. International Journal of Human Computer Studies 47 (1997) 689–734

106

A multithreaded solving algorithm for QCSP+ J´er´emie Vautard (student)1 and Arnaud Lallouet2 1

2

Universit´e d’Orl´eans — LIFO BP6759, F-45067 Orl´eans [email protected] Universit´e de Caen-Basse Normandie — GREYC BP 5186 - 14032 Caen [email protected]

Abstract. This paper presents some ideas about multi-threading QCSP solving procedures. We introduce a first draft of a multi-threaded algorithm for solving QCSP+ and give some work leads about parallel solving of quantified problems.

1

Introduction

Quantified constraint satisfaction problems (QCSP) have been studied for several years, and many search procedures ([7] [5]), consistency definitions([6] [4]) and propagations algorithms([3] [1]) have been proposed to solve them. However, while several attempts have been done to make a parallel solver for CSPs, we have not found any parallel approach for solving quantified problem. This is what we try to do in this paper. First, we propose a sketch of a quite general framework for solving QCSP+ problems (introduced in [2]) using a multi-threaded approach. Then, we discuss about several leads that can be explored in this domain.

2 2.1

The QCSP+ framework Formalism

Variables, constraints and CSPs. Let V be a set of variables. Each v ∈ V , has got a domain Dv . For a given W ⊆ V , we denote DW the set of tuples on W , i.e. the cartesian product of the domains of all the variables of W . A constraint c is a pair (W, T ), W being a set of variables ant T ∈ DW a set of tuples. The constraint is satisfied for the values of the variables of W which form a tuple of T . If T = ∅, the constraint is said to be empty and can never be satisfied. On the other hand, a constraint such that T = DW is full and will be satisfied whatever value its variables take. W and T are also denoted by var(c) and sol(c). S A CSP is a set C of constraints. We denote var(C) the set of its variables, i.e. c∈C var(c) and sol(C) the set of its solutions, i.e. the set of tuples on var(C) satisfying all the constraints of C. The empty CSP (denoted >) is true whereas a CSP containing an empty constraint is false and denoted ⊥.

107 Quantified problems. A quantified set of variables (or qset) is a pair (q, W ) where q ∈ {∀, ∃} is a quantifier and W a set of variables. We call a prefix a sequence of qsets [(q0 , W0 ), . . . , (qn−1 , Wn−1 )] such that (i 6= j) → (Wi ∩ Wj = ∅). We denote Sn−1 var(P ) = i=0 Wi . A QCSP is a pair (P, G) where P is a prefix and G, also called the goal, is a CSP such that var(C) ∈ var(P ). A restricted quantified set of variables or rqset is a triple (q, W, C) where (q, W ) is a qset and C a CSP. A QCSP+ is a pair Q = (P, G) where P is a prefix Si of rqsets such that ∀i, var(Ci ) ⊆ j=0 Wj . Moreover, var(G) ⊆ var(P ) still holds. Solution. A QCSP (P, G) where P = [(∃, W0 ), (∀, W1 ), . . . , (∃, Wn )] represents the following logic formula : ∃W0 ∈ DW0 ∀W1 ∈ DW1 . . . ∃Wn G Thus, a solution of a quantified problem can not be a simple assignment of all the variables anymore : in fact, the goal has to be satisfied for all values the universally quantified variables may take. Intuitively, such a problem can be seen as a “game” where an existential player tries to satisfy all the constraints of G while a universal player aims at violating one of them, each player assigning the variables in turn, in the order defined by the prefix. The solution must represent the strategy that the existential player should adopt to be sure that, whatever its opponent do, the goal will always be satisfied. This strategy can be represented as a family of Skolem functions that give a value to an existential variable as a function of the preceding universal ones, or by the set of every possible scenario (i.e. total assignment of the variables) of the strategy. In this paper, we use this later representation and organize the set of scenarios in a tree : a root node represents the whole problem, then, inductively : – if the next qset (qi , Wi ) is universal, the current node gets as many sons as there are tuples in DWi . Each node is tagged vith one of these tuples ; – if the next qset is existential, the current node gets one unique son, tagged by an element of DWi . Thus, each complete branch of this tree coresponds to a total assignment of the variables of the problem. If every branch of such a tree corresponds to an assignment satisfying G, then it is indeed a solution of the problem. QCSP+ restricts the “moves” of each player to assignments that satisfy the CSP of the rqsets. The logic formula represented is : ∃W0 C0 ∧ (∀W1 C1 → (∃W2 . . . G)) In this case, the notion of solution is the same, except that the restrictions have to be taken into account : for universal rqsets (∀, Wi , Ci ) the current node’s sons corresponds to each solution of Ci . For existential rqsets (∃, Wj , Cj ), the son must be tagged by an element such that the partial branch corresponds to an assignment that satisfies Cj .

108 2.2

A basic solving procedure

One simple way to solve quantified problems is to adapt the classical backtracking algorithm for CSPs : – first, perform a propagation algorithm on the problem. If an inconsistency is detected, return f alse. – pick up the leftmost quantified set of variables and enumerate the possible values of its variables, dividing the problem in as many subproblems. • in the universal case, solve all the subproblems. If one of them is false, return f alse. Else, group all the corresponding substrategies and return them. • in the existential case, solve each problem until one of them does not return f alse. if such a subproblem exists, create a node containing the values that led to the corresponding subproblem, attach the substrategy returned by the subproblem and return the whole. In the other case, return f alse.

3

Multithreaded solving : a first attempt

The multithreaded solving method we propose is based on a central data-structure called manager managing several (single threaded) solvers called workers : a partial strategy is maintained, where some nodes do not father a substrategy. Each of these nodes corresponds to a task that a worker can solve. Formally, a task consists in a pair (Q, τ ) where Q is a QCSP+ and τ the partial assignment of the variables of Q corresponding to the branch of the partial strategy where the task is attached. Once a worker has finished solving a task, it returns its result (either a sub-strategy or f alse) that will be taken into account by the manager to update the partial strategy. Then, the worker receives another task to solve. Whenever the to-do tasks queue empties, the manager sends a signal to one or several workers to stop its current task, and send a partial result. Such a result consists in a partial sub-strategy containing “unfinished” nodes which are as many other pending tasks. Once the whole problem is solved (i.e. there is neither more to-do task left nor other working thread), each worker thread is killed, and the result can be returned. Here is a list of each procedures and signals used in this framework : Workers. Each worker is a thread having a very simple main loop. This loop fetches a task from the Manager, and tries to solve it by calling an internal (single threaded) Solve method. The Manager can also possibly answer by a WAIT pseudo-task, which will cause the worker to sleep until a task becomes available (by calling a special method of the Manager), or by a STOP pseudo-task, which will kill the thread. Once the solver finishes, its result is sent to the Manager. This main loop is described in figure 1.

109 A worker is able to catch two signals called Terminate and Send partial. Both indicates that the search procedure should stop, but the former means that the task has become useless while the later calls for returning a partial result, along with a list of remaining “sub-tasks”. The Solve method can inherit from any original search procedure, but must be modified in order to take the signals into account. Figure 2 shows an adaptation from a basic QCSP+ solving procedure which can be interrupted by these signals. Finally, a worker provides some methods so that other processes know which subproblem it is working on. Procedure Main loop task = Manager.fetchWork() if task == STOP then Exit else if task == WAIT then Manager.wait() else result = Solve(task) Manager.returnWork(result) end if end loop Fig. 1. The worker main loop

Manager. The manager is an object that contains and builds the winning strategy of the problem. During search, this winning strategy is incomplete and some nodes are replaced by tasks remaining to solve. We call ToDo this list of remaining tasks. A Manager is also aware of the list Current Workers of the workers currently solving a task and maintains a list of sleeping workers that should be waken when tasks become available for solving. Unless sait otherwise, the Manager’s methods are called by a worker. The fetchWork method withdraws a task from the ToDo list and returns it. If ToDo is empty, then it sends the signal Send partial to one worker from Current Workers and returns WAIT. If Current Workers is also empty, the search has ended and therefore, STOP is returned. The returnWork method attaches the returned sub-strategy and cuts the branches that are no longer necessary (for example, every brothers of a complete substrategy of an existential node, or directly the father node of a universal substrategy if one of the subproblems have been found to be inconsistent). Each worker that was solving a node on a cut branch are sent the Terminate signal, and the workers contained in the sleeping list are awoken. Finally, the wait

110

Procedure Procedure Solve u ([(∀, W, C)|P 0 ], G) Solve e ([(∃, W, C)|P 0 ], G) STR := ∅ SC = Set of solutions of C SC = Set of solutions of C while SC 6= ∅ do while SC 6= ∅ do choose t ∈ SC ; SC = SC − t choose t ∈ SC ; SC = SC − t if Signal Terminate then if Signal Terminate then return STOP return STOP end if end if CURSTR := Solve( (P 0 , G)[W ← t] ) CURSTR := Solve( (P 0 , G)[W ← t] ) if CURSTR 6= F ail then if CURSTR = F ail then if Signal Send Partial then return F ail return else S PartialResult(tree(t,CURSTR),SC ) STR := STR CURSTR else end if return tree(t,CUR STR) if Signal Send Partial then end if return PartialResult(STR,SC ) end if end if end while end while return F ail return STR Fig. 2. Search procedure.

method records the calling thread in the Sleeping list and puts it in sleeping mode, until it is waken by the previous method.

4

Work leads

Search heuristics. The time taken by a search procedure to solve a CSP greatly depends on the heuristics used to choose which subproblem should be explored first. Unfortunately, most parallel approaches tends to be incompatible with this heuristics, thus ruining solving performances. QCSP (and QCSP+ ), reduce the alternatives, as a solver have to follow the order of the prefix, and the impact of the heuristics used to perform these choices remains unclear, because they were not originally tailored for QCSPs. In 2008, Verger and Bessi`ere presented in [8] a promising heuristics for QCSP+ that accelerate solving by several orders of magnitude on some problems. Parallelizing the search might, as for CSPs, drawn the benefit of theses heuristics. Task priority. In the presented parallel approach, each task could be given a priority, in order to minimize pointless subproblems solving. The method used to calculate this priority will most likely the solving time: in fact, solving one given subproblem might be pointless or not according to the result of another given task, and the priority given to the tasks should take that into account. For example, it sounds reasonable to give top priority to leftmost universal nodes as every branch from a universal node must be verified whatsoever, a single failure cutting the whole

111 subproblem. After that, solving rightmost pending existential tasks should help finishing to build complete sub-strategies. Several kind of workers. The workers run independently from each other, and their communications with the Manager are not specific to a particular search procedure. Thus, using the algorithm of figure 2 in the workers is not mandatory. Any procedure able to return the sub-strategy of a problem is a priori appropriate. However, algorithms that can not return partial work and generate remaining tasks should not be used, as they will tend to prematurely dry the ToDo list, bringing other workers into sleep mode.

5

Conclusion

Solving quantified constraint satisfaction problems in a parallel way is new, and even this contribution is far from being achieved. thus, while looking interesting, this still needs to prove its worth. We presented here a quite simple and general framework that has to be implemented and tested against traditional solvers before drawing definitive conclusions.

References 1. Marco Benedetti, Arnaud Lallouet, and J´er´emie Vautard. Reusing csp propagators for qcsps. In Francisco Azevedo, Pedro Barahona, Fran¸cois Fages, and Francesca Rossi, editors, CSCLP, volume 4651 of Lecture Notes in Computer Science, pages 63–77. Springer, 2006. 2. Marco Benedetti, Arnaud Lallouet, and J´er´emie Vautard. Qcsp made practical by virtue of restricted quantification. In Manuela M. Veloso, editor, IJCAI, pages 38–43, 2007. 3. Lucas Bordeaux, Marco Cadoli, and Toni Mancini. Csp properties for quantified constraints: Definitions and complexity. In Manuela M. Veloso and Subbarao Kambhampati, editors, AAAI, pages 360–365. AAAI Press / The MIT Press, 2005. 4. Lucas Bordeaux and Eric Monfroy. Beyond np: Arc-consistency for quantified constraints. In Pascal Van Hentenryck, editor, CP, volume 2470 of Lecture Notes in Computer Science, pages 371–386. Springer, 2002. 5. Ian P. Gent, Peter Nightingale, and Kostas Stergiou. Qcsp-solve: A solver for quantified constraint satisfaction problems. In Leslie Pack Kaelbling and Alessandro Saffiotti, editors, IJCAI, pages 138–143. Professional Book Center, 2005. 6. Peter Nightingale. Consistency for quantified constraint satisfaction problems. In Peter van Beek, editor, CP, volume 3709 of Lecture Notes in Computer Science, pages 792–796. Springer, 2005. 7. Guillaume Verger and Christian Bessiere. Blocksolve: a bottom-up approach for solving quantified csps. In Proceedings of CP’06, pages 635–649, Nantes, France, 2006. 8. Guillaume Verger and Christian Bessiere. Guiding search in qcsp+ with backpropagation. In Proceedings of CP’08, pages 175–189, Sydney, Australia, 2008.