Keywords: Linear assignment; k-cardinality Linear Assignment; Linear ..... In [5] special k-LAP algorithms have been developed based on min-cost flow or.
Solving the Rectangular assignment problem and applications J. Bijsterbosch and A. Volgenant Operations Research Group, Faculty of Economics and Econometrics, University of Amsterdam, Roetersstraat 11, 1018 WB Amsterdam, The Netherlands
Abstract The rectangular assignment problem is a generalization of the linear assignment problem (LAP): one wants to assign a number of persons to a smaller number of jobs, minimizing the total corresponding costs. Applications are, e.g., in the fields of object recognition and scheduling. Further, we show how it can be used to solve variants of the LAP, such as the kcardinality LAP and the LAP with outsourcing by transformation. We introduce a transformation to solve the machine replacement LAP. We describe improvements of a LAP-algorithm for the rectangular problem, in general and slightly adapted for these variants, based on the structure of the corresponding cost matrices. For these problem instances, we compared the best special codes from literature to our codes, which are more general and easier to implement. The improvements lead to more efficient and robust codes, making them competitive on all problem instances and often showing much shorter computing times. Keywords: Linear assignment; k-cardinality Linear Assignment; Linear Assignment Problem with outsourcing, Machine replacement.
1. Introduction The linear assignment problem (LAP), also denoted as the weighted bipartite matching problem, is well known and intensively studied. It is defined on two equally sized node sets N1 and N2, and a set of feasible assignments A = {(i, j) | i∈N1, j∈N2}, where each arc (i, j)∈A has a cost cij. The objective is to assign each node in N1 to exactly one node in N2, such that the total cost is minimized. For convenience we refer to nodes in N1 as persons and in N2 as jobs; we denote | N1 | as n1 and | N2 | as n2. Allowing n1 ≠ n2, say n1 < n2, is a generalization of the LAP known as the rectangular LAP (RLAP). With the binary variables: xij = 1, if person i is assigned to job j and 0 otherwise, and the corresponding costs cij the RLAP can be stated as min ∑ i∈N1 ∑ j∈N2 cij xij
(RLAP) subject to
∑ j∈N
∑ i∈N
2
1
xij = 1
i∈N1
xij ≤ 1
j∈N2
xij ∈{0, 1}
(i, j)∈A
and the corresponding dual problem with the dual variables or node potentials ui and vj as: (DRLAP)
max ∑ i∈N1 ui + ∑ j∈N 2 v j
subject to
cij – ui – vj ≥ 0
(i, j)∈A
vj ≤ 0
j∈N2
It is well known to solve the RLAP by extending it to a (square) LAP by adding dummy variables, see, e.g., [4]. The transformation increases the computational time by increasing the size of the problem and introduces degeneration into it, i. e., each optimal solution of the original problem instance corresponds generally to (n2 – n1)! optimal solutions of the extended instance. Hence LAP algorithms have been modified for rectangularity – see, e. g., [2] or [15]. (R)LAP is a relaxation of problems as the traveling salesman, the quadratic assignment and vehicle routing. Application fields are, e.g., 1. scheduling and 2. object recognition. 1. Scheduling: [12] formulates the single machine problem with unit processing times as an RLAP that minimizes the sum of the earliness and tardiness costs. [13] considers parallel scheduling to minimize the completion time (i.e., all jobs are ready), e.g., programs on computer processors. Let n jobs to be processed on one of m machines, with pij the time to process job i on machine j. The completion time of all the jobs
1
assigned to machine j is found by multiplying the processing times, of the last job by 1, of the second last job by 2, etc., and then by summing these values. One can assign a job i in n × m ways, to one of the machines and to one of the kth to last positions on the machine, with cost
cij = kpij. Thus the problem is an RLAP of size n μ nm. 2. Object recognition: We mention two applications:
Handwriting recognition: In (Chinese) handwriting, each character can be viewed as a set of, line segments to be used in a similarity measure between an unknown input character and a reference character, to find an optimal matching among their sets of line segments, say Γ1 and Γ2. By comparing the input character to all reference characters, it is recognized as the one with the highest similarity (or lowest cost) if this similarity exceeds a given threshold, and rejected otherwise. The problem to find an optimal match reduces to a LAP, see [8], with the cost of assigning line segments i∈Γ1 and j∈Γ2, depending, e.g., on their lengths and their angles. The problem is an RLAP if | Γ1 | ≠ | Γ2 | .
Multi-object tracking in air traffic control or radar tracking, links targets with tracks. One can partition sensor observations into tracks in some optimal way, to accurately estimate the real tracks. Two sensors located at different sites provide each a line an object must lie on; the intersection of the two lines determines the location of the object in 3-dimensional space. To locate the objects in time, each sensor provides a set of lines. Taking measurements at distinct times, the new sensor measurements, the targets, must be matched with the predicted positions of the existing tracks. Due to false alarms, missed detections and inaccuracy some measurements cannot be matched to (say n2) targets, see [1]. With say n1 (< n2) remaining measurements, one can solve the problem as an RLAP of size n1 μ n2. The following LAP variations can be solved by transforming them to an RLAP: • k-LAP, i.e., the k-cardinality LAP, see [5] and • LAPout, i.e., the LAP with outsourcing, see [14]). We introduce a transformation to solve • k-LAPrep, i.e., the problem of replacing (at most) k machines, see [3]. In [15] LAPJV has been adapted to solve the RLAP; we denote this algorithm as RJV: rectangular LAPJV. It is based on the easy to implement algorithm LAPJV, see [9]. According to [6] and to [17] it is one of the fastest to solve dense LAP instances. We study the RLAP exploiting the properties of these applications to develop codes that perform well. We describe several implementations and compare their results to (special) known codes for these applications. Section 2 focuses on how to improve the RJV algorithm and gives computational results,
2
just as in the sections 3, 4 and 5, which discuss adapted RJV codes for k-LAP, LAPout and kLAPrep. Finally, we draw conclusions.
2. Implementation improvements We treat and discuss improvements of RJV, the rectangular variant of LAPJV. Computational results are given to show their efficiency. The first step of LAPJV, standard reduction of the columns of the cost matrix, is deleted in RJV, as no longer each job has to be done. Thus persons are still free and the second step of RJV reduction transfer is void, as it is only valid for assigned persons. The augmentation
reduction, the last step of the initialization, can be applied to RJV as long as the node potentials v are initially 0, because after terminating RJV the v-values are non positive and the
v-values of the free jobs are still 0. Thus RLAP has the following optimality conditions (the reduced costs defined as wij = cij – ui – vj): An assignment X is optimal, if there exist node potentials (u, v) for all (i, j)∈A with
wij = 0,
(i, j)∈X,
wij ≥ 0,
vj = 0,
j∈{k | k∈N2, (i, k)∉X}, vj ≤ 0,
(i, j)∈A
j∈N2
The augmentation phase finds alternating shortest paths from free persons to free jobs. The optimality conditions remain satisfied during this step, as the v-values of the free jobs remain 0 and the v-values of the assigned jobs never increase. We first discuss three implementations to improve the algorithm’s performance. • Partitioning the job sets (see appendix A for the code) We partition the set N2(i) of jobs in N2 that can be assigned to person i into the sets P(i) of assigned jobs and Q(i) of free jobs. By storing Q(i) as a minimum heap, its root is the job with the lowest cost when assigned to person i. It is sufficient to create the heap when a person is selected and Q(i) is not yet created and to update the heap only if its root job is assigned. It turned out to be inefficient to create all heaps at the start of the algorithm and to update them if an additional job is assigned. It is needless to maintain P(i) for each single person as it is the same for all persons. Therefore, an array of length n2 stores all the jobs. The implementation of the partitioning in the augmenting row reduction step appeared to be fruitless. • β-best implementation (see appendix B for the code) The β-best implementation can bring down the number of scans in N2(i), see [7]. The two best jobs for a person i can remain the same in two successive iterations of i in the augmenting row
3
reduction step. While N2(i) is scanned to find the lowest and second lowest net values, we can determine the β smallest net values (with β ≥ 3 some integer). We can also store the largest value b(i) as well as the β – 1 jobs of N2(i) that belong to the β – 1 smallest net values in a set, say H(i). Consider now a subsequent iteration where the algorithm has to find the lowest and second lowest net values based on an updated set of node potentials v; then if cij – v j ≤ b(i) holds for at least two of the stored jobs j∈H(i), the lowest two net values still are among the stored jobs, because the values vj monotonously decrease during the algorithm, implying that
cij – v j ≥ b(i), j∈N2(i) \ H(i). Thus, one can first search the smallest two net values among the stored jobs whose current net values are at most b(i). The β smallest net values are only redetermined by a full scanning of j∈N2(i), if all except possibly one of the stored jobs have current net values exceeding b(i). In the implementation we only compute b(i) and H(i) if needed, as they may change otherwise. • Examine the free persons in a reverse order If many persons are still free after the augmenting row reduction phase, the instance at hand can be considered as hard. Then the row minimum can occur more often at the same job for more persons. Reversing the order in which the free persons are examined may be better, as these persons may decrease more substantially the v-values of the specific jobs, making them less attractive for the subsequent persons. These three implementations appear to give significant better results, with β = 6 independent of the problem size. The next two ideas appeared to be fruitless. • Examine the n1 cheapest jobs. For each free person i in an RLAP we only have to examine the n1 cheapest jobs from N2(i). The proof is straightforward and therefore omitted. It appeared that finding the n1 cheapest jobs is too time expensive. • Examine assigned jobs last, and assign free jobs earlier in case of ties. The two phases in the RJV algorithm allows us to check the “difficulty” of the problem after the augmenting row reduction procedure and act accordingly. The number of free persons, say
f, is checked; if f ≤ 0.8*n1 then the instance at hand is considered as easy and solved by the existing augmentation procedure, else the procedure AUGMENTATION_P, augmentation with partitioning of the job set, (appendix A) is used. In addition, in this new procedure the free persons are examined in backward order. The rectangular LAPJV with the described improvements (summarized in figure 1) is referred to as RJVI. We will compare RJVI on the following randomly generated problem classes and a benchmark class, all with dense cost matrices and created by the random generator of [10]. The classes are:
4
Figure 1. The algorithm RJVI. Function RJVI; Begin for cnt:=1 to 2 do AUGROWRED_6BEST; {the value 2 appears to be an empirical good choice} If f ≤ 0.8*n1 then AUGMENTATION Else AUGMENTATION_P; Compute optimal value Optimum; RJVI := Optimum End.
Uniform random: This class is a common one to test LAP algorithms; the entries cij of the cost matrix are integers uniformly drawn on the interval [1, R ], with R∈{102, 103, 106}. Geometric: In this class, see [6], two sets A and B, are generated, each with n1 + n2 integer values, uniformly drawn in a given interval. We set cij as the truncated Euclidean distance between two points, i.e.,
cij = trunc
(
)
ai2j + bi2j + 1 , with ai j = ai − an1 + j , bi j = bi − bn1 + j (i∈N1, j∈N2).
Machol-Wien ([11]) defined a famous benchmark class of difficult (deterministic)
instances by cij = i∗j + 1 (i∈N1, j∈N2). Randomized Machol-Wien: Dell’Amico and Toth ([6]) introduced this variant of the
previous class with cij an integer value uniformly generated in [1, i × j + 1] (i∈N1, j∈N2). The tables to come report the average CPU times (in milliseconds), neglecting times to generate the cost matrices, since they are equal for all the algorithms. The improvement factor φ is given as the improvement of the new results relative (in %) to the old ones. The RLAP algorithms were coded in Delphi 6 and run on a personal computer (AMD 1 GHz processor with 384 MB RAM, with Windows XP as operating system). They have been tested with the sizes n2∈{1000, 1500, 2000} and n1∈{0.25, 0.50, 0.75, 0.90, 0.99}*n2. We solved 10 random instances for each pair of (n1, n2), except the deterministic Machol Wien class. We compare RJVI with RJV on rectangular instances and with LAPJV on square instances. We left out results of the naïve transformation into a square LAP (with n2 – n1 dummy nodes with 0-cost for all arcs connected to the jobs of N2). It is about 2 times (n1 ≈ n2) up to about 30 times slower than RJVI, we think because reduction transfer fails on the dummy rows and more persons (n1 is larger) have to be assigned. The results in Table 1 show that for the uniform random and the geometric class, the running times of RJVI often do not grow with larger cost ranges in contrast to RJV. Instances with the largest range are even solved as fastest for n1 ≤ 0.9 n2, and always faster than the
5
Table 1 Results for the random RLAP class; RJV and RJVI times in msec. * Starred results (n1 = n2 = 1000, 1500 and 2000) obtained by LAPJV. Range [1, 102] RJV
RJVI φ in %
n2
n1
1000
250 500 750 900 990 1000
5 14 27 39 80 * 128
4 13 23 35 75 103
average 1500 375 750 1125 1350 1485 1500 average
14 38 68 94 127 * 266
13 33 59 75 111 160
27 77 135 179 222 * 461
25 62 109 142 182 228
Range [1, 103] RJV RJVI 5 12 23 40 110 * 145
5 12 20 32 105 160
13 27 51 85 265 * 370
13 26 46 71 271 392
22 48 89 155 502 * 761
22 45 82 136 441 756
average
7.4 19.5 19.3 20.7 18.0 50.5 22.6
average uniform instances
17.8
uniform
20.0 7.1 14.8 10.3 6.3 19.5 13.0 7.1 13.2 13.2 20.2 12.6 39.8 17.7
uniform
2000 uniform 1000
500 1000 1500 1800 1980 2000
0.0 0.0 13.0 20.0 4.5 –10.3 4.5 0.0 3.7 9.8 16.5 –2.3 –5.9 3.6
5 13 23 46 177 * 193
5 12 17 22 70 134
12 27 57 111 448 * 511
12 26 39 52 205 405
0.0 6.3 7.9 12.3 12.2 0.7 6.5
23 50 101 200 768 * 997
22 43 69 91 403 922
4.9
φ in % 0.0 7.7 26.1 52.2 60.5 30.6 29.5 0.0 3.7 31.6 53.2 54.2 20.7 27.2 4.3 14.0 31.7 54.5 47.5 7.5 26.6 27.8
geometric
5 13 35 88 304 * 467
5 13 32 84 295 412
0.0 0.0 8.6 4.5 3.0 11.8 4.6
4 14 48 121 442 * 488
4 13 36 99 409 438
0.0 7.1 25.0 18.2 7.5 10.2 11.3
4 17 98 357 1148 * 1260
4 12 28 85 392 465
0.0 29.4 71.4 76.2 65.9 63.1 51.0
375 750 1125 1350 1485 1500
12 33 81 215 870 * 1159
13 30 72 211 863 1084
–8.3 9.1 11.1 1.9 0.8 6.5 3.5
13 34 99 297 1146 * 1636
13 28 72 252 1082 1467
0.0 17.6 27.3 15.2 5.6 10.3 12.7
13 41 196 960 2636 * 3009
13 27 59 213 1041 1419
0.0 34.1 69.9 77.8 60.5 52.8 49.2
500 1000 1500 1800 1980 2000
22 54 145 377 1702 * 2483
23 57 128 356 1609 2257
–4.5 –5.6 11.7 5.6 5.5 9.1 3.6
22 67 175 529 1906 * 3424
23 49 135 449 1785 3232
–4.5 26.9 22.9 15.1 6.3 5.6 12.0
23 68 414 1574 4487 * 6091
23 49 113 408 1863 3084
0.0 27.9 72.7 74.1 58.5 49.4 47.1
geometric
average 2000
RJV RJVI
250 500 750 900 990 1000
average 1500
φ in %
Range [1, 106]
geometric
average average geometric instances
3.9
12.0
49.1
6
Table 2. Results for the Machol-Wien RLAP classes; RJV and RJVI times in msec. * Starred results (n1 = n2 = 1000, 1500 and 2000) obtained by LAPJV.
Randomized MW n2 1000
n1
RJV RJVI
RJVI
φ in %
9 22 56 106 259 * 387
6 15 29 54 166 238
33.3 31.8 48.2 49.1 35.9 38.5 39.5
845 3148 6379 8586 9958 * 9969
144 1011 2818 4467 5623 5642
83.0 67.9 55.8 48.0 43.5 43.4 56.9
375 750 1125 1350 1485 1500
19 55 133 254 657 * 1013
14 32 68 112 431 639
26.3 41.8 48.9 55.9 34.4 36.9 40.7
2883 10490 21209 28530 32851 * 33192
545 3074 8703 13910 17715 17705
81.1 70.7 59.0 51.2 46.1 46.7 59.1
500 1000 1500 1800 1980 2000
36 100 246 454 1288 * 2135
26 60 115 193 822 1410
27.8 40.0 53.3 57.5 36.2 34.0 41.4
6799 24654 49996 66685 77641 * 80174
1167 6649 19242 30949 39856 40427
82.8 73.0 61.5 53.6 48.7 49.6 61.5
average 2000
RJV
250 500 750 900 990 1000
average 1500
φ in %
Machol Wien
average
middle cost range instances. We think the improvement for the geometric instances and the range [1, 106] in particular to be due to the β-best implementation and to their structure having more likely jobs competing for a smaller number of about equally desirable persons. The algorithm RJVI is robuster and performs often much better than RJV on all tested rectangular instances. On square instances, RJVI is even up to about 50 % faster on average than LAPJV with only a loss of at most 10 % for the uniform random class on range [1, 103] and n ≤ 1500. The random Machol Wien instances (Table 2) show about the same gain up to size 2000. RJVI is faster for the randomized and the deterministic cases and appears to be even the best one (transforming the computing times for the used computers) in the randomized n = 1000 case in the tests of [6]. The next sections focus on the problems associated with the transformation applications.
7
4. The k-cardinality LAP The k-LAP with k a given positive integer (k ≤ n), is the problem to assign only k persons to k jobs, minimizing the total cost, i. e., the k-LAP is a LAP with the extra constraint
Σ(i, j)∈A xij = k. Assume 1 ≤ k ≤ n1 ≤ n2 and without loss of generality cij > 0. For ease of presentation, we rearrange the rows of the cost matrix such that the last k rows have the smallest row minima, ρi. Denote I as the n1 − k persons with the largest and FREE as the k persons with the smallest row minima. The (refined) transformation of [16] enables to solve the k-LAP as an equivalent RLAP: define m = n2 and add n1 – k dummy jobs to N2, with costs (see figure 2) ⎧∞ ⎨ ⎩0
for i ∈ {i |1,..., n1 − k} and j = m + 1,..., m + n1 − k and i ≠ j − m, otherwise.
Now we can initially assign all persons in I to the dummy jobs with zero costs. As the original row minima equal the second lowest row values in the transformed matrix, we can set the node potentials v for all dummy objects j to v(j) = – ρi, the reverse of the original row minima. The transformed matrix has a structure that we exploit for an advanced transformation, which aims to enable more initial assignments. We first examine the subproblem of the k persons in FREE. Clearly, assignments have to be cheaper than assigning one of the first n1 – k persons. This can be achieved by taking into account the smallest row minimum of these persons, denoted by ρ+. Assignments at a higher net value than ρ+ are avoided by adding k dummy jobs with costs cij = ρ+, for all k persons and j = m, m + 1, ..., n2. An initial assignment for k-LAP can now be found by performing the procedure AUGROWRED_kBEST (see appendix B). We can do this more efficiently by adding a single dummy job with a cost ci , n2 +1 = ρ + for each i, and leaving free any person that would be assigned to this dummy job.
Consequently, the value vm + 1 of the dummy must remain 0 all along. After this preprocessing, Figure 2. Rectangular cost matrix of the k-LAP after the transformation.
m c11 c 21 M
c 12 c 22 M
c n1 − k ,1 M M c n1 1
c n1 − k ,2 M M cn1 2
L c1m L c2 m O M L cn1 − k ,m L M O M L c n1m
n1 − k 0 ∞ L ∞ ∞ 0 L ∞ M M O M n1 − k ∞ ∞ L 0 0 0 L 0 M M O M k 0 0 L 0
8
we delete the dummy jobs and solve the remaining instance by the above refined transformation approach. Since H( ) and b( ) are already known for all k persons in FREE, we can continue to use them. However, to assure correctness, b(i) must be set to ρ+ if b(i) > ρ+. In [5] special k-LAP algorithms have been developed based on min-cost flow or successive shortest paths techniques. Their preprocessing determines so-called fixed persons and fixed jobs that must be assigned in an optimal solution of k-LAP. Experiments showed that this preprocessing failed to give improvements in the transformation approach. To fairly compare to known results, the largest considered cost range is [1, 105] and n1 = n2 = n. Ten random instances were generated and solved with n∈{500, 1000, 1500} and the values k∈{0.20, 0.40, 0.60, 0.80, 0.90, 0.99}*n (truncated). We first tested algorithm kRJV, the refined transformation of [16] solved with RJV. We introduce kRJVI (figure 3) as the version of RJV that combines the advanced transformation with preprocessing, β-best implementation, partitioning, and exploits the sparsity of the cost matrix. Two implementation details for this code are (1) due to the preprocessing and (2) due to the transformation; (1) it appeared to be best to do the k-best augmenting procedure only once instead of twice; (2) with f the number of free persons, ratio f / k determines the augmentation procedure to use. It appeared to be best to solve the remaining problem by the existing augmentation procedure, if less than 0.1*k persons (instead of 0.8*n in kRJV) are still free after the augmenting row reduction phase We exploit the sparsity of the cost matrix (figure 2) of kRJVI to improve the performance, especially if k / n á 1. It is stored whether it is sufficient for a certain person to examine all or only one dummy job. Since assigning a person to a dummy costs 0, it is sufficient to examine the corresponding potentials v. So, fewer elements have to be scanned, which often compensates the related extra work. The algorithm kRJVI is often 2 – 4 times faster than kRJV for the uniform random instances (Table 3). The instances with k = 0.8*n, and n1 = 1500, range [1, 103] appear to be Figure 3. The algorithm kRJVI. Function kRJVI; Begin ADVANCED_TRANSFORMATION; AUGROWRED_6BEST_s; If f ≤ 0.1*k then AUGMENTATION_s Else AUGMENTATION_P_s;
Compute optimal value Optimum; LAPJVI := Optimum End.
9
Table 3. The results for the random k-LAP instances; kRJV and kRJVI times in msec. Range [1, 102] k
n1 & n2
kRJV kRJVI
φ in %
Range [1, 103] kRJV kRJVI
Range [1, 105]
φ in %
kRJV kRJVI
φ in %
0.2*n1
500 1000 1500
22 75 169
2 21 46
90.9 72.0 72.8
22 85 195
6 28 55
72.7 67.1 71.8
21 87 205
7 23 60
66.7 73.6 70.7
0.4*n1
500 1000 1500
20 72 156
7 29 69
65.0 59.7 55.8
25 88 402
13 38 249
48.0 56.8 38.1
26 133 404
12 52 127
53.8 60.9 68.6
0.6*n1
500 1000 1500
25 73 162
10 40 105
60.0 45.2 35.2
31 259 316
19 151 153
38.7 41.7 51.6
35 250 828
23 127 332
34.3 49.2 59.9
0.8*n1
500 1000 1500
43 102 172
19 54 123
55.8 47.1 28.5
34 238 727
26 164 508
23.5 31.1 30.1
47 333 1047
36 263 804
23.4 21.0 23.2
0.9*n1
500 1000 1500
32 185 271
22 62 139
31.3 66.5 48.7
33 183 522
25 169 376
24.2 7.7 28.0
44 301 930
38 296 864
13.6 1.7 7.1
0.99*n1
500 1000 1500
25 128 257
26 102 171
4.0 20.3 33.5
36 159 389
33 145 353
8.3 8.8 9.3
41 260 699
34 236 656
17.1 9.2 6.2
Average
49.1
36.5
36.7
harder, maybe as more row minima are equal. Compared to RLAP, the problem is generally harder to solve on larger cost ranges; possibly the β-best implementation is then less effective. We make a rough comparison with the special algorithm of [5]. They concluded for the uniform random class that kLAP instances are much harder to solve than LAP instances, as they observed that their special algorithm is about a factor 5 slower than LAPJV for k ≥ 0.9*n and cost range [1, 105]. kRJVI gives a (much) smaller factor: using our fastest algorithms for square LAP instances (see table 1), LAPJV on range [1, 103] and RJVI on cost ranges [1, 102] and [1, 105], it follows that the kLAP instances are almost always easier to solve on the range [1, 102] and indeed harder to solve on the other two ranges. However, the factors are much smaller, from 1.15 to 1.65 for k = 0.8*n and 0.9*n; and about 1 for k = 0.99*n. The results of kRJV and kRJVI on the Machol-Wien classes (Table 4) vary strongly. We think that the randomized class is hard to solve, because less than 2.5% of the k to be assigned persons are assigned in the augmenting row reduction phase in contrast to at least 96% in RLAP. The best results of the randomized class (k = 0.2*n and k = 0.99*n) are remarkably close. We make a more refined comparison of kRJVI to the special algorithm of [5]. The results
10
Table 4. The results for the Machol-Wien k-LAP instances; kRJV and kRJVI times in msec.
Randomized n1 & n2
kRJV
kRJVI
0.2 * n1
500 1000 1500
179 1458 4722
0.4 * n1
500 1000 1500
0.6 * n1
Standard
φ in %
kRJV
kRJVI
58 404 1098
67.6 72.3 76.7
550 4219 14039
246 1331 3715
55.3 68.5 73.5
382 2792 9469
100 846 2752
73.8 69.7 70.9
970 7590 25131
490 2963 8531
49.5 61.0 66.1
500 1000 1500
414 3460 11677
129 1140 3553
68.8 67.1 69.6
1275 9751 32688
682 4564 13555
46.5 53.2 58.5
0.8 * n1
500 1000 1500
265 2789 9533
108 1050 3255
59.2 62.4 65.9
1361 10694 35578
837 5717 17416
38.5 46.5 51.0
0.9 * n1
500 1000 1500
205 1966 6493
93 740 2631
54.6 62.4 59.5
1277 10440 35087
852 6020 18540
33.3 42.3 47.2
0.99 * n1
500 1000 1500
104 745 1845
96 571 1237
7.7 23.4 33.0
1189 10059 33445
850 6076 18782
28.5 39.6 43.8
k
Average
φ in %
59.1
50.2
of the Machol-Wien instances (which are exactly the same instances) are obtained about 9 times faster compared to the results of [16], confirmed by averaging the results of the refined transformation for the uniform class with n = 500. He obtained the results on an AMD K6 333 MHz processor, indicating a ratio of about 3 to the personal computer used in our
research; he assumed that his results are obtained about a factor 18 faster than the special Uniform class n1 & n2
k
500
0.2*n1 0.4*n1 0.6*n1 0.8*n1 0.9*n1 0.99*n1
[1, 102]
[1, 103]
4.00 1.14 0.80 2.11 1.95 2.00
1.17 2.31 2.16 2.68 3.60 3.90
Machol Wien
[1, 105]
Randomized
3.14 3.00 2.48 3.19 3.86 6.14
0.48 0.55 0.65 1.88 2.17 1.63
Standard 0.19 0.35 0.77 1.43 1.99 2.77
Table 5. The entrances are ratios of times of kRJVI and the algorithm of Dell’Amico and
Martello [5]; a ratio > 1 indicates that kRJVI is faster.
11
algorithm. We think that the additional speed up is due to the programming language (Delphi 6 versus Turbo Pascal) and to the allocations of the data structures (static versus more dynamic pointers). Thus, to fairly although roughly compare the solution times of the algorithms to each other we divided all computing times of the special algorithm by a factor of 162 (= 9 × 18) and rounded them to integers. Table 5 shows this comparison between the special algorithms and kRJVI for n = 500. On these problem instances, kRJVI is much faster for almost all uniform random instances. The rather large ratio of k = 0.2*n and the range [1, 102] may be due to the small computation times, making them less precise. For the Machol-Wien instances the special algorithm is faster if k ≤ 0.6*n and kRJVI is faster if k > 0.6*n. We finally note that kRJVI is easier to implement than the special algorithm.
5. The LAP with outsourcing To allow for the alternative of sourcing or contracting out internal jobs to external machines (or persons) and processing external jobs on internal machines, Vander Wiel and Sahinidis ([14]) considered the assignment problem with external interactions (APEX). They formulated it as a LAP with a single unrestricted variable and developed a special primal-dual algorithm. The mathematical model maintains the problem’s special structure and its size: Minimize ∑ ( i , j )∈A cij xij
(APEX)
subject to:
∑ j∈N ( i ) xij = 1, ∑ i∈N ( j ) xij = 1,
i∈N1
xij ≥ 0, xi°j° ∈Z
(i, j)∈A \ (i°, j°)
2
1
j∈N2
where i° represents all external persons to which jobs can be assigned to and j° represents all external jobs to which persons can be assigned to. Each internal person can either be assigned to an available internal job or outsourced to an external job. Similarly, each internal job can either be assigned to an available internal person or outsourced to an external person. We assume that | N1( j°) | = | N2( i°) | = n1 – 1. Hence, in any feasible solution of APEX, arc (i°, j°) has a flow from j° to i° that is one less than the number of persons (or jobs) that are assigned externally. If no person is assigned externally, there is a unit flow from i° to j°. In general the flow on arc (i°, j°) is implicitly bounded by the other constraints: – (n – 2) ≤ xi°j°≤ 1. We can solve the APEX by standard LAP algorithms, while keeping the special structure of APEX. We replace person i° by n1 – 1 persons and job j° by n2 – 1 jobs, and duplicate all
12
n2 − 1
n2 − 1
c11
c12 L c1n2
c10
∞ L
∞
c21
c22 L c2n2
∞ c20 L
∞
M
M
M
∞
∞ L cn1 0 ∞ L
M
M
O
M
cn11 cn1 2 L cn1n2
O
c01
∞
L
∞
c00
∞ M
c02 L M O
∞ M
∞ c00 L M M O
∞
∞
L c0n2
∞
n1 − 1
∞ ∞ n1 − 1 M
∞ L c00
Figure 4. The cost matrix of the LAPout transformation.
the associated arcs. We refer to the resulting model as LAPout, the LAP with outsourcing. In the related cost matrix (Figure 4) the external persons and jobs have 0 as index. An optimal solution of LAPout is related to an optimal solution of APEX: set xi°j° = xi°j° – (n1 – 2). The optimal values zA of APEX and zL of LAPout are related by zA = zL – (n1 – 2)c00. A drawback of LAPout is that it increases the problem size. To overcome this Vander Wiel and Sahinidis ([14]) reformulated APEX as the full assignment problem (FAP) that can be solved by any standard LAP algorithm. While FAP does not increase the problem size, it ruins the (special) structure of the cost matrix, especially the sparsity: (FAP)
Minimize ∑ i∈N1 \ i ° ∑ j∈N2 \ j ° min{cij , ci 0 − c00 + c0 j }xij
subject to:
∑ j∈N (i )\ j ° xij = 1,
i ∈ N1 \ j °
∑ i∈N ( j ) \ i ° xij = 1,
j ∈ N 2 \ j°
xij ≥ 0,
i ∈ N1 \ i°, j ∈ N 2 \ j °
2
1
If xij = 1 in an optimal solution of FAP, then it is also in an optimal solution of APEX if cij = min{cij, ci0 – c00 + c0j}; otherwise this solution contains xij° = xi°j° = xi°j = 1. The optimal values zA and zF of APEX and FAP are related by zA = zF + c00. Alternatively, we can reduce the size of the cost matrix of LAPout by transforming it into the matrix of a problem denoted as LAPout’, while maintaining the special structure as follows: Subtract first c00 from the last n1 – 1 rows, then c0j – c00 from the first n2 – 1 columns and finally ci0 from the first n1 – 1 rows. As the costs in the last n1 – 1 rows are 0 or ∞, it makes no difference to assign these external persons to an internal job or to an external one; thus it is sufficient to solve an RLAP on the LAPout’ matrix (figure 5). The following theorem shows the similarity between APEX and LAPout’:
13
n2 − 1
n2 − 1
c11 − c01 − c10 + c00
c21 −c02 −c10 + c00 L c1n2 −c0 n2 − c10 + c00
0 ∞L∞
c21 − c01 − c20 + c00
c22 − c02 − c20 + c00 L c2 n2 −c0 n2 − c20 + c00 ∞ 0 L ∞
M M O M M M O M cn11 − c01 − cn1 0 + c00 cn1 2 − c02 − cn1 0 + c00 L cn1n2 − c0 n2 − cn1 0 + c00 ∞ ∞ L 0
n1 − 1
Figure 5. The rectangular cost matrix of LAPout after transformation (LAPout’). Theorem 1. An optimal solution of the RLAP solved on the LAPout’ matrix is also
optimal for the APEX problem and vice versa. Proof. First note the similarity between LAPout’ and FAP. If xij = 1 in an optimal solution of
LAPout’ then cij ≤ 0; otherwise xij° = 1, as that improves the criterion value. Clearly, the opposite, if xij° = 1 then cij > 0 also holds. Thus, if xij = 1 in an optimal solution of LAPout’, then cij ≤ ci0 – c00 + c0j ; so it is also in an optimal solution of APEX, like in FAP. If on the other hand xij° = 1 in an optimal solution of LAPout’, then for any unassigned internal job j, cij > ci0 – c00 + c0j ; otherwise outsourcing person i would not be optimal. Thus, xij° = xi°j° = xi°j = 1 in an optimal solution of APEX. The optimal value zF of FAP and zL’ of LAPout’ are related: zF follows by adding up zL’ and ci0 – c00 + c0j ∀(i, j )∈X, with X an optimal solution of FAP; so
z A = zL ' + c00 − (n1 − 1)c00 + ∑ i∈N1 \ i ° ci 0 + ∑ j∈N2 \ j ° c0 j = zL ' + ∑ i∈N1 \ i ° ci 0 + ∑ j∈N2 \ j ° c0 j − (n1 − 2)c00 .
□
The structure of the LAPout’ matrix (figure 5) allows less preprocessing than in the case of kLAP. It even appeared that preprocessing of LAPout’ gave no better computational results. In [14] the special algorithm has been compared among others against FAP solved by LAPJV, mainly on uniform random test instances with external arcs c00 = 0. We focus on the influence of different values of these arcs (c00∈[0, ..., 2] *R) on the computing times, comparing the transformation LAPout’ to FAP for dense instances. Ten uniform random instances were generated and solved with different values of c00, each range and n = 500, 750 or 1000, and cost ranges [1, R] with R∈[102, 103, 105]; the external arcs have costs 0 and R. The times (msec) are averages of the ten instances. Table 6 gives the computational results. FAP is solved by LAPJV as well as by RJVI, denoted as FAP and FAPI respectively. RJVoutI is LAPout’ solved by RJVI. To reduce the number of elements to be scanned and the number of elements to be stored, we exploit the structure of LAPout’. Because there is only one relevant external job for each person, a person is outsourced if the cost of the selected (cheapest) assignment is non-negative. As in
14
Table 6. Computing times (msec) for LAP with outsourcing, c00 = 0 or R.
c00 = 0 n
Range
500
[1, 102] [1, 103] [1, 105]
750
1000
2
[1, 10 ] [1, 103] [1, 105] 2
[1, 10 ] [1, 103] [1, 105]
c00 = R
FAP
FAPI
RJVoutI
FAP
FAPI RJVoutI
30
27
333
95
219
65
25
33
531
152
329
100
36
28
533
207
356
80
92 89
76 92
905 1807
341 570
673 1018
217 360
105
95
1960
816
1143
311
148 173 215
131 178 208
1637 4215 4750
797 1420 1875
1458 2178 2608
499 869 770
the kRJVI algorithm, the sparsity is exploited in both augmentation procedures. The LAPout’ transformation is clearly less efficient than the FAP algorithms for instances with c00 = 0, the opposite holds for instances with c00 = R, while FAPI performs poorly on these instances. In FAP we set cij = min {cij, ci0 + c0j – c00}, so for large c00-values the second term is often dominating. FAPI starts at once with augmenting row reduction; as a result, for each row the first and the second row minimum often occur at the same jobs; j1 = arg min j∈N2 {c0 j } and j2 = arg min j∈N2 \{ j1 }{c0 j } . Therefore, reduction transfer often fails, and many persons must be assigned in the more time-consuming augmentation phase. FAP, however, begins with column reduction, in which the node potentials v are given values. As a result, the first and the second row minima less often occur at the same jobs in the augmenting row reduction step. After the LAPout’ transformation, cij := cij – ci0 – c0j + c00. It is likely that row minima occur at similar jobs in the case that c00 is small. Though, in this case reduction transfer does take place, but only by small values. When c00 is larger, the external job can be more often the 1000
Time (ms)
800 600
FAP FAP FAPI FAPI
400
LAPout'I RJVoutI
200 0 0
20
40
60
80
100
120
140
160
180
200
Figure 6: Results for LAP with outsourcing, n = 750, and 0 ≤ c00 ≤ 200.
15
row minimum, making it easier to assign the remaining persons. The results of FAP, FAPI and RJVoutI (Figure 6) with c00 = 0, 20, …, 200 and cost range [1, 102] show that FAPI is the fastest code if c00 ≤ 0.65*R, and RJVoutI otherwise. Similar results can be shown with other values of n and other cost ranges. We can conclude that the two algorithms complement each other.
6. The (k-)LAP for replacement Suppose the manager of a plant has the funds to replace k among the n2 machines, by newer ones. He wants to optimally assign tasks to machines under maximal production or equivalent, minimal production cost. In [3] this problem is denoted as the replacing k machines linear assignment problem (k-LAPrep), given that new machines are more efficient than older ones. They assume to replace exactly k machines, even if this gives worse results. We formulate the more general case omitting the latter restriction and where old machines may be more or less efficient than new ones. We develop a transformation that enables to solve it as a RLAP. Let the (production) cost of processing task i be cij on machine j and dij on a replaced machine j. Assume without loss of generality 1 ≤ k ≤ n1 ≤ n2 and positive costs. Define m2 = n2 and add m2 machines to N2 at cost dij; thus we have n2 = 2 ∗ m2 and ci ,m2 + j = dij , i∈N1, j∈N2 and we can formulate k-LAPrep as an RLAP with one extra constraint:
∑ ∑ n1
n2
i =1
j = m 2 +1 ij
x ≤k
We transform k-LAPrep into an RLAP denoted as k-LAPrep’, defining m1 = n1 and we add m2 – k dummy nodes to N1, at cost: ⎧0, for i = m1 + 1, ..., n1 and j = m2 + 1, ..., n2 , cij = ⎨ ⎩∞, otherwise,
with n1 = m1 + m2 – k. Theorem 2 is proven similarly as the simple k-LAP transformation in [16]. Theorem 2. The problems k-LAPrep’ and k-LAPrep are equivalent. Proof. A solution of k-LAPrep can be extended to a solution of k-LAPrep’ with the same
criterion value, by adding m2 – k (zero cost) slack assignments. There are many equivalent solutions for k-LAPrep’, thus many solutions of k-LAPrep’ correspond to one solution of kLAPrep, to be found by deleting the slack assignments of the k-LAP’ solution. Let S* be an optimal solution of k-LAPrep with criterion value z(S*) = z*; as shown, it can be extended to a feasible solution Se of k-LAPrep’, with the same criterion value z’(Se) = z*.
16
m2
m2
c11
c12
L
c1m2
d11
d12
L
d1, n2 − k
L L
d1m2
c21
c22
L
c2 m2
d 21
d 22
L
d 2, n2 − k
L L
d 2 m2
M
M
O
M
M
M
O
M
O O
M
cm11 cm1 2 L cm m
1 2
d m11
m1
d m1 2 L d m1, n2 − k L L d m1m2
∞ ∞
∞ ∞
L L
∞ ∞
0 ∞
∞ 0
L L
∞ ∞
0 0
L L
0 0
M
M
O
M
M
M
O
M
M
O
M
∞
∞
L
∞
∞
∞
L
0
0
L
0
m2 − k
Figure 7. The cost matrix of k-LAPrep after transformation (k-LAPrep’).
Clearly Se is also an optimal solution of k-LAPrep’. Suppose S’ is an optimal solution of k-LAPrep’ with criterion value z’(S’) = z’. Then S’ contains m2 – k slack assignments of 0 cost, because all non-slack c-values are positive. Therefore, S’ contains at most k, say h (h ≤ k), non-slack assignments to new machines (i.e., i = 1, ..., m1 and j = m2 + 1, ..., n2) and at least m1 – h non-slack assignments to old machines (i.e., i = 1, ..., m1 and j = 1, ..., n1). Deleting the slack assignments of S’ forms a feasible solution, say S”, for k-LAPrep and z(S”) = z(S’). Now z’(S’) > z* = z’(Se) contradicts the optimality of S’, i.e., z’(S’) = z’(Se). Thus S” is an optimal solution of k-LAPrep; i.e., z(S”) = z(S*) and the problems k-LAPrep’ and k-LAPrep are equivalent.
□
It is efficient, see [16], to avoid degeneracy in the set of feasible solutions of a problem, by replacing the zero elements of k-LAPrep’ by ∞ in the rows i = m1 + 1, ..., n1 and j = m2 + 1, …, n2 – k and j ≠ i + m2 – m1 resulting in the matrix of Figure 7. This cost matrix does not necessarily include or exclude any assignments. Yet, it is easy to apply preprocessing, as we can assign all dummy tasks i (i = m1 + 1, ..., n1) to the first m2 – k new machines j, at 0 cost, i.e., assign dummy task i to machine j = m2 + i – m1 and solve the remaining problem by an adapted RJV algorithm (denoted as RLAPrepI). Exploiting efficiently the sparsity of the cost matrix instead of actually transforming that matrix, it is sufficient to check whether the chosen task is a real or a dummy one. For each dummy, say d, one only has to scan the node potentials of the jobs j∈{ j | j = d + m2 – m1 or j = n − k + 1, …, n}. Table 7 gives computational results on the realistic case that new machines produce cheaper than old ones, although the production costs on new and old machines can be arbitrary. We generated uniform random production costs with cij for processing task i on old
17
Range [1, 102] k
n
kLrep
kLrepI
Range [1, 103]
φ in %
kLrep
kLrepI
φ in %
Range [1, 106] kLrep kLrepI
φ in %
0.01*n
500 750 1000
180 532 855
84 197 217
53.3 63.0 74.6
352 863 1655
173 399 678
50.9 53.8 59.0
682 2006 4489
394 1228 2503
42.2 38.8 44.2
0.1*n
500 750 1000
375 1131 2513
61 95 164
83.7 91.6 93.5
599 1949 4426
227 588 999
62.1 69.8 77.4
1460 4931 11806
876 2927 7131
40.0 40.6 39.6
0.25*n
500 750 1000
446 1517 3662
61 128 223
86.3 91.6 93.9
726 2435 5617
289 713 1221
60.2 70.7 78.3
1570 5329 12366
1047 3591 8408
33.3 32.6 32.0
0.5*n
500 750 1000
211 772 2003
30 69 104
85.8 91.1 94.8
509 1570 3197
214 395 379
58.0 74.8 88.1
983 3290 7628
754 2475 5788
23.3 24.8 24.1
0.75*n
500 750 1000
23 125 360
14 89 145
39.1 28.8 59.7
97 215 331
47 102 174
51.5 52.6 47.4
226 781 1794
159 568 1290
29.6 27.3 28.1
0.9*n
500 750 1000
17 41 106
12 32 110
29.4 22.0 –3.8
22 46 79
18 37 61
18.2 19.6 22.8
20 50 90
9 30 50
55.0 40.0 44.4
0.99*n
500 750 1000
11 35 90
13 30 85
–18.2 14.3 5.6
18 37 65
16 33 52
11.1 10.8 20.0
18 40 74
11 23 43
38.9 42.5 41.9
Average
56.2
50.3
36.4
Table 7. Computing times (in ms) for kLrep(I) with more efficient new machines (L short for LAP).
machine j (i = 1, …, n1, j = 1, …, m2) and with costs in range [1, cij] for processing task i on a new (replaced) machine j (i∈N1, j∈N2). We generated 10 instances for each value of n1 = n2 = n and cost ranges [1, R], R∈{102, 103, 106}, with the number of new as well as old machines n∈{500, 750, 1000} and k∈{0.01, 0.1, 0.25, 0.50, 0.75, 0.90, 0.99}*n. In [3] computational results have been given only for instances up to size 40, as the logic constraint approach fails for larger sizes. Since we know of no other (special) code for kLAPrep we demonstrate the effect of the improvements by comparing 2 codes. The code denoted by kRJVrep transforms k-LAPrep to kLAPrep’ and solves the remaining problem with RJV. In the transformed matrix n = 2*m2, the numbers of machines and of tasks have chosen to be equal in the considered problem instances, i.e., m2 = m1. Although the new machines produce tasks cheaper than the old ones (cij > dij), it is not always best to replace all the k machines. It appears that up to k = 0.75*n almost all the k
18
machines are actually replaced and that for higher values of k the replacement percentage drops to a minimum of 72%. The average improvement factor φ of kRJVrepI is 47.6 % and the results are less sensitive to hard instances, i.e., φ is larger than average for most of the hardest instances. Averaging over the times, kRJVrepI is more than 3 times faster than kRJVrep. Instances with k = 0.25*n are on average the hardest to solve.
7. Summary and conclusions We considered the rectangular LAP, i. e., the number of persons differs from the number of jobs and improved an existing algorithm to solve it more efficiently, resulting in the algorithm RJVI. As RJVI can detect hard problem instances during the execution of the algorithm, it can solve such instances faster by partitioning the job set in assigned and in free jobs. The algorithm RJVI also uses a β-best implementation, which appears to increase robustness over various cost ranges. It is able to efficiently solve both rectangular and square LAP instances. We have described and implemented preprocessing by exploiting the structure of the cost matrices to further improve the performance of RJVI on the transformation applications kLAP, LAPout and kLAPrep. The special structures of the cost matrices allow additional improvements, which enable to solve these applications efficiently, using general and easy to implement codes. The computational results showed that one can efficiently solve the three considered variants of the LAP by applying suited transformations and that our codes are often even much faster than existing special algorithms. We conclude that solving the considered LAP variants by means of transformations to a RLAP appears to be flexible and efficient.
References 1. Bertsekas DP, Castañon DA. “A forward reverse auction algorithm for asymmetric assignment problems,” Computational Optimization and Applications, vol. 1, pp. 277– 297, 1992. 2. Bertsekas DP, Castañon DA, Tsaknakis H. “Reverse auction and the solution of asymmetric assignment problems,” SIAM Journal on Optimization, vol. 3, pp. 268–299, 1993. 3. Caseau Y, Laburthe F. “Solving weighted matching problems with constraints,” Constraints, an International Journal, vol. 5, pp. 141–160, 2000. 4. Cheng FH, Hsu WH, Chen CA. “Fuzzy approach to solve the recognition problem of handwritten Chinese characters,” Pattern Recognition, vol. 22, pp. 133–141, 1989.
19
5. Dell’Amico M, Martello S. “The k-cardinality assignment problem,” Discrete Applied Mathematics, vol. 76, pp. 103–121, 1997. 6. Dell’Amico M, Toth P. “Algorithms and codes for dense assignment problems: the state of the art,” Discrete Applied Mathematics, vol. 100, pp. 17–48, 2000. 7. Goldberg AV, Kennnedy JR. “An efficient cost scaling algorithm for the assignment problem,” Mathematical Programming, vol. 71, pp. 153–177, 1995. 8. Hsieh AJ, Fan KC, Fan TI. “Bipartite weighted matching for on-line handwritten Chinese character recognition,” Pattern Recognition, vol. 28, pp. 143–151, 1995. 9. Jonker R, Volgenant A. “A shortest augmenting path algorithm for dense and sparse linear assignment problems,” Computing, vol. 38, pp. 325–340, 1987. 10. Knuth DE. Seminumerical Algorithms. The Art of Computer Programming 2, Reading, MA, Addison-Wesley, 1981. 11. Machol RE, Wien M. “A hard assignment problem,” Operations Research, vol. 24, pp. 190–192, 1976. 12. Mosheiov G, Yovel U. “Minimizing weighted earliness-tardiness and due-date cost with unit processing-time jobs,” European Journal of Operational Research, vol. 172, pp. 528– 544, 2006. 13. Pinedo M. Scheduling Theory, Algorithms, and Systems (2nd ed.). Upper Saddle River, NJ, Prentice Hall, 2002. 14. Vander Wiel RJ, Sahinidis NV. “The assignment problem with external interactions,” Networks, vol. 30, pp. 171–185, 1997. 15. Volgenant A. “Linear and semi-assignment problems, a core oriented approach,” Computers & Operations Research, vol. 10, pp. 917–932, 1996. 16. Volgenant A. “Solving the k-cardinality assignment problem by transformation,” European Journal of Operational Research, vol. 157, pp. 322–331, 2004. 17. Zaki HA. “A Comparison of Two Algorithms for the Assignment Problem,” Computational Optimization and Applications, vol. 4, pp. 23–45, 1995.
20
Appendix A. The augmentation step with N2(i) partitioned into P(i) and Q(i). Procedure AUGMENTATION_P; Begin for all free i* do Begin INITIALIZE; repeat if SCAN={} then Begin SELECT_OBJECTS; if d[j]