Design of Experiments in BDD Variable Ordering: Lessons Learned 2 CBL

1 Dept.

Justin E. Harlow III

1

2

Franc Brglez

of Electrical & Computer Engineering, Duke University, Durham, NC 27708 (Collaborative Benchmarking Lab), Dept. of Comp. Science, Box 7550, NC State U., Raleigh, NC 27695 http://www.cbl.ncsu.edu/

Abstract { Applying the Design of Experiments methodol-

ogy to the evaluation of BDD variable ordering algorithms has yielded a number of conclusive results. The methodology relies on the recently introduced equivalence classes of functionally perturbed circuits that maintain logic invariance, or are within f1, 2, ...g-minterms of the original reference circuit function, also maintaining entropy-invariance. For some of the current variable ordering algorithms and tools, the negative results include: (1) statistically signi cant sensitivity to naming of variables, (2) con rmation that a number of variable ordering algorithms are statistically equivalent to a random variable order assignment, and (3) observation of a statistically anomalous variable ordering behavior of a wellknown benchmark circuit isomorphic class when analyzed under a single and multiple outputs. On the positive side, the methodology supports a statistically signi cant merit evaluation of any newly introduced variable ordering algorithm, including the one brie y introduced in this paper. Keywords: design of experiments, circuit mutants, equivalence class, benchmarking. I. Introduction Design of Experiments and Experimental Design are wellestablished disciplines in sciences and manufacturing processes; keyword entries to popular search engines on the Web return up to 11,259 and 32,086 hits, respectively. However, the application of the Design of Experiments (DoE) methodology to performance evaluation of algorithms in CAD of VLSI circuits is recent [1, 2]. A design of experiments in CAD of VLSI circuits requires a supply of equivalence class circuits, just like the experiments in biomedicine require a class of subjects with wellcontrolled properties (same sex, same age, same weight, same cholesterol level, etc.). Two kinds of equivalence class circuits have been introduced to date: Class W mutants, based on controlled perturbations of circuit wiring characteristic

[3], and Class F mutants, based on controlled perturbations of circuit functional characteristic [4]. Typically, Class W mutants are applied to evaluate layout optimization algorithms, Class F mutants are applied to evaluate logic optimization algorithms. Both classes have a number of applications elsewhere. This paper deals exclusively with the isomorphic and logically equivalent class, Class WD [3]. Speci cally, we present (numbered in the order of sections): (2) motivation, (3) context for DoE with BDD variable ordering algorithms, (4) experiments in asymptotic performance, (5) outline of an experimental variable ordering algorithm, (6) preliminary results. These experiments complement and extend the experiments with BDD variable ordering algorithms based on Class F mutants, introduced in [4]. II. Motivation Variable ordering for BDD construction has been widely studied, and a number of algorithms have been published which aim to achieve near{optimal orders at reasonable cost [5, 6, 7]. Nevertheless, a great deal of variability of results can be observed under widely{used BDD packages, even on very simple circuits. Figure 1 illustrates BDD sizes achieved on the logically equivalent C499 and C1355 combinational benchmark circuits under three BDD packages. For these experiments, we used two structural variants of the benchmark circuits: (1) Original: Translated into blif format [8] from the original isc format [9]. (2) Reference: A re-synthesized version of the original circuit, based on 2-input gates, with inverters removed except the ones driving the primary outputs (in blif format). Also serves as a reference circuit representation to generate all mutant classes reported in [3] and [4]. In Figure 1, bars A, B, C, and D give the BDD sizes achieved for the two variants of both circuits, with a xed, 80000

This research has been supported by contracts from the Semiconductor Research Corporation (94{DJ{553), SEMATECH (94{DJ{800), DARPA/ARO (P{3316{EL/DAAH04{94{G{2080), and (DAAG55-97-1-0345), and a grant from Semiconductor Research Corporation.

70000

BDD Size

60000 50000

C499-orig C499-ref C1355-orig C1355-ref

40000 30000 20000 10000 0 A B C D E F G H I

J K L M N O P

A = C499-orig-best B = C499-ref-best C = C1355-orig-best D = C1355-ref-best E = C499-orig-CAL F = C499-orig-CMU G = C1355-ref-CU H = C1355-ref-CAL I = C499-orig-CU J = C1355-orig-CAL K = C499-ref-CAL L = C1355-orig-CU M = C499-ref-CU N = C1355-orig-CMU O = C499-ref-CMU P = C1355-ref-CMU

Fig. 1. BDD variability in logically equivalent circuits.

Design of Experiments to Compare Graph-Based Algorithms Class ANY = SAME{#_nodes, #_edges, functionality perturbations, ...} Cost Index: crossing number, logic area & level, layout area, BDD size, ...

••• apply Algorithm_i

circuits

circuits EVAL

•••

cost index

Algorithm_i == Treatment_i

••• class ANY

apply Algorithm_k

•••

circuits

Treatment

Initial Dynamic Ordering Ordering

0 1 2 3

Natural VIS VIS VIS

4

VIS

5

VIS

6 7

Natural Natural

8

Natural

9

Natural

10

BBVO

EVAL cost index

Legend

variable order given by the circuit netlist VIS static variable order generated by VIS1.1 or VIS1.2 BBVO variable order inferred by sampling the circuit function

Natural

None None Sift during construction Sift during construction Sift once after construction Sift during construction Sift twice after construction Sift during construction Sift-converge after construction Sift during construction Sift during construction Sift once after construction Sift during construction Sift twice after construction Sift during construction Sift-converge after construction None

Fig. 2. Design of experiments with circuit equivalence classes to compare the sizes of ROBDDs under distinctive variable ordering algorithms (treatments). near{optimal variable ordering given by Somenzi [10]. As expected, all BDDs were of identical size and were veri ed to be logically equivalent. The remaining bars in the gure show the BDD sizes achieved using dynamic variable ordering in VIS [11], with the CAL [12], CU [13], and CMU [14] packages. The gure illustrates several important points: (1) All BDD packages make use of structural analysis to calculate initial variable ordering. The algorithms can easily be led to poor orderings by the particular circuit realization being processed. For instance, the CAL package produced a BDD with 26,790 nodes for the original (E) realization of C499, and 39,166 for the reference (K) version. (2) Comparisons of the performance of dierent BDD packages based on isolated benchmarks can be misleading. For instance, the CMU package got a near-ideal size of 29,606 nodes for the original C499 circuit, but 70,198 nodes for the reference C1355 circuit; the CU package got 32,546 nodes for the C1355 reference, but 41,386 for the C499 original. Depending on which of these logically equivalent circuits is chosen as a benchmark, either package can be said to perform \better." (3) Each BDD package achieved results close to the known good ordering in at least one case, but wide variability in performance was observed for all packages under the four distinct structural realizations used in the experiment. The worst{case spread was 8,840 nodes for CU, 12,376 for CAL, and 40,592 for CMU. These observations illustrate some but not all of the problems with the current benchmarking procedures for comparing the performance of BDD variable ordering algorithms. At best, the results as reported in Figure 1 are inconclusive. At worst, we gained no insights as to the merits of the algorithms under observation that could have any statistical signi cance.

III. DoE: Treatments and Surprises NP-hard problems, such as optimal variable ordering for BDDs, are solved by devising a polynomial-time heuristic, often with no guarantee whatsoever of the quality of the solution. Changing the starting point for the problem instance can induce unpredictable variability of results when experiments are repeated. Two of the fundamental principles of experimental design are randomization and replication. We adopt these principles for the experimental evaluation of algorithms by (1) creating an equivalence circuit class and (2) repeating the experiments for each member in the class. The basic abstractions for such experiments include [2]: 1. an equivalence class of experimental subjects, eligible for a treatment; 2. application of a speci c treatment; 3. statistical evaluation of treatment eectiveness. Here, a treatment is synonymous with an algorithm and an equivalence class of experimental subjects is synonymous with a circuit mutant class. Figure 2 illustrates these abstractions in a generic ow. The cost index, minimized in the context of variable ordering for BDDs, is the BDD size. The table in Figure 2 itemizes a total of 11 distinctive treatments (BDD variable ordering algorithms) that are applied to a number of equivalence class circuits. `Treatment 0' is based on the `natural' order of variables in the circuit netlist only, no algorithm is applied to change the variable order when constructing the BDD. This treatment parallels the `placebo treatment' in biomedicine. Treatments 1{5 start with the static variable ordering in VIS [11], followed by a number of `sift' operations. Treatments 6{9 start with a `natural' order of variables in the circuit netlist, followed by a number of `sift' operations. Treatment 10 is based on the Black Box Variable Ordering (BBVO) algorithm brie y outlined in a section of this paper. Any number of additional treatments may be added to this list in future experiments.

Circuit Class Circuit Class TreatALU4r-in-WD ALU4r-rn-WD ment Sample Sample Sample Sample Mean StD Mean StD trtm00 1,236.0 169.9 1,236.0 169.9 trtm01 1,206.0 0.0 1,268.1 159.4 trtm02 1,206.0 0.0 1,268.1 159.4 trtm03 598.0 0.0 707.9 88.6 trtm04 577.0 0.0 637.4 65.7 trtm05 576.0 0.0 631.3 58.7 trtm06 1,236.0 168.9 1,236.0 168.9 trtm07 710.3 74.3 710.3 74.3 trtm08 643.6 63.7 643.6 63.7 trtm09 631.3 62.9 631.3 62.9 ALU4r-in-WD is a class of one hundred 4-bit isomorphic and logically equivalent ALU circuits with identical names in each of the netlists where I/O and interior nodes are ordered randomly. ALU4r-rn-WD is a class of one hundred 4-bit isomorphic and logically equivalent ALU circuits with randomly assigned names in each of the netlists where I/O and interior nodes are ordered randomly. Both classes are isomorphic and logically equivalent.

Class ALU4r-in-WD

Class ALU4r-rn-WD 100

100 Treatment 5

75

Treatment 1 Treatment 2

50

Treatment 5

75

25

Treatment 1 Treatment 2

50 25

0

0 0

500 1000 1500 BDD size (nodes)

2000

0

500 1000 1500 BDD size (nodes)

2000

100

100

Treatment 0 Treatment 6

50

Treatment 9

75

Treatment 9

75

Treatment 0 Treatment 6

50 25

25

0

0 0

500 1000 1500 BDD size (nodes)

2000

0

500 1000 1500 BDD size (nodes)

2000

Fig. 3. BDD size (node) statistics under dierent variable ordering strategies, denoted as `treatments'. Some treatments of the ALU4r-in-WD class produce statistics that can be grossly misleading.

Circuit Equivalence Classes. The question arises as to

what constitutes an appropriate equivalence class for BDD experiments. In this paper, we demonstrate dramatically that the isomorphic and logically equivalent class, Class WD, created as a special case of Class W mutants with no wiring perturbations [3], is an important class, provided that it retains two properties: (1) the order of all I/Os and interior nodes in the netlist is random relative to all other members in the class, (2) the names of all I/Os and interior nodes in the netlist are assigned randomly relative to all other members in the class1 . Arguments for BDD experiments with Class F mutants, based on controlled perturbations of circuit functional characteristic, are made in [4]. Experiments with classes of ALU4r. We conducted a total of 2000 experiments and applied 210 treatments to two isomorphic classes of the circuit ALU4r , 100 circuits in each class: ALU4r-in-WD is a class of one hundred 4-bit isomorphic and logically equivalent ALU circuits with identical names in each of the netlists where I/O and interior nodes are ordered randomly. ALU4r-rn-WD is a class of one hundred 4-bit isomorphic and logically equivalent ALU circuits with randomly assigned names in each of the netlists where I/O and interior nodes are ordered randomly. Both classes are isomorphic and logically equivalent but 1 The node renaming treats I/Os in a manner that preserves the original I/O names as post- xes, so the original I/O names can be restored to facilitate pairwise logic veri cation. 2 This version is based on a 4-bit ALU circuit textbook schematic, independently entered by J. Calhoun [15], re-entered by D. Ghosh in 1998 and veri ed against each other. The "benchmark alu4" version in [8] diers from this version in three minterms, veri ed by J. Harlow in 1998.

clearly only the class ALU4r-rn-WD maintains properties (1) and (2) as stipulated above. So what's in a name? A lot, it turns out. The summary of 2000 experiments is shown in Figure 3: As to Treatment 0, the circuit classes ALU4r-in-WD and ALU4r-rn-WD are clearly equivalent { as expected. As to Treatments 1{5, we have a contradiction since the circuit classes ALU4r-in-WD and ALU4r-rn-WD do not appear equivalent. However, the problem is not with the classes, it is with how Treatments 1{5 are applied. Since all circuits in the ALU4r-in-WD have the same I/O names, the internal variable re-ordering by the VIS static algorithm maintains, internally, the same initial variable order for all circuits in this class. Hence, the standard deviation for the ALU4r-in-WD class remains at 0 while we observe a near-normal distribution for Treatments 1{5 when applied to the class ALU4r-rnWD. As to Treatments 6{9, the circuit classes ALU4r-in-WD and ALU4r-rn-WD are clearly equivalent { once we replace the VIS static order with the `natural order'. Lessons Learned. The introduction of circuit equivalence classes, with attributes such as random reordering and renaming of circuit nodes, is essential to a good experimental design. Without such classes, we could not have exposed the true performance of algorithms under Treatments 1{ 5. Futhermore, the availability of circuit equivalence classes and the current experimental design also reveals unexpected equivalences among the treatments themselves: According to a pairwise t,test, (Treatments 1 and 0), (Treatments 2 and 0), (Treatments 6 and 0) are equivalent at better than 95% con dence level. In other words, for the isomorphic and logically equivalent class of ALU4r, random variable order is just as eective as the following algorithms in VIS: static variable order-

75

100

Circuit class ex carry15 belongs to the

family of parameterized n-bit carry circuits (n = 7, 15, 31, 63), devised to test the asymptotic performance of BDD ordering algorithms. The best order, inferred from the bit-slice circuit structure, results in the BDD size of 36 nodes. Other circuit families used in this evaluation include the family of n-bit mux circuits (warp7, warp15, warp31, warp63). The histograms in this gure highlight the distributions of BDD sizes reported by a number of ordering heuristics described in Figure 2. Notably, none achieves the optimum value of 36 nodes for all instances of isomorphic and logically equivalent circuits, each randomly re-ordered and re-named. In addition, the most expensive heuristic (Treatment 9) performs worse than Treatment 8 on some circuits. Asymptotic behavior of Treatment 8 heuristic applied to these circuit classes is shown in Figure 5.

Circuit Class ex_carry15

Circuit Class ex_carry15 50

75

Treatment 8

Best order (36) 25

50 Treatment 6

0

1 Dept.

Justin E. Harlow III

1

2

Franc Brglez

of Electrical & Computer Engineering, Duke University, Durham, NC 27708 (Collaborative Benchmarking Lab), Dept. of Comp. Science, Box 7550, NC State U., Raleigh, NC 27695 http://www.cbl.ncsu.edu/

Abstract { Applying the Design of Experiments methodol-

ogy to the evaluation of BDD variable ordering algorithms has yielded a number of conclusive results. The methodology relies on the recently introduced equivalence classes of functionally perturbed circuits that maintain logic invariance, or are within f1, 2, ...g-minterms of the original reference circuit function, also maintaining entropy-invariance. For some of the current variable ordering algorithms and tools, the negative results include: (1) statistically signi cant sensitivity to naming of variables, (2) con rmation that a number of variable ordering algorithms are statistically equivalent to a random variable order assignment, and (3) observation of a statistically anomalous variable ordering behavior of a wellknown benchmark circuit isomorphic class when analyzed under a single and multiple outputs. On the positive side, the methodology supports a statistically signi cant merit evaluation of any newly introduced variable ordering algorithm, including the one brie y introduced in this paper. Keywords: design of experiments, circuit mutants, equivalence class, benchmarking. I. Introduction Design of Experiments and Experimental Design are wellestablished disciplines in sciences and manufacturing processes; keyword entries to popular search engines on the Web return up to 11,259 and 32,086 hits, respectively. However, the application of the Design of Experiments (DoE) methodology to performance evaluation of algorithms in CAD of VLSI circuits is recent [1, 2]. A design of experiments in CAD of VLSI circuits requires a supply of equivalence class circuits, just like the experiments in biomedicine require a class of subjects with wellcontrolled properties (same sex, same age, same weight, same cholesterol level, etc.). Two kinds of equivalence class circuits have been introduced to date: Class W mutants, based on controlled perturbations of circuit wiring characteristic

[3], and Class F mutants, based on controlled perturbations of circuit functional characteristic [4]. Typically, Class W mutants are applied to evaluate layout optimization algorithms, Class F mutants are applied to evaluate logic optimization algorithms. Both classes have a number of applications elsewhere. This paper deals exclusively with the isomorphic and logically equivalent class, Class WD [3]. Speci cally, we present (numbered in the order of sections): (2) motivation, (3) context for DoE with BDD variable ordering algorithms, (4) experiments in asymptotic performance, (5) outline of an experimental variable ordering algorithm, (6) preliminary results. These experiments complement and extend the experiments with BDD variable ordering algorithms based on Class F mutants, introduced in [4]. II. Motivation Variable ordering for BDD construction has been widely studied, and a number of algorithms have been published which aim to achieve near{optimal orders at reasonable cost [5, 6, 7]. Nevertheless, a great deal of variability of results can be observed under widely{used BDD packages, even on very simple circuits. Figure 1 illustrates BDD sizes achieved on the logically equivalent C499 and C1355 combinational benchmark circuits under three BDD packages. For these experiments, we used two structural variants of the benchmark circuits: (1) Original: Translated into blif format [8] from the original isc format [9]. (2) Reference: A re-synthesized version of the original circuit, based on 2-input gates, with inverters removed except the ones driving the primary outputs (in blif format). Also serves as a reference circuit representation to generate all mutant classes reported in [3] and [4]. In Figure 1, bars A, B, C, and D give the BDD sizes achieved for the two variants of both circuits, with a xed, 80000

This research has been supported by contracts from the Semiconductor Research Corporation (94{DJ{553), SEMATECH (94{DJ{800), DARPA/ARO (P{3316{EL/DAAH04{94{G{2080), and (DAAG55-97-1-0345), and a grant from Semiconductor Research Corporation.

70000

BDD Size

60000 50000

C499-orig C499-ref C1355-orig C1355-ref

40000 30000 20000 10000 0 A B C D E F G H I

J K L M N O P

A = C499-orig-best B = C499-ref-best C = C1355-orig-best D = C1355-ref-best E = C499-orig-CAL F = C499-orig-CMU G = C1355-ref-CU H = C1355-ref-CAL I = C499-orig-CU J = C1355-orig-CAL K = C499-ref-CAL L = C1355-orig-CU M = C499-ref-CU N = C1355-orig-CMU O = C499-ref-CMU P = C1355-ref-CMU

Fig. 1. BDD variability in logically equivalent circuits.

Design of Experiments to Compare Graph-Based Algorithms Class ANY = SAME{#_nodes, #_edges, functionality perturbations, ...} Cost Index: crossing number, logic area & level, layout area, BDD size, ...

••• apply Algorithm_i

circuits

circuits EVAL

•••

cost index

Algorithm_i == Treatment_i

••• class ANY

apply Algorithm_k

•••

circuits

Treatment

Initial Dynamic Ordering Ordering

0 1 2 3

Natural VIS VIS VIS

4

VIS

5

VIS

6 7

Natural Natural

8

Natural

9

Natural

10

BBVO

EVAL cost index

Legend

variable order given by the circuit netlist VIS static variable order generated by VIS1.1 or VIS1.2 BBVO variable order inferred by sampling the circuit function

Natural

None None Sift during construction Sift during construction Sift once after construction Sift during construction Sift twice after construction Sift during construction Sift-converge after construction Sift during construction Sift during construction Sift once after construction Sift during construction Sift twice after construction Sift during construction Sift-converge after construction None

Fig. 2. Design of experiments with circuit equivalence classes to compare the sizes of ROBDDs under distinctive variable ordering algorithms (treatments). near{optimal variable ordering given by Somenzi [10]. As expected, all BDDs were of identical size and were veri ed to be logically equivalent. The remaining bars in the gure show the BDD sizes achieved using dynamic variable ordering in VIS [11], with the CAL [12], CU [13], and CMU [14] packages. The gure illustrates several important points: (1) All BDD packages make use of structural analysis to calculate initial variable ordering. The algorithms can easily be led to poor orderings by the particular circuit realization being processed. For instance, the CAL package produced a BDD with 26,790 nodes for the original (E) realization of C499, and 39,166 for the reference (K) version. (2) Comparisons of the performance of dierent BDD packages based on isolated benchmarks can be misleading. For instance, the CMU package got a near-ideal size of 29,606 nodes for the original C499 circuit, but 70,198 nodes for the reference C1355 circuit; the CU package got 32,546 nodes for the C1355 reference, but 41,386 for the C499 original. Depending on which of these logically equivalent circuits is chosen as a benchmark, either package can be said to perform \better." (3) Each BDD package achieved results close to the known good ordering in at least one case, but wide variability in performance was observed for all packages under the four distinct structural realizations used in the experiment. The worst{case spread was 8,840 nodes for CU, 12,376 for CAL, and 40,592 for CMU. These observations illustrate some but not all of the problems with the current benchmarking procedures for comparing the performance of BDD variable ordering algorithms. At best, the results as reported in Figure 1 are inconclusive. At worst, we gained no insights as to the merits of the algorithms under observation that could have any statistical signi cance.

III. DoE: Treatments and Surprises NP-hard problems, such as optimal variable ordering for BDDs, are solved by devising a polynomial-time heuristic, often with no guarantee whatsoever of the quality of the solution. Changing the starting point for the problem instance can induce unpredictable variability of results when experiments are repeated. Two of the fundamental principles of experimental design are randomization and replication. We adopt these principles for the experimental evaluation of algorithms by (1) creating an equivalence circuit class and (2) repeating the experiments for each member in the class. The basic abstractions for such experiments include [2]: 1. an equivalence class of experimental subjects, eligible for a treatment; 2. application of a speci c treatment; 3. statistical evaluation of treatment eectiveness. Here, a treatment is synonymous with an algorithm and an equivalence class of experimental subjects is synonymous with a circuit mutant class. Figure 2 illustrates these abstractions in a generic ow. The cost index, minimized in the context of variable ordering for BDDs, is the BDD size. The table in Figure 2 itemizes a total of 11 distinctive treatments (BDD variable ordering algorithms) that are applied to a number of equivalence class circuits. `Treatment 0' is based on the `natural' order of variables in the circuit netlist only, no algorithm is applied to change the variable order when constructing the BDD. This treatment parallels the `placebo treatment' in biomedicine. Treatments 1{5 start with the static variable ordering in VIS [11], followed by a number of `sift' operations. Treatments 6{9 start with a `natural' order of variables in the circuit netlist, followed by a number of `sift' operations. Treatment 10 is based on the Black Box Variable Ordering (BBVO) algorithm brie y outlined in a section of this paper. Any number of additional treatments may be added to this list in future experiments.

Circuit Class Circuit Class TreatALU4r-in-WD ALU4r-rn-WD ment Sample Sample Sample Sample Mean StD Mean StD trtm00 1,236.0 169.9 1,236.0 169.9 trtm01 1,206.0 0.0 1,268.1 159.4 trtm02 1,206.0 0.0 1,268.1 159.4 trtm03 598.0 0.0 707.9 88.6 trtm04 577.0 0.0 637.4 65.7 trtm05 576.0 0.0 631.3 58.7 trtm06 1,236.0 168.9 1,236.0 168.9 trtm07 710.3 74.3 710.3 74.3 trtm08 643.6 63.7 643.6 63.7 trtm09 631.3 62.9 631.3 62.9 ALU4r-in-WD is a class of one hundred 4-bit isomorphic and logically equivalent ALU circuits with identical names in each of the netlists where I/O and interior nodes are ordered randomly. ALU4r-rn-WD is a class of one hundred 4-bit isomorphic and logically equivalent ALU circuits with randomly assigned names in each of the netlists where I/O and interior nodes are ordered randomly. Both classes are isomorphic and logically equivalent.

Class ALU4r-in-WD

Class ALU4r-rn-WD 100

100 Treatment 5

75

Treatment 1 Treatment 2

50

Treatment 5

75

25

Treatment 1 Treatment 2

50 25

0

0 0

500 1000 1500 BDD size (nodes)

2000

0

500 1000 1500 BDD size (nodes)

2000

100

100

Treatment 0 Treatment 6

50

Treatment 9

75

Treatment 9

75

Treatment 0 Treatment 6

50 25

25

0

0 0

500 1000 1500 BDD size (nodes)

2000

0

500 1000 1500 BDD size (nodes)

2000

Fig. 3. BDD size (node) statistics under dierent variable ordering strategies, denoted as `treatments'. Some treatments of the ALU4r-in-WD class produce statistics that can be grossly misleading.

Circuit Equivalence Classes. The question arises as to

what constitutes an appropriate equivalence class for BDD experiments. In this paper, we demonstrate dramatically that the isomorphic and logically equivalent class, Class WD, created as a special case of Class W mutants with no wiring perturbations [3], is an important class, provided that it retains two properties: (1) the order of all I/Os and interior nodes in the netlist is random relative to all other members in the class, (2) the names of all I/Os and interior nodes in the netlist are assigned randomly relative to all other members in the class1 . Arguments for BDD experiments with Class F mutants, based on controlled perturbations of circuit functional characteristic, are made in [4]. Experiments with classes of ALU4r. We conducted a total of 2000 experiments and applied 210 treatments to two isomorphic classes of the circuit ALU4r , 100 circuits in each class: ALU4r-in-WD is a class of one hundred 4-bit isomorphic and logically equivalent ALU circuits with identical names in each of the netlists where I/O and interior nodes are ordered randomly. ALU4r-rn-WD is a class of one hundred 4-bit isomorphic and logically equivalent ALU circuits with randomly assigned names in each of the netlists where I/O and interior nodes are ordered randomly. Both classes are isomorphic and logically equivalent but 1 The node renaming treats I/Os in a manner that preserves the original I/O names as post- xes, so the original I/O names can be restored to facilitate pairwise logic veri cation. 2 This version is based on a 4-bit ALU circuit textbook schematic, independently entered by J. Calhoun [15], re-entered by D. Ghosh in 1998 and veri ed against each other. The "benchmark alu4" version in [8] diers from this version in three minterms, veri ed by J. Harlow in 1998.

clearly only the class ALU4r-rn-WD maintains properties (1) and (2) as stipulated above. So what's in a name? A lot, it turns out. The summary of 2000 experiments is shown in Figure 3: As to Treatment 0, the circuit classes ALU4r-in-WD and ALU4r-rn-WD are clearly equivalent { as expected. As to Treatments 1{5, we have a contradiction since the circuit classes ALU4r-in-WD and ALU4r-rn-WD do not appear equivalent. However, the problem is not with the classes, it is with how Treatments 1{5 are applied. Since all circuits in the ALU4r-in-WD have the same I/O names, the internal variable re-ordering by the VIS static algorithm maintains, internally, the same initial variable order for all circuits in this class. Hence, the standard deviation for the ALU4r-in-WD class remains at 0 while we observe a near-normal distribution for Treatments 1{5 when applied to the class ALU4r-rnWD. As to Treatments 6{9, the circuit classes ALU4r-in-WD and ALU4r-rn-WD are clearly equivalent { once we replace the VIS static order with the `natural order'. Lessons Learned. The introduction of circuit equivalence classes, with attributes such as random reordering and renaming of circuit nodes, is essential to a good experimental design. Without such classes, we could not have exposed the true performance of algorithms under Treatments 1{ 5. Futhermore, the availability of circuit equivalence classes and the current experimental design also reveals unexpected equivalences among the treatments themselves: According to a pairwise t,test, (Treatments 1 and 0), (Treatments 2 and 0), (Treatments 6 and 0) are equivalent at better than 95% con dence level. In other words, for the isomorphic and logically equivalent class of ALU4r, random variable order is just as eective as the following algorithms in VIS: static variable order-

75

100

Circuit class ex carry15 belongs to the

family of parameterized n-bit carry circuits (n = 7, 15, 31, 63), devised to test the asymptotic performance of BDD ordering algorithms. The best order, inferred from the bit-slice circuit structure, results in the BDD size of 36 nodes. Other circuit families used in this evaluation include the family of n-bit mux circuits (warp7, warp15, warp31, warp63). The histograms in this gure highlight the distributions of BDD sizes reported by a number of ordering heuristics described in Figure 2. Notably, none achieves the optimum value of 36 nodes for all instances of isomorphic and logically equivalent circuits, each randomly re-ordered and re-named. In addition, the most expensive heuristic (Treatment 9) performs worse than Treatment 8 on some circuits. Asymptotic behavior of Treatment 8 heuristic applied to these circuit classes is shown in Figure 5.

Circuit Class ex_carry15

Circuit Class ex_carry15 50

75

Treatment 8

Best order (36) 25

50 Treatment 6

0