N (c)

4 downloads 0 Views 90KB Size Report
the differential equation [1]: ... DT-CNNs and DC-CT-CNNs, as the solution of a single linear programming problem. ...... [19] Murray R. Spiegel, “Schaum's outline of Theory and Problems of Statistics”, Schaum Publishing Company, 1961.
ERROR TOLERANCE IN CNNs. APPLICATION TO THE DESIGN OF ROBUST CNNs. Mancia Anguita, Francisco J. Pelayo, F. Javier Fernández, Antonio F. Díaz. Departamento de Arquitectura y Tecnología de Computadores Facultad de Ciencias, Universidad de Granada, 18071-Granada, Spain Fax: +3458243230, Email: [email protected] Abstract: This paper deals with the obtention of robust parameter configurations for DT-CNNs and for a class of CT-CNNs (here called CT-CNNs with Discrete Configurations, DC-CT-CNN), in the presence of additive and multiplicative implementation errors. Expressions that characterize the tolerance to both multiplicative and additive errors caused by circuit inaccuracies in DT-CNNs and DC-CT-CNNs VLSI implementation are first deduced. Taking into account those expressions it is proposed to obtain robust parameter configurations, by using a design process based on local rules, as the solution of a single linear programming problem. The process is applied to the generation of robust configuration for some tasks. The tolerance to errors of these configurations has been corroborated by simulations. The differences in parameter values and tolerance to errors, between the robust configuration obtained for solving a particular task in DT-CNNs and that obtained in DC-CT-CNNs, are given.

1.- INTRODUCTION The Cellular Neural Network model proposed by L. O. Chua and L. Yang [1] has been widely used for image processing tasks [2]. The time evolution of the state of a cell (neuron or pixel) c in an NxM-cell CNN is described by the differential equation [1]:

t

d x c(t) dt

? ? x c(t) ? ?

n

An? c yn (t) ? ?

Bn? c un ? I n

1? n,c ? N.M ; n ? NR(c) 1 ?x n (0) ?? 1; ?un ?? 1; yn ? f ( x n ) ? ( ?x n ? 1 ? ? ?x n ? 1 ? ); ?yn ?? 1 2

(1)

where n denotes a generic cell belonging to the neighbourhood of cell c, NR (c), with radius equal to R (for example, N1(c) is the set of 3x3 cells centred on c, N1(c)={ c-N-1, c-N, c-N+1, c-1, c, c+1, c+N-1, c+N, c+N+1}). xc is the state of cell c, yn and un are the output and the input, respectively, of the cell n, I is an offset term, and the matrices A and B are called feedback and control templates respectively. Some authors have proposed [3] a discrete-time (DT) version of CNN obtained by applying the Euler integration algorithm to discretise the cell state equation. The state of a cell c in an NxM-cell DT-CNN is described by [3]:

1

yc(t? 1) ? g( x c(t? 1)) ? g

?

An? c yn (t) ? ? n

Bn? c un ? I n

1? n,c ? N.M ; n ? NR(c); ? un ?? 1; g(x c) ?

1 for x c > 0 ? 1 for x c < 0

(2)

Depending on the values of the CNN parameters (cloning template components, A and B, and the offset term I) and on the initial cell states, the resulting CNN is configured to perform a given processing task on the inputs. CNN parameter configurations have been obtained mainly by proof-error functional simulations [4,5,6,7], by translating templates traditionally employed in image processing [1], or by using systematic design processes [8, 9, 10, 11, 12, 13, 14, 15, 16,17]. The systematic design processes require a precise knowledge of the task; some of them to fix the representative set of input-output patterns that retain all the task characteristics in order to train the network, and other approaches require the knowledge to codify all these task characteristics by a set of local rules or conditions common to all cells. Here, a design based on local rules is chosen because it is less expensive in terms of computation time and always confirms whether or not there exists a configuration verifying the rules. Moreover, the rule-based design is valid to find parameter configurations for DT-CNNs and some CT-CNNs taking into account the tolerance to errors in the VLSI implementation as an objective. In Sect. 2 two kinds of CT-CNNs configurations are defined, discrete and continuous, for which the tolerance to errors due to their circuit implementation must be obtained in a different way. In the same section the circuit error dependency for DT-CNNs and for CT-CNNs with a discrete configuration (DC-CTCNN) is deduced. Sect. 3 proposes, using the results of Sect.2, obtaining robust parameter configurations for both DT-CNNs and DC-CT-CNNs, as the solution of a single linear programming problem. As an example, robust parameter configurations for certain tasks (shadow creation in one direction, CCD and border detection in one direction) have been generated. The tolerable error obtained with these parameter configurations is illustrated with simulations of both DT-CNNs and CT-CNNs. Finally, Sect. 4 presents some conclusions. The previous works in [11], [14], [15], [16] and [17] also deal with the obtention of robust CNN configurations. In this work: (1) We illustrate that not all the configurations for CT-CNNs, especially robust configurations, can be obtained with a design process based on local rules defined in terms of limited cell output; this is because for some configurations the worst-case error conditions for a cell are not given when its neighbours are limited (we call these configurations continuous, CC). (2) The differences in parameter values and tolerance to errors between the robust configuration for solving a particular task in DT-CNNs and the one to solve it in DC-CT-CNNs are given. (3) The robustness of a CNN is corroborated by simulations, in which the probability that a cell breaks a rule when the circuit inaccuracy are generated by a normal

2

distribution is taken into account, since this probability may be deduced for DT-CNNs and DC-CT-CNNs. These simulations are also used to distinguish between CC-CT-CNNs and DC-CT-CNNs.(4) The design of a robust DT-CNN or a robust DC-CT-CNN configuration is reduced to solving a single linear programming problem (in [16] various approaches used to obtain parameter configurations are referenced). (5) It is illustrated with the CCD (connected component detection) that not only robust configurations for the (so called in [17]) monotonic CT-CNNs can be designed with local rules. (6) It is shown that if the loss term vanishes from the rules in the CT-CNN design process, as is cosidered in [16], the configuration obtained is less robust than that obtained taking the loss term into account. (7) In the design process based on local rules defined in terms of limited cell output, it has been taken into account that all the adding terms (including errors) of the cell state equation affect the change in limited output of the cells. This must be taken into account in the design process when we fix the inequalities codifying the rules to be observed and (if the robustness is an objective) the values to be optimized (in the examples of CNN design in [10], [14] and [15] this is not taken into account).

2.- TOLERANCE TO CIRCUIT INACCURACIES IN CNNs. The circuit implementation of a CNN introduces errors into the cell state computation that may lead the CNN to a wrong final output. The errors in the cell state computation depend principally on offset (additive) and signal-level dependent (multiplicative) errors introduced by the circuits in the adding terms of the cell state equation, see eq. (2) for DT-CNN and eq. (1) for CT-CNN. Errors in the time constant affect the cell evolution speed and may be seen as a common error in all the adding terms. Note that the errors in each weighted term depend not only on the errors introduced by the multiplier circuit that obtains the weighted term but also on the error in its inputs. The tolerance to errors can be easily deduced in CNNs that use an output hard-nonlinearity as the DT-CNN (eq. (2)). This is because the possible value combinations of the neighbour cell outputs (y) and inputs (u) that make a cell take a value of +1 or -1 are finite, since the cell output and input can only take two possible values: +1 or -1. Then, the errors in a DT-CNN cell that may be tolerated (i.e. even with these errors the network reaches the desired output) can be deduced (Sect. 2.1). In the same way, it is easy to deduce the tolerable errors for those CT-CNNs (eq. (1)) in which the worst-case error conditions for any cell occurs when its neighbour cell outputs are limited, since in this case the number of worst-case error conditions is finite (Sect. 2.2).

2.1.- Error tolerance in DT-CNNs.

3

The tolerance to errors caused by the circuit implementation of a DT-CNN programmed to solve a given task is deduced taking into account the finite number of conditions or rules to be observed by any pixels (cells) in the network in order to solve the task. These local rules define a task specifying the input (u) and output (y) values for the neighbours of any cell c that make it take a value +1 or -1. For example, to generate a shadow to the right of each object in an image stored as the CNN initial state, the local rules in Box 1 must be observed by any pixel c. Box 1

Rules for shadow generation to the right.

(Rule 1) If a pixel with a value of -1 has a pixel to the left with value +1 (yc =-1, yc-1=+1), such a pixel must take the value +1. (Rule 2) If a pixel with a value of -1 has a pixel to the right with value -1 (yc =-1, yc-1=-1), it must maintain the value -1. (Rule 3) If a pixel has a value of +1(yc =+1, yc-1=-1 or yc =+1, yc-1=+1), it must maintain this value +1.

The conditions to be verified in the state equation of any cell c in a DT-CNN in order to observe a rule take the form (see eq. (2) and [9]): xc (t?1)? ? xc (t?1)? ?

n

n

An?c yn (t) ? ? An?c yn (t) ? ?

n

n

Bn?c u n ? I< 0 when y c must take or maintain ?1 Bn?c u n ? I> 0 when y c must take or maintain ?1

y ,u ? {?1,?1}

(3)

The local rules can be codified in a DT-CNN with a set of inequalities common to all cells with the form shown in (3). The right side value in the strict inequalities has been fixed at 0, taking into account that DT-CNNs use a hardnonlinearity to generate the cell output (y) from the cell state (x) [9]. An inequality is not necessary to define a task if it is included in another one that is more restrictive. The rules in Box 1 to be observed by any pixel to generate a shadow to the right are codified in a DT-CNN with the inequalities in Box 2. Box 2

Shadow creation to the right. Inequalities codifying the rules in Box 1 for a DT-CNN.

{Rule 1} A?1 ? A0 ? I >0 {Rule 2} ? A?1 ? A0 ? I< 0 {Rule 3} A?1 ? A0 ? I >0 , ? A?1 ? A0 ? I> 0

In all inequalities for a particular task (see Box 2 for shadow creation) all the parameters used by the CNN to solve the task must appear, because all of them affect the cell evolution at any moment. Parameter configurations for shadow creation storing the input image as the CNN initial state have been presented for example in [10] and [18] (Box 3). The configuration in [18] verifies all the inequalities in Box 2 (see Box 4), and so it can be used for shadow creation in a DT-CNN applying the process described by the rules in Box 1. The

4

configuration in [10] was deduced for CT-CNNs by a design process based on local rules; it cannot be applied for shadow creation using a DT-CNN (observe in Box 4 that it does not fulfil the rule 1). Box 3

Parameter configurations for shadow generation

0 0 0

0 0 0

A? 1 2 0

B? 0 0 0

0 0 0

0 0 0

Box 4

I? 0.5 x b ? ? 1 u b ? don´t care

[10]

0 0 0

0 0 0

A? 2 2 0

B? 0 0 0

0 0 0

0 0 0

I ? 2 xb ? ? 1 u b ? don´t care

[18]

Verification of the inequalities in Box 2 by the configurations in Box 3

Inequalities for shadow creation {Rule 1} A ?1 ? A0 ? I >0

[10]

[18]

1? 2? 0.5? ? 0.5< 0 Fail

2? 2? 2? 2>0

{Rule 2} ? A?1 ? A0 ? I< 0

? 1? 2? 0.5? ? 2.50

1? 2? 0.5? 3.5>0 ? 1? 2? 0.5? 1.5> 0

2? 2? 2? 6>0 ? 2? 2? 2? 2> 0

If during the network evolution a rule (an inequality of a rule) never fails, a correct final CNN output is guaranteed. A rule fails in a cell c during the network evolution if, at any moment, it is true in c that (see eq. (3)): (? (?

n

n

An?c yn (t) ? ? An?c yn (t) ? ?

n

n

Bn?c u n ? I ? ?)> 0 when y c must take or maintain ?1 ( yc (t?1) ? ?1) Bn?c u n ? I ? ?)< 0 when y c must take or maintain ?1 ( yc (t?1) ? ?1)

(4)

This occurs due to additive (?A) and/or multiplicative (?M) errors represented in (4) by ?(?= ?A + ?M), principally those added by the weighted terms and by the circuit implementing the offset term (adding terms). The deviation of the error ?, s ?, causing a wrong CNN output depends on the lowest absolute value for the left side of the inequalities codifying the rules (eqs (3) and (4)). Calling the lowest absolute value for the left side of the inequalities a (as can be observed in Box 4 a=2 for the configuration in [18]), a correct CNN output is obtained if (see eqs.(3) and (4)) s ?< a

(5)

Assuming a maximum additive deviation of s WCA (Worst-Case Additive deviation) for each weighted term and for the offset term, the absolute value for the additive error ?A in a pixel can take a maximum of: ??A ?maximum ? (number of parameters) s WCA

(6)

A cell reaches this maximum value when all its adding terms have a positive additive error of +s WCA or when all have a negative additive error of -s WCA. Supposing a normal distribution for the additive errors, with a standard deviation of s A for each adding term, the standard deviation of the additive error ?A, s ?A, is obtained by:

5

s ?A ?

number of parameters s A

(7)

Furthermore, with a maximum multiplicative deviation of s WCM (Worst-Case Multiplicative deviation) for each adding term, the absolute value for the multiplicative error ?M in a pixel can reach a maximum of:

??M ?maximum ? ( ?

( ?An? c?? ?Bn? c?)? ?I ?) s WCM

(8)

n

To deduce this expression it has been taken into account that cell output (y) and input (u) take values of +1 or -1 and that the maximum error is obtained when all adding terms have the greatest multiplicative errors (?An-c ?s WCM, ?Bn-c ?s WCM, ?I?s WCM) either all positive or all negative. Supposing a normal distribution for the multiplicative errors, with a standard deviation of s M for each adding term, the standard deviation of the multiplicative error ?M , s ?M, is obtained by:

s ?M ?

?

2 2 2 ( An? ·s M c ? Bn? c ) ? I

(9)

n

Without multiplicative errors (then ? = ?A): (1) A correct CNN output is obtained if the maximum deviation for additive errors, s WCA, in the adding terms is (see eqs. (5) and (6)): s ?? (number of parameters)s WCA < a

?

s WCA