On High-Quality, Low Energy BIST Preparation at RT-Level

1 downloads 0 Views 303KB Size Report
The purpose of this paper is to discuss how a recently proposed RT (Register Transfer) Level test preparation methodology can be reused to drive innovative ...
On High-Quality, Low Energy BIST Preparation at RT-Level M. B. Santos, I.C. Teixeira and J. P. Teixeira

S. Manich, L. Balado and J. Figueras

IST / INESC-id, R. Alves Redol, 9, 1000-029 Lisboa, Portugal [email protected]

Univ. Politecnica de Catalunya (UPC) Barcelona, Spain [email protected]

Abstract

The purpose of this paper is to present a methodology for high-quality LE/LP BIST preparation at RT-Level. High quality BIST is ascertained through likely physical Defects Coverage (DC) metrics. Low energy BIST is achieved by generating a short, loosely deterministic test sequence. The methodology for LE/LP BIST preparation is cost-effective and useful for complex designs, as it is carried out at RTL. RT level test generation is carried out through the definition of a reduced set of masks, forcing a limited subset of “cares" bits. A novel RTL testability metrics, IFMB, recently proposed [SaTe01], is used to ascertain the quality of the generated test pattern. IFMB stands for Implicit Functionality and Multiple Branch Coverage. IFMB evaluates, at RTL, the exercise of Implicit Functionality (IF) of operators and Multiple Branch (MB) coverage of conditional constructs. The energy metrics used is the Weighted Switching Activity (WSA) [GeWu99]. This metric takes into account the current associated to the switching charge of the parasitic capacitance of each node, computed by a proprietary tool, lobs [GoTe99].

The purpose of this paper is to discuss how a recently proposed RT (Register Transfer) Level test preparation methodology can be reused to drive innovative LowEnergy (LE) / Low-Power (LP) BIST solutions for digital SOC (System on a Chip) embedded cores. RTL test generation is carried out through the definition of a reduced set of masks, forcing few "care" bits, and leading to a high correlation between multiple detection of RTL faults and single detection of likely physical defects. LE/LP BIST sessions are defined as short test sequences leading to high values of RT-level IFMB metrics and low-level Defects Coverage (DC). The Weighted Switching Activity (WSA) of the BIST sessions, with and without mask forcing, is computed. It is shown that, by forcing vectors with the RTL masks, short BIST sessions, with low energy and with a comparable (or smaller) average power consumption, as compared to pseudo-random test, are derived. The usefulness of the methodology is ascertained using the VeriDOS simulation environment and modules of the CMUDSP and TORCH ITC'99 benchmark circuits.

1. Introduction Product complexity, performance and quality requirements are ever increasing, while power, cost and time-to-market requirements are ever decreasing. This trend puts a heavy pressure on design productivity and quality, and drives the design process towards higher levels of abstraction, and to HDL (Hardware Description Languages). Low-power design and design reuse techniques are currently being used, as well as IP (Intellectual Property) based methods [XXX]. As a consequence, embedded core reuse also quests for core test reuse, and RTL (Register Transfer Level) test planning and preparation. Moreover, energy and power requirements are becoming very relevant in electronic design. In fact, low-energy operation is needed to extend battery lifetime in portable equipment [ ]. Low average power is needed to constraint the temperature of electronic devices under operation [ ]. Low maximum power is also needed to avoid hot spots and electromigration, which limits device reliability [ ]. Low-Energy (LE) / Low-Power (LP) requirements for the normal operation mode should go together with LE/LP requirements in test mode [Nico00]. Test resource partitioning makes BIST (Built-In Self Test) an attractive solution, provided that high Test Effectiveness (TE) can be obtained. TE is measured as the ability of the test pattern to uncover likely defects [WaWi95]. High TE LE/LP BIST solutions are the main focus of this paper.

The paper is organized as follows. In section 2, a review of previous work is made. Section 3 describes the highlevel (IFMB) and low-level (DC) test quality metrics, emphasizing their correlation. Section 4 introduces the proposed RT-level methodology, together with two hardware solutions for test vector customization. In section 5, LE/LP BIST solutions are discussed, using ITC’99 benchmarks. Finally, section 6 summarizes the conclusions.

2. Previous Work Low cost BIST solutions require low area TPGs (Test Pattern Generators), typically pseudo-random TPGs. Random pattern resistant faults require that some degree of test determinism be considered. Different approaches have been proposed for random pattern resistant fault detection in digital circuits. These approaches basically perform logic level LSA (Line Stuck-at) fault simulation with (pseudo-) random vectors, in order to identify hard to detect faults, which are, afterwards, detected using weighted random pattern generation [ScCa75] [Wund85] [NeKi97] or deterministic [HeWu95] approaches. However, high LSA fault coverage does not guarantee high Defects Coverage, DC [SoGo96]. Moreover, hard accessibility of parts of the structural description is expected to result from the synthesis of functional parts seldom exercised. However, this information can be obtained at RTL with low cost fault simulation. At-speed BIST power consumption can be reduced by means of: (1) vector selection [HeWu95] [CoVi99] [MaSa99] [GiPr99], reducing the number of vectors

applied; (2) test pattern generation carried out for low power BIST [ZhRo99] [CoVi00] and/or (3) circuit activity reduction during shift in the chain of a test-perscan architecture [GeWu99] [WaGu99]. In a previous paper [SaTe00], the authors have shown that test generated at RTL can rewardingly be reused in a production environment to improve the coverage of physical defects. In fact, random pattern-resistant faults, which require prohibitively large numbers of equiprobable patterns or multiple weight sets [WaFo89], can be detected with significantly shorter test lengths, if test is derived using RTL information, dramatically reducing the required energy for the BIST session. In a subsequent paper [SaBa01], the authors have provide evidence that multiple detection of hard to detect RTL faults leads to the detection of random patternresistant realistic faults at logic level, namely, hard to detect bridging and open defects. Moreover, a significant reduction in test length was proved to be possible with an associated reduction in the energy required for the BIST session.

3. Test Quality Metrics RT and layout level test quality metrics are defined, as RT level is used for TPG, and layout level for test effectiveness evaluation. All metrics are computed using the mixed-level fault simulator VeriDOS [SaTe99/1,2].

3.1 Implicit Functionality Multiple Branch (IFMB) Metrics The IFMB metrics is defined to exploit the high correlation between variable multiple detection of RTL faults and single detection of likely physical defects. Consider a digital system, characterized by a RTL behavioral description D , and a set of N F RTL faults. For a given Test Pattern, T = {T1 , T2 ," , TN } , the IFMB

metrics is defined as [SaTe01] IFMB =

N LSA N N FC LSA + IF FC IF + MB FC MB (n) (1) NF NF NF

where N LSA , N IF , N MB and N F represent the number of RTL LSA, implicit functional, conditional constructs, and global faults respectively ( N F = N LSA + N IF + N MB ). Hence, three RTL fault classes are considered. Each individual Fault Coverage defined in IFMB is evaluated as the percentage of faults in each class, single ( FCLSA , FCIF ) or n-detected ( FCMB ), by test pattern T. IFMB is, thus, computed as the weighted sum of three contributions: (1) single RTL LSA fault coverage ( FCLSA ), (2) single Implicit Functionality (IF) fault coverage ( FCIF ) and (3) conditional constructs faults Multiple Branch (MB) coverage ( FCMB (n) ). The multiplicity of branch coverage is n. The first RTL fault class has been considered in the work of previous

authors [2-8]. The two additional RTL fault classes and the introduction of n-detection associated to MB have been proposed in [SaTe01] and incorporated in the IFMB testability metric. Here, the concept of weighted RTL fault coverage is used in a different way than in [ThAgr00], as the RTL fault list is partitioned in three classes, all listed faults are assumed equally probable and the weighting is performed taking into account the relative incidence of each fault class in the overall fault list. The inclusion of faults that fully evaluate operators implicit functionality also differs from the model proposed in [ThAgr99] where faults are sampled from a mixed structural-RT level operator description. One of the shortcomings of the first RTL fault class (RTL LSA) is that it only considers the input, internal and output variables explicit in the RTL behavioral description. However, the structural implementation of a given functionality usually produces implicit variables, associated with the normal breakdown of the functionality. In order to increase the correlation between RTL fault coverage and DC, some key implicit variables are identified at RT-level, and LSA faults at each bit of them are added to the fault list. The usefulness of such modeling has been demonstrated in [SaTe01] for relational and arithmetic (adder) operators. Nevertheless, the principle can be applied to other functional modules. As far as MB coverage is concerned, we represent conditional constructs in a graph where a node represents one conditional construct, a branch connects unconditional execution and a path is a combination of branches. High TE needs multiple branch coverage. The proposed fault model for conditional constructs inhibits and forces the execution of each CASE possibility and forces each IF/ELSE condition to both possibilities. Two testability metrics are defined for each branch: Branch Controllability: n n , nai < n COi = COi (bi ) =  ai , nai ≥ n 1

(2)

n n , ndi < n DOi = DOi (bi ) =  di , ndi ≥ n 1

(3)

where nai is the number of activations of branch bi . Branch Detectability:  n /n , ndi < n (3) DOi= DOi(bi)=  di , n ≥n  1 di where ndi is the number the non execution of branch bi was detected. These metrics, in the interval [0, 1], reflect the fraction of success in achieving n fault activations and n fault observations, respectively. The contribution of conditional constructs to IFMB global metrics is then defined as: MB Coverage

i

DO i (n) NF n di =∑ NF i n.NF

(4)

3.2 Defects Coverage (DC) Metrics Using defect-oriented fault models, each listed fault (e.g., a bridging defect) is weighted by its probability of occurrence, wi. This leads to the definition of logic level DC (Defects Coverage), as DC

=

N

number of customized vectors needed to obtain gains in DC depends on ψi and on the complexity of the functionality of the dark corner. A simple example, depicted in Fig. 1, shows the gains in DC achieved with the application of one specific mask after N0=1000 PR vectors, and the saturation in DC(N) after N1=500 vectors. In section 5, after some considerations on energy / power consumption, additional experiments are reported. CUT k=16 (mascara1)

N

d

∑ w j=1

j

∑ w i=1

(5)

100

i

where N is the total number of layout-level faults and Nd is the number of detected faults with associated probabilities wj.

80 DC[%]

NF

FC MB (n) = ∑

60 Faults

40 20

4. BIST RTL TPG Methodology

0 0

Test Pattern Generation is carried out at RTL, defining partially specified test vectors (test cubes), which drive the system under test into the functionality visited in a limited set of the input space. We refer this functionality as the “dark corners” [SaTe01/1]. The BIST strategy is thus to customize Pseudo Random (PR) test vectors (generated on-chip with e.g. a LFSR) with these partially specified test vectors, referred as “masks”. Usually, the number of masks, nm, is limited, and the number of constrained positional bits, mi in mask i, is much smaller than the input word length, m. A merit factor of ψi = mi/m is defined for each mask. The case studies used as test vehicles are modules of the CMUDSP [CMUDSP] and TORCH [TORCH] ITC'99 benchmark circuits. As an example, Table 1 shows the limited effort needed to customize PR patterns for the agu control module (AGUctr) from CMUDSP, the co-processor0 (cp0_ctr) and the multiply or adder booth module (moa_booth) from TORCH. Module

nm # masks

n # PIs

nxnm

PCU_ctr. AGU_ctr. Cp0_ctr. MOA Ppsum

6 14 3 3

347 35 28 272

2082 490 84 816

∑ mi tot. # fix bits 241 217 16 270

Table 1 – Mask customization for ITC’99 benchmark modules

The TPG process is carried out in such a way that, after mask generation, as described in [SaTe01], the test pattern T={T1, T2, ... TN} is built of N={N0(PR) + ∑ Ni(mask i)} vectors, in which N0 are pseudo random vectors, and, for each mask i, Ni vectors are generated, filling the unconstrained positional bits with the 0, 1 values generated by the LFSR. The methodology allows good estimations of the required length of the BIST session, RTL-based, to lead to high DC values. Experiments were carried out using different strategies to introduce the mask vectors, e.g., by applying few PR vectors, then mask 1 vector, again few PR vectors, then mask 2 vector, and so on. However, such strategy leads to higher energy/power consumption. At present, work is under way to derive optimum values for Ni, as the

500

1000

1500

#Vectors

y1 = f (x2) WHEN x1 = " mascara1" f (x2) = (((((x2 0 OR x21 ) AND (x2 2 AND x2 3 )) AND ((x2 4 XOR x2 5 ) OR (x2 6 AND x2 7 )))) XOR ((((x2 8 OR x2 9 ) AND (x210 AND x211 )) AND ((x212 XOR x213 ) OR (x214 AND x215 ))))) & " mascara1" XOR x2

Fig. 1 – Defects Coverage (DC) versus N for the CUT k=16, when a specific mask (mascara1) is applied after 1000 pseudorandom vectors.

BIST quality is frequently evaluated using the LSA fault model. However, as referred, more accurate fault models are used in this paper. The simulation environment uses a commercial design system and DOTLab, a proprietary set of DO tools, including lobs (the proprietary defect extractor) and VeriDOS, which performs mixed-level (behavioral / structural) fault simulation, using VHDL or Verilog behavioral descriptions, and Verilog structural descriptions [SaTe99/1]. This simulation tool uses an extension of the biased-voting model for bridging faults, as described in [SaTe99/2]. Hence, gate-level Verilog fault models for BRI (bridging) and LOP (Line Open) defects, both for interconnection and cell faults, are resident in the VeriDOS tool, for CMOS physical implementations [SaTe99/2]. VeriDOS generates RTL fault lists, according to the RTL fault models defined in [SaTe00], performs mixed RT/logic level fault simulation and WSA (Weighted Switching Activity) computation (the metrics for energy computation) and computes the RTL (IFMB) and layout level (DC) quality metrics. In order to perform on chip PR vector customization, using the RTL generated masks, additional test hardware, Si area (and power consumption) is required. Two structural solutions for LFSR bit masking have been proposed (and evaluated) in [SaTe01/2], taking advantage of the reduced values of ψi in the masks. Their implementation is automated through a dedicated tool.

5. LE / LP BIST Solutions

Ni   × F gi ( ( j − 1) ) ⊕ gi ( j )  ∑i  i ∑ j =1   (6) In this expression, Fi is the node weight and represents its influence on the overall energy. This weight is related to the technology parameters through the expression 1 2 Ei = × Fi × c0 × VDD 2 (7) Parameter Ei is the energy consumption of node i in a single transition, c0 is the equivalent capacitance of a minimum size inverter and VDD the power supply voltage. From this expression it is clear that Fi may be interpreted as the weight of a gate normalized to a minimum size inverter. At logic level, this weight is usually approximated by the fanout of the gate. Finally, consider a set of vectors V={V0,…Vn} is applied sequentially to the input of the circuit at a speed compatible with the total delay of the circuit. The vector sequence is orderly applied according to its index. The total energy consumption, E, can be estimated by the expression

WSA (VI , VF ) =

∀i∈G

(

)

n 1 1 2  2 E = ∑ WSA(Vi −1 , Vi ) ×  × c0 × VDD  = × WSA × c0 × VDD 2  2 i =1 (8) where WSA is the total weighted switching activity. The average power consumption, Pave, can be computed from Pave=E/ttot, where ttot is the total time period of the BIST session. The maximum power consumption, Pmax, is obtained from Pmax=max{(∑Ej(ti)-∑Ej(ti-1))/tck}, where tck is the clock period.

Despite the fact the metric WSA is widely accepted for fast energy estimation, large errors may occur in certain situations due to some strong approximations. It must be kept in mind that (1) the voltage swing of the nodes during the switching is assumed equal to the power source voltage, (2) the internal capacitance of the gates is concentrated to the output load capacitance, (3) Miller capacitance effects are also approximated to the load capacitance, (4) other effects, like short-circuit and leakage, are assumed not significant as compared to the switching of the parasitic capacitors, and delay model approximation may also be another cause of large errors due to its great impact on glitching.

0.90 0.80

DC=92%

0.70 x1E-6 [C]

Assume G be the set of nodes of a circuit excited by two input vectors, VI and VF. The initial vector VI presets the internal nodes at an initial state at time t0. At next time t1, the input vector is changed to VF; as a consequence, the internal nodes reach a new stable state. During the transient, nodes may toggle a number of times, depending on the structure of the circuit. If Ni is the total number of switches of node i, we define the set of different values experimented during this transient as {gi(0), …gi(j), …gi(Ni)}. Accordingly, WSA is defined as

Consider the following example, depicted in Fig. 2, which shows the energy consumption profile of the AGU_control, obtained during a testing experiment [SaTe01/2], in which PR vectors are nested with masks. The X-axis of the plot corresponds to the index of the test vector. Vectors are applied sequentially to the circuit, according to its index value. The Y-axis corresponds to the total charge Q switched from the beginning of the test. This charge is proportional to the energy consumption if we consider that Q = WSA × c0 × VDD . Note that, according to ¡Error!No se encuentra el origen de la referencia., the total energy consumption is computed as 1 E = × Q × VDD (9) 2

WSA

As referred, energy and power consumption are evaluated through the Weighted Switching Activity (WSA) metrics.

0.60 0.50 0.40 0.30

Normal vectors

0.20

Masked vectors

0.10 0.00 0

2000

4000 6000 # vectors

8000

10000

Figure 2 - Energy consumption of the AGU_control module during a test session. Two types of test vectors are used: PR (Normal) and PR with masks (Masked) [SaTe01/1].

The LE goal corresponds to minimize E. The LP goal is obtained with minimum Pave. Minimizing test application time is achieved with minimum N (test length), compatible with a target DC value. How to derive an adequate LE/LP BIST solution, given a module under test? Which are the key parameters to ascertain the quality of the solution? The proposed approach key features can be better understood with the help of Fig. 3. E~WSA

(dE/dN)max = Pmax

DCo

E1 ∆t

E2

(2) dE/dN = Pave

DCo (1) Dc1 > DCo

N

Fig. 3 – Energy and power consumption as a function of test length (N, number of test vectors)

Using a PR test pattern typically leads to an almost linear E(N) dependence (shown as (2) in Fig. 3). In reality, the linear E(N) dependence is built out of a staircase dependence, each step accounting for the WSA of a 2-vector transition (Ei