Programmable Extended SEC-DED Codes for ...

2 downloads 0 Views 751KB Size Report
CEA, LIST, Embedded Systems Reliability Laboratory,. Point Courrier 94 ... produced with advanced scaled-CMOS technologies [2][5]. In parallel, soft-error rates ...
Missing:
2011 29th IEEE VLSI Test Symposium

Programmable Extended SEC-DED Codes for Memory Errors

Valentin Gherman, Samuel Evain, Fabrice Auzanneau, Yannick Bonhomme CEA, LIST, Embedded Systems Reliability Laboratory, Point Courrier 94, 91191 Gif-sur-Yvette CEDEX, FRANCE [email protected] The number of single-bit hard errors that can be masked with memory column replacement grows linearly with the number of bits in each memory word that can be stored in spare columns [10][13][17][19], in other words the number of spare columns that can intersect a memory word. Recently, programmable restricted error correction codes have been introduced to enable the correction of an exponential number of single-bit hard errors with respect to the number of spare columns per memory word [7]. Unfortunately, this method is not appropriate for a memory that is already protected by an error detection and correction code. In this paper, a memory protection scheme is proposed based on the extension of a systematic SEC-DED code with a number of check-bits equal to the number of spare columns available in a memory bank. This extension enables the correction of all double-bit errors that affect at least one bit position from a fixed sub-set of bit positions in the extended code words. Any double-bit error in which these bit positions are not involved remains detectable. The cardinality of the sub-set of better protected bit-positions is significantly higher than the number of spare columns per memory word. Any single-bit hard error affecting these bit positions can be corrected simultaneously with any single-bit soft error. The requirement is that at most one of the storage cells where the bits of each code word are stored is defective. With the proposed extended SEC-DED (E-SEC-DED) code, already the presence of one single spare column per memory word allows to mask out two different single-bit hard errors instead of only one as is the case with column replacement techniques [10][13][17][19]. Furthermore, spare memory columns with defective storage cells can still be used to mask out defective storage cells in regular memory columns as long as each memory word contains at most one defective storage cell. The proposed E-SEC-DED code has a hierarchical structure [8] and can be easily reduced to the original SEC-DED code or to another E-SEC-DED code with fewer supplementary check-bits. A bit-swapper is proposed to rearrange the bits of the ESEC-DED code words before they are stored such that the memory columns with defective storage cells receive bits protected against any double-bit error which might affect them. The bit-swapper can be dynamically programmed based on test result information that indicates the columns with defective storage cells in the accessed memory bank. This further improves the defect masking capacity [16] and facilitates the integration into a memory built-in self-repair (BISR) scheme. In the resulting memory repair scheme, the spare memory columns can be used either for column replacement or to store additional check-bits. A similar approach was also proposed in [6] where, besides column replacement, the redundant columns are employed to reduce the miscorrection probability of

Abstract—Redundant memory columns are an essential ingredient of memory design for yield and reliability. They are used either as spare columns for the replacement of completely defective regular columns or to store check-bits for error detection and correction codes. Column replacement allows to mask isolated malfunctioning storage cells as well. Unfortunately, the number of columns with defective storage cells that can be masked in this way cannot exceed the number of spare columns which is usually quite low. Here, we propose a way to increase the capacity of masking memory columns with isolated defective storage cells using spare memory columns. For this purpose, single error correction and double error detection (SEC-DED) codes already available for the protection against soft errors are extended such that all double-bit errors which affect a fixed sub-set of bit positions in the code words can be corrected. The cardinality of this sub-set is significantly higher than the number of spare columns. A bit-swapper is employed to map the bit positions that are protected by the extended SEC-DED code against double-bit errors to the memory columns with defective storage cells. In this way, single-bit soft-errors affecting any bit position can be corrected simultaneously with single-bit hard errors induced by any subset of memory columns. The bit-swapper can be dynamically reconfigured based on status information that designates the memory columns with defective storage cells. This facilitates the integration into built-in self-repair (BISR) schemes. Keywords- yield; memory repair; BISR; error correction

I.

INTRODUCTION

Manufacturing and wear-out induced defects are identified as major threats for the yield and the reliability of memories produced with advanced scaled-CMOS technologies [2][5]. In parallel, soft-error rates at chip and system levels remain essentially unchanged or they increase, as is the case with the SRAM memories [1]. Memory protection against soft errors is usually ensured with the help of single error correction and double error detection (SEC-DED) codes [4][9]. In such cases, redundant memory columns are necessary to store the check-bits of the SEC-DED codes. Additional redundant memory columns, called spare columns, are required to replace completely malfunctioning regular columns affected by manufacturing or wear-out induced defects [10][13][17][19]. In memory units with a large number of banks, the majority of banks will not have completely defective columns and the available spare columns can be used to mask isolated defective storage cells [7]. In this way, one can reduce the pressure on other memory repair strategies that can handle defective storage cells such as the employment of spare words [17][19].

978-1-61284-656-9/11/$26.00 ©2011 IEEE

140

an error has occurred at the ith bit position. This error can be corrected with the expression below:

multiple-bit soft errors. A brief review of systematic linear block codes is given in Section II. The E-SEC-DED codes are introduced in Section III. A bit-swapper design and a way to program it are shown in Section IV. Section V presents an adaptation of the proposed scheme for memories that are insensitive to soft errors. The paper achievements are summarized in Section VI.

Vi = V 'i ⊕BitFlipi ; 0 ≤ i < n

In case of SEC-DED codes with even parity, the occurrence of a double-bit error is indicated if the following expression becomes true:

II. SYSTEMATIC LINEAR BLOCK CODES The data protected with linear block codes is organized in code words of length n that contain k data-bits and r=n-k checkbits [12][14]. A binary matrix H, also called parity-check matrix, can be defined such that each code word V fulfills the relation below [4]:

(

n −1

)

i

⊕ H j ∧ Vi = 0;

i=0

(3)

⊕ Vi = 0

i =0

Upon a read operation of a memory word V' previously stored as a code word V, syndrome bits Sj (0≤j