Automatic Generation of Fault Tolerant VHDL

0 downloads 0 Views 171KB Size Report
It allows performing behavioural simulation of the modified design, in order to validate the correct behaviour of the design after the modifications performed.
Automatic Generation of Fault Tolerant VHDL Designs in RTL Luis Entrena, Celia López, Emilio Olías Electronic Technology Area. Universidad Carlos III de Madrid Avda. de la Universidad, 30. 28911 Leganés (Madrid) Spain {entrena, celia, olias}@ing.uc3m.es

Abstract Fault Tolerance (F-T) is an important issue in electronic devices. Detecting and even correcting internal faults during normal operation makes possible the usage of these circuits in critical applications. F-T has been taken into account for many years during design process of these applications, but it has not obtained any profit of latest advances in automatic CAD tools that optimise the design process. Therefore, inserting fault tolerant structures into a circuit has been considered as an external (and heavy) task to the automatic design process. In order to enhance productivity and development time, an automatic CAD tool for helping in the development of fault tolerant circuits is needed. In this paper we propose a new tool for the automatic insertion of fault-tolerant structures in an HDL synthesizable description of the design. With this tool, F-T could be included into any design process with little extra cost or development time, by automatically producing a fault tolerant design according user specifications, also described in an HDL, which could be simulated and synthesised with commercial tools. Examples are shown to demonstrate the capabilities of this approach.

1. Introduction A Fault Tolerant system is one that can continue to correctly perform its specified tasks in the presence of hardware failures. Fault Tolerance (F-T) is the attribute that enables a system to achieve fault-tolerant operation [1]. In those applications where an error in the system performance is unacceptable, F-T is mandatory. In the development of a critical application, designers have to consider the possibility of including F-T structures in the circuit. Nowadays these techniques have to be inserted in the design descriptions manually, because no CAD tools are available for this task. The relatively small number of Fault Tolerant applications did not make attractive for CAD vendors the development of tools specific to the design of fault-tolerant circuits. The insertion of Fault Tolerant techniques requires large design efforts that affect negatively design productivity and time-tomarket. The design of a fault tolerant circuit is usually accomplished by introducing fault-tolerant structures in a previously designed circuit. Within the current HDL-based methodologies, the insertion of fault-tolerant structures is usually performed by manually modifying the HDL code in order to insert hardware redundancy, information redundancy or time redundancy at critical points in the circuit. Then, the modified, fault-tolerant design obtained follows a similar design flow consisting in automatic logic synthesis and place and route. In this paper an automatic tool for developing Fault Tolerant Digital Integrated Circuits in the RT abstraction level with VHDL is presented. This tool will help designers increasing the design productivity of fault tolerant circuits. First of all, automatic insertion of FT techniques will be performed in design description in RT level, and, finally, a VHDL description will be given back. Secondly, automatic validation of VHDL resulting code allows evaluating the level of FT achieved in the design. Throughout the iteration with the insertion of faulttolerant structures and the evaluation of the fault-tolerance achieved, a much deeper exploration of the design space is enabled early in the design cycle. And, finally, synthesis guidelines will be provided by the proposed tool to existing CAD tools for automatic synthesis, in order to maintain the FT of the description. The proposed techniques are being developed in AMATISTA project along with a fault injection and simulation tool. The remaining of the paper is as follows. Section 2 presents an introduction to FT techniques. Section 3 describes the general schema of the fault-tolerance insertion tool. Section 4 shows the results obtained with hardware and information redundancy insertion. Finally, section 5 presents some conclusions.

2. Introduction to Fault Tolerance Techniques F-T techniques rely on the concept of redundancy. Redundancy is the addition of resources, time or information beyond what is needed for normal system operation. Redundancy can take several forms [1]. 1. Hardware redundancy is the addition of extra hardware, in order to detect or tolerate faults. 2. Information redundancy is the addition of extra information to implement a given function. 3. Time redundancy uses additional time to perform system functions.

2.1 Hardware redundancy There are three basic forms of hardware redundancy: passive, active and hybrid. Passive techniques use the concept of fault masking to hide the occurrence of faults and prevent the faults from resulting on errors. Active techniques use fault detection, fault location and fault recovery to achieve FT. The active approach achieves FT by detecting the existence of faults and performing some action in order to get the circuit into correct operation in a satisfactory length of time. Hybrid techniques combine the features of the previous approaches. Fault masking is used in hybrid systems to prevent erroneous results from being generated. Fault detection, location and recovery are also used in the hybrid approaches to improve FT by removing faulty hardware and replacing it by spares (redundant extra elements in circuit, not needed until another element in circuit becomes faulty). Passive hardware redundancy uses voting mechanisms to mask the occurrence of faults. Most passive approaches are developed around the concept of majority voting. For example, Triple Modular Redundancy

(TMR) performs a triplication of hardware modules and a majority vote to determine the output of a circuit. If one of the modules becomes faulty, the two remaining fault-free modules mask the results of the failed element when the vote is performed. In a more general application, N-Modular Redundancy consists in the same concept but with n-replication of the hardware modules. Active Hardware Redundancy techniques, as previously said, attempt to achieve FT by fault detection, fault location and fault recovery. No fault masking is achieved. Consequently, active approaches are most common in applications that can tolerate temporary, erroneous results as long as the system is able to reconfigure and regain its operational status in a satisfactory length of time. For example, Duplication with Comparison is a typical technique to achieve FT with active hardware redundancy. The basic concept of this technique is to develop two identical pieces of hardware that perform the same computations in parallel, and compare the results of those computations. In the event of a disagreement, an error signal is generated. In its most basic form, the duplication concept can only detect the existence of faults, not tolerate them, because there is no method for determining which of the two modules is faulty. Other techniques for active hardware redundancy are Standby Sparing, where each module has a fault detection capability and only one is operational at a particular time if the on-line module becomes faulty, then it is removed from operation and replaced with a spare module, [1].

2.2 Information redundancy Information redundancy is the addition of redundant information to data to allow fault detection or fault masking. An error detecting code is a specific representation allowing errors introduced into a code word to be detected, while an error correcting code is a specific representation allowing errors introduced into a code word to be corrected, [3], [4] and [5]. Some of the most commonly used error detecting codes are parity codes and duplication or dual-rail codes. Single-bit parity codes require the addition of an extra bit to a binary word such that the resulting code word has either an even number of 1s or an odd number of 1s. Consequently, a single-bit error can be detected by checking the number of 1s in the code words. Duplication codes are based on the concept of completely duplicating the original information to form the code word. A variation on this basic approach is the complemented duplication (Dual-Rail code). There is another type of error detecting codes, unordered codes that are capable of detecting multiple errors. The most interesting unordered codes are the mout-of-n code and the Berger code. On the other hand, Arithmetic Codes are very useful when it is desired to check arithmetic operations such as addition, multiplication and division. Examples of arithmetic codes are AN Codes, Residue Codes, Cyclic Codes and Hamming Error-Correcting Codes, [6]and [7]. Finally, the most extended error correcting codes are the Hamming Error Correcting Codes.

3. Automatic modification of HDL code The main objective of the proposed fault-tolerance insertion (FTI) automatic tool is to generate a fault tolerant VHDL design description. Designer will provide an original VHDL design description and some guidelines about the type of FT techniques to be used and their location in the design. FTI automatic tool will process original VHDL descriptions [8]. Therefore, a unified format to deal with these descriptions is needed. There are several Intermediate Formats that represents, by means of a database, the VHDL description in a formal way that could be accessed and processed with some Procedural Interface. Within AMATISTA project FTL Systems Intermediate Format (AIRE/CE® [9], [10], [11]) and TAURI® tool have been selected. According to its applicability orientation, FTI automatic tool will work only with synthesisable descriptions, IEEE 1076, [12]. User interface FT insertion tool

Fault Tolerance Insertion Procedures

VHDL output file

VHDL input file

WORK

FT Library

To other tools

Figure 1. Scheme of the Fault-Tolerant Insertion tool

Fault tolerant components to be included into VHDL original descriptions will be already described and stored in a special library called FT library. These components come from previous researches about FT. Designer should not develop elements of this library, just use them. This way, it is not necessary the users are experts in developing FT techniques but in including them in their VHDL designs. Examples of these components are encoders, decoders, checkers, majority voters, etc. [13]. Also, self-checking designs of particular modules (e.g., typical data path components) may be included. These components are inserted in the design as required for the FT techniques to be applied. Designers should be familiar with the hardware elements to be modified or substituted when applying FT techniques (basically memory elements, operators, etc.) and the possibilities of modification (such as single elements, input cones, etc.). These aspects will be detailed in next section.

Application of FT Insertion Tool follows the scheme shown in Figure 1. In this scheme, the initial HDL design is first analysed and elaborated. Throughout a user interface, the user may apply fault-tolerance techniques at selected points in the design in a step-by-step fashion. At each step, a transformation is specified by selecting a target (a piece of hardware and its associated code) and a technique to be applied on the target. The descriptions of components needed for the various fault-tolerance techniques to be applied are available in a component library named FT-library. The final fault-tolerant design obtained can then be synthesized or downloaded to a new VHDL file, which will be the input for simulation or synthesis with other tools. Therefore, it is possible to download a modified VHDL version of the fault-tolerant design. This option provides several advantages: • It allows performing behavioural simulation of the modified design, in order to validate the correct behaviour of the design after the modifications performed. This saves simulation time, because behavioural simulation is far more efficient than gate level simulation • The user can recognize the modifications performed in the code, providing higher user confidence in the modification process. Manual modifications are still possible, if needed. During the user learning steps, the user may compare the results obtained with the automatic and manual modification process. • The proposed approach makes automatic insertion of fault-tolerant structures compatible with any design environment supporting an HDL input. Considering that one of the main reasons for the lack of CAD tools in the area of FT been the relatively small number of users, this is an interesting advantage as it can reach the entire design community independently of the particular design environment used. The proposed approach allows making full use of the synthesis capabilities provided by commercial tools. Components that have been designed manually according to specific techniques can be incorporated in FTlibrary and inserted under the general insertion mechanisms of the tool.

4. Automatic insertion of hardware and information redundancy As stated previously, adding some sort of redundancy is the way of incrementing the FT of a circuit. In FTI Tool, hardware and information redundancy techniques are applied, according to user’s specifications, with the purpose of detecting or tolerating faults. Although application of both methods is quite different, initial conditions, some pre-processing tasks and application methodology are common in both techniques. In this section all these aspects will be explained. Examples of VHDL codes and inferred hardware will be shown. Initial conditions, already stated in previous section, are the usage of an intermediate format for VHDL descriptions (TAURI® [9]) and the need of receiving only synthesisable VHDL descriptions. Pre-processing tasks will generate information about the type and number of clocks in the description, fan-in and fan-out trees, the localisation of inferred flip-flops, sequential and combinational logic, etc. This previous analysis could be considered as a simplified elaboration for synthesis. Once made this pre-processing, users could specify the FT technique to be inserted in the description. Designers should provide three main input data to the tool, Element in the design to be modified, FT technique to be inserted and Fault recovery actions. These data will be detailed in next paragraphs. • Element in the design to be modified The reference to the element could be the data itself (VHDL object: signal, port) or the assignment statement (VHDL statement: label or the line number). Depending on the technique to be applied, this selection may imply many changes in original design. Therefore, as it will be shown in next section, with hardware redundancy it could be necessary to select not only a single element to replicate, but also its input cone. On the other hand, in the information redundancy technique, the selection of elements is being done in a more rigid way. Designer is able to select only single elements where to insert encoders, decoders, etc. The whole modification of a module to deal with codified data should be done carefully due to feedbacks and operations. • Fault tolerance technique to be inserted In this point we should distinguish between Hardware Redundancy or Information Redundancy. Hardware Redundancy generally involves the replication of VHDL declarations and statements, as well as new declarations or statements for replicated elements. Also, the generation of final value (fault tolerant as well) from different outputs of replicated elements should be performed. This action is made by a function or a component that “decides” resulting value, e.g. a voter mechanism for TMR technique. Element selection could be done in two ways: single selection (i.e. a flip-flop to triplicate in some NMR technique) or input-cone selection, where also input logic to the element is selected. For example Code 1, inferred hardware shown in Figure 2, the input cone for data entering Result register consists of all logic elements until the previous registers, while single element selection is referred only to the output of the adder block. In Figure 3 these differences are shown. Information Redundancy involves the modification of data types and operators in order to process encoded information. Therefore, encoders, decoders and checkers are inserted in original VHDL description. The encoding of a portion of dataflow can be obtained through a composition of the four basic operations:

• • • •

Insert an encoder at a selected point, Insert a decoder/checker at a selected point, Substitute an existing operator by the equivalent operator that operates on coded data Extend the size of selected data to accommodate extra bits required by the redundant code used architecture ORIG of DEMO1 is signal A_Reg : std_logic_vector(7 downto 0); signal B_Reg : std_logic_vector(7 downto 0); begin process(Clk, Reset) begin if Reset = '0' then Result '0'); A_Reg ‘0’); B_Reg ‘0’); elsif Clk'event and Clk = '1' then if Enable = '1' then A_Reg