A Constructive Neural Network Approach for ...

7 downloads 0 Views 718KB Size Report
A Constructive Neural Network Approach for Simulation and. Identification of Nonlinear Dynamic Systems. Jin-Song Pei, Member, ASCE, Assistant Professor,.
A Constructive Neural Network Approach for Simulation and Identification of Nonlinear Dynamic Systems Jin-Song Pei, Member, ASCE, Assistant Professor, School of Civil Engineering and Environmental Science, University of Oklahoma, 202 W. Boyd, Room 327E, Norman, OK 73019-1024, Phone: (405)325-4272, Fax: (405) 325-4217, E-mail: [email protected], Andrew W. Smyth, Member, ASCE, Associate Professor, Dept. of Civil Engineering and Engineering Mechanics, Columbia University, 610 S.W. Mudd Bldg., New York, NY 10027-6699, Phone: (212) 854-3369, Fax: (212)854-6267, E-mail: [email protected], and Joseph P. Wright, Principal, Applied Science Division, Weidlinger Associates Inc., 375 Hudson Street, 12th Floor, New York, NY 10014-3656, Phone: (212) 367-3084, Fax: (212) 367-3030, E-mail: [email protected] Abstract Neural networks are well-known universal approximators in function approximation but have been treated more or less as ”black boxes” in engineering practice. In the application of system identification for structural health monitoring, neural networks have been adopted as powerful tools for problems that challenge conventional approaches as in many other situations. In particular, they have been treated as a superior fitting tool for any given input and output data set, however the critical steps of initial network architecture selection (network types, number of layers and hidden nodes) and interpretation of final trained parameters (weights and biases) are not thoroughly explored and understood especially by relating these issues to the underlying physics of the system to be modeled. This typifies a dilemma associated with blackbox models that are highly adaptive but neither transparent nor parametric. Challenged by these controversial natures of neural networks and driven by the need of applying neural networks in damage detection, the authors have proposed a novel constructive approach in initial network architecture design. In this study, focus is given to modeling nonlinear hysteretic behavior typically observed in structural joints subject to extreme dynamic loads. Algebraic and geometric based analysis is carried out to reveal the ”mechanisms” of how neural networks, which in turn serves as a practical guidance in designing neural networks. Numerical simulations are presented to demonstrate the efficiency and engineered feature of this approach. A training example is provided 1

to show that this approach enables neural networks to carry some ”meaning” (either physical or phenomenological) while remaining flexible and powerful in system identification. Introduction Multilayer feedforward neural networks have been studied and applied extensively in modeling nonlinear dynamic systems in many fields over recent decades (Rumelhart et al. 1986; Lippmann 1987; Cybenko 1989; Hornik et al. 1989; Hagan et al. 1995; Haykin 1998; Sandberg et al. 2001). Applications of neural networks, especially multilayer feedforward neural networks in structural engineering and engineering mechanics, are as varied as in other fields. Applications include modeling material behavior (Ghaboussi and Wu 1998), detection of structural damage (Masri et al. 1999; Masri et al. 2000; Sohn et al. 2003) as well as online tracking of nonlinear restoring forces (Kosmatopoulos et al. 2001). The use of neural networks has been a somewhat controversial approach for modeling physical processes. While the “mechanisms” of how neural networks can successfully model a variety of real processes have yet to be fully established, many successful implementations and applications are available in different fields, especially for the problems which challenge conventional approaches. Thus neural networks remain powerful but somewhat mysterious “black boxes”. While there is a prevailing attitude that because of the effectiveness of neural networks in applications, researchers should tolerate these less transparent aspects of neural networks which differ from those of traditional computational modeling tools, there are many researchers who hesitate or even reject the adoption of this powerful computational tool. Their major concern is whether neural networks are reliable given the fact that their effectiveness is not fully understood (just as the brain is not fully understood). A lack of a clearly presented relationship between this soft computing tool and those conventional hard computing tools may also contribute to this limited acceptance of neural network approaches in general. When mulitlayer feedforward neural networks are applied to engineering problems, networks are normally trained just to obtain the best-fit for a given input-output training data set. Various subjective design issues have been encountered and identified, though analytic solutions are scarce and empirical approaches are normally presented as solutions. The main features as well as problems of the commonly seen neural network approaches can be summarized as follows: Initial Design The initial design of the network architecture usually imposes a big challenge. The physical law controlling the problem which the neural network is used to solve, is not used to guide the formulation of neural network architecture. The mathematical aspects of the problem, do not suggest how to set up the network architecture. Therefore, the initial design is an issue of neither physics nor mathematics. It is decided normally based on empirical experience using neural networks or simply by trial and error. For example, there is no solid rule on how to decide the number of hidden nodes for multilayer feedforward neural networks. 2

Training Procedure The training procedure is straightforward for the commonly used backpropagation implementation. The weights and biases in all the layers are left to be self-adapted from their initial values to minimize the performance index according to a training algorithm. This self-adaptation feature gives the neural network the power of learning and problem solving. Various numerical schemes have been developed to aid fast training especially those associated with slow convergence of backpropagation, to arrange proper initialization and to avoid local minima. Empirical approaches are also available to address practical concerns, such as overfitting. Results Interpretation The final results, i.e., trained weights and biases, usually have little or no physical interpretation, just like the network architecture itself. This is not desirable for real-world problems. The neural network approach applied in system identification for structural health monitoring, for example, is classified as a nonparametric approach because it is hard to relate identified weights and biases to physical quantities. Also for multilayer feedforward neural networks, there is a commonly encountered nonuniqueness situation referring to the nonunique results associated with different initial values of weights and biases. This situation can place researchers in a dilemma especially when they have to choose a single solution or have to compare the solutions corresponding to different systems. Take the example above again, the nonuniqueness feature makes the neural network approach difficult to apply to damage detection. Variations in trained weights and biases from using different sets of training data sets can hardly be related to changes of the system (Masri et al. 2000). Overall, in-depth studies are in demand in the above first and third aspects, namely, initial network architecture design and result interpretation, especially when compared with a considerable body of literature on improving the training process of multilayer feedforward neural networks. These under-addressed issues, however, are practical as well as fundamental issues that cannot be avoided. To bridge the gap, the authors will explore the geometric capabilities of linear sums of sigmoidal functions and then demonstrate how to form the architecture of a multilayer feedforward neural network based on mathematical formulation and/or geometric features of the problem rather than some empirical guesses. The initial weights and biases decided accordingly will be close to having geometric and/or mathematical and/or phenomenological “meaning”. The final weights and biases trained by using some established training algorithms will thus be close to having a parametric or phenomenological meaning. All of these investigations will be carried out by taking the force-state mapping problem (Masri and Caughey 1979; O’Donnell and Crawley 1985; Al-Hadid and Wright 1989; Benedettini et al. 1995) as an example.

3

Universal Approximator Theorem Early work to explore how neural networks work can be found in pattern classification (Lippmann 1987) and its review in many textbooks (Hagan et al. 1995; Gurney 1997; Haykin 1998). In these works, the interpretation of the network weights and biases (especially their geometric interpretation) and the relation of the design of multilayer feedforward perceptrons to the nature of the problem (i.e., classifying a pattern space into different decision hyperplanes) have provided researchers with insights on how neural networks work. There is also some literature on how recurrent neural networks work with regard to nonlinear dynamic systems modeling (Nelles 2000; Gupta et al. 2003). Unfortunately, this kind of insightful exploration is not commonly found in the applications of the most widely adopted neural networks, the multilayer feedforward neural networks, for function approximation. The universal approximator theorem (Cybenko 1989; Hornik et al. 1989) states that a finite linear sum of continuous sigmoidal functions can approximate any function to any desired degree of accuracy. For a given continuous scalar target/goal function g (p) where the vector p has a dimension of np , there is a scalar output function z (p) in linear summation form, z (p) =

nh X

w2,j S (w1,j p − bj )

(1)

j=1

so that |z (p) − g (p)| < ε for all p, where ε is an arbitrarily small number. Here S is any continuous sigmoidal function, i.e., S(x) → 1 when x → +∞ and S(x) → 0 when x → −∞, and w1,j (a vector with a dimension of np ), w2,j (a scalar) and bj (another scalar) denote weights and biases (where j = 1, . . . , nh ). Note that the number of nh is an arbitrary number in the theorem. Application of this theorem to neural networks is straightforward; a feedforward network with one hidden layer and an arbitrary number of continuous sigmoidal functions is a universal approximator provided that no constraints are placed on the number of nodes and the size of the weights, i.e., nh in Eq. (1). Various types of sigmoidal functions S can be chosen a priori. Among the most popular candidates, the logistic sigmoidal function S, a function of a single scalar variable x, is defined as S (x) = 1+e1−x . By introducing a new variable/input p as x = wp − b where w is the weight and b the 1 . bias, another function h can be given as a function of a p as h (p) = 1+e−(wp−b) After being assigned some initial values, all the weights and biases, w1,j , w2,j and bj in a multilayer feedforward neural network can be trained using a training data set composed of p and g (p) pairs. However, no guidance is given on how to choose the number of hidden nodes nh , which will vary from problem to problem and of course depends also on the accuracy required. Therefore, although it does settle the question of feasibility (existence), the theorem is not constructive with regard to the network set-up (Lippmann 1989). Publications on the corresponding constructive setup approaches do exist; summaries of some of these works can be found in literature (Lehtokangas 1999; Yam and Chow 2001; Ma and Khorasani 2004). As previously discussed, designing hidden layer(s), i.e., deciding the number of hidden nodes nh and 4

the values of initial weights and biases, however remains as a practical challenge due to the complex and subjective nature of this issue. Proposed Neural Network Approach under Force-State Mapping To meet the above-mentioned challenge, a new approach (Pei 2001; Pei and Smyth 2004b; Pei and Smyth 2004a; Pei et al. 2004) has been tested on the modeling of nonlinear hysteresis restoring forces, an important class of problems in mechanics and many engineering applications. Efforts are made on how to implement the initial network architecture design to 1) get a more rational starting point for neural network training, i.e., by determining how many hidden nodes are needed to approximate certain types of nonlinear restoring force functions in a multilayer feedforward neural network, and 2) to obtain a more meaningful set of final trained weights and biases through defining initial values for the weights and biases. The force-state mapping formulation (Masri and Caughey 1979) has been adopted; the limitations and usefulness of this formulation in modeling with nonlinear hysteretic restoring forces especially for memory associated effects have been discussed in details (Worden and Tomlinson 2001; Masri et al. 2004). In principle, fitting a restoring force surface of a SDOF system in a state-space can be carried out using a neural network with one hidden layer as in other function approximation problems, which is defined by the following mathematical expression: r (x, x) ˙ ≈ rˆ (p1 , p2 ) =

nh X

w2,j hj (p1 , p2 ) ,

(2)

j=1

where hj (p1 , p2 ) =

1 1 + e−(w11,j p1 +w12,j p2 −bj )

(3)

where r and rˆ stand for the exact and approximated restoring force, while p1 and p2 refer to the scaled displacement x and velocity x, ˙ respectively, for a SDOF system. The scaling issue has been discussed (Pei and Smyth 2004a). Also a feedforward neural network with more than one hidden layer may be required for better computational efficiency and perhaps more importantly to yield better insight into the modeling behavior. A constructive approach has been presented (Pei 2001; Pei et al. 2004) to approximate polynomials and mapping polynomial fitting into neural networks with a multilayer feedforward neural network. Based on a finite linear sum of hyperbolic sigmoidal functions and their Taylor series expansion, numbers of hidden nodes as well as initial values for weights and biases are derived to satisfy a certain degree of accuracy. In another study based on a different methodology (Mhaskar 1995; Meade et al. 1996; Meade 2003), derived neural networks were also sought to map polynomials without training. Approximating polynomials is studied first because of their important role in the force-state mapping literature, i.e., Chebyshev and ordinary polynomial fitting (Masri and Caughey 1979; O’Donnell and Crawley 1985; Al-Hadid and Wright 1989; Benedettini et al. 1995). The effectiveness and flexibilities of multilayer feedforward neural networks, however, far surpass this minimal proficiency in approximating polynomial type nonlin5

earities. In addition to the rigorous algebraic approach used in mapping polynomials, a qualitative geometric analysis has been carried out to study the capabilities of linear sums of sigmoidal functions specifically for the force-mapping problem (Pei 2001; Pei and Smyth 2004b). The geometric features in terms of 2-D contours of various restoring force surfaces, are examined first using analytic, simulated and experimental results. Then, strategies are developed to mimic these features in a transparent manner by focusing on the number of hidden layers and hidden nodes, as well as the needed values of the weights and biases. It shows that as long as the initial weights and biases are selected properly, a sum of a very small number of sigmoidal functions can achieve what the above-mentioned conventional numerical schemes can generally not achieve in an efficient, flexible and unified manner. Application of Proposed Approach in Simulation and Identification of Nonlinear Hysteretic Restoring Forces Some simplified typical features of nonlinear hysteretic restoring force surfaces for SDOF systems are simulated and presented in Fig. 1, where single-hidden layer neural network architectures are adopted. In each case, a linear sum of sigmoidal functions with a selected number of terms and specified values of weights and biases (see Row 1) is used to “form” a restoring force surface (see Row 2 for the 2-D contours of the restoring force surfaces obtained). A synthetic swept-sine excitation force is then applied to excite the SDOF system with the defined restoring force surface. By solving the equation of motion numerically, a discrete response time history in terms of system states (displacement and velocity) populate the restoring force surface to create a trajectory (see Row 3). The trajectory is then be projected to produce restoring force versus displacement plots and phase plots as in Rows 4 and 5, respectively. This numerical simulation procedure follows what has been typically done (Smyth et al. 2002; Masri et al. 2004). The simulated plots, especially the restoring force versus displacement plots, are compared with common simplified nonlinear hysteretic phenomena to illustrate the capability of multilayer feedforward neural networks. Note that all the neural networks used here for the simulation are not obtained from training. Instead, their number of hidden nodes and the values of the weights and biases are decided based on the above-mentioned algebraic and geometric study so as to introduce some of the physical, mathematical, and geometric features of nonlinear surfaces through some human judgement rather than leaving these issues entirely to data sets. The “meaning” of the weights and biases can be appreciated from a parametric study run on each case (Pei 2001; Pei and Smyth 2004a) based on the same architecture but with various values of parameters. Also note that this simulation task is achieved on a qualitative basis with a focus on mimicking typical simplified features rather than fitting any specific individual case. Through this exercise, the power and efficiency of the multilayer feedforward neural network over the conventional fitting approaches (either polynomial or non-polynomial) can be observed. In essence, multilayer feedforward neural networks with a small number of hidden layers and hidden nodes are able to capture a wide range of nonlinear hysteretic behaviors given a properly designed architecture and properly selected 6

( b ) S o f t e n i ng

(a ) D u f f i n g

( c)S atu ra tion

( e) Clea ran ce

(d )Cou lo mb

Σ

Σ Σ

Σ Σ

restoring force

Σ Σ

displacement

displacement

Σ

Σ

velocity

Σ

restoring force

Σ

Σ

restoring force

Σ

Σ

Σ

velocity

velocity

Σ

Σ

Σ

displacement

Σ Σ

Σ

velocity

Σ

restoring force

Σ

displacement

restoring force

Σ

velocity

displacement

Σ

Σ

Σ Σ 1.5

40

20

40

3 0

1

1

-2

1

4

0.6

3 0 -1 -2 -4

4

0

3

-1

0

3

-0.6

0

-2

2 -3

-1 -2

-3

0

2

2

-4

-1

-3 0

0.4 0.2

-0.8

0 0.8

-0.2

-1 2

4 1

1

-4

-3

3

-0.4

0

0

-1.5 -1.5

0

-40 -120

1.5

0

120

-40 -120

0

120

-20 -80

3

6

6

4

0

0

0

0

-3 1.5

-6 40

-6 40

1.5 0

120 0

0 -1.5

0 -40

-1.5

120 0

80

0

-1.1 3

-120

15 0

0 -20

6

4

1.1

0

0

0

0

0

1.5

-6 -120

0

120

-6 -120

0

120

-4 -80

0

80

-1.1 -15

1.5

40

40

20

3

0

0

0

0

0

-1.5 -1.5

0

1.5

-40 -120

0

120

-40 -120

0

120

-20 -80

0

0 -3

-80

6

0

15

80 0

3

-3 -1.5

0

1.1

-4 20

0 -40

-120

0

-3 -15

80

-3 -15

-15

0

15

0

15

FIG. 1. Neural network architectures with one hidden layer as prototypes for various typical nonlinearities.

values for the weights and biases. This finding is not merely a validation of the feasibility studies (Cybenko 1989; Hornik et al. 1989) which explore what multilayer feedforward neural networks can do, rather it suggests schematically how to design multilayer feedforward neural networks to approximate a nonlinear function, i.e., a nonlinear restoring force in this context. This exercise, therefore equivalently importantly, gives an insight on how to design the network architecture and select weights and biases even though the results here are qualitative. These typical nonlinearities studied herein, only serve as examples for the use in an identification task; it is suggested to collect more prototypes like these, study the influence of the values of weights and biases to the surface profile, form the corresponding strategies based on any of the mathematical, physical, and geometrical features and store all of them in a library used to provide guidance on how (in terms of architecture design) and where (in terms of initial values of weights and biases) to start neural network training in mapping nonlinear restoring forces. With such a library, it is proposed to use a pre-processing stage to seek guidance on the neural network initial design (perhaps with some additional iterative trials as well), which aims at grasping the global characteristics of the restoring force surface. The training then starts with a clearly defined initial condition that is a product of some physical insights and engineering judgements, and is used to fine-tune the surface to reflect some localized and/or deli7

cate features with or without increasing the size of the neural network. Exercising human judgement based on the observation of the geometric characteristics of the problem and then applying strategies following some pre-formed general ideas derived from models rather than data are the features of this neural network initial design. This philosophy differs fundamentally from the way neural networks are commonly applied because it closely relates the initial neural network design to the nature of the problem to be modeled. The nonuniqueness of identified models for the neural network approach is well known, that is, given different initial points (corresponding to different sets of initial weights and biases), the final trained results differ even for the same set of training data. The proposed methodology leads to an initial point that is placed close to a location where there is some type of physical, geometrical or mathematical meaning. Starting with such an initial design, the training of a multilayer feedforward neural network is then used to fine tune this initial point according to the minimization of some design parameter rather than conducting a wide ranging random search. In other words, the accuracy with which one fits training data is at least as important as searching for meaningful final trained results, and thus, it is expected that by using an initialization model based on a physical interpretation, that the final trained results will be closer to having some meaning than if one adopted other initialization schemes. This is the key assumption made for identification problems in this study. A training example is presented in Fig. 2. In Fig. 2(a), a simulated velocity quadratic damping data set is formed using r (x, x) ˙ = 0.04x˙ + 0.04x˙ 2 sign (x) ˙ + x excited by a swept-sine excitation f = sin(0.01t2 + 0.01t) with a uniform time step t = 0, 0.1, . . . , 200. In the proposed pre-processing stage, the original data set is organized into pairs of restoring force versus displacement and velocity. Numerical interpolation is then carried out to form a data-based (non-analytical) restoring force surface. Since 2-D contour lines characterize a 3-D surface, contour features of the interpolated surface (see Fig. 2(b)) are examined. Based on these features, Fig. 2(c) to (e) give a step-by-step illustration on how to approximate this problem by visual assessment and simple trial and error. Prototypes in Fig. 1 and strategies in handling arbitrary parallel contours and some localized features (Pei 2001; Pei and Smyth 2004b) are adopted to decide the neural network architecture and initial values of the weights and biases. Fig. 2(f) is the derived neural network architecture used for this fitting problem. Fig. 2(g) is the simulation results using Fig. 2(e) and the same swept-sine excitation applied as above. Fig. 2(h) shows the training process where four cases are considered separately to validate the idea of including the proposed pre-processing stage. The batch training mode is adopted, which means the weights and biases are updated only after the entire training set has been applied to the neural network. Throughout the training, the Levenberg-Marquardt algorithm (Demuth and Beale 1998) is adopted because it is considered the most efficient in terms of convergence rate compared with other algorithms used to expedite the backpropagation. Both Cases 1 and 2 adopt the derived neural network architecture in Fig. 2(f) but with different initial values of weights and biases. Case 1 is based on the popular Nguyen-Widrow layer initialization method (Nguyen and Widrow 1990) and Case 2, the proposed pre-processing. In 8

Cases 3 and 4, it is investigated whether a neural network can behave well using only one hidden layer with the number of hidden nodes in accordance with the suggested number from the derived two-hidden layer architecture. While the major benefits of introducing a pre-processing stage are that it provides a clear sense of what- and how-to-do-it in the network design, it seems that one can still justify the pre-processing stage for the improved mean-square-error (MSE) performance. The derived weights and biases from the pre-processing stage (see Case 2 in Fig. 2(h)) not only give the smallest initial MSE among the four cases, but also lead to the best performance after about 300 epoches. This indicates the computational merit of having the pre-processing stage, although many more examples should be investigated to validate this conclusion. Comparison of the values of weights and biases between Case 1 and Case 2 (Pei and Smyth 2004a) shows that most of the trained values are close to their corresponding initial values. Since the initial values derived from the pre-processing stage have geometrical interpretations, the corresponding trained values are thus also close to having some geometrical interpretations. This is one of the primary goals of this study. Discussion and Conclusion This study has sought to take a step toward creating a neural network approach with enough physical /mathematical /phenomenological insights to be classified as meaningful, but yet remain highly adaptive for future development. Compared with existing constructive neural network approaches (Lehtokangas 1999; Yam and Chow 2001; Ma and Khorasani 2004), this study is more specific in providing strategies for important type of nonlinear functions in engineering mechanics, more comprehensive for not being limited to approximating polynomial nonlinearities (Meade et al. 1996) and thus more appropriate for realistic applications especially in modeling nonlinear hysteretic phenomena. As a novel approach in neural network training, the proposed pre-processing stage however faces many challenges (Pei and Smyth 2004a). In addition, a semi-automated procedure could be implemented in this pre-processing stage by classifying types of nonlinearities to be modeled and providing a programmed strategy on initial neural network design for each type. Other issues of normalization of input variables (i.e., scaling), the distributions of initial values of weights and values and fixity of neural network weights and biases during training have been identified (Pei 2001; Pei and Smyth 2004a) and are under further investigations. The force-state mapping formulation has been revisited in a novel neural network setting in this study, however, it is only selected as a vehicle in the search for an engineering neural network approach in function approximation. For the SDOF system restoring force approximation shown in Eq. (3), the problem is to fit a function of two variables so that visualization of the problem in a 3-D space is possible. Further study on higher dimension function fitting along the same lines of this study can be further carried out to directly consider memory effects.

9

(h) Time history of preformance index of four neural netowks

5 1

1

squared error

1

restoring force - g(p ,p )

10

0

Σ

p1

0

10

Σ

Σ

p2 Σ

-1

10

Case 4 Σ -2

10

-5 -5

0 displacement - p

5

10

(c) Plot of f(p2), where

(b) Contour of exact restoring force surface

-1

10

2

-4

Σ

p2

Σ

-2

squared error

2

4

p

Case 2 -3

10

0

1

2

3

4

5

6

7

epoch

8

9

Σ

Σ

-5

10

-4

10

10

Case 1

Case 4

Σ

-5

10

Case 1

-4

p1

Σ

p

Σ

Σ

-6

10

0

velocity - p 2

2 0

z

Σ

10

0

Σ

Σ

Case 3

10

5

5

p1

Σ

w =1, b= -4 and w =1, b=4 1

Σ

Case 1

-3

10

0

1

Σ

Case 4

1

10

-2

Σ

Case 3

2

z

-7

10

0 displacement - p

5

0 f(p )=h (p ) -h (p )+1 2

1

1

2

2

Σ

5

p1

Σ

p

Σ

0

200

400

(d) Contour of u=p1-f(p 2)

600

epoch

800

1000

Σ Σ

-9

10

2

2

Σ

z

Case 3

Σ

5

Σ

0 Σ

-2

Case 2

p2

4

Case 1 - Two hidden-layer architecture with Nguyen-Widrow layer initialization Case 2 - Tow hidden-layer archtecture with the proposed initialization in this study Case 3 - One hidden-layer (six hidden nodes) architecture with Nguyen-Widrow layer initialization Case 4 - One hidden-layer (three hidden nodes) architecture with Nguyen-Widrow layer initialization

0

2

-5 -5

0

5

p1

(f) Derived neural network

(g) Approximated restoring force -displacement plot

(e) Contour of z*(u)=50h(u) -25,

based on the restoring surface in (e)

where w=0.075, b=0

5

5

0 z *(u)

2

0

-4

2

-5

-5 -5

p

-5

Σ

Σ

Case 2 -8

10

-4

0

4

-2

FIG. 2. A training example starting from the proposed pre-processing.

10

Acknowledgments This study was supported in part by the National Science Foundation under SGER CMS-0332350 for the first author and CAREER Award CMS-0134333 for the second author.

(a) Exact restoring force-displacement plot

-5 -5

0

p

5

-5 -5

0 p

5

z

REFERENCES Al-Hadid, M. and Wright, J. (1989). “Developments in the force-state mapping technique for non-linear systems and the extension to the location of non-linear elements in a lumped-parameter system.” Mechanical Systems and Signal Processing, 3(3), 269–290. Benedettini, F., Capecchi, D., and Vestroni, F. (1995). “Identification of hysteretic oscillators under earthquake loading by nonparametric models.” ASCE Journal of Engineering Mechanics, 121(5), 606–612. Cybenko, G. (1989). “Approximation by superpositions of sigmoidal function.” Mathematics of Control, Signals, and Systems, 2, 303–314. Demuth, H. and Beale, M. (1998). Neural Network Toolbox For Use with MATLAB. The MathWorks, Inc. Ghaboussi, J. and Wu, X. (1998). “Soft computing with neural networks for engineering applications: Fundamental issues and adaptive approaches.” Structural Engineering Mechanics, 6(8), 955–969. Gupta, M., Jin, L., and Homma, N. (2003). Static and Dynamic Neural Networks From Fundamentals to Advanced Theory. Wiley Interscience. Gurney, K. (1997). An Introduction to Neural Networks. UCL Press. Hagan, M., Demuth, H., and Beale, M. (1995). Neural Network Design. PWS Publishing Company. Haykin, S. (1998). Neural Networks: A Comprehensive Foundation, 2nd ed. Prentice Hall, pp.842. Hornik, K., Stinchcombe, M., and White, H. (1989). “Multilayer feedforward networks are universal approximators.” Neural Networks, 2, 359–366. Kosmatopoulos, E., Smyth, A., Masri, S., and Chassiakos, A. (2001). “Robust adaptive neural estimation of restoring forces in nonlinear structures.” ASME Journal of Applied Mechanics. Lehtokangas, M. (1999). “Fast initialization for cascade-corrolation learning.” IEEE Transactions on Neural Networks, 10(2), 410–414. Lippmann, R. (1987). “An introduction to computing with neural nets.” IEEE ASSP Magazine, 4–22. Lippmann, R. (1989). “Pattern classification using neural networks.” IEEE Communication Magazine, 47–64. Ma, L. and Khorasani, K. (2004). “New training strategies for constructive neural networks with application to regression problems.” Neual Networks, 17, 589–609. Masri, S., Caffrey, J., Caughey, T., Smyth, A., and Chassiakos, A. (2004). “Identification of the state equation in complex non-linear systems.” International Journal of Non-Linear Mechanics, 39, 1111–1127. Masri, S. and Caughey, T. (1979). “A nonparametric identification technique for nonlinear dynamic problems.” Journal of Applied Mechanics, 46, 433–447. Masri, S., Smyth, A., Chassiakos, A., Caughey, T., and Hunter, N. (2000). “Application of neural networks for detection of changes in nonlinear systems.” ASCE Journal of Engineering Mechanics, 126(7), 666–676. Masri, S., Smyth, A., Chassiakos, A., Nakamura, M., and Caughey, T. (1999). “Train11

ing neural networks by adaptive random search techniques.” ASCE Journal of Engineering Mechanics, 125(2), 123–132. Meade, A. (2003). “Regularization of a programmed recurrent artificial neural network.” Journal of Guidance, Control, and Dynamics. Meade, A., Lind, G., and Zeldin, B. (1996). “Feedforward artificial neural network initilalization by mathematical models.” International Journal of Neural Systems. Mhaskar, H. (1995). “Neural networks for optimal approximation of smooth and analytic functions.” Neural Computation, 8, 164–177. Nelles, O. (2000). Nonlinear System Identification: From Classical Approaches to Neural Networks and Fuzzy Models, pp. 785. Springer Verlag. Nguyen, D. and Widrow, B. (1990). “Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights.” Proceedings of the IJCNN, Vol. III. 21–26. O’Donnell, K. and Crawley, E. (1985). “Identification of nonlinear system parameters in space structure joints using the force-state mapping technique.” Report no., MIT space systems lab., SSL#16-85. Pei, J.-S. (2001). “Parametric and nonparametric identification of nonlinear systems,” Ph.d. dissertation, Columbia University. Pei, J.-S. and Smyth, A. (2004a). “A new approach to design multilayer feedforward neural network architecture in modeling nonlinear restoring forces: Part ii - applications.” ASCE Journal of Engineering Mechanics. under review. Pei, J.-S. and Smyth, A. (2004b). “A new approach to designing multilayer feedforward neural network architecture for modeling nonlinear restoring forces: Part i formulation.” ASCE Journal of Engineering Mechanics. under review. Pei, J.-S., Wright, J., and Smyth, A. (2004). “Mapping polynomial fitting into feedforward neural networks for modeling nonlinear dynamic systems and beyond.” Computer Methods in Applied Mechanics and Engineeering. under review. Rumelhart, D., McClelland, J., and the PDP Research Group (1986). Parallel Distributed Processing Explorations in the Microstructure of Cognition, Volume 1: Foundations, pp. 547. MIT Press. Sandberg, I., Lo, J., Fancourt, C., Principe, J., Katagiri, S., and Haykin, S. (2001). Nonlinear Dynamical Systems: Feedforward Neural Network Perspectives, pp. 256. Wiley-Interscience. Smyth, A., Kosmatopoulos, E., Masri, S., Chassiakos, A., and Caughey, T. (2002). “Development of adaptive modeling techniques for nonlinear hysteretic systems.” International Journal of Nonlinear Mechanics, 37, 1435–1451. Sohn, H., Worden, K., and Farrar, C. (2003). “Statistical damage classification under changing environmental and operational conditions.” Journal of Intelligent Materials Systems and Structures, 13(9), 561–574. Worden, K. and Tomlinson, G. (2001). Nonlinearity in Structural Dynamics: Detection, Identification and Modelling, pp. 680. Institute of Physics Pub. Yam, J. and Chow, T. (2001). “Feedforward networks training speed enhancement by optimal initialization of the synaptic coefficients.” IEEE Transactions on Neural Networks, 12(2), 430–434. 12