ISC2 vol1 part2 - Dicea-Geotechnical Section

1 downloads 0 Views 936KB Size Report
Keywords: geologic uncertainty, stratigraphy, cone penetration testing, ... fuzzy soil classification from mechanical cone penetration tests are applied for ...
Proceedings ISC-2 on Geotechnical and Geophysical Site Characterization, Viana da Fonseca & Mayne (eds.) © 2004 Millpress, Rotterdam, ISBN 90 5966 009 9

Stratigraphic profiling by cluster analysis and fuzzy soil classification from mechanical cone penetration tests J. Facciorusso & M. Uzielli Department of Civil Engineering, University of Florence, Italy

Keywords: geologic uncertainty, stratigraphy, cone penetration testing, clustering, fuzzy ABSTRACT: Cone penetration testing has gained increasing popularity in geotechnical site characterization due to its speed and economy, and to the good quality of its data in terms of precision, accuracy, repeatability and continuity of measurement when compared to other in-situ tests. In the present paper, cluster analysis and fuzzy soil classification from mechanical cone penetration tests are applied for stratigraphic delineation in the harbor area of the southern Italian town of Gioia Tauro. The main features of the clustering and fuzzy algorithms adopted are described. The results of stratigraphic profiling by cluster analysis and fuzzy classification for a number of soundings are shown and compared; the applicability of the adopted site characterization techniques is assessed through comparison with adjacent borehole logs and standard penetration tests. 1

INTRODUCTION

Cone penetration testing is increasingly employed in geotechnical site characterization due to the precision, accuracy, and repeatability of its output data. Data deriving from mechanical cone penetration testing, however, are affected by larger uncertainties than those from electrical cone penetration and piezocone testing; furthermore, the data sampling interval is greater and the distance between cone-tip resistance and sleeve friction measures is larger. Nonetheless, in Italy, at present, a considerable number of databases used for important geotechnical analyses - such as liquefaction susceptibility evaluation - consists of data from mechanical cone tests. Thus, it appears advisable to apply different techniques and compare their results to obtain reliable stratigraphic profiling from such data. In the geotechnical literature, results of cone penetration tests have been interpreted to delineate stratigraphic profiles in one or more of the following ways: 1) visual examination of raw data; 2) empirical soil classification charts; 3) statistical methods; 4) fuzzy soil classification techniques; and 5) neural networks. Here, the applicability of cluster analysis and fuzzy soil classification to mechanical cone penetration test data for stratigraphic profiling is evaluated for the harbor area of the town of Gioia Tauro, in southern Italy; this study also assisted with liqufaction risk analysis in this area.. A description of the basic features of the algorithms employed is

provided, as well as an overview and an assessment of the main results. 2

GEOLOGICAL FEATURES AND SOURCE DATA

The harbour area of Gioia Tauro, in the southern Italian region of Calabria, is located in a flat plain, originating from a depression, spreading along its length in a N-S direction and filled by continental sediments of Quaternary Age. The plain primariliy comprises granular saturated soils in its surficial layers (up to a depth ranging from 50 m to 70 m from ground level), overlying a layer of compacted clays and silty clays of considerable thickness (500m or more). The bedrock is located at variable depths of 500-600 m (Facciorusso and Vannucchi, 2003). Figure 1 shows a representative cross-section from borehole log data and the locations of CPT and SPT tests. As may be observed, the first 20 meters (represented by a dashed area) of the cohesionless deposit - the maximum depth commonly investigated for liquefaction risk analyses - include a thick layer of made ground, overlying, with increasing depth: − Soil A: coarse to medium loose aeolian sands, with a thickness varying between 3 and 5 m;

905

S

S 218 6m 0 m s.l.m.

S

S 119

S

S 203

S

S 209

S

S

S 213 S 212

S

S 244

GEOLOGICAL SETTING

* LEGEND LEGEN Riporto Debris

10 m

2020m m

Soil A Formaz

30 m

Soil B Formaz

40 m

Soil C Formaz

50 m

Formaz Soil D

60 m

70 m

80 m

Formaz Soil E

*

90 m

0m

200 m

Figure 1 –Representative cross-section of soil stratigraphy in the Gioia Tauro harbor area with bore-hole (indicated as S) and CPT and SPT (represented by a dashed line) location. −



Soil B: coarse and medium to coarse sands with polygenic gravels or sandy polygenic alluvial gravels, with a thickness of about 10 m; Soil C: medium and fine to medium dense sands, having a thickness ranging from 30 m to 70 m, including a sequence of lenses and thin layers of sands, gravelly sands and fine silty sands; the top of layer C is found at depths ranging from 7 to 19 m from ground surface. While borehole logs frequently indicate the presence of heterogeneous layers in terms of composition, these are, nevertheless, indicated as single stratigraphic units.

The water table level is estimated at 2.3 m above sea level, corresponding to depths varying from 0.0 m to 3.2 m below the ground surface. The reference water table is the highest measured at the site (with daily fluctuations of 0.35-0.4 m). Extensive geological and geotechnical surveys have been performed in the past (CPT and SPT tests, geotechnical boreholes, laboratory tests, etc.) to characterize the area, which hosts one of the most important trade port junction of southern Europe. The results of 6 boreholes, of 25 profiles of mechanical CPT tests and 19 profiles of SPT tests were selected for the present study. The maximum investigated depth ranges between 40 and 91.3 m, for boreholes, between 20.5 m and 39.5 m, for SPT tests, from 20.5 m and 72 m, for CPT tests. The main characteristics of the CPT tests, such as investigated depth and spatial distance from other in situ investigations, are summarized in Table 1. CPT results, in terms of cone tip resistance, qc, and friction ratio, Rf, in the upper 20m are plotted in Figure 2. The qc values show a certain variability around an average value of 24 MPa, with a a maximum value of 44 MPa; the Rf, values generally fall below 2%, though a peak of 7% is reached.

906

Figure 2 – Comprehensive plots of cone tip resistance, qc, and friction ratio, Rf, for the 23 investigated soundings, limited to 20 m below ground level

3

CLUSTER ANALYSIS

In general terms, cluster analysis is the art of finding groups in data showing a certain degree of similarity in their mathematical description. The delineation of stratigraphies based on visual inspection of CPT profiles is a complex, subjective procedure, relying on expert judgement. Previous investigations (Hegazy 1998; Hegazy and Mayne 2002) have shown that the application of clustering to piezocone CPT results may allow for the objective detection of inherent correlations between data, and for the consequent assessment of the stratigraphic profile. Thus, in the context of the less accurate and reliable mechanical CPT testing, the attempt will be made herein to identify the presence and location of primary layers (arbitrarily defined as those having a thickness greater than 1 m), secondary layers (having a thickness between 0.5 m and 1 m), lenses, transitions layers, soil mixture, and other features, as defined by Hegazy and Mayne (2002), while excluding all data outliers deriving from accidental or systematic errors, as shown in Figure 3. The clustering method adopted herein refers to the following succession of operations: 1) the data are arranged into n objects, (i.e. each set of measures performed at each depth investigated during CPT testing) and p attributes, (i.e. soil properties directly measured or derived during CPT testing) in a nxp matrix (in which each row represents all the considered properties at the corresponding depth); 2) two or more attributes are chosen to identify the structure present in the data and, if necessary, standardized to avoid their dependence on the choice of measurement units; 3) a set of proximity (distance) values between all possible pairs of objects is stored in a nxn matrix; 4) the objects are grouped in clusters on the basis of mutual distance. © 2004 Millpress, Rotterdam, ISBN 90 5966 009 9

1

cluster rank

function adopted to represent the distances between objects (e.g. euclidean, cosine, etc.); (c) the number of clusters (fixed in “partitioning methods”, variable in “hierarchical methods”); (d) the procedure of grouping objects to generate clusters (merging objects by means of “agglomerative techniques” or splitting them by means of “divisive techniques”); and (e) the definition of the type of distance (e.g. minimum, maximum, average, etc., )between clusters (e.g. Kaufman and Rousseeuw, 1990). In the present study, the selection of the most suitable algorithm to perform clustering analysis was based on the nature of the data (continuous variables), their spatial distribution (generally irregular and widely spaced) and the past, still limited, experiences in the field of geotechnics (Máynarek and Lunne, 1987). A hierarchical agglomerative method was adopted, whereby clusters are merged using the average distance criterion. The algorithm, referred to by the acronym HAMAD, is shown schematically in Figure 4 and detailed hereinafter. The Minkowski distance, selected as the proximity parameter, is given, between the i-th and the jth object, by:

17

0 1

A

2

outlier

3 4 5 6

B

7 8

missing data

soil mixture

9 10 11 12 13

C

14

soil transition

15 16

a

17 18 a

primary layer (> 1 m)

19 20

(

§ p d ( i , j ) = ¨¨ ¦ xik − x jk © k =1

Figure. 3 – Main definitions for soil categories used in the clustering procedure of mechanical CPT data

Clustering can be performed by various algorithms which differ from each other in terms of: (a) the input data (e.g. type and number of variables, standardization method, etc.); (b) the mathematical

Choice of variables (qc, fs) Normalization N

§ q − σ v0 ·§ pa · fs ¸¸¨¨ ¸¸ FR = ⋅ 100 Qc = ¨¨ c qc − σ v0 © pa ¹© σ ' v0 ¹

)



¸ ¸ ¹

1

q

(1)

where i and j vary between 1 and the number of objects, n; p is the number of variables considered and q is set equal to 2. The HAMAD algorithm was applied to the nor-

Distance matrix (Minkowski)

[(

d ( i , j ) = Q' ci −Q' cj

)q + (F' Ri − F' Rj )q ]

1

q

q = 2; i,j = 1:n; n = number of measures

Building of clusters Nc = n Standardization (Z-score) X 'i =

X i − average mean absolute deviation

Finding minimum distance

PRELIMINARY TREATMENT

Nc = 1

Nc = Nc +1< Ncmax

Associating to each cluster a stratigraphical feature (primary, secondary and transition layers, lenses, mixed soil) and finding outliers

Yes

Number of primary layers increases or is constant No

Merging the corresponding clusters (Nc = Nc-1)

Nc > 1

“Average Distance” between each pair of clusters

Nc = 1 Finding the maximum number of clusters (Ncmax): - derivative of the function, KD (Nc) < 0.5

distance

-- correlation coefficient, ρc = 1 CLUSTERING

Ncf INTERPRETATION

Figure 4 – Flow chart of the HAMAD algorithm Proceedings ISCʼ2 on Geotechnical and Geophysical Site Characterization, Viana da Fonseca & Mayne (eds.)

907

malized cone tip resistance, Qc, and friction ratio, FR: (2) (3)

where pa = 0.1 MPa, σv0 and σ’v0 are the vertical total and effective stresses, respectively, and N is an exponent depending on soil type (Robertson, 1990), ranging from 0.5 (for sands) to 1 (for clays). Qc and FR were selected in place of qc and fs as normalization allows to remove the influence of depth. To avoid the dependence on the choice of measurements units, which may have a strong effect on the results of clustering, the two reference variables were subsequently standardized into the unitless input variables to the HAMAD algorithm, X1’ and X2’, using a modified Z-score method: X 'i =

X i − mX sX

(4)

where Xi is the variable to be standardized (X1=Qc, X2=FR) and

( ) s X = (¦ in=1 X i − m X )/ n

m X = ¦ in=1 X i / n

cluster 2 avg. dist. 1-2

N

(5)

avg. dist. 2-3

X2

§ q − σ v0 ·§ pa · ¸¸¨¨ ¸¸ Qc = ¨¨ c © pa ¹© σ 'v0 ¹ fs FR = ⋅ 100 qc − σ v0

10

0

cluster 1

avg. dist. 1-3

cluster 3

X1

-4

+4

Figure 5 – Example of average distance clustering visual scheme

old value, Ncmax, was defined arbitrarily in terms of the following conditions: 1) the derivative, KD, of the Minkowski distance function, D(Nc), represented by the minimum distance between clusters at each step, was definitively less than 0.1 (Figure 6b); 2) the correlation coefficient, ρc, approached a value of 1 (Figure 6c) or was characterized, for Ncmax, by a significant variation (i.e. maximum relative peaks). The correlation coefficient, ρc, is calculated between two adjacent cluster configurations (corresponding to two consecutive steps), as defined by Neter et al. (1990):

(6)

¦ (xi( j ) − m( j ) )(xi( j + 1 ) − m( j + 1 ) ) n

are the average and mean absolute deviation, respectively, of the sets of measurements to be standardized (i.e. X1 and X2). Kaufman and Rousseeuw (1990) suggested that using sX instead of the standard deviation – the latter commonly used in the standard z-score method – allows for a more robust identification and treatment of outliers. The Minkowski distance was calculated between pairs of standardized objects (X1i, X2i) and (X1j, X2j), measured at depths i and j, respectively, by means of Eq. 1. Prior to implementing the HAMAD algorithm, each of the n objects initially constitued a single cluster (with the number of clusters, Nc, initially equal to number of objects, n). At the first step of the HAMAD algorithm, the two closest clusters were merged to form a new cluster. In each following step, the distance between all clusters was recalculated considering the average of the Minkowski distances between all pairs of objects in the two clusters (Figure 5); two clusters now closest were merged, and Nc decreased by 1. While the procedure could be iteratively repeated until all data are merged into a single cluster (Nc=1), the algorithm was stopped for Nc=Ncmax. The thresh908

ρc =

i =1

2 ¦ (xi( j ) − m( j ) ) n

i =1

2 ¦ (xi( j + 1 ) − m( j + 1 ) ) n

(7)

i =1

where xi(j) is the cluster rank to which the i-th object a)

(q=2)

1

b) 24 1

c)

20

1

Figure 6 – Minkowski distance function, D(Nc), its derivative, KD(Nc), and correlation coefficient, ρ(Nc), for the selection of Ncmax (Ncmax = 20 in the example shown)

© 2004 Millpress, Rotterdam, ISBN 90 5966 009 9

belongs at the j-th step and m(j) is a weigthed average of the number of objects included in each cluster at the j-th step. All cluster configurations were then analyzed, and the resulting subdivision of data into clusters with depth (conceptually related to a possible soil stratigraphy) were interpreted visually on the basis of the criteria and definitions established by Hegazy (1998) (see, for example, Figure 3). The minimum number of clusters, Ncf, able to provide a reliable soil stratigraphy is defined as the value of Nc corresponding to the last step at which no primary layers (with thickness greater than 1.0 m) were formed as a result of HAMAD agglomeration. 4

Finally the soil classification index U=-u is defined, while the in-situ soil index, V, is not used in the fuzzy classification procedure. In a subsequent paper, Zhang and Tumay (1999) introduced three fuzzy soil types: highly probable sandy soil (HPS), highly probable mixed soil (HPM) and highly probable clayey soil (HPC). The membership functions of the three fuzzy soil types, given in Eq. 11, Eq. 12 and Eq. 13, and shown in Figure 9, are functions of the soil classification index U: ­

1 .0

U > 2.6575

2 ° µ s = ®exp ª«− 1 § U − 2.6575 · º» U ≤ 2.6575 ¨ ¸

° ¯

(11)

«¬ 2 © 0.834586 ¹ »¼

FUZZY CLASSIFICATION

Zhang and Tumay (1999) proposed a possibilisticfuzzy approach with the objective of addressing the observed uncertainty in the correlation between existing soil composition and the mechanical response to penetration in existing CPT-based classification charts, as investigated in previous works (Zhang and Tumay 1996a; Zhang and Tumay 1996b). Zhang (1996a) observed two basic tendencies in existing soil behavior classification charts, as two almost orthogonal curve shapes: soil type changes in one direction, and in situ soil state (OCR, sensitivity, age, cementation, liquidity index, K0, etc.) in the other. The operation of soil chart simplification as proposed by Zhang and Tumay consists of the derivation of two independent indices (the soil classification index, U, and the in-situ state index, V) representing the two primary tendencies in the soil behavior classification chart by Douglas and Olsen (1981) shown in Figure 7, through the empirical superposition of a curvilineal orthogonal coordinate system along the principal tendencies in the original chart and, successively, the transformation of the curvilineal coordinate system into a cartesian coordinate system by conformal mapping (Zhang and Tumay 1996a), as shown in Figure 8. Given qc and Rf, from the Douglas and Olsen (1981) semi-logarithmic chart (Rf is in % and qc is in tsf), the intermediate variables x, y, and u(x,y) are calculated from the following relations:

x = 0.1539 R f + 0.8870 log(qc ) − 3.35

(8)

y = −0.2957 R f + 0.4617 log(qc ) − 0.37

(9)

(a1x − a2 y + b1 )(c1x − c2 y + d1 ) + u=− (c1x − c2 y + d1)2 + (c2 x + c1 y + d 2 )2 (a2 x + a1 y + b2 )(c2 x + c1 y + d 2 ) + (c1x − c2 y + d1)2 + (c2 x + c1 y + d 2 )2

Figure. 7 - Douglas and Olsen soil behavior classification chart (1981)

(10) Figure 8 - Transformed Douglas and Olsen chart with superimposed U-V plane and boundary curves of soil classification criteria (Zhang 1994)

Proceedings ISCʼ2 on Geotechnical and Geophysical Site Characterization, Viana da Fonseca & Mayne (eds.)

909

ª 1 § U − 1.35 · 2 º ¸ » −∞