sur une nouvelle methode de point de pilotes en

2 downloads 0 Views 32MB Size Report
Feb 26, 1999 - applications de la méthode des Points Pilotes dans des modèles ...... form of initial kriged estimates at the pilot point locations, 2) properly posing the ...... of Waterstone Environmental Hydrology and Engineering in Boulder,.
SUR UNE NOUVELLE METHODE DE POINT DE PILOTES EN PROBLEM INVERSE EN HYDROGEOLOGIE; ENGENDRANT UN ENSEMBLE DE SIMULATION CONDITIONELLES DE CHAMPS DE TRANSMISSIVITIES (A New Pilot Point Inverse Method in Hydrogeology; Generating an Ensemble of Conditionally-Simulated Transmissivity Fields)

THESE presentee a

l’Ecole Nationale Superieure des Mines de Paris par

Marsh Lavenue pour l’obtention du titre de

Docteur en

Hydrologie et Hydrogeologie Quantitatives

Soutenue le 24 Novembre 1998 devant le jury compe de: G. de MARSILY J. CARRERA A. GALLI J. WILSON E. LEDOUX M. MARIETTA 26/02/99

President Rapporteur Rapporteur Examinateur Examinateur Examinateur

Table of Contents_________________________________________________________________________________________________

CONTENTS ACKNOWLEDGEMENTS

iv

RESUME

iv

1. INTRODUCTION

1-1

1.1 The Objectives of this Research

1-2

2. THE INVERSE PROBLEM OF GROUNDWATER FLOW

2.1 Direct Methods of Solving the Inverse Problem

2-1

2-4

2.1.1 Numerical Techniques

2-4

2.1.2 Graphical Techniques

2-11

2.1.3 The Transition from Direct Methods into Indirect Methods

2-17

2.2 Indirect Methods of Solving the Inverse Problem 2.2.1 General Differences in the Indirect Inverse Methods

2-20 2-21

2.2.1.1 Deterministic Versus Bayesian

2-21

2.2.1.2 Nonlinear Assumption Versus Linear Assumption

2-21

2.2.1.3 Adjoint Derivatives Versus Direct Derivatives

2-23

2.2.2 Non-Linear Methods

2-23

2.2.3 Linear Geostatistical Methods

2-49

2.2.4 Quasi-Linear Geostatistical Methods

2-66

2.2.4.1 Kitanidis’ Quasi-Linear Method

2-66

2.2.4.2. The Co-Conditional Method

2-68

2.2.4.3. Simulated Annealing

2-72

2.3 The Scope of this Research

2-75

i

Table of Contents_________________________________________________________________________________________________

3. THE GRASP-INV CODE - VERSION 1.0

3.1

3-1

GRASP-INV Methodology: An Overview

3-3

3.1.1 Conditional Simulations

3-5

3.1.2 Solving the Groundwater Flow Equation

3-6

3.1.3 The Objective Function

3-7

3.1.4 Adjoint Sensitivity Analysis

3-8

3.1.5 Locating Pilot Points

3-10

3.1.6 Optimization of Pilot Point Transmissivities

3-12

3.1.6.1 Determining the Direction Vector:

3-12

3.1.6.2 Determining the Step Length:

3-13

3.1.6.3 Constraints on Pilot Point Transmissivity Values

3-13

3.1.7 Earlier Inverse Algorithms: Similarities and Differences

3.2 Applications

3-14

3-15

3.2.1 1992 Culebra Regional Flow Model

3-15

3.2.1.1 Site Description and Review of Past Modeling Studies

3-17

3.2.1.2 Model Input

3-21

3.2.1.3 Model Results

3-21

3.2.1.4 Model Conclusions

3-22

3.2.2 GXG Test Problems

3-25

3.2.2.1. Overview of the GXG Test Problems

3-26

3.2.2.2. Description of the Four Test Problems

3-27

3.2.2.3. GXG Test Problem Results

3-31

3.2.2.4. GXG Test Problem Conclusions

3-43

3.3 Improvements to GRASP-INV v1.0

3-46

4. THE GRASP-INV CODE, V2.0, THEORY AND APPLICATIONS

4-1

PAPER 1: A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

ii

Table of Contents_________________________________________________________________________________________________

PAPER 2: Three-Dimensional Interference-Test Interpretation in a Fractured/Unfractured Aquifer Using the Pilot Point Inverse Method 5. CONCLUSIONS

5-1

6. REFERENCES

6-1

APPENDICIES Appendix A: Application of a coupled adjoint sensitivity and kriging approach to calibrate a groundwater flow model, By Marsh Lavenue and John F. Pickens

Appendix B Pilot point methodology for automated calibration of an ensemble of conditionally simulated transmissivity fields, 1. Theory and computational experiments, By B.S. RamaRao, A.M. Lavenue, G. de Marsily and M.G. Marietta

Appendix C: Pilot point methodology for automated calibration of an ensemble of conditionally simulated transmissivity fields, 2. Application, By A.M. Lavenue, B.S. RamaRao, G. de Marsily and M.G. Marietta

Appendix D: A comparison of seven geostatistically-based inverse approaches to estimate transmissivities for modeling advective transport by groundwater flow, By D.A. Zimmerman, G. de Marsily, C.A. Gotway, M. G. Marietta, C.L. Axness, R. Beauheim, R. Bras, J. Carrera, G. Dagan, P.B. Davies, D. P. Gallegos, A. Galli, J. Gomez-Hernandez, S. M. Gorelick, P. Grinrod, A.L. Gutjahr, P.K. Kitanidis, A. M. Lavenue, D. McLaughlin, S.P. Neuman, B. S. RamaRao, C. Ravenne, Y. Rubin

iii

Table of Contents_________________________________________________________________________________________________

ACKNOWLEDGEMENTS First of all, I wish to dedicate this work to my two sons Justin and Collin. Their patience and sacrifice has helped produce this dissertation. In addition, I dedicate this to my deceased parents Bill and Margaret, my stepfather Joe Newell, my brothers Lashar, Alan, and my sister Myra. I would also like to acknowledge the financial support of Sandia National Laboratories, specifically Dr. Melvin Marietta, without whom this work would have been impossible. I would like to thank my advisor Dr. Ghislain de Marsily for his support, advice and guidance. Working on this degree with him has been one of the most rewarding experiences of my life. I would like to also thank my close friend and fellow INTERA employee Dr. Banda.RamaRao. RamaRao’s technical guidance and instruction has brought me to this point and enhanced my technical career for over ten years. Thanks Ram.

RESUME Depuis le début des années soixante, le problème inverse de l’écoulement des eaux souterraines a fait l’objet d’une attention considérable. Cette thèse propose d’abord une synthèse bibliographique des travaux antérieurs sur le problème inverse ainsi que des techniques utilisées pour le résoudre et se termine par une présentation d’une nouvelle technique inverse employant la méthode des Points Pilotes. La théorie de cette nouvelle technique, incorporée dans le logiciel GRASP-INV v.2.0, est décrite, et deux applications en sont présentées séparément, sous forme d’articles publiés dans des revues. La thèse est structurée en cinq chapitres principaux, développant les sujets mentionnés ci-dessus. Un résumé succinct de chaque partie est donné dans ce qui suit. Chapitre 1. Une brève introduction au problème général de la modélisation des eaux souterraines et une description sommaire des méthodes utilisées par les hydrogéologues dans leurs tentatives de résoudre le problème des eaux souterraines: modèles analytiques, électriques et numériques. Chapitre 2. Description du problème inverse et des difficultés d’y trouver une solution. Ici, l’historique du développement des solutions du problème inverse est examiné en profondeur. La discussion débute par les premiers travaux de Nelson (Nelson, 1960) sur une méthode numérique directe pour résoudre le problème inverse. Elle décrit les limitations que Nelson a rencontrées en suivant cette approche et qui l’ont finalement conduit à développer la méthode graphique directe. La discussion continue par une récapitulation chronologique des méthodes indirectes linéaires, non linéaires et quasi-linéaires, commençant par la méthode indirecte non linéaire de Neuman et Yakowitz en 1979 et se terminant par la technique indirecte quasi-linéaire de Yeh et al. (1996). La contribution de la recherche présentée dans cette étude clôt ce chapitre. Chapitre 3. Description de la théorie et des applications de la première version du logiciel inverse GRASP-INV. Cette première version contenait des améliorations de la technique des Points Pilotes par rapport à la méthode originelle présentée par Marsily (1978). Nos

iv

Table of Contents_________________________________________________________________________________________________

améliorations comprenaient la capacité de situer les points pilotes de façon optimale ainsi que la capacité de générer, et ensuite de calibrer sur des conditions d’écoulement permanentes ou transitoires, un ensemble de champs de transmissivité conditionnellement simulés. GRASP-INV est illustré par deux applications. L’une provient du système d’écoulement régional du projet WIPP (stockage de déchets nucléaires dans le sel) dans le sud-est du Nouveau Mexique. GRASP-INV a été utilisé dans les calculs pour caler des modèles d’écoulement à l’échelle régionale dans l’analyse de performance (AP) du programme WIPP. Le but de cette modélisation était de calibrer, sur des conditions d’écoulement permanentes et transitoires, un ensemble de champs de transmissivité conditionnellement simulés, et de prédire ensuite les temps de transfert des eaux souterraines du centre du site WIPP jusqu’à sa limite sud. La seconde application décrite au Chapitre 3 provient de l’étude comparative des méthodes inverses indirectes linéaires et non linéaires, réalisée en deux ans, par le Groupe d’Experts en Géostatistiques (GXG). Le GXG, réuni par les Laboratoires Nationaux Sandia, a conduit une série de tests de sept méthodes inverses différentes, dont GRASP-INV. Le GXG a conclu, qu’une sélection appropriée du variogramme de Log(T), ainsi que la capacité de le simuler géostatistiquement, avaient un effet significatif sur l’exactitude et la précision des prédictions des temps de transfert calculées par une méthode inverse. Le Chapitre 3 termine en expliquant quelles modifications il faudrait apporter à GRASP-INV pour éliminer les faiblesses identifiées par l’étude du GXG. Chapitre 4. Explication de la théorie sur laquelle reposent deux des améliorations les plus importantes apportées au modèle, et présentation du modèle modifié, GRASP-INV, version 2. Les modifications sont discutées et utilisées dans deux modèles différents, qui sont décrits en détail dans deux articles. Le premier article concerne la théorie et l’application de l’amélioration la plus significative de GRASP-INV: une nouvelle «partie initiale » géostatistique. Cette nouvelle partie utilise un processus géostatistique à deux pas pour générer des simulations conditionnelles de champs de transmissivité. D’abord, elle effectue une simulation d’indicatrices de catégories, afin d’obtenir la distribution dans l’espace d’indicatrices représentant des milieux fracturés et non fracturés. Ensuite, la variabilité spatiale à l’intérieur de chaque catégorie est engendrée au moyen des variogrammes associés à chaque catégorie et de la technique de Simulation Gaussienne Séquentielle. GRASP-INV2 a servi à générer cent champs de transmissivité, simulés conditionnellement à l’aide des transmissivités mesurées et des données sur les indices de fracturation. GRASP-INV2 a ensuite calibré ces champs sur un nombre considérable de données de charges hydrauliques en régime permanent et transitoire. Les résultats sont comparés à une étude précédente. Le second article dans le Chapitre 4 rend compte de la théorie, et de l’application de GRASP-INV2 à quatre essais de pompage à trois dimensions dans de la dolomie fracturée. Les essais de pompage, réalisés avant un test de traçage en écoulement convergeant, utilisaient des débits de pompage sinusoïdaux. Les positions des parties fracturées de la dolomie étaient simulées géostatistiquement en attribuant des catégories d’indicatrices séparées aux parties fracturées et non fracturées de l’aquifère. Cent champs de conductivité à trois dimensions ont été simulés et ensuite calibrés sur les trois premiers essais de pompage sinusoïdaux.. Le quatrième essai de pompage a servi à valider les résultats de l’inverse. Chapitre 5. Dans ce chapitre, les conclusions de la recherche sont résumées et des recommandations sont formulées pour son développement futur.

v

Table of Contents_________________________________________________________________________________________________

La recherche présentée dans cette thèse a conduit au développement du logiciel inverse GRASP-INV, qui contient de nombreuses améliorations de la méthode des Points Pilotes utilisée pour résoudre le problème inverse dans le domaine de l’écoulement des eaux souterraines. Chaque modification a été soigneusement testée et ensuite utilisée dans des applications de la méthode des Points Pilotes dans des modèles d’écoulements souterrains à calage automatique, à l’échelle locale et régionale. Les améliorations suivantes ont été apportées au code GRASP-INV au cours des cinq dernières années : -

-

la capacité de localiser des points pilotes de façon optimale, la capacité de générer, et ensuite de calibrer sur des conditions d’écoulement permanentes ou transitoires, un ensemble de champs de transmissivité simulés conditionnellement, la capacité de générer des simulations conditionnées de champs de transmissivité multi-catégories, tels que ceux rencontrés dans des domaines fracturés et non fracturés, au moyen de techniques géostatistiques paramétriques et non paramétriques et de calibrer automatiquement ces champs sur des conditions d’écoulement permanents et transitoires, la capacité de résoudre des problèmes inverses véritablement tri-dimensionnels par la méthode des Points Pilotes.

Des travaux sur la caractérisation du site WIPP (projet de stockage de déchets nucléaires dans le sel) dans les sud-est du Nouveau Mexique ont fourni un ensemble exhaustif de données hydrogéologiques des transmissivités, des charges hydrauliques en régime permanent et transitoire, obtenues à partir d’un grand nombre d’essais de pompage à l’échelle locale et régionale. En 1992 et 1996, GRASP-INV a été utilisé dans les calculs d’analyse de performance (AP) du programme WIPP afin de calibrer un modèle d’écoulement à l’échelle régionale, au site de WIPP. Le but de cette modélisation était de calibrer, sur des conditions permanentes et transitoires, un ensemble de champs de transmissivité conditionnellement simulés, et ensuite de prédire le temps de transfert des eaux souterraines du centre du site de WIPP jusqu’à sa limite sud. En 1992, l’ensemble des champs de transmissivité a été généré au moyen des logiciels TUBA et AKRIP et de la méthode de recollement des résidus. Les résultats de l’étude de 1992 ont montré que les temps de transfert des eaux souterraines étaient sensibles à la position de la limite entre les domaines fracturé et non fracturé de l’aquifère, la dolomie de Culebra. Les résultats ont également identifié le besoin d’un programme de simulation géostatistique plus robuste pour générer les domaines fracturés et non fracturés du champ des transmissivités de l’aquifère de Culebra. Ce besoin a été confirmé par l’étude comparative du Groupe d’Experts en Géostatistiques (GXG) qui, pendant deux ans, a examiné les méthodes inverses indirectes, linéaires et non linéaires. Le GXG, réuni par les Laboratoires Nationaux de Sandia, a mené une série de tests de sept méthodes inverses différentes, et a conclu que le choix approprié du variogramme du champ de Log(T) et la capacité de le simuler géostatistiquement avaient un impact significatif sur l’exactitude et la précision des prédictions des temps de transfert par une méthode inverse. De plus, l’étude du GXG a montré que la non stationnarité du « vrai » champ de transmissivité ou la présence « d’anomalies », telles que les zones fracturées à forte perméabilité, étaient mieux prises en compte par les méthodes non linéaires que par les méthodes linéaires. Ainsi, en 1995, une nouvelle «partie initiale » géostatistique a été ajoutée à GRASP-INV afin de préparer l’AP de WIPP, en 1996. La nouvelle partie initiale utilise une procédure

vi

Table of Contents_________________________________________________________________________________________________

géostatistique à deux pas pour générer des simulations conditionnées de champs de transmissivité. Le modèle fait d’abord une simulation d’indicatrices de catégories afin d’obtenir une répartition dans l’espace d’indicatrices, qui représentent les milieux fracturés et non fracturés. Ensuite, la variabilité dans l’espace à l’intérieur de chacune de ces catégories est engendrée au moyen des variogrammes associés à chaque catégorie par la technique de la Simulation Gaussienne Séquentielle. Ceci permet au code GRASP-INV2 d’optimiser de manière « indépendante » les propriétés à l’intérieur de ces domaines fracturés et non fracturés, tout en se calant sur les données de charges hydrauliques en régime permanent ou transitoire. Dans l’AP de 1996 à WIPP, cent champs de transmissivité du Culebra ont été simulés conditionnellement à partir des transmissivités mesurées et des données décrivant les indices de fracturation. Ces champs ont ensuite été calibrés sur de nombreux jeux de charges hydrauliques, en conditions permanentes et transitoires. La capacité du logiciel GRASP-INV d’optimiser séparément les propriétés des zones associées à la transmissivité diagénétiquement modifiée (c’est à dire plus forte) et non modifiée (c’est à dire plus faibles) dans le Culebra a amélioré ses capacités d’obtenir un bon accord entre les charges hydrauliques permanentes et transitoires, observées et calculées. Les cent champs de transmissivité prenaient en compte les effets sur les champs d’écoulement des altitudes variables de l’aquifère et de la densité variable du fluide et ont été utilisés pour calculer les temps de transfert des eaux souterraines jusqu’à la limite du site de WIPP. Les champs de transmissivité générés dans l’étude de 1996 avaient une variabilité beaucoup plus forte que ceux résultant de l’AP de 1992. Cette variabilité était due à la simulation de la position incertaine de la limite entre les zones fracturées et non fracturées de l’aquifère. Le découplage (géostatisique) entre les zones à forte et à faible transmissivité a produit une limite mieux définie entre les parties de l’aquifère à plus faible transmissivité (non fracturées) et celles à plus forte transmissivité (fracturées). Ceci a éliminé le mélange (c’est à dire la prise de moyenne) des transmissivités dans les zones fracturées et non fracturées, qui avait eu pour résultat la création d’une zone de transmissivité « intermédiaire » entre ces deux zones dans l’étude de 1992. L’intervalle global de l’incertitude sur la transmissivité, toujours présente dans les champs calibrés, avait un effet sur l’incertitude des temps de transfert des eaux souterraines. Une distribution de probabilité cumulée des temps de transfert, obtenue dans l’étude de l’AP en 1996, a donné des temps de transfert des eaux souterraines considérablement inférieurs à ceux de 1992. Il en a été conclu que l’élimination de la zone de transmissivité « intermédiaire » définie ci-dessus, était le facteur principal responsable de la réduction du temps de transfert des eaux souterraines vers la limite sud du site de WIPP. En 1997 et 1998, GRASP-INV2 a été modifié de façon à pouvoir résoudre des problèmes inverses d’écoulement des eaux souterraines à trois dimensions, et le logiciel a été utilisé pour interpréter un essai de pompage à trois dimensions dans la dolomie de Culebra. L’essai de pompage, faisant partie de l’expérience de traçage H-19 de 1996, utilisait un débit de pompage sinusoïdal. Dans l’aquifère du Culebra, deux sections verticales ont été isolées et le pompage conduit dans les sections supérieure et inférieure, tandis que des capteurs contrôlaient les réponses de pression dans les sections supérieures et inférieures de six puits d’observation voisins (à moins de 40m). Les obturateurs étaient placés verticalement afin d’isoler la partie inférieure de la dolomie de Culebra, fortement fracturée, de la partie supérieure non fracturée. On a simulé cent champs de transmissivité à trois dimensions, que

vii

Table of Contents_________________________________________________________________________________________________

l’on a ensuite calibrés sur trois essais de pompage sinusoïdaux, effectués au forage H-19b0. Un quatrième essai de pompage à trois dimensions a été fait au forage H-19b4 afin de valider les résultats de l’inverse. Les champs calibrés décrivent la variation de la position de la limite entre les parties fracturées et non fracturées du Culebra supérieur. Généralement, la partie non fracturée de l’aquifère se trouve dans les zonesà l’ouest et au nord dans le Culebra supérieur. Le Culebra inférieur est fracturé et ses transmissivités sont plus élevées, de plusieurs ordres de grandeur, que celles du Culebra supérieur. Il a été conclu que les champs de transmissivité calibrés reproduisent de façon très satisfaisante les rabattements dans le Culebra supérieur et inférieur observés au site hydro H-19. De plus, les rabattements dus au pompage, préditsà H-19b4, qui constituent un test de validation, s’accordent étroitement avec les rabattements réels, mesurés sur le terrain. Par ailleurs, la moyenne d’ensemble du log10 de la transmissivité sur le domaine à trois dimensions est très proche d’une valeur précédente du log10 de la transmissivité à l’échelle du site hydro H19, interprétée à partir d’un simple essai de pompage à ce site. Les temps de transfert par advection à partir des forages entourant le site hydro H-19, dus au pompage dans le forage central H-19b0, se comparent avantageusement avec ceux du centre de masse observé sur les courbes de restitution des traceurs, dans les parties fracturées du Culebra inférieur. Toutefois, les temps de transfert par advection étaient beaucoup plus élevés que les temps d’arrivée des traceurs dans la partie non fracturée du Culebra supérieur. Ceci implique qu’il y a une divergence entre les paramètres dont on a besoin pour caler la réponse hydraulique de l’aquifère supérieur et ceux qui jouent un rôle dans les transferts dans ce même aquifère. La recherche et les conclusions présentées dans cette thèse ont conduit aux recommandations suivantes pour des travaux ultérieurs : 1. Il faut trouver une représentation directe du champ de perméabilité verticale dans les processus de simulation et d’optimisation pour les applications à trois dimensions. Cette recommandation est motivée par le fait qu’à présent, la perméabilité verticale est définie à partir d’un rapport d’anisotropie appliqué à la perméabilité horizontale dans une maille. Ainsi, la perméabilité verticale n’est modifiée qu’à travers une modification de la perméabilité horizontale. Les champs de perméabilité horizontale et verticale pourraient être « découplés » en ajoutant un jeux séparé de points pilotes « verticaux », qui ne changeraient que la perméabilité verticale dans une maille. La sensibilité de la mesure de performance à l’addition de points pilotes verticaux constituerait alors une simple modification de la méthode des Points Pilotes. 2. Une représentation directe du coefficient d’emmagasinement dans le processus d’optimisation serait utile à la méthode des Points Pilotes pour les applications en régime transitoire. A présent, le coefficient d’emmagasinement est un paramètre fixe, inchangé dans le programme d’optimisation. GRASP-INV2 peut calculer la sensibilité d’une fonction objectif en régime transitoire au coefficient d’emmagasinement d’une maille. Toutefois, cette capacité doit être incluse en tant que partie de la paramétrisation du problème afin d’obtenir une solution satisfaisante au problème inverse. 3. L’optimisation des conditions aux limites pour les applications en régime permanent ou transitoire constituerait une contribution importante à la méthode des Points Pilotes.

viii

Table of Contents_________________________________________________________________________________________________

L’optimisation des paramètres des limites de type Carter-Tracey serait particulièrement utile aux analyses des essais de puits à trois dimensions. 4. Une modification de la fonction objectif visant à y incorporer une mesure de données de transport en porosité unique augmenterait la fiabilité des solutions inverses obtenues par la méthode des Points Pilotes. La fonction objectif pourrait comprendre un temps d’arrivée du centre de masse à un certain point dans l’espace ou représenter une concentration dans un forage à un instant donné. Ceci permettrait d’obtenir simultanément l’optimisation des mesures de performance hydraulique et de transport.

ix

EXECUTIVE SUMMARY The inverse problem of groundwater flow has had a significant amount of attention given to it since the early 1960s. This dissertation presents a review of the inverse problem, the techniques used to solve it, and concludes with a presentation of a new inverse technique that employs the Pilot Point method. The theory of the new technique, embedded in the code GRASP-INV v2.0, is reviewed and two applications of the new technique are presented in the form of separate journal articles. The dissertation is organized into five major sections containing the details mentioned above. A brief summary of each section is presented below. Section 1 presents a short introduction to the broad problem of groundwater modeling and a brief description of the ways in which hydrogeologists have tried to solve the groundwater problem using analytical models, electrical models and numerical models. Section 2 describes the inverse problem and the difficulties in determining its solution. Here, an in depth look at the historical developments of solutions to the inverse problem is taken. The discussion begins with Nelson’s early work (Nelson, 1960) on a numerical direct method of solving the inverse problem. It reviews the limitations Nelson found with this approach eventually leading him to the development of the graphical direct method. A chronological march through the indirect non-linear, linear and quasi-linear methods then follows. Here, the discussion commences with Neuman and Yakowitz’s 1979 non-linear indirect method and concludes with the quasi-linear indirect technique of Yeh et al (1996). The contribution of the research presented in this study concludes Section 2. Section 3 contains the theory and application of the first version of our inverse code GRASP-INV. This first version included improvements to the Pilot Point technique relative to the original Pilot Point method presented by deMarsily (1978). Our improvements included the capability to locate pilot points optimally and the capability to generate and subsequently calibrate an ensemble of conditionally simulated transmissivity fields to steady-state or transient-state flow conditions. GRASP-INV is demonstrated with two applications. One is taken from the regional flow system at the Waste Isolation Pilot Plant (WIPP) in southeastern New Mexico. GRASP-INV was used in the WIPP program’s performance assessment (PA) calculations to calibrate regional scale flow models at WIPP. The objective of the modeling was to calibrate an ensemble of conditionally-simulated transmissivity fields to steady-state and transient-state conditions and to subsequently predict groundwater travel times from the center of the WIPP-facility area to the southern WIPP-site boundary. The second application contained in Section 3 is from the Geostatistics Expert Group’s (GXG) two-year comparative study of linear and non-linear indirect inverse methods. The GXG, convened by Sandia National Laboratories, conducted a series of tests on seven different inverse methods, one of which was GRASP-INV. The GXG concluded that the proper selection of the semivariogram of the Log(T) field and the ability to geostatistically simulate it, had a significant impact on the accuracy and precision of travel time predictions made by an inverse method. Section 3 concludes with an explanation of GRASP-INV modifications needed to address its weaknesses as identified in the GXG study.

v

Section 4 presents the modified GRASP-INV model, version 2, and explains the theory behind two major improvements to the code. These improvements are discussed and applied in two separate models that are described in detail in two papers. The first paper describes the theory and application of the most significant enhancement to GRASP-INV, a new geostatistical ‘front end’. The new front end uses a two-step geostatistical procedure to generate conditional simulations of transmissivity fields. Categorical indicator simulation is first performed to obtain the spatial distribution of indicators representing fractured or unfractured media. Then, the spatial variability within each of these categories is subsequently ‘filled in’ using the associated semi-variogram models and the Sequential Gaussian Simulation technique. GRASP-INV2 is applied to generate one-hundred transmissivity fields, conditionally simulated using the measured transmissivity and the data describing the occurrence of fracturing. GRASP-INV2 then subsequently calibrated these fields to an extensive set of steady-state and transient-state heads. The results are compared to an earlier study. The second paper in Section 4 contains the theory and application of GRASP-INV2 to four three-dimensional pumping tests in a fractured dolomite. The pumping tests, conducted as a precursor to a convergent flow tracer test, employed sinusoidal pumping rates. The occurrence of the fractured sections of the dolomite was geostatistically simulated by assigning separate indicator categories to the fractured and unfractured portions of the aquifer. One hundred, three-dimensional conductivity fields were simulated and subsequently calibrated to the first three sinusoidal pumping tests. The fourth pumping test was used to validate the inverse results. Section 5 reviews the conclusions of this research and presents recommendations for future research:

vi

1.INTRODUCTION Groundwater is a limited natural resource often impacted by mining, oil and gas exploration and production, industrial processes and/or waste disposal. For example, open pit mines have been known to contaminate nearby groundwater supplies as a result of the leaching process used in some mines. Solid or hazardous waste landfills have leached chemical constituents into nearby groundwater due to faulty liners or faulty landfill design. Radioactive waste isolation in geologic repositories could slowly leach radioactive constituents with long half-lives into groundwater causing contamination of groundwater supplies over thousands of years. These and other impacts to groundwater quality have required groundwater hydrologists to develop tools to enable them to characterize these impacts as well as predict potential impacts over long periods of time. These tools have changed over the years from crude electronic models of the groundwater system to sophisticated computer models used today. Regardless of the approach though, the common objective is to develop a method for determining groundwater flow directions and velocities ultimately used in chemical transport equations. The mathematical equation expressing groundwater flow over a finite domain is given by; ∇ • [ K ( ∇h) ] + q = Ss

∂h ∂t

subject to an initial condition of h = h0 at t = t0 over a domain D, where Ss is specific storage, t is time, h is hydraulic head, K is hydraulic conductivity and q represents sinks or sources (e.g., pumping or injection wells) within the model domain D. The solution of Equation 1 requires the specification of each of the above parameters as well as conditions along the boundary of domain D in terms of either prescribed hydraulic head or prescribed groundwater flux. Many early researchers developed analytical or closed-form solutions from the governing groundwater flow equation to assist hydrologists in understanding their particular hydrogeologic problems. These analytical models only treated isotropic and homogeneous properties with infinite domains. They were later augmented by analog electrical models, which could be tailored to a specific aquifer by placing a network of electric resistors and capacitors in series to represent the conductive and storative properties of an aquifer. These electrical networks could handle anisotropic heterogeneous properties and would be designed to translate the scientist’s conceptual understanding of the flow processes occurring at the site of interest. This conceptual model, developed from data measured at the site, could include estimates of flow direction,

1-1

(1)

areas and rates of recharge and/or discharge, hydrostratigraphy and system boundary delineation. Once the conceptual model was translated to an electric network, a scientist would apply a current to the electric network and the electric potential could be measured at points around the network. This potential was analogous to the hydraulic potential of the site being modeled. The analog model could then be used as an experimental tool to confirm a conceptual model or to develop a more detailed understanding of the groundwater flow process occurring at the site. With the advent of computers and the wider flexibility numerical models provide to hydrogeologists, numerical models eventually replaced analog models. Like analog models, numerical models require the specification of numerous system parameters such as porosity, permeability, storativity, and boundary conditions. However, numerical models provide much more flexibility in varying these properties spatially across the model domain. Once the system parameters are specified, the model may be run to determine the pressures (or hydraulic heads) across the model domain as well as groundwater flux entering and exiting the model boundaries. In an ideal situation, i.e., where the system parameters are exactly known across the model domain, the calculated pressures of a numerical model would agree exactly with the measured pressures. This situation, however, never exists due to errors in the field measurements and errors related to the numerical representation of the governing groundwater flow equation. The modeler is then left to reconcile the differences between the model-calculated results and the field measurements. This reconciliation, referred to as the inverse problem, requires either an attempt to directly determine the system parameter values by inverting the governing flow equation or to iteratively improve the model fit by modifying the initial system parameters and re-running the model to ‘observe’ any improvement in the model results.

1.1 The Scope of this Research Since the early 1960s, research into inverse methods has led to the development of numerous parameter-estimation techniques. These techniques include analytical expressions whereby the hydraulic-head field, or potential field, may be used to estimate permeability across the model domain by either directly calculating conductivity from streamline analysis or by numerical techniques employing sophisticated optimization techniques. A review of the past and present techniques are presented in the following sections to provide a framework within which the contributions of this current research may be understood. Regardless of the approach taken, these techniques have one objective in common, namely the solution of the inverse problem for groundwater flow. However, almost without exception, the inverse techniques developed to date do not provide a direct measure of the uncertainty associated with their inverse solutions. Uncertainty in the inverse solution (i.e., the transmissivity field) is investigated through Monte Carlo simulations where an ensemble of transmissivity fields are generated and used to determine distributions of a selected parameter. In some cases, the variability in the log-transmissivity field is the focus of the uncertainty analysis, but, more often, the uncertainty about a secondary variable dependent on transmissivity is of interest, e.g., groundwater head, groundwater travel times, solute concentration at a selected boundary, etc. Typically, the ensemble of transmissivity fields is 1-2

generated from the final optimal transmissivity field, Y and the covariance of its error of estimation Q (obtained from the inverse solution). The limitation in this approach is that the full range of uncertainty within the transmissivity field is probably not investigated. The conditional simulations all originate from the ‘optimal’ estimate of transmissivity (a single transmissivity field produced from the inverse solution) with variations around the ‘local’ optimal estimate dictated by the post-calibration error covariance matrix (or conditional confidence interval). Thus, the ensemble of transmissivity fields is generated through a local uncertainty analysis. The research presented in this dissertation approaches the question of uncertainty in a more ‘global’ fashion by first producing an ensemble of transmissivity fields conditioned to the measured transmissivity data and its covariance (i.e., conditional simulations). Then each of the conditional simulations is separately calibrated to the set of steady-state and/or transientstate hydraulic head data. Once calibrated, the ensemble of transmissivities may be used to investigate the impact uncertainty in the transmissivity field has upon other dependent variables such as groundwater head or groundwater travel time. The solution of the inverse problem for an ensemble of transmissivity fields requires an efficient, effective automated methodology to ensure that each transmissivity field is sufficiently calibrated to the observed head data. Toward this end, this research presents the development and application of an automated inverse method employing the Pilot Point Technique (de Marsily, 1978). The resulting numerical model, referred to as GRASP-INV (GroundwateR Adjoint Sensitivity Program - INVerse), employs the Pilot-Point Technique to solve the indirect, non-linear groundwater flow inverse problem for a set of conditionally simulated transmissivity fields. GRASP-INV couples parametric (Gaussian) and nonparametric (indicator) sequential simulation for the generation of the ensemble of conditionally simulated transmissivity fields with the Pilot Point Technique for the first time. In addition, this research extends the Pilot-Point Technique to three dimensions. GRASPINV’s development, capabilities and application to multi-dimensional groundwater flow inverse problems are presented in the following sections.

1-3

2.THE INVERSE PROBLEM OF GROUNDWATER FLOW As mentioned in the Introduction, scientists interested in the prediction of groundwater flow over space and time are ultimately forced to reconcile the differences between what has been measured in the field (at their particular site of interest) and what their predictive tool (an analog-, analytical-, or numerical model) suggests. This reconciliation begins with some very basic questions concerning the particular problem. For example: What is the relative number of data versus unknown model parameters? What is the magnitude, if any, of the error in the measured data? What is the relationship between the model parameters? These are some of the questions, the earliest researchers in the inverse problem began to address. The answers to these questions will indicate whether the modeler has a unique answer to the reconciliation problem. Consider again, the governing equation for groundwater flow presented in Equation 1 but modified for steady-state flow and assuming no sinks or sources; ∇ ⋅ [ K ∇h] = 0

(2)

by integrating (2) over a finite cross-sectional area (i.e., x, z), we obtain Darcy’s Law which relates the specific discharge, u, (i.e., the volume of flow/(unit time x cross-sectional flow area)) to the hydraulic-head gradient ∇h, and hydraulic conductivity K. u = − K∇h

(3)

As presented in McLaughlin and Townley (1995), Darcy’s Law may be rewritten in terms of the hydraulic head as; H = F ( K) + ε

2-1

(4)

where H is a vector of measured heads, K is the vector of conductivity over x, y, z, F is the forward operator which maps K to H, and ε is a measurement error vector representing the quality of the measurements of H. As McLaughlin and Townley show, the inverse problem consists of attempting to identify an operator, G(H), which maps the measurement vector to an estimate of K; K = G( H ) = G( F ( K ) + ε )

(5)

Since most models are used for predictive purposes, the aim is to have the estimate of K as close as possible to the true value(s) of K as this improves the predictive capability of the model. If there are no measurement errors ε, then the functional F is generally invertible and the obvious choice for G is F-1 (McLaughlin and Townley, 1995). There are, however, several problems with direct inversion of F. First, if the hydraulic head measurements contain error, which is most often the case, the functional F is not invertible and the estimation of K cannot be perfectly made. This may be understood further by remembering Equation 3 where K is related to the derivative of H. Thus, any errors in H are magnified once the derivative is taken, significantly impacting the estimate of K. Secondly, some problems in groundwater flow are not invertible regardless of presence of measurement errors. An example of this problem is the equation for one-dimensional steady-state groundwater flow;

∂  ∂ h ( x ) K ( x) =0  ∂ x ∂ x 

subject to the boundary conditions;

x ∈[0, L]

(6)

h( 0) = h 0

(7)

h ( L ) = hL

(8)

The solution of Equation 6 subject to Equations 7 and 8, is; h( x ) = h 0 + (hL − h 0)

x L

which illustrates that the head field varies linearly from one end (x=0) to the other (x=L) and is independent of conductivity. Thus, Equation 9 is not invertible to obtain K even though, as assumed above, there are no errors associated with h(x). The above conditions illustrate two characteristics of some inverse problems, ill-posedness and instability. Ill-posedness results from the condition where there is not enough information to obtain a unique solution to the inverse problem, i.e., for a given h(x) there is not a unique K(x). Instability is manifested by wide fluctuations in the inverse problem solution resulting in nonconvergence or convergence to meaningless property values, e.g., negative transmissivities. Instability in the inverse problem is caused by the sensitivity transmissivity estimates have to errors in the head measurements.

2-2

(9)

Both of these conditions may be overcome. Ill-posedness may be overcome by adding more information to the solution of the inverse problem. As an example, consider the onedimensional steady-state flow example described above. As McLaughlin and Townley (1995) illustrate, the solution to Equation 6 may be properly posed by replacing the downstream boundary condition with information concerning flux exiting the boundary; K ( x)

∂ h ( x) =Q ∂x

@

x ( L)

(10)

The solution to the head field is then; x

h( x ) = h 0 + Q ∫ e − α ( ξ ) dξ

(11)

0

where α (x) = K(x). Assuming Q is nonzero, Equation 11 may be inverted to obtain an expression for α(x):  1 ∂ h( x ) (12) α ( x ) = ln   Q ∂ x  Thus, by providing more information to an ill-posed inverse problem, it may be possible to properly pose it and obtain a unique solution. There are numerous methods available to solve the groundwater inverse problem. In general, all the methods may be grouped into one of two categories, Direct and Indirect methods. The following section describes the Direct and Indirect techniques and gives a historical overview of the progression of researchers, results and approaches that has led us to today’s inverse problem solution techniques.

2-3

2.1 Direct Methods of Solving the Inverse Problem 2.1.1

Numerical Techniques

Research on the Direct Method of solving the inverse problem has ranged from 1960 to approximately the mid-1980s. The Direct Method originates from a viewpoint in which the head field as well as its first and second derivatives are considered known. This reduces the groundwater flow equation from a second order partial differential equation (PDE) to a first order PDE in which conductivity is the unknown. Solving the inverse problem by the Direct Method, described below, is essentially conducted in a fashion similar in spirit as Equation 5 above. Any error in the solution of the inverse problem, referred to as equation error, results from errors in the head field, the representation of its derivatives or from incorrect model assumptions. Nelson (1960) formalized the Direct Method of solving the inverse problem as a Cauchy problem. Here he presented an analytic solution to the two-dimensional flow inverse problem whereby he added additional information to properly pose his solution. Nelson showed that the second order partial differential equation (2); expanded as:

∇ ⋅ [ K ∇h] = 0 =

∂ h ∂ K ∂ h ∂ K ∂ h ∂ K  ∂ 2h ∂ 2h ∂ 2 h + + + + + K ∂ x ∂ x ∂ y ∂ y ∂ z ∂ z  ∂ x2 ∂ y 2 ∂ z 2

(13)

could be reduced to the system of ordinary differential equations; − dK dx dy dz = = = ∂ h ∂ h ∂ h  ∂ 2 h ∂ 2 h ∂ 2 h + + K  ∂ x ∂y ∂z  ∂ x2 ∂ y 2 ∂ z 2

(14)

Nelson showed that by dividing Equation 14 through by K and recognizing that the resulting first three denominators were simply the velocity components in the x, y, and z directions respectively gave; d x dy dz − dK (15) = = = 2 vx vy v z  ∂ h ∂ 2 h ∂ 2 h 2 + +  K  ∂ x2 ∂ y 2 ∂ z 2 He also noted that the first three terms of Equation 15 were simply the equations of streamlines of groundwater flow and that if applied as a boundary condition to Equation 13, would provide a unique solution of K. Upon manipulations and substitutions of the above equations, one ends with the following expression for K;

2-4

K =− ln K0

h

∇2 h ∫ ∇h 2 dh h0

(16)

where K0 and h0 are the conductivity and head (potential) at some starting point x0, y0, z0 within a streamline and K is the conductivity at any point x, y, z with which the upper limit of the integral, h, is associated. In a practical sense, the determination of the conductivity field involves first determining the streamlines for steady-state flow (i.e., first three terms of Equation15 solved simultaneously). Then one subsequently evaluates the conductivity distribution from some starting point x0, y0, z0 within a streamline using Equation 16 to yield the conductivity distribution within the streamline. Clearly, this requires the knowledge of at least one conductivity value at a point within each streamline (Figure 2-1) in addition to a twice-differentiable function describing the steady-state hydraulic head distribution throughout the aquifer. In an experimental attempt to evaluate the above equation, Nelson (1961) constructed five electrical analog models simulating two-dimensional saturated flow through a heterogeneous resistance field. He constructed a heterogeneous resistance field by cutting numerous 5/16-inch diameter holes in the conducting paper making up the electrical model. A potential difference was applied to two opposing boundaries. A square grid was laid out over the conducting paper to facilitate voltage measurement. After measuring the voltage field, a polynomial was fit to the measure voltage surface and through numerical differentiation, the first and second order derivatives of the potential

Figure 2-1. Streamlines and transmissivity measurements in a steady-state flow system (from Emsellem and deMarsily, 1971)

2-5

field were calculated. These values, when input to a finite difference approximation to Equation 16, yielded the spatial distribution of resistance across the conductance paper. Figure 2-2 illustrates a comparison between measured and computed resistance along a representative streamline for one of the analog models. Nelson noted that the large differences between the measured and calculated resistance are a result of discrepancies in the numerical differentiation of the potential field. Figure 2-3 illustrates the graphic presented by Nelson to support this claim. Note the small differences in the potential field caused by large differences in the resistance field. This provides one of the first examples of the difficulty with the direct approach to the inverse problem; namely, errors in the head field (or potential field) have a dramatic affect upon the estimates of permeability (or resistance). In the mid-1960’s, Nelson formalized his approach in a Fortran computer code, GENORO (Figure 2-4). Over a seven-year period, he applied his technique to numerous modeling studies (Nelson and Cearlock, 1967; Nelson, 1968). In Nelson and Cearlock (1967), the authors applied the GENORO code to a shallow groundwater system at the Hanford Site in Richland, Washington. Figure 2-5 illustrates the observed head field at Hanford. A polynomial containing 62 terms was fit to the observed head field (Figure 2-5). Once fit, the polynomial was differentiated to provide the needed input for Equation 16 within each streamline. Figure 26 contains the streamlines across the model domain. Starting points (i.e., x0, y0, z0 of Equation 16) within each streamline along with the associated starting permeability, K0, and starting head value, h0, were specified along the vertical dashed line shown in Figure 2-6.

Figure 2-2. Comparison of actual and computed resistance (From Nelson, 1961)

2-6

Figure 2-3. Sensitivity of electric potential to errors in resistance (From Nelson, 1961)

Figure 2-4. Data and computer operations required for Nelson’s direct inverse method (From Nelson, 1968)

2-7

Figure 2-5. Comparison of the fitted potential distribution (dashed contours) with the measured potential distribution over part of the Hanford site (From Nelson and Cearlock, 1967)

Figure 2-6. The streamlines or flow paths generated from the fitted potential distribution at the Hanford site (From Nelson and Cearlock, 1967)

2-8

Using the above information, GENORO produced the permeability distribution shown in Figure 2-7. The permeability field contained very large variations over the modeled region. However, the magnitudes were well within the range of practicality. A verification of the permeability distribution was obtained by using the forward two-dimensional flow equation with boundary conditions obtained from the measured head surface and the permeability field resulting from the solution of the inverse problem. The calculated and measured head fields (Figure 2-8) had good agreement particularly near the river. In 1972, Scarascia and Ponzini proposed solving the direct inverse problem formulated by Nelson in a slightly different way. They suggested that if hydraulic head data from at least two independent states of groundwater flow were available (i.e., two steady-state flow fields or one steady-state flow field and one transient flow field), then only one measured transmissivity value was needed to solve the inverse problem (Scarascia and Ponzini, 1972). This removed the need to have a transmissivity value within each streamtube before the inverse problem could be solved. A condition that must be satisfied to employ this approach is that the hydraulic gradients of the two flow fields must not be parallel anywhere in the model domain. This implies that each streamtube within the first flow field intersects many of the streamtubes associated with the second flow field. One then uses the single measured transmissivity value to solve for the transmissivities along the streamtube within which it is located in flow field one. The resulting transmissivities within this streamtube are then used as the ‘measured’ transmissivity value for the streamtubes of the second flow field and vice versa until the transmissivities in each streamtube for both flow fields have been determined. Other researchers also considered this possibility, e.g., Frind and Pinder (1973), Sagar et al. (1975), Ponzini and Lozej (1982), Ponzini et al (1989), and Ginn and Cushman (1990). In 1995, Giudici and others presented an approximation of the Scarascia and Ponzini (1972) approach. They developed an expression for transmissivity at finite difference nodes based upon the finite difference equation for groundwater flow between adjacent grid blocks, the two sets of hydraulic heads from the two flow fields, and the single measured transmissivity value. An approximation of the first and second derivatives of head is made through the first and second-order finite differences. The expression they developed; exp a − 1   Τ(m + 1, n + 1) = exp− a  Τ(m, n) + b  a  

where T(m,n) is the transmissivity at the center of node m,n, and a and b are scalars computed from expressions containing the first and second-order finite differences of the two head fields. To guard against instability of the inverse solution when a and b are small, Giudici et al. (1995) substitute the first-order expansion of (exp a – 1)/a, i.e., 1 + a/2, into Equation 17. Guidici et al. (1995) compare their inverse approach to two synthetic test problems with limited success. Another recent application of this approach is given in Gonzalez et al. (1997).

2-9

7)

Figure 2-7. The permeability distribution of the Hanford Site obtained by Nelson’s direct inverse method (From Nelson and Cearlock, 1967)

Figure 2-8. Differences between the actual heads and the heads computed using the permeability field from Nelson’s direct inverse technique (From Nelson and Cearlock, 1967)

2-10

2.1.2

Graphical Techniques

The difficulties in obtaining second-order derivatives of the head field for some situations, i.e., low hydraulic gradient areas, led Nelson to develop a linearized approximation of the Direct Method. The linearized approximation, referred to as the graphical inverse technique, is simple in concept and relatively easy to implement. Refer back to Darcy’s Law (Equation 3) presented above. Since the specific discharge relates the discharge, Q, over a cross-sectional area, A, a simple modification to Darcy’s Law may be made by taking into account earlier considerations of flow within a streamtube, namely; u=

Q ∆h = − K∇h ⇒ Q = − K (bW ) A L

(18)

where b is the thickness of the aquifer, W is the width of a streamtube, and ∇h is represented as ∆h/L where L is the length of the streamtube element (Figure 2-9). Since steady-state flow within a streamtube (defined as the flow region between two adjacent streamlines) is constant, Nelson developed the following relationship for flow within adjacent sections of a streamtube (Figure 2-9);

W2 T 2 W1 T 1

∆h1 ≅ L1 ∆h 2 L2

(19)

where T = Kb. A proper flow net provides the basis for the above equation. Nelson (1961) notes that this implies construction of a flow net in which; 1. The incremental difference in potential (∆h) between successive equipotentials is a constant. 2. Streamlines are everywhere orthogonal to equipotentials for an isotropic heterogeneous medium. 3. The spacing between streamlines is such that the same quantity of fluid passes between each pair of streamlines making up the flow system. Thus, Equation 19 may be reduced to

 W1   L 2  T2 = T1     L 1   W2  where the only unknown in Equation 20 is T2. Nelson used this expression in his 1961 paper

2-11

(20)

Figure 2-9. Characteristics of a streamtube element to try and improve the estimates of resistance along a streamline of the electrical analog model discussed earlier. The results from the direct numerical approach (Figure 2-2) were improved upon after implementing the graphical approach. Figure 2-10 illustrates the resistance field obtained with the graphical technique. Comparing the resistance of Figure 2-10 (along the row y=6) to Figure 2-2 indicates that there is an improved fit between the measured and calculated resistance. These results led to more research to on the graphical method as developed by Nelson (see Nelson, 1968; Hunt and Wilson, 1974; and Day and Hunt, 1977; Hawkins and Stephens (1983) and Rice and Gorelick (1985).). The two most recent applications of the graphical method are examples where other techniques were combined with the graphical method to obtain additional information or add constraints to the inverse solution. For example, Hawkins and Stephens (1983) employed kriging to obtain an initial estimate of the transmissivity distribution and the standard deviation of the estimate for a groundwater model of the Animas Valley in Southwestern New Mexico (Figure 2-11). Steadystate head estimates obtained by kriging were then used to construct the flow net (Figure 2-12). The starting points for the execution of Equation 20 (used to re-estimate the transmissivity field) were selected from the locations within each streamtube where the standard deviation of the transmissivity-estimate was the lowest (Figure 2-13). The resulting transmissivity distribution (Figure 2-14) was then used as input to a finite-difference model to verify that the transmissivities could reproduce the observed steady-state flow field (Figure 2-15). A transient calibration step, accomplished by modifying storativity through trial-and-error, provided a calibrated model which when validated against a four-year pumping event provided good agreement between the observed and predicted water levels (Figure 2-16).

2-12

Figure 2-10. Resistance field determined using the graphical inverse method (From Nelson, 1961)

2-13

Figure 2-11. Model domain of the Animas Valley in southwestern New Mexico (From Hawkins and Stephens, 1983)

Figure 2-12. Steady-state flow net of the Animas Valley Model (From Hawkins and Stephens, 1983)

2-14

Figure 2-13. Kriging error in log transmissivity, ln T (From Hawkins and Stephens, 1983)

Figure 2-14. Transmissivity distribution after calibration (From Hawkins and Stephens, 1983)

2-15

Figure 2-15. Model predicted steady-state water-level elevations (From Hawkins and Stephens, 1983)

Figure 2-16. Measured and calculated transient heads (From Hawkins and Stephens, 1983) Rice and Gorelick (1985) employed the graphical method to construct groundwater models at the Nevada Test Site, near Las Vegas, Nevada, the Hanford Site in Richland, Washington, and the Rocky Mountain Arsenal in Denver, Colorado. Extending techniques presented in Rice (1983), the authors applied the Monte Carlo approach to access the uncertainty in the graphical method’s inverse solution to errors in hydraulic heads and to errors in flow net discretization, i.e., number and dimensions of streamtube elements. This exercise was one of the last unique applications of the graphical method to solving the inverse problem. This technique has been used less and less over the last 10 years due to the maturation of the indirect inverse techniques, geostatistical methods (discussed below) and the increase computational speed of computers upon which these indirect inverse methods may be employed.

2-16

2.1.3

The Transition from Direct Methods into Indirect Methods

Years after Nelson presented the graphical method of solving the inverse problem (Nelson, 1960) researchers began to present new ‘hybrid’ Direct methods which attempted to improve the stability of the Direct Method in the presence of errors in the head field. Head errors were corrected through a mass-balance analysis in hopes of reducing instable transmissivity estimates (Kleinecke, 1971). Kleinecke minimized the sum (over all the model nodes) of the absolute value of maximum mass-balance errors at each node for every time step. Unfortunately, Kleinecke did not obtain any improvement in the stability of the inverse problem as approximately half of his transmissivity estimates were zero. Emsellem and de Marsily (1971) overcame the problem of instability by dividing the aquifer into small number of homogeneous, isotropic transmissivity zones. Their technique determined the minimum number of zones (interpreted also as the largest size of zones) with which the mass-balance error was minimized and the transmissivity estimates remained stable. Separate inverse solutions were computed with an increasing size of zones until there was no longer any change in the mass balance error. This approach, in essence, determined the smallest number of unknowns providing an acceptable mass balance error. Emsellem and de Marsily provided the first glimpse of the future of inverse modeling, namely, constraining the transmissivity estimates through a ‘structural’ model associated with the transmissivity field. Emsellem and de Marsily (1971) were the first to modify the correlative properties of the transmissivity field through the solution of the inverse problem. Since the spatial correlation length of the transmissivity field within a zone is infinite when transmissivity is treated constant, modifying the size of the zone modifies the correlation length. Emsellem and de Marsily verified their technique with an electric analog model (previously constructed) of an alluvial aquifer of the Rhine north of Strasbourg, France. Figure 2-17 shows a comparison of the measured and computed heads using the transmissivities determined by their inverse method. Figure 2-18 illustrates the resulting permeability field. The authors note that the boundary fluxes and the transmissivity estimates were very similar to the analog model, however, the geometry of the transmissivity zones was slightly different. In 1973, a new inverse method, presented by Neuman (1973), employed a linear programming method to obtain an inverse solution by minimizing the mass balance errors in a flow model. In this article, Neuman presented the concept of a plausibility criterion. A plausibility criterion is a measurement of the deviations of the estimated parameter, e.g., transmissivity, from ‘prior information’. Here, prior information refers to previous estimates of transmissivity at specific locations from pumping tests. The plausibility criterion serves two functions,

2-17

Figure 2-17. Comparison of actual and computed heads (thin lines) of an alluvial aquifer near the Rhine River (From Emsellem and deMarsily, 1971

Figure 2-18. Permeability field determined from inverse solution (From Emsellem and deMarsily, 1971

it constrains the transmissivity estimates from deviating too far from the prior and it reduces oscillations in the transmissivity field. In Neuman’s formulation, a relative weight between the mass balance error or error criterion and the plausibility criterion is specified subjectively. In the mid-1970’s, research into the effects of spatial variability in hydraulic conductivity upon the head field began to provide additional algorithmic tools for inverse models. The work of Freeze (1975), Gelhar (1976), and Bakr (1976) are some of the early examples. The field of geostatistics, developed by Matheron in the early 1960’s (Matheron; 1962, 1963), began being applied to water resource problems to aid in the estimation of the conductivity field. The work of Delhomme and Delfiner (1973) and Delhomme (1978) are some of the first to illustrate the utility of geostatistics in providing spatial estimates of conductivity over model domains. In 1979, Delhomme coupled geostatistics and groundwater flow modeling by presenting a study in which he employed conditional simulation to generate an ensemble of transmissivity fields from measured transmissivity values and the associated variogram. Since each conditional simulation is a plausible version of the unknown transmissivity field, Delhomme was able to investigate the impact of the uncertainty of the transmissivity field upon the hydraulic head field. The main point of Delhomme’s research was to address the question ‘To what extent does the knowledge of the log T data lessen the uncertainty not only about the log transmissivity field but also about the head values?’ Delhomme’s study illustrated the benefit of including all the possible information on measured transmissivity values (i.e., the mean and covariance or variogram of the log T values). He concluded that the estimation of the transmissivity field could be improved by conditioning the transmissivity field to the measured head values through a comprehensive stochastic approach to the inverse problem (Delhomme, 1979).

2-18

Another important shift in inverse problem research in the mid-1970’s resulted from changing the minimization criterion of the inverse from an equation error focus (i.e., minimizing the mass balance error) to an output error focus (described in the next section). Cooley (1977, 1979) presented an inverse model based upon a nonlinear regression algorithm for steady-state flow conditions. Cooley did not use a smoothing criterion (i.e., prior information) for the parameter field but instead, divided the aquifer into small zones with uniform properties. He considered the solution optimal when the differences between the measured and calculated heads had zero mean and a variance consistent with the measured head-data variance. His method worked well for a small number of parameters. During this same time period, several researchers at the Massachusetts’s Institute of Technology proposed extended kalman filtering to solve transientflow inverse problems (McLaughlin, 1975; Wilson et al., 1978). These contributions illustrated the importance of considering noise in the data statistically as well as the benefit to incorporating transient-flow data into the inverse-problem solution.

2-19

2.2 Indirect Methods of Solving the Inverse Problem Research by hydrogeologists into the Indirect Method has ranged from the mid-1970s to today. Yeh (1986) presented a review of some of the significant contributions to the Indirect Method of the inverse problem. The Indirect Method of solving the inverse problem originated from the need to address the problems associated with the Direct Method approach. The Indirect Method generally proceeds as illustrated in Figure 2-19. An initial guess of the initial parameter field, e.g., transmissivity, porosity, storativity, and boundary conditions is selected. The hydraulic heads are then calculated over the domain of interest through Darcy’s Law. The resulting calculated head field is compared to the measured heads at the measurement locations. The differences between the calculated and measured heads, usually expressed as a weighted least squares sum, are then reduced through changes to the initial parameters until the differences are similar to the measurement errors of the head field. Thus, neither the first or second derivatives of the head field are needed for the inverse solution. Furthermore, errors in the measured heads simply provide a metric by which the calculated heads of the inverse model are considered consistent.

Figure 2-19. General approach used in the indirect inverse method (Modified from de Marsily et al, 1984)

Section 2.2.1 below presents a brief overview of some of the differences in the approach Indirect methods use to solve the inverse problem. The development of the Indirect Inverse Methods and some examples of their application to various groundwater modeling problems is then presented in Sections 2.2.2 through 2.2.4.

2-20

2.2.1

General Differences in the Indirect Inverse Methods

There are numerous ways in which the Indirect Method of solving the inverse problem has been posed. However, the main differences among the techniques stem from; 1) the assumptions imposed upon the parameter field, 2) the relationship between the parameter field and the hydraulic head field, and 3) the technique used to modify the initial parameter field to reach the inverse solution.

2.2.1.1

Deterministic Versus Bayesian

The assumption(s) imposed upon the parameter field refer(s) to the conceptualization the modeler has of the unknown parameter field. There are two distinct interpretations of parameter estimates and their uncertainties. The first viewpoint, referred to as deterministic, states that there is one real parameter field, it is constant and non-random but simply unknown. The uncertainty associated with the parameter field is due to our inability to measure the property accurately. This leads to an error of estimation which may be considered random with a Gaussian statistical distribution of mean µ and covariance Cα1α2. The Fisher model of uncertainty is associated with this viewpoint. Several of the nonlinear methods discussed below employ the deterministic approach to solve the inverse problem. The second viewpoint, referred to as Bayesian, states that the unknowns in the parameter field are believed to be random with a prior mean and covariance, i.e., the parameter field is one realization of a random field with a known constant mean and a prescribed covariance. Most of the inverse problems used today employ the Bayesian philosophy. Geostatistics may be used with either the Deterministic or Bayesian viewpoints. The difference lies in the meaning of the kriged estimate. Conceptually, a Fisher model of uncertainty would state that a kriged estimate of an unknown parameter would be assumed to vary about the nonrandom true value with a variability described by the kriging covariance. However, the Bayesian model would state that the random true values vary about the kriged estimate with variability described by the kriging covariance. McLaughlin and Townley (1995) discuss these two viewpoints and provide rigorous details of their implications upon the inverse problem formulation. However, they also state that under certain assumptions, both viewpoints ultimately provide similar results, thereby reducing the true differences between the methods to a matter of semantics.

2.2.1.2

Nonlinear Assumption Versus Linear Assumption

The second area of difference between indirect methods lies in the relationship between the parameter field and the hydraulic head field. On this point, the indirect methods may be divided into two categories, linear and nonlinear methods. In the linear methods, a model for the statistical spatial variability of the transmissivity field is proposed. A differential equation of groundwater flow is used to relate the spatial variability of the heads to the spatial variability of transmissivity. In this equation, the head and transmissivity fields are decomposed into a mean field and a distribution of perturbations with a mean of 0.0. Once a linear relationship is established between the head and transmissivity perturbations, the linear method then uses the observed head and transmissivity data to estimate the unknown geostatistical parameters in the

2-21

transmissivity field through a Gauss-Newton optimization routine. The resulting distribution of transmissivities across the model domain is then obtained through co-kriging. The nonlinear methods do not linearize the flow equation to obtain the inverse solution. The non-linear inverse method iteratively runs the flow equation, checks the agreement between the measured and calculated heads, modifies the parameter field to improve the agreement and then repeats the process. The iterations end once the differences between the measured and calculated heads reach a prescribed minimum. The measure of the differences is typically expressed either through a weighted-least-square function or a likelihood function. Derivatives of the objective function to the transmissivity (and/or other system parameters) guide changes in the transmissivity field. The changes to the transmissivity field may involve a geostatistical model describing the spatial variability of the field or may be simply conducted within multiple single-value transmissivity zones. Carrera et al. (1994) presented Figure 2-20 to illustrate the differences between the linear and non-linear methods. Assuming both methods use the same spatial variability model to represent the transmissivity field and assuming error-free head measurements without recharge, both methods provide the same initial estimate of the transmissivity field. This is referred to as the prior estimate of transmissivity (shown on Figure 2-20 as E(y)). Conditioning to the head data, hm, leads to changes in the prior estimate of T. Linear methods project this change through one set of first-order derivatives between head and transmissivity taken about the prior estimate of T. Non-linear methods also project the appropriate change of the transmissivity field while conditioning to heads. However, the derivatives used in the projection are updated iteratively. Thus, non-linear optimization is accomplished through a sequence of piecewise linear optimization processes.

Figure 2-20.

Illustrating the difference between linear, yL, and non-linear estimation of yN for the case of error-free head measurement (from Carrera et al., 1994).

2-22

2.2.1.3

Adjoint Derivatives Versus Direct Derivatives

The third area of difference between indirect methods relates to the optimization process employed to obtain the inverse solution. The optimization process requires the determination of derivatives of either hydraulic head to the model system parameters or a weighted least-square (WLS) objective function, J, to the model system parameters. The derivatives of head to model parameters, referred to as the Jacobian, are needed in the Gauss-Newton search algorithms used in linear indirect methods and some non-linear indirect methods. Other indirect methods employ quasi-Newton methods or conjugate-gradient methods requiring only the derivative of a WLS objective function. As noted in Carrera et al. (1994), there are two families of methods for computing derivatives or gradients; direct derivatives and the adjoint state method. Each of these approaches has their strengths and weaknesses depending on the size of the inverse model (i.e., number of grid blocks and time steps) and the number of measured head data. A comparison between the approaches may be found in Zou et al. (1993). Computing direct derivatives consists of taking derivatives of the state equations (forward equation) resulting in a linear system of equations where the unknowns are the columns of the Jacobian matrix. Alternatively, the adjoint approach (Chavent, 1971, 1975) is an extremely efficient approach of computing derivatives of the objective function, J (a scalar) to the system parameters, α, i.e., permeability, storativity, boundary conditions. Once the adjoint state is calculated, the gradient of J to any parameter α may be easily calculated using the same adjoint state function.

2.2.2

Non-Linear Methods

Non-linear methods for solving the inverse flow problem began development in the petroleum industry in the late 1960s (Jahns, 1966; Coats et al., 1970; Thomas et al., 1972; Chen et al., 1974; Chavent, 1975; Yoon and Yeh 1976). These early papers attempted to determine the permeability distribution for hypothetical petroleum reservoirs. However, the non-linearity of the multiphase flow equation inhibited progress of non-linear methods in the petroleum industry. Hydrogeological researchers had more success applying non-linear methods to groundwater flow problems (Yeh, 1986). Examples of some applications using these methods will be presented in an attempt to give the reader an appreciation of the theoretical and practical aspects of the Indirect Method. The starting point for the discussion is a 1979 paper by Neuman and Yakowitz. This paper was chosen as the starting point because it is one of the last non-linear inverse techniques, which did not directly incorporate geostatistics into the solution, yet was successfully applied to a real two-dimensional field problem. Neuman and Yakowitz’s 1979 paper presented an Indirect Method which moved the earlier steady-state method of Neuman (1973, 1976) into a statistical framework by modifying the error criterion to one based upon head residuals. Their technique also recognized that in many practical situations, the hydrologist would have more information on the head field than the transmissivity field. That is, the uncertainty of the head field is considered less than the

2-23

uncertainty of the transmissivity field. Thus, if the transmissivity field estimated from expert judgement did not reproduce the observed head field; the estimates of transmissivity could be changed to improve the fit to the observed heads. This was an attempt to address the earlier concerns of Delhomme (1978) by incorporating head information into the inverse solution. Their method adjusted the transmissivity field through nonlinear programming (Newton Raphson) coupled with a statistical analysis of the residual errors arising from the inverse model. Neuman and Yakowitz’s inverse technique employed a finite element numerical model of steady-state flow in a two-dimensional system. The basic approach of their technique is described below. Consider the groundwater flow equation in matrix form; A (T ) h = Q

(21)

where T is a I-dimensional vector which represents the true geometric mean values of transmissivity over I subregions of a model area, A(T) is the NxN conductance matrix, N is the number of nodes in the model grid, h is an N-dimensional vector representing true head values at the nodes and Q is an N-dimensional vector which represents actual net flow rates into or out of the aquifer. Neuman and Yakowitz assumed that h and T could be estimated from available field data with kriging. They expressed these estimates as; h* = h + ε

(22)

T* = T + ν

(23)

where ε and ν are the errors of estimation for head and transmissivity estimates respectively. Note the inherent deterministic viewpoint of the transmissivity field here. Namely, that there is one transmissivity field in reality and the errors in the model are due to the inability to accurately measure the field everywhere. Neuman and Yakowitz also assumed that the error of the estimate of Q, Q*, could be neglected. By substituting Equation 22 into Equation 21 and Q* for Q, their stochastic finite element model was obtained;

or

A (T ) h * = Q * + A (T ) ε

(24)

h * = A (T ) − 1 Q * + ε

(25)

after multiplying through by the inverse of A(T). Neuman and Yakowitz state that Equation 25 may be viewed as a nonlinear regression of H against Q in terms of the model parameters T. It is also assumed that ε and ν are uncorrelated with each other, have an expectation of 0 (i.e., the mean is unbiased), and covariances of σH2 VH and σT2 VT for head and transmissivity respectively (σT2 and σH2 are scalars).

2-24

As Neuman and Yakowitz (1979) discuss, the estimation of T is completed by minimizing the generalized least square criterion; J (T ) = JH (T ) + λ JT (T )

(26)

with respect to T where T is an estimate of T, λ is a non-negative coefficient and JH and JT are defined as; JH (T ) = (h * − h) T V h− 1 (h * − h)

(27)

JT (T ) = (T * − T ) T V T− 1 (T * − T )

(28)

where h being an estimate of h is given by A( T )-1Q* (see Equation 25). Neuman and Yakowitz (1979) proposed a nonlinear programming technique to minimize J (T ) in Equation 26, valid when ε of Equation 22 is small. If A( T )-1 were linear in T, the optimum value of λ (Equation 26) would be equal to σH2 /σT2. If σH2 or σT2 are unknown, λopt must be estimated by minimizing J(T) (see Equation 26) with respect to T for different values of λ. Since there will be a different set of residual errors (i.e., h-h* and T-T*) associated with each λ, one may analyze the manner in which statistical measures of the residual errors vary with λ. λopt may be chosen by comparing the heads residual error’s mean sum of squares, s2, (calculated by dividing the post-inverse JH (T ) by N, where N is the number of measured head data used in the calculation of JH (T ) ) such that they are close to the variance of the observed head measurements. Figure 2-21 illustrates the approach proposed by Neuman and Yakowitz to identify λopt. Various statistics of the residual errors (zH) vary with λ (presented as JT1/2) for the hypothetical problem presented in Neuman and Yakowitz (1979). Statistics illustrated on Figure 2-21 include average residual zH, coefficient of variation Cv, nearest-neighbor autocorrelation Rac, sum of squared residuals JH, and the probability, P, that zH is normal with mean zero and covariance s2. Neuman and Yakowitz’s method of comparative residual analysis suggests the optimum solution for the inverse problem lies between JT1/2 = 4.0 E+04 and 4.4E+04 ft2/d because zH and JH reach minimums in this region, Cv approaches a plateau and P is very close to unity.

2-25

Figure 2-21.

Variation of various statistics with Jt1/2 for sample problem with non-uniform and uncorrelated T* (From Neuman and Yakowitz, 1979)

Once λopt has been chosen, the associated estimate of T, Topt is the optimal inverse solution. As with the least squares approach, if the model assumptions are correct, then the mean of the estimation error eT = Topt - T is equal to zero and Topt is unbiased. The first order approximation of the covariance of eT is given by;

E [eTT eT ] = σH2 ( S TVH− 1 S λ opt VT− 1 ) − 1

(29)

where S is a sensitivity matrix referred to as the Jacobian defined as Sni = δhn/δTi at λ = λopt . The error of estimating h by h , eH = h - h, also has a mean of zero and a first order approximation of covariance given by; E [e HT e H] =1 S E [e TT e T] S T

(30)

To illustrate the application of their technique, Neuman and Yakowitz (1979) used their technique on a hypothetical two-dimensional flow problem. Another application of this technique is presented in Neuman et al (1980). Here, their indirect method was applied to actual field data taken from the Cortaro Basin in southern Arizona to determine the transmissivity distribution over a two-dimensional finite-element model domain capable of reproducing the measured steady-state water levels. The transmissivities were initially assigned to each finite element by linear interpolation from a hand contour map of measured transmissivities. Even though the final transmissivities, obtained from the solution of Neuman’s indirect technique, had large variances, they were successfully used to reproduce 25 years of recorded water level changes in the Cortaro Basin between 1940 and 1965 caused by temporal variation in recharge from pumping and streams. De Marsily (1978) presented an alternative Indirect Method about the same time as Neuman and Yakowitz’s work. His indirect method was the first to couple the inverse problem with several 2-26

‘state-of-the-art’ techniques, namely kriging with logarithms and optimization with adjointsensitivities. De Marsily called his technique the Pilot Point Approach because of the way in which he parameterized the solution of the inverse problem. In the Pilot Point Approach, the Bayesian viewpoint of the parameter is adopted. The log of transmissivity is used to transform the statistical distribution of transmissivity (considered to be logarithmic; Law, 1944) to Gaussian. The unknown log transmissivities are viewed as variables of a MultiGaussian random function (RF). Log transmissivity at a point in space is considered a Gaussian random variable which may or may not have a distribution which is statistically dependent on, or correlated to, its neighboring transmissivities. A transmissivity measurement at location (x,y,z) is conceptually considered as a sample from the RV distribution at point (x,y,z). As presented by Matheron (1962) kriging provides the best-unbiased estimate of the RV at any point in space given observed data and the correlation between the data. In de Marsily’s work, the RV was log transmissivity and the estimation points were the grid block centroids of his finite-difference model. He employed ordinary kriging to estimate the initial log transmissivity field. Ordinary kriging (OK) provides estimates of a random variable (RV) at any point (or area) in space through the use of measurements of the RV already available and their covariance (for more details, see Section 4). The ordinary kriging (OK) estimator is given by; n

Z ( u ) = ∑ ν β ,m Z ( u β ) * m

(31)

β =1

where Z*m(u) is the log of the transmissivity estimate at the centroid of grid block m (located at (u) = (x1,x2,x3)), Z(uβ) is the log of the measured transmissivity data at point uβ and νβ,m is the kriging weight assigned to measurement location β given the estimation point (u). The covariance of the RV affects the magnitude of the kriging weights through the solution of the following system of equations used to determine νβm . n ∑ ν β ,m C (uβ − uα ) + µ (u) = C (u − uα ), α = 1,..., n  β =1 n  ∑ ν β ,m = 1  β =1

(32)

where the νβ,m’s are the OK weights, C(uβ-uα) is the covariance of the RV between measurements points β and α. µ(u) is the LaGrange parameter associated with the constraint in the second expression in (32) designed to ‘filter’ out the local mean value to preserve stationarity (see Section 4). The solution of the inverse problem using the Pilot Point method relies upon an expanded form of Equation 31; n

N

β =1

p =1

Z ( u ) = ∑ ν β ,m Z ( u β ) + ∑ ν p ,m Z ( u p ) * m

(33)

where Z(up) is the log transmissivity at a pilot point, νp,m is its associated kriging weight and N is the total number of pilot points. The location and number of the pilot points were 2-27

subjectively chosen by de Marsily. He suggested the number of pilot points to be less than the observed transmissivity values and that the pilot points should be placed in areas with high hydraulic-head gradient. As illustrated in Equation 32, the kriging weights νβ,m (and thus νp,m) do not depend on Z(u). Thus, they may be solved for prior to assigning transmissivities to the pilot points. De Marsily’s Pilot Point Approach thus consists of estimating Z(up) so as to minimize the weighted least-squares objective function; JH ( Z ) = ( h * − h) T V h− 1 ( h * − h)

(34)

which is the same as Equation 27 with the exception that the minimization is with respect to Z(up), the pilot point log transmissivities. The derivatives required to guide the optimal selection of pilot-point transmissivities are computed with the adjoint technique (see Section 4) and the kriging equations (Equation 32). The overall equation solved by the Pilot Point Technique may be presented as follows. Let Zm represent the log transmissivity value assigned to grid block m. Using the chain rule, the sensitivity of the weighted least-squares objective function, J, to the log transmissivity value assigned to each pilot point may be expressed by: M ∂ J ∂ Zm dJ =∑ d Z p m =1 ∂ Z m ∂ Z p

(35)

where M is the total number of grid blocks in the flow model. The second term on the right hand side of Equation 35 may be reduced by taking the derivative of Equation 33 with respect to pilot-point transmissivity. Therefore, ∂ Z *m (um) (36) = ν m, p ∂ Z p (u p) Substituting the right hand side of Equation 36 into Equation 35 and simplifying yields, M dJ dJ = 2.3 ∑ vm, p k m m =1 dk m dZ p

(37)

where k is the permeability of grid block m. The sensitivity coefficient, dJ/dKm, is obtained by adjoint sensitivity analysis and used to determine the optimal values of the pilot-point’s log transmissivity by modifying the initial guess of the pilot point transmissivities (taken as the kriged estimate at up obtained from the measured data). De Marsily states that constraints to the pilot-point transmissivities could be applied to ensure that the pilot-point transmissivities lie within ±2 standard deviations of the kriged estimate. De Marsily’s pilot point technique was the first to solve many of the problems inherent in the earlier inverse techniques. By incorporating geostatistics and logarithms of transmissivity, he ensured a smooth transmissivity field even in the presence of significant errors in the head field. In addition, he stabilized his inverse technique by 1) adding prior information in the form of initial kriged estimates at the pilot point locations, 2) properly posing the inverse problem by reducing the number of unknowns to a small number of pilot points, 3) implementing the transmissivity changes in the neighborhood of a pilot point in accord with 2-28

the variogram, and 4) utilizing efficient adjoint sensitivity techniques to obtain the necessary derivatives for the optimization routine. De Marsily’s pilot point technique was first published in English in 1982 at a NATO Advanced Study Institute Conference held in Newark, Delaware (de Marsily, 1982). One of the first applications of the Pilot Point technique published in English was presented in de Marsily et al. (1984). Here the Pilot Point technique was used to obtain the permeability field, which reproduced interference tests in a well field. De Marsily used a two-dimensional finite-difference model (Figure 2-22) to model steady-state and transient gas flow. As presented above, he utilized kriging to obtain the initial transmissivity at each of the grid blocks and at the pilot point locations (Figure 2-22). The final transmissivity field obtained by the Pilot Point method successfully matched the transient interference tests (Figure 2-23). A sensitivity analysis was conducted to determine the impact of recharge upon the final transmissivity field (Figure 2-24). Even though this 1984 paper by De Marsily et al. is referenced as the original pilot point paper, his technique had been in print since 1978 in France. More details on developments in the Pilot Point technique are presented below.

Figure 2-22. Observation wells (O) and pilot-point locations (X) in central part of the grid (from de Marsily et al., 1984)

2-29

Figure 2-23. Fitting of the inverse model (from de Marsily et al., 1984)

Figure 2-24.

Transmissivity (in Darcy-meters) for the central part of the grid. Fitting without recharge shown on left. Fitting with recharge shown on right (from de Marsily et al., 1984).

In 1979, De Marsily spent a year of sabbatical at the University of Arizona in Tucson working directly with Neuman. Shortly thereafter, Neuman modified the approach he presented in 1979 to include logarithms of transmissivity, the adjoint sensitivity method, and a Fletcher-Reeves conjugate gradient optimization routine which used Newton’s’ method for determine the optimum change in the parameters (Neuman 1980). Over the next few years, Neuman’s modified inverse method was applied to several problems. Binsariti (1980) applied Neuman’s

2-30

modified inverse method to the Corato Basin in southern Arizona, the same site as presented in Neuman et al. (1980). Binsariti conducted unconditional simulation of the aquifer using the semi-variogram of log transmissivity, conditional simulation on the basis of kriging as per Delhomme (1979), and conditional simulation using the results of the inverse solution (Equation 30). He demonstrated the reduction in variance of the head values was the greatest when conditioning to both the measured heads and transmissivities. Other applications of Neuman’s inverse technique may be found in Clifton and Neuman (1982), Fennessy (1982), and Jacobson (1985). In 1986, Carrera and Neuman published a new non-linear method in a three-part paper (Carrera and Neuman, 1986a,b,c) which is considered by many to be one of the significant contributions to solving the inverse problem. Carrera and Neuman presented a method for estimating aquifer model parameters under steady-state and/or transient conditions. The parameters included values and directions of principal hydraulic conductivities, specific storage, interior recharge or leakage rates, boundary heads and/or fluxes, head-dependent recharge or leakage coefficients and parameters controlling the error structure of the data. They posed the inverse estimation problem in a non-linear maximum likelihood (ML) framework through the use of a loglikelihood objective function which included measured heads as well as prior information of the model parameters. They relied on adjoint-state finite element theory and a combination of conjugate gradient algorithms to minimize the non-linear ML estimation criterion. Carrera and Neuman assume vector β comprises unknown model parameters, α (e.g., directions of principal hydraulic conductivities, specific storage, interior recharge or leakage rates, boundary heads and/or fluxes, head-dependent recharge or leakage coefficients) and statistical parameters controlling the error structure of the data. The objective is to estimate β given measurements of head, hob, and prior estimates of model parameters, α*. The maximum likelihood estimate of β given z* = (hob,α*) maximizes the probability density of z* if β was true. The likelihood function may be written as: L( β z*) = f ( z * β ) =

1

[(2π )

N

Cz ]

1 2

T  1  exp − ( zˆ − z*) C − 1( zˆ − z*)  2 

(38)

where Cz is a block diagonal matrix composed of Ch and Cα which are initial guesses of the covariances (σh2 Vh and σα2 Vα ) for head and model parameters respectively. As in the earlier approach presented by Neuman (Equations 26 to 28), Carrera and Neuman set the weighting ratio λ equal to σh2 /σα2. Their method automatically maximizes the above expression with respect to model parameters, σh2 and σα2. Other statistical parameters, such as λ, are determined by trial and error. In practice, maximizing Equation 38 is equivalent to minimizing the expression –2(ln L). They show that after dropping terms depending only on statistical parameters, the following set of equations is obtained; J = Jh + λJα

(39)

where; Jh(αˆ ) =(h * −hˆ) T V h− 1 (h * −hˆ)

2-31

(40)

Jα (αˆ ) =(α * −αˆ ) T V α− 1 (α * −αˆ )

(41)

Note the similarities in these expressions compared to those presented by Neuman and Yakowitz (1979) in Equations 26, 27 and 28 above. The main difference is the inclusion of additional model parameters in the optimization process. A vector of parameters, z*, is iteratively updated by a vector di = αi - αi-1 given by Hdi = -gi where H is the first order approximation of the Hessian of J and gi is its gradient (Carrera and Neuman, 1986a). Both are computed using parameters at iteration i, αi. Once convergence of z* is achieved, Carrera and Neuman suggest an analysis of the estimation errors using Equations 29 and 30 presented earlier. Carrera and Neuman’s papers discussed issues such as optimal parameterization, identifiability, uniqueness and stability of the inverse solution. They also provided details of the adjoint-state finite element and conjugate gradient minimization algorithms. They also demonstrated various capabilities of their inverse model, INVERT, using synthetic and real world data. An application they presented resulted in the identification of the conductivity field of alluvium deposits along the Colorado River. The objective of this study was to quantify groundwater return flow to the river (Figure 2-25). The study was carried out in cooperation with the U.S. Geological Survey and the U.S. Bureau of Reclamation. INVERT automatically calibrated the conductivity of two cross-sectional models along the Colorado River (Figure 2-26a and 2-26b). Figure 2-27 illustrates the calibrated model’s fit to the heads measured during a flood event at piezometers installed in cross sections downstream of the Laguna Dam (Figures 2-25 and 226a). The calculated heads and measured heads match well. A validation test conducted using another flood event also resulted in a good match between model calculated and measured heads (Figure 2-28). Carrera and Neuman provided numerous conclusions with their study. They investigated the relative worth of steady-state versus transient-state data and concluded that the most reduction in parameter estimation error occurred when steady-state and transient-state calibration was conducted simultaneously. They attribute this to the significant difference between the flow fields within the aquifer for these conditions. They found that transient data had only a slight advantage over steady-state data. In fact, doubling the quantity of steady-state head data without changing the flow regime results in only a minor improvement in the quality of the estimates. Carrera and Neuman (1986a, b, c) also demonstrated the worth of prior information upon parameter estimates. They showed that in some cases, excluding prior information leads to parameter-estimation errors, which are larger than those associated with the prior information. In their synthetic example, they showed that prior information mainly benefited logtransmissivities estimates since the calculated hydraulic heads are relatively insensitive to transmissivity (Carrera

2-32

Figure 2-25. Location map of cross sections 8 and 16 (from Carrera and Neuman, 1986c and Neuman, 1986c). Also, they highlighted the potential benefits of including prior information in monitoring-well network design and presented the technique of model structure identification for the optimum zonation pattern for the transmissivity field. They concluded that inadequate representation of the model structure may be far more detrimental to the solution of the inverse problem than noise in the data (Carrera and Neuman, 1986c). Since 1986, Carrera has continued to improve and enhance the INVERT code (Carrera, 1994). He has published numerous papers on the inverse problem and the application of his technique to groundwater flow and transport modeling problems (Carrera, 1988; Carrera et al., 1990; Carrera and Glorioso, 1991; Carrera et al, 1993; Carrera and Medina, 1994; Medina and Carrera, 1996). In the late 1980’s and early 1990’s, research of geostatistical methods by Ahmed and de Marsily (1987) and research of the inverse problem by Ahmed (1987) and Certes and de Marsily (1991) resulted in an improved Pilot-Point method which is relevant to the research presented in this thesis. In Certes and de Marsily, a Pilot Point method for a finite difference grid with local mesh refinement was developed. They applied their code to several synthetic and real-world problems. They demonstrated that the adjoint derivatives are more accurate when taken relative to the discretized flow equation rather than discretizing the adjoint derivative of the continuous flow equation. They also investigated the character of the pilot point values during the optimization process.

2-33

Figure 2-26a. Geology of cross section 16 and piezometer arrangement (from Carrera and Neuman, 1986c)

Figure 2-26b. Portion of finite element grid nearest to river for cross section 16 with threefold vertical exaggeration. Distances are in feet (from Carrera and Neuman, 1986c)

2-34

Figure 2-27.

Computed (solid curve) and measured (dots) heads versus time at cross section 16. Flood of August 1973 (from Carrera and Neuman, 1986c)

Figure 2-28.

Computed (solid curve) and measured (dots) heads versus time at cross section 16 for flood of March 1975. Parameters from calibration against August 1973 flood (from Carrera and Neuman, 1986c)

2-35

In their 1991 paper, Certes and de Marsily compared the Pilot Point method to the synthetic example presented in Carrera and Neuman (1986c), Figures 2-29 through 2-31. Their results are repeated in Tables 2-1 and 2-2. Cases A, B and C correspond to variations in the starting values assigned to the pilot points. In case A, the pilot points are located at the observed transmissivity locations (Figure 2-29) and assigned values consistent with the ‘true’ values. In cases B and C, the pilot points are located at the same points in the model domain but assigned the constant values (51.8 m2/day for case B and 5.18 m2/day for case C). As illustrated in Tables 2-1 and 2-2, Certes and de Marsily’s results give good global agreement to the observed heads. In general, the difference between the observed and calculated heads is on average 0.2 m. The resulting transmissivity fields vary in their agreement with the ‘true’ field (Figures 2-32 and 2-33) for cases A and B respectively. Certes and de Marsily found that, in general, adding additional conditioning data (e.g., transient heads or transient drawdowns for 1000 days) to the solution of the inverse problem enabled a more accurate solution. They also compared their method to a previously hand calibrated model of an aquifer in Dijon (Figure 2-34). They again used several configurations of pilot points (referred to as Test A and Test D) and assigned different starting values to them in their study. Figure 2-35 illustrates the transient hydrographs for the three observation wells in the Dijon model. They found that their inverse transmissivities for case D, compared very closely to a previous results from manual calibration (Figure 2-36). Certes and de Marsily concluded that while the zonation approach by Carrera and Neuman provided more accurate results on the synthetic problem than the pilot point technique, the pilot point technique did not have any a priori information and still yielded a consistent solution. They also state that they support including prior information into the solution of the inverse problem through expert judgement as opposed to directly into the objective function.

Figure 2-29. Synthetic example characteristics (from Certes and de Marsily, 1991).

2-36

Figure 2-30. Computed reference steady state (from Certes and de Marsily, 1991)

Figure 2-31.

Computed true transient state, the initial drawdown is zero (from Certes and de Marsily, 1991)

2-37

Table 2-1. Comparison of the various tests conducted to the zoned solution on the basis of arithmetic averaging of transmissivity in each zone (from Certes and de Marsily, 1991).

A

B

C

zone solution Initial S TH100 TH1000 TD100 TD1000 Initial S TH100 TH1000 TD100 TD1000 Initial S

1 150. 159. 151. 159. 154. 170. 152. 51.8 147. 147. 155. 154. 151. 5.18 153.

2 150. 120. 126. 116. 28. 121. 117. 51.8 172. 158. 106. 104. 111. 5.18 118.

Transmissivities: 3 4 150 50.0 129. 48.3 140. 57.6 142. 52.5 144. 46.5 177. 9.51 138. 72.2 51.8 51.8 206. 53.0 191. 9.19 133. 85.2 149. 183. 147. 74.7 5.18 5.18 142. 51.8

average per zone (m2/day) 5 6 7 50.0 50.0 15.0 58.2 50.7 17.6 46.7 54.1 18.6 41.5 54.3 17.3 48.0 53.7 16.0 61.3 59.2 13.0 42.7 64.0 26.1 51.8 51.8 51.8 41.4 82.1 23.2 42.0 55.9 5.44 62.0 56.4 14.8 91.1 52.8 34.8 33.5 76.5 34.7 5.18 5.18 5.18 53.6 44.8 19.4

8 15.0 18.6 18.9 16.2 16.8 39.1 18.0 51.8 19.3 19.7 17.1 22.7 18.4 5.18 17.5

9 5.00 6.20 6.90 6.19 5.97 6.52 6.00 51.8 6.93 5.88 5.98 14.9 6.17 5.18 5.76

Table 2-2. Optimization results of the various tests conducted. The head fit corresponds to the average fit at the 18 observation wells (from Certes and de Marsily, 1991) Objective function (x108 m2)

A

B

C

S TH100 TH1000 TD100 TD1000 S TH100 TH1000 TD100 TD1000 S

Init 45. 8.3 78. 7.2 76. 6425. 380. 6135. 21. 555. 2.e6

optim 7e-8 .058 .828 .030 .534 2.1 6.6 1.4 .064 .661 362.

Ratio 637 143 94 240 142 3119 57 4478 328 839 5525

2-38

Head levels average fit (m) init optim 1.7 7e-5 2.3 .19 3.9 .23 2.1 .14 3.9 .19 20.3 .36 15.6 2.07 19.9 .29 3.6 .20 6.0 .21 17.4 1.05

Number Of Iter Simul 174 204 115 123 136 164 80 96 198 249 270 369 155 207 292 416 69 103 428 562 47 62

Figure 2-32. Transmissivity Fields and Calculated Heads from Inverse Solution of Case A (from Certes and de Marsily, 1991)

2-39

Figure 2-33. Transmissivity Fields and Calculated Heads from Inverse Solution of Case B (from Certes and de Marsily, 1991)

2-40

Figure 2-34. Dijon aquifer: discretization, boundary conditions and head observation points (from Certes and de Marsily, 1991)

Figure 2-35.

Dijon aquifer calibration and validation results (1980-1984) at the three observation wells (head levels) and the monitored spring (flow) (from Certes and de Marsily, 1991)

2-41

Figure 2-36. Dijon aquifer transmissivity contoured maps: kriged optimum for test A and D and zoned from manual calibration (from Certes and de Marsily, 1991) In 1992, Lavenue and Pickens (see Appendix A) presented a new version of the pilot point method applied to a regional aquifer model in the vicinity of the Waste Isolation Pilot Plant (WIPP) radioactive waste repository site (Figure 2-37). In this study, the position of the pilot point was optimized for the first time. All previous pilot point studies arbitrarily selected the pilot point positions. Lavenue and Pickens estimated the initial transmissivity field used in their study from 60 transmissivity values from pump tests using kriging (Figure 2-38). The groundwater flow model utilized the kriged transmissivity field to calculate the model heads. Upon comparing the model-calculated heads to the measured heads, an objective function was obtained and adjoint sensitivities were employed to determine the optimum location for a pilotpoint (Figure 2-39 and 2-40). The pilot point location was selected from the highest sensitivity derivatives of the groundwater flow equation to a set of potential pilot-point locations (Equation 37). Once located, the pilot-point transmissivity value was assigned by expert judgment. The process was then repeated once the pilot point was added to the database of observed values used with kriging (Figure 2-41). When the differences between the model calculated and measured heads was reduced below a prescribed head uncertainty value, the iterations ended. Lavenue and Pickens illustrated the ease at which the pilot point method may be used in calibrating a steady-state and transient-state flow model. They successfully matched the steady-state flow model (Table 2-3) which was then used as input to the transient flow model. The transient-head database used by Lavenue and Pickens contained over 100,000 measurements of heads from 30 boreholes spanning over a 10-year characterization period. The process used in transient calibration was similar to that used in steady-state calibration. The main difference was the technique used to determine the objective function and the derivatives

2-42

Figure 2-37. Location of the Waste Isolation Pilot Plant (WIPP) site in southeastern New Mexico (from Lavenue and Pickens, 1992) used in the adjoint state calculation. The objective function was calculated by summing the head differences at selected boreholes over a given period of time (i.e., over the period of a pump test and recovery). Once calculated, transient adjoint-state derivatives identified the most sensitive location for pilot point given a set of potential pilot point locations to choose from. (Figure 2-42 and 2-43). After the model was calibrated to a given pump test (Figure 2-44), then calibration focused upon another test identified as a calibration event. The calibrated transmissivity field (Figure 2-45) produced by Lavenue and Pickens was eventually used in transport calculations, the 1990 Performance Assessment, to determine the safety of WIPP as a repository.

Figure 2-38. Initial kriged transmissivity field (from Lavenue and Pickens, 1992)

2-43

Figure 2-39. Differences between the calculated and the observed fresh-water heads for the initial simulation of steady state conditions (from Lavenue and Pickens, 1992)

Figure 2-40. Normalized sensitivities of northern borehole pressure differences to changes in transmissivities at potential pilot point locations (from Lavenue and Pickens, 1992)

2-44

Figure 2-41.

Flow chart representing adjoint sensitivity and kriging approach to model calibration (from Lavenue and Pickens, 1992)

Figure 2-42. Calculated and observed transient freshwater heads at H-11, DOE-1, and H-15 prior to calibration to the H-11 long-term pumping test (from Lavenue and Pickens, 1992)

2-45

Table 2-3. Differences Between Calculated and Observed Freshwater Heads During Calibration to the Steady State Heads and After Transient Calibration (from Lavenue and Pickens, 1992) Borehole

Initial Head Differences (m)

H-2 H-2 H-3 H-4 H-5 H-6 H-7 H-9 H-10 H-11 H-12 H-14 H-15 H-17 H-18 P-14 P-15 P-17 W-12 W-13 W-18 W-25 W-26 W-27 W-28 W-30 CB-1 DOE-1 DOE-2 D-268 USGS-1 USGS-4 Average

-0.60 0.41 4.08 3.92 -3.08 -4.78 -2.26 0.96 -5.98 7.81 1.23 5.65 8.88 8.36 -4.92 -3.12 -0.66 5.09 -4.07 -5.49 -2.38 -3.09 -1.29 0.92 2.31 -2.08 5.16 7.49 -4.98 0.37 -0.53 -0.53 3.52

Head Differences(m) after Steady-State Calibration -1.30 1.20 -2.12 -0.99 1.08 0.07 -1.67 0.86 -1.63 -0.22 -0.23 0.90 1.57 1.52 -1.16 -1.31 -0.77 -1.37 -0.05 -0.79 0.81 -0.02 -0.98 0.45 1.09 -0.56 -1.15 -0.40 -0.27 0.89 0.21 0.21 0.87

2-46

Head Differences (m) after TransientState Calibration -0.95 1.77 -0.33 1.62 -1.20 0.23 -1.69 0.78 -2.07 -0.40 -0.44 -0.30 -0.33 1.81 -0.90 -1.02 0.93 -1.23 0.10 -0.70 0.05 0.07 -0.98 0.38 0.56 -0.66 0.01 -0.70 -0.06 1.40 0.18 0.18 0.75

Figure 2-43. Normalized sensitivities of H-15 and DOE-1 transient pressure performance measure to changes in transmissivities at potential pilot point locations (from Lavenue and Pickens, 1992)

Figure 2-44.

Calculated and observed transient freshwater heads at H-11, H-15, and DOE-1 after calibrating to the H-11 long-term pumping test (from Lavenue and Pickens, 1992)

2-47

Figure 2-45. Transient calibrated log10 transmissivities (from Lavenue and Pickens, 1992) Gomez-Hernandez and his fellow researchers at the University of Valencia, Spain, developed a linearized Pilot-Point technique that differs from that of de Marsily, Certes, or Lavenue (Sahuquillo et al., 1992). Their technique is extremely efficient due to several simplifying assumptions they make concerning the inverse solution. Figure 2-46 illustrates this technique. A transmissivity field is generated and conditioned to the measured transmissivity values. The groundwater flow equation is solved and then it is determined whether the calculated heads lie within an acceptable error range of the measured heads. If the calculated heads do not agree adequately with the measured heads, changes to the transmissivity field are made by modifying the transmissivity at ‘pilot points’, which they refer to as ‘master locations’, selected at various locations within the model grid. Gomez-Hernandez and others select the pilot point locations before solving the inverse problem, as do de Marsily and Certes. The solution of the inverse problem using Gomez-Hernandez’ approach is significantly faster than that of Lavenue and Pickens or Certes and de Marsily because of two major differences. First, Gomez-Hernandez does not update the sensitivity derivatives as frequently as the other pilot-point approaches. He attempts to solve the inverse problem using the sensitivity derivatives obtained around the initial objective function. Only if convergence is not achieved will an updated set of sensitivities be obtained to facilitate the optimization process (Figure 246). In contrast, Certes and de Marsily’s and Lavenue and Pickens’ approach solve the groundwater flow equation, determine the updated objective function and resolve for the associated sensitivity derivatives with each pass of the inverse solution (see Figure 2-41). The second major difference in Gomez-Hernandez’s method is that he assumes the conductance between two finite-difference grid blocks is expressed by the geometric mean (as opposed to the harmonic mean) of the associated grid-block transmissivities. This significantly speeds up the calculation of derivatives necessary for the inverse solution.

2-48

Gomez-Hernandez and his fellow researchers applied this technique to the WIPP regional aquifer model published earlier by Lavenue and Pickens (1992) and Lavenue and RamaRao (1992). They discovered that by making the above simplifying assumptions, the inverse solution for a large set of calibrated transmissivity fields could be efficiently obtained. Then, uncertainty in other parameters depending on the calibrated model, such as groundwater travel time, could be investigated (Sahuquillo et al., 1992). Other researchers have also contributed their versions of codes that solve the indirect groundwater inverse problem for flow and/or transport. For example, Grindrod and Impey (1991) published an indirect inverse technique which generates fractal simulations of the transmissivity field and then automatically modifies the model boundary conditions in order to calibrate the model. Other methods or applications may be found in Cooley 1982, 1983; Townley and Wilson, 1985; Loaiciga and Marino, 1987; Peck et al., 1988; Van Geer et al, 1991; Hill, 1992; Sun and Yeh, 1992; Sun, 1994; Xiang et al, 1993; Christiansen et al., 1995; Eppstein and Dougherty, 1996; Poeter and Hill, 1997; and others.

Figure 2-46. Inverse problem solution routine used by Sahuquillo et al. (1992)

2.2.3

Linear Geostatistical Methods

The first linear geostatistical method, presented by Kitanidis and Vomvoris (1983), formulated the linear inverse problem for one-dimensional groundwater flow within the framework of standard statistical inference. In their approach, the inverse problem is solved in two stages. The first stage, comprising the heart of the inverse procedure, consists of structural analysis whereby the parameter fields (transmissivity and head) are characterized by inducing a model (mean, covariance, cross-covariance) from the measured head and transmissivity data. The second stage uses the covariance and cross-covariance models identified in stage one and the 2-49

measured head and transmissivity data to estimate the transmissivity field. This step employs the co-kriging algorithm to obtain the best-unbiased linear estimate consistent with the measured head and transmissivity data. Heads over the discretized-model domain are then computed with the transmissivity field determined through the inverse, the prescribed boundary conditions and the groundwater flow equation. In stage one, exploratory data analysis provides an indication of the structure of the model, e.g., the covariances of head and transmissivity as well as cross-covariance of head and transmissivity. In order to obtain a straightforward expression for the cross-covariance between head and transmissivity, Kitanidis and Vomvoris separated the groundwater flow equation into a deterministic portion and a stochastic portion. This involved separating the head field and the log-transmissivity field into two parts, a mean field and a perturbation field consisting of small variations around the mean. They set: φ =H +h

(42)

Y =F+ f

(43)

and

where φ is the hydraulic head, H is the expected head value (mean or drift); h is the zero-mean head perturbation, Y is log transmissivity, F is the expected log transmissivity value and f is the zero-mean log-transmissivity perturbation. By substituting these expressions (and recalling that Y = ln T) into the groundwater flow equation one receives: ∂ f  ∂ H ∂ h ∂ 2 H ∂ 2h  + + + =0 ∂ x  ∂ x ∂ x ∂ x 2 ∂ x 2

Note that if the expectation of Equation 44 is taken, one receives: ∂ 2H =0 ∂x2

(44)

(45)

One may solve for the mean head field (Equation 45) over a discretized domain by using appropriate boundary conditions and a constant transmissivity. If the variations in the transmissivity field are considered small (i.e., var f < 1.0), then the second order terms of Equation 44 (products of the perturbation terms) above may be removed. Subtracting the mean head field (Equation 45) from this resulting expression gives the stochastic equation used in Kitanidis’ and Vomvoris’ work: ∂ 2h ∂ f ∂H =− 2 ∂x ∂x ∂x

(46)

Written in finite difference form, Equation 46 becomes; hi − 1 + hi + 1 −2hi = −

∆x (f 2

i −1

+f

where ∆x is the evenly spaced finite difference grid dimension.

2-50

∂H

i +1

)∂x

(47)

Kitanidis and Vomvoris develop a covariance matrix of the measurement vector (containing both measured heads and transmissivities) in terms of the variogram of log transmissivity (variance or sill and correlation range). Then, they use a maximum likelihood technique to determine the variance and range of the log transmissivity perturbations consistent with the observed data set (which they modified by removing the respective mean fields from the head and transmissivity data). Their linear method utilizes the Gauss-Newton method for maximum likelihood (ML) parameter estimation to obtain the coefficients of the covariances and crosscovariances. The log-likelihood function used to determine the covariance and cross-covariance terms is; L( z θ = − ln p (z θ ) =

(

)

(

T −1 N 1 1 ln(2π ) + ln Q + z − µ Q z − µ 2 2 2

)

(48)

where z is a vector of N available measurements, µ is the mean of the measurements (head and transmissivity respectively), Q is a vector of covariance and cross-covariance terms for head and transmissivity. The Norm of µ and Q is determined in the first step of structural analysis (i.e., the mean is constant or contains a linear trend; the covariance model is exponential or spherical). The vector of parameters expressing µ and Q are determined by minimizing the negative log- likelihood function presented in Equation 48. The derivative of Equation 48 with respect to a scalar parameter θj is: ∂Q ∂L 1 1 T − 1 ∂µ T − 1 ∂Q −1 Q (z − µ) − (z − µ) Q = Tr (Q − 1 ) − (z − µ) Q ∂θ j ∂θ j ∂θ j ∂θ j 2 2

(49)

where θ is determined at the set of θ values where the vector of derivatives is zero. For the solution, an iterative gradient-based method for minimizing the negative log-likelihood function is employed. The iterative equation is θj+1= θj - ρj Rj gj where θj is a vector of parameters in the jth iteration, ρj is a scalar step-size parameter which is usually fixed at ρj =1 or is determined through a line search, and gj is the gradient vector at j. Rj is the unit matrix for a steepest decent approach to optimization or if Newton’s method is preferred, Rj is the inverse of the Hessian containing second order derivatives of Q and µ with respect to the parameters θ (Kitanidis and Vomvoris, 1983). Once obtaining the covariance and cross-covariance terms in θ, Kitanidis and Vomvoris use the co-kriging equation to solve for linear weights applied to the observed log-transmissivity and head measurements. These weights are used in the following co-kriging expression to estimate the log transmissivity Y given the modified data set: m

n

i =1

j =1

Υ = ∑ λ iΥi + ∑ µ j (φ j − H j)

(50)

where Yi is measured log transmissivity at m measurement points, φj is measured head values at n locations, Hj is the mean of the hydraulic head field at the point of measurement j, and λi and µij are kriging weights assigned to the measured transmissivity and measured heads locations given the estimation point. The covariance of the transmissivity, heads and their

2-51

cross covariance affects the magnitude of the kriging weights through the solution of the following system of co-kriging equations; m

n

i =1

j =1

∑ λ iQYY (i, k ) + ∑ µ jQ Yφ (k , j ) − ν = Cov(Y ,Yk )

(51)

k = 1,....m m

n

∑ λ Qφ (l , i) + ∑ µ Q φφ ( j, l ) = Cov(Y ,φ ) i =1

i

Y

j =1

(52)

l

j

l = 1...n m

∑λ i =1

i

=1

(53)

where the Q’s are the covariance and cross-covariance terms of the head and transmissivity measurement locations, and Cov is the covariance between the measurement locations and the estimation point and ν is the LaGrange parameter associated with the constraint in Equation 51 designed to ‘filter’ out the local mean log-transmissivity value to preserve stationarity. Once the transmissivity field is estimated, the results may be used in a groundwater flow equation to solve for head over the model domain. Kitanidis and Lane (1985) present a detailed explanation of the maximum likelihood method used by Kitanidis and Vomvoris and compare several variations of the solution technique. For example, Kitanidis and Lane compared the ML approach with the Restricted Maximum Likelihood (RML) technique where the drift coefficients are not specified as optimization parameters. Table 2-4 illustrates the results Kitanidis and Lane presented for a simple example where the sill (θ1), the range (θ2), and the mean (θ3) were sought. It shows that the ML method yields biased estimates of the sill, range and mean if the drift coefficients (i.e., the mean) are included in the optimization problem as an unknown parameter. Conversely, RML method provides virtually unbiased estimates of the sill and range but larger variances than the ML method. Kitanidis and Lane state that this is due to a flatter log-likelihood function in the RML case. They also present two figures (Figures 2-47, 2-48) that illustrate the search paths of Gauss-Newton versus Newton’s method for the optimum sill and range for two different exponential variogram problems. They concluded that the Gauss-Newton method is more efficient than Newton’s method for ML and RML optimization. Table 2-4.

Mean and variance of parameter estimates θ1, θ2, and θ3 obtained from 30 replicates through maximum likelihood (ML) and restricted maximum likelihood (RML) estimation (from Kitanidis and Lane, 1985).

Actual parameter value 4 θ1 5 θ2 10 θ3

ML estimation Mean Variance 2.88 3.66 3.53 9.36 10.16 1.30

2-52

RML estimations Mean Variance 4.04 12.90 4.97 31.94

Figure 2-47. Contour map of negative log-likelihood function and the path of Gauss-Newton and Newton iterations. Starting point (1.0, 0.1). (from Kitanidis and Lane, 1985)

Figure 2-48. Contour map of negative log-likelihood function and the path of Gauss-Newton iterations. Starting point (2.0, 0.2). (from Kitanidis and Lane, 1985)

2-53

Hoeksema and Kitanidis (1984) extended the linear approach of Kitanidis and Vomvoris to two dimensions resulting in expanded versions of Eqs. 44, 45 and 46; ∂ f  ∂H ∂h  ∂ 2H ∂ 2 h ∂ f  ∂H ∂h  ∂ 2H ∂ 2 h   + + + + + + + =0 ∂x  ∂x ∂x  ∂x 2 ∂x 2 ∂ y  ∂ y ∂ y  ∂ y 2 ∂ y 2

∂ 2H ∂ 2 H + =0 ∂ y2 ∂ x2

∂ 2h ∂ 2h ∂ f ∂H ∂ f ∂H + =− − 2 2 ∂x ∂y ∂x ∂x ∂ y ∂y

(54)

(55)

(56)

Hoeksema and Kitanidis solve the inverse problem in the same fashion as Kitanidis and Vomvoris (1983). They illustrated their technique with a synthetic data set and applied it to the regional Jordan aquifer in Iowa. In the synthetic problem, Hoeksema and Kitanidis generated a ‘real’ set of head and transmissivity measurements over a 300 km by 300 km two-dimensional discretized domain (Figure 2-49). They generated a transmissivity field (Figure 2-50) and assigned a set of constant head boundary conditions to obtain the ‘real’ head field. The mean head field used in the analysis is illustrated in Figure 2-51. They compared transmissivity fields and the resulting head fields from the inverse solution to the measured values using both a medium and finely discretized grid. In addition, Hoeksema and Kitanidis used simple kriging to estimate the transmissivity field and the associated head field as an additional comparison metric. The results of this synthetic problem are illustrated in Figures 2-52 through 2-57.

Figure 2-49. Fine discretization of test aquifer (from Hoeksema and Kitanidis, 1984)

2-54

Figure 2-50. Contour map of generated log-transmissivities for test aquifer (values generated with θ1 = 0.4, θ2 = -0.01/km) (from Hoeksema and Kitanidis, 1984)

Figure 2-51. Expected head contours (m) for test case (from Hoeksema and Kitanidis, 1984)

2-55

Figure 2-52. Contour map of predicted log-transmissivities using medium discretization (run 1). Parameter estimates are θ1 = 0.470, θ2 = -0.0517/km (from Hoeksema and Kitanidis, 1984)

Figure 2-53. Comparison of actual and predicted head contours for medium discretization (run 1). Actual head contours (m) are shown with dashed lines (from Hoeksema and Kitanidis, 1984)

2-56

Figure 2-54. Contour map of predicted log-transmissivities using medium discretization (run 2). Parameter estimates are θ1 = 1.42, θ2 = -0.0104/km (from Hoeksema and Kitanidis, 1984).

Figure 2-55. Comparison of actual and predicted head contours for medium discretization (run 2). Actual head contours (m) are shown with dashed lines (from Hoeksema and Kitanidis, 1984)

2-57

Figure 2-56.

Contour map of predicted log-transmissivities based on measurements of transmissivity alone (run 3). Parameter estimates are θ1 = 1.46, θ2 = -0.00555/km (from Hoeksema and Kitanidis, 1984)

Figure 2-57. Comparison of actual and predicted head contours for case in which transmissivity field is estimated based on measurements of transmissivity alone, run 3. Actual head contours (m) are shown with dashed lines (from Hoeksema and Kitanidis, 1984)

2-58

The regional model Hoeksema and Kitanidis used in applying the geostatistical linear inverse method to the Jordan aquifer is illustrated in Figure 2-58 (grid block dimension equals 20 miles). The measured data set consisted of 31 steady-state head values and 56 transmissivity measurements. They attempted several inverse solutions with the best (from a head residual standpoint) coming only after recharge was added to the solution of the mean head field and after removing two questionable measured heads from the data set. In addition to solving for the nugget and correlation length of a linear variogram of log transmissivitiy (i.e, the slope), Hoeksema and Kitanidis also determined the variance of the measured head data. The final solution is illustrated in Figure 2-59. The results of their study illustrated the usefulness of including head measurements in the estimation of the transmissivity field and also identified the criticality of the accuracy of these measurements upon the final solution. Another linear method worth referencing is described in Gutjahr and Wilson (1989). Here they published a linear inverse problem that solves two-dimensional, steady-state groundwater problems with a Fast Fourier Transform (FFT) technique for field generation. The log transmissivity field and the mean-removed head field are considered to be statistically homogeneous (Zimmerman et al., 1997). An iterative co-kriging procedure is implemented to condition the inverse problem to transmissivity and head measurements. The extreme speed at which this technique provides results enables the user to investigate several variations of the inverse problem solution. References to this technique may be found in Robin et al., (1993), Gutjahr et al., (1994) and Zimmerman et al. (1996).

Figure 2-58. Discretization of Jordan aquifer and measurements of head (ft) and transmissivity (ft2/d). Constant head boundary is shown with dashed lines (from Hoeksema and Kitanidis, 1984)

2-59

Figure 2-59. Predicted transmissivity (ft2/d) contour map for Jordan aquifer. Leakage is accounted for only in the calculation of the average head field. Estimated parameters are θ1 = 0.256, θ2 = -0.00538 and θ3 427 ft2. (from Hoeksema and Kitanidis, 1984)

Harvey and Gorelick (1995) enhanced the Hoeksema and Kitanidis approach by adding solute arrival times to the inverse solution. Starting with Hoeksema’s and Kitanidis’ method, they formulated the log-transmissivity cross-covariance terms to a linearized transport equation in the same way as Hoeksema and Kitanidis’ formulation with heads. They expressed solute arrival times as a sum of a mean arrival time and a zero-mean perturbation arrival time. As with the log-transmissivity and heads, second-order terms of perturbation products were removed from the transport equation. Figure 2-60 illustrates the process used in their linearized approach. A maximum likelihood method is used to obtain the mean, covariance and cross-covariance terms of the logtransmissivity field using the measured head, log-transmissivity, and arrival time data. A cokriging estimate of the grid block log-transmissivity values is made in Step 2, conditioned to the measured log-transmissivity and either measured heads or measured arrival times. Using the co-kriged transmissivity field as the mean, an updated head field and an updated set of arrival times are calculated and subsequently subtracted from the measured data to obtain the perturbations. The covariance and cross-covariances of the perturbation fields (logtransmissivity, heads and arrival time) are then updated. The results from Step 2 are then used as input to Step 3, in order to improve the ‘prior’ estimate, and the process is repeated. If heads and log-transmissivities were used as conditioning data in Step 2, then arrival times and log-transmissivities are used as conditioning data in Step 3. The final result is a logtransmissivity field and the associated covariance and cross-covariance terms conditioned to the direct transmissivity measurements, the measured heads, and a set of arrival times (generally expressed as quantiles).

2-60

Figure 2-60. Diagram of the estimation procedure of Harvey and Gorelick (1995) Harvey and Gorelick (1995) demonstrate their linearized method on a synthetic test problem. Figure 2-61a illustrates the true log-transmissivity field and the measurement locations. The left and right edges of the domain are no-flow boundaries and the two sides are constant head boundaries (H = 1.0 on top and 0.0 on bottom). The aquifer is 160 m long, 104m wide and 10 m thick with an effective porosity of 0.3. The longitudinal and transverse block scale dispersivities are 1.7 and 0.7 m respectively. The model domain is composed of 40 x 26 evenly spaced, square grid blocks (4m). The log-transmissivity field is normally distributed with a mean of -8.14, variance of 1.5 and has an exponential covariance with a correlation length of 24m. Figures 2-61b through 2-61e illustrate the results of the inverse solution for various degrees of conditioning. Figure 2-61b shows poor results if only the sampled log-transmissivities and the variogram obtained from these data is used in the kriging equations. The authors note that the variogram’s range from the sampled data was essentially white noise. Figure 2-61c illustrates the improvement in the results if arrival times are used in conjunction with direct logtransmissivity measurements. Figure 2-61d illustrates the result if heads instead of arrival times are used with log-transmissivity data. Figure 2-61e contains the result if all the data are used in the inverse problem simultaneously. Harvey and Gorelick state that the best result is obtained if a sequential approach is used (as illustrated in Figure 2-60). Figure 2-61f contains the logtransmissivity field solution of the sequential approach where heads and log-transmissivities were used as conditioning data first followed by arrival times. Harvey and Gorelick concluded that using arrival time quantiles greatly enhanced the linearized inverse method’s ability to reproduce the true transmissivity field. They state that the quantiles contain information about the conductivity field not provided by head or direct conductivity measurements (Harvey and Gorelick, 1995). They, in fact, found very little benefit from the observed log-transmissivity data. They also concluded that sequential estimation provides a better inverse solution by enabling knowledge of the transmissivity field obtained from one data set (e.g., heads) to be used to better incorporate information from another data set (e.g., arrival times). This is due to the improved ‘prior’ estimate used in the second conditioning step.

2-61

Figure 2-61. (a) True log conductivity field with the “grid” well configuration indicated by dots. The cross marks the location of tracer injection. (b)-(e) Simultaneous estimates of the log conductivity field from different data types. (f) A sequential estimate (from Harvey and Gorelick, 1995)

2-62

Dagan (1985) presented a solution to the linear inverse problem in the spirit of Kitanidis and Vomvoris (1983). The primary difference between his technique and their technique is that the head and head-log T covariances as well as the transmissivity and head estimates over a finite space are obtained with an analytical expression in lieu of the numerical approach used by Kitanidis and Vomvoris. Dagan assumes no trend in the data, no vertical recharge, that the average head gradient is constant, and that the domain is infinite. Dagan’s approach was further extended and generalized by Rubin and Dagan (1987a). The main feature of Rubin’s approach is that closed-form, simple analytic expressions for the head variogram, the crossvariogram of head and transmissivity are expressed in terms of an exponential variogram of transmissivity. Rubin derived his expressions by assuming a first-order approximation of the steady-state flow equation in an unbounded aquifer in terms of an unknown vector of parameters θ. His method is capable of estimating recharge and the uncertainty surrounding the parameters θ. In their two part series of papers (Rubin and Dagan, 1987a,b), they present the theory behind their method and an application of the method to a real-world case, the Avra Valley also treated by Clifton and Neuman (1982). In their method, Rubin and Dagan use the maximum likelihood method in a similar vein as Kitanidis and Vomvoris, namely to determine the coefficients of the geostatistical model parameters given the observed head and transmissivity data. They also use the Fisher information matrix to estimate the lower bound of the covariance matrix, referred at as the Cramer-Rao bound. In the application of their technique to the Avra Valley, they investigated the differences between the geostatistical parameters obtained by conditioning first to only transmissivity (referred to as mode 1), then only to head (referred to as mode 2), and finally to both (mode 3). Table 2-5 contains a comparison of these results. Note the variation in the estimates, θ1 (the mean), θ2 (the nugget), θ3 (the correlated residual variance), and θ4 (the integral scale or range). When both measured transmissivity and head are used in the solution of the inverse problem, the reduction of the uncertainty of the estimates of these parameters is significant (see Σ in Table 2-5). In addition, the difference in the estimates of the transmissivity field is observable (Figures 2-62, 2-63). A histogram of the variance of the log-transmissivity estimation errors illustrates (Figure 2-64) the reduction in estimation errors when head is included in the inverse solution. The main limitations of the method proposed by Rubin and Dagan are that the logtransmissivity field is required to have a variance less than 1.0. This, in essence, limits the application of the method to geologic systems where the transmissivity is relatively smooth or contains a linear trend. In addition, the assumption of an unbounded aquifer, adopted to simplify the analytic expression, requires that the aquifer extent is much larger than the logtransmissivity integral scale. Rubin and Dagan state that in many cases, boundaries and boundary conditions are poorly defined. Thus, considering the local aquifer as a part of a much larger regional system may not be a difficult extrapolation (Rubin and Dagan, 1987a). Since first presenting their linear technique in 1987, later named the Semi-Analytical Technique, Rubin and Dagan have applied their approach to several groundwater flow and transport modeling problems. Examples of applications published in the literature include Dagan and Rubin (1988), Rubin (1991a,b), Rubin and Journel (1991), Rubin and Dagan (1992), Rubin et al. (1992), Copty et al. (1993), Roth et al (1996), and Roth et al (1997).

2-63

Table 2-5. Comparison of Rubin and Dagan (1987) inverse model results to Clifton and Neuman (1982) results for θi ,(i = 1,2,3,4) (from Rubin and Dagan, 1987b) θ1 = mean

θ2 = nugget

Method

θ1

Σ11

θ2

Clifton and Neuman [1982] Mode 1 Mode 2 Mode 3

8.88*

NA

0.55

θ3 = residual variance Σ22 θ3 Σ33 South Avra Valley NA 0.21 NA

8.61

0.04

0.07

0.008

8.73

0.039

0.05

Clifton and Neuman [1982] Mode 1 Mode 2 Mode 3

9.97*

NA

9.91 39.80

θ4 = range (miles) θ4

Σ44

2.25†

NA

0.022 0.037 0.010

1.29 1.20 1.21

0.99 0.32 0.18

0.17

0.29 0.46 0.004 0.29 North Avra Valley NA 0.21

NA

2.25†

NA

0.026

0.03

0.007

0.018

0.0

0.007

0.017 0.12 0.008

0.99 3.88 2.42

0.31 1.29 0.28

0.33 0.32 0.31

Estimates by Clifton and Neuman [1982] are based on Y data only. NA, not available., *By geometric mean. †Based on linear integral scale for semispherical variogram.

Figure 2-62. Contour map transmissivity (ft2/day) obtained by mode 1, South Avra Valley (from Rubin and Dagan, 1987b)

2-64

Figure 2-63. Contour map of transmissivity (ft2/day) obtained by mode 3, South Avra Valley (from Rubin and Dagan, 1987b)

Figure 2-64. Histogram of variance of estimation of Y in South Avra Valley: (a) conditional variance (b) generalized conditional variance (from Rubin and Dagan, 1987b)

2-65

2.2.4

2.2.4.1

Quasi-Linear Geostatistical Methods Kitanidis’ Quasi-Linear Method

The main disadvantage of the linear methods is the requirement that the log-transmissivity field perturbations are small. As mentioned above, if the variations in the transmissivity field are considered small (i.e., var f < 1.0), then the second order terms of Equation 54 (products of the perturbation terms) above may be removed. This restriction facilitates the linearization of the head and log-transmissivity field perturbations. Recently, several researchers have presented techniques to loosen this restriction upon the log-transmissivity field variance. In 1995, Kitanidis presented an approach which circumvents the limitation of the logtransmissivity field variance, a technique he refers to as the quasi-linear method. The quasilinear method is an extension of the linearized technique described in Kitanidis and Vomvoris (1983) and Hoeksema and Kitanidis (1985). However, the linearization of the relationship between head and transmissivity is performed in a local domain around each measurement location as opposed to around the prior mean of the head field. Kitanidis formulates the inverse problem through the following probability density function: p( z θ , r ) =



p( z β ,θ , r ) dβ ∝

R



1 2

s

Q



1 2

T

X Q

−1

X



1 2

∫ I ( s) ds

(57)

s

where the integrand is;   1 T −1 T I ( s) = exp − ( z − h( s, r )) R ( z − h( s, r )) + s Gs    2

{

}

(58)

and z is a one-dimensional measurement vector (containing the measured head and transmissivity values), θ is the set of parameters to be estimated (range and sill of the variogram) for the covariance matrix Q(θ), β a one-dimensional vector of the unknown drift coefficients (mean and trend expression), s is a one-dimensional vector of discretized transmissivity values (for the finite-difference grid blocks), r is the model boundary conditions, h(s,r) is the response vector containing the calculated transmissivity and head values at the measurement locations, X is a known two-dimensional matrix of drift terms, R is the covariance of the measurement errors, and G is a weighting matrix dependent on Q-1. As noted by Kitanidis, one may apply restricted maximum likelihood to find the values of the parameters θ that maximize the expression in Equation 57. However, the computational effort of solving the multiple integral can be significant since s is often a very large vector and since there are no analytical expressions readily available. He overcomes this dilemma by performing the integration through a linearization about the point where the integrand reaches its maximum value. The integrand is a peaked function with most of the value of the integral coming from the area around the peak, referred to as the local area. Kitanidis points out that the integrand is peaked when ║Q║ is small and/or ║R║ is small and/or there are many observations. If this is the case, the local head values, h may be approximated by a first order Taylor series about the

2-66

peak. Kitanidis offers the following example where h(s) = s2 and the integrand is a known expression (Figure 2-65). He illustrates that a first-order approximation of h(s) around the peak of the integrand (occurring at s = 1.05) is accurate in the local area.

Figure 2-65. (a) Plot of h(s) (solid line) and its linearized approximation (dashed line). (b) Plot of I(s) (solid line) and its linearized approximation (dashed line) (from Kitanidis, 1995). The first step in Kitanidis’ quasi-linear method is to find the peak of the integrand. A guess is made for the initial values of the geostatistical structural parameters θ (range and sill of the variogram) and sm (where sm = Xβ). Then a Gauss-Newton method is used to obtain an updated vector of values of sm which maximizes Equation 58. Once sm is found, the geostatistical structural parameters are estimated in the second step. The h(s,r) function is linearized around sm and an updated estimate of the structural parameters, θ, and covariance matrix V (containing cross-covariance terms) is made through restricted maximum likelihood estimation. These two steps are repeated until the structural parameters converge. In the third step of the quasi-linear method, an estimate of the spatial function s, (logtransmissivity values at the grid blocks) given z (the log-transmissivity and head measurements) is made using θ and V obtained in steps 1 and 2 above. Then, a groundwater flow model is employed to calculate the head field given s and r. In addition, conditional simulations may be generated from the parameters identified in the inverse solution. Each of these conditional simulations would be realizations from an ensemble of the random transmissivity field. Kitanidis demonstrates the quasi-linear method on a one-dimensional steady-state flow problem. He generates a one-dimensional transmissivity field with a variation of over two orders of magnitude over a relatively short distance, i.e., a variance of log T higher than 1.0. The head field is then solved after assuming a constant head boundary upgradient and a fixed flux rate downgradient. Table 2-6 lists the head measurements and locations in dimensionless quantities. Kitanidis initially attempted to obtain the range and sill of an exponential variogram but determined that his results were too positively correlated. He then switched to a linear variogram to express the variability of the transmissivity field. His quasi-linear method determined the slope of the linear variogram of 12.36 (dimensionless) which is extremely high given the field’s dimensionless distance of 1.0. The quasi-linear method results are illustrated in 2-67

Figure 2-66. The estimated log conductivity reproduces the observations and is close to the actual log conductivity in the vicinity of the observations. The 95% confidence interval illustrated in Figure 2-66 reveals the areas of highest uncertainty. The head field produced from the estimated transmissivity field does a good job reproducing the head field (Figure 2-67).

Table 2-6. Head Measurements used in synthetic problem (from Kitanidis, 1995). φ 0.301 0.299 0.293 0.292 0.254 0.228 0.198 0.181

X 0.25 0.30 0.55 0.60 0.70 0.75 0.80 0.85

The main difference between the quasi-linear approach and Kitanidis’ earlier linearized method is that the linearized approach skips the first step described above. The linearized approach assumes a prior mean, begins with Step 2 and proceeds to Step 3 above. It is therefore less accurate than the quasi-linear approach when the variance of the log-transmissivity field is large because the linearization is taken about the prior mean estimate. In a high variance field, the prior mean estimate could be quite different from the actual values (Kitanidis 1995). The essential requirement for the quasi-linear method to produce meaningful results is that the local area must be small enough and the measurement function h must be sufficiently smooth for the measurement function to be accurately represented by the linearized form in the local area (Kitanidis,1995). To have a small local area, the integrand must be a peaked function. As mentioned above, this occurs when ║Q║ is small and/or ║R║ is small and/or when there are many observations.

2.2.4.2

The Co-Conditional Method

In 1996, Yeh et al.(1996), presented their version of a quasi-linear method, referred to as the coconditional method, an extension of classical co-kriging (see Eqs. 70-71 above). In Yeh’s method, co-kriging is performed after determining the covariance and cross-covariance between log- transmissivity and heads. Yeh and others (1996) obtain these values through classical semi-variogram and cross-variogram analysis in lieu of a maximum likelihood approach used by Hoeksema and Kitanidis (1985).

2-68

Figure 2-66.

Log conductivity: actual (solid line), estimated (dashed line), approximate 95% confidence interval (dotted lines), and observations (open circles) (from Kitanidis, 1995).

Figure 2-67. Head: actual (solid line), calculated using best estimates of conductivity (dashed line), and observations (open circles) (from Kitanidis, 1995)

2-69

Once the co-kriged log-transmissivity field is obtained, a flow model is used to calculated the heads at the model grid blocks. Then the co-kriged transmissivity estimate is updated using the following expression: Yc

nh

( r + 1)

( x 0) = Yc ( x 0) + (r )

∑ω j =1

(r ) j0

[φ * ( x ) − φ j

(r )

( x j)

]

(59)

where r is the iteration index, Ycr(x0) is the co-kriged log-transmissivity estimate at location x0 and iteration r, nh is the total number of measured head values, φ*(xj) is the measured head at location (xj), φr(xj) is the calculated head at location (xj) using co-kriged log-transmissivity values from iteration index r, ωjor is the optimized weight assigned to the difference between the measured and calculated heads at measurement location j relative to the location x0 at iteration r. Yeh and others (1996) show that their successive linear estimator is unbiased. Similar to the kriging equations, Yeh and others ensure minimal variance for the estimator by minimizing their mean square error criterion with respect to ω. This results in the following expression for the solution of ω; nh

nh

∑ ∑ω i =1

j =1

(r ) j0

ε hh( r ) ( x j, x i) = ε yh( r ) ( x 0, x i)

(60)

where εhh(xj,xi) is the covariance between heads located at xj and xi, εyh(x0,xi) is the crosscovariance between log-transmissivity at x0 and the measured head at xi. The solution of Equation 60 requires the knowledge of εhh and εyh which are updated at each iteration through a first-order analysis of the head field and the use of adjoint-state sensitivities. εhh and εyh are calculated by; ε hh( r ) ( xi , x j) = J ( r ) ε y y( r ) ( xl , xm) J T ( r )

(61)

ε yh( r ) ( xi , x j) = J ( r ) ε y y( r ) ( xl , xm)

(62)

where i and j = 1 to nh, l and m = 1 to N (total number of grid blocks), and J is the adjoint sensitivity matrix ( δh/δ(ln T) ) of dimensions nh x N. εyy is obtained through the co-kriging equations for iteration r = 0. For r > 0, εyy is evaluated according to; nh

ε y y( r + 1 ) ( x0 , x j) = ε y y( r ) ( x 0, x j ) − ∑ ω i 0( r ) ε y h( r ) ( xi , x j)

(63)

i =1

Once the covariance and cross-covariance terms from the adjoint sensitivities are updated (Equations 61 through 63), the updated weights ω are determined (Equation 80) and used in Equation 59 to obtain an updated log-transmissivity value at each of the grid blocks. Each new updated transmissivity field yields a new head field and the process is repeated until the difference between the variance of the estimated log-transmissivity field of two successive iterations is below a minimum threshold.

2-70

Yeh and others (1996) compare their method to classical co-kriging on three different synthetic problems. The first two problems are referred to as deterministic test problems because the head field is completely known over the 21 x 11 grid. There are 11 known transmissivity values which span the vertical extent of the western model boundary. There is a discharge well located in grid (5,8). The variance of the log-transmissivity field for Case 1 was 0.38 and 3.01 for Case 2. Figures 2-68 and 2-69 illustrate the results of Case 1 and 2. In both cases, the coconditional method reproduces the true field where classical co-kriging does an adequate job in Case 1 but a poor job in the high variance Case 2. In Case 3, referred to as a stochastic test problem, the same grid is used as in Case 1 and 2. However, only 30 head measurements and 4 transmissivity measurements are available. A pumping well is located at grid block (9,6). Figure 2-70 illustrates the true log transmissivity field (Var=2.96), the true head field and the measurement locations. The results of co-kriging and co-conditional methods are also illustrated in Figure 2-70. As demonstrated in Figure 2-68 through 2-69, Yeh’s co-conditional method does a much better job at reproducing the true fields than classical co-kriging. Yeh’s conclusion from his study is that the linear assumption between the head and logtransmissivity perturbations employed in the classical linear methods tends to smooth the logtransmissivity field too much. His method, as well as Kitanidis’ method, indicates the direction that linear methods will be taking in the future.

Figure 2-68. (a) The true ln T field, (b) the cokriged in T field, and (c) the estimated ln T field by our iterative approach for case 1 (from Yeh et al., 1996).

2-71

Figure 2-69. (a) The true ln T field, (b) the cokriged ln T field, and (c) the estimated ln T field by our iterative approach for case 2 (from Yeh et al., 1996).

Figure 2-70. Illustrations of the true ln T and head perturbation fields and those by the classical cokriging method and our approach for case 3 (from Yeh et al., 1996).

2-72

2.2.4.3.

Simulated Annealing

Simulated annealing is a technique used to solve optimization problems through stochastic relaxation or perturbation. An initial image, e.g., a transmissivity field, is generated by sampling from a univariate distribution (PDF) and placing the sampled values across a specified domain. This initial image may be conditioned to any measured data by relocating the conditioning data to the nearest grid node first. The remaining nodal values are then randomly drawn from the PDF. The essential feature of the annealing, or relaxation, process is the perturbation of the initial image by swapping the values in pairs of nodal locations and then accepting or rejecting the swap (Deutsch and Journel, 1992). The swapping of nodal values is analogous to the metallurgical process of annealing where a metallic alloy is heated without leaving the solid phase so that molecules may move positions relative to one another, thus reordering themselves into a low energy state. The decision to accept or reject a swap, referred to as the decision rule, is based upon an objective function, or energy function, which quantifies the difference between a desired spatial characteristic(s) and those of the initial image. Two point statistics (i.e., the semi-variogram) or multiple-point statistics (e.g., indicator connectivity) are examples of spatial characteristics that may be matched in the optimization process. As Deutsch and Journel (1992) discuss, a swap is accepted if the objective function is lowered. However, not all swaps which raise the objective function are rejected. The probability that a swap will be accepted is given by the Boltzmann distribution: P{accept } =

=

{

{

if O new ≤ O old

1,

e

O new − O old t

(64)

,

otherwise

where t is analogous to the temperature in annealing. The higher the temperature, the greater the probability of accepting an increase in the objective function. An increase in the objective function often deters the optimization routine from getting trapped in a sub-optimal situation. During the swapping process, the user specifies the annealing schedule or the rapidity of temperature decrease with time. If the temperature is decreased too fast, the solution may not converge. If the temperature is decreased too slowly, the solution may take a very long time to converge. In order to make the optimization process as efficient as possible, the objective function is updated locally in lieu of global recalculation. In the case of an objective function consisting of a specified semi-variogram, only the differences between the swapped pair’s (zi and zj) contributions to the semi-variogram is calculated, not the entire semi-variogram. That is: γ *new (h) = γ * (h) +

1 [ (z − z j )2 − (z − zi )2 ] 2N(h )

(65)

There have not been many applications of simulated annealing to hydrogeologic problems other than the generation of conditional simulations of parameter fields. Deutsch and Journel (1991) published a report describing the application of simulated annealing to petroleum reservoir simulation. In their report, they compared the conditional simulation capabilities (i.e., the

2-73

ability to generate multiple realizations which reproduce the semi-variogram) of simulated annealing, sequential Gaussian simulation (sGs) and sequential indicator simulation (sIs). They found that the semi-variograms of the permeability fields produced by simulated annealing were much closer to the specified semi-variogram than those produced by sGs and sIs. They also demonstrated that the flexibility the user has in specifying the components of the objective function leads to an advantage simulated annealing has over other simulation techniques. Deutsch and Journel added an expression to the objective function that empirically related the effective permeability derived from a well test to a power average of the simulated permeability values. Thus, using simulated annealing, Deutsch and Journel were not only able to match the observed univariate and bivariate statistics, but they were also able to match the effective permeability derived from a local well test.

2-74

2.3 The Contributions of this Research The previous section presented the groundwater inverse problem, various problems associated with its solution, and a detailed chronological overview of the progress groundwater hydrologists have made toward this end over the last 35 years. Clearly, the myriad of tools that have been developed to solve the inverse problem since 1960 generate transmissivity fields that reproduce the measured hydraulic heads. The question then becomes, ‘What is the uncertainty associated with this solution?’ As mentioned in Section 1.1, almost without exception, the inverse techniques described in the previous section answer this question the same way. Uncertainty in the inverse solution (i.e., the transmissivity field) is investigated through Monte Carlo simulations where an ensemble of transmissivity fields are generated and used to determine distributions of a selected parameter. In some cases, the variability in the log-transmissivity field is the focus of the uncertainty analysis, but, more often, the uncertainty about a secondary variable dependent on transmissivity is of interest, e.g., groundwater head, groundwater travel times, solute concentration at a selected boundary, etc. Typically, the ensemble of transmissivity fields is generated from the final optimal transmissivity field, Y and the covariance of the error of estimation, Q (obtained from the inverse solution). Cholesky decomposition of the covariance matrix Q produces a lower trianglar matrix L where LLT=Q. Then, conditional simulations of the log-transmissivity field are produced by:

Y csi = Y + LA i

(66)

where i is the number of the conditional simulation, Ai is a set of vectors of standard normal random numbers. Y conditions the simulations (or ensemble) to the log-transmissivity inverse solution and the addition of the LAi term produces small variations about the mean grid block values, Y. Thus, the result is a set of log-transmissivity fields which all contain the general pattern of the log-transmissivity field determined from the inverse solution but each containing a unique random variation consistent with the covariance. The limitation in this approach is that the full range of uncertainty within the transmissivity field is probably not investigated. The conditional simulations all originate from the ‘optimal’ estimate of transmissivity (a single transmissivity field produced from the inverse solution) with variations around the ‘local’ optimal estimate dictated by the post-calibration covariance matrix (or conditional confidence interval). Thus, the ensemble of transmissivity fields is generated through a local uncertainty analysis. In 1991, our research began on an enhanced version of the Pilot Point Technique that would approach the question of uncertainty in a more ‘global’ fashion. Our research resulted from suggestions from an expert panel of stochastic hydrogeologists convened by Sandia National Laboratories (SNL) in Albuquerque, New Mexico, USA. SNL is in charge of characterization and performance assessment activities for the Waste Isolation Pilot Plant Site (WIPP), the United States’ first geologic repository for the isolation of radioactive waste. The expert panel referred to as the Geostatistics Expert Group or GXG, was convened by SNL in order to review the work presented in 1992 by Lavenue and Pickens (Appendix A).

2-75

In addition, their mission was to suggest techniques to access the uncertainty within the single calibrated kriged-transmissivity field presented in Lavenue and Pickens’ work. The GXG suggested investigating uncertainty in a more direct way by first producing an ensemble of transmissivity fields conditioned to the measured transmissivity data and its covariance (i.e., conditional simulations). Then each of the conditional simulations is separately calibrated to the set of steady-state and/or transient-state hydraulic head data. Once calibrated, the ensemble of transmissivities could be used to investigate the impact uncertainty in the transmissivity field has upon other dependent variables such as groundwater head or groundwater travel time (Figure 2-71). The GXG recommendation led to the contributions of the research presented here, namely the development and application of an automated Pilot Point Technique based upon the approach used in Lavenue and Pickens (1992). The resulting numerical model, referred to as GRASP-INV (GroundwateR Adjoint Sensitivity Program INVerse), employs the Pilot-Point Technique to solve the indirect, non-linear groundwater flow inverse problem for a set of conditionally simulated transmissivity fields. This enables the direct investigation of uncertainty in model results. GRASP-INV couples parametric (Gaussian) and non-parametric (indicator) sequential simulation for the generation of the ensemble of conditionally simulated transmissivity fields with the Pilot Point Technique for the first time. In addition, GRASP-INV extends the Pilot-Point Technique to three dimensions. GRASP-INV’s development, capabilities and application to multi-dimensional groundwater flow inverse problems are presented in the following sections.

Figure 2-71. Schematic Illustration of Stochastic Simulation (from Gotway and Rutherford, 1993).

2-76

3. THE GRASP-INV INVERSE CODE As discussed in the previous section, our development of an enhanced version of the Pilot Point Technique began in 1991. Our research was initiated after a March 1991 expert-panel meeting in Albuquerque, New Mexico, which was convened by Sandia National Laboratories (SNL). SNL is in charge of characterization and performance assessment activities for the Waste Isolation Pilot Plant Site (WIPP), the United States’ first geologic repository for the isolation of radioactive waste. The expert panel, referred to as the Geostatistics Expert Group or GXG (Table 3-1), was convened by SNL in order to review the work presented by Lavenue and Pickens (1992; see Appendix A). In addition, their mission was to suggest techniques to assess the uncertainty within the single calibrated kriged-transmissivity field presented in Lavenue and Pickens’ work. Their suggestions led to the development and application of GRASP-INV v1.0 to the Culebra dolomite flow field. Our research between 1991 and 1994, described below in Section 3.1, culminated in the development of the code GRASP-INV (GroundwateR Adjoint Sensitivity Program INVerse), version 1.0 (v1.0). GRASP-INV assesses uncertainty in the inverse problem solution (i.e., the log-transmissivity field) by first producing an ensemble of transmissivity fields conditioned to the measured transmissivity data and its covariance (i.e., conditional simulations) and then calibrating each field to the set of steady-state and/or transient-state hydraulic-head data. GRASP-INV employs an automated Pilot Point Technique extended from the approach used in Lavenue and Pickens (1992) and de Marsily (1978), to solve the indirect, non-linear groundwater flow inverse problem for a set of conditionally simulated transmissivity fields. In this section, the theory of the GRASP-INV code (v1.0) is presented briefly. The details behind the method are presented in detail in an attached paper (RamaRao et al., 1995; see Attachment B). GRASP-INV (v1.0) was applied to a WIPP-site regional groundwater flow model (see Lavenue et al. (1995) in Appendix C) as well as to an inverse problem comparison study developed by the GXG. The GXG comparison study included four test problems designed to compare seven different inverse approaches for identifying aquifer transmissivity (see Zimmerman et al. (1998) in Appendix D). Since details of these applications are contained in Appendices C and D, only brief reviews are contained in Section 3.2 below.

3-1

Table 3-1. Geostatistics Expert Group Participants

D. A. Zimmerman GRAM, Inc. Albuquerque, NM

G. de Marsily (Chair) Universite Paris VI Paris, FRANCE

C. A. Gotway Center for Disease Control Atlanta, GA R.L.Bras MIT Cambridge, MA

M. G. Marietta, C. L. Axness, R. L. Beauheim, P. B. Davies, D. Gallegos Sandia National Laboratories Albuquerque, NM

G. Dagan Tel Aviv University Tel Aviv, ISRAEL

J. Carrera Universitat Politechnica de Catalunya Barcelona, SPAIN

A. Galli, C. Ravenne Ecole de Mines de Paris Fountainebleau, FRANCE

J. Gomez-Hernandez Universidad Politechnica de Valencia Valencia, SPAIN

S. M. Gorelick Stanford University Stanford, CA

P. Grinrod Quantisci Ltd. Oxfordshire, UK

P. K. Kitanidis Stanford University Stanford, CA

A. Gutjahr New Mexico Institute of Mining & Technology Socorro, NM

A. M. Lavenue, Dr. B. S. RamaRao Duke Engineering Austin, TX

D. McLaughlin MIT Cambridge, MA

S. P. Neuman University of Arizona Tucson, AZ

Y. Rubin University of California, Berkeley Berkeley, CA

3-2

3.1 GRASP-INV Methodology: An Overview The solution to the groundwater inverse problem by the GRASP-INV code and the direct determination of the uncertainty of the inverse solution requires the generation of a large number of random transmissivity fields, each of which is in close agreement with all the measured data within the Culebra Dolomite. The collected data comprise (1) transmissivity measurements and (2) pressure measurements (both steady and transient state). Calibration of the initial random transmissivity field and the measured data (i.e., the inverse solution) is achieved in stages, as described in Figure 3-1.

Figure 3-1. Flow Chart of GRASP-INV First, unconditional simulations of the Culebra log transmissivity field are generated. These are random fields, having the same statistical moments (the mean and the variance) and the same spatial correlation structure as identified first from the log transmissivity measurements. These fields need not, however, initially match the measured transmissivities at the locations of their measurements. These transmissivity fields are then "conditioned," so that they honor exactly the measured Culebra transmissivities at the WIPP borehole locations. The resulting field may be referred to as a "conditional simulation" of the Culebra log transmissivity field. The conditional simulations of the log transmissivity field are then further "conditioned" such that the pressures computed by the groundwater flow model (both steady and transient state) agree closely with the "measured pressures," in a least-square sense. When the calibration is completed, one obtains a transmissivity field that is in conformity with all the data within the Culebra Dolomite and may therefore be regarded as a plausible version of the true distribution of log transmissivity. In GRASP-INV, model calibration is approached indirectly. An objective function is defined as the weighted sum of the squared deviations between the model computed pressures and the observed pressures, with the summation extended in the spatial and temporal domain where

3-3

pressure measurements are taken. The classical formulation of the calibration then requires the minimization of the objective function, subject to the constraints of the groundwater flow equations in the steady and transient state. This approach is implemented by iteratively adjusting the transmissivity distribution until the objective function is reduced to a prescribed minimum value. Pilot points are the optimization parameters adopted in GRASP-INV. A pilot point is a synthetic transmissivity data point that is added to an existing measured transmissivity data set during the course of calibration. A pilot point is defined by its spatial location and by the transmissivity value assigned to it. After a pilot point is added to the transmissivity data set, the augmented data set is used to obtain kriged or CS log transmissivity fields for a subsequent iteration in calibration. With the addition of a pilot point, the log transmissivity distribution in the neighborhood of the pilot point is modified with dominant modifications being closer to the pilot-point location (Figure 3-2). The modifications in the different grid blocks are determined by kriging weights and are not uniform (as in the zonation approach). Conceptually, a pilot point may be viewed as a simple way to effect realistic modifications of transmissivity in the region of the model surrounding the pilot-point location. A coupled kriging and adjoint sensitivity analysis is used first for the location of the pilot point and optimization algorithms are then used for assigning the optimal log transmissivity of a pilot point.

Figure 3-2. Spatial Influence of a Pilot Point Upon Model Grid-Block Transmissivities

3-4

3.1.1 Conditional Simulations In a modeling study in which only one calibrated field, or inverse solution, is to be produced, kriging is the best estimation routine one could use to produce an initial estimate of the gridblock transmissivities. Kriging provides an optimal estimate (i..e., minimum variance) of the log transmissivity at a point, i.e., the mean value. However, in an attempt to reproduce the natural variability of transmissivity fields, a simulation of the transmissivity field must be produced. Simulated transmissivity values reproduce the fluctuation patterns in transmissivity field and are therefore useful to resolve the residual uncertainty not represented by kriging. Figure 3-3 taken from Journel and Huijbregts (1978) provides a relationship between the true, kriged, and simulated fields.

Figure 3-3. Relationships between conditional and unconditional simulation

GRASP-INV uses a technique referred to as ‘residual sewing’ to produce conditional simulations of log-transmissivity. Residual sewing requires the generation of an unconditional simulation and two kriging steps in order to obtain a transmissivity field conditioned to the observed transmissivity data. An unconditional simulation of a transmissivity field produces a random field with the same statistical moments (mean and variance) and the same spatial correlation structure as indicated by the measured transmissivities in the field. An unconditionally simulated transmissivity field is said to be isomorphic with the true field and is independent of the true field. The following methods have been used earlier in groundwater hydrology for generating unconditional simulations: •

Nearest neighbor method (Smith and Schwartz, 1981; Smith and Freeze, 1979)



Matrix decomposition (Wilson, 1979; Neuman, 1984)



Multidimensional spectral analysis (Shinozuka and Jan, 1972; Mejia and RodriguezIturbe, 1974)

3-5



Turning bands method (Matheron, 1971, 1973; Mantoglou and Wilson, 1982; Zimmerman and Wilson, 1990)



Sequential Simulation methods (Gomez-Hernandez and Srivastava, 1990; Deutsch and Journel, 1992)

In GRASP-INV v1.0, the Turning Bands Method and the TUBA code (Zimmerman and Wilson, 1990) were used in the unconditional simulation process (see Attachment B). GRASP-INV v1.0 employs the generalized kriging code, AKRIP, to perform the required kriging calculations to condition to the measured data. Details concerning the residual sewing procedure may be found in Appendix B. Once the conditional simulation of the transmissivity field is produced, it is then mapped to the flow model grid blocks and used in the initial flow model calculations, along with selected boundary conditions, to obtain an initial estimate of the goodness-of-fit between the calculated and observed heads.

3.1.2 Solving the Groundwater Flow Equation The groundwater flow model used in GRASP-INV is SWIFT II (Sandia Waste Isolation, Flow, and Transport code). SWIFT II is a fully transient, three-dimensional, finite difference code that solves the coupled equations for single-phase flow and transport in porous and fractured geologic media. The SWIFT II code is supported by comprehensive documentation and extensive testing. The theory and implementation of SWIFT II are given by Reeves et al. [1986a] and the data input guide is given by Reeves et al. [1986b]. Finley and Reeves [1981] and Ward et al. [1984] present the verification-validation tests for the code. The transient flow equation solved by SWIFT II is given by



∂ (φρ )  ( ρk )  −∇• (∇p + ρg∇z) + q = 0 ∂t  µ 

(1)

where k = k(x) is permeability tensor, p = p(x, t) is pressure, z is the vertical coordinate and is considered positive upward, ρ = ρ(x) is fluid density, q is flux sources or sinks, g is the gravitational constant, µ is fluid viscosity, φ is rock porosity, x is the position vector, and t is time. Discretized, (1) becomes a matrix equation of the form [A]{p}n = [B]{p}n-1 +{f}n where, for the fully implicit scheme of time integration in SWIFT II, [A] = [C] + [S]/∆tn, [B] = [S]/∆tn, [C] is the conductance matrix, [S] is the storativity matrix, [f] is the load vector, ∆tn = tn - tn-1, t is time, n is the time level (e.g., 1,2,..., L), L is the maximum time level of the simulation.

3-6

(2)

3.1.3 The Objective Function The objective function that is minimized during calibration is a weighted least square error criterion function that is a model fit criterion. The model fit criterion is a weighted sum of the squared deviations between the computed and measured pressures taken over all points in spatial and temporal domains where pressure measurements have been made. For a purely steady state simulation, the objective function (also called performance measure) is given by: (3)

n J s (u) =

(

∑ w i p i - p ob, i i =1

)2

where: Js(u) n i pi pob,i wi

= = = = = =

objective function for steady state, number of boreholes, suffix for the borehole, calculated pressure, observed pressure, and weight assigned to the borehole.

For transient simulation, similarly t2

J t (u) =

(4)

n

∑ ∑ wi,t ( p i, t− pob i, t) t = t1 i = 1

2

where: Jt(u) = objective function for transient state, t1 = the beginning of the time window, t2 = the end of the time window, and wi,t = weight assigned to selected borehole for a given time t. The transient performance measure may consist of short transient events during which a response is only observed at a single location or long-term events during which responses are observed at several locations. In cases where the flow system is initially at steady state and then transient stresses are imposed upon the steady-state flow field, calibration to the steady-state conditions is undertaken first, followed by transient calibration. It is necessary to ensure that the fit between calculated and observed pressures be improved during transient calibration without degrading the fit to the steady state calibration. From experience, it has been found that this requires that the contributions from the steady state and the transient state to the combined performance measure should be approximately equal. Since transient performance measures can be generally much larger than the steady state performance measures (because values are summed up in the time window), additional factor f is used to ensure that the steady state

3-7

performance measure and the transient performance measure are approximately equal in the combined performance measure J(u). J(u) = f Js (u) + J t (u)

(5)

where: J(u) = combined steady and transient objective function, and f = weight factor for steady state objective function. Also, f Js (u) ≈ Jt(u)

(6)

J t (u) J s (u)

(7)

f ≈

Note the difference in Neuman’s objective function, i.e., J (T ) = JH (T ) + λ JT (T ) , Equation (26) in Section 2, and Equation 4 above. Neuman added the second term of the right-hand side of the above expression, referred to as the plausibility criterion, for two reasons. First it constrains the transmissivity estimates from deviating too far from prior information and secondly it reduces oscillations in the transmissivity field solution. The plausibility criterion is not necessary in the Pilot Point Technique because kriging is used both in the estimation of the initial transmissivity field and in the modification of the transmissivity field during optimization.

3.1.4 Adjoint Sensitivity Analysis GRASP-INV computes measures (e.g., weighted least-square errors) of a groundwater system’s pressures or heads at a location or several locations. It then calculates the sensitivities of these measures to system parameters (e.g, permeabilities and prescribed pressure values at the boundaries). The computed measures are referred to as “performance measures” and may include weighted spatial sums of groundwater pressures at selected locations or weighted squared deviations of computed and observed pressures at selected locations (or boreholes). The sensitivities are computed by the adjoint method (Chavent, 1971) and are derivatives of the performance measures with respect to the parameters for the modeled system, taken about the assumed parameter values. The system parameters available for use with GRASP-INV are (1) log10 transmissivity assigned to a pilot point, (2) grid block permeabilities or transmissivities, (3) prescribed pressure values at the boundaries, (4) recharge, and (5) source/sink rates. In the application to be used by WIPP PA, weighted sums of the squared differences between calculated and observed groundwater pressures at selected boreholes will be the chosen performance measures, and transmissivities assigned to pilot points was the chosen system parameter to be identified during model calibration. GRASP-INV presumes either steady state or transient state saturated groundwater flow conditions and directly uses the results of the groundwater flow simulation obtained from the SWIFT II subroutine. The theory and verification for the steady-state flow adjoint sensitivity

3-8

equations employed in GRASP-INV are presented by Wilson et al. (1986), while those for the transient flow sensitivity equations are presented by RamaRao and Reeves (1990). A brief presentation of the sensitivity equations solved by GRASP-INV during this study is given below. A conventional approach to the evaluation of sensitivity coefficients is defined by the expression J = f(α, p)

(8)

where J is a performance measure and α is a vector of sensitivity parameters. Let α be a vector of sensitivity parameters. Let α1 be the parameter for which a sensitivity coefficient is sought. Then dJ/dα1 = ∂J/∂α1 + ∂J/∂p (∂p/∂α1) (9) The first term on the right-hand side of (9) represents the sensitivity resulting from the explicit dependence of J on α1 and is called the direct effect. The second term represents an indirect effect due to the implicit dependence of J on α1 through the system pressures, p(α). While the computation of the direct effect is a trivial step, that of the indirect effect involves the evaluation of the state sensitivities; ∂p(x, t)/∂α1. State sensitivities may be calculated by the “parameter-perturbation approach” (Yeh, 1986) or by solution of the partial differential equation for state sensitivity (Sykes et al., 1985; Yeh, 1986). However, these approaches require the state sensitivities to be recomputed whenever a new parameter is considered. In a numerical model with a large number of grid blocks/elements and different system parameters, this represents an enormous computational effort of the same order as the multiple simulation approach to parameter sensitivity. The adjoint sensitivity approach circumvents the need to compute state sensitivities. This is done by expressing the performance measure as the sum of two distinct terms, one containing, exclusively, the partial variations with respect to the pressure function and the second containing partial variations with respect to α1 (RamaRao and Reeves, 1990). Both terms include a function referred to as the adjoint state. The adjoint state is computed such that it greatly facilitates the evaluation of the second term on the right-hand side of (9). The adjoint state vector λ is obtained by solving the following equation:

[A ]{λ }

n−1

{

= [ B] λ }

n

 ∂ J T  +  ∂ pn} T 

{

(10)

where T denotes the transpose of the matrix, A and B are the same matrices used in the primary problem (i.e., pressure solution) solved by SWIFT II (Equation2), and J is the performance measure (e.g., the cumulative sum of the weighted squared pressure deviations between calculated and observed pressures). Equation 10 is solved backwards in time, from n = L to n = 1 with

λ

L

3-9

= 0

(11)

If αi is a generic sensitivity parameter in the gridblock i, the sensitivity coefficient dJ/dαi follows from the solution of (10) using the following expression: L dJ ∂J = +∑ dα i d α i n = 1

{λ } n

T

 ∂ [ A] ∂ { f n}  ∂ [B ] n n −1 ⋅ { p} − ∂ α { p} − ∂ α  i i  ∂ α i 

(12)

The fact that there are no state sensitivity terms in the above expression leads to one important feature of the adjoint method, namely, the separation of the relatively timeintensive calculation of the adjoint state vector λ in (10) from the relatively non-timeintensive calculation of the sensitivity derivative (12). In general, this separation permits the calculation of sensitivity derivatives for all of the system parameters using the same adjoint state vector [λ}, a major advantage over the perturbation approach. Adjoint sensitivity analysis provides an extremely efficient algorithm for computing sensitivity coefficients between a given objective function J and a large number of parameters (permeabilities in thousands of grid blocks as is the case here). In the applications contained in this document, Equation 12 is evaluated with αi = Ki, the permeability in the grid block.

3.1.5 Locating Pilot Points De Marsily (1978) pioneered the concept of pilot points as parameters of calibration. He assigned their locations based on empirical considerations. In GRASP-INV, LaVenue and Pickens' (1992) approach to location of pilot points is followed. Pilot points are placed at grid-block center locations where their potential for reducing the objective function is the highest. This potential is quantified by the sensitivity coefficients (dJ/dYp) of the objective function J, with respect to Yp, the logarithm (to base 10) of pilot-point transmissivity. A large number of candidate pilot points are considered (as specified by the user), usually the centroids of all the grid blocks in the flow model grid. Each potential pilot point is initially described by an x,y,z location (grid block center) and assigned the initial transmissivity value at that location in the model domain. Coupled adjoint sensitivity analysis and kriging is used to compute the required derivatives and the procedure is documented in RamaRao and Reeves (1990). From a user-specified region, GRASP-INV calculates absolute dJ/dYp (see below) for these grid blocks. It then reranks these grid blocks’ dJ/dYp sensitivities and places a pilot point in the grid block with the highest absolute sensitivity value. GRASP-INV then sends this new pilot point location to PAREST to optimize the pilot point’s transmissivity value. Let P be a pilot point added to a set of N observed transmissivity values within a particular category. Let Tp be the transmissivity assigned to pilot point P. Kriging is done using Yp, where Y p = log10 T p

3-10

(13)

The kriged estimate (Y*) at the centroid of a grid block m is given by (14)

N

∑ γ m,k Y k + γ m,p Y p k =1 where k is the subscript for an observation point, p is the subscript for pilot point, γ m, k is the kriging weight between the interpolation point m and data point k, and γ m, p is the kriging weight between interpolation point m and pilot point p. Y*m =

When a pilot-point transmissivity is perturbed, the kriged transmissivities and hence the conditionally simulated (CS) values in the neighboring grid blocks are altered, causing the objective function J to change. If a grid block does not belong to the local ‘neighborhood’ used in the nearest neighbor kriging routine, its CS value will not be affected by the addition * of a pilot point. Let Ym represent the CS value assigned to grid block m. Using the chain rule, M ∂ J ∂ Ym dJ =∑ dYp m=1 ∂ Ym ∂ Yp

(15)

where M is the total number of grid blocks in the flow model. d Ym = γ d Yp

m,p

(16)

where γ m,p is the linear weight between a pilot point and the finite-difference grid block centroid. This result is valid for a CS field also, because the kriging error is independent of the kriged values. M dJ dJ = γ m. p dY p m =1 dYm*

(17)

* * = log10 ( Tm ) Ym

(18)



ρ

* Tm = K m m gb m µm

dJ dJ = ln(10) Km dYm dK m

(19)

(20)

where T* is the CS transmissivity, K is the CS permeability, ρ is fluid density, µ is fluid viscosity, g is acceleration due to gravity, b is grid block thickness, and m is the subscript denoting grid block.

3-11

Combining Equations 17 and 20 yields M dJ dJ = ln(10) ∑ γ m, p K m dK m dYp m =1

(21)

The sensitivity coefficient, dJ/dKm of the objective function with respect to the permeability in a grid block m, is obtained by adjoint sensitivity analysis.

3.1.6 Optimization of Pilot Point Transmissivities GRASP-INV contains a series of optimization codes to assign transmissivities to selected pilot point locations. The optimization codes are contained in the subroutine PAREST. Optimization is essentially conducted in a two step process. Given the parameter to be optimized, determine which direction to adjust its initial value. Once the direction is chosen, determine the optimal change or ‘step length’ in this direction. The pilot-point transmissivities are the parameters that are adjusted for calibration. However, in the mathematical implementation, the logarithms (to base 10) of the transmissivities (and not the transmissivity) are treated as parameters. The calibration parameters are given by Y p = log10 T p

(22)

where Tp is the transmissivity at a pilot point (suffix p denotes pilot point). The transmissivities at pilot points are assigned by an unconstrained optimization algorithm and a subsequent imposition of constraints. The optimization algorithm chosen here belongs to a class of iterative search algorithms. It involves a repeated application of the following equation until convergence is achieved: Y i+1 = Y i + βi • d i ,

(23)

where i is the iteration index, di is the direction vector, βi is the step length (a scalar), and Yi is the vector of parameters to be optimized (i.e., logarithms of pilot-point transmissivities to base 10). Details may be found in Appendix B.

3.1.6.1

Determining the Direction Vector: dI

Three options for the computation of the direction vector di are considered. They are the algorithms due to: (1) Fletcher-Reeves, (2) Broyden, and (3) Davidon-Fletcher-Powell 3-12

(Luenberger, 1973, Gill et al., 1981; Carrera and Neuman, 1986a; Certes, 1990). These methods are well known in classical literature and are not described here. The details with respect to pilot point methodology are given in Appendix B.

3.1.6.2

Determining the Step Length: βI

The step length βi, (a scalar) is determined by:

(

min J (Yi+1 ) = J Yi + β i di βi

Thus, βi is obtained by solving

)

(24)

∂ J( Y i+1)

(25) = 0. ∂βi The solution of Equation 25 follows from Carrera and Neuman (1986a) and Neuman (1980). The details in respect to pilot points, are presented in Appendix B and are not repeated here.

3.1.6.3

Constraints on Pilot Point Transmissivity Values

It is possible that the optimization algorithms may dictate large changes in the transmissivities assigned to pilot points and bring about a reduction in the objective function. Such recommended large changes may be viewed as undesirable for several reasons. At any point in the field, one can obtain a kriged estimate of log transmissivity and its variance (kriging variance). One may construct a confidence interval (assuming a normal distribution of kriging errors) for the log transmissivity. It is reasonable to expect the calibrated value to be within the confidence band. A constraint may be imposed to achieve this. There also may be situations where the confidence band is large. A large change in a pilotpoint transmissivity value, even if contained within the confidence band, can cause a large change in the spatial correlation structure of the transmissivity field. One objective in calibration can then be to limit the maximum change to a specified value so that the geostatistical structure of the transmissivity field is not altered significantly. Consider the kth parameter, whose value is Yk (kth element in the vector of parameters, Y). Then, ∆ Yk,i = (Yk,i+1 - Yk,i ) = β i • dk,i

where i is an iteration index.

Constraint 1: The parameter value should lie within the confidence band.

3-13

(26)

Y k - mσ y ≤ Y k i+1 ≤ Y k + mσ y

(27)

Thus Yk gives the kriged value at the location of k (the pilot point), σ gives the kriging variance at the same location, and m is the multiplier of the standard deviation, which gives the semi-width of the confidence band. GRASP-INV uses a 95% confidence band which is obtained from the geostatistics routines during the simulation of the transmissivity field. The 95% confidence interval values are sent to PAREST as grid block minimum and maximum values and are therefore used as constraints during pilot point transmissivity optimization. Constraint 2: The change in any parameters must be limited to ∆Ymax. ∆ Y k,i ≤ ∆ Y max

(28)

After the optimization, these constraints are implemented for each parameter. If a constraint becomes active (imposed), the optimal step length computed is reduced; however, the direction is preserved. Additional details are presented in Appendix B.

3.1.7 Earlier Inverse Algorithms: Similarities and Differences The inverse algorithm used in GRASP-INV shares some similarities with earlier inverse algorithms (de Marsily, et al., 1984; Carrera and Neuman, 1986a) and maintains essential and substantial differences with them. It is useful to appreciate both the similarities and differences. In the present algorithm, if we suspend the automatic pilot-point selection process in the code, and if instead we proceed from given pilot-point locations, the algorithm would be very similar to that of de Marsily et al. (1984). Considering this problem as one of optimizing the magnitudes of parameters (pilot-point transmissivities) at the given locations, the algorithm is similar to that of Carrera and Neuman (1986a). The essential difference between GRASP-INV and earlier methods, however, is that the choice of (location of) parameters, subjectively done in the above-cited references, is rendered totally objective in GRASP-INV. This eliminates the need to consider alternative choices of zonation, as in Carrera and Neuman (1986a), and the alternative choice of the pilot-point configuration, as in de Marsily et al.(1984). Another distinguishing feature of GRASP-INV is the multi-stage-approach used for the location of pilot points. For example, if the final calibration includes 30 pilot points, only one or two pilot points are identified in the first calibration loop. Then using these pilot points to modify the CS transmissivity field, another calibration loop is performed and another set of one or two pilot points is obtained. Only one or two pilot points are selected and optimized at a time because the optimization algorithms work very well with only a few parameters. This process is repeated for numerous loops until the weighted-least-square objective function is reduced below a pre-defined threshold.

3-14

3.2 Applications 3.2.1 1992 Culebra Regional Flow Model This section is a brief review of the major points of the application paper contained in Appendix C (Lavenue et al., 1995). In 1992, the GRASP-INV v1.0 was used to solve the groundwater inverse problem and determine the uncertainty in the transmissivity field of a regional aquifer near a site considered for radioactive waste isolation. The Waste Isolation Pilot Plant (WIPP) near Carlsbad, New Mexico (Figure 3-4), is a research and development project of the United States Department of Energy (DOE). The WIPP is designed to be the first mined geologic repository to demonstrate the safe disposal of transuranic (TRU) radioactive wastes generated by DOE defense programs since 1970.

Figure 3-4. Location of the Waste Isolation Pilot Plant (WIPP) Site in Southeastern New Mexico (from Lavenue et al., 1995)

Before disposing of radioactive waste at the WIPP, the DOE must have a reasonable expectation that the WIPP will comply with the quantitative requirements of Subpart B of the United States Environmental Protection Agency's (EPA) Environmental Standards for the Management and Disposal of Spent Nuclear Fuel, High-Level and Transuranic Radioactive Wastes (40 CFR Part 191, EPA, 1985). Comparing the long-term performance of the WIPP disposal system with the quantitative requirements of 40 CFR Part 191 will help determine whether the disposal system will provide safe disposal of radionuclides. Performance assessment as defined in the Containment Requirements of Subpart B of 40 CFR Part 191 is an analysis that identifies the processes and events that might affect the disposal system, examines the effects of these processes and events on the performance of the disposal

3-15

system, and estimates the cumulative releases of radionuclides, considering the associated uncertainties, caused by all significant processes and events (191.12(q)). Major sources of data for WIPP performance assessment calculations result from sitecharacterization activities, which began at the WIPP site in 1976. Since 1983, when full construction of the facility was started, site-characterization activities had the objectives of updating or refining the overall conceptual models of the geologic, hydrologic, and structural behavior of the WIPP site and providing data adequate for use in the WIPP performance assessment (Lappin, 1988). As the WIPP Project moved toward a compliance determination, the objective of site-characterization efforts was to reduce uncertainty in the conceptual models. Uncertainty and sensitivity analysis for the total disposal system is the task of the WIPP Performance Assessment (PA) Department at Sandia National Laboratories. Because some uncertainty about the parameters controlling groundwater flow and transport will always remain, the 1992 WIPP PA calculations employed Monte Carlo techniques to provide estimates of radionuclide concentrations at the accessible environment boundary (WIPP PA Department, 1992). This approach required that cumulative distribution functions be selected for numerous imprecisely known input parameters. For example, local-scale multiphase codes that simulate the interaction of waste-generated gas and brine within the repository and the Salado Formation (Figure 3-5) required input parameters such as residual saturation, threshold pressure, undisturbed pore pressure, and porosity. Examples of input parameters needed to simulate farfield flow and transport through the Culebra Dolomite, considered to be the principal pathway for offsite transport, included transmissivity, dispersivity, and porosity. The reports by the WIPP PA Department (1992) describe the input parameters used in the PA calculations, the codes used during the calculations, and the relationships between the parameters and codes. Numerous modeling studies over the last 15 years focused on characterizing the hydrogeology of the Culebra Dolomite. In general, these studies attempted to characterize the Culebra transmissivity field by iteratively reducing the differences between the calculated and observed heads within a single groundwater numerical model. The head differences were reduced by modifying the transmissivity field either by intuition or through the use of numerical algorithms such as kriging. While these studies improved the understanding of the relationship between the transmissivity and the flow fields within the Culebra, they did not provid a metric for quantifying the uncertainty within the transmissivity field. The application of GRASP-INV provided a means to quantify this uncertainty and assess the spatial variability within the field. The GRASP-INV code generated and subsequently calibrated conditionally simulated (CS) transmissivity fields of the Culebra dolomite. Because each CS field has similar broad features but distinctly different small-scale variations, the GRASP-INV code produced numerous, equally probable, transmissivity fields calibrated to the observed head data. The unique features present within each calibrated field were related to the uncertainty of the transmissivity field. The WIPP PA Department incorporated this uncertainty into the 1992 Monte Carlo analysis by partially ordering a set of equally probable transmissivity fields by travel time to the accessible environment, and then drawing one field for each system calculation by sampling a uniformly distributed index variable. Because a Latin Hypercube Sampling technique was used and the number of fields in the set was equal to the number of imprecisely known parameters, each field was drawn once in the 1992 PA calculations. Although not required for a compliance

3-16

Figure 3-5. Generalized Geologic Cross Section (from Lavenue et al., 1995)

assessment with 40 CFR 191, Subpart B, travel time is a good intermediate performance measure and provides some physical interpretation of the index variable for sensitivity and uncertainty analysis.

3.2.1.1

Site Description and Review of Past Modeling Studies

Over the past 12 years, a significant effort has been directed toward field investigations at the WIPP site. These investigations have been instrumental in providing estimates of the variability of the hydrogeologic properties within the Culebra Dolomite such as transmissivity and storativity. Numerous boreholes in and immediately surrounding the WIPP-site area have been drilled and tested within the Culebra in support of these investigations. The Culebra aquifer, which dips toward the southeast, has spatially varying characteristics across the WIPP-site area. For instance, an increase in transmissivity and a decrease in formation-fluid density exists from east to west (Figures 3-6 and 3-7). There is no apparent trend to the storativity data obtained from the tests within the Culebra. The transmissivity database for the Culebra Dolomite is derived from numerous hydraulic tests performed at the WIPP site. Values have been obtained from drill-stem tests (DSTs), slug tests,

3-17

and local- and regional-scale pumping or interference tests (Beauheim, 1986, 1987a, 1987b, 1987c, 1989, 1991). Transmissivity values interpreted from these tests extend over a range of seven orders of magnitude. The large range in the transmissivities results from the variation in the fractured nature of the Culebra. A map of the undisturbed freshwater heads within the Culebra is illustrated in Figure 3-8. The freshwater heads reveal a predominantly southerly flow direction across the WIPP site. The heads southeast of the WIPP-site area approximate a westerly flow direction.

Figure 3-6. Initial CS log10 Transmissivity Field #2 (from Lavenue et al., 1995)

3-18

Figure 3-7. Culebra Formation Fluid-Density Values at the WIPP-Area Boreholes (from Lavenue et al., 1995)

3-19

Figure 3-8. Culebra Freshwater-Head Contour Surface (from Lavenue et al., 1995)

3-20

Since the early 1980s, the Culebra has been a focus of numerical modeling activities. Both regional and local-scale models have been constructed over the years due to changes in the conceptual model and in the definitions of the parameter-value distributions. These changes have occurred as a result of the continuing field investigations and the subsequent expansion of the hydrogeologic database. More details may be found in Appendix C.

3.2.1.2

Model Input

The finite-difference grid used in this modeling study to generate 70 CS fields was selected to facilitate the successful reproduction of both steady-state and transient heads. The grid consists of 50x57x1 (x,y,z) grid blocks and has a finer grid occurring in the central portion of the model. The vertical dimension of the grid is taken from the thickness of the Culebra Dolomite in the WIPP area. The mean thickness of 7.7m was calculated from the available data and was assumed suitable for the vertical model dimension in this study. The locations of the boundaries of the model were chosen to maximize the ability to use a nearby groundwater divide and to minimize the effect that the boundaries may have on the transient modeling results for three long-term pumping tests. One section of the northwestern boundary was considered a no-flow boundary due to a groundwater divide which is believed to exist along the western model boundary. The Culebra was considered confined above and below by low-permeability beds of anhydrite, halite, and siltstone. Vertical flux was not considered in the model because of the existence of these low-permeability anhydrites. Lavenue et al. (1995) used the AKRIP code to perform kriging calculations needed for the conditional simulations (see Appendix C). The coefficients of the linear generalized covariance function (GCF) were determined by an automatic iterative procedure in which the GCF is fitted to local neighborhoods defined by subsets of the observed transmissivity data. The neighborhood was defined by the ten nearest observed data points surrounding a particular grid block in the model area. Lavenue et al. (1995) determined that a zero-order GCF best fit the observed Culebra transmissivity data according to the following relation: K (s ) = − 2.3 × 10 − 4 s

(29)

where K(s) is the generalized covariance and s is the distance between an observed data point and the center of the estimation area. More details concerning the selection of this GCF and the production of the conditional simulations may be found in Appendices B and C.

3.2.1.3

Model Results

Steady-state calibration was first conducted with 70 transmissivity fields. Once steady-state calibration was complete, the transmissivity fields were then calibrated to transient state conditions. Once calibration was complete the characteristics of each field were examined. It was determined that each of the calibrated CS transmissivity fields has a different spatial distribution of transmissivities. In some cases, the high-transmissivity zone is a broad feature that extends from the DOE-1 borehole in the east WIPP-site area to the H-14 borehole west of

3-21

H-3 (Figure 3-9). In other cases, the high-transmissivity zone has a narrow, tortuous and in some instances, discontinuous nature. The groundwater travel time and travel path from a point within the Culebra coincident with the centroid of the waste panels to the southern WIPP-site boundary was calculated for each of the calibrated CS fields. A common technique of expressing travel-time distributions is through a cumulative distribution function (cdf), which represents the probability of occurence for various travel times. For instance, the travel time cdf, determined from the calibrated fields (Figure 310), indicates that 90% of the travel times were longer than 12,000 yr, 50% of the travel times were longer than 18,000 yr, and 10% of the travel times were longer than 27,000 yr. The travel paths that correspond to the travel times contained in the cdf are illustrated in Figure 3-11. Most of the travel paths follow a southeasterly direction until reaching the DOE-1 vicinity at which point the paths travel directly south to the WIPP-site boundary. A few paths travel directly south from the starting point while several others have a east-southeasterly direction prior to moving south toward the WIPP-site boundary. Upon inspection, the primary factor that affects travel times is the distance that the particle must travel within a low-transmissivity region between the drop point and the southern WIPP-site boundary. In some realizations, the CS field has a low-transmissivity region of -6.0 to -7.0 (log10 T m2/s) which extends southward from the WIPP-19, WIPP-21, and WIPP-22 boreholes to the H-1 borehole. The width and length of this low-transmissivity feature vary widely. In other realizations, this lower transmissivity feature is confined to the immediate vicinity of the WIPP wells and the transmissivities in the vicinity of the H-1 borehole lie between -5.0 and -6.0 log10 T. In these realizations, the travel times are smaller. The secondary factor affecting the travel time is whether the particle intersects higher transmissivities (-4.0 to -5.0 log10 T) before exiting the southern WIPP-site boundary. In most of the realizations, the particles do eventually intersect a region of higher transmissivities. In some cases, the high-transmissivity region may begin adjacent to the H-3 borehole while, in others, the high-transmissivity region begins in the vicinity of the H-11 and DOE-1 boreholes.

3.2.1.4

Model Conclusions

In an earlier study (Lavenue et al., 1990), a wide high-transmissivity zone was assigned to match the observed pressures during the H-11 pumping test. This high-transmissivity feature, was not unique in orientation, width, or transmissivity magnitude. Given the uncertainty in the high-transmissivity feature, a large number of different representations of the hightransmissivity zone could be possible. The existence of this zone provided a fast groundwater pathway from the center of the WIPP site to the accessible environment 5km away. Since the actual transmissivity zone could be significantly different from the representation in the 1990 model domain, the uncertainty of the groundwater travel time was unknown. Lavenue et al. (1995) employed the GRASP-INV code to investigate the variation in the high transmissivity zone and its impact upon groundwater travel time. They solved the inverse problem for 70 transmissivity fields against the exhaustive data base of measured heads taken within a regional aquifer, the Culebra dolomite, and assessed the plausible variations in this high-transmissivity zone. The uncertainty associated with the Culebra transmissivity field, as expressed through the kriging estimate’s (µ) standard error (s), provided one way to assess the 3-22

possible spatial variability in this region through the analysis of numerous realizations. The distribution of possible values at a given point within which a CS value should lie is expressed by µ ±3s. By generating and subsequently calibrating numerous transmissivity fields with values within the µ ±3s distribution, the range of plausible fields and the spatial variability associated with these fields was determined.

Figure 3-9. Calibrated CS log10 Transmissivities (from Lavenue et al., 1995)

3-23

Figure 3-10.

Cumulative distribution function (cdf) of travel times determined from the transient-calibrated fields (TCDF) and the cdf determined from the steady-state calibrated fields (SCDF) (from Lavenue et al., 1995)

Figure 3-11.

Travel paths corresponding to the travel times contained in the cumulative distribution function (cdf) (from Lavenue et al., 1995)

3-24

Once the calibrated fields were produced, groundwater travel times from a point within the Culebra, coincident with the center of the waste panels, to the southern WIPP-site boundary were calculated. From this distribution of travel times, the most important spatial features controlling groundwater flow were determined. The GRASP-INV results identified the importance of understanding the location of the lower transmissivity region in the vicinity of the H-1 borehole. The uncertainty of the transmissivities in this region was shown to affect the overall travel time distribution significantly.

3.2.2 GXG Test Problems After the 1992 WIPP Performance Assessment calculations were complete, the GXG subsequently questioned how GRASP-INV v1.0 would compare to other inverse codes available today. The GXG developed a list of seven inverse techniques, three non-linear and four linear, which could be used in a comparison exercise. The objective of the comparison was to determine which of the inverse techniques was better suited for making probabilistic forecasts of the potential transport of solutes in a highly variable aquifer with highly uncertain hydrogeologic properties. The seven inverse methods selected for the comparison were: (1) (2) (3) (4) (5) (6) (7)

Fast Fourier Transform (FF), Fractal Simulation (FS), Linearized Cokriging (LC), Linearized Semi-analytical (LS), Maximum Likelihood (ML), Pilot Point, i.e., GRASP-INV v1.0, (PP), and Sequential Self-calibration (SS)).

These seven methods are described in Zimmerman et al. (1998) and were compared on four synthetic data sets develop by a GXG subcommittee. These data sets, referred to as the GXG Test Problems, each had specific features meeting (or not) classical assumptions about stationarity, amenability to a geostatistical description, etc. The selected methods were to estimate the transmissivity field from measurements of transmissivity and head and produce an ensemble of simulated transmissivity fields conditioned on all the available data on transmissivity and head. These simulated transmissivity fields should reflect the uncertainty in the transmissivity (T) estimate after calibration and would be the input T-fields in the Monte Carlo simulations of flow through the system. There should be as many different T-fields as Monte Carlo simulations (about 100), all considered as having an equal probability of occurrence. The GXG compared the results of the inverse methods using the advective groundwater travel time (GWTT) of a conservative tracer as a surrogate for the more complex solute transport problem. The participants generated probability density functions (pdf's) of GWTT using the results of their respective inverse solutions. The GXG then evaluated each inverse approach on its ability to reflect the uncertainty in aquifer parameters adequately as described by the conditional GWTT pdf's. The GXG was interested in determining how the estimate of the conditional pdf's of GWTT could be affected by either the differences in the principles and coding of the inverse methods or the manner in which a given method was applied by the person who ran it. One of the inverse approaches directly produced the conditional pdf, while all the others used intermediate steps, i.e., calibrated transmissivity fields as input to Monte

3-25

Carlo simulations of groundwater flow. Thus, the primary comparison focused on the inverse approaches on their ability to predict the GWTT pdf's, and secondarily on the calculated transmissivity maps or other ancillary measures of ‘performance’. Most of the following section is an excerpt from Zimmerman et al. (1998) contained in Appendix D.

3.2.2.1

Overview of the GXG Test problems

The approaches that were compared came from published literature, and were suggested by the members of the GXG, who, most of the time, applied their methods themselves, or had direct contact with the users of their codes. After considering which approaches would be deemed appropriate for this application and determining the availability of those who could perform the work, seven inverse methods were selected for comparison by the GXG. These seven methods are by no means an exhaustive sampling of all the methods that have been published in the literature. Among the most prominent ``absences'' are the approaches proposed by Cooley [1977, 1979, 1982, 1983], Townley and Wilson [1985], and Sun and Yeh [1992], who unfortunately could not participate. After an initial test of some of these approaches on the real data set, the GXG subsequently formulated a series of four ‘Test Problems’. In these test problems, the ‘true’ system was entirely known and was used for objective comparison of the different approaches. The four test problems were developed in secrecy from the participants who would receive the data and run the inverse models. Four different T-fields were generated. Synthetic hydraulic head data were obtained by solving the two-dimensional flow equations with prescribed boundary conditions using these synthetic T-fields. A limited number of observations of head and transmissivity obtained from the exhaustive (synthetic) data sets would then be provided to the participants. Additionally, particle-tracking calculations were performed to compute advective travel times and travel paths of a conservative solute for the synthetic data sets. Particles were released in a number of locations and the ``true'' groundwater travel times were calculated, but not given to the participants. For each test problem, the participants would analyze the sampled T and head data (about 40 observations of each) and use their inverse procedure to generate the ensemble of conditional transmissivity fields and corresponding head fields (in general between 50 and 100) that were given to the GXG coordinator. The coordinator then calculated the travel times and travel paths for the same release points as those in the ``true'' field, using the same particle-tracking code as the one used for the true field, but using the T values, the grid size, and the boundary conditions specified by the participants as a result of their efforts. A comparison was then made between the participants results and the true travel time(s). In the real world, it is clear that parameters other than transmissivity are variable and uncertain in the system. For example, porosity, aquifer thickness, dispersivity, sorptive properties, etc., are all variable, and the GXG made suggestions on how to incorporate these uncertainties into the PA. However, for the present study, only the transmissivity is involved, and all other parameters are given uniform values.

3-26

3.2.2.2

Description Of The Four Test Problems

Since a complete description of the GXG Test Problems may be found in Appendix D, only a general overview of each test problem is presented here. The Test Problems (TPs) were developed as a series of independent synthetic data sets that were intended to span the range of possible conceptual models of the Culebra transmissivity distribution at the WIPP site. Estimates of transmissivity at 41 boreholes at the actual WIPP site have been obtained through slug tests, local pumping tests, and three regional-scale pumping tests lasting from one to three months (Beauheim,1991). The T-values obtained from these tests span seven orders of magnitude. The test problems were developed as steady-state systems and the sample data were limited to, at most, 41 observations of head and transmissivity (at the same locations). The spatial distribution of the boreholes (i.e., density, pattern) in the TP's were kept similar to that present at the WIPP. Three large-scale pumping tests were also simulated in TP's 3 and 4. In the real world, these data are all subject to measurement errors. However, none was considered in these calculations because the GXG made a conscious decision not to include measurement errors in the test problems. The objective of the comparison was not to assess the robustness of an approach to the magnitude of measurement errors, but for a given set of data, to determine the residual uncertainty on the transport properties of the domain as evaluated by each approach. Adding a measurement error would only increase this uncertainty and decrease the ability to distinguish between the approaches. The true or synthetic transmissivity and head fields were generated on a very small grid (20 to 40m) and the participants were given the grid values at the sampled locations. These dense synthetic data sets comprised from one million to three million nodes. Thus, the small-scale variability of the synthetic log(T) fields can be viewed as measurement error, compared to a ‘measured value’ which could have been provided by averaging over the larger domain such as an actual pumping test would have produced. The size of the area over which the observation data are distributed is 20 km x 20km for TPs 1, 2, and 3, and approximately 30km x 30km for TP-4. This is of similar scale to the area where data are available at the actual WIPP site. Boundary conditions for flow in the vicinity of the WIPP site are not well constrained, as the modeled area in the PA calculations is just a small portion of a large regional system. For all test problems, Dirichlet boundary conditions (different for each test problem) were developed for calculating the ‘true’ (i.e, synthetic) heads by generating a stationary random field and adding it to a trend surface. In the TP's, the boundary conditions were not defined for the participants. Given the 41 head measurements in the domain, the participants were asked to select the boundary conditions they felt appropriate. In Test Problems 1 and 2, the synthetic T-fields were generated as unconditional random fields (Table 3-2) using the two-dimensional random field generator TUBA (Zimmerman and Wilson, 1990; Mantoglou and Wilson,1982; and Matheron, 1973). In Test Problems 3 and 4, the initial field was also generated using TUBA, but additional discrete modifications were made to each of the unconditional fields to achieve certain desired hydrogeologic characteristics.

3-27

Table 3-2. Log10 T (m2/s) field exhaustive data set and sample data statistics (from Zimmerman et al., 1998) TP No 1 2 3 4

Covariance model Exponential Exponential Telis Bessel

Exhaustive Data µ σ2 -5.84 1.56 -1.26 2.14 -5.64 1.38 -5.32 1.93

Sample Data µ σ2 -5.30 1.84 -0.52 2.39 -5.70 1.82 -5.32 1.89

No of observations Head log10(T) 32 41 32 41 41 41 41. 41

TP 1 was the simplest conceptual model of the real WIPP site transmissivity distribution. It was developed using a model of the Culebra transmissivities that was based on a geostatistical analysis of the real WIPP site data. The Log T field (T in m2/s) was modeled as an isotropic process having a mean of -5.5, a variance of 1.5, and an exponential covariance structure with correlation length λ=3905m, close to the value of the real WIPP site. A map of the synthetic Log T field with the location of the observation points is shown in Figure 3-12. The mean and variance of the exhaustive Log T data for the inner region are -5.84 and 1.56, respectively (Table 3-2). The sample data consisted of 41 transmissivity and 32 head measurements taken from the exhaustive synthetic data set. TP 2 was generated specifically to examine how well the linearized techniques could handle high-variance cases. The model of spatial variabilty is identical to TP 1; only the mean and variance of Log T were changed. In fact, the pattern of spatial variability remains exactly the same except that the field is rotated counterclockwise by 90 degrees. The mean of Log T was increased to -1.26 resulting in faster travel times, and the variance was increased to 2.14 (Table 3-2). The boundary values remained the same, albeit rotated by 90 degrees. The same number and similar configuration of observation data as for TP 1 were provided to the participants. The sample Log T data have a mean of -0.52 and a variance of 2.39. The Log T field and a different set of observation points are shown in Figure 3-12. TP 3 was a complex conceptual model developed by the GXG in conjunction with the Geohydrology Department at SNL, which has studied the WIPP site for more than a decade. The intent of TP 3 was to incorporate some of the more complex geohydrologic characteristics of the WIPP site that were not explicitly represented in the first two test problems (Table 3-3). In TP 3, well-log geology provided to the participants indicated the geologic nature of the measured transmissivity data. Some transmissivity measurements were located in fractured media and some were located in ‘intact’ geologic media.

3-28

Figure 3-12. Squares in each TP represent an assumed waste disposal area. The flow lines originating from these squares display the flow direction from the disposal area to the boundaries (from Zimmerman et al., 1998)

3-29

Table 3-3. Additional characteristics of each test problem data set (from Zimmerman et al., 1998) TP No 1 2 3 4

True Field corr. Length 2808 m 2808 m 425 m 2063 m

Recharge included? No No Yes Yes

Transient pumping? No No Yes Yes

well log “geology” No No Yes No

Several high-transmissivity fracture zones approximately 1 to 3km apart have been inferred from pumping tests in the northwest and southeast areas of the WIPP site and in other areas of the site; aquifer tests conducted at several wells have resulted in very low transmissivity values. Similar features were therefore used in the Log T distribution of TP 3 (Figure 3-12). Vertical recharge was applied uniformly over the northwestern portion of the TP 3 model domain; the recharge rate was 6.5x10-9 m3/s. The recharge distributed over this region accounts for approximately ten percent of the regional flow through the system. Such recharge could be inferred by the participants from the observed heads in this area, which showed a localized piezometric mound, but no information on recharge was given to the participants. In addition to the steady-state hydraulic head data and transmissivity values, transient information was provided to the participants in the form of three independent aquifer tests. Pumping from the aquifer was simulated numerically in three different wells (one at a time) and drawdown data in the surrounding wells and the pumping rates were given to the participants. These tests were loosely modeled after the H-3, H-11 and WIPP-13 large-scale pumping tests conducted at the WIPP site (Beauheim, 1991). TP 4 is a complex, non-stationary conceptual model of the transmissivity distribution reflecting large-scale connectivity of fracture zones that have been shown to exist in some areas of the Culebra. The final Log T field had a mean of -5.32 and a variance of 1.93 (Table 3-2); the field is shown in Figure 3-12 along with the 41 observations points. The field was generated on a 1025 x 1025 grid with 40 m grid blocks. The Log T mean and variance of the 41 sample observations are -5.32 and 1.89, respectively. Areal recharge was applied to the southern portion of the field where the transmissivity is generally somewhat higher than average; this also helped to direct the flow through the highT channels without causing any mounding. The recharge (leakage) was applied nonuniformly, being highly correlated with the transmissivity distribution in this area of the field. Because it occurs at the margin of the field, no observation points are located within this region. The recharge amounted to approximately 6% of the regional flow moving through the system, but no recharge information was given to the participants. As in TP 3, three independent numerical pumping tests were performed in TP 4 to provide transient information for those techniques that could use it. A detailed description of the three pump tests and conventional analyses of the results were given to the participants.

3-30

3.2.2.3

GXG Test Problem Results

The comparison between the different inverse methods’ results is based on a total of ten quantitative evaluation measures. Several measures were used to reflect the quality of the prediction of travel times and travel paths of conservative solutes migrating in the aquifer for a distance of 5 km. Some measures were based upon a single set of particles (10 to 30), referred to as fixed wells, released randomly over the model domain and tracked over a 5 km distance. Three measures, referred to as the Random Well Measures, were based upon a comparison of a true and predicted CDF of groundwater travel time from a ‘waste panel’ area. The true and predicted CDFs were generated from 100 particles released in a waste panel area and tracked over a 5 km distance. Other measures include the differences between the calibrated and true transmissivity fields, the calculated and true head fields and the differences between the ‘true’ semi-variogram and the semi-variogram produced from the calibrated transmissivity fields. Appendix D presents a detailed description of the evaluation measures and the statistical analysis of the evaluation measures for all the inverse methods. Figures 3-13 through 3-16 illustrate the pilot point (PP) inverse method’s (i.e., GRASP-INV v1.0) transmissivity fields for each of the four test problems. These figures contain the mean log-transmissivity field, a single calibrated CS transmissivity field and the groundwater travel time CDF. The travel time CDFs on each of the figures contain a thick black line corresponding to the ‘true’ CDF, two thin lines corresponding to the 0.025 and 0.975 quantiles of the predicted CDF (from the inverse solution’s ensemble of calibrated transmissivity fields). They also include a dashed line corresponding to the mean CDF. Comparing Figures 3-13 through 3-16 to the corresponding images contained in Figure 3-12 reveals that the PP doesn’t match the true T fields very well. The differences between the ensemble of transmissivity fields and the true transmissivity field values were used as an evaluation measure. Table 3-4 lists the rank of each of the method’s goodness of fit to the true transmissivity field (Column 8, Log-T Error). The transmissivity fit for Test Problem 1 was the worst for the PP method while TP 3 was the best of the seven methods. However, comparing the true transmissivity field (Figure 3-12) with the transmissivity fields shown in Figures 3-13 and 3-15, there remains a significant discrepancy between the two. Zimmerman et al. (1998) point out that this could be due to the non-uniqueness of the problem, i.e., there are several transmissivity-field solutions that would match the steady-state and transient-state head data. Several evaluation measures were used to reflect the accuracy, spread and conservativism of the predicted travel time CDFs (Figure 3-17). In addition, the error associated with the travel path was also considered in the evaluation measures (Figure 3-18). The rank scores of these measures are listed in columns 1 through 7 of Table 3-4. Zimmerman et al. (1998) provides a detailed description of these measures (Appendix D). In TP 1 and TP 2, the PP method’s CDFs captured the median travel time (i.e., 0.50 quantile) reasonably well but the spread between the 0.025 and 0.975 quantiles was quite broad. In addition, the minimum and maximum travel times were far off from the real CDF minimum and maximum (Figures 3-13 through 3-16). The GWTT CDFs for TP 3 and TP 4 were very different in nature from the results of TPs 1 and 2. In TP 3 and TP 4, the true CDFs fell to the left of the entire predicted CDF (Figures 3-15 and 3-16) with a much smaller spread in the CDFs.

3-31

Figure 3-13. Test Problem 1 - PP Method Results: mean log10T field (top), randomly selected single realization overlain with bounding pathlines for that realization (middle), and the GWTT CDF curves from all realizations (bottom). Thick line is true CDF, dashed line is mean CDF, other curves bound the inner 95% of all conditional CDFs (from Zimmerman et al., 1998)

3-32

Figure 3-14. Test Problem 2 - PP Method Results: mean log10T field (top), randomly selected single realization overlain with bounding pathlines for that realization (middle), and the GWTT CDF curves from all realizations (bottom). Thick line is true CDF, dashed line is mean CDF, other curves bound the inner 95% of all conditional CDFs (from Zimmerman et al., 1998)

3-33

Figure 3-15. Test Problem 3 - PP Method Results: mean log10T field (top), randomly selected single realization overlain with bounding pathlines for that realization (middle), and the GWTT CDF curves from all realizations (bottom). Thick line is true CDF, dashed line is mean CDF, other curves bound the inner 95% of all conditional CDFs (from Zimmerman et al., 1998)

3-34

Figure 3-16. Test Problem 4 - PP Method Results: mean log10T field (top), randomly selected single realization overlain with bounding pathlines for that realization (middle), and the GWTT CDF curves from all realizations (bottom). Thick line is true CDF, dashed line is mean CDF, other curves bound the inner 95% of all conditional CDFs (from Zimmerman et al., 1998)

3-35

Table 3-4. Rank-transformed evaluation measure scores. (The “raw” Boot and Sprd scores were combined into the “NSC measure” as described in Appendix C. The lower the rank, the better the performance, from Zimmerman et al., 1998) Fixed Release Points Method

TP

FF 1 FF 2 FF 3 FF 4 FF Average FS 1 FS 2 FS 3 FS 3 FS 4 FS Average LC 1 LC 2 LC 3 LC 4 LC Average LS 1 LS 2 LS 3 LS 4 LS Average ML 1 ML 2 ML 3 ML 4 ML Average PP 1 PP 2 PP 3 PP 4 PP Average SS 1 SS 2 SS 3 SS 4 SS Average

Random Well

Field Variable

GWTT Error

GWTT Cnsv

GWTT NSC

PATH Error

PATH NSC

GWTT Error

GWTT NSC

Log (T) Error

Head Error

Variogram

6.0 4.5 7.0 6.0 5.88 4.0 6.0 3.5 3.5 3.0 4.13 5.0 4.5 6.0 7.0 5.63 2.5 7.0 2.0 4.0 3.88 2.5 2.0 5.0 5.0 3.63 7.0 3.0 3.5 1.0 3.63 1.0 1.0 1.0 2.0 1.25

2.5 3.0 5.0 6.0 4.13 1.0 6.0 7.0 7.0 5.0 4.75 7.0 2.0 3.0 4.0 4.00 4.0 1.0 4.0 1.0 2.50 2.5 5.0 1.0 7.0 3.88 6.0 4.0 6.0 2.0 4.50 5.0 7.0 2.0 3.0 4.25

6.5 7.0 6.0 5.0 6.13 3.0 4.0 2.0 2.0 2.5 2.88 5.0 6.0 4.0 7.0 5.50 6.5 5.0 7.0 2.5 5.25 2.0 1.0 3.0 6.0 3.00 4.0 2.0 5.0 1.0 3.00 1.0 3.0 1.0 4.0 2.50

4.0 3.5 5.0 6.0 4.63 6.0 7.0 6.0 6.0 5.0 6.00 5.0 2.0 7.0 7.0 5.25 2.0 3.5 1.0 1.0 1.88 3.0 6.0 3.0 2.0 3.50 7.0 5.0 2.0 4.0 4.50 1.0 1.0 4.0 3.0 2.25

6.0 5.0 6.0 6.5 5.88 2.0 3.0 4.0 4.0 4.0 3.25 5.0 1.0 7.0 6.5 4.88 4.0 7.0 4.0 2.0 4.25 1.0 4.0 2.0 5.0 3.00 3.0 2.0 1.0 3.0 2.25 7.0 6.0 4.0 1.0 4.50

5.0 5.0 7.0 7.0 6.00 2.0 4.0 5.0 5.0 6.0 4.25 1.0 2.0 3.0 5.0 2.75 7.0 6.0 6.0 3.0 5.55 4.0 3.0 2.0 2.0 2.75 6.0 7.0 1.0 4.0 4.50 3.0 1.0 4.0 1.0 2.25

6.5 7.0 2.0 6.0 5.38 2.5 3.5 7.0 7.0 7.0 5.00 4.5 6.0 6.0 4.0 5.13 1.0 2.0 1.0 1.0 1.25 4.5 5.0 3.5 3.0 4.00 2.5 3.5 3.5 5.0 3.63 6.5 1.0 5.0 2.0 3.63

4.0 3.0 4.0 5.0 3.75 3.0 6.0 6.0 6.0 6.0 5.25 1.0 1.0 5.0 4.0 2.75 N/A N/A N/A N/A N/A 5.0 5.0 3.0 2.0 3.75 6.0 4.0 1.0 3.0 3.50 2.0 2.0 2.0 1.0 1.75

2.0 3.0 2.0 40 2.75 5.0 6.0 .50 5.0 6.0 5.50 1.0 2.0 6.0 3.0 3.00 N/A N/A N/A N/A N/A 6.0 5.0 3.0 2.0 4.00 4.0 4.0 1.0 5.0 3.50 3.0 1.0 4.0 1.01 2.25

3.0 7.0 4.0 7.0 5.25 6.0 6.0 7.0 7.0 5.0 6.00 1.0 5.0 5.0 2.0 3.25 7.0 2.0 2.0 4.0 3.75 5.0 3.0 6.0 1.0 3.75 4.0 1.0 3.0 6.0 3.50 2.0 4.0 1.0 3.0 2.50

3-36

Ave. Score 4.55 4.80 4.80 5.85 5.00 3.45 5.15 5.25 5.25 4.95 4.70 3.55 3.15 5.20 4.95 4.21 4.25 4.19 3.38 2.31 3.53 3.55 3.90 3.15 3.5 3.53 4.95 3.55 2.70 3.40 3.65 3.15 2.70 2.80 2.10 2.70

Figure 3-17. GWTT CDF evaluation measures used in the Random Well case. Thin solid lines are the 0.025 and 0.975th percentile cdfs, thick dashed line is the median cdf and thick solid line is the “true” cdf. (from Zimmerman et al., 1998)

3-37

Figure 3-18. Fixed release points and pathlines on the true log(T)-fields. Pathlines extend a radial distance of 5 km (from Zimmerman et al., 1998)

3-38

The evaluation measure that seems to provide some indication of the robustness of an inverse method is the goodness-of-fit of the variogram produced by the ensemble of transmissivity fields as compared to the true semi-variogram. Figures 3-19 and 3-20 illustrate the comparison between the semi-variogram produced by the various methods for TPs 3 and 4. The rank measures are tabulated in Table 3-4, Column 10. Generally speaking, the methods that performed the best overall also performed well in the semi-variogram evaluation measure. Figure 3-21 supports this conclusion also. Here the scatterplot of the ‘Average Rank of Evaluation Measures’ is plotted versus the ‘Average Rank of the Semivariogram Measure’. The scatterplot shows that the average total score (shown in stars) is closely correlated to the semivariogram’s goodness of fit. The reason for this is straightforward. A poor representation of the semi-variogram leads to poor estimates of transmissivities. This, in turn, leads to poor estimates of travel times and travel paths which impacts most of the evaluation measures thereby increasing the average total score. Figures 3-22 and 3-23 illustrate the difference in the PP method’s results on TP 1 when the ‘true’ variogram is used in place of the linear variogram inferred from the data and used in the PP inverse. A significant portion of the true CDF falls within the 95% confidence interval of the predicted travel times when the true variogram is used.

3-39

Figure 3-19. Average semivariograms for Test Problem No. 3 (from Zimmerman et al., 1998)

Figure 3-20. Average semivariograms for Test Problem No. 4 (from Zimmerman et al., 1998)

3-40

Figure 3-21. The sensitivity of the method’s performance to the estimation of the covariance structure of the log(T) field; shows how errors are correlated with the quality of the semivariogram estimates (from Zimmerman et al., 1998)

3-41

Figure 3-22. Waste panel CDF for method PP in Test Problem 1 using linear variogram (from Zimmerman et al., 1998)

Figure 3-23. Waste panel CDF for method PP in Test Problem 1 using true variogram (from Zimmerman et al., 1998)

3-42

3.2.2.4

GXG Test Problem Conclusions

The overall comparison of the seven methods is graphically illustrated in Figure 3-24. The method that finished with the lowest overall ranking (i.e., the best score) is the Sequential Self-Calibration (SS) method developed by Jaime Gomez-Hernandez (Journal of Hydrology Reference Here). Figure 3-25 illustrates the SS results in TP 4. The differences between the true and predicted transmissivity fields and travel time CDFs are small. The SS method parameterizes the inverse problem in a similar manner as the PP method (Appendix D). However, the SS method employs sequential simulation routines to develop the conditionally simulated transmissivity field as opposed to TUBA and residual sewing using AKRIP as in the PP method. The PP method performed poorly in TP 1 but improved markedly in TPs 2, 3 and 4. This was due to a poor selection of the variogram and model grid in TP 1. The other two methods which performed well were the Maximum Likelihood Technique (ML) developed by Jesus Carrera and the Linear Semi-Analytical (LS) method developed by Yoram Rubin. Given the objective of the exercise, namely; to determine which of the inverse techniques was better suited for making probabilistic forecasts of the potential transport of solutes in a highly variable aquifer with highly uncertain hydrogeologic properties, Zimmerman et al. (1998) concluded that four of the methods, LS (semi-linear), ML (non-linear), PP (non-linear), and SS (non-linear), were approximately equivalent for the specific problems considered. In addition, it was concluded; 1) The magnitude of the variance of the transmissivity fields, which went as high as ten times the generally accepted range for linearized approaches, was not a problem for the linearized methods when applied to stationary fields (i.e., their inverse solutions and travel time predictions were as accurate as those of the nonlinear methods). 2) Non-stationarity of the ``true'' transmissivity field, or the presence of "anomalies'' such as high-permeability fracture zones, was however, more of a problem for the linearized methods. 3) The importance of the proper selection of the semivariogram of the Log(T) field (or the ability of the method to optimize this variogram iteratively) was found to have a significant impact on the accuracy and precision of the travel time predictions. 4) Use of additional transient information from pumping tests did not result in major changes in the outcome. 5) While the methods differ in their underlying theory, and the codes developed to implement the theories were limited to varying degrees, the most important factor for achieving a successful solution was the time and experience devoted by the user of the method.

3-43

Figure 3-24. Comparison of method performance across test problems; the lower the average rank, the better the performance. Methods are indicated by the two-character abbreviation (from Zimmerman et al., 1998)

3-44

Figure 3-25. Test Problem 4 - SS method Results: mean log10(T) field (top), randomly selected single realization overlain with bounding pathlines for that realization (middle), and the GWTT CDF curves from all realizations (bottom). Thick line is true CDF, dashed line is mean CDF, other curves bound the inner 95% of all conditional CDFs (from Zimmerman et al., 1998)

3-45

3.3 Improvements to GRASP-INV v1.0 Based upon the GXG test problem conclusions and the performance of the PP method on the GXG test problems, GRASP-INV v1.0 underwent several modifications designed to improve the robustness of the code’s predictive capability. Parametric (gaussian) and non-parametric (indicator) sequential conditional simulation routines replaced the turning bands and residual sewing routines embedded in version 1.0. The new geostatistical simulation routine which replaced TUBA and AKRIP is called CONSIM II. CONSIM II is a subroutine designed for the geostatistical simulation of heterogeneous geologic media and related spatial random variables. It creates one-, two- or three-dimensional simulated fields of spatially correlated random variables which may be conditioned to measured values. CONSIM II also produces estimated fields based on the measured values via kriging. The first version of the program, written by Jaime Gomez-Hernandez, borrowed heavily from GSLIB, the well-known library of geostatistical programs published by Deutsch and Journel (1992). CONSIM II uses a two-step approach to simulate geologic media (Figure 3-26). The first step is to simulate lithology or structure within a formation as discrete categories using Indicator Categorical Simulation (iCs). The second step simulates a continuous variable for the property of interest of each category, e.g., permeability for each rock type. The continuous variable is simulated parametrically by Sequential Gaussian Simulation (sGs). If observed values of the variable of interest are available, the simulations will reproduce the observations at their locations while providing alternative, equally-likely realizations for the unmeasured regions of the field. CONSIM II may be used to simulate a variety of geologic media; examples include: • • •

The permeabilities of both sand and shale layers within a single formation. The transmissivities of both fractured and massive units within a limestone aquifer. Facies changes and the associated material properties for an alluvial or aeolian deposit.

The Gauss-quadrature routine used in GRASP-INV v1.0 to obtain average grid block transmissivities from transmissivity point values was also replaced with an upscaling routine. Now, simulation is performed on a grid much finer that the flow model finite difference grid. Once a field is generated, the flow model grid is superimposed upon the geostatistical simulation grid and average transmissivity values are calculated for each flow model grid block by analyzing the simulation grid point values falling within each grid block. The gridblock transmissivity values are then sent to the SWIFT II subroutine to determine groundwater pressures and velocities. Adjoint sensitivities of the objective function are then calculated to determine the optimal pilot point location and transmissivity value. Once a pilot point’s x,y, z location is selected and the transmissivity is optimized, the transmissivity field is modified by determining the influence of the pilot point upon the surrounding grid block transmisssivity values. The modified transmissivity field is then sent back to SWIFT II and the process repeats until the objective function is reduced to a specified minimum or until a selected maximum number of pilot points have been added.

3-46

Figure 3-26. CONSIM II Subroutine of GRASP-INV v2.0 Section 4.0 contains two papers that describe the theory and application of the new version of GRASP-INV v2.0. The first paper discusses the application of GRASP-INV v2.0 to the final set of WIPP calculations and compares the results to the 1992 results obtained with GRASPINV v1.0. The second paper develops the theory behind the extension of the Pilot Point Technique to three dimensions. In addition, the code is used to calibrate a three-dimensional pumping test performed in the Culebra dolomite in 1996.

3-47

4. THE GRASP-INV CODE V2.0, THEORY AND APPLICATIONS As mentioned in the previous section, the GRASP-INV v1.0 code was extensively modified after the GXG Test Problem results. The general process used in the GRASP-INV code v2.0 is essentially the same as in v1.0. The major difference in functionality stems from two significant modifications. First, the new geostatistical simulation routine CONSIM II, replaced the geostatistical routines embedded in GRASP-INV v1.0, i.e., TUBA and AKRIP. CONSIM II is a subroutine designed for the geostatistical simulation of heterogeneous geologic media and related spatial random variables. It creates one-, two- or threedimensional simulated fields of spatially correlated random variables which may be conditioned to measured values. CONSIM II also produces estimated fields based on the measured values via kriging. CONSIM II uses a two-step approach to simulating geologic media. The first step is to simulate lithology or structure within a formation as discrete categories using Indicator Categorical Simulation (iCs). The second step simulates a continuous variable for the property of interest of each category, e.g., permeability for each rock type. The continuous variable is simulated parametrically by Sequential Gaussian Simulation (sGs). If observed values of the variable of interest are available, the simulations will reproduce the observations at their locations while providing alternative, equally-likely realizations for the unmeasured regions of the field. The second major difference in functionality relates to the modifications to the code which were necessary to extend the pilot point method to three dimensions. For example, modifications to the code concerning coupling horizontal and vertical conductivity sensitivity derivatives of an objective function to the location of a pilot point were necessary. Other changes included extending the pilot points influence to three dimensions. The two papers contained in this section describe the theory and application of GRASP-INV v2.0 to the two-dimensional problem of the Culebra regional flow model used to generate the final set of transmissivity fields for the 1996 WIPP Site Compliance Document Calculations. A multi-categorical sequential simulation approach was employed to segment the higher transmissivity, fractured, areas of the Culebra from the lower transmissivity, non-fractured, areas of the Culebra. The GRASP-INV v2.0 has the capability of optimizing the properties within these two categories, i.e., fractured and non-fractured, ‘independently’. That is, pilot

4-1

points added to a grid block of one category does not affect the properties of the transmissivities in another category. This enabled the optimization of the high transmissivity channel properties at the WIPP site without affecting the properties of the lower transmissivity area in the central portion of the WIPP site. Details are contained in the first paper. The second paper describes the theory and application of GRASP-INV v2.0 to a threedimensional pump test in the Culebra dolomite. The pump test, conducted as part of the 1996 H-19 tracer test, employed a sinusoidal pumping rate. The Culebra was isolated into two vertical sections and pumping was conducted in the upper and lower sections with transducers monitoring the pressure responses in the upper and lower sections of six nearby (< 40m) monitoring wells. The packers were vertically set in an attempt to isolate the lower highly fractured portion of the Culebra dolomite from the upper non-fractured portion. The variation in the responses in the upper and lower sections due to pumping varied significantly in the monitoring wells. Generally, the responses measured by transducers in the lower section were much greater than those measured by transducers in the upper Culebra. This is due to the fractured nature of the lower portion of the Culebra. The second paper describes the test in detail and presents the GRASP-INV v2.0 inverse solution which matches the observed responses.

4-2

Paper 1:

A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

by

Marsh Lavenue1, Banda S. RamaRao2, Dennis Longsine2, Greg Ruskauff1 and Ghislain de Marsily3

1) INTERA Incorporated Boulder, CO 2) INTERA Incorporated Austin, TX 3) University of Paris VI Paris, FRANCE

February 24, 1999

Submitted to Water Resources Research for Publication

A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

by

Marsh Lavenue1, Banda S. RamaRao2, Dennis Longsine2, Greg Ruskauff1 and Ghislain de Marsily3

Abstract A capability to generate conditional simulations of multi-category transmissivity fields, such as found in fractured and unfractured domains, and automatically calibrate these fields has been developed. A two-step geostatistical procedure is used to generate the conditional simulations. Categorical indicator simulation is first performed to obtain the spatial distribution of indicators representing fractured or unfractured media. Then, the spatial variability within each of these categories is subsequently ‘filled in’ using the associated semi-variogram models and the Sequential Gaussian Simulation technique. An indirect inverse method, referred to as the Pilot Point Technique, has been coupled to the geostatistical routine to solve the inverse problem for steady-state and transient-state head data. The new methodology, is then applied to automatically calibrate a model of a regional aquifer in the vicinity of the Waste Isolation Pilot Plant in Southeastern New Mexico. Onehundred transmissivity fields were conditionally simulated using the measured transmissivity and the data describing the occurrence of fracturing. These fields were subsequently calibrated to an extensive set of steady-state and transient-state heads. The final results are compared to those of an earlier modeling study.

1) INTERA Incorporated Boulder, CO 2) INTERA Incorporated Austin, TX 3) University of Paris VI Paris, FRANCE

Submitted to Water Resources Research for Publication

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

LIST OF FIGURES

1 2 3 4 5 6 7a 7b 8 9a 9b 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Flow chart of GRASP-INV Spatial influence of a pilot point upon model grid-block transmissivities WIPP site location Example Indicator Probability Calculation Geologic column representative of WIPP area Equivalent freshwater elevations for the Culebra at Well H-1 Culebra freshwater heads at the WIPP-area boreholes Culebra freshwater heads’ uncertainties Culebra log10 transmissivities at the WIPP-area boreholes 1996 Culebra model boundaries 1996 Culebra model finite-difference grid Histogram of the Culebra transmissivity data Quantile normal plot of log transmissivity Indicator variogram for the Culebra transmissivity Variograms for the normal scores of the Culebra transmissivities that are (a) higher and (b) lower than the median transmissivity Construction of an initial conditional simulation Average transient calibrated transmissivity field Average transient calibrated transmissivity field near the WIPP Transient calibrated transmissivity field no. 40 near the WIPP Transient calibrated transmissivity field no. 69 near the WIPP Transient calibrated transmissivity field no. 77 near the WIPP (a) scatterplot of calculated steady-state heads versus measured heads and (b) histogram of differences between calculated heads and measured heads Calculated and measured hydrographs from 1981 to 1990 at WIPP wells H-1, H-3, and H-6 Calculated and measured hydrographs from 1981 to 1990 at WIPP wells H-11, H-15, and H-17 Calculated and measured hydrographs from 1981 to 1990 at WIPP wells WIPP-12, WIPP-13, WIPP-18, and WIPP-19 Calculated and measured hydrographs from 1981 to 1990 at WIPP wells WIPP-30, DOE-1, and DOE-2 Location of the H-19 Pump Test Calculated and measured hydrographs from 1995 to April 1996 (during H-19 pumping) at WIPP wells H-1, H-3, and H-15 Calculated and measured hydrographs from 1995 to April 1996 during H-19 pumping) at WIPP wells WQSP-4 and WQSP-5 Difference between 1992 and 1996 ensemble mean transmissivity fields Groundwater travel time cumulative distribution function from 1992 study Groundwater travel time cumulative distribution function from 1996 study

LIST OF TABLES

1 2

Culebra Undisturbed Head Values and Uncertainties .......................................................64 Pilot Points Added for each Transient Event .....................................................................76

i

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

INTRODUCTION Scientists interested in the prediction of groundwater flow over space and time typically construct a numerical model of their particular site of interest. Prior to any predictive calculations, the numerical model must be able to recreate the historical record of measured hydraulic head data. Reconciling the differences between historical hydraulic head data and the calculated hydraulic head data involves the process of model calibration. Input parameters such as transmissivity, storativity and boundary conditions are adjusted until the differences between the measured and calculated heads fall below the uncertainty associated with the measured head data. Since 1960, numerous methods have been developed to facilitate the model calibration process, also known as the inverse problem. There are trial-and-error procedures where the hydrogeologist will adjust the model parameters using intuition in hopes of reducing the head differences mentioned above. There are also automated inverse techniques that remove most of the guesswork involved in model calibration. In general, all automated techniques may be grouped into one of two categories; Direct and Indirect methods.

Direct Methods of Solving the Inverse Problem

Research into the Direct Method has ranged from 1960 to approximately the mid-1980s. Yeh (1986) presented a review of some of the significant contributions hydrogeological researchers had made to the Direct Method of the inverse problem. The Direct Method of solving the inverse problem stems from a viewpoint in which the head field as well as its first and second derivatives are considered known. This reduces the groundwater flow equation from a second order partial differential equation (PDE) to a first order PDE in which conductivity is the only unknown. Any error in the solution of the inverse problem, referred to as equation error, results from errors in the head field, the representation of its derivatives or from incorrect model assumptions. Nelson (1960) formalized the Direct Method of solving the inverse problem as a Cauchy problem. Here he presented an analytic solution to the two-dimensional flow inverse problem whereby he added additional information to properly pose his solution. In 1972, Scarascia and Ponzini proposed solving the direct inverse problem formulated by Nelson in a 1

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

slightly different way. They suggested that if hydraulic head data from at least two independent states of groundwater flow were available (i.e., two steady-state flow fields or one steady-state flow field and one transient flow field), then only one measured transmissivity value was needed to solve the inverse problem (Scarascia and Ponzini, 1972). Other researchers also considered this possibility, e.g., Sagar et al. (1975), Ponzini and Lozej (1982), Ponzini et al (1989), and Ginn and Cushman (1990). In 1995, Giudici and others presented an approximation of the Scarascia and Ponzini (1972) approach. They developed an expression for transmissivity at finite difference nodes based upon the finite difference equation for groundwater flow between adjacent grid blocks, the two sets of hydraulic heads from the two flow fields, and the single measured transmissivity value.

Indirect Methods of Solving the Inverse Problem

The Indirect Method of solving the inverse problem grew out of a need to address the problems associated with the Direct Method approach, namely, that errors in the head measurements led to unstable transmissivity estimates due to the amplification of the head errors in the first- and second-order head derivatives. The petroleum industry had already made some progress in employing indirect techniques to reservoir modeling problems (Jacquard and Jain, 1965; Jahns, 1966; Thomas et al., 1972; and Chavent, 1975). The Indirect Method generally proceeds as follows. An estimate of the initial parameter field (e.g., transmissivity, porosity, storativity, and boundary conditions) is made. The hydraulic heads are then calculated over the domain of interest through Darcy’s Law and the mass balance equation. The resulting calculated head field is compared to the measured heads at the measurement locations. The differences between the calculated and measured heads, usually expressed as a weighted least squares sum, are then reduced through changes to the initial parameters until the differences are similar to the measurement errors of the head field. Thus, neither the first nor second derivatives of the head field are needed for the inverse solution. Furthermore, errors in the measured heads simply provide a metric by which the calculated heads of the inverse model are considered consistent.

2

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

Research by hydrogeologists into the Indirect Method has ranged from the mid-1970s to today. Yeh (1986) presented a review of some of the significant contributions hydrogeological researchers had made to the Indirect Method of the inverse problem. There are numerous ways in which the Indirect Method of solving the inverse problem has been posed. However, the main differences among the techniques stem from; 1) the assumptions imposed upon the parameter field, 2) the relationship between the parameter field and the hydraulic head field, and 3) the technique used to modify the initial parameter field to reach the inverse solution. Indirect methods may be divided into two categories, linear and nonlinear methods. In the linear methods, a model for the statistical spatial variability of the transmissivity field is proposed. A differential equation of groundwater flow is used to relate the spatial variability of the heads to the spatial variability of transmissivity. In this equation, the head and transmissivity fields are decomposed into a mean field and a distribution of perturbations with a mean of 0.0. Once a linear relationship is established between the head and transmissivity perturbations, the linear method then uses the observed head and transmissivity data to estimate the unknown geostatistical parameters in the transmissivity field through a Gauss-Newton optimization routine. The resulting distribution of transmissivities across the model domain is then obtained through co-kriging. The main disadvantage of the linear methods is the requirement that the log-transmissivity field perturbations are small. If the variations in the transmissivity field are considered small (i.e., var f < 1.0), then the second order terms (products of the perturbation terms) above can be neglected. This restriction facilitates the linearization of the head and log-transmissivity field perturbations. Recently, several researchers have presented techniques to loosen this restriction upon the log-transmissivity field variance (Kitanidis, 1995; Yeh et al, 1996). In 1995, Kitanidis presented an approach that circumvents the limitation of the logtransmissivity field variance which he refers to as the quasi-linear method. The quasi-linear method is an extension of the linearized technique described in Kitanidis and Vomvoris (1983) and Hoeksema and Kitanidis (1985). However, the linearization of the relationship between head and transmissivity is performed in a local domain around each measurement location as opposed to around the prior mean of the head field. Yeh et al.(1996) presented their version of a quasi-linear method, referred to as the co-conditional method, an extension of classical co-kriging. 3

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

The nonlinear methods do not linearize the flow equation to obtain the inverse solution. A non-linear inverse method iteratively runs the flow equation, checks the agreement between the measured and calculated heads, modifies the parameter field to improve the agreement and then repeats the process. The iterations end once the differences between the measured and calculated heads reach a prescribed minimum. The measure of the differences is typically expressed either through a weighted-least-square function or a likelihood function. Derivatives of the objective function with respect to the transmissivity (and/or other system parameters) guide changes in the transmissivity field. The changes to the transmissivity field may involve a geostatistical model describing the spatial variability of the field or may be simply conducted within multiple single-value transmissivity zones.

Uncertainty in the Inverse Solution

Without exception, the inverse techniques developed until just recently, assessed uncertainty of their solutions in basically the same way. Uncertainty in the inverse solution (i.e., the transmissivity field) was investigated through Monte Carlo simulations where an ensemble of transmissivity fields are generated and used to determine distributions of a selected parameter. In some cases, the variability in the log-transmissivity field was the focus of the uncertainty analysis, but, more often, the uncertainty about a secondary variable that depends on transmissivity was of interest, e.g., groundwater head, groundwater travel times, solute concentration at a selected boundary, etc. The ensemble of transmissivity fields is generated from the final optimal transmissivity field, Y and its covariance Q (obtained from the inverse solution). Cholesky decomposition of the covariance matrix Q produces a lower trianglar matrix C where CCT=Q. Then, conditional simulations of the log-transmissivity field are produced by; Ycsi = Y + CA i

(1)

where i is the number of the conditional simulation, Ai is a set of vectors of standard normal random numbers. Y conditions the simulations (or ensemble) to the log-transmissivity inverse solution and the addition of the CAi term produces small variations about the mean grid block values, Y. Thus, the result is a set of log-transmissivity fields which all contain the general pattern of the log-transmissivity field determined from the inverse solution but each containing a unique random variation consistent with the covariance. The limitation in 4

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

this approach is that the full range of uncertainty within the transmissivity field is probably not investigated. The conditional simulations all originate from the ‘optimal’ estimate of transmissivity (the transmissivity field produced from the inverse solution) with variations around this optimal dictated by the post-calibration covariance matrix (or conditional confidence interval). Thus, the ensemble of transmissivity fields is generated through a local uncertainty analysis. In 1995, RamaRao et al. presented an enhanced version of the Pilot Point Technique that approached the question of uncertainty in a more ‘global’ fashion. RamaRao et al. (1995) first produce an ensemble of transmissivity fields conditioned to the measured transmissivity data and its covariance (i.e., conditional simulations) and then calibrates each field to the set of steady-state and/or transient-state hydraulic head data. The resulting numerical model presented in RamaRao et al. (1995), referred to as GRASP-INV (GroundwateR Adjoint Sensitivity Program - INVerse), employs the Pilot-Point Technique to solve the indirect, nonlinear groundwater flow inverse problem for a set of conditionally simulated transmissivity fields. The general process used in the GRASP-INV code to generate a single calibrated transmissivity field is illustrated by the flow chart in Figure 1. First, an objective function is defined, usually a weighted-least-squares-error between computed and measured steady-state and/or transient pressures. Calibration implies reducing the objective function (to a prescribed minimum value), and is achieved by iteratively adjusting the transmissivity field. In GRASP-INV, adjustments to the transmissivity field are made using pilot points. A pilot point is a synthetic transmissivity data point (having magnitude and location) that is added to an existing measured transmissivity set during the course of calibration (Figure 2). With the addition of a pilot point, the transmissivity values in the neighborhood of the pilot point get modified, with dominant modifications being closer to the pilot point location. Conceptually, GRASP-INV uses pilot points to effect realistic modifications of transmissivity in a large region of the model. In a companion paper to RamaRao et al. (1995), Lavenue et al. (1995) employed GRASPINV to solve the groundwater inverse problem and determine the uncertainty in the transmissivity field of a regional aquifer near a site considered for radioactive waste isolation. The Waste Isolation Pilot Plant (WIPP) near Carlsbad, New Mexico (Figure 3), is a research and development project of the United States Department of Energy (DOE). The WIPP is designed to be the first mined geologic repository to demonstrate the safe disposal of transuranic (TRU) 5

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

radioactive wastes generated by DOE defense programs since 1970. The GRASP-INV code generated and subsequently calibrated conditionally simulated (CS) transmissivity fields of the Culebra dolomite, a regional aquifer in the WIPP-site vicinity considered the most likely offsite transport pathway. Because each CS field has similar broad features but distinctly different small-scale variations, the GRASP-INV code produced numerous, equally probable, transmissivity fields calibrated to the observed head data. The special features present within each calibrated field collectively span the range of uncertainty of the transmissivity field. The WIPP PA Department incorporated uncertainty in the transmissivity field into the 1992 Performance Assessment (PA) Monte Carlo analysis by partially ordering a set of equally probable transmissivity fields by travel time to the accessible environment, and then drawing one field for each system calculation by sampling a uniformly distributed index variable. Because a Latin Hypercube Sampling technique was used and the number of fields in the set was equal to the number of imprecisely known parameters, each field was drawn once in the 1992 PA calculations. Although not required for a compliance assessment with 40 CFR 191, Subpart B, travel time is a good intermediate performance measure and provides some physical interpretation of the index variable for sensitivity and uncertainty analysis.

Present Study’s Contribution

One of the conclusions presented in Lavenue et al. (1995) was that the location of the boundary between the high transmissivities (i.e., fractured) and low transmissivities (i.e., unfractured) in the central WIPP-site area had an important effect upon the groundwater travel time to the WIPP-site boundary. Over the last two years, the GRASP-INV code was modified to enhance the code’s ability to more adequately define the boundary between the central WIPP-site area’s fractured and unfractured regions through the solution of the inverse problem. GRASP-INV V2 contains parametric (gaussian) and non-parametric (indicator) sequential simulation capabilities for the generation of the ensemble of conditionally simulated transmissivity fields which are then automatically calibrated by the Pilot Point Technique. In this study, the spatial variability in the Culebra transmissivities as well as the boundary between the fractured and unfractured areas were directly simulated using a two-step approach. The first step simulated the fractured and unfractured area boundaries through a discrete categorical simulation using Indicator Categorical Simulation. This required the 6

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

selection of a cutoff transmissivity value where all the measured transmissivities below this value were categorized as an unfractured indicator and all those above were categorized as a fractured indicator. The second step then simulated the spatial variability of transmissivity within each category parametrically by Sequential Gaussian Simulation. The resulting CS transmissivity fields were then calibrated using Pilot Points. A unique feature employed in the optimization process stems from the independent optimization of the fractured and unfractured parts of the aquifer. That is, adjustments in the magnitude of transmissivity in the fractured portion of the aquifer do not affect the unfractured areas transmissivity; such a capability to treat multiple categories in aquifers is not present in earlier Pilot Point Inverse methods and is developed now in GRASP-INV Version 2 (V2). The theory of the GRASP-INV V2 code is presented in this study. The results of the recent application of GRASP-INV V2 code to the 1996 WIPP PA calculations are also presented and compared to the earlier results discussed in Lavenue et al. (1995).

7

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

THEORY Constructing the Conditional Simulations of Transmissivity

In a modeling study in which only one calibrated field, or inverse solution, is to be produced, kriging is the best estimation routine one could use to produce an initial estimate of the gridblock transmissivities. Kriging provides an optimal estimate of the transmissivity at a point, i.e., the mean value. However, if uncertainty of the inverse solution is of interest, then multiple simulations must be generated. These simulations must reproduce the natural variability of the transmissivity field. Simulated transmissivity values reproduce the fluctuation patterns in transmissivity field and are therefore useful to resolve the residual uncertainty not resolved by kriging. There are several approaches that may be used to produce conditional simulations of logtransmissivity; •

Nearest neighbor method (Smith and Schwartz, 1981; Smith and Freeze, 1979)



Matrix decomposition (Wilson, 1979; Neuman, 1984)



Multidimensional spectral analysis (Shinozuka and Jan, 1972; Mejia and Rodriguez-Iturbe, 1974)



Turning bands method (Matheron, 1971, 1973; Mantoglou and Wilson, 1982; Zimmerman and Wilson, 1990)



Sequential Simulation methods (Gomez-Hernandez and Srivastava, 1990; Deutsch and Journel, 1992)

Sequential simulation is a straightforward method of generating realizations of a multivariate Gaussian field. Deutsch and Journel (1992) provide an excellent description of the technique. Thus, only the major concepts of the Sequential Simulation process used in GRASP-INV are presented below. Most of the text describing the Sequential Simulation procedure below draws directly from Deutsch and Journel (1992).

8

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

Consider log10 transmissivity (log10 T) assigned to a specified area as a random variable (RV), Zi. Each transmissivity is simulated sequentially according to its normal complementary distribution function (cdf) fully characterized through a simple kriging system of equations. The conditioning data consist of all original log10 T data and all previously simulated log10 T values found within a neighborhood of the location being simulated. The Sequential Simulation procedure may be summarized as follows. Consider the joint distribution of N random variables Zi with N very large, e.g., log10 T assigned to the nodes of an evenly spaced grid. Consider the conditioning of these N RV’s by a set of n data of any type symbolized by the notation |(n). The corresponding N-variate ccdf is denoted: F(N)(z1,..., zN|(n)) = Prob{Zi ≤ zi, I = 1,..., N|(n)}

(2)

The successive application of Eq. 2, the conditional probability relation, shows that drawing an N-variate sample from the ccdf (Eq. 2) can be done in N successive steps, each involving a univariate ccdf with increasing levels of conditioning. That is; •

draw a value z1(l) from the univariate ccdf of Z1 given the original data (n). The value z1(l) is now considered as a conditioning datum for all subsequent drawings; thus, the information set (n) is updated to (n + 1) = (n)∪ {Z1 = z1(l)}.



draw a value z2(l) from the univariate ccdf of Z2 given the updated data set (n + 1), then update the information set to (n + 2) = (n +1)∪{Z2 = z2 (l)}.



sequentially consider all N RV’s Zi.

The set {zi(l), i =1,..., N} represents a simulated joint realization of the N dependent RV’s Zi. If another realization is needed, {zi(l′), i = 1,..., N}, the entire sequential drawing process is repeated.

9

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

Sequential simulation requires the determination of N univariate ccdf’s, more precisely: Prob{Z1 ≤ z1|(n)} Prob {Z2 ≤ z2|(n + 1)} Prob {Z3 ≤ z3|(n + 2)}

(3)

... Prob {ZN ≤ zN|(n + N - 1)} The sequential simulation principle is independent of the algorithm or model used to establish the sequence of univariate ccdf’s (Eq. 3). In sequential Gaussian simulation (sGs), all ccdf’s (Eq. 3) are assumed Gaussian and their means and variances are given by a series of N simple kriging systems. In indicator categorical simulation (iCs), the ccdf’s are obtained directly by indicator kriging. Both of these algorithms are described below.

Indicator Categorical Simulation GRASP-INV begins by assuming that the observed data T(u ) may be divided into mutually exclusive categories sk, k = 1,...,cat, where cat is the number of categories.. 1 if T( u ) ∈s k i k (u) =  0 otherwise

(4)

Each T(u ) is only assigned to one category sk. At any location u we can define a probability of the T(u) falling into a category sk as pk(u) = p{T(u) ∈ sk}

(5)

By kriging the indicator of category sk, we can estimate the probability of category sk at a grid-block location;

[i k ( u| u α )]* = Pr ob {T( u) ∈ s k | s( u α )} = p *k ( u| α )

(6)

by the relation; 10

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

Prob{I k(u) = 1 (n )}= Pk +

n

λα [I (uα ) − P ] ∑ α =1

k

k

(7)

where Pk is the relative frequency of category, k, in the n-observed T(u) data. Figure 4 illustrates an example calculation of a probability of a categorical indicator for a three-point problem. Sequential indicator simulation requires equation 7 to be repeated for the k = 1, 2,...,cat categories. A check (and/or correction) is also performed to ensure that the sum of the indicator probabilities at each location totals 1.0. Then simulation is conducted by considering p*k(u) as an empirical cumulative distribution function, Fsk(sk) of P{s(u) ∈ sk}, generating a random u[0,1] and inverting Fsk(sk) to get a simulated sk as S*k. The simulated value, S*k, is then added to the conditioning data and the process is repeated at a new random location u. Once the categorical indicator is simulated at each grid block, the transmissivity within each category is simulated using the sGs algorithm. The spatial variability within each category is independent of other categories due to the assignment of separate variograms to each category. Thus, at the boundary between two different category zones, the transmissivity may be discontinuous. This allows for great flexibility in simulating hydrogeologic environments that may be discontinuous over an area due to fracturing or depositional environment. Sequential Gaussian simulation within GRASP-INV works with normal scores of the original log10 T data. The conditioning transmissivity data used in the simulations are first transformed to their normal scores, variogram analysis and simulation are then performed in the normal space, then the results (i.e., kriging results or simulation results) are transformed back to obtain the log10 T simulated value.

Sequential Gaussian Simulation The conditional simulation (CS) of a continuous variable T(u) modeled by a Gaussian-related stationary Random Function T(u) proceeds as follows: 1.

Determine the univariate cdf Fz(T) representative of the entire study area and not only of the T-sample data available. 11

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

2.

Using the cdf Fz(T) perform the normal score transform of log10T data into normal score Y-data with a standard normal cdf.

3.

If a multivariate Gassian RF model can be adopted for the Y-variable, proceed with sequential simulation, i.e., •

Define a random path that visits each node of the grid (not necessarily regular) once. At each node u, retain a specified number of neighboring conditioning data including both original Y-data and previously simulated grid node Yvalues.



Use Simple Kriging with the normal score variogram model to determine the parameters (mean and variance) of the ccdf of the RF Y(u) at location u.

5.



Draw a simulated value Y(1)(u) from that ccdf.



Add the simulated value Y(1)(u) to the data set.



Proceed to the next node, and loop until all nodes are simulated.

Backtransform the simulated normal values {Y(1)(u), u ∈ A} into simulated values for the original variable {T(1)(u) = ϕ-1(Y(1)(u), u ∈ A}. Within-class interpolations and tail extrapolations are usually necessary.

Simple Kriging Kriging elaborates on the basic linear regression algorithm and corresponding estimator: n

[ Z *SK ( u) − m( u)] = ∑ λ α ( u)[ Z( u α ) − m( u α )]

(8)

α =1

where Z(u) is the random variable RV model at location u, the uα’s are the n data locations, m(u) = E{Z(u)} is the location-dependent expected value of RV, Z(u), and Z*SK(u) is the linear regression estimator, also called the “simple kriging” (SK) estimator. The SK weights λα(u) are given by the system of normal equations written in their more general non-stationary form as follows: 12

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

n

∑ λ β ( u)C( u β , u α ) = C( u, u α ), α = 1,..., n

(9)

β =1

The SK algorithm requires prior knowledge of the (n + 1) means m(u), m(uα), α = 1,..., n, and the (n + 1) by (n + 1) covariance matrix [C(uα, uβ), α, β = 0, 1,..., n] with u0 = u. In most practical situations, inference of these means and covariance values requires a prior hypothesis of stationarity of the random function Z(u). If the RF Z(u) is stationary with constant mean m, and covariance function C(h) = C(u, u + h), ∀u, the SK estimator reduces to its stationary version: n n   Z *SK ( u) = ∑ λ α ( u)Z( u α ) + 1 − ∑ λ α ( u) m α =1  α =1 

(10)

with the traditional stationary SK system: n

∑ λ β ( u)C( u β − u α ) = C( u − u α ), α = 1,..., n

β =1

(11)

Stationary SK does not adapt to local trends in the data since it relies on the mean value m, assumed known and constant throughout the area. Consequently, SK is rarely used directly for mapping the z-values. Instead, it is the more robust ordinary kriging (OK) algorithm which is used.

Ordinary Kriging Ordinary kriging (OK) filters the mean from the SK estimator by requiring that the kriging weights sum to one. This results in the following ordinary kriging (OK) estimator: n Z*OK ( u) = ∑ ν α ( u) Z( u α ) α =1

(12)

13

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

and the stationary OK system:  n  ∑ ν β ( u)C( u β − u α ) + µ ( u) = C( u − u α ), α = 1,..., n β =1  n  ∑ ν ( u) = 1 β =1 β 

(13)

where the να(u)’s are the OK weights and µ(u) is the Lagrange parameter associated with the constraint in the second expression in (13). Comparing expression (11) and (13), note that the SK weights are different from the OK weights. It can be shown that ordinary kriging amounts to re-estimating, at each new location u, the mean m as used in the SK expression. Since ordinary kriging is most often applied within moving search neighborhoods, i.e., using different data sets for different locations u, the implicit re-estimated mean denoted m*(u) depends on the location u. Thus, the OK estimator (12) is, in fact, a simple kriging of type (10) where the constant mean value m is replaced by the location-dependent estimate m*(u): n

Z *OK ( u) = ∑ ν α ( u)Z( u α ) α =1

(14)

n   = ∑ λ α ( u)Z( u α ) + 1 − ∑ λ α ( u) m * ( u) α =1  α =1  n

Hence, ordinary kriging as applied within moving neighborhoods is already a non-stationary algorithm, in the sense that it corresponds to a non-stationary RF model with varying mean but stationary covariance.

Implementation Considerations Strict application of the sequential simulation principle calls for the determination of more and more complex ccdfs, in the sense that the size of the conditioning data set increases from (n) to (n + N - 1). In practice, the argument is that the closer data screen the influence of more remote data; therefore, only the closest data are retained to condition any of the N ccdfs. Since the number of previously simulated values may become overwhelming as i progresses from 1 to N > > n, one may want to give special attention to the original data (n) even if they are more remote. 14

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

The neighborhood limitation of the conditioning data entails that statistical properties of the (N + n) set of RV’s will be reproduced only up to the maximum distance found in the neighborhood. For example, the search must extend at least as far as the distance to which the variogram is to be reproduced; this requires extensive conditioning as the sequence progresses from 1 to N. Gomez-Hernandez and Cassiraga (1994) have suggested that sequential algorithms such as sGs can fail to adequately reproduce the long range spatial correlations of some fields; this problem may be particularly pronounced for covariance function models with zonal anisotropy. This can occur when the random sequence of simulated locations fails to populate the field at long distances early enough in the sequence so that the simulated values do not reflect the effects of the long distance correlations. To circumvent this, GRASP-INV uses a multigrid approach, a type of stratified random sampling, to force the sequential simulation to visit widely separated locations early in the sequence. The multigrid approach involves setting up an initial coarse grid, each point of which is visited randomly in the sequential simulation. Subsequent finer grids are superimposed and simulated until the grid spacing is reduced to the desired resolution. Because the multigrid approach requires the sequential simulation algorithms to visit widelyspaced locations first, it helps the simulations retain the long distance spatial correlation structure specified as input. Theory does not specify the sequence in which the N nodes should be simulated. Practice has shown, however, that it is better to consider a random sequence. Indeed, if the N nodes are visited row-wise, any departure from rigorous theory may entail a corresponding spread of artifacts along rows.

Upscaling to the Flow Model Grid Once the simulation (i.e., generation) of the transmissivity field is complete at a fine scale, information concerning the transmissivity field is upscaled to the flow-model finite difference grid. The information assigned to each grid block from the underlying finer geostatistical simulation grid include: 1. The mean of the simulated log transmissivity points; 2. A lower bound for the simulated log transmissivity points; 3. An upper bound for the simulated log transmissivity points; 15

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

4. The mean category for the simulated points. A flag that can prohibit the block from being selected as a pilot point GRASP-INV generates items 1 through 4 for every block on the flow model grid by averaging these values for the simulation points falling within each grid block.. After the averaging is done, the simulation grid is effectively discarded and not used further. To find transmissivities and a category for a grid block, the SWIFT II finite difference grid is ‘overlain’ by the simulation grid. One requirement is that each finite difference grid block must contain at least one simulated point. The simulated transmissivity values from each contained point are averaged to find a value for the grid block. There are two different approaches taken depending on the type of simulation grid. If the simulation grid is irregular, the geometric mean of the simulated transmissivities is calculated. If the simulation grid is regular, an analog to resistance computations for an electrical circuit is used. For a regular grid, the first assumption is that each simulation point has equal weight, i.e., comprises an equal volume within the SWIFT II grid block. A principal direction is selected and the SWIFT II grid block is partitioned into a sequence of slices, where each slice is perpendicular to that direction. In two dimensions, a slice is a row or column of simulated points. In three dimensions, a slice is comprised of points in a plane. Fluid entering the block must traverse each slice. If there is more than one simulated point within a slice, the fluid has more than one path through the slice. This is analogous to resistance in parallel in an electrical circuit. Resistors in parallel have a total resistance found by the inverse of the sum of the inverses of the individual resistors. Since transmissivity is inversely proportional to resistance, the transmissivity for a slice is found from the sum of the transmissivities for the simulated points within the slice. Each slice must also be traversed in sequence. This is analogous to resistance in series. Resistors in series have total resistance equal to the sum of the individual resistors. Again applying the inverse relation between resistance and transmissivity, the average transmissivity across all slices is the inverse of the sum of the inverses of the individual slice transmissivities. CONSIM II finds the directional transmissivity for each SWIFT II grid block in the x and y directions (Tx and Ty) using this electrical analogue. The equivalent horizontal conductivity is then the geometric mean, Th = (Tx Ty)½.

16

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

CONSIM II uses the same approach, depending on the type of simulation grid, to find a minimum and maximum value for transmissivity in each block. Instead of simulated values it substitutes the minimum and maximum bounds for each simulated point during the averaging process. This produces a minimum and maximum log transmissivity for the SWIFT II grid block which are used in the optimization routines to constrain modifications to the transmissivity field. Note that the bounds were originally symmetric when simulation was being performed on the normal scores. Symmetry is generally lost during back transformation to original data, and can be further diminished during the averaging processes described above. In the GRASP-INV scheme, the upscaled transmissivities are used as an initial estimate for grid block transmissivities to initialize the calibration process. At each step of the calibration, a pilot point(s) is placed in a grid block with the highest sensitivity and its transmissivity is modified. The nearby grid blocks, that have the same category as the grid block containing the pilot point, are also modified. Pilot point locations are selected from the set of SWIFT II grid block centers. If the location of a conditional data point coincides with a SWIFT II grid block center, the grid block is removed from consideration as a pilot point location. This honors the observed transmissivity data within the flow model’s transmissivity field.

Solving the Groundwater Flow Equation The groundwater flow model used in GRASP-INV is SWIFT II (Sandia Waste Isolation, Flow, and Transport code). SWIFT II is a fully transient, three-dimensional, finite difference code that solves the coupled equations for single-phase flow and transport in porous and fractured geologic media, where the mass per unit volume, ρ, is a function of the concentration of the transported constituents. The SWIFT II code is supported by comprehensive documentation and extensive testing. The theory and implementation of SWIFT II are given by Reeves et al. [1986a] and the data input guide is given by Reeves et al. [1986b]. Finley and Reeves [1981] and Ward et al. [1984] present the verification-validation tests for the code. The transient flow equation solved by SWIFT II is given by



∂ (φρ )  ( ρk )  −∇• (∇p + ρg∇z) + q = 0 ∂t  µ 

(15)

17

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

where k = k(x) is permeability tensor, p = p(x, t) is pressure, z is the vertical coordinate and is considered positive upward, ρ = ρ(x) is fluid density, q is flux sources or sinks, g is the gravitational constant, µ is fluid viscosity, φ is rock porosity, x is the position vector, and t is time. Discretized, (15) becomes a matrix equation of the form; [A]{p}n = [B]{p}n-1 +{f}n

(16)

where, for the fully implicit scheme of time integration in SWIFT II, [A] = [C] + [S]/∆tn, [B] = [S]/∆tn, [C] is the conductance matrix, [s] is the storativity matrix, [f] is the load vector, ∆tn = tn - tn-1, t is time, n is the time level (e.g., 1,2,..., L) is the maximum time level of the simulation.

The Objective Function Once initial model parameters are assigned to the finite-difference flow model, an initial flow simulation is performed to obtain the calculated pressures across the model domain. The process of reducing the differences between the calculated and measured heads is illustrated in Figure 1. An objective function is calculated and minimized during calibration. It is a weighted sum of the squared deviations between the computed and measured pressures taken over all points in spatial and temporal domains where pressure measurements have been made. For a purely steady state simulation, the objective function (also called performance measure) is given by:

n J s (u) =

(17)

(

∑ w i p i - p ob, i i =1

)2

where: Js(u) = objective function for steady state, n

= number of boreholes,

i = suffix for the borehole, pi = calculated pressure, pob,i = observed pressure, and wi = weight assigned to the borehole. 18

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

For transient simulation, similarly (18) J t (u) =

t2 ∑

n 2 ∑ wi,t ( p i, t− pob i, t)

t = t1 i = 1

where: Jt(u) = objective function for transient state, t1 = the beginning of the time window, t2 = the end of the time window, and wi,t = weight assigned to selected borehole for a given time t. The transient performance measure may consist of short transient events during which a response is only observed at a single location or long-term events during which responses are observed at several locations. In cases where the flow system is initially at steady state and then transient stresses are imposed upon the steady-state flow field, calibration to the steady-state conditions is undertaken first, followed by transient calibration. It is necessary to ensure that the fit between calculated and observed pressures be improved during transient calibration without degrading the fit to the steady state calibration. From experience, it has been found that this requires that the contributions from the steady state and the transient state to the combined performance measure should be approximately equal. Since transient performance measures can be generally much larger than the steady state performance measures (because values are summed up in the time window), an additional factor f is used to ensure that the steady state performance measure and the transient performance measure are approximately equal in the combined performance measure J(u). J(u) = f Js (u) + J t (u)

(19)

where: J(u) = combined steady and transient objective function, and f

= weight factor for steady state objective function.

Also, 19

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

f Js (u) ≈ Jt(u)

f ≈

(20)

J t (u) J s (u)

(21)

Note the difference in the objective function in Equation 18 and the objective function generally employed in other inverse techniques, i.e., J (T ) = JH (T ) + λ JT (T ) , (Neuman and Yakowitz, 1979; Neuman, 1980; Carrera and Neuman, 1986). Other inverse procedures add the second term of the right hand side of the above expression, referred to as the plausibility criterion, for two reasons. First it constrains the transmissivity estimates from deviating too far from prior information and secondly it reduces oscillations in the transmissivity field solution. The plausibility criterion is not necessary in the Pilot Point Technique because kriging is used both in the estimation of the initial transmissivity field and in the modification of the transmissivity field during optimization. Adjoint Sensitivity Analysis GRASP-INV computes measures (e.g., weighted least-square errors) of a groundwater system’s pressures or heads at a location or several locations. It then calculates the sensitivities of these measures to system parameters (e.g, permeabilities and prescribed pressure values at the boundaries). The sensitivities are computed by the adjoint method (Chavent, 1971) and are derivatives of the performance measures with respect to the parameters for the modeled system, taken about the assumed parameter values. GRASP-INV presumes either steady state or transient state saturated groundwater flow conditions and directly uses the results of the groundwater flow simulation obtained from the SWIFT II subroutine. The theory and verification for the steady-state flow adjoint sensitivity equations employed in GRASP-INV are presented by Wilson et al. (1986), while those for the transient flow sensitivity equations are presented by RamaRao and Reeves (1990). A brief presentation of the sensitivity equations solved by GRASP-INV during this study is given below. A conventional approach to the evaluation of sensitivity coefficients is defined by the expression J = f(α, p)

(22) 20

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

where J is a performance measure and α is a vector of sensitivity parameters. Let α be a vector of sensitivity parameters. Let α1 be the parameter for which a sensitivity coefficient is sought. Then dJ/dα1 = ∂J/∂α1 + ∂J/∂p (∂p/∂α1)

(23)

The first term on the right-hand side of (23) represents the sensitivity resulting from the explicit dependence of J on α1 and is called the direct effect. The second term represents an indirect effect due to the implicit dependence of J on α1 through the system pressures, p(α). While the computation of the direct effect is a trivial step, that of the indirect effect involves the evaluation of the state sensitivities; ∂p(x, t)/∂α1. State sensitivities may be calculated by the “parameter-perturbation approach” (Yeh, 1986) or by solution of the partial differential equation for state sensitivity (Sykes et al., 1985; Yeh, 1986). However, these approaches require the state sensitivities to be recomputed whenever a new parameter is considered. In a numerical model with a large number of grid blocks/elements and different system parameters, this represents an enormous computational effort of the same order as the multiple simulation approach to parameter sensitivity. The adjoint sensitivity approach circumvents the need to compute state sensitivities. This is done by expressing the performance measure as the sum of two distinct terms, one containing, exclusively, the partial variations with respect to the pressure function and the second containing partial variations with respect to α1 (RamaRao and Reeves, 1990). Both terms include a function referred to as the adjoint state. The adjoint state is computed such that it greatly facilitates the evaluation of the second term on the right-hand side of (23). The adjoint state vector λ is obtained by solving the following equation:  ∂J

T   ∂ pn} T 

[A ]{λ } n − 1 = [ B] {λ } n + 

{

(24)

where T denotes the transpose of the matrix, A and B are the same matrices used in the primary problem (i.e., pressure solution) solved by SWIFT II, and J is the performance measure (e.g., the cumulative sum of weighted squared pressure deviations between calculated and observed pressures). Equation 24 is solved backwards in time, from n = L to n = 1 with

21

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

λ

L

(25)

= 0

If ai is a generic sensitivity parameter in the gridblock i, the sensitivity coefficient dJ/dai follows from the solution of (24) using the following expression: L dJ ∂J = +∑ dα i d α i n =1

{λ } n

T

 ∂ [ A] ∂ { f n}  ∂ [B ] n n −1 ⋅ { p} − ∂ α { p} − ∂ α  i i  ∂ α i 

(26)

The fact that there are no state-sensitivity terms in the above expression leads to one important feature of the adjoint method, namely, the separation of the relatively timeintensive calculation of the adjoint state vector λ in (24) from the relatively non-timeintensive calculation of the sensitivity derivative (26). In general, this separation permits the calculation of sensitivity derivatives for all of the system parameters using the same adjoint state vector [λ}, a major advantage over the perturbation approach. Adjoint sensitivity analysis provides an extremely efficient algorithm for computing sensitivity coefficients between a given objective function J and a large number of parameters (permeabilities in thousands of grid blocks as is the case here). In the applications contained in this document Equation 26 is evaluated with αi = Ki, the permeability in the grid block.

Locating Pilot Points De Marsily (1978) pioneered the concept of pilot points as parameters of calibration. He assigned their locations based on empirical considerations. In GRASP-INV, LaVenue and Pickens' (1992) approach to location of pilot points is followed. Pilot points are placed at grid-block center locations where their potential for reducing the objective function is the highest. This potential is quantified by the sensitivity coefficients (dJ/dYp) of the objective function J, with respect to Yp, the logarithm (to base 10) of pilot-point transmissivity. A large number of candidate pilot points are considered (as specified by the user), usually the centroids of all the grid blocks in the flow model grid. Each potential pilot point is initially described by an x,y,z location (grid block center) and a category type. The variograms for each category represented by the candidate pilot points and the number of neighboring grid blocks with the same category type is considered in the sensitivity equations. Coupled adjoint sensitivity analysis and kriging is used to compute the required derivatives and the procedure is documented in RamaRao and Reeves (1990). From a user-specified 22

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

region, GRASP-INV calculates absolute dJ/dYp (see below) for these grid blocks. It then reranks these grid blocks’ dJ/dYp sensitivities and places a pilot point in the grid block with the highest absolute sensitivity value. GRASP-INV then sends this new pilot point location to PAREST to optimize the pilot point’s transmissivity value.

Let P be a pilot point added to a set of N observed transmissivity values within a particular category. Let Tp be the transmissivity assigned to pilot point P. Kriging is done using Yp, where Y p = log10 T p

(27)

The kriged estimate (Y*) at the centroid of a grid block m for this category is given by (28)

N Y*m =

∑ γ m,k Y k + γ m,p Y p k =1

where k is the subscript for an observation point, p is the subscript for pilot point, γ m, k is the kriging weight between the interpolation point m and data point k, and γ m, p is the kriging weight between interpolation point m and pilot point p. When a pilot-point transmissivity is perturbed, the kriged transmissivities and hence the conditionally simulated (CS) values in the neighboring grid blocks having the same category of the pilot point are altered, causing the objective function J to change. If a neighboring grid block belongs to another category, its CS value will not be affected by the addition of a * nearby pilot point belonging to another category. Let Ym represent the CS value assigned to

grid block m. Using the chain rule, M ∂ J ∂ Ym dJ =∑ dYp m=1 ∂ Ym ∂ Yp

(29)

where M is the total number of grid blocks in the flow model. d Ym = γ d Yp

m,p

(30) 23

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

where γ m,p is the linear weight between a pilot point and the finite-difference grid block centroid. This result is valid for a CS field also, because the kriging error is independent of the kriged values. M dJ dJ = ∑ • γ m. p dYp m =1 dY * m

(31)

* * Ym = log10 ( Tm )

(32)

ρ

* Tm = K m m gb m µm

(33)

dJ dJ = ln(10) Km dK m dYm

(34)

where T* is the CS transmissivity, K is the CS permeability, ρ is fluid density, µ is fluid viscosity, g is acceleration due to gravity, b is grid block thickness, and m is the subscript denoting grid block. Combining Equations 31 and 34 yields M dJ dJ = ln(10) ∑ γ m, p K m dYp dK m m =1

(35)

The sensitivity coefficient, dJ/dKm of the objective function with respect to the permeability in a grid block m, is obtained by adjoint sensitivity analysis.

Optimization of Pilot Point Transmissivities GRASP-INV contains a series of optimization codes to assign transmissivities to selected pilot point locations. The optimization codes are contained in the subroutine PAREST. Optimization is essentially conducted in a two step process. Given the parameter to be optimized, determine which direction to adjust its initial value (i.e., increase or decrease). Once the direction is chosen, determine the optimal change or ‘step length’ in this direction.

24

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

The pilot-point transmissivities are the parameters that are adjusted for calibration. However, in the mathematical implementation, the logarithms (to base 10) of the transmissivities (and not the transmissivity) are treated as parameters. The calibration parameters are given by Y p = log10 T p

(36)

where Tp is the transmissivity at a pilot point (suffix p denotes pilot point). The transmissivities at pilot points are assigned by an unconstrained optimization algorithm followed by an imposition of constraints. The optimization algorithm chosen here belongs to a class of iterative search algorithms. It involves a repeated application of the following equation until convergence is achieved: Y i+1 = Y i + β i • d i ,

(37)

where i is the iteration index, di is the direction vector, βi is the step length (a scalar), and Yi is the vector of parameters to be optimized (i.e., logarithms of pilot-point transmissivities to base 10).

Determining the Direction Vector: di Three options for the computation of the direction vector di are considered. They are the algorithms due to 1. Fletcher-Reeves, 2. Broyden, and 3. Davidon-Fletcher-Powell (Luenberger, 1973; Gill et al., 1981; Carrera and Neuman, 1986; Certes, 1992). These methods are well known in classical literature and are not described here.

Determining the Step Length: βI The step length βi, (a scalar) is determined by: β 1 J (Y i + 1) = min J (Y i + βi d i)

(38) 25

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

Thus, βi is obtained by solving ∂ J( Y i+1) ∂βi

= 0.

(39)

The solution of Equation 39 follows from Carrera and Neuman (1986) and Neuman (1980).

Constraints on Pilot Point Transmissivity Values During the calibration process, the optimization algorithms may dictate large changes in the transmissivities assigned to pilot points to reduce the objective function. Large changes may be undesirable for several reasons. GRASP-INV constrains the pilot point transmissivity values with the minimum and maximum values of transmissivity for each grid block determined while upscaling transmissivities from the geostatistical simulation grid to the flow-model grid.

MODEL APPLICATION WIPP Site Geology

The WIPP site lies within the geologic region known as the Delaware Basin. The upper seven formations in the vicinity of the WIPP site are illustrated in Figure 5. The repository horizon lies within the bedded salt of the Salado. The Rustler formation consists of beds of halite, siltstone, anhydrite, and dolomite. It is divided into five separate members based on lithology. The Culebra, one of these five members, has been identified through extensive field site-characterization efforts as the most transmissive, laterally continuous hydrogeologic unit above the Salado. It is considered to be the principal subsurface pathway for offsite radionuclide transport in the subsurface, should an accidental breach of the repository occur, e.g., by an intrusion borehole (WIPP PA Department, 1992). Based upon observations of outcrops, core, and detailed shaft mapping, the Culebra can be characterized, at least locally, as a fractured medium at the WIPP site. As the amount of fracturing and development of secondary porosity increases, the Culebra transmissivity generally increases. The occurrence of enhanced transmissivity zones due to fracturing has an important effect on groundwater velocities. As shown in Lavenue and RamaRao (1992), the travel time from the intrusion borehole location to the WIPP site boundary is directly 26

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

related to the transmissivity along the groundwater travel path; the higher the transmissivity, the lower the travel time to the southern boundary. Since the fractured portions of the Culebra have the highest measured transmissivity values, determining the locations of these fractured, or enhanced, transmissivity regions within the WIPP-site boundary is important from a groundwater travel-time perspective.

Culebra Hydrologic Data

Over the past 16 years, a significant effort has been directed toward field investigations at the WIPP site. Numerous boreholes in and immediately surrounding the WIPP-site area have been drilled and tested within the Culebra in support of these investigations. From these boreholes, estimates for hydrogeologic parameters such as formation elevation, fluid density, transmissivity, storativity, and porosity have been obtained. In addition, an exhaustive set of water-level measurements for hydraulically undisturbed conditions as well as hydraulically disturbed conditions (that is, transient hydraulic tests) has been recorded. The field investigations have been instrumental in providing estimates of the variability of the hydrogeologic properties within the Culebra. Data from the observation-well network in the Culebra were evaluated in Cauffman et al. (1990) to characterize the hydraulic conditions in the Culebra. Appendix G of Cauffman et al. (1990) presents the hydrographs plotted as equivalent freshwater head versus time. The freshwater-head data were calculated from either depth-to-water or downhole-pressuretransducer measurements. The procedure used and the information necessary to calculate the freshwater heads are presented in Appendix G of Cauffman et al. (1990). An example of the hydrograph for well H-1 is shown in Figure 6. Cauffman et al. (1990) estimated the undisturbed hydraulic conditions and the transient responses to construction of the shafts and regional-scale pumping tests in the Culebra from these hydrographs. In addition, they presented the uncertainty associated with each selected undisturbed head, which was calculated by summing the measurement error of the parameters used to calculate freshwater head. For example, the accuracy of the water-level measuring device, the accuracy of the ground-surface elevation survey, and the uncertainty of the borehole fluid-column density). However, the uncertainties associated with the selected heads did not account for unexplained trends in the hydrographs. Most of the hydrographs 27

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

display either a rising or declining trend prior to 1982. For example, in Figure 6, a 3.4-m rising trend occurs between 1977 and the middle of 1981. Unlike the water level fluctuations due to transient pumping and shaft excavation events that start in 1982, the origin of pre-1982 trends is not well understood. The trends are not caused by any known pumping event but are probably caused by a transient condition moving through the system. For this study, the undisturbed heads were selected so that these trends could be included directly in the uncertainty associated with each head value. The WIPP-borehole hydrographs were reviewed in order to select a representative 'undisturbed' head prior to the shaft excavations for those boreholes near the shafts. In essence, the heads were selected so that the head value used in the 1996 performance assessment could be considered a mean head with a Gaussian distributed error equal to the total head uncertainty value. Weights were calculated from the uncertainties and assigned to the head values during steady-state model calibration in order to weight the more certain heads higher than the less certain heads. Table 1 summarizes the selected undisturbed (steady state) head values, their uncertainties and the associated weights assigned to the steady-state heads during model calibration. These head values are displayed on Figure 7a. Figure 7b illustrates the uncertainties listed in Table 1. The asymmetrical nature of the uncertainties is due to the inclusion of the uncertainty in the trends as described above.

Culebra Transmissivity, Storativity, and Porosity Data.

The transmissivity database for the Culebra is derived from numerous hydraulic tests performed at the WIPP site (Figure 8). Values have been obtained from drill-stem tests (DSTs), slug tests, and local- and regional-scale pumping or interference tests (Beauheim 1986, 1987a, 1987b, 1987c, 1989, 1996; Beauheim et al. 1991; Cooper 1962; Cooper and Glanzman 1971). These data are summarized in data package WPO 35406. Transmissivity values interpreted from these tests extend over a range of seven orders of magnitude (Figure 8). The uncertainty of the transmissivity data has been estimated to be ±0.3 log10 m2/s. This value is used in GRASP-INV to assign limits on the permissible changes to the transmissivity field during model calibration. The lack of numerous storativity, or storage coefficient, data eliminated the possibility of spatially varying storativity in the model domain. Storativity ranges from a maximum of 2.6 x 10-3 at H-7 west of the WIPP-site area (where the Culebra may be unconfined) to a 28

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

minimum of 1.1 x 10-5 in the center of the WIPP site. The storativity data that were obtained from the tests within the Culebra were therefore used to determine a representative storativity (1.0 x 10-5) for the entire area. Kelley and Saulnier (1990) estimated a mean porosity of the Culebra at 0.153, with a standard deviation of 0.053. The lowest porosity was 0.079, and the highest approximately 0.28. The average value used in the WIPP database is 0.16.

Model Construction Spatial Discretization The model boundaries and orientation used in this study are essentially the same as used by Lavenue and RamaRao (1992) for the 1992 PA Culebra calculations (Figure 9a). The locations of the boundaries of the model were chosen to maximize the ability to take advantage of a groundwater divide located in the Nash Draw region west of the WIPP site. Another factor was to minimize the effect that the boundaries may have on the transient modeling results for the long-term pumping tests at the H-3, WIPP-13, H-11 and H-19 locations (Figure 9a). The finite-difference grid used for this analysis was selected to facilitate the successful reproduction of both steady-state and transient heads (Figure 9b). The grid consists of 108 x 100 x 1 (x,y,z) grid blocks and has a finer grid in the central portion of the model in the vicinity of H-3, H-11, WIPP-13, and the shafts. This grid is also denser than that used in 1992 by Lavenue and RamaRao (1992). Grid-block dimensions range from 100 meters near the center of the site to 800 meters at the model boundary. The vertical dimension of the grid is taken from the thickness of the Culebra in the WIPP area. The mean thickness of 7.75 meters was calculated from the available data and was assumed suitable for the vertical model dimension in this study. The range of thickness of the Culebra is from about 6 to 11 m, while the range in transmissivity is over 5 orders of magnitude. Thus any error introduced by assuming a constant representative thickness is overshadowed by the inherent variability in transmissivity. The Culebra is considered confined above and below by low-permeability beds of anhydrite, halite, and siltstone. Vertical flux is not considered in the model because the existence of these low-permeability anhydrites indicates that flow would be confined. In addition, any leakage into the Culebra would have a negligible impact upon the estimation of the 29

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

transmissivity fields because the fields are calibrated to a much larger set of stresses (i.e., the pump tests in the Culebra). The infinitesimal effect upon any leakage occurring during these tests is not seen in the responses at the observation wells (that is, the large drawdowns associated with the transient tests “swamp” any effect the leakage would have upon the heads). Thus, for the purpose of identifying the transmissivity values in the Culebra, the lack of vertical flux has no impact upon the results, and the numerical model used in this analysis assumes a two-dimensional flow system.

Boundary Conditions A kriging analysis of regional fresh-water heads was done to estimate the boundary conditions for the model. The fundamental assumption when applying geostatistics is that the univariate and bivariate (relation between points) probability laws are independent of absolute location. Generally, data that shows specific deterministic influences (trends and inhomogeneities) does not satisfy this requirement. A common approach to handle this is to fit an ordinary least-squares trend surface to the data (Isaaks and Srivastava, 1989), perform geostatistical analysis on the detrended data which has been made random by the detrending, and then combine the trend and the random results. The observed steady-state heads were detrended by bilinear regression. The coefficients of the trend surface were H(x,y) = 912.409 -0.6938x + 1.1326y + 0.0104xy, which is a simple first-order trend. This first-order function was selected because the regional potentiometric surface showed a bilinear trend. Higher order functions were not indicated by the data. Goodness of fit information (mean residual and standard deviation) indicated that the trend surface is adequate to represent the potentiometric surface. Once the trend was removed from the observed head data, a variogram and kriging analysis on the head residuals led to estimates of the head residuals along the model boundaries, which were then added to the trend surface to obtain the boundary heads.

Temporal Discretization Numerous transient events extending over a fifteen year period were simulated during the calibration of the transmissivity fields to the measured transient heads. The minimum time step used in the transient simulation was 24 hours during periods of pumping and either 4 days or 8 days during periods of inactivity (i.e., marching in time from the end of one transient event to the beginning of another). An increase in the time steps by a factor of 2 30

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

was generally used to increase the time steps once a transient event began. This increase was considered acceptable due to the long periods of time (i.e., months) over which transient test were performed in the Culebra.

Model Initial Kriged-Transmissivity Field and Its Uncertainty A histogram and probability plot of the logarithm of Culebra transmissivity is presented in Figures 10 and 11, respectively. The transmissivities range over five orders of magnitude excluding the unusually low transmissivity value at P-18 (-10.1 log10 m2/s). The histogram illustrates an interesting aspect of the Culebra transmissivity values, namely the transmissivity distribution appears approximately bi-modal with the split at a transmissivity value of -5.9 log10 m2/s, which is approximately the median as shown on Figure 10. This bimodal distribution separates areas of increased transmissivity due to fracturing and dissolution and areas where the Culebra is intact. As previously mentioned, the transmissivities generally increase from east to west where the halite removal from the Rustler Formation is greatest. One of the conclusions presented in Lavenue and RamaRao (1992) was that the location of the boundary between the high and low transmissivities in the area between the H-1 and H-11 boreholes had an important effect upon the groundwater travel time to the WIPP-site boundary. Data that show marked inhomogeneity should not be grouped together for geostatistical analysis and simulation (Isaaks and Srivastava, 1989). Thus it was decided to geostatistically simulate the high transmissivities separately from the low transmissivities. The value presented above, -5.9 log10 m2/s, was chosen as the cutoff between the high and low values because: (1) the histogram appears to support this value as the cutoff and (2) boreholes that have exhibited dual-porosity behavior (Beauheim et al., 1987c) fell into the high-transmissivity category using this cutoff. The correlation of the high and low transmissivities was considered on a categorical basis by performing an indicator transform (Isaaks and Srivastava, 1989). The isotropic semivariogram that describes the correlation of the categories is shown in Figure 12. A spherical variogram with a range of 2 kilometers and a sill of 0.25 fitted the categorical raw variogram well. It should be remembered that the (h) value on the y-axis of Figure 12 represents the probability of changing from one category to the next. While it appears that a sill of about 0.30 might be appropriate, the selected sill of 0.25 is based on theoretical considerations. The maximum theoretical sill is defined as p(1-p) where p is the probability of being in a specific category. Because the 31

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

median data value is used to separate the categories, p=0.5 and the maximum theoretical sill is 0.25 (see Deutsch and Journel.1992). A directional semivariogram approach was investigated while analyzing the transmissivity data. It would be possible to fit the east-west trend of the transmissivity data through a directional variogram. However, a more physically based approach was to separate the fractured portions of the Culebra transmissivity data (high values) from the non-fractured (low) transmissivities and conduct a semivariogram analysis on these two populations separately. Once separated, there was not enough transmissivity data within each distribution to obtain reliable directional variograms. Thus, only isotropic variograms were used for each category. Figures 13 (a) and 13 (b) illustrate the isotropic normal-score variograms for the high transmissivities and the low transmissivities, respectively. The most noticeable difference between the two variograms is their respective correlation lengths. The spherical variogram fitted to the high transmissivity normal scores has a range of 5.9 km, a nugget of 0.05 (m2/s)2, and a sill of 1.00 (m2/s)2. The spherical variogram fitted to the low transmissivity normal scores has a range of 2.1 km, a nugget of 0.08 (m2/s)2, and a sill of 1.00 (m2/s)2. Thus, the correlation length associated with the high transmissivity is almost three times longer that the correlation length associated with the low transmissivity. The geostatistical simulation procedure consisted of first generating a categorical simulation that determined the gross distribution of high and low transmissivity. Each category was filled in using continuous-variable simulation with the semivariograms determined for the associated category. Only nearest neighbors belonging to the same category were used in determining transmissivity. To develop the initial transmissivity values for the model grid blocks in each realization, the two category’s transmissivity fields were combined into a single transmissivity field (Figure 14). All the geostatistical simulations were conducted on an evenly spaced (100 x 100m) grid containing 299 nodes in the y direction and 219 in the x direction. Once the conditional simulation was complete, the geostatistical simulation grid was then superimposed upon the flow models’ finite-difference grid and the transmissivities were scaled up to the groundwater model’s finite-difference grid using an analog to electrical resistance computations. The upscaled transmissivities were then used as initial input to the flow model.

32

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

Transient Events Simulated in Model Transient events ranging over a fifteen-year period were used in the model to calibrate the transmissivity fields. These events included three regional pumping tests and hydraulic stresses associated with four shaft excavations. These events are essentially the same as those used in the 1992 Culebra model; however, the pumping events that occurred in 1995 and 1996, e.g., pumping at H-19, H-11, and WQSP-2, were added to the simulated events. The hydraulic responses associated with the H-19 tracer test were not calibrated to, but simulated and then independently compared for model validation.

MODEL CALIBRATION

The transmissivities were calibrated to steady-state conditions before starting calibration to the regional-scale pumping tests. Pilot points were added and optimized one by one during steady state calibration until either a minimum threshold weighted steady-state objective function value, Jmin, was achieved or a maximum number of fifty pilot points were added. Jmin was set equivalent to an average error of approximately 0.15m. One pilot point was located and its transmissivity value optimized during each loop of GRASP-INV V2. As specified, up to 50 loops of the code was permitted to match the steady-state heads. Once the minimum steady-state objective function was reached or the 50 loops occurred, then the transient calibration was initiated. An average of 35 loops were required for the 100 fields that were calibrated. An objective function goal could also have been set for the transient analysis. However, it was decided to add the same number of pilot points (19) for each transient calibration in order to generate roughly the same level of parameterization for each transmissivity field. As during steady-state calibration, one pilot point was located and its transmissivity value optimized during each of the 19 transient loops of GRASP-INV V2. Table 2 lists the number of pilot points added for each transient pumping event simulated during transient calibration.

Ensemble Mean Transmissivities As described earlier, the conditionally simulated (CS) transmissivity fields were generated using categorical (indicator) simulation followed by sequential Gaussian simulation of transmissivity for each category. The observed transmissivity data were divided into two

33

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

categories (high transmissivity and low transmissivity) and used to obtain the categorical and continuous variable (that is, transmissivity) values for the model grid blocks. One hundred CS transmissivity fields were generated and subsequently calibrated to the observed steady-state and transient-state freshwater head data using the approach detailed earlier. Clifton and Neuman (1982) found that about 300 realizations were sufficient to establish a reasonable level of uncertainty for Monte Carlo simulations. De Marsily (1986) and Peck et al. (1988) found that only a few hundred runs are sufficient to characterize a mean response. Peck et al. (1988) also note that the number of runs depends on the variability of the parameters and sensitivity of the system, and no general rule can be given for the requisite number of Monte Carlo simulations that must be performed. Use of Latin Hypercube Sampling (LHS) will reduce the number of runs and provide the same quality of results (Peck et al., 1988). Because WIPP PA uses LHS, 100 transmissivity fields is considered adequate to characterize the mean response and uncertainty in the Culebra flow field. Once calibrated, the 100 CS transmissivity fields were analyzed to determine the quality of the fit to the observed heads and to investigate the variability of the transmissivity fields. As in Lavenue and RamaRao (1992), an ensemble mean calculation was performed of all the realizations to determine the average transmissivity value at each grid block. The resulting ensemble transmissivity field (Figure 15) has features that are very similar to the 1992 ensemble mean transmissivity field. Outside the WIPP-site area, the re-entry of high transmissivities from the Nash Draw area occurs south of the WIPP site near the H-7 borehole and the high-transmissivity zone within the WIPP-site, and extends northward from the P-17 borehole where it narrowly lies between the P-17 and H-17 boreholes. Entering the controlled area from the south (Figure 16), the high-transmissivity zone widens significantly extending westward to the H-3 and H-19 boreholes and eastward beyond the H-11 and DOE1 boreholes. There are a number of geologic reasons for the occurrence of the high-transmissivity zone. These are reviewed in detail in Holt and Powers (1988). Physical evidence of the hightransmissivity zone stems from the poor core recovery and fractured nature of recovered core at the DOE-1 and H-11 boreholes. Additional evidence of the high-transmissivity zone stems from the rapid and significant drawdown that was measured at H-15 due to pumping at H-11. The drawdown indicates the presence of a high-transmissivity zone extending northward from the DOE-1 and H-11 area. The extent of this high-transmissivity zone is important from 34

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

a performance assessment standpoint because it greatly affects the travel time from an intrusion borehole release location near the H-1 borehole to the southern WIPP boundary. Thus, the proper delineation of the high-transmissivity zone, as was conducted during steadystate and transient calibration for each field, is essential. The ensemble transmissivity field (Figure 15) shows a high-transmissivity zone in the vicinity of DOE-1 and H-11 as suggested by the physical evidence. Figures 17 through 19 are examples of three calibrated fields. These fields are numbers 40, 69, and 77, respectively, and they were chosen to illustrate different characteristics. For example, the transmissivities for each of these fields in the vicinity of the H-1 borehole are low. However, in field 40 the conditional simulation placed a very high-transmissivity zone between H-1 and H-3, whereas in field 69 a much lower transmissivity rests between H-1 and H-3. This variability is due to the uncertainty in the location of high transmissivity field zones within the WIPP-site boundary, and illustrates the necessity for simulating high and low transmissivity categories across the model domain. As observed in these three figures, the higher transmissivities are connected in a much more tortuous fashion than previously determined in the 1992 study. The finer model grid, coupled with the model grid blocks being specified with categorical indicators and separately optimized, enables the code to produce transmissivity fields that may have distinct contrasts in transmissivity between neighboring grid blocks. Nonetheless, each of the fields has the following common features, consistent with the physical evidence:

• A east to west trend of increasing transmissivities • A local area of high transmissivities in the H-11, DOE-1 area • A northward extension of the high transmissivities in the H-7 area connecting with the H-11 and DOE-1 areas • An area of low transmissivities in the vicinity of the shafts (related to the geology, not to the presence of the shafts)

Ensemble Steady-State and Transient Calibration The differences between the calculated and observed steady-state heads were determined in order to summarize the fit of each realization to the steady-state data. A scatterplot of the ensemble of the calculated heads versus observed heads is illustrated in Figure 20a. While there are a couple of locations where the range of calculated heads is greater than the 35

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

uncertainties associated with the measured heads, the calculated heads in general agree well with the observed steady-state heads and their uncertainties (Figure 7b). This is further illustrated by the histogram of the mean differences of the heads (Figure 20b). As shown, most of the mean differences between the calculated and observed heads fall between -4 m and 2 m. The simulation with the worst steady-state head fit is shown to have a head difference falling between -4 meters and -6 meters. This worst fit realization illustrates a situation in which GRASP-INV could not achieve the minimum steady-state objective function within 50 calibration loops. Although GRASP-INV could theoretically bring the head field into agreement with the observed data by adding more than the allowed 50 pilot points to reduce head differences, the tradeoff would be a loss in predictive capability of the field. Thus, for the purpose of performance assessment, restricting the calibration procedure to 50 steps appears to be suitable despite occasional differences between the head field and observed data. The ensemble-mean transient heads were also calculated and compared to the measured transient heads. Figures 21 through 24 depict the hydrographs for the time period 1981 through 1990. The calculated heads match the measured heads well for effects of the regional scale pumping tests at H-3, WIPP-13, and H-11. As in previous modeling studies of the Culebra (Lavenue et al. 1990; Lavenue and RamaRao 1992), the drawdown of the boreholes responding to pumping at H-11 is correct but the drawdown recovery time is too slow. This discrepancy indicates that the storativity used for this part of the model may be too high. In addition, the drawdowns associated with the shaft construction are underpredicted by the model, which is probably caused by the way in which the shaft effects are simulated. Because of the paucity of flow volume data from the shafts, only shaft pressures could be specified for most of the simulation time period. This condition leads to a problem when the transmissivity varies at the shaft location from one realization to the next. A relatively low transmissivity value in the shaft area reduces the area affected by drawdown due to an atmospheric pressure in the shaft.

MODEL VALIDATION

The observed and average calculated transient heads as a result of the H-19 pumping tests (Figure 25) that occurred in 1995 and 1996 are illustrated in Figures 26 and 27. The calculated and observed drawdowns during the H-19 tracer test (December 1995 through March 1996) agree well. Recall that the H-19 tracer test data was not included in the 36

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

calibration data set. The good visual agreement between calculated and observed drawdowns (both in magnitude and temporal trend) indicates that the transmissivities in the southern portion of the WIPP site are represented well by the model. This validation test indicates that the distribution of the transmissivity as constructed by GRASP-INV is consistent with the true (but uncertain) distribution of aquifer transmissivity.

COMPARISON WITH EARLIER RESULTS

In an attempt to assess the impact of separately modeling the fractured and unfractured portions of the Culebra aquifer, a comparison of the 1992 model results (Lavenue et al., 1995) and the results of this study was made. Figure 28 illustrates the difference between the ensemble mean transmissivities in the central WIPP-site area. The transmissivities of the 1996 study are lower in the area surrounding the WIPP shafts and higher in the high transmissivity zone near the H-11 borehole. This is expected due to the segmentation of the optimization of the high and low transmissivity categories. High transmissivity pilot points added in the high transmissivity fractured category do not affect the transmissivity within the low transmissivity unfractured category. The high transmissivity pilot points were added in the fractured category in order to match the observed drawdowns during the H-3 and H-11 pumping tests. Groundwater travel times from a point in the Culebra, coincident with the centroid of the WIPP waste panels, to the southern WIPP-site boundary are another useful comparative metric. In Lavenue et al. (1995), a travel time CDF produced by tracking a particle through each calibrated transmissivity field (Figure 29) resulted in a minimum travel time of 9,000 yrs, a median of 18,000 yrs and a maximum travel time of 38,000 yrs. These calculations considered only a single porosity of 16%, a conservative particle and unfractured media. A travel time CDF produced using the 1996 calibrated transmissivity fields is illustrated in Figure 30. The groundwater travel times are significantly lower in the 1996 results. The minimum travel time is 1500 yrs, the median is 7,000 yrs and the maximum is 28,000 yrs. Lavenue et al. (1995) determined that the greater the distance the groundwater had to travel to reach the high transmissivity zone, the greater the travel time. The averaging (i.e., kriging) among the high transmissivity and low transmissivity values (measured and pilot points) in the 1992 study, generated a greater distance to the high transmissivity zone by lengthening the distance groundwater had to travel through an ‘intermediate’ transmissivity zone. This 37

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

intermediate zone consists of transmissivities ranging from –6.0 to –8.0 log10 T(m2/s). By simulating and optimizing the transmissivities by high transmissivity and low transmissivity categories, the intermediate zone was essentially removed and replaced by a sharp boundary. Thus, the distance to the high transmissivity category (and therefore the high transmissivity zone) has been reduced. In conclusion, decoupling the influence of the high transmissivity zone (and the associated pilot points) upon the low transmissivity zone has had a significant impact upon the travel time.

38

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

CONCLUSIONS A capability to generate conditional simulations of multi-category transmissivity fields, such as found in fractured and unfractured domains, and automatically calibrate these fields has been developed. A two-step geostatistical procedure is used to generate the conditional simulations. Categorical indicator simulation is first performed to obtain the spatial distribution of indicators representing fractured or unfractured media. Then, the spatial variability within each of these categories is subsequently ‘filled in’ using the associated semi-variogram models and the Sequential Gaussian Simulation technique. An indirect inverse method, referred to as the Pilot Point Technique, has been coupled to the geostatistical routine to solve the inverse problem for steady-state and transient-state head data. The new methodology, embedded in the GRASP-INV code, has been used to automatically calibrate a model of a regional aquifer in the vicinity of the Waste Isolation Pilot Plant in Southeastern New Mexico. The aquifer is a dolomite that is fractured in some areas of the region and unfractured in other areas of the region. One-hundred transmissivity fields were conditionally simulated using the measured transmissivity and the data describing the occurrence of fracturing. These fields were subsequently calibrated to an extensive set of steady-state and transient-state heads. The ability of the GRASP-INV code to optimize the properties of the areas associated with diagenetically altered (that is, higher) transmissivity and unaltered (that is, lower) transmissivity in the Culebra separately improved the capability of the model to obtain good agreement between the observed and calculated steady-state and transient heads. The 100 transmissivity fields incorporate the effects of variable elevation and variable fluid-density upon the flow fields and were used to calculate groundwater travel times to the WIPP-site boundary. The transmissivity fields generated in this study have a much higher variability than those produced in a 1992 study. This variability is due to a finer grid in the WIPP-site area and to the simulation of the uncertain location of high transmissivity and low transmissivity zones within the model domain. The transmissivities of the 1996 study are lower in the area immediately surrounding the WIPP shafts but higher in the region between the shafts and the H-1 borehole. This was expected due to the segmentation of the optimization of the high and low transmissivity categories. High transmissivity pilot points added to the hightransmissivity fractured category did not influence the transmissivity within the low 39

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

transmissivity, unfractured category. This eliminated the development of an ‘intermediate’ transmissivity zone (values ranging from –6.0 to –8.0 log10 T m2/s) produced in the 1992 study due to averaging of the high and low transmissivity values from measurements and pilot points. The global range of uncertainty in the transmissivity, remnant in the calibrated fields, impacted the uncertainty in the groundwater travel times. A travel time CDF produced in this study resulted in significantly lower groundwater travel times, from the waste panel area to the southern WIPP-site boundary, relative to the 1992 groundwater travel times. The decoupling of the high transmissivity zone and the low transmissivity zone produced a sharper boundary between the lower transmissivity (i.e., unfractured) and higher transmissivity (i.e., fractured) parts of the aquifer. This led to the elimination of the ‘intermediate’ transmissivity zone, mentioned above, resulting in reduced groundwater travel times to the southern WIPP-site boundary.

40

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

REFERENCES Beauheim, R.L. 1986. Hydraulic-Test Interpretations for Well DOE-2 at the Waste Isolation Pilot Plant (WIPP) Site. SAND86-1364. Albuquerque, NM: Sandia National Laboratories.

Beauheim, R.L. 1987a. Analysis of Pumping Tests of the Culebra Dolomite Conducted at the H-3 Hydropad at the Waste Isolation Pilot Plant (WIPP) Site. SAND86-2311. Albuquerque, NM: Sandia National Laboratories.

Beauheim, R.L. 1987b. Interpretations of the WIPP-13 Multipad Pumping Test of the Culebra Dolomite at the Waste Isolation Pilot Plant (WIPP) Site. SAND87-2456. Albuquerque, NM: Sandia National Laboratories.

Beauheim, R.L. 1987c. Interpretations of Single-Well Hydraulic Tests Conducted At and Near the Waste Isolation Pilot Plant (WIPP) Site, 1983-1987. SAND87-0039. Albuquerque, NM: Sandia National Laboratories.

Beauheim, R.L. 1989. Interpretation of H-11b4 Hydraulic Tests and the H-11 Multipad Pumping Test of the Culebra Dolomite at the Waste Isolation Pilot Plant (WIPP) Site. SAND89-0536. Albuquerque, NM: Sandia National Laboratories.

Beauheim, R.L. 1996. “Culebra H-19 Hydraulic Test Analyses.” Albuquerque, NM: Sandia National Laboratories. (Copy on file in the Sandia WIPP Central Files, Sandia National Laboratories, Albuquerque, NM as WPO 38401.)

Beauheim, R.L., T.F. Dale, and J.F. Pickens. 1991. Interpretations of Single-Well Hydraulic Tests of the Rustler Formation Conducted in the Vicinity of the Waste Isolation Pilot Plant Site, 1988-1989.

SAND89-0869.

Albuquerque, NM:

Sandia National

Laboratories.

Carrera, J., and S.P. Neuman. 1986. Estimation of aquifer parameters under transient and steady state conditions: 2. Uniqueness, stability, and solution algorithms," Water Resources Research. Vol. 22, no. 2, 211-227.

41

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

Cauffman, T.L., A.M. Lavenue, and J.P. McCord. 1990. Ground-Water Flow Modeling of the Culebra Dolomite. Volume II: Data Base. SAND89-7068/2. Albuquerque, NM: Sandia National Laboratories.

Chavent, G.,1971. Analyse Fonctionnelle et Identification Des Coefficients Repartis Dans les Equations Aux Derivees Partielles. These d’Etat en Mathematiques, Paris VI.

Chavent, G., M. Dupuy, and P. Lemonnier, 1975. History Matching by Use of Optimal Control Theory. Soc. Petroleum Eng. Jour. 15(1): 74-86.

Clifton, P.M., and S.P. Neuman, 1982. Effects of kriging and inverse modeling on conditional simulation of the Avra Valley aquifer in southern Arizona, Water Resour. Res., 18(4), 1215-1234.

Cooper, J.B., 1962. Ground-Water Investigations of the Project Gnome Area, Eddy and Lea Counties, New Mexico. U.S. Geological Survey TEI-802, Open-File Report, 67 p., 17 Figures.

Cooper, J.B. and V.M. Glanzman, 1971. Geohydrology of Project Gnome Site, Eddy County,New Mexico. U.S. Geological Survey, Professional Paper 712-A, 24 p.

Deutsch, C. V. and Journel, A. G., 1992. GSLIB, Geostatistical Software Library and User’s Guide. Oxford University Press, New York.

de Marsily, G., 1978. De L’Identification Des Systemes Hydrogeologiques. These, Univ. Paris VI.

de Marsily, G., 1986. Quantitative Hydrogeology: Groundwater Hydrology for Engineers. Orlando, FL: Academic Press, Inc.

Finley, N.C. and

M. Reeves, 1981. SWIFT Self-Teaching Curriculum: Illustrative

Problems to Supplement the User’s Manual for the Sandia Waste-Isolation Flow and Transport Model (SWIFT), NUREG/CR-1968. SAND81-0410, Sandia National Laboratories, Albuquerque, NM. 42

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

Gill, P.E., W. Murray, and M.H. Wright. 1981. Practical Optimization. New York, NY: Academic Press, Inc. Ginn, T.R., and J.H. Cushman, Inverse Methods for Subsurface Flow: A Critical Review of Stochastic Techniques, Stochast. Hydrol. Hydraul., 4 (1), 1-26, 1990.

Giudici, M., G. Morossi, G. Parravicini and G. Ponzini, 1995. A new method for the identification of distributed transmissivities, Water Resour. Res. 31(8), 1969-1988.

Gomez-Hernandez, J.J. and E.F. Cassiraga, 1995. Theory and Practice of Sequential Simulation. In Armstrong, M. and P. Dowd (Eds.), Geostatistical Simulations, pp. 111121. Kluwer, Dordrecht.

Gomez-Hernandez, J.J. and R.M. Srivastava, 1990. ISIM3D: An ANSI-C Three Dimensional Multiple Indicator Conditional Simulation Program. Computer and Geosciences, 16(4): 395-440.

Hoeksema, R.J., and P.K. Kitanidis, 1984. An application of the geostatistical approach to the inverse problem in two-dimensional groundwater modeling, Water Resour. Res., 20(7), 1003-1020.

Holt, R.M., and D.W. Powers, 1988. Facies Variability and Post-Depositional Alteration Within the Rustler Formation in the Vicinity of the Waste Isolation Pilot Plant, Southeastern New Mexico. DOE-WIPP-88-04. Carlsbad, NM: Westinghouse Electric Corporation.

Isaaks, E. and R. Srivastava, 1989. An Introduction to Applied Geostatistics. Oxford University Press, New York, NY.

Jacquard, P. and C. Jain, 1965. Permeability Distribution from Field Pressure Data. Soc. Pet. Eng. Jour. 281-294; Trans. AIME 234.

Jahns, H. O., 1966. A Rapid Method for Obtaining a Two-Dimensional Reservoir Description from Well Pressure Response Data. Soc. Pet Eng. Jour. 315-327.

43

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

Kelley, V.A. and G.J. Saulnier, 1990. Core Analysis for Selected Samples From the Culebra Dolomite at the Waste Isolation Pilot Plant (WIPP) Site.

Sandia National

Laboratories, Contractor Report SAND90-7011.

Kitanidis, P.K. 1995. Quasi-linear geostatistical theory for inversing. Water Resour. Res., 31(10), 2411-2419.

Kitanidis, P.K., and E.G. Vomvoris., 1983. A geostatistical approach to the inverse problem in groundwater modeling (steady state) and one-dimensional simulations, Water Resour. Res., 19(3), 677-690.La

Lavenue, A.M., Cauffman, T.L., and J.F. Pickens, 1990. Ground-Water Flow Modeling of the Culebra Dolomite Volume I: Model Calibration, SAND89-7068/1. Albuquerque, NM: Sandia National Laboratories.

Lavenue, A.M., and B.S. RamaRao. 1992. A Modeling Approach to Address Spatial Variability within the Culebra Dolomite Transmissivity Field.

SAND92-7306.

Albuquerque, NM: Sandia National Laboratories.

Lavenue, A.M., B.S. RamaRao, G. de Marsily, and M.G. Marietta., 1995. Pilot point methodology for automated calibration of an ensemble of conditionally simulated transmissivity fields. 2. Application, Water Resour. Res., 31(3), 495-516.

Luenberger, D.G. 1973. Introduction to Linear and Nonlinear Programming. Reading, MA: Addison-Wesley Publishing Co. Matheron, G. 1971. The Theory of Regionalized Variables and its Applications. Paris, France: École National Supérieure des Mines. Matheron, G. 1973. "The Intrinsic Random Functions and Their Applications," Advances in Applied Probability. Vol. 5, no. 3, 439-468.

44

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

Mantoglou, A., and J.L. Wilson. 1982. The turning bands method for simulation of random fields using line generation by a spectral method, Water Resour. Res., Vol. 18, no. 5, 1379-1394. Mejia, J. M., and I. Rodríguez-Iturbe. 1974. On the synthesis of random field sampling from the spectrum: An application to the generation of hydrologic spatial processes, Water Resour. Res., Vol. 10, no. 4, 705-711. Nelson, R.W., 1960. In-place Measurement of permeability in Heterogeneous Media, 1. Theory of a Proposed Method. Jour. of Geophysical Res. 65(6): 1753-1760.

Neuman, S.P., 1980. A Statistical Approach to the Inverse Problem of Aquifer Hydrology, 3. Improved Solution Method and Added Perspective. Water Resour. Res., 16(2): 331-346.

Neuman, S.P., 1984. Role of Geostatistics in Subsurface Hydrology. In (Eds. G. Verly, M. David, A.G. Journel,

and A. Marechal) Geostatistics for Natural Resources

Characterization. Proc. NATO-ASI, Part1, pp. 787-816. Reidel, Dordrecht, The Netherlands.

Peck, A., S. Gorelick, G. de Marsily, S. Foster, and V. Kovalevsky, 1988. Consequences of Spatial Variability in Aquifer Properties and Data Limitations for Groundwater Modelling Practice.

IAHS Publication No. 175.

International Association of

Hydrological Sciences, Wallingford, Oxfordshire, U.K.

Ponzini, G. and A. Lozej, 1982.

Identification of aquifer transmissivities: The

comparison model method. Water Resour. Res. 18(3): 597-622.

Ponzini, G., G. Crosta, and M. Giudici, 1989. Identification of thermal conductivities by temperature gradient profiles: One-dimensional steady flow. Geophysics, 54, 643-653.

RamaRao, B.S., A.M. LaVenue, G. de Marsily, and M.G. Marietta. 1995. Pilot point methodology for automated calibration of an ensemble of conditionally simulated

45

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

transmissivity fields. 1. Theory and computational experiments, Water Resour. Res., 31(3), 475-493. RamaRao, B.S., and M. Reeves. 1990. Theory and Verification for the GRASP II Code for Adjoint-Sensitivity Analysis of Steady-State and Transient Ground-Water Flow. SAND89-7143. Albuquerque, NM: Sandia National Laboratories. Reeves, M., D.S. Ward, N.D. Johns, and R.M. Cranwell. 1986a. Theory and Implementation for SWIFT II, The Sandia Waste-Isolation Flow and Transport Model for Fractured Media, Release 4.84. SAND83-1159; NUREG/CR-3328. Albuquerque, NM: Sandia National Laboratories. Reeves, M., D.S. Ward, N.D. Johns, and R.M. Cranwell. 1986b. Data Input Guide for SWIFT II, The Sandia Waste-Isolation Flow and Transport Model for Fractured Media, Release 4.84. SAND83-0242; NUREG/CR-3162. Albuquerque, NM: Sandia National Laboratories. Scarascia, S., and G. Ponzini, An approximate solution for the inverse problem in hydraulics, Energ. Elet., 49, 518-531, 1972.

Shinozuka, M., and C.-M. Jan. 1972. Digital Simulation of Random Processes and Its Applications, Journal of Sound and Vibration. Vol. 25, no. 1, 111-128. Smith, L., and R.A. Freeze. 1979. Stochastic analysis of steady state groundwater flow in a bounded domain: 2. Two-dimensional simulations, Water Resour. Res. 15(6), 1543-1559. Smith, L., and F.W. Schwartz. 1981. Mass transport. 2. Analysis of uncertainty in prediction, Water Resour. Res.. Vol. 17, no. 2, 351-369. Sykes, J.F., J. L. Wilson, and R. W. Andrews, 1985. Sensitivity analysis for steady-state groundwater flow using adjoint operators. Water Resour. Res. 21(3): 359-371.

46

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

Thomas, L.K., L.J. Jellums, and G.M. Rehets, 1972. A Nonlinear Automatic History Matching Technique for Reswervoir Simulation Models. Soc. Pet. Eng. Jour.12(6):508514.

Ward, D.S., M. Reeves and L.E. Duda, 1984. Verification and Field Comparison of the Sandia Waste-Isoloation Flow and Transport Model (SWIFT), NUREG/CR-3316, SAND83-1154, Sandia National Laboratories, Albuquerque, NM.

Wilson, J.L. 1979. "The Synthetic Generation of Areal Averages of a Random Field," Socorro Workshop on Stochastic Methods in Subsurface Hydrology, Socorro, NM, April 26-27, 1979. (Copy on file at the Waste Management and Transportation Library, Sandia National Laboratories, Albuquerque, NM.) Wilson, J.L., B.S. RamaRao, and J.A. McNeish. 1986. GRASP: A Computer Code to Perform Post-SWENT Adjoint Sensitivity Analysis of Steady-State Ground-Water Flow. BMI/ONWI-625. Columbus, OH: Prepared for Office of Nuclear Waste Isolation, Battelle Memorial Institute. WIPP PA (Performance Assessment) Department.

1992.

Preliminary Performance

Assessment for the Waste Isolation Pilot Plant, December, 1992 - Volume 2: Technical Basis. SAND92-0700/2. Albuquerque, NM: Sandia National Laboratories.

Yeh, T.-C.J., Jin, M., and S. Hanna. 1996. An iterative stochastic inverse method: Conditional effective transmissivity and hydraulic head fields, Water Resour. Res., 32 (1),85-92.

Yeh, W.W.G., 1986. Review of parameter identification procedures in groundwater hydrology: The inverse problem, Water Resour. Res., 22 (1), 95-108.

Zimmerman, D.A., and J.L. Wilson. 1990. Description of and User's Manual for TUBA: A Computer Code for Generating Two-Dimensional Random Fields via the Turning Bands Method. Albuquerque, NM: SeaSoft Scientific and Engineering Analysis Software. (Copy on file at the Waste Management and Transportation Library, Sandia National Laboratories, Albuquerque, NM

47

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

Table 1. Culebra Undisturbed Head Values and Uncertainties Well

Undisturbed Head* (meters)

H-1 H-2 H-3 H-4 H-5 H-6 H-7 H-9 H-10 H-11 H-12 H-14 H-15 H-17 H-18 DOE-1 DOE-2 P-14 P-15 P-17 W-12 W-13 W-18 W-25 W-26 W-27 W-28 W-30 CB-1 USGS-1 D-268 AEC7

921.6 (D) 924.8 (I) 914.8 (D) 911.4 (D) 934.2 (I) 932.0 (D) 912.7 (S) 906.4 (D) 921.3 (S) 912.4 (D) 913.5 (S) 916.9 (D) 916.1 (D) 911.0 932.4 914.3 934.7 (S) 926.9 (S) 917.8(I) 909.3 (D) 933.6 (S) 933.7 (D) 930.5 (D) 928.7 (S) 918.5 (D) 938.1 937.5 (I) 934.1 (D) 911.1 (S) 909.8 (S) 915.2 932.0

Residual Effects in the Data (meters)

2.0 3.2 2.9 1.4 2.2

0.9 0.3 1.8

Range of Trends (meters) 3.4 1.6 4.6 2.8 0.3 1.2 1.1 3.5 1.0 2.4 1.0 0.5 0.4 0.0 0.0 0.0 2.7 1.9 2.6 4.6 1.0 0.8 0.8 1.4 2.0 0.0 0.8 2.0 0.6 1.6 0.0 0.0

Overall Head Uncertainty due to Measurement Error (meters) ±2.0 +1.8/-0.1 ±1.9 ±0.6 ±1.4 ±1.0 +0.5/-0.1 +1.2/-0.1 ±2.2 +3.0/-1.0 +1.2/-1.2 +3.9/-0.1 +4.2/-0.1 ±0.9 +2.5/-1.1 +4.3/-2.2 ±1.5 ±0.9 ±0.8 ±0.7 +2.2/-0.1 +1.5/-1.3 +3.0/-1.2 ±1.0 +0.4/-0.1 ±0.7 +0.9/-1.2 +0.9/-1.3 ±0.7 +0.4/-0.1 +0.4/-0.1 ±0.8

Overall Head Variance

Steady-State Head Weight

1.5 0.3 2.0 0.4 0.3 0.3 0.1 0.6 0.8 1.1 0.3 0.6 0.6 0.3 0.4 0.8 0.9 0.6 0.5 1.0 0.3 0.4 0.7 0.3 0.2 0.1 0.2 0.5 0.1 0.1 0.3 0.3

0.7 3.3 0.5 2.5 3.3 3.3 10.0 1.7 **NA 0.9 3.3 1.7 1.7 3.3 2.5 1.3 1.1 1.7 2.0 1.0 3.3 2.5 1.4 3.3 5.0 **NA 5.0 2.0 10.0 10.0 3.3 3.3

* Including the trend in the uncertainty of the heads required the 1996 head to be increased (I) or decreased (D) from the 1990 value to accommodate the trend, (either rising or decreasing). The head could remain the same (S) to reflect no trend. For example, the rising trend of 3.4 m for Well H-1 meant that the 1996 head value was lower than the 1990 value. ** Outside Model Domain

48

Paper 1 – A New Inverse Methodology for Aquifers with Fractured and Unfractured Domains

Table 2. Pilot Points Added for each Transient Event Number of Pilot Points

Event

2

Early shaft construction 1981-83

3

H-3 pumping

2

WIPP-13 pumping

4

H-11 (both 1989 and 1996 regional tests)

2

Air Intake Shaft construction

4

H-19 pre-tracer pumping

2

WQSP-2 pumping

49

Paper 2:

Three-Dimensional Interference-Test Interpretation in a Fractured/Unfractured Aquifer Using the Pilot Point Inverse Method

by

Marsh Lavenue INTERA Incorporated Boulder, CO

and

Ghislain de Marsily University of Paris VI Paris, FRANCE

February 24, 1999

Submitted to Water Resources Research for Publication

WATER RESOURCES RESEARCH, VOL. 37, NO. 11, PAGES 2659 –2675, NOVEMBER 2001

Three-dimensional interference test interpretation in a fractured aquifer using the pilot point inverse method Marsh Lavenue INTERA Inc., Boulder, Colorado, USA

Ghislain de Marsily Laboratoire de Ge´ologie Applique´e, University of Paris VI, Paris, France

Abstract. A geostatistical conditional simulation routine that employs both parametric and nonparametric geostatistics is coupled to the pilot point inverse method to simulate the spatial distribution of conductivities in a dolomitic aquifer which is fractured in some areas and unfractured in others. An inversion is then conducted to obtain the conductivities within the fractured and nonfractured parts of the aquifer to match interference data from a series of three-dimensional pumping tests for an ensemble of 100 conditional simulations. The calibrated and predicted drawdowns match the observed drawdowns well. A comparison to the transmissivity interpreted from a single-well pumping test compared very well to the geometric mean transmissivity calculated from the calibrated realizations in this study. The results from this study indicate that conditioning the conductivity fields to the geologic facies data, in this case fractured-unfractured categorical data, as well as to ample transient hydraulic data can lead to robust groundwater flow models which can adequately predict the response to other hydraulic interference tests.

1.

Introduction

Simulation of highly heterogeneous aquifer systems by hydrogeologists and engineers has become more mainstream because of the sophisticated geostatistical methods available today. Hydrogeologists have long understood that proper incorporation of heterogeneity into models makes the difference between “getting it” and “getting it right.” This point was emphasized in a recent comparison paper of inverse modeling techniques under “known” conditions in which several teams attempted to reproduce “reality” from a limited set of observations [Zimmerman et al., 1998]. In general, Zimmerman et al. found that the inverse methods able to incorporate the most geologic information into the model through geostatistics performed better than those that could not. In Zimmerman et al.’s study there were no errors associated with the transmissivity or head measurements given to the participants. In reality, errors in estimated aquifer parameters do exist because of our inability to measure aquifer parameters at the field scale directly or with great certainty. In addition, effective properties are usually estimated using simple conceptual models and their analytic solutions. The analytic expressions used in well test analysis are extremely simplistic in the degree of variability they are able to address, resulting in mean aquifer parameter values. For example, Theis’ [1935] solution assumes homogeneity, two-dimensional confined flow with a constant pumping rate. Other analytic solutions given by Jacob [1946], Hantush and Jacob [1955], Hantush [1960, 1966], and others address issues such as semiconfined or leaky conditions. More recent contributions by Barker [1988], Butler [1988], and Butler Copyright 2001 by the American Geophysical Union. Paper number 2000WR900289. 0043-1397/01/2000WR900289$09.00

and Liu [1991, 1993] have provided analytic solutions that increase the variability of the aquifer by dividing the aquifer into several regions of homogeneous properties. Yet, notwithstanding these recent improvements, analytical models provide little information on heterogeneity that exists. Numerical models provide the necessary flexibility to assess heterogeneity. Deriving aquifer parameters from well tests using numerical models originated in 1968 when Pinder and Bredehoeft [1968] presented an axisymmetric numerical model designed for well test interpretation. Since that time, there have been numerous numerical modeling efforts focused on well test interpretation and the impact that heterogeneity has upon the uncertainty of the interpreted results [e.g., Cooley, 1971; Lachassagne et al., 1989; Herweijer, 1996; Meier et al., 1998]. Applications to actual field test data have led to the development of inverse methods to automatically interpret well test drawdowns. Examples are given by Carrera and Neuman [1986], Leebe and De Breuck [1995], and Meier et al. [1997]. To date, these inverse techniques have been somewhat hampered by the complexity of the geologic and hydrogeologic properties they could feasibly address in the inverse parameterization. This paper presents a new technique that may be used for multidimensional aquifer test interpretation in highly heterogeneous settings. It provides improvements in the details we can incorporate into numerical models used to interpret aquifer tests by representing spatial variability through facies simulation followed by inversion. It is yet another step toward improving our ability to (1) simulate highly heterogeneous formations, (2) determine uncertainties in aquifer parameters, and (3) predict future states of the simulated system with more confidence. In this new method, the pilot point inverse technique is extended to solve three-dimensional, steady state, or transient state flow problems. The new inverse technique orig-

2659

2660

LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION

inates from de Marsily’s [1978] pilot point method as extended by RamaRao et al. [1995] and Lavenue [1998]. A unique feature of this new approach is the incorporation of a geostatistic “front end” comprised of parametric (Gaussian) and nonparametric (indicator) sequential simulation to generate an ensemble of conditionally simulated (CS) transmissivity (or conductivity) fields. The initial conductivity fields are subsequently input to a groundwater flow model and used in a pilot point inverse procedure to match hydraulic head data. The geostatistical simulation process requires two steps. The first step is to simulate lithology or facies within a formation as discrete categories using indicator categorical simulation [see McKenna and Poeter, 1995]. The second step “fills in” the spatial variability within each facies type by sequential Gaussian simulation. If observed values of the variable of interest, e.g., conductivity, are available, the geostatistical simulations will honor both the categorical and continuous data, i.e., reproduce the observations at their locations while providing alternative, equally likely conductivity realizations for the unmeasured regions of the field. If an ensemble of realizations is generated and subsequently calibrated to observed heads using the pilot point method, one obtains an indication of the uncertainty remaining in the model’s predictions. This method’s current limitation is that it focuses upon conductivity as the principal parameter of interest. It could be extended to include other aquifer parameters, however. Here we discuss the theory behind the new technique and its application to determine the heterogeneity in a threedimensional fractured dolomitic aquifer from a series of threedimensional pumping tests conducted at the Waste Isolation Pilot Plant (WIPP) site in New Mexico. 1.1.

Earlier Work

De Marsily’s [1978] original pilot point inverse method provided an alternative indirect inverse method. His indirect method was the first to couple the inverse problem with several “state-of-the-art” techniques, namely kriging with logarithms and optimization with adjoint sensitivities. De Marsily called his technique the pilot point approach because of the way in which he parameterized the solution of the inverse problem. In the pilot point approach the Bayesian viewpoint of the parameter is adopted. The log of transmissivity is used to transform the statistical distribution of transmissivity (considered to be logarithmic; Law [1944]) to Gaussian. The unknown log transmissivities are viewed as variables of a multi-Gaussian random function (RF). Log transmissivity at a point in space is considered a Gaussian random variable (RV) which may or may not have a distribution which is statistically dependent on, or correlated to, its neighboring transmissivities. A transmissivity measurement at location ( x, y, z) is conceptually considered as a sample from the RV distribution at point ( x, y, z). As presented by Matheron [1962], kriging provides the best unbiased estimate of the RV at any point in space given observed data and the correlation between the data. In de Marsily’s work, the RV was log transmissivity, and the estimation points were the grid block centroids of his finite difference model. He employed ordinary kriging to estimate the initial log transmissivity field. The solution of the inverse problem using the pilot point method relies upon an expanded form of the kriging equation

冘 n

Z *m共u兲 ⫽

冘 N

␯ ␤,mZ共u ␤兲 ⫹

␤⫽1

␯ p,mZ共u p兲,

(1)

p⫽1

where Z(u p ) is the log transmissivity at a pilot point, ␯ p,m is its associated kriging weight, and N is the total number of pilot points. The number, location, and value of the pilot points comprise the parameterization issues in determining a solution to the inverse problem. De Marsily [1978] suggested that the number of pilot points should be less than the observed transmissivity values and that the pilot points should be placed in areas with high hydraulic head gradient. De Marsily’s pilot point approach thus consists of estimating Z(u p ) so as to minimize a weighted least squares objective function ˆ 兲 ⫽ 共h* ⫺ h ˆ 兲 TV ⫺1 ˆ J H共Z h 共h* ⫺ h 兲,

(2)

where h* is the set of head measurements, V h is the head ˆ is the set of calculated heads at the covariance matrix and h measurement locations. The overall equation solved by the pilot point technique may be summarized as follows: Let Z m represent the log transmissivity value assigned to grid block m. Using the chain rule, the sensitivity of the weighted least squares objective function J to the log transmissivity value assigned to each pilot point may be expressed by dJ ⫽ dZ p

冘 M

m⫽1

⭸J ⭸Z m , ⭸Z m ⭸Z p

(3)

where M is the total number of grid blocks in the flow model. The derivatives required to guide the optimal selection of pilot point transmissivities are computed with the adjoint technique and the kriging equations (equation (1)). The second term on the right-hand side of (3) may be reduced by taking the derivative of (1) with respect to pilot point transmissivity. De Marsily states that constraints to the pilot point transmissivities could be applied to ensure that the pilot point transmissivities lie within ⫾2 standard deviations of the kriged estimate. De Marsily’s [1978] pilot point technique was the first to solve many of the problems inherent in the earlier inverse techniques. By incorporating geostatistics and logarithms of transmissivity he ensured a smooth transmissivity field even in the presence of significant errors in the head field. In addition, he stabilized his inverse technique by (1) adding prior information in the form of initial kriged estimates at the pilot point locations, (2) properly posing the inverse problem by reducing the number of unknowns to a small number of pilot points, (3) implementing the transmissivity changes in the neighborhood of a pilot point in accord with the variogram, and (4) utilizing efficient adjoint sensitivity techniques to obtain the necessary derivatives for the optimization routine. Over the last 20 years the pilot point method has been applied to steady state and transient state groundwater flow problems [de Marsily, 1984; de Marsily et al., 1984; Ahmed and de Marsily, 1987; Ahmed, 1987]. The pilot point method has also undergone modifications and improvements since its inception. Certes and de Marsily [1991] adapted the pilot point method to a finite difference model with local mesh refinement. Lavenue and Pickens [1992] presented a new version of the pilot point method in which the locations of pilot points were optimized for the first time. They demonstrated their technique on a regional aquifer model in the vicinity of the Waste Isolation Pilot Plant (WIPP) radioactive waste reposi-

LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION

tory site. J. Gomez-Hernandez and his fellow researchers at the Technical University of Valencia, Spain, developed a linearized pilot point technique [Sahuquillo et al., 1992]. Their technique was extremely efficient because (1) the sensitivity derivatives are not updated as frequently as the other pilot point approaches and (2) the conductance between two finite difference grid blocks is assumed to be expressed by the geometric mean (as opposed to the harmonic mean) of the associated grid block transmissivities, thereby greatly simplifying the calculation of objective function sensitivity to pilot point transmissivities. RamaRao et al. [1995] and Lavenue et al. [1995] presented a two-part paper that incorporated the pilot point method into a direct uncertainty analysis methodology whereby an ensemble of transmissivity fields were generated through conditional simulation and subsequently calibrated using the pilot point method. The approach, embedded in a finite difference model called Groundwater Adjoint Sensitivity Program–Inverse (GRASP-INV), was subsequently evaluated in a comparative exercise of inverse methods by Zimmerman et al. [1998]. Here GRASP-INV compared favorably with other nonlinear inverse methods. However, one conclusion that resulted from the exercise was the importance of proper incorporation of geology into the finite difference model, specifically through geostatistics. Thus, in 1998, M. Lavenue and others developed an enhanced version of the pilot point method to improve the geostatistics capability within GRASP-INV and to expand its functionality to three dimensions. This paper describes this enhanced pilot point method. During the time of this study, independent new work similar in concept to this study was conducted by Gomez-Hernandez and Franssen [1999] and Franssen et al. [1999a], who used a pilot point method to calibrate steady state flow within fractured three-dimensional media. In addition, Franssen et al. [1999b] present a study of joint simulation of transmissivity and storativity fields conditional to two-dimensional steady state and transient state hydraulic heads. 1.2.

Study Site: Culebra Dolomite

The Culebra dolomite is one of five members of the Rustler Formation within the Delaware Basin in southeastern New Mexico. The Culebra is a laminated to thinly bedded argillaceous dolomite with abundant open and gypsum-filled fractures. As the amount of connected, open fracturing and secondary porosity increases, the Culebra transmissivity generally increases. Over the past 16 years a significant effort has been directed toward field investigations and regional flow and transport modeling within the Culebra. The focus of these activities was to determine whether the Waste Isolation Pilot Plant was suitable for a radioactive waste repository. Numerous boreholes in and immediately surrounding the WIPP site area have been drilled and tested within the Culebra in support of these investigations. The field investigations have been instrumental in providing estimates of the variability of the hydrogeologic properties within the Culebra. Five tracer tests conducted in the Culebra over the last 15 years were performed to determine transport parameters such as matrix block length, fracture and matrix porosity, and horizontal anisotropy for interpretative dual-porosity models. Each of these tracer tests were performed over the entire thickness of the Culebra (generally between 7 and 8 m) and therefore provided no indication of vertical heterogeneity within the Culebra. Observations by Holt and Powers [1988] in

2661

Figure 1. H-19 hydropad monitoring well locations at the WIPP site.

one of the WIPP shafts, an area of generally lower conductivity, indicate that fracturing within the Culebra is not uniform. They also found that most of the flow within the Culebra in the area of the shafts actually occurs within a distinct, fractured lower section of the formation. Other cores taken at the site also indicate that flow within the Culebra occurs primarily through discrete fractures or fracture zones occurring in the lower half of the Culebra. In the highest-transmissivity area of the Culebra, to the west of the WIPP site boundary, fracturing may be found throughout the entire vertical section. Thus the occurrence of and boundary between the fractured and nonfractured sections (vertically and horizontally) of the Culebra is highly variable. 1.3.

H-19 Hydropad

In 1995 and 1996, Sandia National Laboratories installed seven boreholes in close proximity (⬍50 m), referred to as the H-19 hydropad (Figure 1; note that the dashed lines are projections of the well length to the surface), and executed a series of hydraulic and tracer tests in order to investigate the vertical heterogeneity of the Culebra and its impact upon transport processes [Meigs and Beauheim, 2001; Haggerty et al., 2001; McKenna et al., 2001]. During the construction of these boreholes, cores were taken, and hydrophysical logging was employed to identify fracture orientations, fracture density, and permeable units over the entire thickness of the Culebra. Hydrophysical logging in the Culebra and core taken from the H-19b0 borehole indicates that the upper 2.5 m of the Culebra have a very low conductivity relative to the rest of the Culebra. Core measurements and fluid logging results obtained from the H-19b2 and H-19b4 boreholes also support this conclusion. Numerous hydraulic pumping tests were conducted at the H-19 hydropad between the summer of 1995 and the spring of 1996. Packers were placed in all seven H-19 boreholes to divide

2662

LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION

Figure 2. Pressure response in monitoring well H-19b5.

the Culebra into an upper 2- to 3-m section and a lower 4- to 5-m section for a series of sinusoidal pumping tests. The H-19b0 borehole was first pumped in the upper zone and then pumped in the lower zone. The pump was then installed into the upper and lower zones of the H-19b4 borehole. Pumping in each interval was initiated at a constant rate to establish a steady state flow field and then switched to a sinusoidal pumping rate using a computer-controlled system. A sinusoidal pumping rate was implemented because the amplitude and time lag of pressure responses in observation wells provide additional information on storativity [Beauheim et al., 1995]. Beauheim et al. [1995] presented the results of the sinusoidal pumping tests in the Culebra and provided a qualitative interpretation of the hydraulic characteristics of the Culebra which lead to the interference well responses. They found that the amplitudes in the lower zone in all observation wells (due to sinusoidal pumping) were greater than in the upper zone, regardless of whether the upper or lower zone was pumped at H-19b0. More specifically, the upper zone of the wells to the east and south of H-19b0 (H-19b2, H-19b3, and H-19b7) were 70 –90% of the lower zone amplitudes while the upper zone amplitudes in the wells to the west and north of H-19b0 (H19b4, H-19b5, and H-19b6) were 25% of the lower zone amplitudes. They provided Figures 2 and 3 to support this conclusion. Note the distinct difference in the amplitudes of the

upper and lower zones of H-19b5 due to pumping in the upper zone of H-19b0 (Figure 2). The amplitude in the lower zone of H-19b2 (Figure 3), because of pumping in the lower zone of H-19b0, is only slightly larger than the response in the upper zone, indicating strong vertical conductivity at this location. Beauheim and Ruskauff [1998] analyzed one of the H-19 pumping tests which resulted in a single hydropad transmissivity value of ⫺5.2 log10 m2/s. While Beauheim et al. [1995] were able to draw some conclusions concerning the hydrogeology of the Culebra from the H-19 pumping tests, they were not able to provide estimates of the spatial distribution of conductivities that would reproduce the three-dimensional interference data. Section 2 describes the theory and application of GRASPINV2 to the three-dimensional H-19 sinusoidal pumping test data to obtain the spatial distribution of conductivities at the intrahydropad scale.

2.

Theory

Lavenue [1998] provides an in-depth discussion of the theory of GRASP-INV2, the inverse method used in this study. The process used is depicted in Figure 4 where the various steps used in the construction and calibration of a conditionally simulated field are shown. In sections 2.1–2.5 a general overview of this process is discussed.

LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION

2663

Figure 3. Pressure response in monitoring well H-19b2.

2.1. Constructing the Conditional Simulations of Conductivity CONSIM II is a subroutine within GRASP-INV2 designed for the geostatistical simulation of heterogeneous geologic media and related spatial random variables. It creates one-, two-, or three-dimensional simulated fields of spatially correlated random variables which may be conditioned to measured values. CONSIM II also produces estimated fields based on the measured values via kriging. The first version of CONSIM II, designed by this study’s lead author and developed by J. Gomez-Hernandez, borrowed heavily from GSLIB, the wellknown library of geostatistical programs published by Deutsch and Journel [1992]. This study employed a subsequent version of CONSIM II slightly modified for incorporation into the GRASP-INV2 inverse model. CONSIM II uses a two-step approach to simulate geologic media. The first step is to simulate lithology within a formation as discrete categories using indicator categorical simulation (iCs). The second step simulates a continuous variable for the property of interest, e.g., conductivity, within each category. Both categorical and continuous data can be used as the conditioning data. The continuous variable is simulated parametrically by sequential Gaussian simulation (sGs). If observed values of the variable of interest are available, the simulations will reproduce the observations at their locations while providing alternative, equally likely realizations for the unmeasured

regions of the field. CONSIM II may be used to simulate a variety of geologic media; examples include (1) the permeabilities of both sand and shale layers within a single formation, (2) the transmissivities of both fractured and matrix units within a limestone aquifer, and (3) facies changes and the associated material properties for an alluvial or aeolian deposit. Sequential simulation is performed on a grid much finer compared to the flow model finite difference grid. Once a field is simulated, the flow model grid is superimposed upon the geostatistical simulation grid, and average transmissivity (or conductivity) values are calculated for each flow model grid block. This is done by analyzing the simulation grid point values falling within each grid block. Typically, a geometric mean is computed, and a categorical type is determined from the majority of the category points falling within the grid block. Once the simulation (i.e., generation) of the conductivity field is complete, information concerning the conductivity field is passed on to the flow model finite difference grid (Figure 4). In three-dimensional applications the upscaled conductivities are converted to transmissivities and used as an initial estimate for grid block transmissivities to initialize the calibration process. 2.2.

Solving the Groundwater Flow Equation

The groundwater flow model used in GRASP-INV2 is the Sandia Waste Isolation, Flow, and Transport code (SWIFT II)

2664

LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION

Figure 4. GRASP-INV2 flowchart.

[Reeves et al., 1986]. SWIFT II is a fully transient, threedimensional, finite difference code that solves the coupled equations for single-phase flow and transport in porous and fractured geologic media. In this study, only the single-porosity transient flow equation was implemented. SWIFT II permits the specification of constant head or specified-flux boundary conditions typical for numerical models. A third boundary condition type available in SWIFT II is the dynamic flux Carter-Tracy boundary condition. A Carter-Tracy boundary condition minimizes model boundary effects for transient states by embedding the simulation model within a regional aquifer. This has the effect of placing the heterogeneous threedimensional model within an infinite aquifer with a specified conductivity and storativity. Thus, as a cone of depression reaches the model boundary, it will continue to propagate radially outward at a rate governed by the diffusivity of the infinite Carter-Tracy aquifer. This boundary condition type originates from the petroleum industry and is discussed by Carter and Tracy [1960] and Reeves et al. [1986]. 2.3.

taken over all points in space and time where measurements have been made. For transient simulation the objective function is

冘冘 t2

J共d兲 ⫽

n

共d i,t ⫺ d obi, t兲 2,

(4)

t⫽t1 i⫽1

where J(d) n i d d ob t1 t2

objective function for transient state drawdown; number of boreholes; suffix for the borehole; calculated drawdown; observed drawdown; beginning of the time window; end of the time window.

The time window selected for calibration should be set long enough for drawdown and recovery to be achieved at each of the interference wells.

Objective Function

Once initial model parameters are assigned to the finite difference flow model, an initial flow simulation is performed to obtain the calculated pressures across the model domain. The process of reducing the differences between the calculated and measured heads is illustrated in Figure 4. A least squares (LS) objective function is calculated and minimized during calibration. In this application the LS is a sum of the squared deviations between the computed and measured drawdowns

2.4.

Locating Pilot Points

As discussed in section 1.1, de Marsily [1978] pioneered the concept of pilot points as parameters of calibration. He assigned their locations on the basis of his intuition. In GRASPINV2, pilot points are placed at grid block center locations where their potential for reducing the objective function is the highest. As discussed in section 1.1., this potential is quantified by the sensitivity coefficients (dJ/dY p ) of the objective func-

LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION

2665

Plate 1. Calibrated conductivity (log10 m/s) fields at the H-19 hydropad for three realizations. (left) Upper Culebra. (right) Lower Culebra.

tion J, with respect to Y p , the logarithm (to base 10) of pilot point conductivity. This sensitivity may be expressed as dJ ⫽ dY p

冘 M

m⫽1

⭸J ⭸Y Hm ⫹ ⭸Y Hm ⭸Y p

冘 M

m⫽1

⭸J ⭸Y Vm , ⭸Y Vm ⭸Y p

(5)

where Y Hm represents the horizontal conditionally simulated (CS) conductivity value assigned to grid block m, Y Vm is the vertical CS conductivity assigned to grid block m, and M is the total number of grid blocks in the flow model. The sensitivities are calculated for a large number of grid

2666

LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION

Figure 5. Pumping rates for hydraulic tests at the H-19 hydropad: pumping rate of upper Culebra test at H-19b0 (top left), pumping rate of first lower Culebra test at H-19b0 (bottom left), pumping rate of second lower Culebra test at H-19b0 (top right), and pumping rate of lower Culebra test at H-19b4 (bottom right).

blocks by considering their centroids as potential pilot point locations. The sensitivities are then ranked from highest to lowest, and the grid block with the highest sensitivity value that does not already contain a pilot point at the centroid is chosen as the location for the next pilot point. The pilot point may then be described by an ( x, y, z) location (grid block center), and it assumes the category of the grid block it is placed in. 2.5. Optimizing Pilot Point Conductivities and Updating Grid Block Values The sensitivity of the objective function to changes to horizontal and vertical conductivity due to a pilot point is assessed

through (5). Once a pilot point location is selected and a category is assigned to the pilot point, GRASP-INV2 then optimizes the pilot point conductivity to minimize the objective function J. GRASP-INV2 contains several optimization routines used to assign conductivity to a selected pilot point location. In our study, a steepest decent, gradient-based optimization routine was employed. Details are given by RamaRao et al. [1995]. Once obtained, the pilot point conductivity is used along with its ( x, y, z) location and category type to modify the surrounding grid block conductivities. This process is detailed below. Let K p be the optimized conductivity assigned to pilot point

LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION

2667

Plate 2. Ensemble mean conductivity (log10 m/s) field for the upper Culebra and lower Culebra (top left and top right plots, respectively) and difference between the 0.02 and 0.97 percentiles for the upper Culebra and lower Culebra (bottom left and bottom right plots, respectively).

P, and let Y p, ⫽ log10 K p . If a surrounding grid block m is of the same category as the grid block containing the pilot point, then this grid block will be adjusted by ⌬Y m , expressed by ⌬Y m ⫽ ␥ m, p⌬Y p,

(6)

where p is the subscript for “pilot point,” ␥ m,p is the kriging weight between the centroid of grid block m and pilot point p, and ⌬Y p is the difference between the conductivity of the grid block containing the pilot point (before the pilot point is added) and the optimized pilot point conductivity. If a neighboring grid block belongs to another category, its conductivity value will not be adjusted.

3.

Model Application

3.1.

Model Construction

Details of the model construction and calibration are given by Lavenue [1998]. Thus only the highlights of the model and its calibration are discussed here. 3.1.1. Spatial discretization. As discussed in section 1.2, the Culebra may be proportioned into at least two distinct hydrogeologic units. The lower unit is fractured and transmits groundwater much more readily than the upper unit. The upper, less permeable unit is considered intact dolomite in some places with filled fractures that do not contribute to ground-

2668

LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION

Table 1. Final Conductivities Assigned to the H-19 Hydropad Boreholes and Their Associated Normal Scores Value (K v ⫽ 0.1K H ) Upper Culebra

Lower Culebra

Borehole

Category

Log10 K H , m/s

Normal Score

Category

Log10 K H , m/s

Normal Score

H-19b0 H-19b2 H-19b3 H-19b4 H-19b5 H-19b6 H-19b7

fractured fractured fractured unfractured unfractured unfractured fractured

⫺3.85 ⫺5.40 ⫺6.65 ⫺8.79 ⫺9.01 ⫺7.98 ⫺6.35

0.7479 0.4728 ⫺1.0968 0.0000 ⫺0.9674 0.9674 ⫺0.7479

fractured fractured fractured fractured fractured fractured fractured

⫺3.11 ⫺6.26 ⫺7.06 ⫺4.12 ⫺6.09 ⫺6.17 ⫺5.81

1.6906 ⫺0.4728 ⫺1.6906 1.0968 0.0000 ⫺0.2299 0.2299

water flow. In other places, the upper layer behaves very similarly to the lower Culebra. A 50 ⫻ 50 ⫻ 2 numerical grid constructed for this study contained 1 m ⫻ 1 m ( x and y) grid blocks. Only two vertical layers were used in the model in order to be consistent with the pumping tests’ packer intervals over which the pressure data were measured and to be consistent with the hydrogeologic information of the Culebra described above. The upper layer of the model, representing the less permeable upper portion of the Culebra, was 2.6-m thick. The lower layer, representing the more permeable unit, had a thickness of 5.2 m. The Culebra is considered to be confined above and below by low-permeability beds of anhydrite, halite, and mudstone. For the purpose of identifying the conductivity values in the Culebra, vertical flux from confining beds is assumed to have no impact upon the results. 3.1.2. Boundary conditions. Carter-Tracy boundary conditions were assigned to the model boundaries. The conductivity assigned to the Carter-Tracy aquifer was ⫺5.2 log10 m/s for both model layers. This initial value originated from a transmissivity value interpreted from the hydropad single-well pumping test by Beauheim and Ruskauff [1998]. The specific storage assigned to both layers of the Carter-Tracy aquifer was the same, 1.3 ⫻ 10⫺6 (which results in a storativity of 1.0 ⫻ 10⫺5 for the entire Culebra). 3.1.3. Initial conductivity field simulation. Given the paucity of hydrogeologic data at the H-19 hydropad (geologic cores from the upper and lower Culebra at seven locations and only one single-well pumping test transmissivity value), basic parametric data needed for geostatistical simulation had to be determined through initial sensitivity analysis or assumed. For example, conductivity ranges for the fractured and nonfractured Culebra were assumed from conductivity ranges obTable 2. Final Conductivities Assigned to the Carter-Tracy Aquifer Carter-Tracy Boundary Upper layer north south east west Lower layer north south east west

Log10 K, m/s ⫺6.4 ⫺6.4 ⫺6.4 ⫺6.4 ⫺5.4 ⫺6.4 ⫺5.4 ⫺5.4

served in other boreholes at the site with similar characteristics. The range for the fractured Culebra at the H-19 borehole was assumed to be ⫺2.0 log10 m/s to ⫺7.0 log10 m/s with a mean of ⫺4.5 log10 m/s. The range for the unfractured conductivity was ⫺7.0 log10 m/s to ⫺10.0 log10 m/s with a mean of ⫺8.5 log10 m/s. In this study, we geostatistically simulated the high conductivities associated with the fractured parts of the Culebra separately from the low conductivities associated with the less fractured Culebra. The geostatistical simulation procedure consisted of first generating a conditional categorical indicator simulation (iCs) using the fractured-unfractured core information of the upper and lower Culebra at the H-19 boreholes. This resulted in the determination of fractured and unfractured (the categories) portions of the aquifer model. The spatial variability of conductivities within each category was then determined through separate unconditional simulations using sGs and the ranges and means associated with the categories discussed above. Because of the lack of data, semivariograms for both the H-19 hydropad indicator categories and the conductivities had to be assumed. Semivariograms require a sill and a range to be specified. In categorical simulation the maximum theoretical sill is defined as p(1 ⫺ p), where p is the probability of occurrence of an indicator, inferred from the site data [see Deutsch and Journel, 1992]. The values of p assigned to the fractured and unfractured categorical variables were determined from the cores and the observed responses to pumping at H-19b0. The unfractured category was assigned to three of the seven measurement locations in the upper Culebra (i.e., H-19b4, H-19b5, and H-19b6). The fractured category was assigned to the remaining four boreholes in the upper Culebra and to all seven of the boreholes in the lower Culebra. Thus the probability of the Culebra being unfractured in the H-19 hydropad area is 21% while the probability of the Culebra being fractured is 79%. Using either of these values in the

Table 3. Final Calibrated Specific Storage Used in the Model

Model Layer

Specific Storage of Low Conductivity (Unfractured) Category

Specific Storage of High Conductivity (Fractured) Category

Upper layer Lower layer

6.9 ⫻ 10⫺6 NAa

1.35 ⫻ 10⫺5 1.35 ⫻ 10⫺5

a

NA, not applicable.

Figure 6. Observed (dashed lines) and ensemble mean calculated (solid lines) responses in the upper Culebra due to pumping in the upper H-19b0 borehole.

LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION 2669

Figure 7. Observed (dashed lines) and ensemble mean calculated (solid lines) responses in the lower Culebra due to pumping in the upper H-19b0 borehole.

2670 LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION

Figure 8. Validation data set: Observed (dashed lines) and ensemble mean– calculated (solid lines) responses in the upper Culebra due to pumping in the lower H-19b4 borehole.

LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION 2671

Figure 9. Validation data set: Observed (dashed lines) and ensemble mean– calculated (solid lines) responses in the lower Culebra due to pumping in the lower H-19b4 borehole.

2672 LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION

LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION

expression p(1 ⫺ p) yields 0.17, the semivariogram sill used in this study for the categorical semivariogram. A range of 20 m was assumed to be acceptable as the range of the categorical semivariogram. This value was chosen to represent the strong correlation of the fractured and unfractured areas of the H-19 hydropad. For example, the entire lower Culebra is fractured, and only boreholes to the north and west of the central H-19b0 borehole in the upper Culebra are unfractured. A value of 20 m for the range produced fields that reflected these general observations. Sequential Gaussian simulation (sGs) works with data that are normally distributed with a zero mean and unit variance. Typically, a transform is made on the original data prior to sGs by subtracting the mean from each data point and dividing the difference by the variance (see Table 1). Once sGs is complete, a back transform of the simulated normal scores is made by multiplying each simulated value by the original data’s variance and adding the original data’s mean. SGs also requires a semivariogram to be used in the simulation process. The exponential semivariograms assigned to the fractured, higherconductivity, and unfractured, lower conductivity, portions of the aquifer had assumed ranges of 5 and 15 m, respectively. These range values were assumed acceptable because the conductivity within the unfractured Culebra conceptually is correlated over longer distances than the conductivity within the fractured Culebra. The variograms used for the categorical and Gaussian simulations were assigned a 0.05 vertical to horizontal anisotropy in order to produce realizations that were horizontally oriented. Thus the upper Culebra model layer had very little geostatistical influence upon the lower Culebra model layer. 3.2.

Model Calibration

As mentioned in section 1.3, numerous hydraulic pumping tests were conducted at the H-19 hydropad between the summer of 1995 and the spring of 1996. Several short single-well pumping tests were conducted followed by a series of sinusoidal pumping tests in the upper and lower zones of the H-19b0 and H-19b4 boreholes. Pumping in each interval was initiated at a constant rate to establish a steady state flow field and then switched to a sinusoidal pumping rate using a computercontrolled system. Drawdowns from three sinusoidal pumping tests at the H-19b0 borehole were used in model calibration through the objective function defined in (4). Each sinusoidal pumping rate had a period of 72 min. The first test was conducted in the upper Culebra at a pumping rate fluctuating between 0.035 and 0.06 L/s (Figure 5). The second pumping test was also conducted at H-19b0 but in the lower zone of the Culebra. The sinusoidal pumping rate fluctuated between 0.03 and 0.06 L/s (Figure 5). The third pumping test, also conducted in the lower zone of H-19b0, had the greatest amplitude in pumping rate, fluctuating between 0.03 and 0.175 L/s (Figure 5). The fourth test illustrated in Figure 5 was used as a validation test and will be discussed further below. Each of the seven H-19 hydropad boreholes was monitored in the upper and lower zones of the Culebra during the pumping tests. These interference test data comprised the observedpressure data set for the calibration of the three-dimensional conductivity fields. Model calibration focused on reducing the differences between the observed and calculated sinusoidal drawdown amplitudes and peak amplitude times. The model parameters adjusted were conductivity, storativity, and CarterTracy aquifer conductivity. Since the current version of

2673

GRASP-INV2 only optimizes conductivity, the storativity and Carter-Tracy aquifer conductivity parameters were adjusted first by trial and error. Numerous unsuccessful attempts at calibration were made using a single value of storativity for the model. Only after separate storativity values were assigned to grid blocks with fractured and unfractured categories was GRASP-INV2 able to calibrate the conductivity field. Table 1 lists the final conductivities assigned to the H-19 hydropad boreholes. These conductivities remained fixed during model calibration. However, the conductivities between these borehole locations fluctuated between realizations as dictated by the initial conditional simulation and as modified by the pilot points added during inversion. The Carter-Tracy aquifer boundary conditions as well as the storativity assigned to the fractured and unfractured grid blocks were held constant during inversion. Tables 2 and 3 list the conductivities of the Carter-Tracy aquifer and the final storativity values assigned to the fractured and unfractured Culebra in the model layers, respectively. A total of 10 pilot points were added one by one during the calibration process (Figure 4). A single LS objective function composed of the sum of the squared differences between the observed and calculated drawdowns for each of the three H-19b0 pumping tests was used in calibration. The time window for each test was set to include the last cycle of the sinusoidal drawdown curves at the observation wells for each of the three tests. Once the calibration of the three-dimensional conductivity fields was complete, a fourth pumping test was subsequently simulated to validate the results of the calibration. A pumping test in the lower zone of H-19b4 (Figure 1) was set aside as a validation data set. The pumping rate fluctuated between 0.035 and 0.175 L/s with a period of 1.2 hours (lower right graph of Figure 5). The results of the calibration and the comparison of the predicted drawdowns in response to pumping at H-19b4, i.e., the validation test, are discussed in section 4.

4.

Discussion of Calibration Results

One hundred conductivity fields were calibrated to the observed interference data from three sinusoidal pumping tests at H-19b0. Figures 6 and 7 contain comparisons between some of the observed drawdowns and calculated ensemble mean sinusoidal drawdowns in the upper and lower Culebra for the sinusoidal pumping tests conducted in the upper zone of H-19b0. While only the ensemble mean drawdowns are illustrated, the drawdowns for the individual realizations do not deviate much from the ensemble drawdowns shown. As shown, the calculated drawdowns match the observed drawdowns well. The calculated responses in the upper and lower Culebra seem to contain the same overall amplitudes and peak amplitude times as the observed drawdowns. Plate 1 illustrates 3 of the 100 calibrated fields. The upper Culebra is shown to have a lower overall magnitude and a wider range of conductivity values than the conductivity in the lower Culebra. This is predominantly due to the presence of the unfractured portion of the upper Culebra. The boundary between the fractured and the unfractured Culebra in the upper section is clearly visible between the H-19b0 borehole and the H-19b4, H-19b5, and H-19b6 boreholes to the north and west. The unfractured section is made distinct by the extremely low conductivities associated with it. Plate 2 contains the ensemble mean conductivity field of the

2674

LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION

100 calibrated conductivity fields for the upper and lower Culebra. The ensemble mean field for the upper Culebra depicts the average location of the unfractured low conductivity lying in the north and western portions of the H-19 hydropad. The log conductivities in the southern and eastern portions of the upper fractured Culebra range from ⫺4.5 log10 m/s to ⫺6.0 log10 m/s while the log conductivities in the unfractured portion range from ⫺7.0 log10 m/s to ⫺8.5 log10 m/s. The log ensemble mean conductivities in the lower Culebra (Plate 2) range from ⫺4.0 log10 m/s to ⫺5.0 log10 m/s, a much tighter range than in the upper Culebra. As a comparison to the transmissivity interpreted from the H-19 single-well pumping test by Beauheim and Ruskauff [1998], a geometric mean transmissivity was calculated by averaging the product of all the grid block conductivities multiplied by their associated grid block thicknesses across all 100 realizations. A value of the log geometric mean transmissivity of ⫺5.4 log10 m2/s was obtained, which compares favorably to Beauheim and Ruskauff’s ⫺5.2 log10 m2/s interpreted transmissivity value. Plate 2 also contains two plots that provide an indication of the variation of the conductivity fields across the 100 realizations. The conductivities for each grid block for all 100 realizations were sorted, and the differences between the 2% grid block values and 97% grid block values were computed and plotted in Plate 2. Given that only 100 realizations were calibrated, the 2% and 97% values were chosen to represent the variation across the grid block conductivity distribution. The variation in conductivities in the lower Culebra is significantly lower than the variation in the upper Culebra. The most significant variation across the realizations occurs in the upper Culebra because of the uncertainty in the location of the boundary between the fractured and unfractured portions of the Culebra. The 100 calibrated realizations were used to predict drawdowns to pumping at H-19b4, the validation data set discussed above. Of particular interest is the relative location of the H-19b4 borehole (Figure 1) in proximity to the other boreholes. H-19b4 is located away from the other boreholes and on the fringe of the area represented by the interference data calibration set. The predicted and measured drawdowns are illustrated in Figures 8 and 9. The predicted drawdowns adequately represent the actual drawdowns measured in the field fairly well. The amplitudes and peak amplitude times of the calculated sinusoidal interference data for both the upper and lower Culebra responses approximate those observed. While only the mean response is illustrated in Figures 8 and 9, the range of the calibrated field’s predictions is not large; that is, each of the 100 calibrated fields adequately reproduces the observed drawdowns fairly well. (Note that the calculated drawdown at the pumping well H-19b4 is larger than observed because of a skin factor used in SWIFT II. The skin factor is a resistance term around the borehole that is used as a fitting parameter for borehole pressures or drawdowns. The initial skin factor was set to the initial mean conductivity and remained fixed across the 100 realizations.)

5.

Conclusions

This paper describes the theory and application of the pilot point inverse method to a set of geologic facies data and a hydraulic data set taken from a series of three-dimensional pumping tests. One hundred three-dimensional conductivity

fields were simulated and subsequently calibrated to three sinusoidal pumping tests conducted in the Culebra dolomite located at the WIPP site. The final set of fields reproduces the drawdowns observed in the upper and lower Culebra at the H-19 hydropad very well. The calibrated fields depict the variation in the location of the boundary between the fractured and unfractured portions of the upper Culebra. In general, the unfractured part of the Culebra resides in the western and northern areas of the upper Culebra. A fourth sinusoidal pumping test conducted at the H-19b4 borehole was used as a validation data set to assess the predictability of the calibrated realizations. The predicted drawdowns to pumping at H-19b4 adequately represent the actual drawdowns measured in the field. The amplitudes and peak amplitude times of the calculated sinusoidal interference data for both the upper and lower Culebra responses approximate those observed. A comparison to the transmissivity interpreted from the H-19 single-well pumping test compared very well to the geometric mean transmissivity calculated from the calibrated realizations in this study. This method has the added benefit of providing spatially distributed conductivities as well as indications of changes in media or lithologies (e.g., fractured and unfractured portions of an aquifer). The results from this study indicate that conditioning the conductivity fields to the geologic facies data, in this case, fractured-unfractured categorical data, as well as to ample transient hydraulic data can result in groundwater flow models with robust hydraulic predictive capability. A current follow-on study is using these calibrated fields to assess whether the conditioning of these fields to the geologic and hydraulic data leads to robust predictions of tracer breakthrough. Acknowledgments. The authors would like to acknowledge the contributions of many individuals who reviewed, discussed, and/or otherwise contributed to this work, namely Banda S. RamaRao, Toya Jones, and Tim Dale of Duke Engineering and Services in Austin, Texas, Jaime Gomez-Hernandez of the Universidad Polite´cnica de Vale´ncia, Mel Marietta, Richard Beauheim, Lucy Meigs, and Sean McKenna of Sandia National Laboratories in Albuquerque, New Mexico, Emmanuel LeDoux and Allain Galli of Ecole des Mines de Paris, John Wilson of New Mexico Institute of Mining and Technology, Jesus Carrera of Universidad Polite´cnica de Catalunya, and Greg Ruskauff of Waterstone Environmental Hydrology and Engineering in Boulder, Colorado.

References Ahmed, S., Estimation des transmissivities des aquiferes par methodes geostatistiques et resolution indirecte du problem inverse, Ph.D. thesis, Paris Sch. of Mines, Fontainebleau, France, 1987. Ahmed, S., and G. de Marsily, Comparison of geostatistical methods for estimating transmissivity using data on transmissivity and specific capacity, Water Resour. Res., 23(9), 1717–1737, 1987. Barker, J. A., A generalized radial flow model for hydraulic tests in fracture rock, Water Resour. Res., 24(10), 1796 –1804, 1988. Beauheim, R. L., and G. J. Ruskauff, Analysis of hydraulic tests of the Culebra and Magenta dolomites and Dewey Lake Redbeds conducted at the Waste Isolation Pilot Plant Site, Rep. SAND98-0049, Sandia Natl. Lab., Albuquerque, N. M., 1998. Beauheim, R. L., L. C. Meigs, and M. B. Kloska, Evaluation of conceptual models of flow and transport through a fractured dolomite, 1, Hydraulic testing, Eos Trans. AGU, 76(46), Fall Meet. Suppl., F251, 1995. Butler, J. J., Jr., Pumping tests in non-uniform aquifers: The radially symmetric case, J. Hydrol., 101, 15–30, 1988. Butler, J. J., Jr., and W. Z. Liu, Pumping tests in non-uniform aquifers: The linear strip case, J. Hydrol., 128, 69 –99, 1991.

LAVENUE AND DE MARSILY: THREE-DIMENSIONAL INTERFERENCE TEST INTERPRETATION Butler, J. J., Jr., and W. Z. Liu, Pumping tests in non-uniform aquifers: The radially asymmetric case, Water Resour. Res., 29(2), 259 –269, 1993. Carrera, J., and S. P. Neuman, Estimation of aquifer parameters under transient and steady state conditions, 1, Maximum likelihood method incorporating prior information, Water Resour. Res., 22(2), 199 –210, 1986. Carter, R. D., and C. W. Tracy, An improved method for calculating water influx, Trans. Soc. Pet. Eng., 219, 415– 417, 1960. Certes, C., and G. de Marsily, Application of the pilot point method to the identification of aquifer transmissivities, Adv. Water Resour., 14(5), 284 –300, 1991. Cooley, R. L., A finite difference method for unsteady flow in variable saturated porous media: Application to a single pumping well, Water Resour. Res., 7(6), 1607–1625, 1971. de Marsily, G., De l’identification des systemes hydrogeologiques, these, Univ. of Paris VI, 1978. de Marsily, G., Spatial variability of properties in porous media: A stochastic approach, in Fundamentals of Transport Phenomena in Porous Media, NATO ASI Ser., Ser. E, vol. 82, edited by J. Bear and M. Corapcioglu, pp. 719 –769, Martinus Nijhoff, Zoetermeer, Netherlands, 1984. de Marsily, G., G. Lavedan, M. Boucher, and G. Fasanio, Interpretation of interference tests in a well field using geostatistical techniques to fit the permeability distribution in a reservoir model, in Geostatistics for Natural Resources Characterization, NATO ASI Ser., Ser. C, vol. 122, part 2, edited by G. Verly et al., pp. 831– 849, Norwell, Mass., 1984. Deutsch, C. V., and A. G. Journel, GSLIB Geostatistical Software Library and User’s Guide, Oxford Univ. Press, New York, 1992. Franssen, H.-J. H., E. F. Cassiraga, J. J. Gomez-Hernandez, A. Sahuquillo, and J. E. Capilla, Inverse modeling of groundwater flow in a 3D fractured media, in GeoENV II, Geostatistics for Environmental Applications, edited by J. J. Gomez-Hernandez et al., pp. 283–294, Kluwer Acad., Norwell, Mass., 1999a. Franssen, H.-J. H., J. J. Gomez-Hernandez, J. E. Capilla, and A. Sahuquillo, Joint simulation of transmissivity and storativity field conditional to steady state and transient hydraulic head data, Adv. Water Resour., 23(1), 1–13, 1999b. Gomez-Hernandez, J. J., and H.-J. H. Franssen, True block scale stochastic continuum modeling: Model assessment and first conditional model, Aspo Hard Rock Lab. Tech. Doc., ITD-99-18, 35 pp., Swed. Nucl. Fuel and Waste Manage. Co., Figeholm, 1999. Haggerty, R., S. W. Fleming, L. C. Meigs, and S. A. McKenna, Tracer tests in a fractured dolomite, 2, Analysis of mass transfer in singlewell injection-withdrawal tests, Water Resour. Res., 37(5), 1129 – 1142, 2001. Hantush, M. S., Modification of the theory of leaky aquifers, J. Geophys. Res., 65, 3713–3725, 1960. Hantush, M. S., Analysis of data from pumping tests in anisotropic aquifers, J. Geophys. Res., 71, 421– 426, 1966. Hantush, M. S., and C. E. Jacob, Non-steady radial flow in an infinite leaky aquifer, Eos Trans. AGU, 36, 95–100, 1955. Herweijer, J. C., Constraining uncertainty of groundwater flow and transport models using pumping tests, in Calibration and Reliability in Groundwater Modeling, IAHS Publ., 237, 473– 482, 1996. Holt, R. M., and D. W. Powers, Facies variability and post-depositional alteration within the Rustler Formation in the vicinity of the Waste Isolation Pilot Plant, southeastern New Mexico, Rep. DOE-WIPP88-04, Westinghouse Electr. Corp., Carlsbad, N. M., 1988. Jacob, C. E., Radial flow in a leaky artesian aquifer, Eos Trans. AGU, 27, 198 –205, 1946. Lachassagne, P., E. Ledoux, and G. de Marsily, Evaluation of hydrogeological parameters in heterogeneous porous media, in Groundwater Management: Quantity and Quality, IAHS Publ., 188, 3–18, 1989. Lavenue, A. M., Sur une novelle methode de point de pilotes en

2675

problem inverse en hydrogeologie: Engendrant un ensemble de simulation conditionelles de champs de transmissivites, Ph.D. dissertation, Ecole Natl. Super. des Mines de Paris, Nov. 1998. Lavenue, A. M., and J. F. Pickens, Application of a coupled adjointsensitivity and kriging approach to calibrate a groundwater flow model, Water Resour. Res., 28(6), 1543–1569, 1992. Lavenue, A. M., B. S. RamaRao, G. de Marsily, and M. G. Marietta, Pilot point methodology for automated calibration of an ensemble of conditionally simulated transmissivity fields, 2, Application, Water Resour. Res., 31(3), 495–516, 1995. Law, J., A statistical approach to the intersticial heterogeneity of sand reservoirs, Trans. Am. Inst. Min. Metall. Pet. Eng., 155, 202–222, 1944. Leebe, L., and W. De Breuck, Validation of an inverse numerical model for interpretation of pumping tests and a study of the factors influencing accuracy of results, J. Hydrol., 172, 61– 85, 1995. Matheron, G., Traite de Geostatistique Appliquee, vol. 1, 334 pp. Technip, Paris, 1962. McKenna, S. A., and E. P. Poeter, Field example of data fusion in site characterization, Water Resour. Res., 31(12), 3229 –3240, 1995. McKenna, S. A., L. C. Meigs, and R. Haggerty, Tracer tests in a fractured dolomite, 3, Double porosity, multiple-rate mass transfer processes in convergent flow tracer tests, Water Resour. Res., 37(5), 1143–1154, 2001. Meier, P. M., J. Carrera, A. Medina, and L. Vives, Inverse geostatical modeling of ground water flow within a shear-zone in granite, in Proceedings of IAMG 1997: The Third Annual Conference of the International Association for Mathematical Geology, vol. 2, edited by B. Pawlowshy-Glahn, pp. 755–776, Int. Cent. of Numer. Methods in Eng., Barcelona, Spain, 1997. Meier, P. M., J. Carrera, and X. Sanchez-Vila, An evaluation of Jacob’s method for the interpretation of pumping tests in heterogenous formations, Water Resour. Res., 34(5), 1011–1025, 1998. Meigs, L. C., and R. L. Beauheim, Tracer tests in a fractured dolomite, 1, Experimental design and observed tracer recoveries, Water Resour. Res., 37(5), 1113–1128, 2001. Pinder, G. F., and J. D. Bredehoeft, Application of digital computers for aquifer evaluation, Water Resour. Res., 4(5), 1069 –1093, 1968. RamaRao, B. S., A. M. LaVenue, G. de Marsily, and M. G. Marietta, Pilot point methodology for automated calibration of an ensemble of conditionally simulated transmissivity fields, 1, Theory and computational experiments, Water Resour. Res., 31(3), 475– 493, 1995. Reeves, M., D. S. Ward, N. D. Johns, and R. M. Cranwell, Theory and implementation for SWIFT II, The Sandia Waste-Isolation Flow and Transport Model for fractured media, release 4.84, Rep. SAND83-1159, NUREG/CR-3328, Sandia Natl. Lab., Albuquerque, N. M., 1986. Sahuquillo, A. J. Capilla, J. J. Gomez-Hernandez, and J. Andreu, Conditional simulation of transmissivity fields honouring piezometric data, in Fluid Flow Modeling, edited by W. R. Blain and E. Cabrera, pp. 201–212, Comput. Mech., Billerica, Mass., 1992. Theis, C. V., The relation between the lowering of the pieziometric surface and the rate and duration of discharge to a well using groundwater storage, Eos Trans. AGU, 16, 519 –524, 1935. Zimmerman, D. A., et al., Comparison of seven geostatistically-based inverse approaches to estimate transmissivities for modeling advective transport by groundwater flow, Water Resour. Res., 34(6), 1373– 1413, 1998. G. de Marsily, Laboratoire de Geologie Appliquee, Universite Paris VI, 4 Place Jussieu, 75230 Paris Cedex 05, France. (gdm@ ccr.jussieu.fr) M. Lavenue, INTERA Inc., 936 Poplar Place, Boulder, CO 80304, USA. ([email protected]) (Received August 5, 1999; revised August 3, 2000; accepted September 12, 2000.)

5.0 CONCLUSIONS The research presented in this dissertation has led to the development of the GRASP-INV inverse code which contains numerous improvements to the Pilot Point Technique of solving the inverse problem for groundwater flow. Each of these improvements was thoroughly tested and subsequently used in applications of the Pilot Point Technique to solve the automatically calibrate local and regional scale groundwater flow models. Over the last five years, some of the major improvements embedded in GRASP-INV include: •

the capability to locate pilot points optimally,



the capability to generate and subsequently calibrate an ensemble of conditionally simulated transmissivity fields to steady-state or transient-state flow conditions,



the capability to generate conditional simulations of multi-category transmissivity fields, such as found in fractured and unfractured domains, using parametric and non-parametric geostatistical techniques and automatically calibrate these fields to steady-state or transient-state flow conditions



the capability to solve fully three-dimensional inverse problems with the Pilot Point Technique.

Site characterization activities at the Waste Isolation Pilot Plant (WIPP) in southeastern New Mexico have provided an exhaustive hydrogeologic data set of transmissivities, steady-state and transient-state heads resulting from numerous local-scale and regional scale pumping tests. In 1992 and 1996, GRASP-INV was used in the WIPP program’s performance assessment (PA) calculations to calibrate a regional scale flow model at WIPP. The objective of the modeling was to calibrate an ensemble of conditionally-simulated transmissivity fields to steady-state and transient-state conditions and to subsequently predict groundwater travel times from the center of the WIPP-facility area to the southern WIPP-site boundary.

5-1

In 1992, the ensemble of transmissivity fields was generated using the TUBA and AKRIP codes and the residual sewing method. The results of the 1992 study showed that groundwater travel times were sensitive to the location of the boundary between the fractured and unfractured areas of the aquifer, the Culebra dolomite. These results also identified the need for a more robust geostatistical simulation routine in the generation of the Culebra transmissivity field’s fractured and unfractured domains. This conclusion was also confirmed by the Geostatistics Expert Group’s (GXG) two-year comparative study of linear and non-linear indirect inverse methods. The GXG, convened by Sandia National Laboratories, conducted a series of tests on seven different inverse methods and concluded that the proper selection of the semivariogram of the Log(T) field and the ability to geostatistically simulate it, had a significant impact on the accuracy and precision of travel time predictions made by an inverse method. In addition, the GXG study showed that non-stationarity of the “true”' transmissivity field, or the presence of "anomalies'' such as high-permeability fracture zones, was handled better by non-linear methods than by the linearized methods. Thus, in 1995, a new geostatistical ‘front end’ was added to GRASP-INV in preparation for the 1996 PA at WIPP. The new front end uses a two-step geostatistical procedure to generate conditional simulations of transmissivity fields. Categorical indicator simulation is first performed to obtain the spatial distribution of indicators representing fractured or unfractured media. Then, the spatial variability within each of these categories is subsequently ‘filled in’ using the associated semi-variogram models and the Sequential Gaussian Simulation technique. This allows GRASP-INV2 to optimize the properties within these fractured and non-fractured domains ‘independently’ while matching steady-state or transient-state head data. In the WIPP 1996 PA, one-hundred Culebra transmissivity fields were conditionally simulated using the measured transmissivity and the data describing the occurrence of fracturing. These fields were subsequently calibrated to an extensive set of steady-state and transient-state heads. The ability of the GRASP-INV code to optimize the properties of the areas associated with diagenetically altered (that is, higher) transmissivity and unaltered (that is, lower) transmissivity in the Culebra separately improved the capability of the model to obtain good agreement between the observed and calculated steady-state and transient heads. The 100 transmissivity fields incorporated the effects of variable elevation and variable fluid-density upon the flow fields and were used to calculate groundwater travel times to the WIPP-site boundary. The transmissivity fields generated in the 1996 study had a much higher variability than those produced in the 1992 PA study. This variability was due to the simulation of the uncertain location of the boundary between the fractured and unfractured areas of the aquifer. The decoupling (geostatistically) of the high transmissivity zone and the low transmissivity zone produced a sharper boundary between the lower transmissivity (i.e., unfractured) and higher transmissivity (i.e., fractured) parts of the aquifer. This eliminated the blending (i.e., averaging) of the fractured and unfractured transmissivities which resulted in the development of an ‘intermediate’ transmissivity zone between the fractured and unfractured areas in the 1992 study. The global range of uncertainty in the transmissivity, remnant in the calibrated fields, impacted the uncertainty in the groundwater travel times. A travel time CDF produced in 5-2

the 1996 PA study resulted in significantly lower groundwater travel times relative to the 1992 groundwater travel times. It was concluded that the elimination of the ‘intermediate’ transmissivity zone, mentioned above, was the main factor in reducing groundwater travel times to the southern WIPP-site boundary. In 1997 and 1998, modifications to GRASP-INV2 to solve three dimensional groundwater flow inverse problems were implemented and applied to a three-dimensional pump test in the Culebra dolomite. The pump test, conducted as part of the 1996 H-19 tracer test, employed a sinusoidal pumping rate. The Culebra was isolated into two vertical sections and pumping was conducted in the upper and lower sections with transducers monitoring the pressure responses in the upper and lower sections of six nearby (< 40m) monitoring wells. The packers were vertically set in an attempt to isolate the lower highly fractured portion of the Culebra dolomite from the upper non-fractured portion. One hundred, three-dimensional transmissivity fields were simulated and subsequently calibrated to three sinusoidal pumping tests conducted at the H-19b0 borehole. A forth three-dimensional pumping test conducted at the H-19b4 borehole was used to validate the inverse results. The calibrated fields depict the variation in the location of the boundary between the fractured and unfractured portions of the upper Culebra. In general, the unfractured part of the Culebra resides in the western and northern areas of the upper Culebra. The lower Culebra is fractured and has transmissivities several orders of magnitude higher than the upper Culebra. It was concluded that the calibrated transmissivity fields reproduce the drawdowns observed in the upper and lower Culebra at the H-19 hydropad very well. Also, the predicted drawdowns for due to pumping at H-19b4, the validation test, agree very closely with the actual drawdowns measured in the field. In addition, the ensemble mean log10 transmissivity over the three-dimensional domain agrees closely with an earlier hydropad-scale log10 transmissivity value interpreted from a single well test at the H-19 hydropad. Advective travel times from the surrounding H-19 hydropad boreholes due to pumping at the central H-19b0 borehole compared favorably with observed center of mass of the tracer-breakthrough curves in the fractured portions of the lower Culebra. However, the advective travel times were much higher than the tracer arrival times in the upper unfractured Culebra. This implies that a discrepancy exists between the parameters needed to match the hydraulic response of the upper aquifer and those consistent with transport in the upper aquifer. The research and conclusions documented in this dissertation has led to the following recommendations for future research: 1. A direct representation of the vertical permeability field during optimization in three-dimensional applications is needed. This recommendation originates from the fact that currently vertical permeability is determined from an anisotropy ratio applied to the horizontal permeability in a grid block. Thus, the vertical permeability is only modified through a modification to the horizontal permeability. The vertical and horizontal permeability fields could be ‘decoupled’ by adding a separate set of “vertical” pilot points that only modify vertical permeability in a grid block. The sensitivity of the performance measure to the addition of vertical pilot points would then be a straightforward modification to the Pilot-Point Technique.

5-3

2. A direct representation of storativity in the optimization process would benefit the Pilot Point Technique during transient state applications. Currently, storativity is a fixed, unaltered parameter in the optimization routine. GRASP-INV2 has the capability to solve for the sensitivity of a transient-state objective function to grid block storativity. However, this capability needs to be included as part of the parameterization of the inverse problem to facilitate a successful inverse solution. 3.

Optimizing boundary conditions for steady-state or transient-state applications would be a contribution to the Pilot Point Technique. Of particular use for three-dimensional well-test analysis would be the optimization of the Carter Tracey boundary aquifer parameters.

4. Modification of the objective function to incorporate some measure of single- porosity transport would increase the reliability of the inverse solutions produced by the Pilot Point Technique. The objective function could include a center of mass arrival time at a point in space or could be a concentration at a borehole location at a particular time. This would allow the simultaneous optimization of hydraulic and transport performance measures.

5-4

6.0 REFERENCES Ahmed, S., Estimation des transmissivities des aquiferes par methodes geostatistiques et resolution indirecte du problem inverse. Ph.D. thesis, Paris School of Mines, Fontainebleau, France, 1987. Ahmed, S., and G. de Marsily, Comparison of geostatistical methods for estimating transmissivity using data on transmissivity and specific capacity, Water Resour. Res., 23 (9); 1717-1737, 1987. Bakr, A.A., Stochastic analysis of the effects of spatial variations in hydraulic conductivity on groundwater flow, PhD dissertation, New Mexico Instititute Mining and Technology, Socorro, New Mexico, 1976. Beauheim, R.L., Hydraulic-Test Interpretations for Well DOE-2 at the Waste Isolation Pilot Plant (WIPP) Site. SAND86-1364. Albuquerque, NM: Sandia National Laboratories, 1986. Beauheim, R.L., Analysis of Pumping Tests of the Culebra Dolomite Conducted at the H-3 Hydropad at the Waste Isolation Pilot Plant (WIPP) Site. SAND86-2311. Albuquerque, NM: Sandia National Laboratories, 1987a. Beauheim, R.L., Interpretations of the WIPP-13 Multipad Pumping Test of the Culebra Dolomite at the Waste Isolation Pilot Plant (WIPP) Site. SAND87-2456. Albuquerque, NM: Sandia National Laboratories, 1987b. Beauheim, R.L., Interpretations of Single-Well Hydraulic Tests Conducted At and Near the Waste Isolation Pilot Plant (WIPP) Site, 1983-1987. SAND87-0039. Albuquerque, NM: Sandia National Laboratories, 1987c. Beauheim, R.L., Interpretation of H-11b4 Hydraulic Tests and the H-11 Multipad Pumping Test of the Culebra Dolomite at the Waste Isolation Pilot Plant (WIPP) Site. SAND89-0536. Albuquerque, NM: Sandia National Laboratories, 1989.

6-1

Beauheim, R.L., Identification of Spatial Variability and Heterogeneity of the Culebra Dolomite at the Waste Isolation Pilot Plant Site, in Proceedings: NEA Workshop on Heterogeneity of Groundwater Flow and Site Evaluation, Paris, France, 22-24 October 1990, 131-142 OECD, NEA, Paris, France, 1991. Binsariti, A. A., Statistical analysis and stochastic modeling of the Cortaro aquifer in Southern Arizona, PhD dissertation, Dept. of Hydrology and Water Resources, Univ. of Arizona, Tucson, Arizona, 1980. Carrera, J., State of the art of the inverse problem applied to the flow and solute transport equations, in Groundwater Flow and Quality Modelling, NATO ASI Ser. Vol. 224, pp. 549585, Kluwer, Boston, MA, 1988. Carrera, J., INVERT-4. A Fortran program for solving the groundwater flow inverse problem. User’s guide, CIMNE Technical Report, 160pp, 1994. Carrera, J., and L. Glorioso, On geostatistical formulations of the groundwater flow inverse problem, Adv. Water Resour. 14(5), 273-283, 1991. Carrera, J., and A. Medina, An improved form of adjoint-state equations for transient problems, in Computational Methods in Water Resources, pp. 199-206, Kluwer, Boston, MA, 1994. Carrera, J., A. Medina, C. Axness and T. Zimmerman, Formulations and computational issues of the inversion of random fields, Proceedings of UNESCO Conference, Paris, November, 1994. Carrera, J., F. Navarrina, L. Vives, J. Heredia and A. Medina. Computational aspects of the inverse problem, Comp. Meth. In Wat. Resour., 513-523, 1990. Carrera, J., and S.P. Neuman, Estimation of aquifer parameters under transient and steady state conditions, 1. Maximum likelihood method incorporating prior information, Water Resour. Res., 22(2), 799-210 1986a. Carrera, J., and S.P. Neuman, Estimation of aquifer parameters under transient and steady state conditions: 2. Uniqueness, stability, and solution algorithms, Water Resour. Res., 22(2), 211227, 1986b. Carrera, J., and S.P. Neuman, Estimation of aquifer parameters under transient and steady state conditions: 3. Application to synthetic and field data, Water Resour. Res., 22(2), 228-242, 1986c. Carrera, J., A. Medina, and G. Galarza, Groundwater inverse problem. Discussion on geostatistical formulations and validation, Hydrogeologie, 4, 313-324, 1993. Cauffman, T.L., A.M. LaVenue, and J.P. McCord, Ground-water flow modeling of the Culebra Dolomite, Volume II: Data Base, SAND89-7068/2, Sandia National Laboratories, Albuquerque, NM, USA, 1990.

6-2

Certes, C., Analyse et resolution du probleme d’identification de parametres spatialement repartis dans les modeles d’ecoulement souterrain. Mise en oeuvre de la methode des mailles pilotes. These de Doctorat. Ecole des Mines de Paris, May 1990. Certes, C. and G. de Marsily, Application of the pilot point method to the identification of aquifer transmissivities, Adv. Water Resour. 14(5), 284-300, 1991. Chavent, G., Analyse Fonctionnelle et Identification Des Coefficients Repartis Dans les Equations Aux Derivees Partielles. These d’Etat en Mathematiques, Paris VI, 1971. Chavent, G., History Matching by Use of Optimal Control Theory. Soc. Petroleum Eng. Jour. 15(1): 74-86, 1975. Chen, W.H., G.R. Gavalas, J.G. Seinfeld, and M.L. Wasserman., A new algorithm for automatic history matching. Trans. AIME 257:593-608, 1974. Christiansen, H., M.C. Hill, D. Rosberg, and K.H. Jensen. Three-dimensional inverse modeling using heads and concentrations at a Danish landfill. Wagner, B.J. and T. Illangesekare, eds. Proceedings of Models for Assessing and Monitoring Groundwater Quality, IAHS-INGG XXI General Assembly, Boulder CO. pp167-175, 1995. Clifton, P.M., and S.P. Neuman, Effects of kriging and inverse modeling on conditional simulation of the Avra Valley aquifer in southern Arizona, Water Resour. Res., 18(4), 12151234, 1982. Coats, K.H., J. R. Dempsey, and J.H. Henderson, A new technique for determining reservoir description from field performance data. Soc. Pet. Eng. Jour. 10(1): 66-74, 1970. Cooley, R.L., A method of estimating parameters and assessing reliability for models of steady state groundwater flow. 1. Theory and numerical properties, Water Resour. Res., 13(2), 318324, 1977. Cooley, R.L., A method of estimating parameters and assessing reliability for models of steady state groundwater flow. 2. Application of statistical analysis, Water Resour. Res., 15(3), 603617, 1979. Cooley, R.L., Incorporation of prior information on parameters into nonlinear regression groundwater flow models. 1. Theory, Water Resour. Res., 18(4), 965-976, 1982. Cooley, R.L., Incorporation of prior information on parameters into nonlinear regression groundwater flow models. 2. Applications, Water Resour. Res., 19(3), 662-676, 1983. Copty, N., Y. Rubin, and G. Mavko, Geophysical-hydrological identification of field permeabilities through Bayesian updating, Water Resour. Res., 29(8), 2813-2825, 1993. Dagan, G., Stochastic modeling of groundwater flow by unconditional and conditional probabilities: The inverse problem, Water Resour. Res., 21(1), 65-72, 1985.

6-3

Dagan, G., and Y. Rubin, Stochastic identification of recharge, transmissivity and storativity: in aquifer transient flow: "a quasi-steady approach," Water Resour. Res., 24(10), 1698-1710, 1988. Day, M.C. and B.W. Hunt, Groundwater transmissivitites in north Canterbury, New Zealand, New Zealand Journal of Hydrology 16(2); 158-163, 1977. Delhomme, J.P. and P. Delfiner, Application du krigeage a l’optimization d’une campagne pluviometrique en zone aride, in Proceedings of the Symposium on the Design of Water Resources Projects with Inadequate Data, (2), pp. 191-210, UNESCO, Madrid, Spain, 1973. Delhomme, J.P., Application de la theorie des variables regionalisees dans les sciences de l’eau. These, Univ. Paris VI. 1978. Delhomme, J.P., Spatial variability and uncertainty in groundwater flow parameters: a geostatistical approach, Water Resour. Res., 15 (2), 269-280, 1979. Deutsch, C.V., and A.G. Journel, The application of simulated annealing to stochastic reservoir modeling. In Report 4, Stanford Center for Reservoir Forecasting, Stanford, CA, May 1991. Deutsch, C.V., and A.G. Journel, GSLIB : Geostatistical software library and user's guide, Oxford University Press, New York, 1992. Emsellem, Y., and G. de Marsily, An automatic solution for the inverse problem, Water Resour. Res. 7(5): 1264-1283, 1971. Eppstein, M.J. and D.E. Dougherty, Simultaneous estimation of transmissivity values and zonation, Water Resour. Res. 32(11); 3321-3336, 1996. Fennessy, P.J., Geostatistical analysis and stochastic modeling of the Tajo Basin Aquifer, Spain, M.S. thesis, Dep. Of Hydrol. And Water Resour., Univ. of Arizona, Tucson, Arizona, 1982. Frind, E.O., and G.F. Pinder, Galerkin solution of the inverse problem for aquifer transmissivity, Water Resour. Res. 9(5); 1397-1410, 1973. Finley, N.C. and M. Reeves. SWIFT Self-Teaching Curriculum: Illustrative Problems to Supplement the User’s Manual for the Sandia Waste-Isolation Flow and Transport Model (SWIFT), NUREG/CR-1968. SAND81-0410, Sandia National Laboratories, Albuquerque, NM, 1981. Freeze, R.A., A stochastic-conceptual analysis of one-dimensional groundwater flow in nonuniform homogeneous medial Water Resour. Res. 11(5): 725-741, 1975. Gelhar, L.W., Effects of hydraulic conductivity variation on groundwater flows, paper presented at 2nd International Symposium on Stochastic Hydraulics, Int. Assoc. for Hydraul. Res., Lund, SWEDEN, 1976. Gill, P.E., W. Murray, and M.H. Wright. Practical Optimization. New York, NY: Academic Press, Inc., 1981.

6-4

Ginn, T.R., and J.H. Cushman, Inverse methods for subsurface flow: A critical review of stochastic techniques, Stochast. Hydrol. Hydraul., 4 (1), 1-26, 1990. Gomez-Hernandez, J.J. and R.M. Srivastava, ISIM3D: An ANSI-C Three Dimensional Multiple Indicator Conditional Simulation Program. Computer and Geosciences, 16(4): 395-440, 1990. Gonzalez, R. V., Giudici, M., Ponzini, G., and G. Parravicini, The Differential System Method for the Identification of Transmissivity and Storativity, Transport in Porous Media, 26, 339-371, 1997. Gotway, C.A., and B.M. Rutherford, Stochastic simulation for imaging spatial uncertainty: Comparison and evaluation of available algorithms, in Proceedings of the Workshop on Geostatistical Simulation, May 27-28, 1993, Fontainebleau, FRANCE, 1993. Grindrod, P., and M.D. Impey, Fractal field simulations of tracer migration within the WIPP Culebra Dolomite, Report, Intera Information Technologies, Dec. 1991. Giudici, M. G. Morossi, G. Parravicini, and G. Ponzini, A new method for the identification of distributed transmissivities, Water Resour. Res. 31(8); 1969-1988, 1995. Gutjahr, A.L., and J.L. Wilson, Co-Kriging for stochastic flow models, Transport in Porous Media, 4(6), 585-598, 1989. Gutjahr, A., B. Bullard, S. Hatch, and L. Hughson, Joint conditional simulations and the spectral method approach for flow modeling, Stoch. Hydrol. Hydraul., 8(1), 79-108, 1994. Harvey, C.F., and S.M. Gorelick. Mapping hydraulic conductivity: Sequential conditioning with measurements of solute arrival time, hydraulic head and local conductivity, Water Resour. Res., 31(7), 1615-1626, 1995. Hawkins, D.B. and D.B. Stephens, Ground-water modeling in a southwestern alluvial basin, Groundwater 21(6): 733-739, 1983. Hill, M., A computer program (MODFLOWP) for estimating parameters of a transient, threedimensional groundwater flow model using nonlinear regression. USGS OFR 91-484. 358pp., 1992. Hoeksema, R.J., and P.K. Kitanidis, An application of the geostatistical approach to the inverse problem in two-dimensional groundwater modeling, Water Resour. Res., 20(7), 1003-1020, 1984. Hunt, B.W. and D.J. Wilson, Graphical calculation of aquifer transmissivities in northern Canterbury, New Zealand, New Zealand Journal of Hydrology 13(2): 66-80, 1974. Jacobson, E., A statistical parameter estimation method using singular value decomposition with application to Avra Valley in southern Arizona, PhD. dissertation, Univ. of Arizona, Tucson, 1985. Jahns, H. O., 1966. A Rapid Method for Obtaining a Two-Dimensional Reservoir Description from Well Pressure Response Data. Soc. Pet Eng. Jour. 315-327.

6-5

Journel, A.G., and Ch.J. Huijbregts, Mining geostatistics, Academic Press, London, U.K., New York, 1978. Kitanidis, P.K., Quasi-linear geostatistical theory for inversing. Water Resour. Res., 31(10); 2411-2419, 1995. Kitanidis, P.K., and R.W. Lane, Maximum likelihood parameter estimation of hydrologic spatial processes by the Gauss-Newton method, Hydrol., 79 (1-2) 53-71, 1985. Kitanidis, P.K., and E.G. Vomvoris, A geostatistical approach to the inverse problem in groundwater modeling (steady state) and one-dimensional simulations, Water Resour. Res., 19(3), 677-690, 1983. Kleinecke, D., Use of linear programming for estimation geohydrologic parameters of groundwater basins, Water Resour. Res. 7(2); 367-374, 1971. Lappin, A.R., Summary of site-characterization studies conducted from 1983 through 1987 at the Waste Isolation Pilot Plant (WIPP) site, southeastern New Mexico. SAND88-0157, Sandia National Laboratories, Albuquerque, NM, USA, 1988. Lavenue, A.M., Cauffman, T.L., and J.F. Pickens, Ground-Water Flow Modeling of the Culebra Dolomite Volume I: Model Calibration, SAND89-7068/1. Albuquerque, NM: Sandia National Laboratories, 1990. Lavenue, A.M., and J.F. Pickens, Application of a coupled adjoint-sensitivity and kriging apporach to calibrate a groundwater flow model, Water Resour. Res., 28(6), 1543-1569, 1992. Lavenue, A.M., and B.S. RamaRao, A modeling approach to address spatial variability within the Culebra Dolomite transmissivity field, SAND92-7306, Sandia National Laboratories, Albuquerque, NM, USA, 1992. Lavenue, A.M., B.S. RamaRao, G. de Marsily, and M.G. Marietta, Pilot point methodology for automated calibration of an ensemble of conditionally simulated transmissivity fields. 2. Application, Water Resour. Res., 31(3), 495-516, 1995. Law, J., A statistical approach to the intersticial heterogeneity of sand reservoirs. Trans. Am. Inst. Min. Metall. Pet. Eng. 155, 202-222, 1944. Loaiciga, H.A. and M. A. Marino, The inverse problem for confined aquifer flow: Identification and estimation with extensions. Water Resour. Res. 23(1): 92-104, 1987. Luenberger, D.G., Introduction to Linear and Nonlinear Programming. Reading, MA: Addison-Wesley Publishing Co., 1973. Mantoglou, A., and J.L. Wilson, The turning bands method for simulation of random fields using line generation by a spectral method, Water Resour. Res. 18(5); 1379-1394, 1982. Marsily, G. de, De l'identification des systemes en hydrogeologiques (Tome 1), Doctoral thesis, L'Universite Pierre et Marie Curie - Paris VI, pp. 58-130, 1978.

6-6

Marsily, G. de, Spatial variability of properties in porous media: A stochastic approach, NATO Advanced Study Institute, Proceedings of Mechanics of fluids in Porous Media, Newark, DE, July 18-28, 1982. Marsily, G. de, G. Lavedan, M. Boucher, and G. Fasanio, Interpretation of interference tests in a well field using geostatistical techniques to fit the permeability distribution in a reservoir model, in Geostatistics for Natural Resources Characterization 2nd NATO Advanced Study Institute, South Lake Tahoe, CA, September 6-17, 1987, G. Verly, M.David, A.G. Journel, and A. Marecha (eds.), D. Reidel, Hingham, MA, Pt. 2, 831-849, 1984. Matheron, G., Traite de Geostatistique Appliquee (in French), vol. 1, 334 pp. Technip, Paris, 1962. Matheron, G., Principles of geostatistics, Econ, Geol., 58, 1246-1266, 1963. Matheron, G., The Theory of Regionalized Variables and its Applications. Paris, France: École National Supérieure des Mines, 1971. Matheron, G., The intrinsic random functions and their applications, Adv. Appl. Prob., 5 (3), 439-468, 1973. McLaughlin, D., and L.R. Townley, A reassessment of the groundwater inverse problem, Water Resour. Res., 32(5), 1131-1161, 1996. McLaughlin, D., Investigation of alternative procedures of estimating groundwater basin parameters, final report to OWRT, Water Resour. Eng., Walnut Creek, CA., 1975. Medina, A., and J. Carrera, Coupled estimation of flow and solute transport parameters, Water Resour. Res. 32(10); 3063-3076, 1996. Mejia, J. M., and I. Rodríguez-Iturbe. On the synthesis of random field sampling from the spectrum: An application to the generation of hydrologic spatial processes, Water Resour. Res. 10(4); 705-711, 1974. Nelson, R.W., In-place measurement of permeability in heterogeneous media, 1. Theory of a proposed method, Jour. of Geophysical Res. 65(6): 1753-1760, 1960. Nelson, R.W., In-place measurement of permeability in heterogeneous media, 2. Experimental and computational considerations, Jour. of Geophysical Res. (66): 2469-2478, 1961. Nelson, R.W., In-place determination of permeability distribution of heterogeneous porous media through analysis of energy dissipation, Soc. Pet. Eng. Jour., (3), 33-42, 1968. Nelson, R.W., and D.B. Clearlock, Analysis and predictive methods for ground water flow in large heterogeneous systems, in Proceedings of the National Symposium on Ground-Water Hydrology, American Water Resources Association, San Francisco, CA, November, 1967.

6-7

Neuman, S.P., Calibration of distributed parameter groundwater flow models viewed as a multiple objective decision process under uncertainty. Water Resour. Res. 9(4); 1006-1021, 1973. Neuman, S.P., The inverse problem of groundwater hydrology, in Proceedings of IBM International Seminar on Regional Groundwater Hydrology and Modeling, pp. 210-249, IBM, Venice, Italy, 1976. Neuman, S.P., A Statistical Approach to the Inverse Problem of Aquifer Hydrology, 3. Improved Solution Method and Added Perspective. Water Resour. Res., 16(2): 331-346, 1980. Neuman, S.P., Role of Geostatistics in Subsurface Hydrology. In (Eds. G. Verly, M. David, A.G. Journel, and A. Marechal) Geostatistics for Natural Resources Characterization. Proc. NATO-ASI, Part1, pp. 787-816. Reidel, Dordrecht, The Netherlands, 1984. Neuman, S.P. and S. Yakowitz, A statistical approach to the inverse problem of aquifer hydrology 1. Theory. Water Resour. Res. 15(4): 845-860, 1979. Neuman, S.P., G.E. Fogg, and E.A. Jacobson, A statistical approach to the inverse problem of aquifer hydrolgy 2. Case study, Water Resour. Res. 16, 33-58, 1980. Peck, A., S. Gorelick, G. de Marsily, S. Foster, and V. Kovalevsky, Consequences of spatial variability in aquifer properties and data limitations for groundwater modelling practice. IAHS Publication 175, 272 p., 1988. Poeter, E.P. and M.C. Hill, Inverse models: A necessary next step in ground-water modeling, Groundwater 35(2); 250-260, 1997. Ponzini, G. and A. Lozej, 1982. Identification of aquifer transmissivities: The comparison model method. Water Resour. Res. 18(3): 597-622. Ponzini, G., G. Crosta, and M. Giudici, Identification of thermal conductivities by temperature gradient profiles: One-dimensional steady flow, Geophysics, 54, 643-653, 1989. RamaRao, B.S., A.M. LaVenue, G. de Marsily, and M.G. Marietta, Pilot point methodology for automated calibration of an ensemble of conditionally simulated transmissivity fields. 1. Theory and computational experiments, Water Resour. Res., 31(3), 475-493, 1995. RamaRao, B.S., and M. Reeves, Theory and Verification for the GRASP II Code for AdjointSensitivity Analysis of Steady-State and Transient Ground-Water Flow. SAND89-7143. Albuquerque, NM: Sandia National Laboratories, 1990. Reeves, M., D.S. Ward, N.D. Johns, and R.M. Cranwell. Theory and Implementation for SWIFT II, The Sandia Waste-Isolation Flow and Transport Model for Fractured Media, Release 4.84. SAND83-1159; NUREG/CR-3328. Albuquerque, NM: Sandia National Laboratories, 1986a. Reeves, M., D.S. Ward, N.D. Johns, and R.M. Cranwell. Data Input Guide for SWIFT II, The Sandia Waste-Isolation Flow and Transport Model for Fractured Media, Release 4.84. SAND83-0242; NUREG/CR-3162. Albuquerque, NM: Sandia National Laboratories, 1986b.

6-8

Rice, W.A., Error and uncertainty analysis of a direct inverse technique with application to the Hanford site. M.S. Thesis, University of Washington, Seattle, Washington, 1983. Rice, W.A. and S.M. Gorelick, Geologic inference from “flow net” transmissivity determination: three case studies. Water Resour. Bull. 21(6): 919-929, 1985. Robin, M.J.L., A.L. Gutjahr, E.A. Sudicky, and J.L. Wilson, Cross-correlated random field generation with the direct Fourier transform method, Water Resour. Res., 29(7), 2385-2397, 1993. Roth, C., C. de Fouquet, J.P. Chiles, and G. Matheron, Geostatistics applied to hydrogeology's inverse problem: taking boundary conditions into account, Proceedings from the 5th international geostatistics congress, Sept. 23-27, 1996, Wollongong, 1996. Roth, C., J.P. Chiles, and C. de Fouquet, Combining geostatistics and numerical flow simulators to affront hydrogeology's inverse problem, submitted to Water Resour. Res., 1997. Rubin, Y., Prediction of tracer plume migration in disordered porous media by the method of conditional probabilities, Water Resour. Res., 27(6), 1291-1308, 1991a. Rubin, Y., Transport in heterogeneous porous media: Prediction and uncertainty, Water Resour. Res., 27(7), 1723-1738, 1991b. Rubin, Y., and G. Dagan, Stochastic identification of transmissivity and effective recharge in steady groundwater flow 1. Theory, Water Resour. Res., 23(7), 1185-1192, 1987a. Rubin, Y., and G. Dagan, Stochastic identification of transmissivity and effective recharge in steady groundwater flow 2. Case study, Water Resour. Res., 23(7), 1193-1200, 1987b. Rubin Y., and G. Dagan, Conditional estimation of solute travel time in heterogeneous formations: Impact of transmissivity measurements, Water Resour. Res., 28(4), 1033-1040, 1992. Rubin, Y., and A.J. Journel, Simulation of non-Gaussian space random functions for modeling transport in groundwater, Water Resour. Res., 27(7), 1711-1721, 1991. Rubin, Y., G. Mavko and J. Harris, Mapping permeability in heterogeneous aquifers using hydrologic and seismic data, Water Resour. Res., 28(7), 1809-1816, 1992. Sagar, B., S. Yakowitz, and L. Duckstein, A direct method for the identification of the parameters of dynamic nonhomogeneous aquifer Water Resour. Res. 11(4); 563-570,1975. Sahuquillo, A., J.Capilla, J.J. Gomez-Hernandez and J. Andreu, Conditional simulation of transmissiviy fields honouring piezometric data, in Fluid Flow Modeling, ed: W.R. Blain and E.Cabrera, Computational Mechanics Publications, Elsevier Applied Science, p201-212, 1992. Scarascia, S., and G. Ponzini, An approximate solution for the inverse problem in hydraulics, Energ. Elet., 49, 518-531, 1972.

6-9

Shinozuka, M., and C.-M. Jan., Digital Simulation of Random Processes and Its Applications, Journal of Sound and Vibration. Vol. 25, no. 1, 111-128, 1972. Smith, L., and R.A. Freeze. Stochastic analysis of steady state groundwater flow in a bounded domain: 2. Two-dimensional simulations, Water Resour. Res. 15(6), 1543-1559, 1979. Smith, L., and F.W. Schwartz. Mass transport. 2. Analysis of uncertainty in prediction, Water Resour. Res.. Vol. 17, no. 2, 351-369, 1981. Sun, N.Z., Inverse problems in groundwater modeling, Kluwer Acad., Boston, MA, 337 pp., 1994. Sun, N.Z., and W.W.G. Yeh, A stochastic inverse solution for transient groundwater flow: Parameter identification and reliability analysis, Water Resour. Res., 28(12), 3269-3280, 1992. Sykes, J.F., J. L. Wilson, and R. W. Andrews, Sensitivity analysis for steady-state groundwater flow using adjoint operators. Water Resour. Res. 21(3): 359-371, 1985. Thomas, L.K., L.J. Jellums, and G.M. Rehets, A Nonlinear Automatic History Matching Technique for Reservoir Simulation Models. Soc. Pet. Eng. Jour.12(6):508-514, 1972. Townley, L.R., and J.L. Wilson, Computationally efficient algorithms for parameter estimation and uncertainty propagation in numerical models of groundwater flow, Water Resour. Res., 21(12); 1851-1860, 1985. U.S. EPA (Environmental Protection Agency). 40 CFR 191: Environmental standards for the management and disposal of spent nuclear fuel, high-level and transuranic radioactive wastes; Final rule, Fed. Reg., 50 (82), 38066-38089, 1985. Van Geer, F.C., C.B.M. Te Stroet, and Z. Yangxiao. Using kalman filtering to improve and quantify the uncertainty of numerical groundwater simulations 1. The role of system noise and its calibration, Water Resour. Res. 27(8); 1987-1994, 1991. Ward, D.S., M. Reeves and L.E. Duda. Verification and Field Comparison of the Sandia WasteIsolation Flow and Transport Model (SWIFT), NUREG/CR-3316, SAND83-1154, Sandia National Laboratories, Albuquerque, NM, 1984. Wilson, J.L., "The Synthetic Generation of Areal Averages of a Random Field," Socorro Workshop on Stochastic Methods in Subsurface Hydrology, Socorro, NM, April 26-27, 1979. (Copy on file at the Waste Management and Transportation Library, Sandia National Laboratories, Albuquerque, NM.), 1979. Wilson, J.L., P.K. Kitanidis, M. Dettinger, State and parameter estimation in groundwater models, in Applications of Kalman Filter to Hydrology, Hydraulics, and Water Resources, edited by C-L. Chiu, p.657-679, University of Pittsburgh, Pittsburgh, PA., 1978. Wilson, J.L., B.S. RamaRao, and J.A. McNeish. GRASP: A Computer Code to Perform PostSWENT Adjoint Sensitivity Analysis of Steady-State Ground-Water Flow. BMI/ONWI-625. Columbus, OH: Prepared for Office of Nuclear Waste Isolation, Battelle Memorial Institute, 1986.

6-10

WIPP PA. Preliminary performance assessment for the Waste Isolation Pilot Plant, December, 1992. Volumes 1-4: Third comparison with 40 CFR Part 191, Subpart B, SAND92-0700/1, Sandia National Laboratories, Albuquerque, NM, USA, 1992. Xiang, Y., J.F. Sykes, and N.R. Thomson, A composite L1 parameter estimator for model fitting in groundwater flow and solute transport simulation. Water Resour. Res. 29(6); 16611673, 1993. Yeh, T.-C.J., Jin, M., and S. Hanna., An iterative stochastic inverse method: Conditional effective transmissivity and hydraulic head fields, Water Resour. Res., 32 (1);85-92, 1996. Yeh, W.W.G., Review of parameter identification procedures in groundwater hydrology: The inverse problem, Water Resour. Res., 22(1), 95-108, 1986. Yoon, Y.S. and W.W.-G. Yeh, Parameter identification in an inhomogeneous medium with the finite element method. Soc Pet. Eng. Jour. 217-226, 1976. Zimmerman, D.A., and J.L. Wilson, Description of and user's manual for TUBA: A computer code for generating two-dimensional random fields via the turning bands method, GRAM, Incorporated, Albuquerque, NM, USA, 1990. Zimmerman, D.A., C.L. Axness, G. de Marsily, M.G. Marietta, and C.A. Gotway, 1996. “Some results from a comparison study of geostatistically-based inverse techniques." Presented at the Workshop on Parameter Identification and Inverse Methods in Hydrology, Geology and Geophysics, Karlsruhe, Germany, April 10-13, 1995. in Parameter Identification and Inverse Problems in Hydrology, Geology and Ecology, J. Gottlieb and P. DuChateau editors, Kluwer academic publishers, Dordrecht, Germany, 1996. Zimmerman, D.A., G. de Marsily, C.A. Gotway, M.G. Marietta, C.L. Azness, R. Beauheim, R. Bras, J. Carrera, G. Dagan, P.B. Davies, D.P. Gallegos, A. Galli, J. Gomez-Hernandez, S.M. Gorelick, P. Grinrod, A.L. Gutjahr, P.K. Kitanidis, A.M. Lavenue, D. McLaughlin, S.P. Neuman, B.S. RamaRao C. Ravenne, and Y. Rubin. Comparison of seven geostatistically-based inverse approaches to estimate transmissivities for modeling advective transport by groundwater flow, Water Resour. Res. 34(6); 1373-1413, 1998. Zou, X., I.M. Navon, M. Berger, K.H. Phua, T. Chlick, and F.X. Le Diment, Numerical experience with limited memory quasi Newton and truncated Newton methods, SIAM J. Opt., 3(3); 582-608, 1993.

6-11

APPENDICES __________________________________

Appendix A

Paper:

Application of a coupled adjoint sensitivity and kriging approach to calibrate a groundwater flow model

By

Marsh Lavenue and John F. Pickens

Published in Water Resources Research, June, 1992

Appendix B

Paper:

Pilot point methodology for automated calibration of an ensemble of conditionally simulated transmissivity fields, 1. Theory and computational experiments

By

B.S. RamaRao, A.M. Lavenue, G. de Marsily and M.G. Marietta

Published in Water Resources Research, March, 1995

Appendix C

Paper:

Pilot point methodology for automated calibration of an ensemble of conditionally simulated transmissivity fields, 2. Application

By

A.M. Lavenue, B.S. RamaRao, G. de Marsily and M.G. Marietta

Published in Water Resources Research, March, 1995

Appendix D

Paper:

A comparison of seven geostatistically-based inverse approaches to estimate transmissivities for modeling advective transport by groundwater flow

By

D.A. Zimmerman, G. de Marsily, C.A. Gotway, M. G. Marietta, C.L. Axness, R. Beauheim, R. Bras, J. Carrera, G. Dagan, P.B. Davies, D. P. Gallegos, A. Galli, J. Gomez-Hernandez, S. M. Gorelick, P. Grinrod, A.L. Gutjahr, P.K. Kitanidis, A. M. Lavenue, D. McLaughlin, S.P. Neuman, B. S. RamaRao, C. Ravenne, Y. Rubin

Published in Water Resources Research, June, 1998

WATER RESOURCES RESEARCH, VOL. 34, NO. 6, PAGES 1373–1413, JUNE 1998

A comparison of seven geostatistically based inverse approaches to estimate transmissivities for modeling advective transport by groundwater flow D. A. Zimmerman,1 G. de Marsily,2 C. A. Gotway,3 M. G. Marietta,4 C. L. Axness,4 R. L. Beauheim,4 R. L. Bras,5 J. Carrera,6 G. Dagan,7 P. B. Davies,4 D. P. Gallegos,4 A. Galli,8 J. Go ´mez-Herna´ndez,9 P. Grindrod,10 A. L. Gutjahr,11 P. K. Kitanidis,12 A. M. Lavenue,13 D. McLaughlin,5 S. P. Neuman,14 B. S. RamaRao,13 C. Ravenne,15 and Y. Rubin16 Abstract. This paper describes the first major attempt to compare seven different inverse approaches for identifying aquifer transmissivity. The ultimate objective was to determine which of several geostatistical inverse techniques is better suited for making probabilistic forecasts of the potential transport of solutes in an aquifer where spatial variability and uncertainty in hydrogeologic properties are significant. Seven geostatistical methods (fast Fourier transform (FF), fractal simulation (FS), linearized cokriging (LC), linearized semianalytical (LS), maximum likelihood (ML), pilot point (PP), and sequential self-calibration (SS)) were compared on four synthetic data sets. Each data set had specific features meeting (or not) classical assumptions about stationarity, amenability to a geostatistical description, etc. The comparison of the outcome of the methods is based on the prediction of travel times and travel paths taken by conservative solutes migrating in the aquifer for a distance of 5 km. Four of the methods, LS, ML, PP, and SS, were identified as being approximately equivalent for the specific problems considered. The magnitude of the variance of the transmissivity fields, which went as high as 10 times the generally accepted range for linearized approaches, was not a problem for the linearized methods when applied to stationary fields; that is, their inverse solutions and travel time predictions were as accurate as those of the nonlinear methods. Nonstationarity of the “true” transmissivity field, or the presence of “anomalies” such as high-permeability fracture zones was, however, more of a problem for the linearized methods. The importance of the proper selection of the semivariogram of the log10 (T) field (or the ability of the method to optimize this variogram iteratively) was found to have a significant impact on the accuracy and precision of the travel time predictions. Use of additional transient information from pumping tests did not result in major changes in the outcome. While the methods differ in their underlying theory, and the codes developed to implement the theories were limited to varying degrees, the most important factor for achieving a successful solution was the time and experience devoted by the user of the method.

1. 1.1.

Introduction Background

For many practical problems of groundwater hydrology, such as aquifer development, contaminated aquifer remedia1

GRAM, Inc., Albuquerque, New Mexico. Universite´ Paris IV, Paris, France. Centers for Disease Control and Prevention, Atlanta, Georgia. 4 Sandia National Laboratories, Albuquerque, New Mexico. 5 Massachusetts Institute of Technology, Cambridge. 6 Universitat Polite`cnica de Catalun ˜a, Barcelona, Spain. 7 Tel Aviv University, Tel Aviv, Israel. 8 Ecole de Mines de Paris, Fontainebleau, France. 9 Universidad Polite´cnica de Valencia, Valencia, Spain. 10 QuantiSci, Ltd., Henley-on-Thames, England, United Kingdom. 11 New Mexico Institute of Mining and Technology, Socorro. 12 Stanford University, Stanford, California. 13 Duke Engineering and Services, Inc., Austin, Texas. 14 University of Arizona, Tucson. 15 Institut Franc¸ais du Pe´trole, Rueil-Malmaison, France. 16 University of California, Berkeley. 2 3

Copyright 1998 by the American Geophysical Union. Paper number 98WR00003. 0043-1397/98/98WR-00003$09.00

tion, or performance assessment of planned waste disposal projects, it is no longer enough to determine the “best estimate” of the distribution in space of the aquifer parameters. A measure of the uncertainty associated with this estimation is also needed. Geostatistical techniques are ideally suited to filling this role. Basically, geostatistics fits a “structural model” to the data, reflecting their spatial variability. Then, both “best estimates” (by kriging) and the variance of the estimation error can be developed. Geostatistical techniques can also produce “conditional simulations” that honor the data at measurement points and, through multiple realizations, display the uncertainty in the spatial distribution of the parameters. These conditional simulations can then be used in a Monte Carlo analysis (e.g., as input to groundwater flow and transport models) to display the uncertainty in the final outcome of the study (flow rates, concentrations, travel times, etc). In some cases the probability distribution function (pdf) of the final outcome can be directly predicted analytically from the “structural” characteristics of the data. Nongeostatistical approaches such as weighted least squares optimization followed by sensitivity studies to assess parameter uncertainty have also been used. Examples of such simulations have been given by Delhomme

1373

1374

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

[1979], Dagan [1985, 1989], Rubin and Dagan [1987, 1992], Rubin [1991a, b], Rubin and Journel [1991], Desbarats and Srivastava [1991], Robin et al. [1993], Gutjahr et al. [1994], Harvey and Gorelick [1995], and Koltermann and Gorelick [1996], among others. In groundwater hydrology, data may come from at least four sources: (1) transmissivity (or permeability) measurements; (2) hydraulic head measurements; (3) tracer concentrations in wells from tracer tests; and (4) geologic information on the nature and characteristics of the formation. The incorporation of geologic information is generally made by zoning the parameter field or by including a trend or a piecewise-varying identification of the structural model. For example, the inclusion of geophysical information has been presented by Rubin et al. [1992], Copty et al. [1993], and Hyndman et al. [1994]. Because it is difficult to incorporate these types of information in an “inverse” approach simultaneously, it has been common engineering practice to calibrate models by “trial and error,” sometimes using sensitivity analyses and optimization subroutines to accelerate the fitting [e.g., Dettinger and Wilson, 1981; Peck et al., 1988]. Such approaches are, however, limited to producing a “best estimate” and can only assess a residual uncertainty (i.e., an estimate of the confidence interval of each parameter after calibration) by a postcalibration sensitivity study. This approach is insufficient to characterize the uncertainty after calibration. Therefore a large number of geostatistically-based inverse techniques have been developed for handling both head and transmissivity data. In general, these techniques follow these steps: (1) Calibrate a “structural model” of the spatial variability using either the transmissivity data only or the transmissivity and the head data; (2) determine the cross covariance between the transmissivity and the head; (3) use an optimization procedure to estimate the transmissivity based on autocovariances and cross covariances. Alternatively, the estimation can be replaced by simulations of alternative realizations of the transmissivity fields. A good review and comparison of a number of these approaches has been very recently prepared by McLaughlin and Townley [1996], who not only described the approaches but also presented them in a unified framework, with a discussion of their respective theoretical merits. Prior to that, several surveys had been presented by Kuiper [1986], Yeh [1986], Carrera and Neuman [1986a], Carrera [1988], Ginn and Cushman [1990], Keidser and Rosbjerg [1991], Ahmed and de Marsily [1993], and Sun [1994], among others. Although we believe that this comparison is to date the largest effort undertaken to evaluate inverse approaches objectively, it is worth mentioning here the comparison of four inverse techniques (the pilot point, pure zoning, a combination of zoning and kriging, and a version of linear cokriging) published by Keidser and Rosbjerg [1991] on four different data sets that included both hydraulic and contaminant data. They compared the precision and robustness of the approaches and concluded that pure zonation (without any geostatistical assumptions) was superior to the other approaches when data are scarce or when measurement errors exist. Pilot point performed best for reproducing large-scale heterogeneities, the combination of zoning and kriging was robust and flexible, and linear cokriging was found to be very sensitive to the reliability of the T data. Pure zoning did not perform well in the case of fairly complex aquifers. Rubin and Dagan [1987] used the data on the Avra Valley presented by Clifton and Neuman [1982] in an inverse method different from that of these authors (a

linear semianalytical method for the former, a maximum likelihood estimate using zoning for the latter). They concluded that reasonably similar results had been obtained by the two approaches. Carrera and Glorioso [1991] also compared linear and nonlinear approaches and obtained similar conclusions, except for large variances of ln(T), for large head measurement errors, or in the presence of sink/source terms. Under any of these conditions they found that nonlinear approaches performed much better than the linear ones. The reader is also referred to special issues of Advances in Water Resources (volume 14, numbers 2 and 5, 1991), in which a large number of inverse approaches have been presented and analysed. Finally, it should be noted that this paper represents the completion of the in-progress comparison study presented by Zimmerman et al. [1996]. 1.2.

Motivation for This Study

A comparison of inverse approaches was undertaken by Sandia National Laboratories (SNL) in conjunction with the performance assessment (PA) of the Waste Isolation Pilot Plant (WIPP) site. The WIPP is a U.S. Department of Energy (DOE) facility currently being evaluated to assess its suitability for isolating transuranic wastes generated by the defense programs in the United States. It should be noted that this work was not performed in accordance with the SNL WIPP quality assurance (QA) program and that none of these results are to be referenced for any work performed under the SNL WIPP QA program. The proposed repository is located within the bedded salt of the Salado Formation at a depth of about 650 m. A description of the WIPP site and of the first application of an inverse technique to this site is given by Lappin [1988], LaVenue and Pickens [1992], and LaVenue et al. [1995]. The Culebra Dolomite, a 7-m-thick member of the 120-mthick Rustler Formation located at a depth of about 250 m, has been characterized as the most transmissive, laterally continuous hydrogeologic unit above the repository and is considered a potentially important transport pathway for off-site radionuclide migration within the subsurface. This transport could occur if, in the future, a well drilled for exploration purposes created an artificial connection between the waste storage rooms and the Culebra, allowing radionuclides to leak into the Culebra. Such a scenario is part of a probabilistic PA that the U.S. Environmental Protection Agency (EPA) requires DOE to perform to demonstrate compliance of the repository system with regulations governing disposal of radioactive wastes [EPA, 1985; Sandia National Laboratories, 1992]. The data base for modeling the Culebra is available from Cauffman et al. [1990]. Because the EPA regulation is probabilistic, the PA must adequately reflect the variability and uncertainty within all factors that contribute to the simulation of the repository performance for isolating wastes. For performance assessment of a nuclear waste repository a hydrologist must provide not just a transmissivity field or a series of transmissivity fields but the probability density function (pdf) of the outcome of the flow simulation (the travel time). Thus the inverse problem serves only as the means for estimating this pdf, conditioned on the available data. The current probabilistic approach to PA [Sandia National Laboratories, 1991] accommodates parameter correlations, including spatial correlations, and conditioning on sample data. For a contaminant transport problem such as radionuclide migration in the Culebra at the WIPP, the focus is on adequately

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

1375

Table 1. The Seven Inverse Methods Compared Inverse Method

Symbol

First Author

Affiliation

Fast Fourier transform Fractal simulation Linearized cokriging Linearized semianalytical Maximum likelihood Pilot point method Sequential self-calibration

FF FS LC LS ML PP SS

A. Gutjahr P. Grindrod P. Kitanidis Y. Rubin J. Carrera B. S. RamaRao J. Go ´mez-Herna´ndez

New Mexico Institute of Mining and Technology QuantiSci, United Kingdom Stanford University University of California, Berkeley Universitat Polite`cnica de Catalun ˜a, Spain Duke Engineering and Services, Inc. Universidad Polite `cnica de Valencia, Spain

characterizing the hydraulic properties of the medium and their uncertainty. However, the real quantity of interest is the conditional pdf of the PA outcome. Thus solving the inverse problem is not an objective per se but is just a means to generate adequate intermediate parameter fields to be used in the PA simulations. 1.3.

Objectives

In this study we compare seven inverse approaches, outline their differences, and discuss their potential strengths and weaknesses. The results point to areas of research that may be useful for improving the inverse techniques. This paper addresses the following issues by comparing the different inverse approaches on three different test problems: (1) How different are the inverse techniques considered in this paper? (2) How effective are they for solving practical problems? (3) How dependent are they on the various assumptions that are made to derive the algorithm, for example, statistical homogeneity, Gaussian distributions, and small magnitude of the log transmissivity (log10(T)) variance? The problem sets are artificial in order to be able to compare the approaches to one another and also with a synthetic “truth.” Each test was also designed to be comparable with that of an actual site and to address the validity of the underlying assumptions inherent in the different approaches. We have chosen to consider the advective groundwater travel time (GWTT) of a conservative tracer as a surrogate for the more complex solute transport problem. We will therefore generate pdf’s of GWTT (as the PA outcome) and evaluate inverse approaches on their ability to reflect the uncertainty in aquifer parameters adequately as described by these conditional GWTT pdf’s. Our objective is to reveal how the estimate of the conditional pdf’s of GWTT can be affected by either the differences in the principles and coding of the inverse methods or the manner in which a given method was applied by the person who ran it. 1.4.

Geostatistical Approaches To Be Compared

Seven inverse methods were selected for comparison. The selected methods were to estimate the transmissivity field from measurements of transmissivity and head and produce an ensemble of simulated transmissivity fields conditioned on all the available data on transmissivity and head. These simulated transmissivity fields should reflect the uncertainty in the transmissivity estimate after calibration and would be the input T fields in the Monte Carlo simulations of flow through the system. There should be as many different T fields as Monte Carlo simulations (about 100), all considered as having an equal probability of occurrence. The geostatistical inverse approaches are listed in alphabetical order in Table 1. In Appendix B we give a short summary

of the description of each method, with references to the major publications where the methods were presented and applied. For clarity and brevity we will refer to the methods by their two-letter symbols (see Table 1). These seven methods are by no means an exhaustive sampling of all the methods that have been published in the literature. Among the most prominent “absences” are the approaches proposed by Cooley [1977, 1979, 1982, 1983], Townley and Wilson [1985], and Sun and Yeh [1992], who unfortunately could not participate. The seven approaches can be categorized as being either linearized or nonlinear. While the groundwater flow equation for confined aquifers is always linear for the head, this same equation is nonlinear for the relation of T to head. The linearized approaches are generally based upon simplifying assumptions about the flow field (e.g., a uniform hydraulic head gradient, a small ln(T) variance, etc.), that lead to a linearized relation between T and head using a perturbation expansion of the head and transmissivity fields. This equation can then be solved analytically or numerically. The nonlinear approaches have no such restrictions placed on them and can, in principle, handle more complex flow fields or larger ln(T) variances. Methods FF, LC, and LS fall into the linearized category, while methods FS, ML, PP, and SS fall into the nonlinear one. The LS method is able to calculate the GWTT cumulative distribution functions (CDFs) directly, so this method did not produce transmissivity fields. T fields could have been produced by this method, but these fields would then not have been linked to a particular travel path or travel time and so were not calculated. 1.5.

Overview of the Test Problem Exercise

The test problem exercise was conceived and performed by group of participants referred to by SNL as the Geostatistical Expert Group (GXG). A listing of the participants is given in Appendix A. Four test problems were developed in secrecy from the participants who would receive the data and run the inverse models. The test problems were designed to be “WIPPlike,” meaning that the hydrogeologic characteristics and the complexity of the problems, as well as the type of data and their spatial distribution, should be relatively similar to that of the WIPP site. The synthetic transmissivity fields should also have properties similar to those observed at WIPP or believed to exist at the WIPP on the basis of inference from geological and hydrological data. Four different T fields were generated. Synthetic hydraulic head data were obtained by solving the two-dimensional flow equations with prescribed boundary conditions using these synthetic T fields. A limited number of observations of head and transmissivity obtained from the exhaustive (synthetic) data sets would then be provided to the participants. Additionally, particle-tracking calculations were performed to compute advective travel times and travel paths

1376

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

of a conservative solute for the synthetic data sets. Particles were released in a number of locations and the “true” groundwater travel times were calculated but not given to the participants. For each test problem the participants would analyze the sampled T and head data (about 40 observations of each) and use their inverse procedure to generate the ensemble of conditional transmissivity fields and corresponding head fields (in general between 50 and 100) that were given to the GXG coordinator. The coordinator would then calculate the travel times and travel paths for the same release points as those in the “true” field, using the same particle-tracking code as the one used for the true field but using the T values, the grid size, and the boundary conditions specified by the participants as a result of their efforts. Throughout this paper the term “GWTT” is defined as the time it takes for a particle to reach a radial distance of 5 km from the release point. The calculated GWTTs taken across all realizations produced by a method were used to construct a GWTT CDF which was compared to the “true GWTT.” This is referred to as the “fixed well approach,” described in more detail below. In a second set of analyses (the “random well approach,” also described in detail below) the GWTTs from an ensemble of release points contained within a localized area were used to construct the “true GWTT CDF,” which was then compared with the calculated GWTT CDFs for the same release points from each of the methods. In the case of the linearized semianalytical method, only a particle travel time CDF was requested because this method does not require generation of a transmissivity field to estimate this CDF. In the real world it is clear that parameters other than transmissivity are variable and uncertain in the system. For example, porosity, aquifer thickness, dispersivity, sorptive properties, etc., are all variable, and the GXG made suggestions on how to incorporate these uncertainties into the PA. However, for the present intercomparison, only the transmissivity is involved, and all other parameters are given uniform values.

2.

Description of the Four Test Problems

The test problems (TPs) were developed as a series of independent synthetic data sets that were intended to span the range of possible conceptual models of the Culebra transmissivity distribution at the WIPP site. Estimates of transmissivity at 41 boreholes at the actual WIPP site have been obtained through slug tests, local pumping tests, and three regionalscale pumping tests lasting from 1 to 3 months [Beauheim, 1991]. The T values obtained from these tests span 7 orders of magnitude. Analyses of these data indicate that it is likely that the spatial distribution of heterogeneity is not random, but made of specific zones of high and low values. Transmissivity is strongly impacted by the presence or absence of open fractures. Large-scale pumping tests indeed suggest that narrow, relatively conductive fracture zones are possible in some areas. Whether these fractures form a connected network or are isolated from each other by low-transmissivity zones is not clear. In other areas local well tests have indicated the existence of rather low-permeability zones. There are lithologic indicators of high or low transmissivities such as the presence or absence of gypsum filling in fractures, although these indicators are not strict. An attempt has been made in the test problems to represent the presence or absence of such features. Although the PA calculations to date have assumed a per-

fectly confined, two-dimensional flow system for the Culebra, there may be vertical flow into or out of the Culebra. Vertical leakage is therefore reflected in some of the test cases. There is also a known salinity gradient in the Culebra which was not considered in the test problems, as most inverse approaches assume constant density. Hydraulic heads obtained prior to the WIPP site characterization activities when the system was in a quasi–steady state condition were available at 32 locations. Transmissivity estimates were available at 41 locations. Thus the test problems were developed as steady state systems, and the sample data were limited to, at most, 41 observations of head and transmissivity (at the same locations). The spatial distribution of the boreholes (i.e., density, pattern) in the TPs were kept similar to that present at the WIPP. Three large-scale pumping tests were also simulated in TPs 3 and 4. In the real world these data are all subject to measurement errors. However, none was considered in these calculations because the objective of the comparison was not to assess the robustness of an approach to the magnitude of measurement errors, but for a given set of data, to determine the residual uncertainty on the transport properties of the domain as evaluated by each approach. Adding a measurement error would only increase this uncertainty and decrease the ability to distinguish between the approaches. Another reason is that the synthetic data were generated on a very small grid (20 – 40 m) and the participants were given the grid values at the sampled locations. Thus the small-scale variability of the synthetic log (T) fields can be viewed as measurement error, compared to a “measured value” which could have been provided by averaging over the larger domain such as that which an actual pumping test would have produced. Boundary conditions for flow in the vicinity of the WIPP site are not well constrained. Thus the boundary conditions were not defined for the participants. Given the 41 head measurements in the domain, they were asked to select the boundary conditions they felt appropriate. In test problems 1 and 2, the synthetic T fields were generated as unconditional random fields using the two-dimensional random field generator TUBA (Zimmerman and Wilson, [1990]; see also work by Mantoglou and Wilson [1982] and Matheron [1973]). In test problems 3 and 4 the initial field was also generated using TUBA, but additional discrete modifications were made to each. For all test problems, Dirichlet boundary conditions (different for each test problem) were developed for calculating the synthetic heads by generating a stationary random field and adding that to a trend surface. These dense synthetic data sets comprised from one to three million nodes. In all test cases a uniform mesh was used and the true head-field solution was obtained via a multigrid solver (finite difference method) provided by Pacific Northwest Laboratory (see acknowledgments). The size of the area over which the observation data are distributed is 20 km ⫻ 20 km for TPs 1, 2, and 3 and approximately 30 km ⫻ 30 km for TP 4. This is of similar scale to the area where data are available at the actual WIPP site. The exact correlation structure of each synthetic data set was determined via semivariogram analysis using GSLIB routines [Deutsch and Journel, 1992]. Over 3600 randomly located sample points were used in the computation of the exhaustive data set semivariograms in order to obtain enough pairs for stable semivariogram estimates from a single realization (the “true” field). An exponential semivariogram model was then fit to each empirical semivariogram via nonlinear regression in or-

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

der to report a correlation length parameter and a variance. Exponential semivariogram models have been fit to the WIPP site log10(T) data and an exponential semivariogram model was used to generate the log10(T) fields for TPs 1 and 2. The main features of each test problem, including means, variances, correlation lengths, etc., are summarized in Tables 2a and 2b. 2.1.

Test Problem 1

TP 1 was the simplest conceptual model. It was developed using a model of the Culebra transmissivities that was based on a geostatistical analysis of the real WIPP site data. The log10 (T) field (T in m2/s) was modeled as an isotropic process having a mean of ⫺5.5, a variance of 1.5, and an exponential covariance structure with correlation length ␭ ⫽ 3905 m, close to the values of the real WIPP site. A map of the synthetic log10 (T) field with the location of the observation points is shown in Figure 1. A large regional field (40 km ⫻ 40 km) was generated on a 1793 ⫻ 1793 size grid (over 3.2 million unknowns) with each grid block being 22.5 m on a side. However, the observation data were located in the central 20 km ⫻ 20 km area which is the portion of the field that is shown in Figure 1. The mean and variance of the exhaustive log10 (T) data for the inner region are ⫺5.84 and 1.56, respectively. The sample data consisted of 41 transmissivity and 32 head measurements taken from the exhaustive synthetic data set. Boundary conditions were generated using a combination of a linear trend surface and spatially correlated noise. The trend surface, given by Z ⫽ 890 ⫹ 0.36X ⫹ 1.28Y, with X and Y in km, was derived from an analysis of the WIPP site data to provide a similar head gradient. An anisotropic Gaussian process with zero nugget, a sill of 50 m2, and ranges of 5 and 15 km in the north-south and east-west directions, respectively, was used to model the hydraulic head variability for generating the boundary values. 2.2.

1377

Table 2b. Additional Characteristics of Each Test Problem Data Set TP

True Field Correlation Length, m

Recharge Included?

Transient Pumping?

Well log “Geology”?

1 2 3 4

2808 2808 425 2063

no no yes yes

no no yes yes

no no yes no

2.39. The log10 (T) field and observation points are shown in Figure 2. 2.3.

Test Problem 3

The intent of test problem 3 was to incorporate some of the more complex geohydrologic characteristics of the WIPP site. Several high-transmissivity fracture zones approximately 1–3 km apart have been inferred from pumping tests in the northwest and southeast areas of the WIPP site and in other areas of the site; aquifer tests conducted at several wells have resulted in very low transmissivity values. The transmissivity field of represents a possible nonstationary conceptual model of the WIPP site transmissivity distribution that includes, within a background medium of variable transmissivity, disconnected high-transmissivity “channels” that represent fracture zones, local low-transmissivity subregions representing tight zones, and a large low-transmissivity zone in the southwest corner of the field. Information on “the type of geology encountered in each borehole” was provided to the participants (descriptors such as “porous,” “fractured,” and “tightly cemented” were used to relate to the “background,” the high-T “channels,” or the low-T subregions, respectively). Such information would, of course, be available at any real site. The log10 (T) distribution is shown in Figure 3. The map is

Test Problem 2

The second test problem data set was generated specifically to examine how well the linearized techniques could handle high-variance cases. The model of spatial variability of TP 2 is identical to TP 1; only the mean and variance of log10 (T) were changed. In fact, the pattern of spatial variability remains exactly the same except that the field is rotated counterclockwise by 90⬚. The mean of log10 (T) was increased to ⫺1.26, resulting in faster travel times, and the log10 (T) variance was increased to 2.14. The boundary values remained the same, albeit rotated by 90⬚. The same number and similar configuration of observation data as for TP 1 were provided to the participants. The sample log10 (T) data have a mean of ⫺0.52 and a variance of

Table 2a. Log10 (T) Field Exhaustive Data Set and Sample Data Statistics Exhaustive Data

Sample Data

Observations

TP

Covariance Model



␴2



␴2

Head

log10 (T)

1 2 3 4

exponential exponential Telis bessel

⫺5.84 ⫺1.26 ⫺5.64 ⫺5.32

1.56 2.14 1.38 1.93

⫺5.30 ⫺0.52 ⫺5.70 ⫺5.32

1.84 2.39 1.82 1.89

32 32 41 41

41 41 41 41

T in m2/s.

Figure 1. Test problem 1 true log (T) field (20 km ⫻ 20 km). Squares are assumed waste disposal areas. The flow lines originating from these squares display the flow direction from the disposal area to the boundaries. The six gray shades are log10 (T) intervals of 10⫺7–10⫺6, 10⫺6–10⫺5, 10⫺5–10⫺4, 10⫺4– 10⫺3, and ⬎10⫺3 (lightest).

1378

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

Figure 2. Test problem 2 true log (T) field (20 km ⫻ 20 km). Squares are assumed waste disposal areas. The flow lines originating from these squares display the flow direction from the disposal area to the boundaries. The six gray shades are log10 (T) intervals of 10⫺7–10⫺6, 10⫺6–10⫺5, 10⫺5–10⫺4, 10⫺4– 10⫺3, and 10⫺3–10⫺1 (lightest).

best thought of as several “geologic overlays.” The underlying “geostatistical background field” (the T field without the low-T subregions and high-T fractures) was generated as a stationary field having a log10 (T) mean and variance of ⫺5.5 and 0.8, respectively, and an anisotropic Telis covariance with correlation lengths of 1.025 km and 0.512 km in the east-west and north-south directions, respectively. The high-T “channels” were generated with a log10 (T) mean of ⫺2.5, a variance of 0.1, and an exponential covariance with a correlation length of 6667 m. This correlation structure applies only along narrow zones described as “fractures,” as shown in Figure 3. The low-T zones, shown as dark, nonuniform ovoid regions, had a mean log10 (T) of ⫺7.5, a variance of 0.5, and an isotropic exponential covariance model with a correlation length of 3417 m. In the lower left-hand corner of the field is an area in which there is a trend of decreasing transmissivity toward the corner of the field. After overlaying the low-T subregions and linear features, a 3 ⫻ 3 moving window block filter (averaging on the log10 (T)) was passed over the field to help smooth the transition between these zones in order to avoid potential numerical-convergence problems. This did not significantly reduce the variance of the final log10 (T) field, which had a mean and variance of ⫺5.64 and 1.38, respectively. The mean and variance of the sample log10 (T) data are ⫺5.70 and 1.82, respectively (for the 41 sample points). Vertical recharge was applied uniformly over the northwestern portion of the model domain; the recharge rate was 6.5 ⫻ 10⫺9 m3/s. The recharge distributed over this region accounts for approximately 10% of the regional flow through the system. Such recharge could be inferred by the participants from the observed heads in this area, which showed a localized piezometric mound, but no information on recharge was given to the participants.

Boundary conditions were generated in a similar fashion as those for TPs 1 and 2, using a combination of linear trend surface and spatially correlated noise. The trend surface was based on an analysis of the WIPP site data, but with the x direction of the trend reversed. The trend model is given by Z ⫽ 890 ⫹ 0.36(41 ⫺ X) ⫹ 1.28Y, where X and Y are given in km. An anisotropic exponential covariance model having zero nugget, a sill of 50 m2, and X and Y correlation lengths of 15 and 5 km, respectively, was used to model the head spatial variability for generating the boundary values. In addition to the steady state hydraulic head data and transmissivity values, transient information was provided to the participants in the form of three independent aquifer tests. Pumping from the aquifer was simulated numerically in three different wells (one at a time) and drawdown data in the surrounding wells and the pumping rates were given to the participants. A uniform storativity value of 5 ⫻ 10⫺6 was assigned to the system (but not made known to the participants). These tests were loosely modeled after the H-3, H-11, and WIPP-13 large-scale pumping tests conducted at the WIPP site [Beauheim, 1991]. Details of the three aquifer tests, including drawdowns at each observation well and estimates of transmissivity and storativity based on conventional well-test analysis, were given to the participants. 2.4.

Test Problem 4

TP 4 is a complex, nonstationary conceptual model of the transmissivity distribution reflecting large-scale connectivity of fracture zones (contrary to) that have been shown to exist in some areas of the Culebra. The features of the conceptual model included the following: (1) well-connected high-T channels, (2) a variation in transmissivity of 5– 6 orders of magnitude, (3) a small trend in log10 (T) (1–2 orders of magnitude for T across the entire field), (4) a local recharge area correlated with the high-T zones, and (5) some high-T zones that

Figure 3. Test problem 3 true log (T) field (20 km ⫻ 20 km). Squares are assumed waste disposal areas. The flow lines originating from these squares display the flow direction from the disposal area to the boundaries. The six gray shades are log10 (T) intervals of 10⫺7–10⫺6, 10⫺6–10⫺5, 10⫺5–10⫺4, 10⫺4– 10⫺3, and ⬎10⫺3 (lightest).

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

are well-identified while others are missed by the observation wells. The field was generated as follows: Initially, an unconditioned field having an anisotropic Bessel covariance structure was generated with correlation lengths of 2.05 and 1.025 km in the east-west and north-south directions, respectively. Through a series of repeated kriging exercises, a network of connected high-T channels was developed. Each time the kriging was performed, a number of “fake conditioning points” was added to develop the high-T channels iteratively in the kriged “true field.” The final conditionally simulated field was generated via the classical method of conditional simulation described by Journel and Huijbregts [1978], being the sum of the kriged true field and the perturbations resulting from the difference between the unconditioned field and the kriged unconditioned field. The final log10 (T) field had a mean of ⫺5.32 and a variance of 1.93; the field is shown in Figure 4 along with the 41 observations points. The field was generated on a 1025 ⫻ 1025 grid with 40-m grid blocks. The log10 (T) mean and variance of the 41 sample observations are ⫺5.32 and 1.89, respectively. Boundary values were obtained by generating a random head field using a generalized isotropic covariance C(h) ⫽ h 5 using the TUBA code. The field so generated had a southwestto-northeast trend diagonally across the field and through the high-T channels. This field was scaled to provide a head difference of 94 m along that diagonal (⬃58 km). Areal recharge was applied to the southern portion of the field where the transmissivity is generally somewhat higher than average; this also helped to direct the flow through the high-T channels without causing any mounding. The recharge (leakage) was applied nonuniformly, being highly correlated with the transmissivity distribution in this area of the field. Because it occurs at the margin of the field, no observation points are located within this region. The recharge amounted to approximately 6% of the regional flow moving through the system, but no recharge information was given to the participants. As in TP 3, three independent numerical pumping tests were performed to provide transient information for those techniques that could use it. A detailed description of the three tests and conventional analyses of the results were given to the participants.

3.

Qualitative Results

In this section we present the results of the groundwater travel time (GWTT) distributions and the transmissivity maps produced by the different approaches along with a statistical analysis of all the evaluation measures. 3.1.

Comparison of GWTT CDFs

It is assumed that radionuclides can reach the aquifer when at some future time an exploratory well is drilled through the repository. Because this hypothetical future drilling location is unknown, it is reasonable to treat the unknown location as a random variable within the repository (also referred to as “the waste panel area”). The objective in this analysis is to compute GWTT CDFs for both the true fields and the fields produced by the inverse methods and to compare them. These CDFs represent the uncertainty in GWTT resulting from an intrusion borehole whose location is unknown but which lies somewhere within the waste panel area. What we want to investigate is if

1379

Figure 4. Test problem 1 true log (T) field (30 km ⫻ 30 km). Squares are assumed waste disposal areas. The flow lines originating from these squares display the flow direction from the disposal area to the boundaries. The six gray shades are log10 (T) intervals of 10⫺7–10⫺6, 10⫺6–10⫺5, 10⫺5–10⫺4, 10⫺4– 10⫺3, and ⬎10⫺3 (lightest).

the CDFs produced by each approach, for each test problem, are reasonably close to the true CDFs. Hereinafter, the designations “true field,” “true travel time,” and “true GWTT CDF” refer to quantities computed using the exhaustive (synthetic) data set. To construct the true GWTT CDF, a hypothetical repository of similar scale to the waste panel area at the real WIPP site (1.1 km ⫻ 1.1 km) is located within the study area. The repository is located in the zone where the density of the observation data is greatest. Particle tracking is performed for each of 100 particles, distributed uniformly over the waste panel area, out to a radial distance of 5 km. These GWTTs are then used to construct the “true CDF.” To construct the GWTT CDFs for each of the approaches, the procedure was similar, but the uncertainty will of course be larger, since knowledge of the true transmissivity distribution is not perfect. The transmissivity fields used to calculate the travel times are those derived via the inverse procedures. These CDFs, however, will be conditioned on the available data. That is, the velocity fields were computed by solving the forward problem where the boundary conditions, source terms, and discretization were assigned individually by each participant (i.e., they were different for each inverse approach). The GWTT CDFs for each approach were constructed as follows: For each of the 100 release points within the waste panel area, a GWTT CDF was constructed from the ensemble of GWTTs obtained across all realizations. Hence 100 GWTT CDFs are obtained, each CDF being conditional on a particular release point location. The mean CDF for the entire waste panel area was computed as mean CDF ⫽



waste panel

CDF(x) f共x兲 dx

(1)

1380

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

Figure 5a. Test problem 1 mean log10 (T) field, randomly selected single realization overlain with bounding pathlines for that realization, and the GWTT CDF curves from all realizations. Thick line is true CDF, dashed line is mean CDF, and other curves bound the inner 95% of all conditional CDFs. where f(x) is the probability density function for the borehole location (which was treated as uniform; hence the release points are equally weighted). Figures 1– 4 show the position of the repository for each TP, with envelopes of all the true-field particle paths originating from the edges of the waste panel area. Both the true GWTT CDFs (thick line) for each TP and the median GWTT CDF for each method (dashed line) are plotted at the bottom of Figures 5a– 8c. The GWTT CDFs for method LS (which did not need to produce T fields) are shown in Figure 9. In addition, on these plots a bounding envelope containing the inner 95% of the CDF curves at each travel time value was constructed. These GWTT0.025 and GWTT0.975 bounding curves reflect the degree of variability in GWTT within the repository area from realization to realization. This type of analysis allows for the fact that conceptually, the distributions of properties could be identical and yet the un-

derlying T fields different. This is because an identical GWTT may be obtained for very different random well locations in the simulated and true fields. The test only compares the distributions and does not consider if the short or long GWTTs originate from the same locations. 3.2. Comparison of the Transmissivity Fields and Their Semivariograms As an additional means of comparison, some of the T fields produced by each approach for each TP are shown in Figures 5a– 8c. In each figure we have chosen to show both the average of the simulated log10 (T) fields (50 –100 simulations) and one individual realization, selected at random. Both maps are “embedded” in the true T fields in order to reveal the area the participants decided to model, the grid orientation, and the level of discretization they used. Also shown in these figures

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

1381

Figure 5b. Test problem 1 mean log10 (T) field (top), randomly selected single realization overlain with bounding pathlines for that realization, and the GWTT CDF curves from all realizations. Thick line is true CDF, dashed line is mean CDF, and other curves bound the inner 95% of all conditional CDFs.

are the envelopes of the travel paths from the edges of the waste panel area. Six gray shades are used; each represents an order-of-magnitude change in the value of transmissivity. The ability of an inverse method to reproduce the correlation structure of the T field in the realizations correctly was considered an important feature for predicting contaminant transport and spreading. Therefore semivariogram estimates of the simulated log10 (T) fields were computed for each realization of a method, using the GAMV2M routine from the GSLIB software package [Deutsch and Journel, 1992]. On the order of 600 –1000 randomly placed sampling points were used in the estimation of each semivariogram. Then the average semivariogram was computed across the ensemble of realizations for each TP. For method LS the participant gave directly the parameters of the exponential variogram he had selected. For the other approaches, estimates of the parameters of an

exponential semivariogram model fit to each of the average empirical semivariograms (one for each TP) were made via nonlinear regression. The same analysis was performed on each of the true log10 (T) field realizations; approximately 3600 sample values were used for the semivariogram estimates in each of the true-field exhaustive data sets. 3.3.

Qualitative Comparison Observations

First, a visual comparison of the mean CDFs with the true CDF reveals large differences among the approaches. Second, no one particular approach is obviously superior to all others, for all TPs. A third observation is that the mean CDF of each method spans a broader range than the true one: The uncertainty linked to the position of the intrusion borehole is significantly increased when additional uncertainty is introduced by only

1382

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

Figure 5c. Test problem 1 mean log10 (T) field, randomly selected single realization overlain with bounding pathlines for that realization, and the GWTT CDF curves from all realizations. Thick line is true CDF, and other curves bound the inner 95% of all conditional CDFs. incomplete knowledge of the parameters. To give some perspective to these conditional CDFs, we have crudely estimated what would have been the unconditional CDFs, if we had not used any inverse and had directly sampled uncertain parameters in “traditional” Monte Carlo simulations. For this, we assume that we know only, for each TP, the pdf of T from the 41-sample T data and the average head gradient from the head observation data, and we take the same porosity as in all our calculations (16%). For simplicity, we assume that the unconditional T field is uniform over the whole domain and that its value is sampled from a lognormal distribution defined by the mean and variance of the 41 log10 (T) sample data. These four unconditional CDFs are shown in Figure 10. It is clear from this figure that conditioning drastically reduced the uncertainty, which otherwise would have spanned an interval several orders of magnitude larger than the true one.

Fourth, all approaches do relatively well for TPs 1 and 2; their mean CDFs are reasonably close to the true ones, the error is small (less than half an order of magnitude), and the overall uncertainty range is relatively small. The results for TP 1 are generally conservative, and the uncertainties are generally higher in TP 2. But for TPs 3 and 4 the results are in general rather poor. The error can reach several orders of magnitudes, and in general the methods are systematically biased. Thus the predicted GWTT is in general longer than the true one. We can now examine these results in more detail to reveal some differences among the approaches. Remember that three of the approaches are “linearized” (LC, FF, and LS) and therefore should be sensitive to the magnitude of the variance of the log10 (T) fields. The only difference between TPs 1 and 2 is an increase in the log10 (T) variance (from 1.56 to 2.14, or, for

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

1383

Figure 6a. Test problem 2 mean log10 (T) field, randomly selected single realization overlain with bounding pathlines for that realization, and the GWTT CDF curves from all realizations. Thick line is true CDF, dashed line is mean CDF, and other curves bound the inner 95% of all conditional CDFs.

ln(T), from 8.25 to 11.32). It is clear, when comparing Figures 5a–9, that the increase in variance did not affect any of the methods, neither the linearized nor the nonlinear ones. The magnitude of the variance of log10 (T) is thus apparently not a critical issue. Linearization is generally assumed valid for ln(T) variances on the order of 1; the ln(T) variances in these test problems are well beyond this range. When we look at TPs 3 and 4, it becomes clear that some of the nonlinear approaches systematically perform better than the linearized ones: SS, ML, and PP have mean CDFs substantially closer to the true CDFs than FF and LC. FS, although nonlinear, does not do as well. Let us now turn to the bounding curves of the waste panel CDFs. These curves reveal how different the CDFs can be, for a given approach, from simulation to simulation. If the bounding curves are very near the mean CDF, it means that the T

fields are relatively well known from the available data, by calibration, and that the residual uncertainty is small. This would be desirable only if the mean CDF was very close to the true one. Otherwise, the method can be said to be “overconfident.” In PA, overconfidence can be regarded as an unacceptable “sin.” This is because the decision on whether or not to license a waste repository, that is, to declare it “safe” with regard to isolating the wastes, would then be based on overly optimistic predictions of the repository’s performance. If application of the inverse methods was to result in overconfidence, then their use in PA should be questioned. Another test for evaluating PA methodology was conducted in the United Kingdom [Mackay, 1993] to see how the “VANDAL” PA approach, developed by Her Majesty’s Inspectorate of Pollution, would perform in a synthetic case, as the amount of information made available to the modeler

1384

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

Figure 6b. Test problem 2 mean log10 (T) field, randomly selected single realization overlain with bounding pathlines for that realization, and the GWTT CDF curves from all realizations. Thick line is true CDF, dashed line is mean CDF, and other curves bound the inner 95% of all conditional CDFs.

would increase through reconnaissance. Although calibration was done by hand, this exercise clearly showed that the methodology employed in this case resulted in “overconfidence.” The decision on whether or not to declare the repository safe may be affected by the degree of overconfidence associated with an approach. If an approach produces results indicating the site will adequately contain the waste, but the approach is deemed to result in too much confidence, this could sway the decision maker to reject the application. In our case the distances between bounding curves predicted by the methods are small for TPs 1 and 2 and wider for TPs 3 and 4. This is satisfactory, as it shows that the methods account for more uncertainty in the more complex TPs. If we look further, we see that method SS is overconfident for TP 1, but not so for the other TPs. Method LS produces almost systematically the largest range, and LC the smallest. The range for

PP does not vary a great deal between TPs. The range for ML is, in general, the most appropriate over all the TPs; PP and SS come next. We now briefly examine the T fields (Figures 5a– 8c). For TPs 1 and 2, visually, the major high-T zones seem to be reasonably well captured by PP, SS, and ML, in that order, and a little less so by the others. For TP 4 it is evident that SS looks closer to the true field than the others. It is interesting to know that for TP 3, the participant realized that the log10 (T) sample had a bimodal distribution (because of the different geology of the “features” in the aquifer) and decided that the multi-Gaussian assumption would not be appropriate for this distribution. He therefore used the indicator kriging approach, optional in his code, with two populations, to account for this bimodal distribution. One population had high-T values; since the transient head data indi-

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

1385

Figure 6c. Test problem 2 mean log10 (T) field, randomly selected single realization overlain with bounding pathlines for that realization, and the GWTT CDF curves from all realizations. Thick line is true CDF, dashed line is mean CDF, and other curves bound the inner 95% of all conditional CDFs.

cated that this population should be well connected in the ⫺45⬚ direction, he decided to use an anisotropic indicator variogram for that population, consistent with this observation. These high-T “features” in the ⫺45⬚ direction can easily be seen in Figure 7c in the given realization, and because of conditioning, some of these features are included in all realizations so that they appear in the mean T field. The fact that method SS did not rank first in the global ranking for may reflect that the “features” in the true T field were slightly more complex than accounted for by the anisotropic variogram. The other approaches did not really identify the disconnected high-T channels. The darker zone at the lower left corner (low-T zone) was captured by SS, LC, FS, and, although not exactly in place, by PP. For TP 4 it is interesting to see that

all approaches were more or less able to identify the connected high-T “channel” present across the domain. One issue of interest in comparing inverse approaches is parameterization [e.g., McLaughlin and Townley, 1996]. The way each method parameterizes its T field is described in Appendix B. We tried in several ways to relate parameterization to the present results but did not find any real clues. One reason is that most methods parameterized the T field with a relatively similar number of unknowns (on the order of 50) and used geostatistics to interpolate the values; they furthermore constrained the unknowns in predefined ranges so that in the end, the specifics of the parametrization of each approach did not seem to make a large difference. We will return to this issue in the discussion, together with that of uniqueness.

1386

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

Figure 7a. Test problem 3 mean log10 (T) field, randomly selected single realization overlain with bounding pathlines for that realization, and the GWTT CDF curves from all realizations. Thick line is true CDF, dashed line is mean CDF, and other curves bound the inner 95% of all conditional CDFs.

Examination of the semivariograms (Figures 11–14) shows large differences among the approaches. Furthermore, it is clear that these differences are not systematic. To illustrate what can be learned from these figures, we will look for instance at the results of method SS. In TP 1, SS underestimates the variability: the sill of the variogram is approximately 2/3 that of the true field. As a result, the CDF of SS for TP 1 is “overconfident,” as we have seen. In TP 2, SS has the variogram which is the closest to the true-field semivariogram, and, as a result, SS does very well on the CDF and its bounds. In TP 3, SS does the best job for short distances, even if it overestimates the sill. Since for this problem, the short-scale spatial variability dominates the T fields (because of the presence of the channels), SS again does very well. In TP 4, SS (as well as LC, ML, and PP) is very close to the true-field semivariogram and also produces accurate flow results. In TP 1, PP used a

generalized covariance model with no sill (see insets of Figures 11–14). After the completion of TP 1 the PP approach was rerun using an exponential covariance model whose sill matches the true-field sill. To summarize the initial findings so far: (1) There are significant differences between the methods and the way each approach is implemented (e.g., grid discretization and orientation). (2) The use of any of the inverse methods to condition the CDFs of GWTT on transmissivity and head data drastically reduces the uncertainty in these GWTTs, compared with the unconditional CDFs. (3) For “simple” (classical geostatistical) problems, all the approaches do a reasonably good job; the errors in GWTT are within half an order of magnitude, and in general, with only few exceptions, the inverse methods do not build “overconfidence.” (4) For “complex” cases the nonlinear approaches do better in general. The LS method is an excep-

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

1387

Figure 7b. Test problem 3 mean log10 (T) field, randomly selected single realization overlain with bounding pathlines for that realization, and the GWTT CDF curves from all realizations. Thick line is true CDF, dashed line is mean CDF, and other curves bound the inner 95% of all conditional CDFs. tion in this respect, as it does better than the other linearized ones, for reasons that we will investigate later. There is, however, a tendency for all approaches to be overconfident and to overestimate the GWTT, meaning not erring on the side of greater safety. (5) The magnitude of the variance of ln(T), up to 11 in our case, does not seem to be a problem, even for the linearized approaches. However, nonstationarity (and departure from a true “geostatistical” distribution) is obviously more difficult to handle for the linearized approaches than for the nonlinear ones. (6) Unconnected channels are poorly identified by most inverse methods; an exception is, however, the SS method using the multiple-population approach. If the presence and average direction of such channels can be identified by external data (in this case, the transient head response to the aquifer tests), then this approach can be geared to generate such channels in the selected direction(s). An alternative,

which was attempted by the participant of method FF, is to introduce such features “by hand.” In both cases, if the features (or position) of these channels have been correctly identified, this will of course improve the results. (7) A good selection of the variogram of the true T field seems to improve the results of the inverse. This first series of findings was purposely based on qualitative “subjective” judgments without any attempts to quantify the results. In the next section we build a number of objective evaluation measures and analyze their results statistically.

4.

Quantitative Comparisons

The GXG decided that the comparison of the methods should be primarily based on quantifiable measures that can be directly related to the ability of the model to predict transport.

1388

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

Figure 7c. Test problem 3 mean log10 (T) field, randomly selected single realization overlain with bounding pathlines for that realization, and the GWTT CDF curves from all realizations. Thick line is true CDF, dashed line is mean CDF, and other curves bound the inner 95% of all conditional CDFs. The advective groundwater travel time of a conservative solute in the aquifer was selected as the most significant outcome of a calibrated model on which the evaluation of the methods should be based. This is a performance measure related to the ability of a waste disposal site to isolate waste yet is simpler than prediction of solute concentration. In addition, it was decided that GWTT alone was insufficient and that the groundwater flow paths should also be examined. It was reasoned that calculations resulting in accurate GWTT but very inaccurate groundwater flow paths would probably not be defendable, even though there is no regulatory requirement pertaining specifically to contaminant migration paths. To this end, particle-tracking calculations were performed; particles were released at a number of selected locations and the travel path and groundwater travel time to reach a radial distance of 5 km from the release point were calculated. In addition, the

orientation of the flow path from the release point to the crossing point at the 5-km radial boundary was determined. For brevity, the name PATH will be used to refer to the particle pathline analyses. Ten quantitative evaluation measures were tested and applied to the results of the test problems. These measures will be described under three headings: the fixed well approach, the random well approach, and the field variables measures. 4.1.

The Fixed Well Approach

A selection of 10 –30 particle release points was used for each test problem. The points were randomly located but more or less uniformly distributed over the total area of the studied field (as opposed to just within the waste panel area). The flow lines originating from these points were calculated by particle tracking (Figure 15). For each release point the distribution of

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

1389

Figure 8a. Test problem 4 mean log10 (T) field, randomly selected single realization overlain with bounding pathlines for that realization, and the GWTT CDF curves from all realizations. Thick line is true CDF, dashed line is mean CDF, and other curves bound the inner 95% of all conditional CDFs. GWTTs estimated from the ensemble of simulated fields is compared with the GWTT value from the true field. The analysis thus involves comparing a distribution of GWTTs to a single value (the true-field GWTT), contrary to the random well approach (described below), where we compare two GWTT distributions. The path lines predicted by each approach are also compared with those of the true fields (path line calculations are denoted “PATH”). Combining all four test problems, 88 particle paths were analyzed for each approach for the fixed well release points case. Given that there are two CDFs for each release point (one for GWTT, one for PATH) and seven approaches, this results in more than 1200 CDFs. Consequently, only a few samples are shown here (Figure 16). The aim of this analysis is also to evaluate the inverse approaches on their ability to predict advective transport, but this

time the locations of the release points are distributed over the whole domain. Because the analysis involves several independent measures and numerous release points, it will be possible to conduct statistical tests to assess differences in performance. Five evaluation measures were defined within the framework of the fixed well approach, the details of which are given in Appendix C. Evaluation measure (EM) 1: The GWTT error (denoted “Error” in the tables) compares the median of the simulated GWTTs with the true GWTT. EM 2: The GWTT “degree of caution” (denoted “Caut”) measures the propensity of an approach (if any) to underestimate rather than overestimate the GWTT. This is a PAspecific concept where it is considered better to err on the side of greater safety and protection of public health which equates to predicting faster travel times. In the parlance of PA termi-

1390

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

Figure 8b. Test problem 4 mean log10 (T) field, randomly selected single realization overlain with bounding pathlines for that realization, and the GWTT CDF curves from all realizations. Thick line is true CDF, dashed line is mean CDF, and other curves bound the inner 95% of all conditional CDFs. nology, such predictions are referred to as “conservative” predictions. EM 3a: The GWTT spread (denoted “Sprd”) measures the width of the GWTT CDF produced by a method, i.e., the uncertainty that an approach associates with its GWTT predictions. EM 3b: The GWTT robustness (denoted “Boot”) measures the number of times the true GWTT falls within the inner 95% of the simulated GWTT distribution (the bootstrap test). The Sprd and Boot measures must be considered simultaneously. For instance, an approach which predicts a small Sprd (a small uncertainty) but which fails the bootstrap test clearly underestimates the uncertainty. Similarly, an approach resulting in a large Sprd and a good Boot measure may be overpredicting the uncertainty. The goal is to have the smallest Sprd with a good Boot. Together the Sprd and Boot measures re-

flect whether an approach is “self-consistent,” that is, whether it over or underpredicts the uncertainty. For ranking purposes a single index grouping the two, called the normalized selfconsistency measure (NSC), was also constructed (see Appendix C). EM 4: The PATH error quantifies the absolute deviation (in degrees) between the median path direction angle and the direction of the true path. The orientation of the path is defined by the angle from the release point to the point where the path crosses a circle of radius 5 km (centered at the release point). EM 5a: The PATH spread measure quantifies the spread in the distribution of path angles. EM 5b: The PATH bootstrap (robustness) measure, as for GWTT, measures the robustness of the path line calculations. It should be noted that all these measures are computed for

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

1391

Figure 8c. Test problem 4 mean log10 (T) field, randomly selected single realization overlain with bounding pathlines for that realization, and the GWTT CDF curves from all realizations. Thick line is true CDF, dashed line is mean CDF, and other curves bound the inner 95% of all conditional CDFs.

each release point and are averaged over all release points, where the averaging is made with nonuniform weights. Additionally, each release point was given a weight reflecting how close this point was from the true observation data. It was reasoned that those points surrounded by measurements should be better predicted by the methods than those far away from any measurements so they are assigned relatively higher weights. The way in which these weights were derived is explained in Appendix C. 4.2.

evaluation measures have been defined. Figure 17 and Appendix C clarify how these measures are constructed. EM 6: The GWTT error which is a measure of the disparity between the median GWTT CDF and the true CDF. EM 7a: The GWTT spread measures the area between the 95% bounding envelopes of all waste panel GWTT CDFs. EM 7b: The GWTT robustness measure quantifies what proportion of the true CDF is contained within the 95% bounding envelope.

The Random Well Approach

This approach involves a comparison of the estimated GWTT CDF with the true GWTT CDF. The CDFs to be compared are the ones generated from the 100 release points within the waste panel area shown in Figures 5a– 8c. Three

4.3.

The Field Variables

The final three evaluation measures (detailed in Appendix C) compare the simulated T fields and head fields with the true fields, and the semivariograms of the log10 (T) fields.

1392

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

problem and the “raw” (untransformed) evaluation scores are listed in Table 4. For several of the measures, the scores vary only within the range [0, 1]. For the other measures, however, the scores are only bounded by zero and can range beyond one. Because these “raw” measures are not all computed in the same units, they cannot be meaningfully averaged and are not directly comparable. Transformation of the evaluation measure scores to consistent units was performed in two ways, by converting them to standardized variables and via rank transformation. Statistical analyses were performed on both the standardized and the rank-transformed variables. However, analyses of the rank-transformed variables were considered more powerful from a statistical viewpoint, and therefore only the rank-transformed results are presented here. For method

Figure 9. Waste panel GWTT CDFs for the linearized semianalytical (LS) method.

EM 8: The log10 (T) error measures the difference between the ensemble of T fields and the true T field. EM 9: The head error similarly measures the difference between the ensemble of head fields and the true-field head solution. EM 10: The semivariogram error measures the difference between the average of the semivariograms of each simulated log10 (T) fields and that of the true one (Table 3). 4.4.

Analysis of the Results

The 10 evaluation measures described in the previous section were computed for each of the approaches in each test

Figure 10. Unconditional waste panel GWTT CDFs from the random homogeneous (RH) case. Note that the ranges of GWTT in these plots span several orders of magnitude more than those from the inverse methods (Figures 5a– 8c).

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

1393

Figure 11. Average semivariograms for test problem 1.

Figure 13. Average semivariograms for test problem 3.

LS, which has not produced transmissivity and head fields, the comparison was done excluding the head and log10 (T) field error measures. 4.4.1. Evaluation measures overview and ranking. The rank-transformed scores for the 10 evaluation measures for each test problem and method are listed in Table 5. The average rank across all measures in a test problem is shown in the far right column and the average score for each measure across all four test problems is shown down the columns. It was of interest to determine if the two sets of GWTT analyses (the random well and the fixed well approaches) would lead to different conclusions. To compare these measures, we averaged the five evaluation measures for the fixed well and the two evaluation measures for the random well approaches from Table 5, for each approach over all test problems. Table 6 shows that on average, similar rankings are obtained for both analysis approaches except for LC, which does slightly better for the random well case. The comparison of distributions with the single-valued “truth” used in the fixed well approach, although involving more evaluation measures, is thus a relatively robust indicator of the performance of an approach and compares favorably with the better founded comparison of distributions. This result also shows that the relative performance of methods does not change depending on whether the analysis is conducted in the vicinity of the highest data density

(random well case) or throughout the region with more sparse data (fixed well case). This may be expected because of the kriging variance-based weighting used in the fixed well case. Thus far we have not paid much attention to the correct orientation of the flow lines predicted by each method. On average, it appears that method LS performs somewhat better than any other method to predict the correct path. The spread of the PATH CDFs for this approach is also, in general, wider than for the others, as was the case for GWTT. It is also worth mentioning that LS scores best for Caut, as a result of a conscious decision of the participant to “fine tune” his approach to meet this criterion. The head and log10 (T) errors are highly correlated. The head and log10 (T) errors listed in Table 4 are plotted in Figure 18 to show this correlation. The closer the T fields are to the true T field, the smaller, in general (save for TP 3), are the head errors. Linear regression performed on the results from TPs 1, 2, and 4 (dashed line in Figure 18) has a coefficient of determination of 0.70. In TP 3 the average magnitude of the head error (for all approaches) is approximately half an order of magnitude larger than for the other TPs (see Table 4). The exception is method PP, which does much better on head than any other approach. Note, however, that its performance in the log10 (T) error in is not that much better than SS or ML. One likely reason is that the flexibility of this approach in optimally choosing the parametrization (optimal selection of the location

Figure 12. Average semivariograms for test problem 2.

Figure 14. Average semivariograms for test problem 4.

1394

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

Figure 15. Fixed release points and pathlines on the true log (T) fields. Pathlines extend a radial distance of 5 km.

of the PP) makes it possible to fit the head better than other approaches. This may be especially true for complex flow systems, but perhaps at the expense of producing a T field which poorly corresponds to the true T field (see Figures 7a–7c, where it is clear that the PP T fields have a more continuous design pattern (zones of high and low Ts) than the ML and SS methods). This exemplifies the necessity to prescribe, in one form or another, a “plausibility” criterion on the T field in an inverse solution. As shown by Carrera and Neuman [1986a, b], the minimization of the head differences alone is insufficient. 4.4.2. Statistical analysis of the evaluation measures. The results presented in Table 5 can be used in statistical analyses to indicate if the performances of the seven methods (as quantified through the evaluation measures) are significantly different. In these analyses it is assumed that each evaluation measure (appropriately transformed to ranks or standardized) is an independent measure of the performance of the approach. A two-factor analysis of variance (ANOVA) was used to analyze the performance measure information. This

approach considered the two factors, test problem and method, and their interaction to be potentially significant sources of variation in method performance. The validity of statistical tests obtained through the ANOVA on ranked data depends on the assumption that the performance values of each approach are independent measures of the same quantity which have a constant variance. Thus we are assuming that each evaluation measure is an independent measure of the performance of the method where each measure quantifies performance in a different way. Because the performance values are clearly not independent and the assumption of constant variance may be questionable, the results from the ANOVA tests were used only as an indicator that substantial differences may exist and to suggest a general ordering of the methods rather than to declare the results “statistically significant.” The null hypothesis for the two-way ANOVA is that there is no difference among the approaches or the test problems or the combination of the two. The computed p value is the probability that an F statistic greater than the one observed

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

1395

Figure 16. Examples of some GWTT CDFs for the fixed release points case (all plots are release point 16, TP 3). Vertical line is the true GWTT.

would be obtained if the null hypothesis were true. The p value for the method/test problem interaction was 0.0079, indicating a strong interaction between the approach and test problem, that is, that the performance of an approach tends to differ depending on the test problem. The nature of this interaction is illustrated in Figure 19, which is a plot of the average evaluation scores across the 10 measures in each test problem, for each approach. From this figure we can see that while the performance of some of the methods is relatively consistent across all test problems (such as SS and FF), the performance of other methods (such as LS and PP) tends to depend on the

particular test problem considered. In fact, the performance of the LS method, and perhaps also that of the PP method, tends to improve across test problems. For method LS, the most likely explanation is that the participant changed the method of integration and could thus use finer time steps. For method PP the explanation could be that the participants took more care in the application of the method (e.g., selection of the semivariogram), or that the approach is more suited to more complex test problems, or both. Apart from the dependence of method performance on the test problem, Figure 19 also indicates that the performance of

1396

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

Figure 17. GWTT CDF evaluation measures used in the random well case. Thin solid lines are the 0.025th and 0.975th percentile CDFs, thick dashed line is the median CDF, and thick solid line is the “true” CDF.

the SS method is, three times out of four, superior to that of the other approaches. For certain test problem/method comparisons (such as the comparison of SS and FF on TP 4), the difference in performance is substantial; for other test problem/method comparisons (such as the SS and PP methods on TP 3), the difference is extremely small. Given this result and the test problem method interaction, it does not appear that there is sufficient evidence to conclude that the performance of one approach was consistently significantly superior to that of all other approaches. These observations were reinforced by conducting a one-way ANOVA on the average of the rank-transformed evaluation measures scores taken across all 10 measures for each test problem. This gives four measures of performance for each approach, one for each test problem. Because the test problems were constructed, sampled, and analyzed independently of each other, the overall average performance measure scores should also be independent. The one-way ANOVA was used to test the hypothesis that the average performance of all the

Table 3. Parameters of the Exponential Semivariogram Model Fit to the Average Semivariogram Across All Realizations ␴2

␭, m

R2

True FF FS LC LS ML PP SS

1.66 0.79 2.41 0.29 2.99 1.26 1.52 1.28

2808 4074 5503 2142 6900 4730 3767 3995

0.9949 0.9976 0.9978 0.9947 䡠䡠䡠 0.9913 0.9998 0.9975

True FF FS LC LS ML PP SS

1.66 2.37 4.59 0.54 2.99 2.32 2.34 1.92

2808 12576 4395 3617 4000 3776 2525 4687

0.9949 0.9989 0.9994 0.9971 䡠䡠䡠 0.9993 0.9993 0.9978

True FF FS LC LS ML PP SS

1.35 5.09 4.29 2.91 0.75 1.41 0.74 1.76

425 1134 2429 3099 2700 1605 1562 384

True FF FS LC LS ML PP SS

2.18 6.28 3.98 2.12 2.24 1.97 2.99 2.09

2063 3320 2715 2128 1600 1046 3044 1497

Method

Cut, 1000 m

J ␴2

J␭

J␥

Rank

0.00 0.72 0.26 1.13 䡠䡠䡠 0.47 1.52 0.41

䡠䡠䡠 0.34 0.31 0.45 0.44 0.19 0.60 0.19

䡠䡠䡠 0.31 0.49 0.19 0.59 0.41 0.25 0.30

䡠䡠䡠 0.32 0.45 0.26 0.55 0.35 0.34 0.27

䡠䡠䡠 3 6 1 7 5 4 2

Test Problem 12 10 7 12 䡠䡠䡠 15 15 9

2 0.00 0.49 1.40 0.94 䡠䡠䡠 0.49 0.66 0.13

䡠䡠䡠 0.30 0.64 0.40 0.44 0.28 0.29 0.14

䡠䡠䡠 0.78 0.36 0.22 0.30 0.26 0.09 0.40

䡠䡠䡠 0.66 0.43 0.27 0.34 0.26 0.14 0.33

䡠䡠䡠 7 6 3 5 2 1 4

0.9986 0.9726 0.9978 0.9983 䡠䡠䡠 0.9992 0.9990 0.9981

Test Problem 12 15 15 9 䡠䡠䡠 9 10 9

3 0.00 3.56 2.43 0.93 䡠䡠䡠 0.25 0.67 0.43

䡠䡠䡠 0.73 0.69 0.54 0.31 0.04 0.31 0.23

䡠䡠䡠 0.63 0.83 0.86 0.84 0.74 0.73 0.09

䡠䡠䡠 0.66 0.79 0.78 0.71 0.56 0.62 0.12

䡠䡠䡠 4 7 6 5 2 3 1

0.9927 0.9986 0.9968 0.9967 䡠䡠䡠 0.9842 0.9993 0.9979

Test Problem 15 9 9 15 䡠䡠䡠 15 9 15

4 0.00 2.56 1.15 0.23 䡠䡠䡠 0.32 0.35 0.18

䡠䡠䡠 0.65 0.45 0.03 0.03 0.09 0.27 0.04

䡠䡠䡠 0.38 0.24 0.03 0.18 0.33 0.32 0.22

䡠䡠䡠 0.45 0.29 0.03 0.14 0.27 0.31 0.18

䡠䡠䡠 7 5 1 2 4 6 3

Test Problem 12 9 12 15 䡠䡠䡠 15 9 10

RMSE 1

Parameters for LS method provided by participant. RMSE is calculated between the average semivariogram and the true-field semivariogram using equally spaced observation points and equal weights. “Cut” is the limiting distance used for the curve fitting and the RMSE calculations. Beyond cut (which was chosen subjectively) the semivariogram estimates become erratic and are likely to be very unreliable. J ␭ and J ␴ 2 are the evaluation measure scores for the correlation length and sill, respectively. Rank is based on the overall correlation structure score, J ␥ ⫽ (3 䡠 J ␭ ⫹ J ␴ 2 )/4.

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

1397

Table 4. “Raw” Evaluation Measure Scores, Test Problem 1– 4 Fixed Release Points Method

Npts

GWTT Error

GWTT Cnsv

GWTT Boot

GWTT Sprd

PATH Error

FF FS LC LS ML PP SS

10 10 9 4 10 10 8

0.24 0.20 0.23 0.17 0.17 0.39 0.15

0.25 0.12 0.46 0.27 0.25 0.45 0.40

0.26 0.05 0.06 0.21 0.05 0.05 0.05

1.19 2.11 1.20 1.69 1.35 3.38 0.90

0.28 0.42 0.32 0.20 0.23 0.50 0.17

FF FS LC LS ML PP SS

20 22 22 11 23 23 22

0.25 0.27 0.25 0.45 0.15 0.16 0.13

0.12 0.25 0.11 0.09 0.18 0.14 0.32

0.19 0.05 0.19 0.14 0.01 0.01 0.14

1.42 2.71 1.35 1.62 1.81 2.15 1.03

FF FS LC LS ML PP SS

27 27 19 22 27 27 21

0.50 0.33 0.37 0.25 0.35 0.33 0.18

0.57 0.63 0.45 0.47 0.15 0.58 0.34

0.42 0.06 0.39 0.52 0.14 0.42 0.00

FF FS LC LS ML PP SS

27 27 24 23 26 27 22

0.39 0.31 0.44 0.32 0.36 0.26 0.28

0.52 0.48 0.45 0.29 0.58 0.33 0.36

0.22 0.06 0.39 0.14 0.39 0.03 0.28

Random Well PATH Boot

PATH Sprd

Field Variable

GWTT Error

GWTT Sprd

GWTT Boot

Log (T) Error

Head Error

Semivariogram

Test Problem 1 0.16 38.8 0.05 61.0 0.06 37.7 0.05 179.0 0.05 48.9 0.05 65.3 0.21 29.0

5.73 3.15 1.20 8.19 4.49 5.94 3.93

0.38 0.34 0.16 1.87 0.44 0.53 0.11

0.97 0.73 0.82 0.29 0.74 0.64 0.99

0.543 0.507 0.334 N/A 0.739 0.853 0.483

2.41 3.94 2.28 N/A 4.19 2.56 2.51

0.32 0.45 0.26 0.55 0.35 0.34 0.27

0.29 0.51 0.24 0.29 0.37 0.35 0.21

Test Problem 2 0.16 48.5 0.05 106.2 0.43 39.0 0.05 49.2 0.05 112.6 0.04 59.3 0.28 32.2

6.34 3.99 3.02 6.37 3.14 7.07 1.80

0.61 0.71 0.19 1.41 0.26 0.53 0.56

0.89 0.42 0.93 0.29 0.59 0.57 0.00

0.691 1.169 0.559 N/A 0.856 0.708 0.584

2.59 5.99 2.43 N/A 3.58 3.01 2.35

0.66 0.43 0.27 0.34 0.26 0.14 0.33

2.19 3.27 2.14 1.65 2.22 1.45 1.91

0.80 0.86 1.36 0.45 0.75 0.55 0.79

Test Problem 3 0.53 44.0 0.34 106.7 0.94 44.6 0.47 43.5 0.30 84.2 0.22 66.8 0.45 58.7

14.03 10.66 7.01 11.39 6.43 6.13 7.19

2.54 0.74 0.36 2.63 1.11 0.38 0.75

0.58 1.00 1.00 0.00 0.91 0.96 0.99

1.202 1.366 1.321 N/A 0.907 0.807 0.859

11.5 16.4 19.3 N/A 12.8 6.7 15.7

0.66 0.79 0.78 0.71 0.56 0.62 0.12

3.15 3.01 2.41 2.27 1.77 2.66 2.01

0.38 0.37 0.83 0.24 0.26 0.31 0.29

Test Problem 4 0.26 7E76.8 0.05 110.9 0.17 131.4 0.01 138.0 0.07 46.3 0.05 79.1 0.00 63.5

9.04 8.84 8.37 4.72 2.87 6.03 2.66

0.79 0.94 0.42 1.78 1.09 0.50 1.20

1.00 1.00 1.00 0.01 0.12 1.00 0.07

1.731 1.768 1.344 N/A 1.032 1.194 0.914

5.57 6.33 4.25 N/A 4.20 6.09 3.19

0.45 0.29 0.03 0.14 0.27 0.31 0.18

All measures were constructed such that the target value is zero. Npts is the number of release points used for the fixed release points GWTT and PATH analyses.

approaches is the same. The F test from the ANOVA indicated significant differences ( p value ⫽ 0.0055); therefore Fisher’s least significant difference (LSD) pairwise comparison procedure [Steel and Torrie, 1980] was used to determine which approaches differed. The results show a great deal of overlap in the performance of the approaches. This is expected because of averaging across test problems when significant interaction is present. Although no single approach performed significantly better than all other methods in all cases, we can roughly delineate three performance groups. The SS approach had the best overall performance, although its overall average performance may not be substantially better than that of a middle group comprising the ML, LS, and PP methods. The performance of the SS method may be significantly superior to all other approaches for a particular test problem, as indicated in Figures 19 and 20. Method SS performs significantly better than the third group containing the FS, LC, and FF methods. While there is a strong similarity between the performance of the FF and the LC approaches, the results of the ANOVA do not indicate a clear differentiation in performance between the linearized and nonlinear methods. This is in part because the results for the LS method were more similar to the nonlinear approaches and method LS did not perform as poorly as the other linearized approaches. In addition to significance testing via ANOVA procedures,

cluster analyses were performed using the average, across the four test problems, of each of the evaluation measure scores (except for the head and log10 (T) error measures because method LS did not produce head and log (T) fields). Cluster analysis is a statistical procedure for partitioning multivariate data into groups based on some measure of similarity. The correlation between the performance vectors was used as the measure of similarity, so that two approaches are deemed similar if the correlation between their performance vectors is high. Clustering began with each method as a separate cluster and was allowed to continue until all methods were combined into one cluster. Amalgamation of the approaches into clusters was performed using unweighted pair-group averaging [Johnson and Wichern, 1982]. The results are shown in Figure 21, where the clustering is stopped when there are two clusters remaining. These results appear to distinguish the behavior of the linear and nonlinear approaches, as the two remaining clusters fall into those categories.

5.

Discussion of Results

Before drawing conclusions from this comparison, the reasons for some of the results observed should be clarified. In the following sections we discuss issues that are generic to inverse modeling, issues related to the assumptions used in the mod-

1398

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

Table 5. Rank-Transformed Evaluation Measure Scores Fixed Release Points Method

TP

Random Well

Field Variable

GWTT Error

GWTT Cnsv

GWTT NSC

PATH Error

PATH NSC

GWTT Error

GWTT NSC

Log (T) Error

Head Error

Variogram

Average Score

FF FF FF FF FF Average

1 2 3 4

6.0 4.5 7.0 6.0 5.88

2.5 3.0 5.0 6.0 4.13

6.5 7.0 6.0 5.0 6.13

4.0 3.5 5.0 6.0 4.63

6.0 5.0 6.0 6.5 5.88

5.0 5.0 7.0 7.0 6.00

6.5 7.0 2.0 6.0 5.38

4.0 3.0 4.0 5.0 3.75

2.0 3.0 2.0 4.0 2.75

3.0 7.0 4.0 7.0 5.25

4.55 4.80 4.80 5.85 5.00

FS FS FS FS FS Average

1 2 3 4

4.0 6.0 3.5 3.0 4.13

1.0 6.0 7.0 5.0 4.75

3.0 4.0 2.0 2.5 2.88

6.0 7.0 6.0 5.0 6.00

2.0 3.0 4.0 4.0 3.25

2.0 4.0 5.0 6.0 4.25

2.5 3.5 7.0 7.0 5.00

3.0 6.0 6.0 6.0 5.25

5.0 6.0 5.0 6.0 5.50

6.0 6.0 7.0 5.0 6.00

3.45 5.15 5.25 4.95 4.70

LC LC LC LC LC Average

1 2 3 4

5.0 4.5 6.0 7.0 5.63

7.0 2.0 3.0 4.0 4.00

5.0 6.0 4.0 7.0 5.50

5.0 2.0 7.0 7.0 5.25

5.0 1.0 7.0 6.5 4.88

1.0 2.0 3.0 5.0 2.75

4.5 6.0 6.0 4.0 5.13

1.0 1.0 5.0 4.0 2.75

1.0 2.0 6.0 3.0 3.00

1.0 5.0 5.0 2.0 3.25

3.55 3.15 5.20 4.95 4.21

LS LS LS LS LS Average

1 2 3 4

2.5 7.0 2.0 4.0 3.88

4.0 1.0 4.0 1.0 2.50

6.5 5.0 7.0 2.5 5.25

2.0 3.5 1.0 1.0 1.88

4.0 7.0 4.0 2.0 4.25

7.0 6.0 6.0 3.0 5.55

1.0 2.0 1.0 1.0 1.25

N/A N/A N/A N/A N/A

N/A N/A N/A N/A N/A

7.0 2.0 2.0 4.0 3.75

4.25 4.19 3.38 2.31 3.53

ML 1 ML 2 ML 3 ML 4 ML Average

2.5 2.0 5.0 5.0 3.63

2.5 5.0 1.0 7.0 3.88

2.0 1.0 3.0 6.0 3.00

3.0 6.0 3.0 2.0 3.50

1.0 4.0 2.0 5.0 3.00

4.0 3.0 2.0 2.0 2.75

4.5 5.0 3.5 3.0 4.00

5.0 5.0 3.0 2.0 3.75

6.0 5.0 3.0 2.0 4.00

5.0 3.0 6.0 1.0 3.75

3.55 3.90 3.15 3.50 3.53

PP PP PP PP PP Average

1 2 3 4

7.0 3.0 3.5 1.0 3.63

6.0 4.0 6.0 2.0 4.50

4.0 2.0 5.0 1.0 3.00

7.0 5.0 2.0 4.0 4.50

3.0 2.0 1.0 3.0 2.25

6.0 7.0 1.0 4.0 4.50

2.5 3.5 3.5 5.0 3.63

6.0 4.0 1.0 3.0 3.50

4.0 4.0 1.0 5.0 3.50

4.0 1.0 3.0 6.0 3.50

4.95 3.55 2.70 3.40 3.65

SS SS SS SS SS Average

1 2 3 4

1.0 1.0 1.0 2.0 1.25

5.0 7.0 2.0 3.0 4.25

1.0 3.0 1.0 4.0 2.50

1.0 1.0 4.0 3.0 2.25

7.0 6.0 4.0 1.0 4.50

3.0 1.0 4.0 1.0 2.25

6.5 1.0 5.0 2.0 3.63

2.0 2.0 2.0 1.0 1.75

3.0 1.0 4.0 1.0 2.25

2.0 4.0 1.0 3.0 2.50

3.15 2.70 2.80 2.10 2.70

The “raw” Boot and Sprd scores were combined into the “NSC measure” as described in Appendix C. The lower the rank, the better the performance.

eling, issues related to the characteristics of the test problem data sets, the comparison exercise itself, and issues which are approach specific. 5.1.

Uniqueness and Ill-posedness

The issue of uniqueness of the inverse solution is discussed by McLaughlin and Townley [1996], who describe conditions that must be met for an inverse problem to be well posed. Dietrich and Newsam [1990] show that the problem of estimating transmissivity from steady state head measurements is ill posed unless the flow system is forced by a known recharge or pumpage which is sufficiently large to produce closed head contours over the region of interest. Although ill-posedness can be mitigated to some extent when head measurements are augmented by transmissivity measurements, as in the WIPP test problems, it is still possible, that the resulting problems do not have unique solutions. That is, many different transmissivity fields may yield equally good fits to the available measurements. Some of these may be fortuitously closer to the “true” transmissivity field than others, but all are equally consistent with the data presented in the

TPs. Clearly, this complicates the process of comparing different inverse approaches, but this is the reality facing a modeler at any site. Although it might have seemed reasonable to base an inverse comparison on TPs that were well posed, a conscious decision was made to model the TPs after the real WIPP problem, which is probably ill posed in the sense that it does not have a unique solution. This decision forced each partici-

Table 6. Comparison of Fixed Well Versus Random Well Evaluation Measure Scores Inverse Method

Fixed well average Random well case average Absolute value of difference

FF

FS

LC

LS

ML

PP

SS

5.3 5.7

4.2 4.6

5.1 3.9

3.6 3.4

3.4 3.4

3.6 4.1

3.0 2.9

0.4

0.4

1.2

0.2

0.0

0.5

0.1

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

pant to deal with the issue of ill-posedness in their own way, generally by constraining the set of possible transmissivity solutions. In this study, the constraints were conveyed primarily by the transmissivity parametrization, which specifies how transmissivity values must vary over space [McLaughlin and Townley, 1996]. Parametrizations were implemented by specifying a particular transmissivity variogram, a particular spatial block scheme, and/or a particular set of pilot points, depending on the approach used (see Appendix B). A properly designed parametrization should transform the original ill-posed problem into a well-posed problem with a unique solution. The need to deal with ill-posedness was thus one of the intrinsic features of the comparison. Since the different inverse approaches constrained the original problem in different ways, it could be argued that these approaches ultimately solved different problems. This even led one participant to claim that since there was no proven unique “truth” and since any one solution of the inverse problem obtained by any method could have equally well been the “truth” (provided its values of transmissivity and head at the measurement points were sufficiently close to the sample data), then the best simulation technique should be the one having the widest spectrum encompassing all the possible “truths” produced by each approach. In performance assessment, widening the spread of the possible outcomes of simulations is termed “risk dilution,” as it may diminish the probability of the high-consequence region. This criterion was not employed, and the GXG considered that there was only one “truth” and that one of the aims of the comparison was to determine if the approaches could come close to that one “truth” and no other one. As an example, the relatively good fit on the head error obtained by method PP on TP 3, although its T field was not superior to those from other approaches, can be taken as an indication of a potential nonuniqueness of the solution of that problem. This example shows, however, that the type of parametrization chosen by method PP was not close enough to the optimum to unravel the type of T distribution of the “true” T field. 5.2.

Choice of the Underlying Covariance Structure

For any geostatistical method a very important step is to determine the statistical structure of the field and to select the semivariogram of the random T field that is to be simulated. The seven approaches can be classified into two groups: those that use only the T data to select the semivariogram and those that use both T and head data simultaneously in the initial statistical inference step to develop the semivariogram of the T field. Only methods LS and LC strictly fall into the second category, using the maximum likelihood approach for this inference (see method descriptions in Appendix B). It was expected that the use of both head and T data would give these techniques an advantage over the other approaches, since incorrect assumptions about the semivariogram can have important consequences, as will be illustrated below. It is interesting to see that method LC ranked second for the semivariogram measure over all TPs, on average, even though the code did not allow for the selection of any variogram model other than exponential, which is a code limitation, not a method limitation. Method LS ranked fourth on this measure. However, the average score may be biased because LS ranked seventh on this measure for TP 1, perhaps because of insufficient initial attention of the participant to the importance of this selection. One advantage of the LS method (not used in the present exercise)

1399

Figure 18. Correlation of head errors with errors in the log (T) field.

is that it could be extended to use the transient head data to infer the semivariogram [Dagan and Rubin, 1988]. The average of the semivariogram measures for methods LC and LS over all TPs is lower (better) than the average of all other approaches excluding SS (in which the head information was used to guide the selection of the semivariogram model “manually”). Thus there is some evidence to support the contention that approaches that use both T and head data will in general do better than those using just the T data in selecting the semivariogram model. This is probably particularly true when there are many more head data than T data. The importance of the selection of the semivariogram on the performance of an approach is illustrated by Figure 22, which shows the dependence of the average performance of each approach over all four TPs as a function of the average rank of the semivariogram measure. It is interesting to note that method SS, which in general ranks first across all measures, also ranks first in semivariogram selection (see Table 5). Method SS does not infer the semivariogram from both T and h data; it uses only T data. However, according to the participant, a great deal of attention was given to the fitting of the semivariogram model to the sample T data (careful analysis of the data, declustering, test of multiple population, elimination of outliers). The excellent exploratory data analysis to select the semivariogram (which is not method-specific, but participantspecific) is most likely one of the reasons for the success of the SS method. In addition, method SS also has the ability to recalibrate itself during the optimization, that is, to modify the semivariogram. In case the number of data is larger, the role of the selection of the semivariogram may not be as critical, because

1400

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

Figure 19. Comparison of method performance across test problems; the lower the average rank, the better the performance. Methods are indicated by the two-character abbreviation.

the conditioning on these data becomes dominant to structure the T field. Additional evidence of the importance of the choice of semivariogram model is provided in a section below. 5.3.

The Multi-Gaussian Assumption

One of the methods, SS, has the advantage of being directly able to use the geostatistical “indicator” approach [Journel and Huijbregts, 1978] and permit any form of T distribution to be used, not just the multi-Gaussian one. The small sample size, however (41 data points), does not make it easy, in general, to detect the underlying type of transmissivity distribution. Only for TP 3, as we have seen, was this feature used in the exercise. In this case the method is able to generate values with a bimodal distribution and specific spatial correlation patterns for each population. The difference in results between method

SS and the other methods is, however, small, and the multiGaussian assumption was not too erroneous. In real cases, however, it may happen that the underlying distribution of T is not lognormal or displays connectivity patterns at extreme threshold values inconsistent with a multi-Gaussian distribution. Therefore the ability of method SS to handle these characteristics would be quite valuable. This intercomparison exercise therefore might not have been sufficient to evaluate the usefulness of this capability in an inverse method adequately. It can be stated, however, that the errors caused by making an erroneous choice will generally decrease as the number of conditioning data increases. Method ML can also use the indicator approach, and method PP was later adapted to include this capacity as a result of this comparison exercise.

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

5.4.

The Assumption of Stationarity

TP 1 and TP 2 are, by construction, true stationary fields. TP 3 and TP 4 are not, as there are distinct local features and trends. Linearized methods assume the existence of a constant mean and random fluctuation around that mean and uniform flow on the average. Nonlinear methods do not depend on such assumptions. Tables 4 and Figure 19 clearly show that this had a significant effect on some of the linearized methods: LC in particular, as well as FF, show a systematic decrease of the performance between TP 1 and TP 2, and TP 3 and TP 4. The case of LS is quite interesting; the participant was able to detect from the sample data that the field did not look stationary and decided to apply a piecewise-linear approximation. He divided the domain into several subareas and assumed different means for each area. This made the results of LS for TP 3 and TP 4 much better than the results of the other linearized methods. Again, the skill of the participant to “tailor” the method to the particular features of the problem can improve the results significantly. The issue of nonstationarity is thus thought to be the primary reason why the cluster analysis (Figure 21) clearly makes a distinction between the linearized and the nonlinear methods. 5.5.

1401

Effect of Introducing Recharge

In and TP 3 and TP 4, recharge was introduced into the synthetic data sets, comprising 10% and 6%, respectively, of

Figure 21. Cluster analysis tree diagram. Linkage distance is 1-Pearson correlation coefficient.

the total flow through the system. This information was not communicated to the participants, but it could have been inferred from the sample data for TP 3, which clearly showed a mound. In TP 4 the existence of recharge was not as evident from the sample data. Method PP was the only method to include recharge, which amounted to about 12% of the PP model system flux in TP 3. This may partially explain the better performance of method PP in TP 3. 5.6.

Effect of Grid Discretization

It is well known that particle tracking is very sensitive to grid size and to time steps. In order to minimize this effect, the same particle-tracking code and time-stepping scheme was used by the coordinator for all the T fields provided by the participants. But the grid used was the one provided by the participants. The degree to which the various discretization schemes affected the GWTT calculations was, unfortunately, not assessed.

Figure 20. Results from the LSD pairwise comparison. (a) Means with the same letter are not significantly different; a ⫽ 0.05, critical value of T ⫽ 2.08, least significant difference ⫽ 1.12. (b) Results from the LSD pairwise comparison. “X” indicates the means are not significantly different, and “O” indicates the difference in the means is statistically significant at the a ⫽ 0.05 level.

Figure 22. The sensitivity of the methods performance to the estimation of the covariance structure of the log (T) field. Figure shows how errors are correlated with the quality of the semivariogram estimates.

1402

5.7.

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

The Magnitude of the Log10 (T) Variance

In the so-called linearized methods (FF, LC, and LS), the development of the inverse equations is based on the perturbation method, which assumes that the ln(T) field has a “small” variance. In the literature [e.g., Dagan, 1989] and depending on the problem at hand, it is generally assumed that such a linearization is valid for ln(T) variances smaller than 1. However, in some cases larger variances do not jeopardize linearized methods, while in other cases, variances larger than 0.1 could not be adequately handled by linearization [e.g., Roth et al., 1996]. The variances of the true (synthetic) log10 (T) fields ranged from 1.38 to 2.14 across all four TPs. This corresponds to ln(T) variances in the range of 7.30 to 11.32, far in excess of the variance typically considered valid for the linearized approach. We have shown, however, when qualitatively comparing the results of TP 1 and TP 2, where the only difference was the variance of the log10 (T) field, that the linearized methods did not have any difficulty with such large variances. This is also clear from a comparison of the average rank scores over all measures for each TP (see Figure 19). Thus, it seems that given the type of inverse problems dealt with in this comparison, and given the objectives of the exercise, the magnitude of the variance is not an important issue. This conclusion is most likely linked to the effect of conditioning, as the small variance assumption was initially formulated for unconditional cases. This is particularly true in this exercise, where the average distance between measurement points (in the central area) is much shorter than the correlation length of the log10 (T) fields in all test problems. If this had not been the case, the effect of conditioning might have been much smaller. Indeed, Neuman and Orr [1993] compared linear and nonlinear stochastic approximations of effective hydraulic conductivity in three-dimensional (3-D) infinite domains and concluded that nonlinearity becomes critical for ln(K) variances in excess of 2. Similar results were obtained by Paleologos et al. [1996] for bounded domains and by Hsu et al. [1996] for transport problems, for variances of ln(K) on the order of 1 to 2. 5.8.

Connectivity of the High-T Zones

The major difference between TP 3 and TP 4 is that the high-T zones are discontinuous for TP 3 and connected for TP 4. The discontinuous high-T zones were very difficult for all methods in TP 3, and most raw evaluation measures show poor results, particularly for the head error. The ranking order for the top four methods in TP 3 (based on average scores across all evaluation measures) is PP, SS, ML, and LS. One may interpret this ranking on the basis of the intrinsic features of the methods. In the PP method the structure of the T field results from the selection of the “pilot points,” where the transmissivity value is calibrated by the optimization algorithm and further used to krige (or simulate) the whole T field. But contrary to all the other methods, the location of the pilot points is also one of the unknowns of the problem. An optimum selection of the location of the pilot points is made iteratively, prior to optimizing the T value (see Appendix B). By contrast, SS selects a priori the position of the “master locations,” which are to some extent equivalent to the pilot points, and ML also selects the zoning of the field a priori. For TP 3 method SS was able to generate unconnected high-T channels using the transient head data to determine the major direction of the channels. Other participants tried to prescribe channels “by hand,” without a great deal of success. The PP T fields, while matching the heads better than the other methods

in TP 3, do not match the T field values very well; we have already pointed out that this may reflect a nonuniqueness problem. The PP method may, perhaps, more easily detect local anomalies if they cannot be introduced a priori from geological or external knowledge. By contrast, in the case of continuous high-T zones (TP 4), all methods performed reasonably well. The calculated T fields more or less correctly show a continuous high-T flow path, in general correctly located. Continuous high-T zones are thus more easily detected by inverse methods than discontinuous ones, at least in steady state and for problems similar to the ones examined. 5.9.

Importance of Transient Data

It is difficult to see any significant differences between the two methods that could directly use the transient data in the formulation of the inverse problem (ML and PP) and the other methods. In TP 4, ML and PP obtain very similar results, but SS and LS, which do not directly use the transient information, perform significantly better. Similar results were reported by Gonzalez et al. [1997], where both stationary and transient data were shown to improve stability but did not lead to a better solution. In order to better understand the reason for this outcome, ML and PP were asked to rerun TPs 3 and 4 using only the steady state data and discarding the transient information. The outcome showed that both methods produced results very similar to those obtained with the transient information, particularly for TP 3. This is in contrast to the real WIPP site data, where the PP methodology was used in a preliminary PA [LaVenue et al., 1995] and where the use of the transient information proved to make a significant change in the outcome of the inverse calculations. The reason the additional transient information from the pumping tests did not result in major changes in the outcome is believed to be due to the limited areal influence of these tests. However, close examination and analysis of the TPs in the areas affected by the pumping was not performed. Thus we see that the evaluation measures did not provide enough information about the value of the transient information. We will not therefore be able to draw any conclusions on the value of transient information from this exercise. 5.10.

Effect of Code Limitations

The methods that were compared were not all at the same stage of development. In particular, method LC was developed in 1983 for solving a specific problem and had not been significantly updated since. The available code could not handle more than about 1600 grid blocks, which forced the use of a coarse grid to represent a small domain, giving perhaps too much importance to the head boundary conditions and limiting its ability to reflect the desired correlation behavior adequately. The performance of this method in these test problems is therefore hindered by this constraint, which is specific to the code, not to the method itself. 5.11.

Motivation of the Participants

The participants learned about the effectiveness of their method during the course of the comparison, as the “true” field and some preliminary evaluation measures were computed and released to the participants after each TP had been run, prior to starting the next one. Apart from method LC, which was run by D. Gallegos and C. Axness and not by the code developer, all other participants either ran their own

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

1403

codes or were directly involved in the supervision of the exercise. Some participants treated the exercise as a competition and felt peer pressure to become “the winner,” while others treated the exercise as more of a learning experience. In some cases the participants made some improvements to the codes to accommodate difficulties encountered during the tests. It is clear, for instance, that method PP did very poorly on TP 1, for at least three reasons that were later understood: the grid was too coarse; the domain was too small, which gave too much importance to the boundary conditions; and the selected variogram was estimated without enough care. The choice of a linear variogram, without a sill, resulted in a too-large variance in the simulated fields. A more careful analysis of the sample data, as was performed by the participant of the SS method, could have shown that an exponential variogram would have been a better choice. This was verified by rerunning method PP for with an exponential variogram without changing the grid, which produced better evaluation measure scores than the linear variogram case. A comparison of the waste panel CDFs produced by method PP using both the linear and the exponential semivariogram models (Figure 23) shows significant improvement with the exponential model; the envelope of CDF curves for the exponential case more completely covers the true CDF, while the CDF for the linear semivariogram case covers a much broader range and has a very long tail (see also Figure 11). Therefore the results of the comparison reflect not only the intrinsic quality of a method but also the skill and experience of the team that ran it and occasionally improved it; these two effects cannot be easily distinguished. 5.12.

Case of the Linearized Semianalytical Method

Among the seven methods that were compared, six use numerical techniques that discretize the domain and transform the problem into discrete grid blocks and solve the flow equation by finite differences or finite element techniques. The seventh method, LS, is “semianalytical” and solves the GWTT problem directly but without discretization, without defining boundaries (therefore without the need to specify boundary conditions), and without generating block values. The method directly calculates the movement of particles in the velocity field, which is conditioned on the T and head data via the geostatistical model. This method, although linear, produced very good results and compares favorably with the three nonlinear methods. As noted earlier, the method was used in a piecewise-linear fashion to account for the nonstationarity of the TP 3 and TP 4 fields. One of the major advantages of the LS method is its computational efficiency; it does not need the large computer resources required by many of the other methods. It should be noted that the method could be extended to produce simulated values of transmissivities on a grid, which could then be used as input for a numerical solver of the flow and transport equations. This method could also be extended to produce concentrations directly. It has been extended to 3-D to include transient data and uniform recharge. During the course of this comparison, however, resources were not available to evaluate the adequacy of the discretized transmissivities that the LS method could produce and to compare them with the transmissivities produced by the other methods (note that in Tables 4 and 5, the T and head-field evaluation measures are not available for method LS).

Figure 23. Waste panel CDF for method PP in test problem 1: (a) linear semivariogram model and (b) exponential semivariogram model.

5.13.

The Case of the Fractal Simulation Method

The FS method performed relatively well for, but much less so for the other cases, which had either a larger ln(T) variance or nonstationary fields. It seems that this is due to the principles of the method. First of all, the FS method is not really an inverse algorithm (see Appendix B). Once a T field has been simulated (with a fractal underlying semivariogram and conditioning on the T data only), the conditioning to the head data is not done by altering the simulated T field but by optimizing the head boundary condition values (which, as specified earlier, are left to each participant to decide). If no constraints are applied to these head values, the results can be physically meaningless. If constraints such as continuity or ranges are added to these head values, the fitting of the head may be poor (as each T field is fixed). This is especially true for the more “complex” fields of TP 3 and TP 4, thus leading to a poor global performance. Therefore this method seems limited to stationary fields with rather small ln( T) variances. This method generally produced the largest GWTT spread (see Table 4), which means that in general, it overestimates the uncertainty. However, a reasonable fit of the sample T data could be obtained with a fractal semivariogram, at least for TPs 1, 2, and 4 even if the underlying semivariogram of the synthetic field was not fractal.

1404

5.14.

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

The Case of the Maximum Likelihood Method on TP 4

TP 4 was non–multi-Gaussian but was interpreted as multiGaussian by the participant for the ML method for TP 4. This is acceptable when the effect that this error has on head data is small, which is the case with steady state data. However, the effect of continuous channels is much more severe when transient head data are used. Transient head data are indeed able to identify high-T channels better than steady state head data. Since such channels cannot be generated in a stationary multiGaussian field, the only option left available to method ML was to increase the T away from the channels. Consequently, the resulting T fields have a higher mean than the true field. This is probably the reason for the poor performance of method ML on TP 4. It also explains why the T fields simulated with only the steady state head data were better than those simulated with both the steady state and transient data. This explanation is also consistent with the relatively better performance of method ML on TP 3, where the participant artificially increased T along “guessed” channels, based on a manual interpretation of the transient data. 5.15. The “Robustness” of a Method With Respect to the Type of Heterogeneity It has been shown that some methods perform better for a given type of heterogeneity, while they would perform less well for another. In practice, it may be difficult to know in advance which type of heterogeneity is dominant for a given aquifer. At the WIPP site, for instance, it is not yet clear if the high-T zones in the aquifer are discontinuous or connected. It is therefore of interest to detect if there are methods that perform well on the average but may occasionally produce very poor results. The ANOVA and cluster analyses have shown that four methods seem to have a similar behavior: LS, ML, PP, and SS. The three others, FF, FS, and LC, fall into a second, less desirable category, probably because of their difficulty in dealing with nonstationary fields. Among the first four methods, PP behaved rather poorly for TP 1, but this was thought to be linked to insufficient discretization and poor selection of the variogram from the sample data. For the more complex TP 3 and TP 4, which are also the more realistic “WIPP-like” cases, the average scores (lower being better) are SS ⫽ 2.5, LS ⫽ 2.8, PP ⫽ 3.1, and ML ⫽ 3.3. Thus method SS appears to be the most robust, followed by LS, PP, and ML.

6. 6.1.

Conclusions Importance of the Topologic Structure of the T Field

The results of the comparison exercise demonstrate the following: 1. In developing the inverse model, the greatest attention should be given to the selection of the semivariogram to be used in the inversion, as this appears to be very significant in achieving success. This selection must be made from the ensemble of transmissivity data available from the site, with careful declustering and checking of the distribution of the data and elimination of outliers. Using both the T and the heads in calculating the semivariogram can improve this selection significantly, particularly if the number of the T data is few compared to the number of head data. 2. Identifying the proper parametrization (topologic/ geometric structure of the T field) can be more important than estimating parameter values. It has been shown that the cali-

bration of the model (head matching) can be very good even with a T field that is not very representative of reality. Flexibility in this parametrization is thus an important factor for an inverse model. 3. The issue of the level of discretization did not receive sufficient attention. Many participants would agree that using a fine grid can be important but not necessarily the dominant factor. Method SS, for instance, used a coarse grid and did very well. On another hand, the choice of the grid is closely linked to the issue of upscaling, to the size of the domain which a measured value (e.g., pumping test) represents, and the degree to which the assumed correlation structure can be represented. Assigning the “measured” values to a given grid size in their mesh may have been a source of bias for some participants. But this could not be determined from the results. 4. Neglecting to consider recharge when it exists in the real problem does not seem to be of major importance, if this recharge remains on the order of 10% or less of the total flux through the aquifer system. But including recharge when it indeed is present in the real system seems to improve the calibration of the model, even when the quantity and distribution is unknown. 6.2.

Improvements in the Inverse Methodologies

The results presented herein clearly show that there is much room for improvement in the inverse methodology. It is disturbing to see that the available methods still do not adequately assess the uncertainty of the prediction. The clear message of this exercise comes, we believe, from the results of TP 3. In this case the design committee tried to create an aquifer that was realistic in its complexity and not constructed to be “geostatistical,” that is, not a realization of a stationary random function with a multi-Gaussian log (T) distribution with a simple semivariogram. In the past, researchers have perhaps focused too much on validating their inverse methods on too-simplistic synthetic T fields. What can be recommended, on the basis of the present study, is the following: 1. Gaussian geostatistically based inverse methods have a tendency to generate parameter fields with circular (or ellipsoidal) heterogeneities. This is due to a basic principle of Gaussian geostatistics, which assumes that the correlation of the parameter values in space is a regular function of the distance, with or without anisotropy, valid for all classes of transmissivities. In the case where the heterogeneity of the aquifer is made of linear features (such as faults, channels, etc.) of varying orientation imbedded into a different matrix, the multi-Gaussian geostatistical approach is probably inadequate. The indicator approach is then a better choice, as it can use different variograms for each class of transmissivities. If some variograms are taken as very anisotropic, then some channels or fractures can be represented with the orientation prescribed by the anisotropy of the variogram. This approach was taken by Tsang [1996], among others. Inverse methods based on conditional expectations, maximum a posteriori probability, maximum likelihood, minimum variance, or variants will not be able to produce natural features leading to discontinuities (such as fractures, paleochannels, dissolution channels, etc). These have to be incorporated explicitly in the model. Nonparametric geostatistics are more flexible in this respect. One alternative could be to generate such structures randomly, like Boolean objects used in the oil industry, while introducing fine-scale variability within each object or estimating the T through inverse procedures for each class of object.

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

This is similar to the approach taken by McKenna and Poeter [1995]. Fields that do not reproduce head data can then be eliminated. They could also possibly be introduced via some optimization procedure, provided uncertainty remains included. 2. For those cases where a geostatistical description of the heterogeneity is adequate, it is clear that the ability to use both the T and head data to identify the underlying log (T) semivariogram is very desirable, since it has been shown in this test that it improves the statistical inference. This is one of the best features of the LC and LS methods, but it could be added to others. The maximum likelihood inference part of the LS method, for instance, could be used as a front end to any other inverse. As an example of how this recommendation could be implemented slightly differently in an existing inverse, let us take the case of the PP method. The initial developer of this method [de Marsily, 1978; de Marsily et al., 1984] felt it to be a deficiency of the method that after calibration, the variogram of the calibrated field, calculated by using the measured T values and the calibrated T values at the pilot points, could be different from the initial prescribed one, based only on the measured T data. The validity of the PP was often questioned because of this potential “deficiency.” Given the results of this comparison, it is clear that the evolution of the variogram before and after calibration in some way reflects the conditioning by the head data. One could therefore adopt a strategy where the PP inverse would be run once, to obtain additional T values at the pilot points, to better infer the semivariogram, and then rerun (or iterated) with the new semivariogram. Such a strategy is already imbedded in method SS. 3. Allowing for simultaneous calibration of the T field and the boundary conditions (in cases where they are not well defined), as is done in most methods, is certainly an important feature. But it is also necessary to impose reasonable (physically plausible) constraints on these boundary conditions during calibration. 4. If linearized methods are to be used, the main issue seems to be the stationarity of the field rather than the magnitude of the ln(T) variance. It would therefore be desirable to develop methods that could detect nonstationarities and optimally select zoning with piecewise stationary properties. 5. Using transient head data is in general a significant improvement of a method. Methods ML, PP, and LS were able to use transient data. Method SS was extended to it during the course of this exercise. 6.3.

Effort Applied Toward Solution of the Inverse Problem

At this stage of the development of the inverse methodologies, it is not advisable to use them as robust “black boxes.” The experience and skill of the modeler and the time and effort spent on the modeling of the problem have been shown to be essential components of success. Some participants consciously decided to use their methods with minimal intervention, to see precisely what the outcome would be. Their methods, in general, performed less well than the methods for which a substantial effort was applied. In this respect, method LS, which does not have a large number of options or parameter values to select, such as discretization, boundary conditions, time steps, etc., would most likely produce more reproducible results if used by different modelers.

6.4.

1405

Design of Intercomparison Studies

This intercomparison has highlighted some difficulties that may be of interest for those who want to perform a similar exercise. Among these are the following: 1. A design subcommittee separate from the participants is very desirable. If the objective is really to evaluate methodologies and not at the same time improve them, a series of tests should be given without the outcome of the first test being available before the next is run. The design subcommittee should not limit the synthetic fields to “classical” fields, but should try to imagine (based on geological knowledge and experience) what real fields might look like and incorporate them into the “true field” exhaustive data set. 2. The set of evaluation measures on which the methods will be compared needs to be specified from the start. This is not easy. In order to do so, the objectives of the comparison must be fully developed and stated explicitly from the outset, and the measures must be designed to achieve those objectives. In this exercise, the initially agreed upon set of measures proved to be inadequate and the final set was only decided when all the tests had been made. We do not believe that this has created biases in the results presented in this paper, but it certainly created a lot of discussion and confusion, as the methods could be adjusted to satisfy (or not) a given criterion. 6.5.

Selection of Appropriate Inverse Approaches for PA

Four approaches have been identified as being approximately equivalent for use in performance assessment at sites such as WIPP; these methods are LS, ML, PP, and SS. With such methods, the uncertainty is very clearly reduced by the conditioning compared with unconditional simulations respecting only the pdf of the measured parameter. The outcome of the simulations (in our case the CDF of the advective travel time) is reasonably similar among these methods. It should be noted that these approaches do not give identical results: the T fields and the predicted uncertainty, as given by the spread of the CDF, are significantly different among the methods. These differences stem from the differences between the techniques (e.g., parametrization, assumptions on stationarity) and thus reflect a fundamental uncertainty associated with the inverse problem because of its nonuniqueness. Each method is “conditioned” by its own assumptions to make the problem well posed, and the differences between the methods display the importance of these assumptions. The total uncertainty could therefore be better described by the results of the ensemble of several methods, as any one single method in general tends to underestimate the uncertainty. This study has not addressed the question whether these differences between methods could lead to different conclusions, in terms of performance assessment, when contaminant transport is simulated and not just travel time. This would depend on how close the outcome of the simulations is from the performance target, but we believe that it would not be vastly different for the examples reported here. It should be emphasized that the four approaches which were found to be approximately equivalent are not just “methods,” but are at the same time codes and a manifestation of the manner in which the method was applied, reflecting the time, effort, and experience of the modeling team that worked on the problems. Those three factors are unequivocally imbedded in this comparison. The other methods involved in this comparison were found to have either too stringent assumptions (e.g., stationarity),

1406

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

coding constraints, or insufficient time and effort devoted by the modeling team to produce results of the same level as the previous ones, particularly for the more realistic TPs.

Appendix A: Geostatistical Expert Group Participants (1) C. L. Axness, Sandia National Laboratories, Albuquerque, New Mexico; (2) R. L. Beauheim (TP design committee member), Sandia National Laboratories, Albuquerque, New Mexico; (3) R. L. Bras, Massachusetts Institute of Technology, Cambridge; (4) J. Carrera, Universitat Polite`cnica de Catalun ˜a, Barcelona, Spain; (5) G. Dagan, Tel Aviv University, Tel Aviv, Israel; (6) P. B. Davis (TP design committee member), Sandia Laboratories, Albuquerque, New Mexico; (7) G. de Marsily (TP design committee member), Universite´ Paris IV, Paris, France; (8) D. P. Gallegos, Sandia National Laboratories, Albuquerque, New Mexico; (9) A. Galli, Ecole de Mines de Paris, Fontainebleau, France; (10) J. Go ´mez-Herna´ndez, Universidad Polite`cnica de Valencia, Valencia, Spain; (11) S. M. Gorelick (TP design committee member), Stanford University, Stanford, California; (12) C. A. Gotway (TP design committee member), University of Nebraska, Lincoln; (13) P. Grindrod, QuantiSci Ltd., Henley-on-Thames, England, United Kingdom; (14) A. L. Gutjahr, New Mexico Institute of Mining and Technology, Socorro; (15) P. K. Kitanidis, Stanford University, Stanford, California; (16) A. M. Lavenue, Duke Engineering and Services Inc., Austin, Texas; (17) M. G. Marietta (TP design committee member), Sandia National Laboratories, Albuquerque, New Mexico; (18) D. McLaughlin, Massachusetts Institute of Technology, Cambridge; (19) S. P. Neuman, University of Arizona, Tucson; (20) B. S. RamaRao, Duke Engineering and Services, Inc., Austin, Texas; (21) C. Ravenne, Institut Franc¸ais du Pe´trole, Rueil-Malmaison, France; (22) Y. Rubin, University of California, Berkeley; and (23) D. A. Zimmerman (TP design committee member), GRAM, Inc., Albuquerque, New Mexico. G. de Marsily was chairman of the GXG. D. Zimmerman was the GXG coordinator. (He created the test problem data sets, distributed them to the participants, collected from them their results after calibration, performed the GWTT calculations, and conducted the comparative analyses.)

Appendix B: Brief Description of Each Inverse Method B1.

The Fast Fourier Transform Method (FF)

This technique was developed by A. Gutjahr at New Mexico Institute of Mining and Technology, Socorro [Gutjahr and Wilson, 1989; Robin et al., 1993; Gutjahr et al., 1994]. The method is implemented in the code CSIMFFT. This code solves 2-D, steady state groundwater flow problems with a fast Fourier transform technique for field generation. The log transmissivity field and the mean-removed head field were considered to be statistically homogeneous for this exercise. The newest version of the code is able to consider a lnT trend. An iterative cokriging procedure is implemented to condition on transmissivity and head field measurements. The FFT technique is very efficient and is capable of generating many realizations with modest computing resources over times on the order of minutes. The procedure used in these tests could not handle recharge or time-dependent data.

B2.

The Linearized Semianalytical Method (LS)

This technique is based on the conceptual and analytical tools developed by G. Dagan at Tel Aviv University and by Y. Rubin at Tel Aviv University and at the University of California, Berkeley [Dagan, 1985; Rubin and Dagan, 1987, 1992; Dagan and Rubin, 1988; Rubin, 1991a, b]. The procedure comprised two stages: first, the solution of the inverse problem and, second, the solution of the transport problem. The solution of the inverse problem is achieved by adapting a stationary log transmissivity structure of an analytical form (e.g., exponential) that is fully characterized by a few unknown parameters, by using a first-order, linearized, solution for the head field to obtain analytical expressions for the head-lnT cross covariance and the head covariance and by identifying the unknown parameters (mean head gradient, log transmissivity mean, variance, and integral scale) with the aid of measurements. This is done by a maximum-likelihood procedure applied concomitantly to both transmissivity and head measurements. The head and transmissivity fields can be generated subsequently at any point by conditioning on measurements (through cokriging). The method does not imply a lognormal distribution of transmissivity, though it is supposedly better suited to such distributions. The solution of the transport problem is carried out by particle tracking. At each time step and along the trajectory of each particle, the velocity is generated directly by conditioning (cokriging) on head and transmissivity measurements by using first-order analytical solutions for the velocity log transmissivity and velocity head cross covariances. To account for trends in log transmissivity which may by responsible for nonstationarity and large variances present when the entire domain is regarded as a single unit, the method was applied over separate subdomains in the final test problem. The method does not require numerical solutions of the flow equations and is free of discretization errors. The numerical computations pertain to the maximum-likelihood stage, to conditioning by cokriging, and to particle tracking. The main limitation is the first-order approximation, implying that conditioning on measurements extends the range of validity of the method. The method can be easily applied to 3-D simulations, and a 3-D code is available. B3.

The Linearized Cokriging Method (LC)

This technique was developed by P. Kitanidis, R. Hoeksema, E. Vomvoris, and R. Bras and is implemented in the GEOINVS code [Kitanidis and Vomvoris, 1983; Hoeksema and Kitanidis, 1984; Kitanidis and Lane, 1985]. The technique differs from the other linear techniques because it implements maximum-likelihood estimation of the structural parameters associated with the log transmissivity covariance based on both T and h data. The GEOINVS code implementation of this methodology is limited by the fact that an N ⫻ N (where N is the number of interior nodes in the flow model) matrix is inverted directly, so that calculation is restricted in the present version of the code to grids on the order of 40 ⫻ 40 for simulation on present-day workstations. An improved numerical implementation of this code is in development. An advantage of this technique is that it is very simple to implement and has not suffered from convergence problems. B4.

The Fractal Simulation Method (FS)

The selfaffine fractal technique was developed by P. Grindrod and M. D. Impey of Intera Information Technologies (now

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

QuantiSci) [Grindrod and Impey, 1991]. This technique depends upon the assumption that the spatial variability of the log transmissivity may be represented in the form ⌫ ␺共h兲 ⬅ 具兩 ␺ 共x ⫹ h兲 ⫺ ␺ 共x兲兩 2典 ⬀ h 2p where ␺ is log transmissivity, h is the separation between two points in the field, p is the Hurst coefficient [Mandelbrot, 1983] and 具 典 denotes the ensemble average over all realizations. The method proceeds by calculating the experimental semivariogram of the log transmissivity data and then fitting the fractal scaling law to the data. The parameter a and the Hurst coefficient p are chosen to best fit the data using maximum likelihood estimation. A set of fractal fields is then generated using the fast Fourier transform method with randomly generated phase and amplitude coefficients. The conditioning of these fields to the transmissivity data is accomplished through a linear superposition of the unconditioned fields, where the difference between the variance of the final field and the observed data is minimized. The full flow equation is then solved using the T fields generated above. For each realization, the set of head measurements are in effect “fit” by calibrating the heads at the boundary and the head data do not affect the individual T fields. B5.

The Pilot Point Method (PP)

This technique was developed by B. S. RamaRao, A. M. LaVenue, and G. de Marsily [RamaRao et al., 1995; LaVenue et al., 1995] and begins by estimating the variogram using the T data and then generates unconditional simulations of the transmissivity field with this variogram using the turning bands method. These transmissivities are then conditioned to honor measured transmissivities by the addition of a simulated kriging error to the kriged field based on the measured data. An automated iterative calibration follows in which an objective function defined by a weighted sum of the squared deviations between the computed and the observed pressures over points in the spatial and temporal domains is minimized. Pilot points are synthetic transmissivity data points and are used as parameters of calibration. During calibration, pilot points are added to the measured transmissivity database to produce a revised conditional simulation. Coupled adjoint-sensitivity analysis and kriging are used to locate pilot points optimally, where their potential for reducing the objective function is the highest. Gradient search methods, subject to subsequent constraints, are used to derive optimal transmissivities at the pilot points. The pilot points are added to the transmissivity data base for purposes of kriging, but the simulated kriging error to be added for conditional simulation is based on the measured transmissivities only and, thus, remains the same across all iterations. At the end of an iteration, a revised transmissivity field and the corresponding pressure field are obtained. The test for convergence of iterations is based primarily on a prescribed minimum value for the objective function and a prescribed maximum number of pilot points. Each conditionally simulated transmissivity field is calibrated separately. B6.

The Maximum-Likelihood Method (ML)

This technique, implemented in the INVERT code, is a very general nonlinear technique that estimates the aquifer parameters (transmissivity, recharge, storage, leakage coefficients, prescribed boundary heads, or flow rates) using prior estimates of their values along with transient or steady state head mea-

1407

surements. It was developed by Carrera and Neuman [1986a, b]. Parameter estimation is performed using the maximumlikelihood theory, for which several optimization methods are available. The nonlinear flow equation is solved by the finite element method using a fully implicit lumped time integration. The flow domain can be 1-D, 2-D, 2-D radial, or quasi-3-D, where 1-D linear string elements may be used to represent vertical flow, fractures, well bore effects, etc. The INVERT code minimizes an objective function consisting of an error component associated with the measured head data and a weighted error component associated with the prior estimates of other hydrologic parameters. The weighting is a parameter that is varied manually in simulation. For several values of this weighting parameter the objective function is minimized. The INVERT code offers a number of gradient and Gauss-Newton methods for minimizing the objective function. In conjunction with this exercise, INVERT was used to estimate aquifer parameters simultaneously from three transient pumping tests using prior INVERT block transmissivity estimates computed from steady state transmissivity and head data. Some of the advantages of the INVERT implementation of the ML technique are that it is a fast, powerful, well-documented code that is being used extensively and is actively undergoing development. A more up-to-date description of the code’s geostatistical formulation is given by Carrera et al. [1993]. When the exercise was started, the optimization algorithm CPU time was highly sensitive to the number of blocks (pixels) over which T is estimated. As a result of this exercise, a new optimization method, whose CPU cost is virtually independent of the number of blocks, was developed [Carrera and Medina, 1994]. However, zones for TPs had already been prepared with costreduction constraints in mind. That is, small zones were used where data were abundant, and large zones were used elsewhere. To simulate small-scale variability at each block, the following algorithm was used. First, starting from the simulations for each finite element block, assign the simulated values as measurements to 2 ⫻ 2 gauss points of each finite element grid block. Second, generate a simulation conditioned on these points on any desired grid (e.g., on a very fine regularly spaced grid). B7.

The Sequential Self-Calibration (SS)

This method was developed by the Department of Hydraulic and Environmental Engineering [Sahuquillo et al., 1992; Go ´mez-Herna ´ndez et al., 1997; Capilla et al., 1997]. SS is able to accommodate both multi-Gaussian and nonmulti-Gaussian random function models using the indicator kriging approach. The indicator kriging approach can be seen as a superset of the multi-Gaussian approach. That is, if the data are suitable to be modeled by a multi-Gaussian random function, the indicator approach will produce the same results. However, it can handle the histogram of the data as is, that is, normal, lognormal, or otherwise, and it can inject spatial patterns to certain transmissivity classes that could not be reproduced with multiGaussian models. The nonmulti-Gaussian model can be used to introduce multiple populations of transmissivities or fracture-like features in the simulations. The decision to use which model is taken after careful examination of the data. The transmissivity data are then kriged, and the kriging standard deviation is calculated at each grid block location. A grid oriented in the mean flow direction is constructed, and a seed transmissivity field, according to the random function model chosen and conditional to the available transmissivity data, is

1408

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

generated. The next step is the computation of a transmissivity perturbation field so that forward simulation of flow in the seed field plus the perturbation reproduces the head data. Determination of the seed field is done by optimization. The perturbation field is parameterized by a few values at selected master locations. Perturbation of the remaining cells is obtained by kriging interpolation of the master location values. The set of master locations always includes the transmissivity measurements (at which the perturbation is constrained by the transmissivity measurement error) and the transmissivity perturbation at the master cells never falls outside the interval of the kriging estimate plus or minus three kriging standard deviations.

Appendix C: Detailed Description of Evaluation Measures Three classes of analyses were used to compare and rank the inverse methods: (1) GWTT analyses, (2) PATH analyses, and (3) field variable analyses. Within each class there are several “evaluation measures” that were used to quantify and characterize the performance of the methods. In the end, there were 10 evaluation measures used to characterize the performance of the methods. The final set of evaluation measures was arrived at through iteration and consensus of the GXG and test problem participants. The measures of method performance are not all necessarily independent. For GWTT and PATH analysis, the measures were designed to quantify such performance characteristics as (1) the error between the estimated CDFs and the true value or true CDF, (2) the magnitude of the spread in the distributions, (3) the robustness and self-consistency of the method, and (4) for GWTT only, the bias toward cautious or noncautious estimates (i.e., underpredicting rather than overpredicting the GWTT). A method is considered robust if the true GWTT or PATH falls within the calculated range of the CDF. For example, over several release points, one would expect that sometimes the true GWTT is greater than the calculated median, sometimes less than the median, and that on the average, the true GWTT falls somewhere within the full range of the calculated GWTTs. If this is not the case, we can have the following situations: 1. There can be a systematic bias, for example, the true GWTT always or nearly always being greater (or less) than the median GWTT. If a bias exists and the tendency is to overestimate the true GWTT, then this should be noted, as the method will be considered “noncautious.” In a PA context, producing cautious estimates is more desirable than producing noncautious ones because the error is on the side of greater protection of public health. The degree of bias is assessed by two numbers: the magnitude of the error and a measure of the degree of caution in the estimates. 2. Independent of any bias, the calculated range of GWTT can be much too wide or much too narrow. In the former case the true GWTT generally falls only in a limited portion of the estimated GWTT CDF. A method which consistently does this is said to overpredict the uncertainty. In the latter case the true GWTT does not, in general, fall within the calculated range; such a method is said to underpredict the uncertainty. If the method neither overpredicts nor underpredicts the uncertainty, it is considered to be “self-consistent.” To evaluate how well a method performs, we assess the magnitude of the error between the estimated CDF and the

true value, quantify the degree of spread in the distribution, and perform a bootstrap confidence interval test to evaluate the robustness and self-consistency of the method. In addition, for GWTT, we compute a measure which quantifies the degree of conservatism in the estimates. For reasons described later on, the robustness and spread measures must be evaluated jointly, and are thus combined into a single measure, which is referred to as the “normalized self-consistency” measure. We refer to these calculations as performance measure calculations which are computed for the GWTT and PATH CDFs at each release point. Then we compute the evaluation measures by averaging the performance measures across all release points in each test problem. Two approaches were used to evaluate the GWTT performance of the methods, the fixed well and the random well cases described in the text. The first compares the distribution of simulated GWTT values from each release point to the single, known value; the second involves a comparison of the estimated distribution for particles released from within a designated area to the true distribution of GWTTs for that area. Hereinafter, the designations “true field,” “true travel time,” or “true distribution” refer to quantities computed using the exhaustive (synthetic) data set based on the reference model. All of the evaluation measures were constructed such that the target value is zero, that is, the closer the measure value is to zero, the better the performance. The reason for this was to provide a consistent target value for each evaluation measure and to aid in transforming the computed measures into consistent units (e.g., ranks) for use in averaging across the measures. The measures are described below. C1.

GWTT Analyses: Fixed Well Approach

Because some of the release points were located close to observation points and others were not, the performance measures computed at each release point were weighted accordingly (i.e., release points placed in close proximity to observation points were assigned larger weights). Thus, in the formulations that follow, the evaluation measures are presented as a weighted average over the number of release points, of the performance measures. If a method did not produce a GWTT for a given release point (e.g., the particle reached the edge of the model domain at a distance less than 5 km), then this release point was not considered, and the weights of the remaining release points were adjusted accordingly such that they still summed to one. In other words, no penalty is applied for skipping a release point. In Table 4 the number of release points used by each method in each test problem is given. The rationale and formulations involved in the weighting of the release points are as follows. C1.1. Weighting of release points. Each true groundwater flow path was discretized into as many points as there were grid cells intercepted by the path in the true-solution model (in general, on the order of 300 points). At each of these points, kriging was performed using the locations of all observation locations in the field, both transmissivity and head data, indistinctly. A linear variogram model, given by ␥(␰) ⫽ ␰, was used for the kriging. The mixture of head and T data in the kriging was inconsequential as only the kriging variances were kept; the kriged estimates were discarded. The arithmetic average of the kriging variances was computed for all the points falling along the path. The weighting factor assigned to each release point was then calculated as being proportional to the inverse

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

of this average kriging variance, normalized so that the sum of these weights equals one. This weighting approach was selected because the kriging variance is only a function of the location of the data points and not of the actual measured values. Generally speaking, this variance increases when the point to be estimated becomes further away from the observation locations. The weights, being inversely proportional to the kriging variance, represent a “measure of the distance” between the particle path and the observation points. Mixing together the T and head data (i.e., using the information regarding their locations) represents the simplest assumption on the worth of the data, and using a linear variogram does not give a limited range of influence to any observation location (a variogram with a sill would have made all points beyond the range appear to be at the same distance). The weights were constructed as

冘冉 冊 m

Wi ⫽ s

j⫽1

1 ␴ 2K

(2) j

where ␴ 2K is the kriging variance in cell j of the true field, m is the number of cells intercepted by particle path i, and s is a scale factor chosen such that



1409

GWTT spread

冘 nrp



W i关log (GWTT0.975兲 i ⫺ log (GWTT0.025) i

where nrp is the number of release points and W i are the release point weights, as before. C1.4. GWTT CDF robustness measure. As noted previously, the bootstrap test is a measure of the robustness of the method and, together with the spread measure, provides an indication of the self-consistency of the method. The GWTT bootstrap measure is defined as nci 0.95 ⫺ nrp GWTT bootstrap ⫽ (6) 0.95





where nci is the number of times the true GWTT or PATH fell within the 0.025 and 0.975 quantiles of the GWTT or PATH CDFs and nrp is the number of release points (CDFs). C1.5. GWTT CDF degree of caution measure. The GWTT degree of caution measure was considered important from a PA standpoint. It is a binary measure given by

W i ⫽ 1.0

(3)

where nrp is the number of release points. C1.2. GWTT CDF error measure. The GWTT error measure quantifies the discrepancy between the median GWTT and the true GWTT, in log10 space and in absolute value. For each TP and method, it is given by GWTT error nrp

i⫽1

Wi



兩log 共median GWTT兲 i ⫺ log 共true GWTT兲 i兩 1 M

冘 M

j⫽1

W iNCi

(7)

i⫽1

NCi ⫽

i⫽1



冘 nrp

GWTT degree of caution ⫽

nrp



(5)

i⫽1

关log 共GWTT0.975兲 ⫺ log 共GWTT0.025兲兴 ij



(4)

where nrp is the number of release points, M is the number of methods being compared, and GWTTx is the xth quantile of the log (GWTT) CDF. The evaluation measure was constructed so that the same error measure value will be computed regardless of whether the median and true GWTT values are, respectively, 1000 and 2000, 2000 and 1000, or 10,000 and 20,000. That is, the error measure is independent of where the true travel time falls on the timescale and of whether or not it is to the left or the right of the median GWTT. This allows the evaluation measures to be averaged over the release points (for each TP, each of which has a different true GWTT) and does not discriminate between underestimation and overestimation of the GWTT. The denominator represents an average of the spread in the distributions across all methods for each TP. This normalization was chosen so that these error-measure values could be compared across release points and test problems where the reference GWTT can differ substantially. It also provides a constant divisor, common to all methods, that facilitates an objective comparison among them. C1.3. GWTT CDF spread measure. The GWTT spread measure is given by

再 10

if Q true ⬍ 0.20 otherwise

where Q true is the quantile of the estimated GWTT CDF corresponding to the true GWTT. If the CDF for a release point is noncautious (80% of the CDF exceeds the true GWTT), score a 1 (undesirable); if it is not noncautious, score a zero (desirable). The measure was formulated and the description was phrased in this manner to point out a subtle distinction: Caution by itself is not particularly desirable, but being noncautious is definitely undesirable. If the weighted average of the NC scores is close to 1, it means the method generally produces noncautious solutions. When associated with GWTT error, it is a measure of the bias, but centered on the 0.20 quantile, not on the median. The selection of the 0.20 quantile is arbitrary, but reflects the belief that a method that overpredicts the GWTT 80% of the time is noncautious. C1.6. PATH CDF error measure. The PATH error measure quantifies the absolute deviation between the median path direction angle and the direction of the true path. The orientation of the path is defined by the angle (␣, in degrees, between ⫺180⬚ and ⫹180⬚, 0⬚ being east) from the release point to the point where the path crosses a circle of radius 5 km (centered at the release point). As with GWTT, these errors are normalized by the average spread in the path direction errors across all methods in order to enable the magnitude of this measure to be comparable across test problems (the variance in the spread of the PATH CDFs will vary with the test problem due to the different types of hydrogeologic features and different degrees of heterogeneity represented in the different test problems). The PATH error measure is given for each TP by

冘 nrp

PATH error ⫽

i⫽1

Wi



median ␪ i 1 M

冘 M

j⫽1

关 ␪ 0.975 ⫺ ␪ 0.025兴 ij



(8)

0 ⱕ ␪ ⱕ 180

1410

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

where ␪ is the magnitude of particle path direction error in degrees, ␪ x is the xth quantile of the CDF of path direction errors, and M is the number of methods being compared. ␪ i ⫽ 兩 ␣ i ⫺ ␣ t 兩, where ␣ i is the estimated particle path direction and ␣ t is the true path direction. C1.7. PATH CDF spread measure. The path spread measure was formulated in terms of the actual angular orientation ␣ rather than the magnitude of the error ␪, as the latter depends on where the true ␣ falls. Using ␪ for comparing the spread in the distributions could be misleading. For example, two distributions, each spanning 90⬚, will have different ␪-based spread measures if true direction falls at the median of one distribution and the 0.01 quantile of the other distribution. This measure is designed to be independent of the error in the path direction.

冘 nrp

PATH spread ⫽

W i关 ␣ 0.975 ⫺ ␣ 0.025兴 i

(9)

i⫽1

⫺180 ⱕ ␣ ⱕ ⫹180 where ␣ x is the xth quantile of the CDF of angular groundwater flow path directions. C1.8. PATH CDF robustness measure. The PATH CDF robustness measure was constructed identically to that for the GWTT CDFs:

PATH bootstrap ⫽

nci 冏 0.95 ⫺ nrp 冏 0.95

(10)

where nci refers to the number of times the true path direction falls within the 0.025 and 0.975 quantiles of the CDF of path directions. C1.9. The “normalized self-consistency” measures. After careful examination of the evaluation measure scores, it was determined that the spread and bootstrap measures cannot be judged independently of each other. A distribution with a narrow spread, on the outset, appears desirable because it indicates little uncertainty. However, if the true GWTT rarely falls within the CDF bounds (a failure of the bootstrap test), the performance is deemed unsatisfactory. Conversely, if a method consistently satisfies the bootstrap interval bounds, but also consistently produces CDFs with a very large spread, this too is undesirable. From a PA viewpoint it is better to have a robust method where the range of estimates nearly always contains the true value than to produce narrow distributions which fail to capture the true value a significant portion of the time. This way of thinking was encoded by combining the spread and bootstrap measures into a single “normalized self-consistency” evaluation measure, denoted NSC, and formulated as NSC ⫽

3 䡠 Boot ⫹ Sprd 4

(11)

In this formula the bootstrap score (Boot) and the spread measure score (Sprd) are first converted to consistent units via rank transformation. If the range of the transformed Boot and Sprd measures was [0, 1] (the rank-transform range is [1, 7]), then CDFs that were spikes always landing on the true value would yield NSC ⫽ 0, broad CDFs never containing the true value would lead to NSC ⫽ 1, and the overestimation and underestimation of uncertainty cases would lead to NSC ⫽ 0.25 and NSC ⫽ 0.75

respectively. This measure was constructed for both the GWTT and PATH analyses. The 3-to-1 weighting of the two measures was determined by successive trials of different weights along with comparisons and discussions of the merits of the CDFs produced in this test problem exercise. C2.

GWTT Analyses: Random Well Approach

For each TP and each method, in addition to computing the mean GWTT CDFs by averaging over the CDFs derived from each of the simulated flow fields, a bounding envelope containing the inner 95% of the CDF curves at each travel time value was constructed. These GWTT0.025 and GWTT0.975 bounding curves reflect the degree of variability in GWTT within the repository area from realization to realization. To distinguish these results from the fixed release points CDFs, we will refer to these CDFs as “the waste panel CDFs.” Comparison of pathlines was not deemed necessary for the random well case because, as noted earlier, the regulations do not mandate treatment of where the contamination occurs, only how much reaches the accessible environment. Also, the repository is located in the area of greatest data density so that there is much less uncertainty associated with capturing the general flow directions. The evaluation measures attempt to quantify the deviation from the true CDF, the spread among the CDF estimates, and the robustness of the methods, as follows (see Figure 17). For GWTT error the area between the mean CDF and the true CDF was compared among the methods. Similarly, for the GWTT spread measure, the area between the two bounding envelopes was used to rank the methods. This area represents the degree of uncertainty in the estimate of the waste panel CDF. The GWTT robustness measure was computed by determining the proportion of the true CDF that lies within the bounding envelope of CDF curves and subtracting that from 1.0. As in the fixed-release points case, the spread and bootstrap measures are, after conversion to rank values, combined into a single normalized, centralized spread (NSC) measure as NSC ⫽ (3 䡠 Boot ⫹ Sprd)/4. C3.

Field Variable Analyses

Inverse methods produce simulations of the entire transmissivity field conditioned on transmissivity and head data at a few observation points. T and head are related through the governing flow equation. The log10 (T) error measure was intended to quantify, in some average sense, how well the realizations reproduced the true transmissivity field. Because solute transport is a function of both log10 (T) variability and head gradients, a similar global measure of deviation from the true head field was also used as a performance measure. Because these measures are of such a global nature (a single scalar to quantify the degree of correspondence between true and estimated spatially variable quantities), they serve more as indicators of method performance than measures which can be compared to two decimal places. This is one of the reasons all of the measures were converted to rank values. C3.1. Head and log10 (T) error measures. The global measure of error was computed as a weighted average of the absolute differences between the true field values and the values from the grid blocks of the participant’s model, averaged across all realizations produced by the method. It is computed as

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

head or log10 共T兲 error 1 ⫽ NR䡠NG

冘冘 NR

NG

W j共TRUEj ⫺ MTHDij兲

(12)

i⫽1 j⫽1

where NR is the number of realizations produced by the method, NG is the number of grid blocks in the participant’s grid, W j is a weight associated with grid block j, TRUEj is the value of log10 (T) or head from the true solution for the participant’s grid block j, and MTHDij is the value of log10 (T) or head from the inverse solution of a method in grid block j of simulation i. Because the true solution was solved on a much finer grid than any of the participant’s models, the true value, TRUEj , corresponding to a grid block of the participant’s model is given by an area-weighted average of the true-grid block values contained within the participant’s grid block j, TRUEj ⫽

冘 NT

1

冘 NT

Ak

A k 䡠 TRUEk

(13)

1411

where subscript T denotes true field value and subscript i denotes the value from method i. Just as the GWTT spread and bootstrap measures cannot be interpreted independently, neither can these two measures. It was decided that for PA purposes, it is more important to capture the correlation scale of the log10 (T) process than to match the variance in the distribution of true log10 (T) field values. This is particularly evident in TP 3, where the high-T channels lead to a very short correlation length. As a result of this thinking, the J ␴ 2 and J ␭ evaluation measures were combined into a single geostatistical structure evaluation measure, J ␥ , defined as J␥ ⫽

3 䡠 J ␭ ⫹ J ␴2 . 4

(17)

The ability of an inverse method to reproduce correctly the correlation structure of the T-field in the realizations was considered an important feature for predicting contaminant transport and spreading.

k⫽1

k⫽1

where NT is the number of true-grid blocks having any portion overlapping with grid block j of the participant’s model and A k is the amount of area of true-grid block k which overlaps with the participant’s grid block j. The weights, W j in (12), were developed to account for the proximity of the observation points to the grid block where the error is being evaluated, in a similar way to that of the weights for the particle pathlines from the fixed release points case. The weights, identical for head and log10 (T), were computed as W j ⫽ Max共 ␴ K兲 ⫺ ␴ K共 j兲

(14)

where ␴ K ( j) is the kriging standard deviation for cell j, using a linear semivariogram model and all data, both head and T, and Max( ␴ K ) is the maximum kriging standard deviation over all meshes j. There was no constraint that the sum of the weights be equal to unity. C3.2. T-field correlation structure measure. Semivariogram estimates of the simulated log10 (T) fields were computed for each realization of a method using the GSLIB GAMV2M routine [Deutsch and Journel, 1992]. On the order of 600 to 1000 randomly placed sampling points were used in the estimation of each semivariogram. Then the average semivariogram was computed across the ensemble of realizations for each TP and each method. Finally, estimates of the parameters of an exponential semivariogram model fit to each of the average empirical semivariograms were made via nonlinear regression. The same analysis was performed on the single true log10 (T) field realization for each test problem. The regression estimates of the sill, ␴2, and the correlation length parameter ␭ led to two geostatistical performance measures characterizing the correlation structure of the ensemble of fields produced by each method. The sill and correlation length evaluation measures, denoted J ␴ 2 and J ␭ respectively, were constructed identically as J ␴2 ⫽ 1 ⫺

␴ 2T ␴ ⫹ 兩 ␴ 2i ⫺ ␴ 2T兩

(15)

J␭ ⫽ 1 ⫺

␭T ␭ T ⫹ 兩 ␭ i ⫺ ␭ T兩

(16)

2 T

Acknowledgments. The test problem coordinator and first author would like to express gratitude for the help and constructive input provided by all the participants, design committee members, and others who attended the GXG meetings, and in particular to the following: Charlie Cole and Harlan Foote of Pacific Northwest Laboratory, who provided the multigrid solver used to obtain the solutions to the “true” fields; Anil Mishra of New Mexico Tech for helping with the numerical pumping in TP’s 3 and 4; Randy Roberts and Paul Domski of Duke Engineering and Services, Inc. for their analyses of the aquifer test data; Hamilton Link of the University of Oregon who developed numerous codes used to carry out the comparison analyses; and Sean McKenna and Erik Webb of Sandia National Laboratories, Albuquerque, New Mexico for their extremely helpful review comments. Part of this paper was written while the second author, who also chaired the GXG, was spending a sabbatical at Stanford University in the Department of Geological and Environmental Science. The support of Stanford University is greatly appreciated. This work was funded by Sandia National Laboratories in the framework of research programs to assist in the development of the PA methodology for evaluation of the WIPP site on behalf of the Department of Energy.

References Ahmed, S., and G. de Marsily, Cokriged estimation of aquifer transmissivity as an indirect solution of the inverse problem: A practical approach, Water Resour. Res., 29(2), 521–530, 1993. Beaubeim, R. L., Identification of spatial variability and heterogeneity of the Culebra Dolomite at the Waste Isolation Pilot Plant site, in Proceedings: NEA Workshop on Heterogeneity of Groundwater Flow and Site Evaluation, Paris, France, 22–24 October 1990, pp. 131–142, Nucl. Energy Agency, Org. for Econ. Coop. Dev., Paris, 1991. Capilla, J. E., J. J. Go ´mez-Herna´ndez, and A. Sahuquillo, Stochastic simulation of transmissivity fields conditional to both transmissivity and piezometric data, 2, Demonstration on a synthetic aquifer, J. Hydrol., 203, 175–188, 1997. Carrera, J., State of the art of the inverse problem applied to the flow and solute transport equations, in Groundwater Flow and Quality Modelling, NATO ASI Ser., vol. 224, pp. 549 –585, Kluwer, Norwell, Mass., 1988. Carrera, J., and L. Glorioso, On geostatistical formulations of the groundwater flow inverse problem, Adv. Water Resour., 14(5), 273– 283, 1991. Carrera, J., and A. Medina, An improved form of adjoint-state equations for transient problems, in Computational Methods in Water Resources X, pp. 199 –206, Kluwer, Norwell, Mass., 1994. Carrera, J., and S. P. Neuman, Estimation of aquifer parameters under transient and steady state conditions, 1, Maximum likelihood method incorporating prior information, Water Resour. Res., 22(2), 199 –210, 1986a. Carrera, J., and S. P. Neuman, Estimation of aquifer parameters under

1412

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES

transient and steady state conditions, 2, Uniqueness, stability, and solution algorithms, Water Resour. Res., 22(2), 211–227, 1986b. Carrera, J., A. Medina, and G. Galarza, Groundwater inverse problem: Discussion on geostatistical formulations and validation, Hydroge´ologie, 4, 313–324, 1993. Cauffman, T. L., A. M. LaVenue, and J. P. McCord, Ground-water flow modeling of the Culebra Dolomite, vol. II, Data base, SAND897068/2, Sandia Natl. Lab., Albuquerque, N. M., 1990. Clifton, P. M., and S. P. Neuman, Effects of kriging and inverse modeling on conditional simulation of the Avra Valley aquifer in southern Arizona, Water Resour. Res., 18(4), 1215–1234, 1982. Cooley, R. L., A method of estimating parameters and assessing reliability for models of steady state groundwater flow, 1, Theory and numerical properties, Water Resour. Res., 13(2), 318 –324, 1977. Cooley, R. L., A method of estimating parameters and assessing reliability for models of steady state groundwater flow, 2, Application of statistical analysis, Water Resour. Res., 15(3), 603– 617, 1979. Cooley, R. L., Incorporation of prior information on parameters into nonlinear regression groundwater flow models, 1, Theory, Water Resour. Res., 18(4), 965–976, 1982. Cooley, R. L., Incorporation of prior information on parameters into nonlinear regression groundwater flow models, 2, Applications, Water Resour. Res., 19(3), 662– 676, 1983. Copty, N., Y. Rubin, and G. Mavko, Geophysical-hydrological identification of field permeabilities through Bayesian updating, Water Resour. Res., 29(8), 2813–2825, 1993. Dagan, G., Stochastic modeling of groundwater flow by unconditional and conditional probabilities: The inverse problem, Water Resour. Res., 21(1), 65–72, 1985. Dagan, G., Flow and Transport in Porous Formations, 465 pp., Springer-Verlag, New York, 1989. Dagan, G., and Y. Rubin, Stochastic identification of recharge, transmissivity and storativity in aquifer transient flow: A quasi-steady approach, Water Resour. Res., 24(10), 1698 –1710, 1988. Delhomme, J. P., Spatial variability and uncertainty in groundwater flow parameters: A geostatistical approach, Water Resour. Res., 15(2), 269 –280, 1979. de Marsily, G., De l’identification des syste`mes en hydrogeologiques (tome 1), Ph.D. thesis, pp. 58 –130, L’Univ. Pierre et Marie Curie– Paris VI, Paris, 1978. de Marsily, G., G. Lavedan, M. Boucher, and G. Fasanion, Interpretation of interference tests in a well field using geostatistical techniques to fit the permeability distribution in a reservoir model, in Geostatistics for Natural Resources Characterization 2nd NATO Advanced Study Institute, South Lake Tahoe, CA, September 6 –17, 1987, part 2, edited by G. Verly et al., pp. 831– 849, D. Reidel, Norwell, Mass., 1984. Desbarats, A. J., and R. M. Srivastava, Geostatistical simulation of groundwater flow parameters in a simulated aquifer, Water Resour. Res., 27(5), 687– 698, 1991. Dettinger, M. D., and J. L. Wilson, First order analysis of uncertainty in numerical models of groundwater flow, 1, Mathematical development, Water Resour. Res., 17(1), 149 –161, 1981. Deutsch, C. V., and A. G. Journel, GSLIB: Geostatistical Software Library and User’s Guide, Oxford Univ. Press, New York, 1992. Dietrich, C. R., and G. N. Newsam, Sufficient conditions for identifying transmissivity in a confined aquifer, Inverse Prob., 6(3), L21–L28, 1990. Ginn, T. R., and J. H. Cushman, Inverse methods for subsurface flow: A critical review of stochastic techniques, Stochastic Hydrol. Hydraul., 4(1), 1–26, 1990. Go ´mez-Herna´ndez, J. J., and A. G. Journel, Joint sequential simulation of multi-gaussian fields, in Geostatistics Troia ’92, vol. 1, edited by A. Soares, pp. 85–94, Kluwer Acad., Norwell, Mass., 1993. Go ´mez-Herna´ndez, J. J., A. Sahuquillo, and J. E. Capilla, Stochastic simulation of transmissivity fields conditional to both transmissivity and piezometric data, 1, Theory, J. Hydrol., 203, 162–174, 1997. Gonzalez, R. V., M. Giudici, G. Ponzini, and G. Parravicini, The differential system method for the identification of transmissivity and storativity, Transp. Porous Media, 26, 339 –371, 1997. Grindrod, P., and M. D. Impey, Fractal field simulations of tracer migration within the WIPP Culebra Dolomite, Intera Inf. Technol., Henley-upon-Thames, U. K., Dec. 1991. Gutjahr, A. L., and J. R. Wilson, Co-kriging for stochastic flow models, Transp. Porous Media, 4(6), 585–598, 1989. Gutjahr, A., B. Bullard, S. Hatch, and L. Hughson, Joint conditional

simulations and the spectral method approach for flow modeling, Stochastic Hydrol. Hydraul., 8(1), 79 –108, 1994. Harvey, C. F., and S. M. Gorelick, Mapping hydraulic conductivity: Sequential conditioning with measurements of solute arrival time, hydraulic head and local conductivity, Water Resour. Res., 31(7), 1615–1626, 1995. Hoeksema, R. J., and P. K. Kitanidis, An application of the geostatistical approach to the inverse problem in two-dimensional groundwater modeling, Water Resour. Res., 20(7), 1003–1020, 1984. Hsu, K., D. Zhang, and S. P. Neuman, Higher-order effects on flow and transport in randomly heterogeneous porous media, Water Resour. Res., 32(3), 571–582, 1996. Hyndman, D. W., J. M. Harris, and S. M. Gorelick, Coupled seismic and tracer test inversion for aquifer property characterization, Water Resour. Res., 30(7), 1965–1977, 1994. Johnson, R. A., and D. W. Wichern, Applied Multivariate Statistical Analysis, 594 pp., Prentice-Hall, Englewood Cliffs, N. J., 1982. Journel, A. G., and C. J. Huijbregts, Mining Geostatistics, Academic, San Diego, Calif., 1978. Keidser, A., and D. Rosbjerg, A comparison of four inverse approaches to groundwater flow and transport parameter identification, Water Resour. Res., 27(9), 2219 –2232, 1991. Kitanidis, P. K., and R. W. Lane, Maximum likelihood parameter estimation of hydrologic spatial processes by the Gauss-Newton method, Hydrol., 79(1–2), 53–71, 1985. Kitanidis, P. K., and E. G. Vomvoris, A geostatistical approach to the inverse problem in groundwater modeling (steady state) and onedimensional simulations, Water Resour. Res., 19(3), 677– 690, 1983. Koltermann, C. E., and S. M. Gorelick, Heterogeneity in sedimentary deposits: A review of structur-imitating, process-imitating, and descriptive approaches, Water Resour. Res., 32(9), 2617–2658, 1996. Kuiper, L. K., A comparison of several methods for the solution of the inverse problem in two-dimensional steady state groundwater flow modeling, Water Resour. Res., 22(5), 705–714, 1986. Lappin, A. R., Summary of site-characterization studies conducted from 1983 through 1987 at the Waste Isolation Pilot Plant (WIPP) site, southeastern New Mexico, SAND88-0157, Sandia Natl. Lab., Albuquerque, N. M., 1988. LaVenue, A. M., and J. F. Pickens, Application of a coupled adjointsensitivity and kriging approach to calibrate a groundwater flow model, Water Resour. Res., 28(6), 1543–1569, 1992. LaVenue, A. M., B. S. RamaRao, G. de Marsily, and M. G. Marietta, Pilot point methodology for automated calibration of an ensemble of conditionally simulated transmissivity fields, 2, Application, Water Resour. Res., 31(3), 495–516, 1995. Mackay, R., A study of the effect of the extent of site investigation on the estimation of radiological performance: Overview, DoE/HMIP/ RR/93.053, 28 pp., UK Dep. of the Environ., Her Majesty’s Insp. of Pollut., London, 1993. Mandelbrot, B. B., The Fractal Geometry of Nature, 468 pp., W. H. Freeman, New York, 1983. Mantoglou, A., and J. L. Wilson, The turning bands method for simulation of random fields using line generation by a spectral method, Water Resour. Res., 18(5), 1379 –1394, 1982. Matheron, G., The intrinsic random functions and their applications, Adv. Appl. Prob., 5(3), 439 – 468, 1973. McKenna, S. A., and E. P. Poeter, Field example of data fusion in site characterization, Water Resour. Res., 31(12), 3229 –3240, 1995. McLaughlin, D., and L. R. Townley, A reassessment of the groundwater inverse problem, Water Resour. Res., 32(5), 1131–1161, 1996. Neuman, S. P., and S. Orr, Prediction of steady state flow in nonuniform geologic media by conditional moments: Exact nonlocal formalism, effective conductivities, and weak approximation, Water Resour. Res., 29(2), 341–364, 1993. Paleologos, E. K., S. P. Neuman, and D. Tartakovsky, Effective hydraulic conductivity of bounded, strongly heterogeneous porous media, Water Resour. Res., 32(5), 1333–1341, 1996. Peck, A., S. Gorelick, G. de Marsily, S. Foster, and V. Kovalevsky, Consequences of spatial variability in aquifer properties and data limitations for groundwater modelling practice, IAHS Publ. 175, 272 pp., 1988. RamaRao, B. S., A. M. LaVenue, G. de Marsily, and M. G. Marietta, Pilot point methodology for automated calibration of an ensemble of conditionally simulated transmissivity fields, 1, Theory and computational experiments, Water Resour. Res., 31(3), 475– 493, 1995. Robin, M. J. L., A. L. Gutjahr, E. A. Sudicky, and J. L. Wilson,

ZIMMERMAN ET AL.: COMPARISON OF INVERSE APPROACHES Cross-correlated random field generation with the direct Fourier transform method, Water Resour. Res., 29(7), 2385–2397, 1993. Roth, C., J. P. Chiles, and C. de Fouquet, Adapting geostatistical transmissivity simulations to finite difference flow simulators, Water Resour. Res., 32(10), 3237–3242, 1996. Rubin, Y., Prediction of tracer plume migration in disordered porous media by the method of conditional probabilities, Water Resour. Res., 27(6), 1291–1308, 1991a. Rubin, Y., Transport in heterogeneous porous media: Prediction and uncertainty, Water Resour. Res., 27(7), 1723–1738, 1991b. Rubin, Y., and G. Dagan, Stochastic identification of transmissivity and effective recharge in steady groundwater flow, 1, Theory, Water Resour. Res., 23(7), 1185–1192, 1987. Rubin, Y., and G. Dagan, Conditional estimation of solute travel time in heterogeneous formations: Impact of transmissivity measurements, Water Resour. Res., 28(4), 1033–1040, 1992. Rubin, Y., and A. J. Journel, Simulation of non-Gaussian space random functions for modeling transport in groundwater, Water Resour. Res., 27(7), 1711–1721, 1991. Rubin, Y., G. Mavko, and J. Harris, Mapping permeability in heterogeneous aquifers using hydrologic and seismic data, Water Resour. Res., 28(7), 1809 –1816, 1992. Sahuquillo, A., J. E. Capilla, J. J. Go ´mez-Herna´ndez, and J. Andreu, Conditional simulation of transmissivity fields honoring piezometric data, in Hydraulic Engineering Software IV, Fluid Flow Modeling, vol. 2, edited by Blain and Cabrera, pp. 201–214, Elsevier Sci., New York, 1992. Sandia National Laboratories, Preliminary comparison with 40 CFR Part 191, Subpart B for the Waste Isolation Pilot Plant, December 1991, vol. 1, Methodology and results, SAND91-0893/1, Albuquerque, N. M., 1991. Sandia National Laboratories, Preliminary performance assessment for the Waste Isolation Pilot Plant, December, 1992, vol. 1, Third comparison with 40 CFR 191, Subpart B, SAND92-0700/1, Albuquerque, N. M., 1992. Steel, R. G. D., and J. H. Torrie, Principles and Procedures of Statistics: A Biometrical Approach, 2nd ed., 633 pp., McGraw-Hill, New York, 1980. Sun, N. Z., Inverse Problems in Groundwater Modeling, 337 pp., Kluwer Acad., Norwell, Mass., 1994. Sun, N. Z., and W. W. G. Yeh, A stochastic inverse solution for transient groundwater flow: Parameter identification and reliability analysis, Water Resour. Res., 28(12), 3269 –3280, 1992. Townley, L. R., and J. L. Wilson, Computationally efficient algorithms for parameter estimation and uncertainty propagation in numerical models of groundwater flow, Water Resour. Res., 21(12), 1851–1860, 1985. ¨ spo Tsang, Y. Y. W., Stochastic continuum hydrological model of A ¨ for the SITE-94 performance assessment project, Rep. SKI-R-96-9, 80 pp., Swed. Nucl. Power Insp., Stockholm, 1996. U.S. Environmental Protection Agency, 40 CFR 191: Environmental standards for the management and disposal of spent nuclear fuel, high-level and transuranic radioactive wastes: Final rule, Fed. Regist., 50(82), 38,066 –38,089, 1985. Yeh, W. W. G., Review of parameter identification procedures in groundwater hydrology: The inverse problem, Water Resour. Res., 22(1), 95–108, 1986. Zimmerman, D. A., and J. L. Wilson, Description of and user’s manual for TUBA: A computer code for generating two-dimensional ran-

1413

dom fields via the turning bands method, GRAM, Inc., Albuquerque, N. M., 1990. Zimmerman, D. A., C. L. Axness, G. de Marsily, M. G. Marietta, and C. A. Gotway, Some results from a comparison study of geostatistically-based inverse techniques, in Parameter Identification and Inverse Problems in Hydrology, Geology and Ecology, edited by J. Gottlieb and P. DuChateau, Kluwer Acad., Norwell, Mass., 1996. C. L. Axness, R. Beauheim, P. B. Davies, D. P. Gallegos, and M. G. Marietta, Sandia National Laboratories, P.O. Box 5800, Albuquerque, NM 87185-5800. (e-mail: [email protected]; [email protected]; [email protected]; [email protected]) R. L. Bras, Department of Civil Engineering, Massachusetts Institute of Technology, Water Resources and Environmental Engineering Division, Room 48-311, Cambridge, MA 02139. (e-mail: [email protected]) J. Carrera, Universitat Polite`cnica de Catalun ˜a, E.T.S.I. Caminos, Jordi, Girona 31, E-08034 Barcelona, Spain. (e-mail: [email protected]) G. Dagan, Department of Fluid Mechanics and Heat Transfer, Tel Aviv University, P.O. Box 39040, Ramat Aviv, Tel Aviv 69978, Israel. (e-mail: [email protected]) G. de Marsily, Laboratoire de Ge´ologie Applique ´e, Universite ´ Paris VI, 4 place Jussieu, 75230 Paris Cedex 05, France. (e-mail: [email protected]) A. Galli, Centre de Geostatistique, Ecole de Mines de Paris, 35 rue St. Honore, 77035 Fountainebleau, France. J. J. Go ´mez-Herna´ndez, Departmento de Ingenieria Hidraulica y Medio Ambiente, Universidad Polite`cnica de Valencia, Camino de Vera, S/N, 46071 Valencia, Spain. (e-mail: [email protected]) C. A. Gotway, National Center for Environmental Health, Centers for Disease Control and Prevention, MS F42, 1600 Clifton Rd. NE, Atlanta, GA 30333. (e-mail: [email protected]) P. Grindrod, Quantisci Ltd., Chiltern House, 45 Station Road, Henley-on-Thames, Oxfordshire, RG9 1AT, UK. (e-mail: peterg@ quantisci.co.uk) A. Gutjahr, Department of Mathematics, New Mexico Institute of Mining and Technology, Socorro, NM 87801. (e-mail: agutjahr@ nmt.edu) P. K. Kitanidis, Department of Civil Engineering, Stanford University, Terman Engineering Center, Stanford, CA 94305-4020. (e-mail: [email protected]) A. M. Lavenue and B. S. Rama Rao, Duke Engineering and Services, Inc., 9111 Research Blvd., Austin, TX 78758. (e-mail: [email protected]; [email protected]) D. McLaughlin, Massachusetts Institute of Technology, Room 48209, Cambridge, MA 02139. (e-mail: [email protected]) S. P. Neuman, College of Engineering and Mines, Department of Hydrology and Water Resources, University of Arizona, Tucson, AZ 85721. (e-mail: [email protected]) C. Ravenne, Institut Franc¸ais du Pe´trole, Rueil-Malmaison, France. Y. Rubin, Department of Civil Engineering, University of California, Berkeley, CA 94270. (e-mail: [email protected]) D. A. Zimmerman, GRAM, Inc., 8500 Menaul Blvd. NE, B-335, Albuquerque, NM 87112. (e-mail: [email protected]) (Received May 14, 1997; revised December 22, 1997; accepted December 29, 1997.)