Book reviews

0 downloads 0 Views 2MB Size Report
of a century ago, and much recent methodology or computer power was unknown then. .... The specific programs listed in the book include a data entry program,.
Journal of Classification 4:111-114 (1987)

Journal of

Classification ©1987Springer-VerlagNew YorkInc. BOOK REVIEWS F r e d e r i c k M o s t e l l e r a n d D a v i d L. W a l l a c e , Applied Bayesian and Classical Inference: The Case of the Federalist Papers, N e w Y o r k : S p r i n g e r - V e r l a g , 1984, pp. xxxvii + 302. This is the second edition of Inference and Disputed Authorship: The Federalist (1964, Addison-Wesley Publishing Co.). Apart from a few typographical changes, however, the only difference in the second edition is the replacement of the original Table of Contents by an Analytical Table of Contents (in which a description of the content of each section or subsection is given in addition to its title), and the inclusion of an extra chapter reviewing statistical authorship studies in the period 1969-1984. The first edition is surely well-enough known for a detailed description of the current contents to be superfluous; in any case, given the minimal textual changes readers may obtain such a description from any of the original reviews (e.g., Tiao 1967). The following is therefore just a brief summary. The Federalist papers were written under a single pseudonym ("Publius") in 1787-1788. Alexander Hamilton, John Jay and James Madison shared the task of writing them, and the precise authorship of each paper has been the subject of much subsequent study. Of the 77 papers originally published in newspapers, it has generally been agreed among historians that five were written by Jay, 43 by Hamilton, 14 by Madison and three by Hamilton and Madison jointly. The authorship of the remaining 12 has long been in dispute between Hamilton and Madison, and it is this question which Mosteller and Wallace aim to settle statistically. They treat the investigation as a case study in applied inference, with particular emphasis on Bayesian methods. The scene is set in Chapter 1. The materials selected for the task are words and their distributions, and this topic is discussed extensively in Chapter 2. Chapters 3 and 4 provide the centerpiece of the book: Chapter 3 is the description of the main Bayesian study from the practical point of view, while Chapter 4 provides full theoretical bases for all the methods used in Chapter 3. The main thrust in these two chapters is towards the calculation of log odds in favor of each author for each disputed Reviewer's Address: W.J. Krzanowski, Department of Applied Statistics, University of Reading, Whitenights, Reading RG6 2AN, U.K.

112

Book Reviews

paper. Subsidiary studies occupy the next three chapters. Classical linear discriminant analysis is used in Chapter 5, a robust simplified Bayesian analysis in Chapter 6, and an approach based on categorizing variables in Chapter 7. Other studies are reviewed in Chapter 8 and conclusions are presented in Chapter 9. The evidence from the main study is strongly in Madison's favor as author of the disputed papers, and support for this conclusion is provided by all the subsidiary studies. As already mentioned, the new Chapter 10 reviews further authorship studies that have appeared since the first edition. This book makes for fascinating reading at a number of different levels: as a statistical detective story it grips attention early and holds it consistently (despite unraveling the "mystery" less than half-way through!); as a detailed case study in applied inference, with particular emphasis on Bayesian methodology, it remains unrivaled; and as a historical document in its own right it provides some interesting early glimpses of methodological ideas which have been developed and refined in the intervening years. The first two of these aspects are to some extent connected, and contribute to give the book its unique flavor among statistical texts. The impression of a unified whole as opposed to a series of unconnected chapters is maintained throughout, although the applied statistician would perhaps be wise to defer the theoretical Chapter 4 until the rest of the book has been absorbed. The feature which stands out most strongly, in fact, is the emphasis on application; theoretical development is not shirked where necessary, but it is always the practice that drives the theory and not vice versa. Choice of distribution for word frequencies provides a simple example of this philosophy. The natural distribution that the practitioner would consider is the Poisson, but this turns out not to provide adequate fits to all the selected words. The negative binomial comes close to doing so. Rather than just accept this and go ahead, however, the negative binomial is reparameterized so that one of the parameters represents "non-Poissonness." Such reparameterization requires the retouching of many of the properties of the negative binomial in new terms. However, results can now be interpreted with reference to the Poisson, and hence in language closer to the practitioner's. Perhaps the most obvious manifestation of the applied approach comes in the prodigious effort devoted to all aspects of checking assumptions, assessment of reasonableness of conclusions, validation of results, and correction for extraneous factors. Careful attention to detail is manifest, but nowhere becomes pedantic; the issues tackled are all important ones, and the approach is refreshingly pragmatic. Consider, for example, the issue of correlations between different word rates. Initial theoretical and empirical study suggested that such correlations should be negligible, and accordingly log odds were obtained in the main study assuming independence between words. Subsequent investigations were nonetheless conducted to assess how much the log odds based on independence would differ from the log odds

Book Reviews

113

based on a model incorporating modest dependence, and a small reduction in discrimination was thereby deduced. Another constant source of worry was the inevitable presence of a regression effect from papers of known authorship (the training sets) to ones of unknown authorship (the allocation set), and a selection effect due to picking the "best" words for discriminating the training sets. Both of these effects exaggerate the apparent discrimination achieved for the training sets, and need to be allowed for when considering the allocation set. In the Bayesian study, observed log odds were compared with expected ones for subsets of word groups, thereby allowing adjustments to be made for the regression effect. Theoretical considerations showed that the negative binomial assumption automatically compensated for the selection effect. In the linear discriminant study, the papers of known authorship were split into a calibrating set plus a training set, and the regression effects were studied and corrected for on the calibrating set before application was made to the allocation set. No approximations were ever introduced without considerable supporting evidence for their reasonableness. For example, the final log odds in the Bayesian study required a four-dimensional numerical integration to be evaluated for the negative binomial distribution, and this integration was avoided by an approximation using the mode of the appropriate posterior distribution. Extensive investigation was therefore made of using the posterior mode as if it were exact and this required the development of some general asymptotic theory of posterior densities. Xs a final example, the validity of making strong probabilistic predictions from limited test material was explored through a logarithmic penalty study. The above gives an impression of the thoroughness of the study, but what about the techniques that were used? This is where the historical aspects show up; we must remember that the study was conducted a quarter of a century ago, and much recent methodology or computer power was unknown then. Here and there we get tantalizing glimpses of embryonic ideas that have now become familiar features of the landscape: jackknifing and cross-validation are briefly on the scene; a few sentences at the end of Chapter 5 anticipate the predictive versus estimative debate of fifteen years later (Aitchison, Habbema and Kay 1977; Moran and Murphy 1979); the main study of Chapter 3 is an early contribution to the line of Bayesian predictive discrimination; and the confidence intervals for the likelihood ratio in Chapter 5 tackle the same concerns as those taken up by Critchley and Ford (1985). While reading the book, however, one constantly wonders how modern techniques would deal with the problem. Would logistic discrimination or Bayesian predictive discrimination be better for assessing the log odds than the method adopted? How would full cross-validation compare with the calibration + training assessment of regression and selection effects? What about log-linear fitting for the categorized variable study of Chapter 7? Surely modern computing power would cope easily with the

114

Book Reviews

four-dimensional integration that was skated over in the main study? Inevitable, all these questions are linked to computer power, but this prompts the further though that with such power now available the temptation often is to conduct large-scale comparisons of a dazzling array of complex techniques at a superficial and Shallow level. Thorough investigation using simpler techniques but providing conclusive results is eminently more satisfying and valuable. The authors took this latter course originally and have, I think, been wise to stick to it here by restricting their updating to the final review of more recent authorship studies. A final comment on timeliness. Publication of the book coincides with media interest in recent statistical identification of a "lost" Shakespearean poem, and also with the routine use of such Computer Optical Character Recognition systems that were anticipated in the preface to the first edition over twenty years ago. The book should therefore be of direct and current interest to those involved in authorship studies. (Here at Reading, such studies have proved to be a fertile area for undergraduate statistics projects also.) It goes without saying that statisticians (both pure and applied) will learn much from this work, and furthermore it is written in a pleasant and readable style. In conclusion, it can be recommended heartily to anyone who appreciates a numerate argument and an elegant turn of phrase.

Wojtek Krzanowski

University of Reading, England

References AITCHISON, J., HABBEMA, J.D.F., and KAY, J.W. (1977), "A Critical Comparison of Two Methods of Statistical Discrimination," Applied Statistics, 26, 15-25. CRITCHLEY, F., and FORD, I. (1985), "Interval Estimation in Discrimination: The Multivariate Normal Equal Covariance Case," Biometrika, 72, 109-116. MORAN, M.A., and MURPHY, B.J. (1979), "A Closer Look at Two Alternative Methods of Statistical Discrimination," Applied Statistics, 28, 223-232. TIAO, G.C. (1967), Review of Inference and Disputed Authorship: The Federalist, Journal of the American Statistical Association, 62, 306-309.

Journal of Classification4:115-117 (1987)

Mike James, pp. 211.

Classification Algorithms,

London:

Collins,

1985,

The text written by James is designed as an introduction to the field of classification. However, the definition of the term "classification" is used in a fairly restrictive sense. That is, the book deals exclusively with the problem of classifying individuals when the groups are known a priori. As such, it is not consistent with the more liberal interpretation of the term taken by this journal. The book consists of 10 chapters and three appendices. The book begins with a discussion of classification rules, including Bayes' (chapters 12). Linear and quadratic discrimination based on the standard normality assumptions are presented in Chapters 3-5. The problem of estimating error rates is reviewed as a method of evaluating classification rule performance (Chapter 6). Chapters 7 and 8 deal with feature selection. In Chapter 7, canonical correlation is presented as a way to reduce the number of variables used in a classification rule. Stepwise discriminant analysis is proposed in Chapter 8 as a method for reducing the number of variables which must be collected in the first place. Chapter 9 discusses classification when using categorical data or nonparametric methods. The last chapter, titled "Artificial Intelligence and Pattern Recognition," attempts to provide a review of these fields, particularly as they apply to the classification problem. However, the chapter seems to add little to the overall discussion. Included in several chapters are listings of BASIC programs which are designed to perform some of t h e procedures included in the text. The Microsoft V5 version of BASIC is used and the author claims that the programs will run on almost any CP/M machine. In several instances, the author attempts to indicate those programming changes that might be required for other equipment. Rather than keying in the programs directly, the author provides an address to write for information about software availability. It appears that the programs are to be obtained as part of a larger system called SAM, Statistical Analysis for Microcomputers. The specific programs listed in the book include a data entry program, linear and quadratic discriminant analysis, canonical analysis, stepwise discriminant analysis, plus various options or extensions to these routines. Additional material includes a nearest neighbor classifying program, a data generator (Appendix 2), and a listing of Fisher's Iris data (appendix 3). Reviewer's Address: Glenn W. Milligan, Facultyof ManagementSciences, 301 Hagerty • Hall, The Ohio State University,Columbus, Ohio 43210, USA.

116

Book Reviews

The iris data set is used in the display of program output at several places in the text. The author has written the text at a fairly low level for individuals with limited statistical backgrounds. It is indicated on page 5 that the reader should have some familiarity with probability and statistics, and some knowledge of matrix algebra. No other advanced knowledge is specified, such as regression or a general introduction to multivariate methods. Some remedial help is offered in the text, including an introduction to probability (pp. 8-10), and a review of matrix algebra (Appendix 1). However, the material is incomplete. For example, the review of matrix algebra presents the concept of matrix inversion (pp. 193-194), but no computational example is given and no method is shown to obtain an inverse. From this base, the author then attempts to motivate the concepts of quadratic forms, eigenvalues, and eigenvectors. Several other deficiencies or inadequacies can be identified in the text. First. although the author gives a reference page for the notation used in the text, some symbols are nonstandard and can cause confusion. For example, the symbols M and m are used to represent sample sizes, while N and n are used to specify the number of variables. Second, too few computational examples are . presented and no practice problems are given in the text for student study or class homework assignments. Third, although a few statements are made concerning computational accuracy in the programs, no specific claim is made in the text as to the accuracy of the programs written in BASIC. Fourth, some errors appear in the text. For example, on the bottom of page 75, either the unusual confidence level of 55% is not specified, or the critical score has been left out of the confidence interval formula. Similarly, at the top of page 124, a numerical equation is given which would appear to be based on the output on page 123 which represents the topic currently under discussion. However, the necessary values are obtained from output on pages 113-114. Fifth, the logic of statistical instruction is out of sequence at times. On page 98, while attempting to motivate the concept of canonical correlation, the author takes a digression for those who are not familiar with one-way ANOVA. On the other hand, the author introduces the kernel (or Parzen) estimate in a section beginning on page 162 without the double asterisk notation used in the book to warn the reader that the material is advanced. Finally, the index for this 200-plus page book is inadequate and remarkably fits on only one page! A different aspect of the text has to do with its treatment of discriminant analysis. No assumption is made that the reader is familiar with this topic and it is covered in the text. However, the coverage is divided among several chapters and it is not clear that the reader will develop an overall understanding of the technique. This is unfortunate given the fact that an individual familiar with the common classification rules also should have a good foundation in discriminant analysis.

Book Reviews

117

I feel that it is naive to believe that an unsophisticated reader will benefit greatly from the presentation given by the text alone. There is little that a welt-trained reader would have not yet already studied. At best, I would recommend the book as a supplemental text to be used in a multivariate course when the main text is weak on classification or when the instructor would like to emphasize this topic. Ohio State University

Glenn W. Milligan

Journal of Classification 4:118-122 (1987)

J o e l H. L e v i n e , Levine's Atlas of Corporate Interlocks, V o l u m e s I a n d II, H a n o v e r , N H : W o r l d n e t , 1984, V o l u m e I - 70 pages + 3 a p p e n d i c e s , V o l u m e II -- 379 pages (includes 59 m a p s a n d 43 color plates), $495.

These two volumes of Joel Levine present an in-depth description of board interlocks among corporations around the world. The volumes contribute to the scaling and classification literature as good examples of how centroid scaling (Part I of Volume I) and frequency reconstructive scaling (Part II) can be used to describe the patterns of corporate interlocks among rougt~ly 500 publicly-held corporations active in the late 1970's. An interlock exists if two corporations share a director. As Levine states, such links are international and join these major corporations into a single network. This monograph is intended primarily for political sociologists, business strategists, and those interested in applications of multidimensional scaling, and contains brief introductions to the techniques employed to study the corporate network. The volumes are quite distinct. Volume I, as mentioned above, presents a brief, non-technical introduction to "pick any" centroid scaling (CS) and frequency reconstructive scaling (FRS) -- techniques that Levine has pioneered (see below) but which are not widely used. Also included in this first volume are descriptions of the data, the results of the two multidimensional scalings of the data, and a brief substantive discussion of the analyses. Part I also contains three valuable appendices that include a listing of the corporations in the study (Appendix A), the directors in the study (Appendix B), and the direct interlocks that each corporation has with one another and the shared directors who provide the links (Appendix C). Volume II is simply a 379 page table of all 15,000 directors in the study and the boards that they serve on. For each director and for all the boards on which he/she sits, the table also gives the directors and boards to which he/she is directly and indirectly interlocked. As Levine states, the table presents "the networks by which directors have indirect access to target corporations through the agency of other corporate directors." The table also Reviewers' Addresses: Stanley Wasserman, Department of Psychology and Department of Statistics, University of Illinois, 603 E. Daniel Street, Champaign, Illinois 61820, U.S.A., and Joseph Galaskiewicz, Department of Sociology, University of Minnesota, 267 19th Avenue South, Minneapolis, Minnesota 55455, U.S.A.

Book Reviews

119

contains the path(s) that link directors, although these paths are given only for links once-removed. For example, Lee Iacocca (page 163), CEO of Chrysler Corp., is on the board of Chrysler and thus is linked to Bank America Corp. through Najeeb Halaby, also on Chrysler's board, who is King Hussein's (of Jordan) father-in-law. We are not told who else is on the board of Bank America that Iacocca would then have second-order indirect access to. Students of applied scaling will find the description and application of CS and FRS to these corporate interlocks most interesting. Levine described these techniques in his 1979 Psychometrika paper (CS) and his 1972 Behavioral Science paper (FRS). Levine's (1972a) well-known work on corporate interlocking and the resultant "sphere of influence" is an important theoretical antecedent of this monograph -- in fact, the monograph should be viewed as a culmination of this work. Levine, who in this monograph presents a strategy to map social distances into geometric space (and hence, classify actors and corporations based on geographic proximity in the space) views himself as a "social cartographer." He has produced an Atlas of Corporate Interlock which, instead of locating towns, continents, and seas in a physical plane, maps the locations of corporations in a geometric space based on social distances defined by their interlocking directorates. Centroid scaling is a technique that is applicable to two-mode, two-way data where the row actors are allowed to choose any number of column actors; i.e., binary, rectangular sociomatrices. CS attempts to draw a social map containing both sets of actors, which in the present application are both directors and corporations. Each director and corporation is placed in the space containing the map in such a way that each point is the center of the elements to which it is linked. The maps resemble sociograms but on a much larger scale. We like to think of it as an algorithm for constructing sociograms for such rectangular networks. The first criterion to be satisfied by the solution is that the position of row actor i (the corporations) must be proportional to the average of the positions of all row actors to which the actor is linked. The second criteria is that the same centroid property must hold for all column actors (the directors). Levine (1979) shows that the positions satisfying these two criteria are the eigenvectors of a matrix derived from the normed binary data array. The monograph gives a non-mathematical overview of CS which should provide the novice with a good understanding of how the technique works. Levine presents a global picture of the CS solution (which pictures 442 corporations -- the remainder were excluded for a number of substantive reasons) and then focuses on the solution which produces a classification of the 442 into five regions. These regions are described and pictured in greater detail. The regions are best viewed as national clusters of German, French, Swiss, and Dutch corporations and a fifth more dense and complicated

120

Book Reviews

cluster of firms from English-speaking countries (U.S., U.K., Canada, and South Africa). It is interesting that none of these clusters is in the center of the others; rather, there is a triangular pattern with five vertices (one for each region) and ten edges (all regions are linked). Levine discusses the nature of these important inter-region interlocks. Levine then describes each of the clusters in more detail and "blows up" the map of each cluster to study the relative proximities among corporations and directors. Levine's discussion and use of frequency reconstructive scaling is similarly non-technical. The main difference between CS and FRS is that the latter uses the number of directors that two firms have in common and the additional criterion that two firms should be closer together on the map ff they have more directors in common. This proportionality between numbers of interlocks and organizational distance is coupled with the assumption of a city-block metric so that FRS is more similar to standard multidimensional scaling techniques. We are not really sure what FRS is. Levine states in footnote 8 that FRS is a "reinterpretation of the technically identical model" for two-way social mobility tables given in Levine (1972b). However, a study of this model still left us in the dark over how to apply it to rectangular nonbinary network data. Further technical information on FRS would have been appreciated. Levine constrains himself to a two-dimensional FRS solution, so that the results of this scaling are easier to interpret than those of the multidimensional CS analysis. The majority of Levine's efforts are directed at the description of sixteen regional maps that are direct (untransformed) subsets of the complete FRS solution. Each of these maps is expertly drawn using color so that blue lines indicate connections between a director and a corporation within the region and yellow lines indicate connections between directors and firms outside the region. Within-region corporations in the regional maps are labeled in large italics and corporations outside the region with small italics. The artistry found in these maps is quite important since the maps themselves are quite complicated. We found that Levine's presentations of these FRS plates greatly helped our understanding of the FRS solution. We suspect that these plates (there are more than forty in color) contributed to the hefty price of these volumes (more on this later) and certainly prevents one from reproducing them with photocopiers! Overall, the FRS solution is hard to interpret. Most of the first volume is devoted to this task. The network is quite packed, and although the solution can be viewed by examining the sixteen regional clusters (once again, the classification is based primarily on geography), the clusters themselves have centers that are complex and tangled. The simplicity of the CS solution is missing here. Even within a region, such as the English-speaking center, one cannot assess the network without splitting the region up into much smaller subregions. Levine guides the reader through these maps, but it is impossible to sum up the structure of this solution (except for saying

Book Reviews

121

that regionalism is important) in few words. The size of the data array really makes a complete analysis impossible, unless one uses only analytic tools designed to produce simple solutions. However, such simple solutions usually miss interesting interactions present in the data. We recommend that those interested in global corporate interlocking consult Levine (1972a). This earlier study uses multidimensional scaling (specifically, nonmetric unfolding methods) to analyze a much smaller corporate interlock network of only 70 firms (based on 14 major banks headquartered in New York, Pittsburgh, or Chicago). 'The figures in this paper, which also demonstrate a regional classification, are interpretable. Looking at this monograph as organizational or political sociologists, we found that the effort was disappointing. To be blunt, we never learned why the network appeared as it did (i.e., with firms being classified along regional lines) and if the structure of the network had any impact on the corporations, the global economy, or the general public. The monograph simply lacks any sociological theorizing. The presentation of the cartography is certainly elegant, and a reader, solely interested in how firms in the global economy interact through the sharing of directors, can not find a better guide. But other organizational sociologists have mapped such interorganizational systems and scholars in this area demand to know whether the uncovered structure has any impact, and if so, on what arena. If we look at Levine's Atlas as an exercise in social network analysis, the exposition of the application of two multidimensional scaling techniques to a rich data set was enlightening. The primary advantage of the techniques is that one can interchange the network's vertices (the corporations) and links (the directors). A rectangular network such as this one can thus be viewed as a network of directors linked together by their memberships on corporate boards or a network of corporations that are tied by the directors they share. Both CS and FRS allow both types of actors to be positioned in the geometric space. However, one wonders whether other scaling or network modeling techniques might find simpler, easier to understand underlying social structures. Nonetheless, Levine is to be praised for his data gathering skills. Richer data sets do not exist. It is difficult for us to recommend that interested readers go out and examine the volumes, unless the readers have a specific substantive interest in interlocking directorates or are trying to find someone who is well connected to sit on their corporate board. We suspect, given the inflated price of the volumes, that Levine and Worldnet are marketing this monograph for the business community. Relatively poor academics are unlikely to spend $495 on one volume of data analyses and one volume containing a data table. The monograph is certainly written for the former audience, and its historical perspective of the global economy probably would appeal to chairs of corporate boards of directors. We would like to encourage the author to include the data in machine-readable form when selling the monograph to

Book Reviews

122

academics or libraries. Much could be gained if other methodologists were able to examine these data. Stanley Wasserman Joseph Galaskiewicz

University of Illinois University o f Minnesota

References LEVINE, LH. (1972.a), "The Sphere of Influence," American Sociological Research, 37, 14-27. LEVINE, J.H. (1972b), "A Two-Parameter Model of Interaction in Father-Son Status Mobility," Behavioral Science. 17, 455-465. LEVINE, J.H. (1979), "Joint-Space Analysisof "Pick-any" Data: Analysisof Choices from an Unconstrained Set of Alternatives," Psychometrika, 44, 85-92.

Journal of Classification4:123-124 (1987)

F r a n s N. S t o k m a n , R o l f Z i e g l e r , a n d J o h n Scott, Networks of Corporate Power, C a m b r i d g e , E n g l a n d : Polity Press, 1985, pp. 304.

It can't work, but it does: Combine 20 working scholars from 11 nations, coordinate their research methods and their concepts over a period of 6 years, and edit the whole into an integrated and readable volume. It can not work, but it does. In Networks of Corporate Power, Stokman, Ziegler and Scott have organized and edited an important and coherent work: Important as an analysis of interlocking directorates, important as a comparative study of corporate/state systems in 9 nations, and important as a landmark in the application of network analytical methods. Networks of Corporate Power presents analyses of corporate interlocks in 9 nations, including Austria, West Germany, the Netherlands, Switzerland, Belgium, Finland, France, Italy, Great Britain, and the United States. All authors coordinated the sizes of their samples, approximately 275-300, and committed themselves to a common toolkit of quantitative methods. The result is a curious inversion of what might have been anticipated from the overt emphasis On common methods: Methodology per se does not dominate the book, apparently because there was no need for individual authors to introduce, or to justify, the methods of their separate studies. The result is, in effect, an ethnology: Nine inside views of the ways that analysts think about the corporate systems of their respective nations. Within their common methods the authors have molded themselves to that which their experience indicated was significant: The major or minor role of the state, the variously dominant or ancillary contributions of sectors like banking and agriculture, the current residue of the distinct historical origins of these separate national systems. The combined studies are an antidote and challenge to those of us who, though we may know better, tend to view the world of real and potential corporate systems as variants of the one system within which we personally learned the economic facts of life. There are two gaps, one methodological and one theoretical, at which one needs more than is provided, although the fault lies more in the state of the art than in the state of this book. One gap is the absence of a complete specification of the method. While the authors have incorporated their methods in a specific computer program, GRADAP, it would have been Reviewer's Address: Joel H. Levine, MathematicalSocial Sciences,#:6104 Dartmouth College, Hanover,New Hampshire03755, U.S.A.

124

Book Reviews

appropriate to include a short but precise technical appendix that would facilitate exact reproduction of their methods without recourse to other publications. The methodology is, if anything, understated in the book. In a field for which creation of "new" methods is something of an intramural sport, this understatement could not have been better calculated to create interest in the methods. For some time to come the book is likely to be a touchstone: Argue with its methods, argue with its assumptions, argue with its interpretations, but do not ignore it. And there will, of course, be room for such argument. One part of the methodological gap is the complement of the methods' flexibility. Where the authors have been free to pick and choose from the common tool kit of computer outputs, the result leaves some ambiguity about the methods not chosen by particular authors. For example, the first two national chapters, the chapters for Austria and Germany, feature figures showing "Overlapping Spheres of Influence," but exact corresponding figures are not replicated in other chapters. Presumably the concept or the specific technique was not useful for these other nations. But to eliminate the possibility that authors may have selected methods whose results conformed to a priori considerations, it would have been good to show the pitfalls of the methods not used by the respective authors. The other gap is a problem for us all, not just for these authors. It is the disjuncture between the language of theory and the language of methodology. The language of "class cohesion," "resource dependence," "coordination," and "control" and the language of "centrality," "interlock," and "largest connected component" seem to bypass one another. The editors handle this problem well: What passes for "theory" in our current discourse is neatly outlined and segregated in a chapter of its own, with references in other chapters. It is a wise separation, avoiding attempts to look respectably "theoretical" in chapters where the authors have nontheoretical but important agendas. The authors have a great deal to say and did not let the embarrassing state of contemporary theory get in their way. Networks of Corporate Power does its job well. Their work is good enough to bring one era of empirical research to an end, confronting us all with the next layer of deeper theoretical problems.

Dartmouth College

Joel H. Levine

Journal of Classification4:125-128 (1987)

Discrete Choice Analysis: Theory and Application to Travel Demand, C a m b r i d g e ,

Moshe

Ben-Akiva

and

Steven

R.

Lerman,

M A : M I T P r e s s , 1985, pp. x x + 390. The methods of discrete choice analysis and their applications to travel demand modeling are the two (approximately equally represented) topics of this well-written and clearly organized monograph. Meant as a professional reference source as well as a graduate level textbook (presumably for the area of transportation studies), the book will fill both of these roles well. A minor drawback for its use as a textbook is the absence of exercises. The exclusive use of travel demand studies to exemplify analytic techniques or advantages and shortcomings of particular procedures will probably discourage use of the book as a text in courses on discrete choice models in other disciplines. Problems in transportation studies have motivated many of the theoretical developments discussed in the book. These problems however are not unique to this substantive area. Researchers and students of discrete choice in any area could benefit from the book's logical and transparent organization, clear description of problems and issues, and concise exposition of solutions. The book's basic approach to modeling individual choice is in the tradition of random utility models first proposed by Thurstone (1927). (The book gives due credit, but it will pain the psychologist readers to see Thurstone's name consistently misspelled.) This class of models accounts for observed inconsistencies in individual choice behavior by treating the utility of outcomes as a random variable or, equivalently, as the sum of a systematic and a random component. Specific random utility models differ in their assumption about the joint probability distribution of the set of random utility components for the choice alternatives. These alternative assumptions and the resulting probabilistic choice models are developed in detail (Chapters 4 and 5). The other major conceptual approach, which accounts for choice inconsistency with a probabilistic decision rule, namely constant utility models as exemplified by Luce (1959), is briefly discussed but seen as less compatible with economic consumer theory than the random utility approach. Random utility models, of course, adopt the principle of utility maximization as their decision rule. Another major concern of the book is the relationship between individual choice probabilities (disaggregate models) and aggregate choice Reviewer's Address: Elke U. Weber, Departmentof Psychology,Universityof Illinois, 603 E. DanielStreet, Champaign,Illinois61820, U.S.A.

126

Book Reviews

probabilities. There are systematic developments of ways to aggregate individual choice probabilities to predict group choices (Chapter 6) as well as of ways to disaggregate group choices to predict the choice probabilities of subpopulations (Chapter 7). A brief description of Chapter 6 will show the clear organization and progression of topics that structure each chapter and the book as a whole. Each chapter starts with a preview of topics and ends with a summary of its major results. Chapter 6 on "Aggregate Forecasting Techniques" first describes the problems involved in aggregating across individuals (Section 6.1), then (in Section 6.2) provides a taxonomy of five general types of aggregation techniques. (They are shown to reduce the problems of aggregating forecasts across individuals by making simplifying assumptions about the choice model, the population, or both.) This is followed by a detailed description of each approach in Section 6.3. Finally, Section 6.4 provides a comparison of these techniques using both theoretical and empirical evidence to suggest guidelines for choosing the procedures most appropriate for particular situations. The book strikes a nice balance between the presentation of theoretical and metatheoretical issues (e.g., Section 7.2, "The Art of Model Building"), the derivation and exposition of analytic techniques, and discussions of feasibility and problems encountered in empirical applications. Befitting a book written for practitioners, the text emphasizes efficiency with respect to data requirements as well as computational feasibility and cost as important criteria for selecting analytic techniques (in addition to statistical criteria such as the efficiency of parameter estimates). Many of the examples are travel demand studies conducted by the two authors and their students and collaborators that have previously appeared only as unpublished dissertations, working papers, or technical reports. There are also frequent references to published studies for additional examples of model applications. The reference list is comprehensive (as far as this reviewer who has no expertise in transportation studies can tell) and accurate. The relevance of a particular reference (e.g., original source of an idea or technique, background reading, further development of a technique beyond the scope of the book, or example) is always clear. The major choice model developed by the book is the multinomial logit model, i.e., the random utility model resulting from the assumption that the disturbances (random components of the utilities of the choice alternatives) are independent and identically Gumbel distributed. Other models less extensively covered are a generalization of multinomial logit, i.e., the generalized value (GEV) model (McFadden 1978), and the multinomial probit model with its assumption of disturbances that are independent and identically Gaussian distributed. The latter assumption may be more justifiable than the assumption of a Gumbel distribution via the Central Limit Theorem by viewing the disturbances as the sum of a larger number of unobserved but independent components. However, the probit model is

Book Reviews

127

shown to be computationally difficult in comparison to the "probit-like" logit model for the multinomial case. Multidimensional choice sets, i.e., choices where the set of feasible alternatives are combinations of underlying choice dimensions (e.g., a particular choice alternative may consist of a possible shopping destination and a method for getting there) get separate treatment in Chapter 10. The joint logit model is introduced as a straightforward application of the multinomial logit model to multidimensional choice sets with independent utilities. When the utilities of the multidimensional alternatives are not independent (as may be the case when the elements of the choice set have separate unobserved components), a generalization of the logit model, termed nested logit, is shown to apply as long as the correlations between utilities have a particular structure. Parameter estimation for all models is done mainly by maximum likelihood techniques, even through occasionally least squares methods are also developed. Chapter 2 provides a selective review of relevant results from the literature on parameter estimation. The authors emphasize the interdependencies between different stages in the modeling enterprise (i.e., selection of a sampling strategy, choice of a model, and parameter estimation). Chapter 8 provides a review of conventional sampling designs and considers their effectiveness for discrete choice studies. An interesting result derived in the chapter (but due to Daganzo, 1980) is that the optimal sampling strategy for even simple problems cannot be determined without prior knowledge of the model parameters to be estimated. This leads to the recommendation to conduct two-stage studies with an initial small randomly drawn sample to provide preliminary parameter estimates which are then used to determine the optimal sampling strategy for the larger second-stage sample. The final chapter (Chapter 12, "Models of Travel Demand: Future Directions") is the authors' attempt to point other researchers in the field into directions where additional research is necessary and/or promising. Current and future research is classified into four categories: behavioral theory, measurement, model structure, and estimation. Progress in the field, which is reported by this monograph, has been mainly in the areas of model structure and estimation. Progress in measurement, the category under which the authors subsume all aspects of data collection including sampling theory, is considered uneven. Travel demand studies have historically relied on revealed preferences, but the authors see the potential of stated preferences to provide vastly more and richer information from each respondent. Thus, a major perceived deficiency is the absence of an explicit theory of how stated preferences map into actual behavior. Research with a focus on behavior theory has been sparse. The authors (probably correctly) predict that the assumption of utility maximization will stay with us despite behavioral evidence to the contrary because of the mathematical tractability

128

Book Reviews

it brings to choice models. Instead, objections to utility maximization as a behavioral theory will and have been dealt with by modifying other assumptions of the neoclassical economic model. Work on extensions of the model is recommended in several directions. The two most promising may be (1) developments that incorporate more sophisticated assumptions about available information and information acquisition into discrete choice models and (2) the use of intermediate constructs that stand between the physical attributes of choice alternatives and evaluation of their utility. In summary, this book gives a comprehensive, accurate, ad well-written account of the current state of discrete choice models as relevant for transportation studies. Researchers in this area and areas with similar discrete choice problem structures will find it a valuable reference.

Elke U. Weber

University of Illinois

References DAGANZO, C. (1980), "Optimal Sampling Strategies for Statistical Models with Discrete Dependent Variables," Trans. Sci., 14. LUCE, R.D. (1959), Individual Choice Behavior: A TheoreticalAnalysis, New York: Wiley. MCFADDEN, D. (1978), "Modelling the Choice of Residential Location," in Spatial Interaction Theory and Residential Location, eds. A. Karlquist, et al., Amsterdam: North Holland.

Journal of Classification4:12%131 (1987)

K e n n e t h J. A r r o w a n d H e r v 6 R a y n a u d , Social Choice and Multicriterion Decision-Making, C a m b r i d g e , M A : M I T P r e s s , 1986, pp. 127. Suppose the alternatives in a large finite set have been ranked best-toworst against each of a large number n of criteria to produce a sequence of n criterion rankings. Suppose also that these ordinal rankings of the alternatives are the only intracriterion evaluation data available and that the criteria have been selected to be relatively independent of one another and of approximately equal importance so that there is no importance ranking of the criteria themselves. Given these assumptions, the problem addressed in Social Choice and Multicriterion Decision-Making is to determine a good procedure (or class of procedures) that forms a holistic best-to-worst ranking of the alternatives on the basis of the n criterion rankings. The procedure is to be general in the sense that it can be formulated in a simple algorithmic form that is routinely applicable to any sequence of criterion rankings. The multicriterion ranking problem is ubiquitous. Examples arise when a governmental funding agency or an industrial firm wishes to rank order a large number of proposals or projects, or a university or employer wants an ordering of a pool of applicants. One of the authors of the monograph has extensive consulting experience with the problem that has motivated the development of the procedures they advocate. Their procedures are presented as alternatives to other multicriterion aggregation methods, including the widely-used Electre methods developed in France by Bernard Roy and others. An appendix briefly outlines and criticizes one of the Electre methods, but there is little analysis of other methods in the book. The "Social Choice" of the title highlights the fact that the description of the problem given above is identical to a standard problem in social choice theory when 'criterion' is changed to 'voter'. The problem in its social-choice guise has been widely investigated by Arrow (Social Choice and Individual Values, Wiley, 1951) and many others. A prime lesson of their work, beginning with Arrow's celebrated impossibility theorem (a few desirable conditions for aggregating voter preference rankings into a social ranking are mutually incompatible), is that every seemingly reasonable aggregation method is flawed in some way or another. Comparisons of voting or social ranking procedures for 'large' sets of candidates thus reduce to issues of practicality, cost, understanding of and acceptance by the electorate, and Reviewer's Address: Peter C. Fishburn, Room 2C-354, AT&T Bell Laboratories, 600 Mountain Avenue,Murray Hill, NJ 07974, USA.

130

Book Review

comparisons of their good and bad properties and consequences. The recent book by Steven Brams and myself (Approval Voting, Birkh~iuser, 1983) is a case in point. Arrow and Raynaud follow a similar practice-oriented approach in their monograph. The first of its three parts explains their formulation of the multicriterion ranking problem and cites difficulties in aggregation raised by the research in social choice theory. The second and longest part discusses in some detail conditions under which simple-majority comparisons between alternatives (i beats j if more criteria rank i above j than rank j above i) are transitive and therefore provide an ordering of alternatives that might be adopted as the output of the aggregation procedure. They also take care to point out the differences between the multicriterion problem in its industrial versus electoral contexts and explain how this affects their choice of conditions to impose on the aggregation procedure in the industrial context. The second part concludes with theoretical and practical arguments which suggest that the majority-comparison method will often fail to yield a simplemajority ranking due to the presence of majority cycles (i beats j, j beats k, k beats i). The third part of the book presents their procedures after they introduce their aggregation conditions, show that these exclude well-known procedures associated with the names of Borda and Kemeny, and discuss very similar procedures due to G. KiShler. The K~hler and Arrow-Raynaud methods are based on the matrix [au], where a u is the number of criteria that rank i above j, and their aggregated rankings coincide with the majority ranking when majorities are transitive. (A variety of other methods that duplicate the majority ranking under transitivity can be inferred from the choice functions defined in my "Condorcet Social Choice Functions," SlAM Journal on Applied Mathematics, 1977.) Although other set partitioning procedures are discussed in the book, the primary holistic ranking methods of K~hler and Arrow-Raynaud are sequential procedures that select one alternative at a time (ties broken arbitrarily) to construct the holistic tanking top down (best to worst) or bottom up (worst to best). After each selection, the row and column in [aij] of the selected alternative are deleted and the procedure continues with the reduced matrix. Hence previous selections do not explicitly affect the next selection. An indication of their methods can be given by noting the choice criterion at each step in their 'primal algorithms'. K~hler's primal algorithm goes top down. Its selection at the next step is an i that maximizes the minimum over remaining j ;~ i of the a~/. It is a conservative maximin procedure that is 'prudent' (a desirable prol~erty) in Arrow-Raynaud's terms. The primal algorithm of Arrow and Raynaud is similar except that it goes bottom up and, in doing so, reverses the max and rain roles. Its selection at the next step is an i that minimizes the maximum over remaining

Book Review

131

j ;e i of the at/. It too is prudent and can differ from Ki~hler's ranking when majorities are not transitive. Although the monograph has defects, it is quite readable, short, and is recommended to all who are interested in the theory or practice of multieriterion ranking or choice. The middle part is mostly tangential to the main thesis and can be safely ignored by most readers except to note that majority cycles can play havoc with naive attempts to rank things by simple-majority comparisons. The monograph could have benefitted from more care by its authors and from a thorough editorial review. Its early portions make assertions about the mind and the brain that, with a few exceptions, strike me as amateur psychologizing. Some terms are undefined ('semiorder', p. 10, used differently than Luce's by-now-standard meaning) and others are defined tardily ('linear ranking', p. 18, defined on p. 31). The index omits some key terms (e.g., 'prudence') and there are several misprints, none very serious. One is also left with the feeling that more could have been done by way of comparison to other ranking methods, including versions of Electre. Perhaps we shall not have to wait long for this.

AT&T Bell Laboratories Murray Hill, NJ

Peter C. Fishburn

Journal of Classification4:132-134 (1987)

Taxonomic Analysis in Biology: Computers, Models and Databases, N e w Y o r k :

L . A . A b b o t t , F . A . Bisby, a n d D.J. R o g e r s , C o l u m b i a U n i v e r s i t y P r e s s , 1985, pp. 336.

Throughout recorded history and before, people have classified the objects in their environment as well as each other. Biologists have produced formal classifications for centuries, and for this reason alone the present text might interest the general readership of this journal, regardless of their own field. The authors do not provide a historical overview of biological taxonomy, but they do describe their view of the procedures of "standard" taxonomy (taxonomy without computers), which constitutes the first part of the book. They build on its basic concepts to describe three theoretical models of taxonomy (Part II), which then serves as a natural introduction to computer-assisted taxonomic analysis (Part III). A final section (Part IV) addressed computer-assisted database management. A review of the "standard" approaches of biology helps us to understand the practical and theoretical origins of procedures that we otherwise tend to take too much for granted. Another reason is to provide nonbiologists with a productive context for us to "think laterally," to use Edward De Bono's well-known term. In the present case, the reader discovers (or recalls) that taxonomy is not just classification. And even within classification itself, specialized procedures have been developed for special purposes. Cladistics is a good example. Its special goal of producing a cladogram that reflects hereditary relationships seems to most practitioners to be served best by the use of specific methods that can be seen in one way as biased classification procedures, as are many phenetic methods. Such procedures may be the best available for the stated purpose. The important point is to understand that many classification methods used throughout all disciplines have been borrowed from biology, often without questioning their philosophic underpinnings. The present text can remind us of the origins of many such methods and serve as a context for us to evaluate the pros and cons of their direct use in other disciplines. If much of quantitative classification had not begun in biology, how would the current methodology differ in your field? Reviewer's Address: Theodore J. Crovelto, Departmentof Biology,The University of Notre Dame, South Bend, Indiana46556, U.S.A.

Book Reviews

133

More specifically, in Part I the claim is made that Standard Taxonomy should be considered as an information system. This view is substantiated by examples of how various biologists use taxonomy and its products as information sources. Part II establishes the theoretical bases of taxonomy by dividing the process into two stages: aspects of the structure of taxonomic data bases and models and algorithms for producing classifications. Only thirteen pages are devoted to aspects of structure, so its treatment is quite superficial. Models and algorithms receive considerably more attention (46 pages). They define model as "a generalized, simplified representation of the complex system or set of data under study." Algorithms are "procedures for problem solving," and the authors caution that there has been confusion between models and algorithms, especially since the same algorithm can be applied to different models.. The focus is on three models to create a taxonomic hierarchy: the geometric model; the graph theoretical model; and the information theory model. The well-known methods that complement hierarchic procedures are found in the section on geometric models, and include principal components and coordinates analyses, discriminant functions, and multidimensional scaling. Graph and set theoretic methods include minimum spanning trees but also dendrograms. Examples from biology are given throughout these sections, and Part II of the book ends with a short prose description of many of the major clustering methods. Part III focuses on practical aspects of computer-assisted analysis and covers the following topics in separate chapters: character analysis, phenetic classification; diagrams of variation pattern; identification; and phylogeny and cladistics. While all of these are short and descriptive chapters, they again might remind us to ask what the basic questions are that we have not addressed deeply enough in our own field. For example, how do we choose characters in our own work, and how much of our focus is on steps beyond the character by object Basic Data Matrix instead of concern for the reliability of that matrix itself. The final part of the book addresses computer-assisted database management. It is a very practical but important section, containing an introduction to basic concepts and examples of several biological databases. The book is described by the publisher as requiring no previous knowledge of taxonomy. It also requires no previous knowledge of quantitative methods. As such, regular readers of the Journal of Classification should not expect to find new methods or detailed treatment of established ones. But as I said earlier, they might find a context to consider fundamental aspects of classification in their own fields, including the pros and cons of methods adopted from other fields, of viewing databases as models, and the use of specific procedures when a particular biased insight into a data matrix is desired.

134

Book Reviews

In their final chapter, the authors note some possible future developmerits in taxonomy. They mention expert systems as one possibility, and we can expect data base program package developers to explore ways in which programs can learn about each particular data base and each user. Expert systems already exist that diagnose (identify) human diseases, identify geographic regions that have a high probability of containing oil, etc. Unfortunately, this process of determining to what preestablished class an object belongs is called classification instead of being referred to more correctly as identification. Waterman (1986) provides a readable, quite thorough introduction to expert systems, including a summary (arranged in a nonhierarchical classification!) of most publicly known expert systems. A future edition should pay some attention to the work of cognitive psychologists in classification, concept learning and categorization (e.g., see Medin and Smith 1984). The authors also might explore the role of classification in biogeography, which can be seen as classification in which the objects are areas (OGUs, Operational Geographic Units; see Crovello 1981). The extra challenge of geography-based classification is to accommodate a frequent additional constraint: to create the best hierarchic classification that also has geographic contiguity. That is, the OGUs that emerge in the same cluster in character space also should be close together in geographic space.

Theodore J. Crovello

University of Notre Dame

References CROVELLO, T.J. (I98I), "Quantitative Biogeography: An Overview," Taxon, 30, 563-575. MEDIN, D.I., and SMITH, E.E. (1984), "Concepts and Concept Formation," Annual Review of Psychology, 35, 113-t38. WATERMAN, D.A. (1986), A Guide to Expert Systems, Reading, Massachusetts: AddisonWesley.

Journal of Classification4:135-138 (1987)

L. L e g e n d r e a n d P. L e g e n d r e , Numerical Ecology, A m s t e r d a m : E l s e v i e r Scientific, 1983, pp. x v i h - 4 4 9 ; L. L e g e n d r e e t P. L e g e n d r e , Ecologie numbrique 2e e d i t i o n 1. Le

traitment multiple des donnbes bcologiques. 2. £a structure des donnbes ~cologique~ Paris: M a s s o n , 1984, pp. x x + 2 6 0 , viii + 3 3 5. This book resulted from an expanded translation and revision of the first edition of Ecologie nurn~rique (Legendre and Legendre 1979). The second edition of the French version is almost identical to the English version. This is a textbook "directed to practical ecologists." It is regarded by its authors "as a practical handbook as well as a reference book." The authors stated in the Foreword: "With the advent of computer packages, scientists may now have access to the most sophisticated treatments, without having to write their own programs" and that it is "thus essential that the principles underlying the various numerical methods and the extent to which the latter can be used be clearly established and that guidelines be set for the interpretation of the results generated by the computer." They thus consider their book also "as a practical guidebook based on the most accessible packages." This book contains eleven chapters (12 in the French edition), a bibliography, a few tables, a French-English vocabulary of numerical ecology, and a subject index. The book is didactic and is tailored to the nonmathematically trained ecologist. It is applicable to the taxonomist also or to anyone who wishes to analyze diversity and to find groupings by means of computer assisted methods, especially the first nine chapters. Chapter 1, entitled "Complex Ecological Data Sets," introduces the subject of numerical data and its nature, how to organize data for computer analysis, computer packages, and descriptors. The chapter stresses that the most widely available computer packages are BMDP (Dixon et al. 1981) and SPSS (Nie et al. 1975). Unfortunately SAS (1982) was omitted not only here, but throughout the English version. This has been rectified in the French version, where Table 1.III lists the various programs of BMDP, SAS and SPSS that refer to the various chapters and sections of the book. This chapter also emphasizes that the most suitable language is that of matrix algebra. Reviewer's Address: Dr. Bernard R. Baum, Section Head, Vascular Plants Section, Biosystematies Research Centre, Agriculture Canada, Saunders Bldg., Central Experimental Farm, Ottawa, OntarioCanada K1A 0C6.

136

Book Review

Chapter 2 is entirely devoted to matrix algebra, where the subject is clearly explained to the uninitiated. Step-by-step examples are provided. Dimension analysis is the subject of Chapter 3. The bases of dimensional analysis are equations. The authors found few applications in ecology. In stressing the usefulness of this approach they provided not less than eight ecological applications to help the reader visualize its potential. The difference between the English and French editions lies in the approach of the treatment of the different kinds of variables. The English version treats multidimensional qualitative data in Chapter 4, and quantitative data in Chapter 5. The French version instead, and perhaps for didactic reasons, treats quantitative data first in Chapter 4, then devotes a separate Chapter 5 to semi-quantitative data, followed by qualitative data in Chapter 6. Most important is the summary in Table 5.1 of the French version, being an important guide to the correct application of various methods depending on whether the variables are quantitative, semi-quantitative, qualitative, or a mixture of these. This also ends part 1 of the French edition. Until this point this book fulfills a long awaited need for which there may not be any equivalent textbook for reference useful to both ecologists and taxonomists. Chapter 6 (7 of the French version) does not, by and large, provide anything different from other textbooks of numerical taxonomy such as Sneath and Sokal (1973). This chapter, as the previous ones, is fully detailed with explanations. The most important section is the last one, entitled "Choice of a Coefficient," including its Tables 6.3 and 6.4 (7.111 and 7.IV of the French version) which are not found in other textbooks. Although the book of Jardine and Sibson (1971) is cited by the authors, information radius is not mentioned in Table 6.4, perhaps because it was not yet used in ecology. Cluster analysis is the subject of Chapter 7. As in the other chapters, a number of widely used packages, such as BMDP, NT-SYS and CLUSTAN, are mentioned, except for SAS unfortunately. The illustrations on how the different methods work, the examples of ecological applications, and the synoptic summary of clustering methods (Table 7.9), all make this chapter a very useful guide. Aside from this there is nothing new in this chapter that has already been summarized in other textbooks, such as Sneath and Sokal (1973). Furthermore, I disagree with the authors' statement on p. 219 that contrary to taxonomy, there is no ecology theory which predicts the existence of discontinuities which are the reproductive barriers. Taxonomy covers a vast field, including bacteria where continuities abound. Even among vascular plants continuities are pervasive in spite of reproductive barriers. What the authors meant by "Taxonomy" is probably the biological species concept (for a critique see for instance Davis and Heywood 1963, Sokal and Crovello 1970).

Book Review

137

Chapter 8 describes ordination in reduced space - a major goal in ecology. Many of these methods are also familiar to other users. This is perhaps one of the best text books on the subject. Section 8.4 lucidly describes correspondence analysis and is illustrated by a computational example. Chapter 9 is entitled "Structure Analysis." Here the authors stress the desirability to use clustering and ordination together in complement, in order to reveal structure. Then they devote a section to discriminant analysis. In fact they describe canonical discriminant analysis and leave out entirely classificatory discriminant analysis. The former is an ordination method whereas the latter is a classification method (see definition in Pimentel 1979). After dealing with canonical correlation in a separate section, the authors provide a very useful Table 9.3 whose aim is to guide the user in interpretation of ordinations and clusterings in relation to a group of descriptors. Chapter 10 describes ecological series, and Chapter I 1 deals with Markov processes and Leslie matrix. Both chapters, as the previous ones, are enriched with ecological applications. After the eleven chapters and four tables, the book ends with a "French-English Vocabulary of Numerical Ecology," and an index. The bilingual vocabulary "contains the technical terms which are not found in ordinary French-English dictionaries." It is arranged in nine themes corresponding to the same sequence as the first nine chapters. It is identical in the French and English edition. Its primary aim is to help Francophone scientists read English papers. Its presence in the English edition is questionable. Furthermore, one would expect to have the terms in English first with their French equivalent next in the English edition, and not the same as in the French edition. This is however a very minor point. In conclusion, it must be said that this book has very few shortcomings. It is a book to be widely used, not only by ecologists but by taxonomists of all kinds. It caters to the non-mathematically oriented researcher and fulfills its aim, i.e., being a guide to the understanding of a number of classificatory methods and the proper application of computer packages that implement them.

Biosystematics Research Centre

Bernard R. Baum References

DAVIS, P.H., and HEYWOOD, V.H. (1963), Principles of Angiosperm Taxonomy, Edinburgh: Oliver and Boyd. DIXON, W.J., BROWN, M.B., ENGELMAN, L., FRANE, J.W., HILL, M.A., JENNRICH, R.I., and TOPOREK, J.D. (1981)~ BMDP StatisticalSoftware 1981, Berkeley: University of California Press.

138

Book Review

JARDINE, N., and SIBSON, R. (1971), Mathematical Taxonomy, London: Wiley. LEGENDRE, L. and LEGENDRE, P. (1979), Ecotogie Numbrique, Lere 6dition, Paris: Masson. NIE, N.H., HALL, C.H., JENKINS, J.G., STEINBRENNER, K., and BRENT, D.H. (1975), SPSS-Statistical Package for the Social Sciences, 2nd edition, New York: McGraw-HilL PIMENTEL, R.A. (1979), Morphometrics. The Multivariate Analysis of Biological Data, Dubuque, Iowa: Kendall Hunt. SAS (1982), SAS User's Guide & Statistics, 1982 edition, Cary, NC: SAS Institute. SNEATH, P.H.A., and SOKAL, R.R. (1973), Numerical Taxonomy. The Principles and Practice of Numerical Class~/~cation, San Francisco: Freeman. SOKAL, R.R., and CROVELLO, T.J. (1970), "The Biological Species Concept: A Critical Evaluation," American Naturalist, 104, 127-153.

Journal of Classification 4:13%141 (1987)

Cluster Dissection and Analysis: Theory, FORTRAN Programs, Examples, C h i c h e s t e r , E n g l a n d : Ellis H o r w o o d L t d . ,

H. Sp/ith,

1986, pp. -I-226.

This book presents a treatment of partitioning methods of cluster analysis. The author defines a partitioning clustering criterion as "an objective function D which associates with each partition a non-negative real number, and thus allows a comparison between the partitions to be made." The "optimal partition" is then the one which maximizes or minimizes D. Sp/ith claims that the major reason for focusing upon such methods is that they seem most appropriate for large data sets. This book is divided into three major sections. Section I contains a theoretical exposition of partitioning cluster methods which utilize some type of objective function. The first five chapters in this first section appear to be organized in terms of the specific type of objective function that is optimized for use with metric data: minimum variance (Chapter 2), minimum determinant (Chapter 3), the criterion of adaptive distances (Chapter 4), and a summary of a variety of other criteria for use with matrix data (Chapter 5). Chapter 6 presents use of the LI criteria for metric, binary, and ordinal data. Chapter 7 discusses criteria and methods that are utilized when given or computed distances are used instead of profile data. Finally, Chapter 8 describes Sp~ith's contributions to clusterwise linear regression where one simultaneously seeks to partition the observations and solve for regression coefficients in those cases where the number of observations is much greater than the number of explanatory variables. Section II of the book presents two chapters (Chapters 9 and 10) which contain the numerical details of the algorithms previously discussed as well as FORTRAN IV "machine independent" subroutines for Selected technical computations. Chapter 9 presents such details for the the minimum variance and minimum determinant criteria, as well as criterion of adaptive distances. Chapter 10 provides implementation for the Ll criterion with different data types, for criteria utilizing distances, and for clusterwise regression analysis. Finally, Section III presents two chapters containing the main programs for these procedures with several empirical examples. Several data sets are provided for each main program, as well as the corresponding output which Reviewer's Address: Wayne DeSarbo, Marketing Department, Wharton School, University of Pennsylvania, Philadelphia, PA 19104, USA.

140

Book Reviews

is presented graphically. There is also an Appendix which includes a description of a magnetic tape written in standard format which includes all the programs quoted in the text, and which may be obtained by writing to the publisher. Hints for implementation on micro-computers are also given here.

Evaluation Sp/ith presents a number of seemingly innovative procedures for partitioning clustering analysis. Since many of the methods discussed are based on exchange algorithms, these procedures appear quite useful for large data sets. The inclusion of FORTRAN IV code with examples is commendable. Similarly, at the end of each chapter, there are a number of problems that can be worked through. In Section I, these problems tend to be more theoretica! in nature, while in Sections II and III these problems are mostly empirical requiring a computer. However, there are also many problems associated with the text. One problem is that the book is very narrow in focus in only examining partitioning methods. According to the author, he purposely excludes hierarchical methods, methods for density estimation, methods based on graph theory, methods relying on a given ordering of distances, etc. As such, the narrow focus would make it difficult to assign this book as a required text for a class in clustering or classification. This book is much narrower in scope than the previous Sp~ith (1980) book. Another problem concerns the writing style of the author. Each chapter is inadequately motivated -- there is no apparent organization of the first eight chapters. One goes from technique to technique with no idea why or what is to follow. Finally, there is no attempt by the author to position the contribution of this book in relationship to the massive literature which already exists in this area. There are few citations given in the body of the text of Section I. It is therefore not clear to the reader which techniques that are reported are original and which exist in the literature elsewhere. Sp~ith does a very poor job of summarizing the work in the area. As an example, mathematical programming approaches to cluster analysis (c.f. Arthanari and Dodge 1981), including the work of Vinod (1969), Jensen (1969), Jarvinen, Rajala, and Sinvervo (1972), Littschwager and Wang (1978), Lefkovitch (1978), Mulvey and Crowder (1979), Klastorin and Watts (1981), etc., are not discussed. Neither is the vast literature on K-means approaches (cf. MacQueen 1967, and Hartigan 1975). The bottom line is that this book is definitely worth purchasing for researchers and practitioners who perform research in the area. However, I would not recommend it for use as a course textbook. I do hope that Wiley

Book Reviews

141

& Sons will do a better job of distributing this book than they did for the Sp~ith's (1980) book, where inadequate inventories were kept for distribution. University of Pennsylvania

Wayne DeSarbo References

ARTHANARI, T.S., and DODGE, Y. (1981), Mathematical Programming #r Statistic.~, New York: Wiley. HARTIGAN, J.A. (1975), Claxtering Atgorithm,~, New York: Wiley. JARVINEN, P., RAJALA, J., and SINVERVO, H. (1972), "A Branch-and-Bound Algorithm for Seeking the p-Median," Operations Re.~earch. 20, 173-178. JENSEN, R.E. (1969), "A Dynamic Programming Algorithm for Cluster Analysis," Operations Re.~earc'h. 12, 1034-t057. KLASTORIN, T.D., and WATTS, C.A. (1981), "The Determination of Alternative Hospital Classifications," Health Service.~ Rexearch. 16, 205-220. LEFKOVITCH, L.P. (1980), "Conditional Clustering," Biometrics. 36, 43-58. LtTTSCHWAGER, J.M., and WANG, C. (1978), "Integer Programming Solution of a Classification Problem," Management Science, 24, 151-165. MACQUEEN, J. (1967), "Some Methods for Classification and Analysis of Multivariate Observations," in Proceeding~ of the Fifth Berkeley .~vmpo~'ium on Mathematical Stati~'tic~ and Prohahilit.v, L. M. LeCam and J. Neyman, eds., University of California Press: Berkely, California, Vol I, 281-297. MULVEY, J.M., and CROWDER, H,P. (1979), "Cluster Analysis: An Application of Lagrangian Relaxation," Maoagemem Science. 25, 329-341. SPATH, H. (1980), Chtster Ana/w'i.s" Algorithm~, Chichester, England: Ellis Horwood. VINOD, H.D. (1969), "Integer Programming and the Theory of Groups," Journal of the American Stati.s'tical As.~oeiation, 64, 506-519.