Combinatorial procedures for structuring internal ... - Springer Link

4 downloads 112058 Views 1MB Size Report
Community and Organization Research Institute, University of California, Santa. Barbara ...... numbers of retirees, possibly from the automotive trades, moving to.
Quality and Quantity, 15 (1981) 179-202 179 Elsevier Scientific Publishing Company, Amsterdam - Printed in The Netherlands

COMBINATORIAL PROCEDURES FOR STRUCTURING INTERNAL MIGRATION AND OTHER TRANSACTION FLOWS PAUL B. SLATER Community

and Organization Research Institute, University Barbara, California 93106, U.S.A.

of California,

Santa

Introduction Gower (1977) has recently discussed multivariate methods for “the analysis of square non-symmetric matrices D whose rows and columns are classified by the same elements”. He indicates that such a data structure is not uncommon, and that it arises in diverse situations. In the social sciences, such matrices are typically termed transaction flow tables. Instances of them are input-output, internal migration, occupational mobility, trip distribution, and journal citation matrices. Gower has focused primarily on the application of multidimensional unfolding and canonical analysis of skew symmetry to tables of this nature. Though of substantial interest, these methods do not directly yield groupings of the analytical units. In contrast, the present author has developed several procedures for clustering using asymmetric matrices. The most widely applied of them has been a two-stage algorithm, IPFPHC (Slater, 1976a, b). Methods for Structuring Transaction Flows 1. IPFPHC: HIERARCHICAL CLUSTERING OF DOUBLY-STANDARDIZED TABLES

In the first stage, an n X n table of recorded flows is adjusted by the itemtiue proportional fitting procedure (IPFP) - or, as it is otherwise sometimes variously termed, the biproportional, RAS, Furness, Fratar, or growth factor method - to have uniform row and column sums. The entries of the adjusted table are then maximum entropy estimates of the flows that would occur in an idealized situation in which no size 0033~5177/81/0000-0000/$02.50

0 1981 Elsevier

ScientificPublishing

Company

180 differences between the n units with respect to the total amounts of movement into and out of each of them existed. The IPFP leaves standard measures of association, the cross-product ratios fiifhk/fh/fik (i # h, j# k), invariant. In this sense, it does not distort the interaction structure of the observed flows (Mosteller, 1968). In the second stage of the procedure, a hierarchical clustering (HC) method - the directed graph (digraph) analogue of single linkage clustering (Hubert, 1973) - is implemented, employing the adjusted table as an asymmetric dissimilarity matrix. A series of directed graphs on n vertices is obtained through the sequential insertion of links that correspond to the entries of the table ordered by decreasing magnitude. The strong components - sets of mutually reachable vertices - of any one of these digraphs partition the IZ vertices into non-overlapping groups. Initially, the digraph contains no links and thus has it strong components. These n groups then merge hierarchically as more and more links are inserted, until all vertices can be reached from one another. Dendrograms, or rooted tree diagrams, in which this clustering process is displayed, can be used to ascertain functional collections of units - those with relatively strong intragroup, and weak intergroup ties. The significance of these sets can be informally gauged by the magnitudes of the differences between the thresholds - values of “critical” links whose insertion results in the merging of strong components - at which they are formed from smaller groups and those at which they are consolidated to form larger clusters. These pairs of thresholds are the limits on the ranges of existence of groups. If these ranges are comparatively large, the groups are well-defined. One type, in particular, of transaction flow - internal migration has been examined with the use of IPFPHC. The results obtained have been generally well received by those intimately acquainted with migration phenomena in the various countries studied. A frequently expressed reaction is that the tree diagrams are highly successful in drawing attention to salient migration features. (The growing importance of migration as a component of population change was emphasized by S. Goldstein (1976) in his Presidential Address to the Population Association of America. He asserted that, “Most, if not all, of the great social problems confronting both the more and the less developed countries today probably have a migration component” (Goldstein, 1976, pp. 424-425)). One attractive feature of the dendrograms obtained with the use of IPFPHC is that cosmopolitan or central areas - those that send and receive migrants to and from a wide range of locations - are isolated either singly or in groups with other cosmopolitan units, at lower

181 thresholds than provincial areas - those having narrow migration bases. MacKinnon and Skarke (1977, p. 109) have noted the ability of IPFP to distinguish between these two types of areas - those with nationa and those with local identities. They suggest that “rather broad incentives and disincentives would have to be imposed in order to retard the growth of these [national] areas, whereas some finer policy instruments (of an origin-destination-specific type) would perhaps be more effective with those [local] regions with a more limited network of connection. ” Examples of areas that have been shown to be highly national in character in their respective countries are : the Paris Region - in an analysis based on 21 planning regions; and, in a study employing 90 departments, three Parisian areas - Seine and Seine-et-Oise clustered as a unit, and Seineet-Marne (“the extraordinary concentration of French life in Paris, which has resulted from the long tradition of administrative centralisation” is discussed by Hall (1966, p. 81)) - and two contiguous Mediterranean departments - Var (site of Toulon), and Alpes-Maritimes (location of Nice). (The convention of listing areas in order of decreasing centrality is followed here and below. Thus, c[Seine and Seine&-Oise] > c[Seineet-Marne] > c[Var] > c[Alpes-Maritimes], where c denotes centrality.); Nivelles, Louvain, Brussels-Capital, and Hal-Vilvorde - the constituent arrondissements of the province of Brabant ; Barcelona 1 which is industrial and a more powerful attraction for migrants than the service center and capital of Madrid (Richardson, 1975, p. 65) - and Vizcaya (site of Bilbao, a heavily industrialized port); London and two groups of four SMLA’s (Standard Metropolitan Labour Areas) on the western periphery of London - a “Western Home Counties” cluster of Reading, Slough, High Wycombe, and Oxford described by Johnson et al. (1974, p. 73, and a suburban London group formed by Aldershot, Guildford, Walton and Weybridge, and Woking; the city and oblast of Moscow as a unit, the East Siberian areas of Krasnoyarsk Kray and the Tuva ASSR joined together, and Krasnodar Kray and Rostov Oblast as a North Caucasus pair; the centrally located Polish city and province of Lodi, as a unit; Bucharest, and the Black Sea provinces of Constanta and TuIcea paired together; Veliko Turnovo and Gabrovo paired together, Sofia City and Sofia Province; Varmland and &ebro as a pair, Stockholm and Siidermanland; Lombardia - which contains the major economic center of Milan - and the three, relatively underdeveloped, strongly clustered northeastern regions of Trentino Alto-Adige, FriuliVenezia Giulia and Veneto (18-region analysis), and, in a 94-province study, Sardinia, the contiguous Lombardian provinces of Milan0 and Varese, Vercelli and Novara as a Piemonte unit, the tourist-oriented Lombardian provinces of Como and Sondrio joined together, the autonomous region of Vat d’Aosta, Genova (Liguria), Turino (Piemonte), and the two paired provinces - Bolzano-Bozen and Trento -that form Trentino Alto-Adige (Rome is not highly national in terms of migration characteristics, having most of its interaction with other central provinces);

182 Istanbul, Erzincan, and Ankara; the province of Rizal -the urban heart of the Philippines and site of many residential suburbs - and the major city of Manila; five consolidated states (Yucatan, Campeche, Chiapas, Tabasco, and Oaxaca) and a territory (Quintana Roo) of southeastern Mexico; two northern Brazilian states (Acre and Amazonas) joined together with two northern territories (Roraima and Rondonia) that were formerly part of Amazonas, as welJ as a group formed by four southeastern states (Mmas Gerais, Espirito Santo, Rio de Janeiro, and Guanabara); four united Patagonian provinces (Rib Negro, Neuquen, Chubut, and Santa Cruz y Tierra de1 Fuego) of Argentina, in addition to a pair comprised of the federal capital and Greater Buenos Aires; and in an analysis of 1965-70 migration between the 510 State Economic Areas, Alaska, the two Hawaiian SEA’S as a group, the three southeastern Florida SEA’s together, the Chicago, New York, Norfolk, and San Bemadino-Riverside SMSA’s, Los Angeles, and the District of Columbia.

Ironically, remote areas, such as Alaska, Hawaii, Patagonia, northern Brazil, and northeastern Italy, which are far from central in a locational sense, frequently function centrally in a migration context. The occurrence of this phenomenon is plausible, since the range of distance-effects between such remote areas and all other units is relatively small. The difference in distance-deterrence in choosing to move between either Alaska and New York or Alaska and California would be, for example, much smaller than that associated with migration between New York TABLE I Number of Migration Units versus Threshold of Isolation of Highly Central Areas Central area

Total number of migration units

Threshold of isolation a

Lombardia Paris Bucharest Barcelona Rizal Istanbul Moscow City and Oblast Seine and Seine-et-Qise Milan London Alaska

18 21 40 52 55 67 72 90 94 100 510

93.3 86.1 66.8 63.8 67.6 56.1 49.6 50.5 37.1 35.6 16.8

a Row and column sums equal 1000 in each analysis.

183 and New Jersey or New York and Colorado. (Alaska and Hawaii also have disproportionately large numbers of military personnel, who come from and go to diverse locations.) “Persons ready to move to a considerable distance from their former residence are indifferent to whether the new residence is, say, rather distant from their former home, or very distant from it” (Bachi, 1961). Table I shows how thresholds of isolation of cosmopolitan areas vary - holding all row and column sums at 1000 - in analyses with differing numbers of subdivisions. (Adjustment of n X n tables to have row and column sums equal to n “* X constant appears to be an effective means of ensuring inter-country comparability in the case of centrality.) The ability of IPFPHC to capture noteworthy centralityeffects can be attributed to the non-negativity of entries and the uniformity of row and column sums of the IPFP-adjusted matrix. If alternative functional distance measures, which can assume negative values - such as relative acceptance coefficients - were employed, as suggested by Holmes (1977), it would become difficult to distinguish between cosmopolitan and provincial areas. The non-negativity of the IPFP scores is also of fundamental importance for the graph-theoretical model utilized in the hierarchical clustering stage of the algorithm. The relevance and meaningfulness of well-defined groups constructed with the use of IPFPHC is highlighted when they are found to conform, at least in part, with traditional regionalizations. (Its successrate, in this regard, is much higher than would be anticipated on the basis of chance alone.) When some of these regions are also islands, the perspicacity of the procedure in detecting the impact of geographic barriers upon human movement is underscored. Correspondences have beeh found, in either a zerodiagonal or originaldiagonal analysis, or both, for: Shikoku and Kyushu - both are islands - and Chugoku and Tohoku (Japan); Sardhria and Sicily - both are islands - Trentino Alto-Adige, F&i-Venezia Molise, Basilicata, and Calabria (Italy);

Ghdia, Umbria,

North and South Islands (New Zealand); South Wales, Essex, North Midlands, and ‘Western Home Counties” (England and Wales); Antwerpen, Liege, Limburg, and Oost-Vlaanderen (Belgium); Alsace, Basse-Normandie, Bretagne, Franche-ComtC, Haute-Normandie, (France);

Nord, and Lorraine

Galicia, Aragon, Extremadura, and the Canary Islands (Spain) (Richardson (1975, ch. 4)

184 documents and discusses 21 alternative regionalizations provinces of Spain.);

that have been proposed for the

the Czech and Slovak Republics (Czechoslovakia) (“Despite the fact that the Czech lands are much more developed economically than Slovakia, there is relatively little movement of population between the two” (Compton, 1976, p. 193)); Norte Grande and Norte Chico, both of which two-province regions unite to form a particularly strong four-member group, and Nicleo Central, Los Lagos, and Los Canales (Chile) (J. Bahr, who supplied the Chilean data, and W. Golte have employed factor analytical and distance grouping methods to develop regional schemes for Chile (Bahr and Golte, 1974)); Mesopatamia, Cuyo, and Patagonia (Argentina); the South and the Central-West (Brazil); New England (United States college migration study); and State Economic Areas were combined by IPFPHC to form the states of: Nevada (2 constituent SEA’s), New Mexico (4), Colorado (9), Arizona (4), Oregon (6), Idaho (4), Utah (5), Arkansas (lo), South Dakota (5), North Dakota (5), Vermont (2), New Hampshire (3), Montana (4), Wyoming (2), Connecticut (5), and Maine (5).

SEA’s also formed welldefined migration regions identifiable as the East, the Northwest, Southern Louisiana, Northeast Oklahoma, Southern Illinois, Missouri and Eastern Kansas, and Eastern North Carolina. Close, but not exact, correspondence with other traditional regions has frequently been found with the use of IPFPHC. Some instances are: Campania and Marche (Italy); Hainut and West-Vlaanderen (Belgium); the Northwest (RSFSR); Dunantdl (Hungary); Norte, Pacific0 Norte, and Golfo de Mexico (Mexico); the Northeast and Southeast (Brazil); Bio-Bio (Chile); and Bicol (Philippines). Table II provides the ranges of existence for a selection of welldefined traditional regions. Low initial thresholds indicate central clusters, such as Sardinia and Hawaii. Los Lagos, Limburg, and Nord, on the other hand, are provincial in their inter-regional interactions. High thresholds of dissolution characterize strongly-bound regions, such as Limburg, the Canary Islands, Nord and Hawaii. Kyushu and the Russian Far East have relatively weak internal cohesion. If little or no agreement occurs, the relevance of the accepted scheme for analyzing human movements is called into question. The Italian regions of Lazio and Veneto, for example, have little correspondence to the functional migration regions found with IPFPHC. (Lazio is a hybrid region created in the 1920’s by uniting portions of Umbria, Abruzzi, and Campania with what was formerly the province of Rome. The northern provinces of Lazio have more interaction with central Italy, while its southern provinces have stronger ties with the South.) The states of

185 TABLE II Ranges of Existence of Selected We&Defined Regions a Region

Slovak Republic North Island South Island Los Lagos Norte Grande and Norte Chico Limburg Shikoku Kyushu New England (college migration study) Galicia Canary Islands European Turkey Russian Far East FrancheComte Bretagne Alsace Nord Sardinia Sicily Calabria Molise Trentino Alto-Adige South Wales Eastern United States Northeast Oklahoma Arkansas New Hampshire and Vermont Hawaii

No. of constituent units 3 6 3 4 3 4 I 6 4 2 3 10 4 4 2 2 3 9 3 2 2 5 185 1

10 5 2

Total no. of migration units

Range of existence

Difference

10 13 13 2.5 25 43 46 46 51 52 52 61 12 90 90 90 90 94 94 94 94 94 100 510 510 510 510 510

131-257 75-150 75-135 176-245 96-234 151-316 87-169 66-111 87-124 120-201 77-408 89-185 54-61 138-174 92-138 138-271 156-295 25-228 60-101 71-103 118-219 73-241 60-162 56-60 72-101 71-75 61-107 20-284

126 75 60 69 138 165 82 45 37 81 331 96 13 36 46 133 139 203 41 32 101 168 102 4 29 4 46 264

a All row and column sums equal 1000. The countries to which these regions belong are given in the text.

Ohio, West Virginia, Iowa, Massachusetts, New Jersey, Maryland, North Carolina, and Texas would be poor choices for migration regions because of their relatively strong out-of-state, and weak in-state ties, and/or their lack of internal cohesion. West Virginia has State Economic Areas, for instance, that have greater ties with Ohio, Maryland, and Virginia SEA’s than with other West Virginia areas. Texas SEA’s, though not strongly clustering with out-of-state areas, break off from one another at low thresholds. Contrastingly, many instances have arisen of well-defined clusters

186 THRESHOLD

0.26 I

0.24

020

0.16 I

0.12 I

0.00

0.04 I

Fig. 1. Hierarchical Clustering Based on Migration Linkages.

having‘-little correspondence with traditional regions. The three provinces - Kirklareli, Tekardig, and Edirne - of European Turkey comprise such an example. An application of IPFPHC to a 5 1 X 5 1 table (U.S. Bureau of the Census, 1973a, Table 44) - the &entry of which is the number of people estimated by the Census Bureau on the basis of a 15% sample to have lived in state i in 1965 and statej in 1970 - is presented in Fig. 1 (ZIP Code abbreviations for the states are employed in the figures and subsequent tables. The flow table was adjusted by IPFP to be doublystochastic - i.e., to have all row and column sums equal to 1). New Hampshire and Vermont emerged as the most strongly bound pair of states, since the migrants from either one of these states have a great affinity to move to the other. Other strongly united pairs, as well as larger groups such as New England, can be discerned in the tree dia-

187 gram. The low thresholds at which Hawaii, Michigan, Florida, California, and Alaska are isolated (separated from all other states) indicate that they have moderately-sized migration ties with many other states, rather than strong ties with just a few. A robustified metric multidimensional scaling procedure was applied to the reciprocals (measures of dissimilarity) of the entries of the doubly-stochastic table. (Trimeans - i.e., 0.25 [lower quartile + upper quartile] + 0.5 [median] - rather than means of partial changes were employed in the trilateration algorithm, developed by W.R. Tobler (Golledge and Rushton, 1972, pp. 14-17; Slater and Winchester, 1978), to render the results less sensitive (more robust) to exceptional values. No symmetrization of the dissimilarity matrix was necessary. Negative logarithms, rather than reciprocals, could have been employed to generate the dissimilarities.) The results are presented in Fig. 2. Distances between the states in this diagram are estimates of the dissimilarities. The major discrepancies (stresses) between these predictions and the corresponding reciprocals are presented in Table III. Thus, the “migration distance” from Rhode Island to Wyoming is the most severely underpredicted (only 13 people were estimated by the Census Bureau to have moved from Rhode Island to Wyoming). Contrastingly, Vermont should be substantially closer to North Dakota on the basis of the estimated flow of 591 people. (Some of the large negative residuals in Table III, since they relate to pairs of northern states, may be attributable to the opening of new winter-sports complexes.) The broad migration bases of Hawaii, Michigan, Florida, and California - already exhibited in the dendrogram - are confirmed by their central positions in Fig. 2 (dominant features of the figure are the long East-West

wl; MT ;; O-

M-N ‘A &B N’D

SD

I.0

NE/

bR bJA

LK

.x “To

‘“ti

IN

?A

r;rI

&

6K nk

-250

-100

-150

-100

-50

Fig. 2. Robust Metric Multidimensional

I;IA’ME

VT

r;H

‘GA

/iL ky T-N -100

d

-DE -qc MD

Vit

*TX ‘LA

‘RI

Fi

K.S

-5D-

dy tiJ

iL

0

iv .50

100

150

Scaling Based on Migration Linkages.

I 200

188 TABLE

III

Largest

Residuals

from

Robust

Scaling

(Fig.

Positive

2) of Migration

Distances

Negative

Link

Residual

Flow

Link

Residual

Flow

RI -WY WY-NH ND-tVT WY+VT NV-DE ID *DE NB -VT ND-tNH TN -WY ND-tDC

1160 1159 1112 940 862 830 801 155 563 508

13 24 17 13 14 21 34 49 89 44

VT -+ND VT-tSD VT-WY NH-ND OR-tNH ID +ME ME+MT DE-ID ID -CT MT-WV

-211 -252 -251 -197 -190 -185 -184 -176 -169 -166

591 155 114 208 195 261 370 200 169 552

expanse, and the Deep South protuberance). Though Kruskal (1977) has emphasized that “multidimensional scaling and clustering are sensitive to complementary aspects of’ the data, the large dissimilarities versus the small ones”, the scaling presented here has a degree of success in substantiating small clusters, such as Vermont and New Hampshire, North Dakota and South Dakota, and North Carolina and South Carolina. However, the dyads formed by Mississippi and, Louisiana, Indiana and Kentucky, and West Virginia and Ohio are widely split by the scaling. The robust metric scaling procedure can also be used to appraise the hierarchical fit (Fig. 1) to the doublystochastic table. The q-fitted value is the lowest threshold at which i and j are united. The scaling procedure was applied to the difference between the q-fitted value and the Q-entry of the doubly-stochastic table (these are the negatives of the conventional residuals - most of which assume negative values themselves due to the fact that the cluster structure is determined by the largest entries (smallest dissimilarities) in the doubly-stochastic table). The results are presented in Fig. 3. Geographic clustering is evident, indicating that systematic effects remain after the fitting of the hierarchy. It would appear that South Dakota and Nebraska, for example, do not have a much weaker affinity than indicated in Fig. 1, since they are close to one another in Fig. 3. The largest residuals from the scaling are listed in Table IV. Thus, the large distance, 0.068, between the District of Columbia and Maryland in Fig. 3 is misleading and does not, in fact, indicate that the former unit is deceptively close to the latter in Fig. 1. (The largest entry, 0.368, in the doublystochastic table corre-

189

ME. .MdNH .A2

.RI

.SC

‘CT

0 02. OOI-

*NY

‘VT * UT ID .WY ‘*MT .OR *WA

‘HI ‘FL DC

iN

‘NM

‘ND ‘OH

- 0.02-

*LA ‘MS

‘PA

O- OOl-

b/

‘NJ

‘MI

‘CA

‘MN

‘DE ‘vAeMD ‘WV

/AL NC -KY

.TX -003-

‘GA

-004

‘sD

-005’ -005-004-0.03-0.02-001

‘NB

. IA 0

001

Fig. 3. Robust Metric Multidimensional archical Clustering (Fig. 1).

001

002

Scaling

003

004

of Negative

0.05

Residuals

from Hier-

sponds to the movement of 79,053 people from the District of Columbia to Maryland. The entry for the movement of 13,102 people from Maryland to the District was 0.139 - the threshold at which these geographic units joined in Fig. 1, since they formed a strong compo-

TABLE IV Largest Residuals from Robust Scaling (Fig. 3) of Negative Residuals from Hierarchical Clustering Positive

Negative

Link

Residual

Link

Residual

OR-VT WA-+UT UT +OR UT +WA LA -+TN TN -. LA. WA-WY VT *ME ME-VT OR-WY

0.088 0.087 0.086 0.078 0.077 0.075 0.075 0.074 0.072 0.071

DC +MD NV+UT UT -+NV WI *MN KY -IN KS +NB CA +NV MN-tWI NB *KS MO-‘IA

-0.297 -0.113 -0.105 -0.104 -0.103 -0.100 -0.094 -0.091 -0.091, -0.090

190 nent at that level. Note that 0.368 = 0.139 - (0.068 - 0.297). In other words, the District of Columbia -+ Maryland link equals the fitted hierarchical value minus the negative residual, which itself equals the fitted scaled value plus the residual from the scaling (Table II).) Hawaii, Michigan, Florida, and California retain their centrality even after removal of the hierarchical effects. The ratio of the midspread (interquartile distance) of the residuals from the hierarchy to the midspread of the entries in the doublystochastic table could be used as a measure of goodness of fit. Similarly, the ratio of the midspread of the residuals from the scaling to the midspread of the reciprocals of the doublystochastic entries could be used to appraise the scaling. Such a course would serve as a robust alternative to the conventional use of a sum-of-squares stress measure.

2. MULTI-TERMINAL TABLES

NETWORK

ANALYSES

OF

UNSTANDARDIZED

FLOW

Two other procedures for examining transaction flows have been developed and employed for analysis, though to a lesser extent than IPFPHC. In one, the max-flow min-cut algorithm (Nijenhuis and Wilf, 1975, ch. 18) is applied n(n - 1) times (once for each possible pair of sources and sinks) to the original, unadjusted n X y1table - regarding it as the capacity matrix of a network - to ascertain whether or not any nodal groups at all exist (Slater, 1976~). Nodal groups, considered as

TABLE

V

In-Migration Nodal

Nodal

Regions

region

1. MD *, DC 2. (AZ, CA, NV, OR, WA) *, MT, ID, UT 3. (MD, DC, VA, WV, NC, SC, GA, FL, AL, MS) *, ME, NH, VT, MA, RI,CT, NY, NJ, PA, OH 4. (OH, IN, IL, MI, WI, MN, IA, KY) *, ND, SD, WV 5. (CO, NM, AZ, NV, CA, OR, WA, HI) *, ND, SD, MT, ID, WY ,UT 6. (VT, MA, RI,CT, NY) *, ME, NH 7. (OH, IN, MI, VA, WV, NC, GA, KY, TN, AL) *, ME, NH RI,CT, NY, NJ, PA, IL, WI,MN, IA, DE, MD, SC, MS, LA * Node.

In-migration to region

In-migration to node

473,875 2,212,685 2,407,507

476,068 2,253,305 2,565,635

1,913,366 2,411,383

1,930,014 2,515,388

1,066,805 2,185,063

1,067,824 2,938,983

191 TABLE

VI

Out-Migration Nodal

Nodal

Regions

region

1. (OR, CA) *, WA 2. (AZ,CA) *, ID, UT, NV, OR, WA 3. (IN, IL, MI, WI, MN, IA, MS, ND, SD, NE, MT, ID, WY, CO, NM, AZ, UT, NV, WA, OR, CA) *, KS, AR, LA, OK, TX, AK, HI 4. (MO, ND, SD, NE, KS, OK, TX, MT, ID, WY, CO, NM AZ, UT, OR) *, AR, NV, WA, CA, HI, AK 5. (MA, RI, NY, NJ) *, NH, VT

Out-migration from region

Out-migration from node

1,455,523 1,504,404 2,407,507

1,461,760 1,505,500 3,069,396

2,185,063

2,240,182

1,780,610

1,793,238

* Node.

units, have less movement out from (into) them than do their nodes individually distinguished members of the groups. (The author has conducted an indepth study of the most detailed interindustry flow table available for the United States - a 1967, 49.5 X 495 matrix - with this method: 200 production and 42 consumption complexes were identified (Slater, 1978).) An extension of this “multi-terminal” procedure to include “supernodes” formed by aggregating contiguous areas has been developed and utilized in examining the 1965-70 interstate migration table. This approach can be adopted if the use of individual nodes as sources and sinks reveals few or no non-trivial groups - those that have at least two members. To conduct this analysis, a table of the 109 contiguous pairs of states was compiled (Hawaii was regarded as contiguous to California and Alaska to Washington). A connected graph having 5 1 vertices and 109 edges was conceptualized. A random-number generator was used to delete edges of this graph with a uniform probability at each iteration of the procedure (this probability was varied between 0.35 and 0.5 between iterations). A FORTRAN subroutine, SPANFO (Nijenhuis and Wilf, 1975, ch. 14), was used to compute the connected components of the resultant graph. If m components were found, m(m - 1) network flow problems with artificial sources and sinks were solved. Approximately 50 iterations of this procedure were performed. The average number of connected components after deletion of edges was roughly 10. A much richer variety of results was obtained (Tables V and VI) than when the candidates for nodes were only the units themselves.

192 Table v. Region 1 reflects the flight to the suburbs of Maryland from

the highly urbanized District of Columbia. Maryland has a more extensive border with Washington, DC, than does the city’s other neighbor, Virginia. Region 2 shows a movement from the western states inland to three mountain states and relatively little movement out from these mountain states. The third region resulted from the same cut as the third region of Table VI (both source and sink regions are listed, since they are approximately equal in size and are each of substantial interest). It corroborates the well-known phenomenon of movement from the northeast to the southeast states, with their warmer climates and less-expensive labor. The fourth group reflects movement from three of the less prosperous states into the industrialized states of the Midwest. Migration to the West Coast states by former residents of Northern Mountain and Great Plains states appears in region 5. The only state Maine borders is New Hampshire, while the latter state is contiguous to Vermont and Massachusetts, both members of the sixth region’s supemode. Contiguity is a major factor in explaining migration between areas. It is thus reasonable that few people would migrate to Maine and New Hampshire from outside the sixth region and that migrants from these states would travel predominantly to neighboring states. The seventh region tabulated shows movement from both sides of an East Central group into that group and relatively little into the side states from outside the region. Table I/I. Northward movements on the Pacific Coast can be seen in the first two regions of Table VI. The third region reveals that West South Central states, together with Alaska and Hawaii, receive the bulk of migrants from western states. Movement within western states to the Far West appears in region 4. Migration from industrialized northeastern states to. the more rural New England states of New Hampshire and Vermont is manifest in the fifth region. Multi-commodity flow algorithms (Frank and Frisch, 1971, ch. 3) could be applied to transaction flow tables to generate partitions of the units into three or more groups. The dual equality of the maximum flow and the minimum cut, upon which the max-flow mincut procedure - which dichotomizes the nodes - is based, would not hold, however. The attractive property of groups having less flow associated with them than with their individually distinguished members would, therefore, be lost. IPFPHC and the multi-terminal procedure are highly complementary approaches, in that differences in row and column sums are removed in

193 the former, while they play crucial roles in the latter. Similarities, however, arise if the multi-terminal method is applied to the doublystandardized table, rather than to the matrix of raw flows. The IPFPadjusted table can for this purpose be regarded as the capacity matrix of a pseudosymmetric network (Frank and Frisch, 1971, p. 141) - one in which the total incapacity of each node equals its total out-capacity (these amounts can vary between nodes). Flow patterns in a pseudosymmetric network are much simpler than those in a general network, since there are at most (n - 1) distinct flows in the former instance, while there can be up to (N + 2) (n - I)/2 in the latter case (Frank and Frisch, 1971, corollary 5.2.1). Since the flows vijii> in a pseudosymmetric network satisfy the conditions for an ultrametric,fii > min[fik, fkj] for all k, they can be used for hierarchical clustering (Hartigan, 1975, p. 160). However, the conditions for a group, or LS (LuccioSumi) set, to appear that is non-trivial are exceptionally demanding. (“Intuitively, an LS set is a subset of nodes that are more strongly connected to each other than to nodes in the complementary set” (Lawler, 1973, p. 276).) The weight of the cut separating an LS set from its complement must be less than the weight of the cut separating any of its subsets. The weight of the cut separating a nodalgroup (as defined by the author) from its complement is less than the weight of the cut separating at least oae of its nodes. Dendrograms derived. from applying the max-flow min-cut algorithm to doublystandardized tables would only on exceptional occasions reveal non-trivial groups. Such an instance arose in examining a trip distribution matrix, previously analyzed by Masser and Brown (1975), for the Merseyside (Greater Liverpool) conurbation. So little interaction occurs across the Mersey River that the 29 superzones in the analysis were split into two LS sets, one comprised of the 8 areas on the Win-al peninsula, and the other of the 2 1 remaining areas across the water barrier. 3. THE USE OF A LINEAR STRUCTURES

ASSIGNMENT

ALGORITHM

TO FIND CYCLIC

In yet a third approach to structuring transaction flows, the linear assignment problem (Christofides, 1975, sec. 12.4) - maximizing the sum of entries no two of which lie in the same row and column - can be solved, for either the original or IPFPadjusted table (plausible results appear to be generated in both instances). The assignment made can be expressed as a permutation of the units of the table. Any permutation has a representation as a collection of unit-disjoint, directed circuits - i.e., cycles. The cycles - strong components with a minimal

194 TABLE VII Optimal Assignment Based on Raw Migration Flows a 1. (CT, MA, NH, ME) 2. (VT, RI) 3. (NJ, PA, NY) 4. (WV, OH, MI, FL, GA, AL, TN, MS, LA, TX, CA, OR, WA, ID, UT, NV, AK, HI, DE) 5. (IN, KY) 6. (IL, WI) 7. (MN, ND)

8. 9. 10. 11. 12. 13. 14.

(IA, NE) (MO, KS) (SD, MT) (MD, VA, NC, SC, DC) (AR, OK) (WY, CO) (NM, AZ)

a Weight of assignment = 1,873,943.

TABLE VIII Optimal Assignment Based on Doubly-Stochastic Flows a 1. (ME, CT) 2. (NH, VT) 3. (MA, RI) 4. (NY, NJ) 5. (PA, DE) 6. (OH, WV) 7. (IN, KY) 8. (IL, WI) 9. (HI, CA, NV, AZ, NM, TX, OK, AR, TN, MI, FL, VA) 10. (ND, MN, SD)

11. 12. 13. 14. 15. 16. 17. 18. 19. 20.

(IA, NE) (MO, KS) (MD, DC) (NC, SC) (GA, AL) (MS, LA) (MT, WY) (ID, UT) (CO, AK) (WA, OR)

a Weight of assignment = 6.79.

TABLE IX Optimal Assignment Based on Raw Flows Between Contiguous States a 1. (ME, NH) 2. (VT, MA) 3. (RI, CT) 4. (NY, NJ) 5. (PA, DE) 6. (MI, OH, WV, KY, IN) 7. (IL, WI) 8. (MN, ND) 9. (IA, NE) 10. (MO, KS) 11. (SD,MT). a Weight of assignment = 1,588,586.

12. 13. 14. 15. 16. 17. 18. 19. 20. 21.

(DC, MD, VA) (NC, SC) (GA, FL) (AL, MS, LA, TX, OK, AR, TN) (ID, OR) (WY, CO) (NM, AZ) (UT, NV) (WA, AK) (CA, HI)

195 TABLE X Optimal Assignment Based on Doubly-Stochastic Flows Between Contiguous States a 1. (ME, NH) 2. (VT, MA) 3. (RI, CT) 4. (NY, NJ) 5. (PA, DE) 6. (OH, MI, IN, KY, TN, VA, WV) 7. (IL, WI) 8. (MN, SD) 9. (IA, NE) 10. (MO, KS) 11. (ND, MT)

12. 13. 14. 15. 16. 17. 18. 19. 20. 21.

(MD, DC) (NC, SC) (GA, AL, FL) (MS, LA) (AR, TX, OK) (ID, UT, NV, OR) (WY, CO) (NM, AZ) (WA, AK) (CA, HI)

a Weight of assignment = 6.18.

number of links - can be regarded as groups. Although precise significance tests are not generally, at present, available for cluster analyses, a ready one is at hand when the linear assignment algorithm is employed in this fashion. (Results generated with the use of quadratic assignment algorithms, which have been utilized to optimally order input-output matrices (Blin, 1973), could similarly be te.sted.) This stems from the fact that the distribution of the number of cycles in random permutations is completely known (Knuth, 1974, p. 176). For instance, the expected number of cycles in a random permutation of 5 1 items is 4.5 19 - i.e., &E, 1/i - with a standard deviation of 1.701 i.e. [EfJi(l/i) - Z~.21(1/iz)11’2. To illustrate the application of the assignment algorithm to transaction flows, the 5 1 X 5 1 table of 1965 -70 migration flows between the 50 states and the District of Columbia was studied. Results are presented in Tables VII-X by listing the cycles of the assignments made. Table VII presents the solution to the assignment problem for the table of recorded flows. Table VIII shows the solution after the flows were standardized by the IPFP to have row and column sums of 1. The recorded numbers of cycles in Tables VII and VIII are highly significant, indicative of a non-random structure of flows between the 5 1 units (since flows tend to be greater between neighboring units, non-randomness would be anticipated). The cycles found can be considered as regions. Several discontiguities occur in them, such as Michigan + Florida (Table VII) and Maine =+Connecticut (Table VIII). If regions composed of contiguous areas are required, the assignment algorithm can be constrained to select only entries corresponding to adjacent pairs. Results obtained by doing this

196 are shown in Tables IX and X. (California was treated as the only contiguous state to Hawaii, and Washington the only one to Alaska). The weights of the assignments have been reduced from 1,873,943 to 1,588,586 and from 6.79 to 6.18 by imposing these constraints. Eight noncontiguous assignments in the raw flow analysis were eliminated. These were Texas to California (135,852 migrants), Michigan to Florida (67,8 13), Maine to Connecticut (10,736), South Carolina to the District of Columbia (3,141), Rhode Island to Vermont (1045), Nevada to Alaska (926), Vermont to Rhode Island (646), and Hawaii to Delware (337) (since California had to be paired with Hawaii, the strong California to Oregon tie (104,330) was lost too). The analogous pairs in the doubly-stochastic analyses were Connecticut to Maine (0.126), Maine to Connecticut (0.12 l), Michigan to Florida (0.049), Virginia to Hawaii (0.038), Tennessee to Michigan (0.037), Colorado to Alaska (0.029), and Florida to Virginia (0.027). Though Maine and Connecticut are not contiguous, there is relatively little distance between them. The strong Michigan to Florida link might be explicable on the basis of large numbers of retirees, possibly from the automotive trades, moving to Florida. A maximin assignment procedure (see Christofides, 1975, p. 386) could be applied sequentially to a doubly-stochastic matrix to yield a decomposition of it into a convex combination of permutation matrices - square matrices with a single 1 in each row and column and zeros elsewhere. (That such a decomposition exists is a famous theorem independently established by J. von Neumann and G. Birkhoff (Berge, 1963, p. 182).) The first term or few terms of the decomposition could be used to approximate the doubly-stochastic matrix, much in the manner of a principal-components analysis. The cycles of the permuation matrices used for the approximation could, as discussed above, be regarded as clusters and tested for significance. If A is an y1X n doubly-stochastic matrix, then A”, where m is a positive integer, is also doubly-stochastic. If A has all positive entries or, more generally, is primitive (Seneta, 1973, Part 1, p. 1) - Am converges, as m + m, to J,,, the uniform doubly-stochastic matrix, all the entries of which equal 1/n. Raising a doublystochastic matrix to a positive integral power is, thus, a procedure for smoothing it. The &entry of A” is the interaction between i and j occurring over paths of length m (Blin and Murphy, 1974, p. 438). Am can be treated as a dissimilarity matrix for strongcomponent hierarchical clustering, in the same manner as A. For m > 1, the resultant dendrograms should show weaker clustering than for m = 1. The three procedures discussed in detail above all employ good -

197 i.e., polynomiuljy-bounded - algorithms. This allows them to be applied to largeorder tables. On the other hand, flow-structuring methods which would require the solution of the travelling salesman, maximum cut, minimal feedback edgeset, or many other interesting graph-theoretical problems for which no polynomially-bounded algorithms are believed to exist (Deo, 1974, p. 3 15), could not be effectively utilized for this purpose. 4. OTHER

FLOW-STRUCTURING

PROCEDURES

The possibilities of applying other graph-theoretical and network procedures to transaction flow tables have been investigated. None, at present, seems to be of outstanding value in gaining insight into flow patterns. Shortest-path algorithms, although they can be used to define potentiah (Berge and Ghouila-Houri, 1965, sec. 9.5), do not seem particularly useful. In flow tables, large entries - not the small ones used to construct shortest paths - correspond to strong relationships. Longest paths - which might be of greater interest - can, in general, only be found in acyclic networks (Christofides, 1975, sec. 6.1). The ability of IPFPHC to distinguish areas by their degrees of centrality indicates that it has some relationship to a large class of “problems of finding the ‘best’ location of facilities in networks or graphs,” in particular, minimax location problems, such as determining the “absolute center” or “absolute p-centers” (Christofides, 1975, ch. 5). IPFPHC implicitly decides whether one unit or a group of units is most central. Therefore, it would seem to add an additional dimension to the standard algorithms for the location of centers. Signal-flow graphs - which are used to show cause-and-effect relationships between variables in simultaneous, linear equations and to solve those equations - have been applied to transaction flows by Lamarche (1975). He computed measures (“gains”) of direct and indirect effects between units (cf. Brown and Horton (1970) in which the mean first passage time of Markov chain analysis is employed for this purpose). Lamarche did not perform any summarization of his measures, as for example, clustering them. Love (1972) utilized electrical network theory to derive grouping indices which include the direct and indirect effects of multiple relationships. 5. ANALYSIS

OF THREE-WAY

MIGRATION

TABLES

In addition to the 5 1 X 5 1 1965 -70 interstate migration table analyzed above, a 9 X 9 X 9 placeof-birth 1965-70 table has also been pub-

198 lished (U.S. Bureau of the Census, 1973b, Table 8). To study it using methods analogous to those developed for two-way tables, it was restructured to form an 81 X 81 matrix. The rows and columns of this table were labelled with all 8 1 ordered pairs of regions - (New England, New England), (New England, Middle Atlantic), (East North Central, Pacific), etc. An ijentry was zero if the second member of the ith pair - say (Mountain, South Atlantic) - was different from the first member of the jth pair - say (West North Central, East South Central). If the second member of the ith pair - (Middle Atlantic, West South Central), for example -is the same as the first member of the jth pair - (West South Central, Pacific), for example - the &entry would, in this case, be the number of people born in the Middle Atlantic region who lived in West South Central states in 1965 and Pacific ones in 1970. If the assignment problem is solved for this 81 X 8 1 table, a very simple, highly significant structure is found (its value - the sum of 81

TABLE XI Source Groups Source group c

Flow from node

Flow from group

l.MA+NE*,NE+NE 2. ESC -, ENC *, ENC --* ENC 3.MA+SA*,SA-rSA 4. ENC -t SA *, SA + SA 5.ESC-tSA*,SA+SA 6.ENC-+Mt *,Mt+Mt 7. WNC+Mt *,Mt+Mt kWSC+Mt*,Mt+Mt 9. MA + Pat *, Pat -+ Pat lO.ENC+Pac*,Pac+Pac ll.WNC-+Pac*,Pac+Pac 12. WSC -+ Pat *, Pat --cWSC, Pat + Pat 13. Mt -+ Pat *, Mt + Mt, Pat -+ Mt, Pat + Pat

583,116 a 1,863,197 1,414,258 945,202 963,739 459,626 799,679 500,811 937,648 1,632,244 2,090,525 1,649,377 1,117,050

443,201b 1,359,729 956,890 904,157 871,694 433,252 459,039 453,885 558,074 629,309 603,144 596,763 523,059

* Node. a This is the number of people born in the Middle Atlantic states who resided in New England in 1965 and in any of the nine regions in 1970. b This is equal to 583,776 (see preceding footnote) plus 325,144 (the number of people who were born in New England who lived there in 1965, and in one of the other eight regions in 197.0) minus 465,7 13 (the number of people born in the Middle Atlantic who lived in New England in both 1965 and 1970). c MA = Middle Atlantic; NE = New England; ESC = East South Central; ENC = East North Central; SA = South Atlantic; Mt = Mountain; WNC = West North Central; WSC = West South Central; Pat = Pacific.

199 TABLE XII Sink

Groups

Sink group a 1. NE -+ NE *, MA ft NE, ENC *NE, WNC f* NE, SA + NE, ESC ++ NE, WSC *NE, Mt ft NE 2. WNC -+ WNC *, NE ++ WNC, MA ++ WNC, ENC * WNC, SA c* WNC, ESC ++ WNC, WSC ft WNC, Mt - WNC, Pat f* WNC 3. ESC --t ESC *, NE c* ESC, MA f* ESC, ENC ++ ESC, WNC ++ ESC, SA ++ ESC, WSC * ESC, Mt c* ESC 4. WSC --f WSC *, NE c* WSC, MA f* WSC, ENC * WSC, WNC c* WSC, SA ++WSC, ESC f* WSC, Mt f f WSC, Pat t* WSC S.Mt-+Mt*,NE*Mt,MA-Mt,ENC++Mt, WNC ++ Mt, SA * Mt, ESC * Mt, WSC ++Mt, Pat * Mt 6. ENC -+ ENC *, NE cf ENC, MA c* ENC, WNC * ENC, SA c* ENC, ESC * ENC, WSC c* ENC, Mt * ENC, Pat tf ENC 7. MA -+ MA *, NE ++ MA, ENC c* MA, WNC cf MA, SA *MA, ESC * MA, WSC ++ MA, Mt c* MA, Pat t* MA

Flow to node

Flow to group

850,594

468,044

1,461,962

599,662

878,5 10

483,186

1,735,167

806,020

2,009,291

853,645

4,945,045

1,183,143

* Node. a See footnote c. Table XI.

selected entries - is 126,852,740). Any pair the members of which are the same - such as (Middle Atlantic, Middle Atlantic) - is matched with itself. All other pairs - (Pacific, East North Central), for example - are matched with their inverses - (East North Central, Pacific), in this instance. The interpretation of this result is clear. There are two major components in multi-stage migration. One is the inertial tendency of people who have remained in an area for one period to continue to reside there, while the other is the tendency for people who have moved from one area to another, and are relocating again, to move back from where they came. If the multi-terminal network flow procedure is applied to the 81 X 8 1 table, 13 source groups (Table XI) and 7 sink groups (Table XII) are found (diagonal entries - the numbers of people who lived in the region in which they were born, in both 1965 and 1970 - do not affect these results). The properties of a source group can be illustrated using group 13 of Table XI. 1,117,050 is the number of people born in the Mountain region who lived in the Pacific in 1965 and in any of the nine regions in

200

1970. This is the total flow emanating from the node Mountain -+ Pacific. The comparable figures for the non-nodal members of the group are: Mountain + Mountain, 334,698; Pacific + Mountain, 345,65 1; and Pacific + Pacific, 440,337. The sum of these four numbers is 2,237,736. To obtain 523,059 - the number leaving group 13 as a whole - we must subtract movement within the group: 76,537 (Mountain + Pacific + Mountain); 1,003,339 (Mountain -+ Pacific + Pacific); 239,604 (Pacific + Mountain + Mountain); and 130,072 (Pacific + Pacific + Mountain). This group shows that there is a strong tendency for people who were born in the 13 states of the Pacific and Mountain regions and lived in one of those states in 1965, to remain living in them in 1970. The other source groups show that the number of people who moved from a certain region of birth, A, to another region B in which they lived in 1965 and 1970, was greater than the number of people born in B who lived in B in 1965 and in one of the eight regions other than B in 1970 (see footnote b of Table XI). Region A can be regarded as a major supplier of immigrants to region B. The seven sink-groups fall into two quite similar categories (1 and 3 ; 2,4, 5,6, and 7). Each group has for its node a pair of identical regions that correspond to those people who lived in the same regions both in 1965 and 1970. For example, 850,594 is the number of people born in one of the eight regions besides New England who resided in New England both in 1965 and 1970 (group 1) (there is no sink group with either South Atlantic s South Atlantic or Pacific + Pacific as its node. These two regions have been the most attractive ones for immigrants). The Pacific region does not enter groups 1 or 3, but is found in the other five groups. To illustrate the structure of groups 2 and 4-7, take 2 as an example. Designate the West North Central region by A, and the other eight regions by B (for groups 1 and 3, remove the Pacific region from B). Then the number born in any of the nine regions and living in B in 1965 and A in 1970 is less than the number born in B and living in A in both 1965 and 1970. Paraphrasing, the number of people moving into A from B between 1965 and 1970 is less than the number who moved into A from B before 1965 and remained there in 1970. Summary Several combinatorially-based procedures for structuring two-way, as well as three-way, transaction flow matrices have been discussed and

201

illustrated. The inter-relationships between the procedures are of some interest. For example, if the multi-terminal method is applied to a doubly-standardized table, it yields a hierarchical clustering. The assignment algorithm yields cycles - the simplest form of strong components. Strong components are considered as clusters in the IPFPHC procedure. The assignment problem - applied to the logarithms of the observed flows - can be viewed as the solution, for the case n = 1, of the problem of determining the non-negative integral matrix all of the row and column sums of which equal 12,and which best estimates, in a maximum entropy sense, the table of observed flows. Solutions of the problem for y1= 1,2,3, .... would yield a series of increasingly complex tables, all of which could be studied with the strong-component analogue of single-linkage clustering. Significance tests for IPFPHC could be developed by studying the distribution of clusters found in random doubly-stochastic tables. These are convex combinations of permutation matrices. References Bachi, R. (1961). Some Methods for the Study of Geographical Distributions of Internal Migrations. Department of Statistics, Hebrew University. Bihr, J. and Golte, W. (1974). “Population distribution and economic-geographical regionsof Chile,” Geoforum 17: 25-42. Berge,C. (1963). TopoEogicaZ Spaces. New York: Macmillan. Berge, C. and Ghouila-Houri, A. (1965). Programming, Games and Transportation Networks. London: Methuen. Blin, J.M. (1973). “A further procedure for ordering an input-output matrix: some empirical evidence,” Economics of Planning 13: 121-l 29. Blin, J.M. and Murphy, F. (1974). “On measuring economic interrelatedness,” Review of Economic Studies 41: 437-440. Brown, L.A. and Horton, F.E. (1970). “Functional distance: an operational approach,” Geographical Analysis 43 : 380-402. Christofides, N. (1975). Graph Theory: An Algorithmic Approach. New York: Academic Press. Compton, P. (1976). “Migration in Eastern Europe,” pp. 168-215 in J. Salt and H. Clout, eds.,Migration in Post-War Europe. London: Oxford University Press. Deo, N. (1974). Graph Theory with Applications to Engineering and Computer Science. Englewood Cliffs: Prentice-Hall. Frank, H. and Frisch, I.T. (1971). Communications, Transmissions, and Transportation Networks. Reading,MA: Addison-Wesley. Goldstein, S. (1976). “Facets of redistribution: researchchallengesand opportunities,” Demography 13: 423-434. Golledge, R.G. and Rushton, G. (1972). Multidimensional Scaling: Review and Geographical Applications. Washington, DC: Association of American Geographers.

202

Cower, J.C. (1977). “The analysisof asymmetry and orthogonality,” pp. 109123 in J.R. Barra, F. Brodeau, G. Romier and B. Van Cutsem, eds., Recent Developments in Statistics. Amsterdam: North-Holland. Hall, P.G. (1966). The World Cities. New York: McGraw-Hill. Hartigan, J. (1975). Clustering Algorithms. New York: Wiley. Holmes, J.H. (1977). “Hierarchical regionalization by iterative proportional fitting procedures: a comment,” IEEE Transactions on Systems, Man, and Cybernetics 7: 474-477. Hubert, L.J. (1973). “Min and max hierarchical clusteringusingasymmetric similarity measures,”Psychometrika 38: 63-72. Johnson, J.H., Salt, J. and Wood, P.A. (1974). Housing and the Migration of Labour in England and Wales. Westmead:Saxon House. Knuth, D.R. (1974). The Art of Computer Programming, Volume 1: Fundamental Algorithms. Reading: Addison-Wesley. Kruskal, J.B. (1977). “The relationship between multidimensionalscalingand clustering,” pp. 17-44 in J. Van Ryzin, ed., Classification and Clustering. New York: Academic Press. Lamarche, R. (1975). “Analyse et stimulation topologiquesen geographie,” Cahiers de Geographie de Quebec 19: 189-207. Lawler, E.L. (1973). “Cutsets and partitions of hypergraphs,” Networks 3: 275285. Love, S.F. (1972). “A new methodology for the hierarchical grouping of related elementsof a problem,” IEEE Transactions on Systems, Man, and Cybernetics 2: 23-29. MacKinnon, R.D. and Skarke, A.M. (1977). “Exploratory analysesof interregional migration tables: an Austrian example,” Regional Studies 11: 99-l 11. Masser,I. and Brown, P.J.B. (1975). “Hierarchical aggregationproceduresfor interaction data,” Environment and Planning A 7: 509-523. Mosteller, F. (1968). “Association and estimation in contingency tables,” Journal of the American Statistical Association 63: l-28. Nijenhuis, A. and Wilf, H.W. (1975). Combinatorial Algorithms. New York: Academic Press. Richardson, H. (1975). Regional Development Policy and Planning in Spain. Westmead: Saxon House. Seneta,E. (1973). Non-negative Matrices. New York: Wiley. Slater, P.B. (1976a). “A hierarchical regionalization of Japaneseprefectures using 1972 interprefectural migration flows,“Regional Studies 10: 123-132. Slater, P.B. (1976b). “Hierarchical internal migration regions of France,” IEEE Transactions on Systems, Man, and Cybernetics 6: 321-324. Slater, P.B. (1976~). “A multiterminal network flow analysisof an unadjusted and Planning A 8; 875Spanishinterprovincial migration table”, Environment 878. Slater, P.B. (1978). “A network flow analysis of the most detailed 1967 U.S. interindustry flow table”, Empirical Economics 3: 44-70. Slater, P.B. and Winchester, H.L.M. (1978). “Clustering and scalingof transaction flow tables: a French interdepartmental example,” IEEE Transactions on Systems, Man, and Cybernetics

8: 635-640.

U.S. Bureau of the Census(1973a). Mobility for States and the Nation. Subject Report PC(2)-2B. U.S. Bureau of the Census(1973b). Lifetime and Recent Migration. Subject Report PC(2)-2D.