The Geography of Female Labor Force Participation and ... - NYU Stern

2 downloads 0 Views 686KB Size Report
Jul 2, 2007 - the Piedmont region (the mid-Atlantic and south east foothills of the Appalachians) ..... period where we fix the urban and rural labor supplies equal to their ..... If a mother works from one year after birth until age six, these five ...
The Geography of Female Labor Force Participation and the Diffusion of Information Alessandra Fogli New York University, Minneapolis Fed, CEPR Stefania Marcassa University of Minnesota, Minneapolis Fed Laura Veldkamp New York University∗ July 2, 2007 - Preliminary and Incomplete Draft -

Abstract Many theories have been proposed to explain the dramatic rise in women’s labor force participation over the last century, but few have studied its geographic pattern. This paper documents the geographic patterns in female labor force participation and shows that a model of information diffusion can explain their evolution over time. Using data from 1930-2000 for over 3000 counties and techniques developed by geographers for measuring the spatial dependence of social phenomena, we show that spatial effects matter. The inter-county variation in labor force participation is well explained by the impact of distance-weighted geographic units, even after controlling for demographic and economic factors. Our explanation for this spatial dependence is that participation decisions are affected by women’s uncertainty about the effect of employment on their children. Information about the effect of maternal employment is spread not through a centralized news channel, but in a decentralized way, person-to-person, diffusing slowly through space and time.

Keywords: Female labor force participation, spatial diffusion, contagion, geography, counties. JEL Nos.: J21, N32, R12, Z13.



Corresponding author: [email protected], Research Department Federal Reserve Bank of Minneapolis, 90 Hennepin Ave. Minneapolis, MN 55401, tel: (612) 204-5174. We thank conference participants at the Midwest Macroeconomics meetings and the 2007 SED conferences for helpful comments and suggestions. Laura Veldkamp thanks Princeton University for their hospitality and financial support through the Kenen fellowship. Keywords: economic geography, female labor force participation, information diffusion. JEL codes: R1, E2, J21, N32.

1

The rise of female labor force participation is one of the greatest economic transformations of the twentieth century. The transformation happened over the course of many decades and unevenly across the U.S.. While many theories have explained the rise in participation over time, its geography remains unexplored. Geographic analysis provides an important piece of evidence in the debate about what caused women to join the labor force, because whatever caused participation to rise over time should also explain its spatial diffusion. Although the fact that participation is higher is some regions than others is readily apparent, section 1 shows that variation in social and economic conditions cannot fully explain the concentration of labor force participation. What captures this unexplained component is the participation of nearby counties. Geographic effects matter; they explain 20-30% of the inter-county variation in labor force participation rates. Furthermore, the strength of these spatial effects has been growing. The pattern of growth in spatial correlation mimics the increase in participation rates: Both have an S-shaped time-series, with the steepest increases in the 1970’s and 80’s. An obvious explanation for local similarities in choices is a local externality: for example, peer pressure to make similar choices or thick market externalities in child care. But positive externalities alone would not predict the enormous heterogeneity in the speed at which female labor force participation rose across the U.S.. Without information frictions or adjustment frictions whose strength varies across regions, new innovations and new information should cause all regions to shift from one coordinated equilibrium to another, in unison. To explain the gradual change in labor force participation over the century and the variation in the rate of change across the country, section 2 proposes a model of local information transmission. Agents learn about the effects of maternal employment on children’s outcomes. A key feature of the model is that the information comes from observing the outcomes of nearby children of employed women. In regions with higher initial employment, women have a higher probability of knowing other women who are employed and observing an informative signal. These regions learn faster and their participation rates climb earlier on. In regions where participation is low, few women

2

know other employed women from whom they can learn about the effects of maternal employment. The lack of information slows learning and retards the increase in participation. Thus spatial correlation in participation choices comes from an information externality: Nearby women make similar decisions because they have observed similar information from them women around them. This paper contributes to a growing literature on economic geography. Topa (2001) is the most closely related paper. He uses a model where neighbors exchange information about job openings to explain concentration of urban unemployment and estimates it using census data from Chicago. Our econometric techniques for measuring spatial correlation are similar to those used by Conley and Topa (2002) and Patacchini and Zenou (2007). Benabou (1996) argues that geographic concentration of high- and low-skill workers is privately beneficial because of positive spillovers in education, but has a social cost which comes from decreasing returns to a given skill level. FuchsSchundeln and Izem (2007) use geographic location as an instrument to identify sources of labor productivity. Our model differs from the previous work in its application to female labor force participation and its focus on the dynamics of spatial patterns. Section 3 calibrates our model to 1940 labor market conditions and shows that it can replicate the evolution of labor force participation over time. First, the model predicts the right magnitude and shape of the increase in the aggregate participation rate. Second, it matches the fact that increases in participation rates happen faster in regions near high-participation focal points, typically large urban centers. Of course, this is what the model was designed to do. Third, the model matches a more subtle feature of the spatial data: Spatial correlation is weak at first, rises more quickly in the 1970’s and 80’s and levels off between 1990 and 2000. Why explore an information-based theory of labor force participation? Existing theories based on technological innovation, falling child care costs, the less physical nature of jobs and increases in women’s wages have had success in explaining features of participation.1 Fogli and Veldkamp 1

See Greenwood, Seshadri, and Yorukoglu (2001), Goldin and Katz (2002), and Albanesi and Olivetti (2006) on technologies, Attanasio, Low, and Sanchez-Marcos (2006) and Del Boca and Vuri (2007) on child care costs, and

3

(2007) show that adding learning to these forces allows them to reconcile a broader set of facts: participation dynamics as well as labor supply elasticity, reported beliefs, and cross-sectional participation differences due to ethnicity, wealth, ability, marital status and motherhood. We extend that framework by introducing spatial location and a signal distribution that is geographically concentrated. Because localized learning creates both positive externalities to employment and an information friction that keeps the positive externalities from spreading instantly across regions, it can explain what other theories do not: the spatial correlation in participation rates. The results in the paper suggest that large-scale economic change arises not because of previouslyproposed changes in the macro environment, but rather because of the information people glean from observing their neighbors. The fact that information needs to pass from person to person makes aggregate learning slow. This helps to explain why economic change takes place so slowly, even though Bayesian learning generally converges quickly. Big transformations may be triggered by small shocks that cause a few people to behave differently. Their actions create new information that others observe. The resulting changes in others’ choices are observed by yet more people. As this process snowballs, social and economic change ensues.

1

The Geographic Facts

We begin by illustrating the geographic pattern of female labor force participation in each decade from 1940-1990. Of course, geographic patterns could arise because of regional social or economic differences. Our two measures of spatial correlation control for a variety of economic and demographic differences across counties and finds significant spatial correlation of the residuals.

1.1

The Distribution of Participation Rates

Our data come from “Historical, Demographic, Economic, and Social Data: The United States, 1790-2000” produced by the Inter-university Consortium for Political and Social Research, ICPSR Goldin (1990), Jones, Manuelli, and McGrattan (2003) on nature of jobs.

4

1

Figure 1: Female labor force participation across the U.S. (1940-70). (series 2896). Our sample consists of 3092 counties, identified by their FIPS code. The maps in figures 1 and 2 illustrate the geographical differences in female civilian labor force participation rates. Counties with darker shades of grey have higher participation rates. The highest percentage of female LFP (above 30%) can be observed in the Eastern states of the U.S., including Virginia, New York, Massachusetts, and Georgia. The high participation rates in the Piedmont region (the mid-Atlantic and south east foothills of the Appalachians) arose because of the concentration of textile jobs, which were thought to be appropriate work for women in the 1940’s.2 The Piedmont region is also characterized by the lowest median years of female schooling. Median family income in 1950 (data not available for 1940) is highest in the Western part of the country, where the median years of school for females is above 9 and the percentage of white 2

See Holmes and Stevens (2004).

5

2

Figure 2: Female labor force participation across the U.S. (1980-90). population is over 88%. In contrast, Tennessee and Kentucky have the lowest participation rates (below 5%). This is related to the prevalence of coal mining jobs, from which women were excluded.

1.2

Measuring Spatial Dependence with the Potential Labor Force Index

As the previous discussion highlights, much of the spatial concentration in female labor force participation comes from economic and social differences across regions. Yet, spatial effects operate above and beyond what can be explained by these economic and social differences. We measure this excess spatial correlation by estimating a linear system where participation rates depend on a variety of socio-economic control variables and on the “potential labor force index,”

LF Pit = β1 + β2 controlsit + β3 Potential LFP + ²it

6

(1)

control variables none demographics demographics, sectors & occupations

Potential LFP coefficient (β3 ) 0.047 (0.012) 0.016 (0.008) 0.017 (0.007)

Table 1: The effect of surrounding counties’ participation, controlling for socio-economic differences. H0 : β3 = 0. Standard errors in parentheses. The demographic control variables are: White population, native white population, females, urban population, median family income 1950, median years of school for males 25+, median years of school for females 25+, fertility, married. The sector control variables are: mining, farm, manufacturing, transport, retail, business and other. The occupation control variables are: semiprofessionals, farmers, managers, clerks, crafts, operatives, domestics, non-domestic services, wage farm laborers, family farm laborers, non-farm laborers and other.

Proposed by Bowles (1976), this index is a distance-weighted sum of the participation rates of all other counties.3 Potential LFP ≡

3091 X i=1

LF Pj ∀j = 1, ..., 3091, j 6= i distanceij

(2)

The null hypothesis of no excess spatial correlation is H0 : β3 = 0. Table 1 reports the coefficients on the potential labor force index for specifications including more and fewer control variables. The main result is that the index has a positive and highly significant effect, meaning that there is excess spatial correlation, beyond what our social and economic control variables can explain. The finding is robust across all specifications that we tried. Appendix A reports additional specifications as well as the coefficients and standard errors on each of the control variables. The magnitude of the excess spatial correlation is not trivial: A one standard deviation increase in the index implies a one point increase in female labor force participation. The index accounts for 20-30% of inter-county variation, depending on the specification. These results suggest that the geographic concentration in participation rates are not driven completely by factors that jointly determine nearby counties’ participation rates, but that there is an externality in effect. Labor force participation is having a direct effect on participation in nearby counties. 3

Geographic distances are measured as the highway distance between county centroids, by the CTA Transportation Network (http://www-cta.ornl.gov/transnet/SkimTree.htm).

7

1.3

Moran’s I Statistic

The second measure of spatial dependence, Moran’s I (Moran 1950) is a measure commonly used in fields such as geography, sociology and epidemiology. It involves estimating LF Pit = β1 + β2 controlsit + ²it , and then testing for autocorrelation in the residuals for locations within a given distance:

PP zi zi+d I = N (h) P 2 . zi+d

(3)

Spatial correlation (Moran I)

1 1940 1950 1960 1970 1980 1990 2000

0.9 0.8 0.7 0.6 0.5 0.4 0

10

20

30

40

50 60 Distance in miles

70

80

90

100

Figure 3: Excess spatial correlation in female participation at varying distances (1940-2000). Excess spatial correlation is highly significant. Of all distances up to 100 miles and in all decades, the highest standard error on any of the Moran’s I statistics is 0.03, an order of magnitude smaller than its mean. Correspondingly, p-values on every estimate are below 0.001. Spatial correlation falls monotonically in distance because counties further away have lower correlations. This tells us that the effect is a local one. It also contradicts the frictionless coordination hypothesis. In such a model, regions would exhibit excess correlation. But because all counties would coordinate simultaneously on changes in circumstances, their Moran’s I statistic should be flat in distance. Because spatial correlation consistently falls precipitously between 0 and 40 miles away and then flattens out and declines very slowly thereafter. This suggests that a distance less than 40

8

miles is the radius of direct social spillover. Distances beyond 40 miles may still exhibit positive spatial correlation because of secondary interactions. One person interacts with someone 40 miles away who subsequently contacts someone 40 miles further away. But the lower-probability chain of events required for this secondary transmission make the spatial correlation significantly lower and flatter in distance. 0.9

Moran I statistic

0.8

20 mile radius 40 mile radius

0.7 0.6 0.5 0.4 1940

1950

1960

1970 year

1980

1990

2000

Figure 4: Excess spatial correlation from 1940-2000, measured at 20 and 40 miles. One of the more subtle features of spatial correlation is its S-shaped increase over time. Figure 4 takes two vertical slices of figure 3, one at 20 miles and one at 40 miles and plots each over time. Spatial correlation is clearly increasing and has a distinct S-shape: It rises slowly in 1940-50, fastest in 1970-80, and flattens out in 1990-2000. Interestingly, this is precisely the same shape and same timing as the rise in female labor force participation itself (?). The striking similarity is one more piece of evidence that the rise in participation rates and the change in spatial correlation are intertwined.

2

A Model of Spatial Information Diffusion

To explain spatial correlation and geographic heterogeneity of labor force participation rates, we propose a model of women who are learning about the effect of maternal employment on children. Initially, most women stay at home to nurture their children because they are averse to risking

9

their children’t welfare by working. Over time, they learn about how much maternal employment compromises the welfare of their children by observing a few nearby children whose mothers were employed. This learning happens slowly because the children have to grow up and realize their potential wage before information about the effect of their mother’s employment is revealed. Thus, each generation is learning about the actions of the generation before. In addition, many women do not participate early on. Seeing their children’s outcomes carries no information about the affects of maternal employment. Therefore, learning is slow initially because most women see no informative signal from which they can learn. The variation in growth rates of labor force participation comes from initial heterogeneity in participation rates of nearby counties. The more women participate initially, the more information their employment generates for nearby women and the faster women in the region learn and join the labor force.

2.1

The model

We assume a discrete infinite horizon, t = 1, 2, ..., and we consider an overlapping generation economy made up of a large finite number of agents living for two periods. Each agent is nurtured in the first period and consumes and has one child in the second period of her life. Preferences of an individual in family i born at time t − 1 depend on their consumption cit and the potential wage of their child wi,t+1 .4

1−γ wi,t+1 c1−γ it U= +β 1−γ 1−γ

γ>1

(4)

The budget constraint of the individual from family i born at time t − 1 is

cit = nit wit + ωit

(5)

where ωit is an endowment which could represent a spouse’s income and nit ∈ {0, 1} is the discrete 4

Using utility over the future potential wage, rather than recursive utility shuts down an experimentation motive where mothers participate in order to create information that their decedents can observe. Such a motive makes the problem both intractable and unrealistic. Most parents do not gamble with their children’s future just to observe what happens.

10

labor force participation choice. If the agent works in the labor force, nit = 1. The key feature of the model is that an individual’s earning potential is determined by a combination of endowed ability and nurturing, that cannot be perfectly disentangled. Endowed ability is an unobserved normal random variable ai,t ∼ N (µa , σa2 ). If a mother stays home with her child, the child’s full natural ability is achieved. If the mother joins the labor force, some unknown amount θ of the child’s ability will be lost. Wages depend exponentially on ability:

wi,t = exp(ai,t − ni,t−1 θ)

(6)

Initial conditions in cities The set of signals that agents observe in the model’s first period depends on the number of nearby employed women in the previous period. Therefore, the model requires an initial period-0 participation rate. In the 1940 data, most centers of high female labor force participation are large cities. Therefore, we call any region in the model with a high initial participation rate (in the top 1% of counties) a city. The probability that individual i participates in the initial period 0 is Lc in cities and a lower probability Lr in non-urban (rural) regions. Information Sets The constant θ determines the importance of nurture and is not known when making labor supply decisions. Young agents inherit their prior beliefs about θ from their parents’ beliefs. In the first generation, initial beliefs are θi,0 ∼ N (µ0 , σ02 ). Each subsequent generation updates these beliefs by observing w’s. But, their signals are only informative about the effect of maternal employment on wages if a mother actually worked. Note from equation (6) that if ni,t−1 = 0, then wi,t only reflects innate ability and contains no information about θ. Each agent knows whether she was nurtured ni,t−1 and observes her potential wage wi,t at the beginning of time t. We refer to w as the potential wage because it is observed, regardless of whether the agent chooses to work.5 In addition, she observes both potential earnings and parental 5

This assumption could be relaxed. If wi,t were only observed once agent (i,t) decided to work, then an informative signal about θ would only be observed if both ni,t = 1 and ni,t−1 = 1. Since this condition is satisfied less frequently, such a model would make fewer signals observed and make learning slower.

11

employment decisions for J − 1 peers. The set of family indices for the outcomes observed by agent i is Ji . Ability a is never observed so that θ can never be perfectly inferred from observed wages. Spatial location matters in the model because it determines the composition of the signals in each agent’s information set. Each agent i has a location on a two-dimensional map with indices (xi , yi ). Signals are drawn uniformly from the set of agents within a distance d in each direction: Ji ∼ unif {[xi − d, xi + d] × [yi − d, yi + d]}J−1 . Agents use the information in observed potential wages to update their prior, according to Bayes’ law. Bayesian updating with J signals is equivalent to running a regression of children’s potential wages on parents’ labor choices and then forming a linear combination of the estimated weight on labor choices θˆ and the prior belief µt .6 At the end of each period t, the regression agents run to form their signal is

W − µa = N θ + εi

where W and N are J × 1 vectors {log wj,t }j²Ji and {ni,t−1 }j²Ji . Let n ¯ i,t be the sum of the labor P decisions for the set of families that (i, t) observes: n ¯ i,t = j²Ji ni,t . The resulting estimated P coefficient θˆ is normally distributed with mean µ ˆi,t = ni,t and variance j²Ji (log wj,t − µa )nj,t /¯ 2 = σ 2 /¯ σ ˆi,t a ni,t . 2 ). The Posterior beliefs about the value of nurturing are normally distributed θ ∼ N (µi,t , σi,t

posterior mean is a linear combination of the estimated coefficient and the prior beliefs, where each 6 The fact that another woman’s mother chose to work a generation earlier is potentially an additional signal because it reveals something about her beliefs. But the information content of this signal is very low because the outside observer does not know whether this person worked because they were highly able, very poor, less uncertain or had low expectations for the value of θ. Since these observations contain much more noise than wage signals, and the binary nature of the working decision makes updating much more complicated, we approximate beliefs by ignoring this small effect. Including it would only reinforce our results because women near high-participation centers would get even more information relative to those in low-participation areas. But it would speed up the aggregate rate of learning.

12

component’s weight is its relative precision:

µi,t =

2 σ ˆi,t 2 2 σi,t−1 +σ ˆi,t

µi,t−1 +

2 σi,t−1 2 +σ 2 σi,t ˆi,t

µ ˆi,t

(7)

The posterior precision (inverse of the variance) is the sum of the prior precision and the signal precision. Thus posterior variance is

−2 −2 −1 2 σi,t = (σi,t−1 +σ ˆi,t ) .

(8)

Definition of equilibrium An equilibrium is a sequence of wages, distributions that characterize beliefs about θ, work and consumption choices, for each individual i in each generation t such that: 1. Taking beliefs and wages as given, consumption and labor decisions maximize expected utility (4) subject to the budget constraint (5). 2. Wages of agents born in period t − 1 are consistent with the labor choice of the parents, as in (6). 3. An agent i born at date t − 1 chooses consumption and labor at date t. That optimization is conditioned on beliefs µi,t , σi,t . 4. Priors µi,t−1 , σi,t−1 are equal to the posterior beliefs of the parent, born at t − 1. Priors are updated using observed wage outcomes Ji,t , according to Bayes’ law (7). 5. Distributions of elements Ji,t are consistent with distribution of optimal labor choices ni,(t−1) and each agent’s spatial location.

13

2.2

Solving the model

Substituting the budget constraint (5) and the law of motion for wages (6) into expected utility (4) produces the following optimization problem for agent i born at date t − 1:

max

nit ² {0,1}

· ¸ exp ((ai,t+1 − ni,t θ)(1 − γ)) (nit wit + ωit )1−γ + βEai,t+1 ,θ . 1−γ 1−γ

(9)

Taking the expectation over the unknown ability a and the importance of nurture θ delivers expected utilities from each choice. If a woman stays out of the labor force, her expected utility is µ ¶ (ωit )1−γ β 1 2 2 EU Oit = + exp µa (1 − γ) + σa (1 − γ) . 1−γ 1−γ 2

(10)

If she participates in the labor force, her expected utility is

EU Wit =

µ ¶ β 1 (wit + ωit )1−γ 2 + exp (µa − µi,t )(1 − γ) + (σa2 + σi,t )(1 − γ)2 . 1−γ 1−γ 2

(11)

The optimal policy is to join the labor force when the expected utility from employment is greater than the expected utility from staying home (EU Wit > EU Oit ). Define Nit ≡ EU Wit − EU Oit to be the expected net benefit of labor force participation, conditional on information (µi,t , σi,t ).

3

Comparing Model Predictions to the Data

3.1

Calibration

Our calibrated parameters are summarized in table 2. Our strategy is to choose parameters of the earnings and endowment distributions in our model to match the empirical distributions of annual labor income of full-time employed, married women with children under age 5 and their husbands. We match the moments for 1940, the earliest year for which we have the wage distribution data. Since we interpret women’s endowment ω as being the earnings of their husbands, we use 14

mean log ability mean log urban ability std log ability mean log endowment std log endowment outcomes observed radius of interaction prior mean θ prior std θ true value of nurture intertemporal substitution

µa µaC σa µω σω J d µ0 σ0 θ γ

-0.88 -0.32? 0.57 -0.28 0.75 3 2 0.04 1.38 0.04 2

women’s earnings distribution urban wage premium women’s earnings distribution average endowment = 1 men’s earnings distribution Prob(ni,t = ni,t−1 )1970 − 2000 1940 Moran’s I statistic ≥ 0.4 unbiased beliefs 1940 LFP children’s test scores (Bernal and Keane 2006) commonly used

Table 2: Parameter values for the simulated model and the calibration targets. a log-normal distribution, ln(ω) ∼ N (µω , σω2 ), which is frequently used to describe earnings. We normalize the average endowment (not in logs) to 1 and use σω to match the dispersion of 1940 annual log earnings of husbands with children under 5 at home. For women’s ability, µa and σa , we match the censored distribution for earnings of working women in the first period of the model to the censored distribution of earnings in the 1940 data. These parameters imply that full-time employed women in our sample earn 81% of their husbands’ annual earnings, on average.7 Without such direct observable counterparts for our information variables, we need to infer them from participation data. Initial beliefs are assumed to be the same for all women and unbiased, implying µ0 = θ. The alternative, a theory driven by initially biased beliefs, is difficult to rationalize. The bias would have to be present in every country; otherwise, female labor force participation would start out high and decrease in some countries. Uncertainty σ0 is chosen to match women’s 1940 labor force participation. The number of signals observed each period J matches the level of Moran’s I in 1940 at distance d. d comes from the point at which Moran’s I drops precipitously. In the data, that distance is 40 miles. On average, 7.94 other counties have centriods within 40 miles. Therefore, we set the radius of social interaction in the model to be 2 counties so that agents learn from other women in an area that comprises approximately 7.94 regions. 7 A wage gap where women earn 81% of their husbands’ income is higher than most estimates. This is due to two factors. First, we do not require husbands to be full-time workers because we want to capture the reality that women’s endowments can be high or low. Second, poor women are more likely to be employed. By examining husbands of employed women, we are selecting poorer husbands.

15

The true value of nurture θ comes from evidence on the effect of maternal employment on Peabody Individual Achievement Test scores of children and the correlation between the childhood test scores and educational attainment (Bernal and Keane 2006), combined with estimates of the college wage premium (Goldin and Katz 1999). One year of full time maternal employment plus informal day care reduces test scores by roughly 3.4%. Maternal employment until age five translates into a 4% drop in children’s future annual wages. Appendix B provides more detail on the estimation of all the calibration targets and discusses sample selection issues. The model also needs an initial distribution of signals, which are observations of wages and parental working decisions, in period 1 (1940). Our calibration determines the initial wages, but not the maternal participation decisions. To generate participation outcomes, we simulate a 1930 period where we fix the urban and rural labor supplies equal to their true 1930 values and generate a set of signals from those outcomes. Section C explores alternative calibrations. One such alternative model has more realistic timing conventions: Families are staggered and a new generation begins every 20 years. The results of that model are not qualitatively different from the 10-year generation model.

3.2

Simulation results: The distribution of participation

We simulate a square country with 625 regions and 45 agents per region.8 There are 33 cities, which we match with the 33 counties with participation rates in the top 1%. Their locations in the model are randomly (uniformly) assigned. The maps in figure 5 show the simulated labor force participation rates in each region. As with the maps of U.S. data (figures 1 and 2), a darker color indicates a higher participation rate. In the model, there are no economic or demographic differences between regions, only information differences. Therefore, these results are more directly comparable to the residuals of a 8

Of course, the model can be extended to include more regions and more agents, without changing the results except to make them more accurate. We choose the number of agents and regions based solely on computation time. This simulation takes 15 minutes to run.

16

1940

1950

1960

1980

1990

2000

1970

0

0.5

1

Figure 5: Simulated participation rates for each decade in each region. Darker colors indicate a higher participation rate. regression of participation rates on economic and demographic variables. The dark squares are cities. Cities have a higher participation rate in 1940 because urban residents have higher average abilities and because cities start with higher levels of participation in 1930. The information generated by the 1930 generation of working women made labor force participation less risky and more attractive to women in the following decade. High participation rates spread out from around cities because women living near cities draw some of their signals from urban women who are more likely to be employed and provide information about the effects of maternal employment. The result for aggregate participation rates is a rise that resembles the data (figure 6). In fact, not only can a simple reduction in uncertainty about the consequences of maternal employment cause participation rates to rise, they slightly over-predict the increase in the data. This is not to say that uncertainty should be the only force causing participation to rise. If there were other concurrent changes such as the emergence of the pill and household appliances, we would need less initial uncertainty to match the low participation rates in 1940 and information diffusion would then account for less of the rise. The point here is that information diffusion is capable of generating large increases in participation, whose shape is consistent with the data. 17

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0 1930

1940

1950

1960

1970

1980

1990

2000

2010

Figure 6: Aggregate participation rate for each decade in the model and data.

3.3

Estimating Moran’s I in the model

There are three salient features of the Moran’s I statistic in the data: First, it is decreasing in distance. Second, it is decreasing less in distance over time (flattening out in figure 3). Third, it rises then falls at a radius of 20 miles (figure 7). The model replicates all three features. Moran I within 20 miles 1 0.9

Model Data

0.8 0.7 0.6 0.5 0.4 1940

1950

1960

1970

1980

1990

2000

Figure 7: Moran’s I measure of spatial correlation in a 20 mile radius: model and data. The model misses some features of Moran’s I as well. The model produces less spatial correlation than the data, particularly in the beginning of the sample. Therefore, the shape of Moran’s I at 20 miles over time is a concave function in the model and an S-shape in the data. The reason for these failings is that the initial condition for the model in 1930 is zero spatial correlation. Cities 18

are randomly placed and are unrelated to the participation rates in the surrounding areas. Spatial correlation comes from the similarity in the information observed in nearby regions. For these information sets to be similar, it takes a few periods of learning. Thus, spatial correlation starts very low and grows quickly in the first few years. It never quite reaches the level in the data. Some more nuanced starting conditions with some initial spatial correlation could help to correct this discrepancy.

4

Conclusion

Local transmission of information about the effects of maternal employment can reconcile spatial correlation and the geographic diversity in participation rates. Information externalities can explain what market or preference externalities cannot because information creates both a complementarity, which explains local similarity, and a friction, which ensures that regions remain heterogeneous. The model can replicate the evolution of geographic effects over time. The results suggest that personal economic decisions are based more on observations of neighbors and than on aggregate information. One way to interpret this model is that is explains the process of social change. Interpreting the model in such a broad way suggests a direction for future research. One roles of social norms is to coordinate actions. People who make decisions that differ from the norm are ostracized. Coupling such a desire for coordination with local information diffusion could yield additional insights. Local coordination motives would rationalize the choice to learn about local outcomes, instead of aggregate statistics. In addition, local information would make local actions more similar, reinforcing coordination. Both effects would slow down changes in beliefs and explain the enduring persistence of social norms.

19

A

Data and Estimation Table 1: Summary Statistics 1940 Variable Female labor force participation rate Potential Labor Force Index Dwelling with radio % Interaction

Mean

Std. Dev.

Obs

Min

Max

13.631 82.588 69.36935 5622.223

5.306 21.563 19.91309 2060.576

3092 3093 3091 3091

3.0401 10.6814 13.2

37.3905 166.8427 98.5

0.886 0.961 0.487 0.014 0.026 232.946 2289.303 1135.71 7.741 8.303

0.179 0.047 0.02 0.038 0.008 254.987 814.101 3629.666 2.023 2.063

3089 3089 3093 3015 3089 3089 3032 3089 3093 3093

.1444 .6575 .1957 0 0 0 0 0 0 0

1 1 .5494 .7999 .1135 1 5489 .1135 12.3 12.2

0.042 0.03 0.417 0.123 0.049 0.12 0.124 0.094

0.024 0.076 0.212 0.121 0.035 0.047 0.049 0.052

3089 3089 3089 3089 3089 3089 3089 3089

0.173 0.005 0.045 0.056 0.193 0.005 0.102 0.217 0.116 0.017 0.045 0.008 0.017

0.068 0.007 0.042 0.034 0.081 0.005 0.114 0.083 0.052 0.041 0.084 0.012 0.012

3092 3092 3092 3092 3092 3092 3092 3092 3092 3092 3092 3092 3092

Demographics: White population % Native white population % Females % Marriages % Fertility % Urban population % Median family income 1950 Fertility Median years of school / males 25+ Median years of school / females 25+ Industrial sectors: Constructions % Mining % Farm % Manufacturing % Transport % Retail % Business % Other % Occupations: Professionals % Semiprofessionals % Farmers % Managers % Clerks % Crafts % Operatives % Domestics % Non-domestic services % Wage farm laborers % Family farm laborers % Non-farm laborers % Other %

Figure 8: Summary statistics for data used in section 1.

20

Table 2 - Panel A: Dep = Female Labor Force Participation Rate 1940 Variable Potential Labor Force Index

(i)

(ii)

(iii)

(iv)

(v)

(vi)

.0466∗∗

.0383∗∗

.0162∗

.0273∗∗

.0167∗∗

-.0524∗∗

(.0123)

(.0109)

(.0082)

(.0078)

(.0065)

(.0129)

Dwellings with radio %

-.0269∗

Interaction

.0009∗∗

(.0157) (.0001)

Demographics: White population %

-7.4673∗∗

-11.2575∗∗

(.6011)

(.47822) ∗

-11.6387∗∗ (.4549) ∗∗

-7.8058∗∗ (.4347) ∗∗

-8.8244∗ (.4468)

Native white population %

-38.5121∗∗ (2.4687)

(1.9622)

(1.8192)

(1.4842)

(1.4601)

Females %

121.5833∗∗

38.1814∗∗

24.4100∗∗

20.2475∗∗

18.6605∗∗

(4.5077)

-4.5236

-4.7392

-4.7508

-5.1999∗∗

(3.9835)

(4.2392)

(3.5554)

(3.5005)

Urban population %

.0093∗∗

.0051∗∗

.0038∗∗

.0038∗∗

(.0003)

(.0003)

(.0003)

(.0003)

Median family income 1950

.0026∗∗

.0017∗∗

.0018∗∗

.0013∗∗

(.0001)

(.0001)

(.0001)

(.0001)

Median years of school / males 25+

-.2472†

.3410∗

.0522

-.1405

(.1649)

(.1571)

(.1267)

(.1280)

Median years of school / females 25+

.2866∗∗

-.3355∗

-.0137

.1621

(.1666)

(.1561)

(.1259)

(.1276)

-44.4448∗∗

-45.2258∗∗

-47.7152

-34.7002∗∗

(8.3097)

(7.6259)

(6.0991)

(6.1506)

-.0134

-.6694

-.6455

-.5820

(1.5297)

(1.4121)

(1.1168)

(1.0982)

X

X

X

X

X

Fertility Marriages %

Industrial sectors Occupations Intercept

Number of obs Adjusted R-squared Note:

∗∗

significant at 1%;



9.7793∗∗

-5.0858∗∗

.8091∗

.9035

-10.2062

-5.1880†

(1.0211)

(3.2402)

(2.4237)

(3.5032)

(3.2037)

(3.2684)

3092

3088

2957

2957

2957

2957

0.2788

0.4872

0.7452

0.7870

0.8673

0.8717

significant at 5%;



significant at 10%. State fixed effects in all specifications.

Figure 9: Parameter estimates for potential labor force participation and all control variables.

21

B

Calibration Details

Calibration targets are based on the sample of women 25-54, with their own child younger than 5 living in the household. We use individuals not living in institution, not living in farm, not working in agriculture and white. Abilities The distribution of women’s abilities is constructed so that their wages in the model match the distribution of women’s wages in the 1940 census data. σa = .57 is the standard deviation of log ability and µa = ln(earnings gap) − (σa2 )/2 is the mean of log ability. These parameters match the initial ratio between average earnings of working women to average earnings of all husbands (0.8 in the data) and to match the standard deviation of log earnings of employed women in the data (0.53 in the data). Selection effects in the model The distribution of observed wages in the data needs to be matched with the distribution of wages for employed women in the model. Employed women are not a representative sample. They are disproportionately high-skill women. The calibration deals with this issue by matching the truncated distribution of wages in the data to the same truncated sample in the model. In other words, we use the model to back out how much selection bias there is. Endowment distribution Data come from the census. We use wages in 1940 (first available year). From this, we construct two pools of matched data: One is only married women; the other is their husbands. The log endowment is normal. For these two sets of wage data, we take the log of wages over previous year. For husbands, mean(log incwage husb ) = 7.043089 and std(log incwage husb) = .7348059. Therefore, we set σω = 0.73. We choose the mean log endowment µω = −(σω2 )/2 such that mean endowment is normalized to 1. Initial labor force participation We need a period 0 participation rate that determines the period 1 wage distribution of women and a period 1 participation rate to start the simulation. Period zero FLFP=3%, period one FLFP=6%. These are given from Census 1930 and 1940 for women, married, white, not living in farms, not living in institutions, age 25-45 with a child younger than 5 living in the household. True value of nurture To calibrate the θ parameter, we use micro evidence on the effect of maternal employment on the future earnings of children. Our evidence on the effect of maternal employment comes from the National Longitudinal Survey of Youth (NLSY), in particular the Peabody Picture Vocabulary Test (PPVT) at age 4 and the Peabody Individual Achievement Test (PIAT) for math and reading recognition scores measured at age 5 and 6. One year of full time maternal employment plus informal day care reduces test scores by roughly 3.4% (Bernal and Keane 2006). If a mother works from one year after birth until age six, these five years of employment translate in to a score reduction of 17%. The childhood test scores are significantly correlated with educational attainment at 18. A 1% increase in the math at age 6 is associated with .019 years of additional schooling. A 1% increase in the reading test score at age 6 is associated with .025 additional school years. Therefore, five years of maternal employment translates into between 0.32 (17*.019) and 0.42 (17*.025) fewer years of school. The final step is to multiply the change in educational attainment by the returns to a college education. We use the returns to a year of college from 1940 to 1995 from Goldin and Katz (1999). Their estimates are the composition-adjusted log weekly wage for full-time/full-year, non agricultural, white males. Those estimates are 0.1, 0.077, 0.091, 0.099, 0.089, 0.124, and 0.129 for the years 1939, 1949, 1959, 1969, 1979, 1989, and 1995. The average return to a year of college is 10%. Since maternal employment reduces education by 0.32-0.42 years, the expected loss in terms of foregone yearly log earnings is about 4%, or θ = 0.04. Number of signals The number of signals observed each period J matches the 42% probability of a woman making a different labor participation decision than her mother. We use the 1970-2000 GSS data because matched mother-daughter data is only available for those years. This probability comes from GSS data on married women with children under six who report that they work either full-time or are homemakers and who reported whether their mother worked for at least one year after having children and before they were six years of age. The rationale for matching this moment is that a woman who observes many signals will have posterior beliefs potentially quite different from her mother’s and will have a higher chance of switching outcomes.9 A seminal paper in the sociology literature (Marsden 1987) estimates that the average American has three other people with whom he/she discusses important issues. Since J includes the woman’s observation of her own wage, this implies J = 4. 9

This calibration strategy uses the counter-factual assumption that mothers’ and daughters’ abilities are uncorrelated. If we relax this assumption, we would need more signals (higher J) to match Prob(ni,t = ni,t−1 ). At the same time, each correlated signal would reveal less independent information. The net effect on the rate of learning is ambiguous.

22

C

Sensitivity Analysis for Calibrated Model

We explore the sensitivity of model outcomes to changes in four key sets of calibrated parameters, and to changes in model timing. Results still need to be filled in.

C.1

Changing Parameters Model

parameter

starting

ending

ending

ending coefficient

value

Moran’s I

Moran’s I

female LFP

of variation

Few signals

J =2

More signals

J =4

High urban ability

µaC = 0.73

Low urban ability

µaC = 0.24

Larger radius

d = 0.06

Smaller radius

d = 0.02

Benchmark

Table 3: Sensitivity of spatial correlation and participation rates to parameter changes. Number of signals Since the number of signals is not something we can directly observe, it is important that our results not be too sensitive to it. Increasing the number of signals to 4 or decreasing it to 2 either speeds up or slows down learning. This can be seen in the steeper or flatter S-shapes in panel (a). Table 3 shows that the number of signals has mild effects on initial endowments and wages. These initial effects arise because more signals make agents beliefs initially more different. Differences in beliefs, rather than differences in endowments or abilities then becomes a more important determinant of the participation decision. Mean of urban ability / wages We do three exercises to explore the role of the ability distribution. The first two are straightforward: move the mean of the ability distribution (exp(µa )) 50% up and down. The results are labeled high ability and low ability. When women have higher ability, on average, they are more likely to join the labor force and earn higher wages when they work. The effect on participation rates is large, telling us that our results are sensitive to our ability distribution. However, it is comforting that wages are even more sensitive to the mean of ability. This tells us that a given wage distribution provides precise information about what the right ability distribution is. Time-varying ability distribution The third ability-distribution exercise explores what role empirical changes in wages plays in determining participation. This is the same benchmark model, but a change in the calibration strategy. Instead of assuming that the distributions of endowments and abilities of women are changing overtime, we calibrate a distribution of endowments and of abilities for each decade. The endowment distribution mean and variance comes from the census data in each decade on married men’s wages. The distribution of female abilities is calibrated so that wages for working women in the model have the same mean and standard deviation as in the data, for each decade. Radius of social interaction We explore two alternative values of social interactions - one that is 100% higher and one that is 50% lower than our initial estimate.

C.2

Changing Model Timing

The model is designed to explain the labor force participation decisions of women with children under 5 years of age. The majority of these women in the census data are between the ages of 25 and 35, with an average age of 32. Whether women return to the labor force afterwards or not is not something our theory has anything to say about, nor is it relevant for the participation rates of our subgroup. This is part of the reason why we look at 10-year periods. What our timing assumptions miss is that it takes about 20 years between when a child is born and when she enters the labor force. Therefore, the decisions of mothers determine the information that others observe 20 years, not

23

10 years later. This longer lag in the learning process slows down cultural change and allows us to add more signals to the model, while still matching the data. To have a more realistic information lag, we explore two alternative timing conventions.

Twenty-year periods We acknowledge that it takes twenty years for a child to grow up and their potential wage to be observed. If we made the model have twenty-year instead of ten-year periods, without changing the number of signals, the results would be literally the ones we reported, with a re-labeling of years. But the longer periods allow us to increase the information flow that women observe to address the criticism that information is too restricted. We suppose that they observe three signals per decade, or six signals every 20-year period. It loses some of the S-shape in labor force participation. The reason is that the observations are too far apart to capture the nuances of how participation increased.

Staggered dynasties This model is one where a child grows up for 25 years and realizes her potential wage at 25. At the same time, the woman marries and starts having children. She is a married woman with a child under 5 years of age until age 35, when she drops out of our sample. If started all families having children at the same time, then in every 25 years, there would be 15 where there are no women with small children. Therefore, we stagger families so that every year an equal number of children are born. The parameters are all equal to our benchmark parameters, except that the number of signals observed every 25 years is 8. Those 8 signals come from wage and maternal employment decisions of women from the current and last 10 cohorts. The labor force participation rates measure participation of the 10 birth year cohorts that are between 25-35 years old. The evolution of participation is illustrated in FILL IN GRAPH. One interesting feature of this data is that it has 25-year cycles. The women begin learning and participation more in 1940. The women who choose their participation 25 years later, in 1965 have observations that are more informative because the mothers of the women they observe worked with higher probability. That makes learning speed up in 1965 and participation increase at a higher rate for the next few years. Again, in 1990, learning speeds up as women draw more informative signals. While out census data is sampled too infrequently to know whether such echo effects arise, one might think of the WWII era as a shock to the participation rate of women. 20-25 years later, was the cultural revolution of the 1960’s and a distinct increase in the rate of labor force participation. This model give us a way to think about how these baby-boom echo effects might arise. Finally, this model has a number of features that help to slow the increase in participation. One feature is a longer childhood. That means that information that is generated from a woman participating today will not be revealed for 25 years. A second feature is that participation rates include not only the current cohort, but also 10 years of older cohorts who made their participation decisions with less information and are therefore less likely to participate. A third feature is that signals are drawn from both current and 10 years of past cohorts. The potential wage and maternal employment decisions of an older woman are less likely to be informative. These three features allow us to more than double the rate of information flow, without making women learn the truth immediately. What we learn from this is the more realistic modeling of the timing of childbirth and introducing overlapping cohorts helps us to add more persistence to the learning model.

24

References Albanesi, S., and C. Olivetti (2006): “Home Production, Market Production and the Gender Wage Gap: Incentives and Expectations,” Working Paper. Amador, M., and P.-O. Weill (2006): “Learning from Private and Public Observations of Others’ Actions,” Working Paper. Attanasio, O., H. Low, and V. Sanchez-Marcos (2006): “Explaining Changes in Female Labour Supply in a Life-Cycle Model,” Working Paper. Benabou, R. (1996): “Equity and Efficiency in Human Capital Investment: The Local Connection,” Review of Economic Studies, 63, 237–264. Bernal, R., and M. Keane (2006): “Child Care Choices and Childrens Cognitive Achievement: The Case of Single Mothers,” Northwestern University, Working Paper. Bowles, G. (1976): “Potential Change in Labor Force in the 1970-80 Decade for Metropolitan and Nonmetropolitan Counties in the United States,” Phylon, 37(3), 263–269. Conley, T., and G. Topa (2002): “Socio-Economic Distance and Spatial Patterns in Unemployment,” Journal of Applied Econometrics, 17, 303–327. Del Boca, D., and D. Vuri (2007): “The Mismatch between labor supply and child care,” Journal of Population Economics, 4. Fogli, A., and L. Veldkamp (2007): “Nature or Nurture? Learning and Female Labor Force Dynamics,” Minneapolis Fed Working Paper. Fuchs-Schundeln, N., and R. Izem (2007): “Explaining the Low Labor Productivity in East Germany - A Spatial Analysis,” Working Paper. Goldin, C. (1990): Understanding the Gender Gap. Oxford University Press. Goldin, C., and L. Katz (1999): “The Returns to Skill in the United States across the Twentieth Century,” NBER Working Paper # 7126. (2002): “The Power of the Pill: Oral Contraceptives and Women’s Career and Marriage Decisions,” Journal of Political Economy, 100, 730–770. Greenwood, J., A. Seshadri, and M. Yorukoglu (2001): “Engines of Liberation,” Economie d’avant gard, research Report 2, University of Rochester. Holmes, T., and J. Stevens (2004): “Spatial Distribution of Economics Activities in North America,” in Handbook of Regional and Urban Economics, ed. by J. V. Henderson, and J. F. Thisse. Jones, L., R. Manuelli, and E. McGrattan (2003): “Why Are Married Women Working So Much?,” Research Department Staff Report 317, Federal Reserve Bank of Minneapolis. Marsden, P. (1987): “Core Discussion Networks of Americans,” American Sociological Review, 52, 122–131. 25

Moran, P. (1950): “Notes on continuous stochastic phenomena,” Biometrika, 37, 17–23. Patacchini, E., and Y. Zenou (2007): “Spatial dependence in local unemployment rates,” . Topa, G. (2001): “Social Interactions, Local Spillovers and Unemployment,” Review of Economic Studies, 68, 261–295.

26