Inequality in pupils' test scores

3 downloads 0 Views 600KB Size Report
intraclass correlations CorrS and CorrN by using mixed models (multilevel models). To estimate the sibling correlation, Lindahl (2011) adopts the following type ...
Inequality in pupils’ test scores: how much do family, sibling type and neighborhood matter?

Cheti Nicoletti† Birgitta Rabe‡

†University of York, UK ‡Institute for Social and Economic Research, University of Essex, UK

Abstract We explore the relative influence of family and neighborhood on pupils’ test scores and how this varies by sibling type. Using English register data we find that the neighborhood explains at most 10-15% of the variance in pupils’ test scores, whereas the variance explained by family is between 44 and 54% at the end of primary school and between 47 and 61% at the end of compulsory schooling. The family influence is significantly higher for identical twins. It is also higher for fraternal twins than for non-twin siblings brought up at different times and therefore experiencing varying family circumstances.

1

INTRODUCTION There is a long-standing interest in the relative influence of family and neighborhood background on socio-economic outcomes such as income and education. Over many years, social scientists have used sibling correlations in socio-economic outcomes to measure the importance of family background, where any sibling resemblance indicates that family background matters. However, the sibling correlation measures both family and community influences, as siblings share both the family and the neighborhood context they grow up in. Solon et al. (2000) use a variance decomposition approach to put bounds on the possible magnitude of family and neighborhood effects, estimating correlations between siblings and between unrelated neighbors. This can be used to put an upper and a lower bound on the ‘pure’ family influence, i.e. on the proportion of variation in the outcome explained by shared family factors. There are, however, a number of family background factors that are not shared by siblings. Examples are factors that differ between the siblings because they are brought up at different times, because parents allocate their resources unequally between the children in the family, or because siblings do not share all genetic traits (unless they are identical twins). These unshared family factors could be the key to explaining why inequalities in socioeconomic outcomes are fairly large even within families. They can be captured by comparing sibling types, defined for example by the extent to which siblings share environments (Solon 1999). In this paper we exploit English register data with an exceptionally large sample size to explore the relative importance of family, sibling type and neighborhood on pupils’ test scores. In particular we look at, and compare, educational outcomes at the end of primary school, at age 11, and at the end of compulsory schooling, usually at age 16. We add to the 2

previous literature in various ways. To the best of our knowledge we are the first to apply the variance decomposition into family and neighborhood influences (Solon et al. 2000) to exam results which arguably are a finer measure of educational outcomes than the years of education used in other papers. We are able to explore how these influences evolve with pupil age. Furthermore, we can differentiate the results by sibling type. Specifically we look at sibling differences in educational outcomes by sibling gender combination and genetic similarity, comparing monozygotic (MZ) twins with dizygotic (DZ) twins and non-twin siblings. We develop a method to identify MZ and DZ twin correlations separately although the twin types are not defined in our data. Finally, we use multilevel models to estimate the variance components, and we show that the lower bound on the family influence has a straightforward interpretation; it measures the influence of what we call the relative family influence. This arises from a family having characteristics which differ from the average of other families living in the same neighborhood. We find a sibling correlation of 54% for eleven and 61% for sixteen year old pupils which measures the total effect of family and community background. The shared neighborhood can account for at most 10-15% of the variance in pupils’ test scores, whereas the influence of the shared family background is estimated to be between 44 and 54% at age eleven and between 47 and 61% at age sixteen. Living in an urban neighborhood has a slightly larger influence than living in a rural one, possibly because the contact among neighboring children is higher in densely populated areas. Differentiating by sibling type, we find that the sibling correlations are higher between same gender siblings than between siblings of differing gender, and they are higher at age sixteen than at age eleven. Monozygotic (identical) twins’ correlations in exam scores are between 0.26 and 0.36 points higher than those of dizygotic (fraternal) twins, which presumably is caused at least in part by the identical genes of monozygotic twins. We also find that sibling correlations in test scores 3

of non-twin siblings are lower than of DZ twins, highlighting the role of varying family circumstances for siblings of the same family born at different points in time.

I. BACKGROUND Sibling correlations in socio-economic outcomes are summary measures of the importance of shared family and community background in explaining the outcome in question (Jencks 1979). Attempts to disentangle the relative importance of family and neighborhood are complicated by the fact that they are strongly correlated, as children who grow up in communities with schools, peers and role models that lead to favorable adult outcomes also tend to live in families with favorable characteristics (Björklund and Jäntti 2009). Solon et al. (2000) use a variance decomposition approach to put bounds on the possible magnitude of family and neighborhood effects, estimating correlations between siblings and between unrelated neighbors. The neighbor correlation captures pure neighborhood factors, but also family traits that are likely positively correlated within the neighborhood because of sorting mechanisms. The neighborhood correlation in children’s outcomes is therefore an upper bound on the proportion of variation in the children’s outcome explained by neighborhood factors. The difference between the sibling and neighbor correlation provides a lower bound on the importance of the family influence. The advantage of this approach to estimating neighborhood and family influence is that it does not require observing any of the family and neighborhood characteristics that may explain children’s outcomes. Therefore it avoids issues of measurement error, omission and arbitrary selection of family and neighborhood characteristics.1 There are numerous channels through which neighborhoods could affect the educational outcomes of the pupils living in them. These could include social interactions 4

between members of the community such as schoolmates and friends living in the same neighborhood (“peer effects”) or influences of adults that either live or work in the neighborhood and serve as role models. They could, on the other hand, include physical characteristics of the neighborhood such as safety, recreational facilities and the like. Our comparison of the neighborhood influence of communities with differing population density (urban and rural neighborhoods) is supportive of the social interaction channel. Previous papers have shown that any sibling resemblance in a variety of outcomes arises much more from growing up in the same family than from growing up in the same neighborhood. Specifically, Solon et al. (2000) estimate the sibling correlation in years of education to be 0.5 in the US, of which the shared neighborhood contributes at most around 0.1, leaving a lower bound of the pure family influence of approximately 0.4. Likewise, based on Swedish data Björklund et al. (2010) find the brother correlation in IQ measures to be 0.5. Two papers using data for Sweden and Norway respectively find sibling correlations in years of education of 0.4 and neighbor correlations of around 0.1 (Lindahl 2011; Raaum et al. 2006). Several papers looking at income and earnings as an outcome find sibling correlations ranging from 0.2 (Raaum et al. 2006; Lindahl 2011) up to 0.4/0.5 (Solon et al. 1991; Mazumder 2008; Björklund et al. 2009) and neighbor correlations ranging from 0.003 (Lindahl 2011, for girls) to 0.2 (Page and Solon 2003a, boys). Any variations in the correlation estimated in these studies can result from country differences, but they are also likely to reflect the type of outcome measurement (continuous variables always having higher correlations than categorical or binary ones) and the amount of measurement error, for example in measuring ‘permanent’ earnings (see Björklund et al. 2002). In this paper we focus on exam scores as an outcome because they are associated with many later life outcomes such as further education, earnings, unemployment and teenage pregnancy and hence measure pupil’s opportunities. Unlike many of these later life outcomes – earnings are 5

an example – test scores are observed for the whole population and thus avoid issues of sample selection. Similarly to the common practice of averaging earnings across several years to reduce potential measurement errors, we consider a composite measure which combines three tests, which are English, Mathematics and Science. There is an extensive literature which spans several disciplines from psychology, sociology and behavioral genetics to economics that looks at which specific within-family background factors are responsible for sibling differences in outcomes. Starting with genetics, some researchers have compared family members with varying degrees of genetic relatedness, such as MZ and DZ twins, full and half siblings to parcel out genetic and environmental influences (e.g. Taubman 1976; Guo and Wang 2002; Björklund et al. 2005, Rabe-Hesketh et al. 2007). The papers generally find that similarity in outcomes increases with genetic relatedness, but disentangling the influences of nature and nurture in detail requires quite strong assumptions. For example, behavioral genetics models assume that siblings share 50% of their genes, although when allowing for assortative mating among parents they would likely share a larger proportion. Moreover, these models use the so-called equal environment assumption which postulates that the shared environment component is the same for MZ twins, DZ twins and for siblings. We investigate differences in sibling correlations between MZ and DZ twins and non-twin siblings without aiming to decompose these differences into genetic and environmental components. To explain differences between siblings, researchers have looked at the effects on socio-economic outcomes of family composition, of which birth order has perhaps been most widely researched. Birth order has been shown to affect the allocation of scarce parental resources such as time inputs between siblings, with the first-born usually receiving the largest share and subsequent children receiving less and less (Price 2008). There is a growing 6

body of literature showing that consequently outcomes such as educational attainments decline with family size and further down the birth order (e.g. Black et al. 2005). Other aspects of family composition have been less often researched, mainly because survey data do not contain sufficient sample sizes to distinguish between subgroups. Björklund et al. (2004) study the combined effect of sibship size and gender combination on earnings, and they find that sister correlations in long-run earnings are lower than the respective brother correlations. It cannot be ruled out, however, that these differences are a result of issues with the measurement of female long-run earning. Most papers using sibling correlations produce separate estimates by gender, so comparing sister pair with brother pair correlations. While our data do not allow us to investigate birth order effects, we are able to distinguish the gender composition of siblings, including brother-sister pair correlations. The literature on intergenerational mobility has also focused on how family effects on children’s outcomes change across the child’s age. It is generally found that the effect of parent’s socio-economic status increases with age. In England, for example, Feinstein (2003) and Goodman et al. (2011) find that the gap in cognitive abilities and educational attainments between poor and rich children widens continually during pre-school and primary school. This widening of the gap seems to slow down during secondary school but by the age of 16 the socio-economic gradient is still large (see Goodman et al. 2011; Ermisch and Del Bono 2011). We add to this literature by assessing how both family and neighborhood influences change when looking at educational outcomes at the end of primary school (age eleven) and of compulsory schooling (age sixteen). However, one caveat to such comparisons is that any changes observed between Key Stage 2 and Key Stage 4 could reflect changes in the neighborhood and family influence, but they could also be caused by changes in the sorting of families into neighborhoods or differences in the way outcomes are measured.

7

II. MODEL AND ECONOMETRIC METHODS Variance decomposition approach Let ycfs denote our outcomes of interest, test scores, for sibling s in family f in neighborhood c, and let us assume the following model: (1)

ycfs   ' X cf   ' Zc  ucfs ,

where Xcf and Zc are vectors of all family and neighborhood characteristics relevant to explain ycfs, α and β are the corresponding vectors of coefficients, and ucfs is an error term independent of family and neighborhood characteristics and identically and independently distributed (i.i.d.) with mean zero and variance σu2. As shown by Solon et al. (2000), the correlations in educational attainment between siblings and between neighboring children CorrS=Cov(ycfs,ycfs’)/V(ycfs)=[V(β'Zc)+V(α'Xcf)+2 Cov(α'Xcf,β'Zc)]/V(ycfs) and CorrN=Cov(ycfs,ycf’s’)/V(ycfs)=[V(β'Zc)+Cov(α'Xcf,α'Xcf’)+2 Cov(α'Xcf,β'Zc)]/V(ycfs), provide upper bounds on the proportion of variance of ycfs explained by shared family background and neighborhood characteristics, V(α'Xcf)/V(ycfs) and V(β'Zc)/V(ycfs), which we call family and neighborhood influence. This is because we can assume that Cov(α'Xcf,β'Zc) and Cov(α'Xcf,α'Xcf’), which represent the sorting of families into neighbourhoods, are both positive. Furthermore, by subtracting the correlation between children living in the same neighborhood from the sibling correlation, we can compute a lower bound on the variance explained by the ‘pure’ family influence, V(α'Xcf)/V(ycfs), CorrS – CorrN =[V(α'Xcf)Cov(α'Xcf,α'Xcf’)]/V(ycfs). 8

It is possible to produce tighter bounds on the neighborhood influence. Altonji (1988) proposes a two-step procedure (see Solon et al. 2000). Let us assume that we can observe a subset of all the relevant family characteristics and let us partition Xcf into two sub-vectors of observed and unobserved characteristics X1,cf and X2,cf. The first step consists in regressing ycfs on the subset of observed family characteristics X1,cf

and dummy variables for each

neighborhood. The second step uses the family effect predicted from the first step to estimate the covariance between pupils living in the same neighborhood, Cov (α ̂1 'X1,cf , α ̂1 'X1,cf' ) (where α ̂1 is the estimated coefficient of X1,cf). This is a lower bound on Cov(α'Xcf,α'Xcf’) because only a subset of the predictors are used for its estimation. Therefore [CorrN-

Cov (α ̂1 'X1,cf , α ̂1 'X1,cf' )/Var(ycfs )] provides a tighter upper bound for the variance explained by the neighborhood influence. We call these adjusted upper bounds. They are used to provide tighter lower bounds on the family influence.2 Estimation Method Following Guo and Wang (2002), Mazumder (2008) and Lindahl (2011) we estimate the intraclass correlations CorrS and CorrN by using mixed models (multilevel models). To estimate the sibling correlation, Lindahl (2011) adopts the following type of mixed model: (2)

ycfs =γ0+ucf+ ucfs

where γ0 is the overall mean of ycfs, ucf is a family random component i.i.d. as normal with mean zero and variance σf , ucfs is a child specific error term normally i.i.d. with mean zero and variance σu2, and ucf and ucfs are mutually independent. The error terms ucfs in models (1) and (2) are identical. Moreover, since siblings living in the same family share the same neighborhood, the random family component ucf in model (2) captures both neighborhood and family effects and is identical to (β'Zc+α'Xcf) in model 9

(1). The sibling correlation CorrS=V(β'Zc+ α’Xcf)/V(ycfs) is equal to σf2/(σf2+ σu2) and can be estimated consistently by restricted maximum likelihood of model (2) as suggested by Mazumder (2008). Similarly, Lindahl (2011) estimates the correlation between pupils living in the same neighborhood by considering the following mixed model: (3)

ycfs =β0+εc+εcfs

where β0 is the overall mean of ycfs, εc is a neighborhood random component i.i.d. as normal with mean zero and variance σc2, εcfs is a pupil specific error term normally i.i.d. with mean zero and variance σε2, and εc and εcfs are mutually independent. The neighborhood random component εc in model (3) differs from the neighborhood influence (β'Zc) in model (1) except when there is no sorting of families into neighborhoods i.e. if Cov(α'Xcf,α'Xcf’)=Cov(α'Xcf,β'Zc)=0. More in general, in the presence of sorting, εc captures (β'Zc) as well as the variation of (α'Xcf) across neighborhoods. If we assume that this variation across neighborhoods is equal to the variation of (α' X c ), where X c is the average of the family characteristics in neighborhood c, then εc=β'Zc+α' X c , εcfs= εcf+ucfs and εcf= α' (Xcf- X c ), where β'Zc, α'Xcf and ucfs are the neighborhood and family effects and error term as in model (1). Given assumptions imposed by model (1), it is easy to prove that the independence conditions between εc and εcfs and between εcfs and εcf’s’ imposed by model (3) are satisfied if the following two assumptions hold,

10

A1.

the deviation of family characteristics from the neighborhood mean (Xcf- X c ) is independent of the neighborhood characteristics Zc,

A2.

there is independence between (Xcf- X c ) for two unrelated children living in the same neighborhood. The random family component εcf =α'(Xcf- X c ) captures the effect of deviations of

family characteristics of family f in neighborhood c from the average family characteristics in that neighborhood. So it measures that part of the family influence arising from a family having characteristics which differ from the average of others living in the same neighborhood. We call this effect the relative family effect. Under assumption A1 and A2 the correlation between unrelated children living in the same neighborhood becomes CorrN= [V(β'Zc)+Var(α’ X c )+2 Cov(α' X c ,β'Zc)]/V(ycfs)=V(β'Zc+α’ X c )/V(ycfs), and it is equal to the ratio σc2/(σc2+ σε2), which can be estimated by restricted maximum likelihood of model (3).3 Furthermore, the difference between CorrS and CorrN CorrS-CorrN= Var[α’(Xcf- X c )]/V(ycfs), not only provides a lower bound for the influence of family characteristics, but can also be interpreted as that part of the variance V(ycfs) explained by the relative family effect. Heterogeneity of the neighborhood and family variance components In models (2) and (3) we have assumed that the random components have the same constant variance for all pupils. In our empirical application we relax this restrictive assumption. We 11

extend model (3) and allow both the neighborhood component εc and the residual error εcfs to have variance which changes between pupils living in urban and rural neighborhoods. In other words model (3) becomes: (4)

ycfs =β0+εR,cdR,cfs+εU,cdU,cfs +εcfs

where β0 is the overall mean, dR,cfs and dU,cfs are dummy variables taking value 1 if the neighborhood c is rural and urban respectively, εR,c and εU,c are neighborhood components mutually independent and i.i.d. as normal with mean zero and variances σU2 and σR2, εcfs is independent of the neighborhood random component and independently normally distributed with mean zero and variance σεU2 for urban neighborhoods and σεR2 for rural neighborhoods. Given this new model the correlation between two pupils living in the same rural neighborhood is CorrRN=σR2/(σR2+ σεR2), whereas the correlation between two pupils living in the same urban neighborhood is CorrUN=σU2/(σU2+ σεU2). Similarly we extend model (2) to allow the variance of the family component εcf to vary between types of siblings, i.e. we consider the following model (5)

ycfs =γ0+∑Kk=1 εk,cf dk,cfs+ ucfs

where γ0 is the overall mean of ycfs, dk,cfs are K dummy variables for different typologies of siblings, εk,cf is a family component i.i.d. as normal with mean zero and variance σk2, ucfs is independent of the family random components and independently normally distributed with mean zero and variance that changes by typology of sibling. To be more specific we use two specifications of model (5). The first specification allows the sibling correlation to vary between pairs of sisters, brothers and mixed sex siblings. The second specification extends the list of possible different types of sibling to twins of different gender, twin brothers, twin sisters, non-twin siblings of different gender, non-twin brothers, non-twin sisters. This last 12

specification allows us to estimate the correlations between monozygotic and dizygotic twins as explained below. Identification of correlations for dizygotic and monozygotic twins Based on results estimated using model (5) we can derive twin correlations separately for dizygotic (DZ) and monozygotic (MZ) twins. Here we show how this is possible even in situations where we cannot distinguish MZ and DZ twins as is the case in our application. To derive these correlations we need the variance of the random family component and of the error term separately by twin type and gender. The identification of the variance of 2 the family component for DZ twins of different gender, 𝜎𝐷𝑍,𝐹𝑀 , is straightforward because

there are no MZ twins of different sex. Therefore the variance for mixed gender DZ twins is directly estimated using model (5). We now use an assumption to compute the corresponding 2 2 variances for DZ twin brothers, 𝜎𝐷𝑍,𝑀𝑀 and sisters, 𝜎𝐷𝑍,𝐹𝐹 - we assume that gender

differences in the variance of the family component in non-twin siblings are a good approximation of the corresponding difference in variance between DZ twins. Therefore we can compute the variance for DZ twin brothers (sisters) as the sum of the variance of mixed sex DZ twins and the gap in the variance between non-twin brothers (sisters) and non-twin siblings of different sex. The computation of the variance of the family component for MZ twin 2 2 brothers, 𝜎𝑀𝑍,𝑀𝑀 , and MZ twin sisters, 𝜎𝑀𝑍,𝐹𝐹 , is slightly more complicated because we are

able to identify twin sisters and twin brothers but we cannot distinguish between MZ and DZ 2 2 twins. To compute 𝜎𝑀𝑍,𝑀𝑀 and 𝜎𝑀𝑍,𝐹𝐹 we exploit the fact that

(6)

2 2 2 𝜎𝑇,𝑀𝑀 = 𝜎𝑀𝑍,𝑀𝑀 𝑝𝑀𝑍,𝑀𝑀 + 𝜎𝐷𝑍,𝑀𝑀 𝑝𝐷𝑍,𝑀𝑀 ,

13

2 where 𝜎𝑇,𝑀𝑀 is the variance of the family component for all twin brothers (including MZ and

DZ twins), and 𝑝𝑀𝑍,𝑀𝑀 and 𝑝𝐷𝑍,𝑀𝑀 are the proportions of twin brothers who are MZ and DZ twins (𝑝𝑀𝑍,𝑀𝑀 +𝑝𝐷𝑍,𝑀𝑀 =1). In equation (6) we implicitly assume that the means of the family components for MZ and DZ twins are identical. This implies that the variance of the mean of the family component for MZ and DZ is zero. We have already shown how to estimate 2 2 𝜎𝐷𝑍,𝑀𝑀 and 𝜎𝑇,𝑀𝑀 directly using model (5). 𝑝𝑀𝑍,𝑀𝑀 and 𝑝𝐷𝑍,𝑀𝑀 can be estimated making use

of the fact that because of roughly equal probabilities of conceiving girls or boys, empirically 50% of DZ twins are different gender twins, 25% are DZ twin sisters and the remaining 25% are DZ twin brothers. Therefore the total number of DZ twin brothers and sisters can each be estimated to be half the number of different gender DZ twins, and the number of MZ twins is the difference between the total number of male twins and the number of DZ brothers. Deriving the corresponding proportions of MZ and DZ twins brothers, 𝑝𝑀𝑍,𝑀𝑀 and 𝑝𝐷𝑍,𝑀𝑀 , 2 2 and replacing the computed values for 𝜎𝐷𝑍,𝑀𝑀 , 𝜎𝑇,𝑀𝑀 , in equation (6), ultimately we can 2 derive the variance of the random family components for MZ brothers, 𝜎𝑀𝑍,𝑀𝑀 . Similarly we 2 can estimate the corresponding variance for MZ sisters, 𝜎𝑀𝑍,𝐹𝐹 .

Following the same line of reasoning we can derive the variance of the error component ucfs separately for MZ and DZ twins by gender combination. Finally we can compute the correlation separately for all twin types as ratios between the family component variance and the sum of the family component and error variances.

III. DATA The empirical analysis is based on the National Pupil Database (NPD), which is available from the English Department for Education and has been widely used for education research. 14

The NPD is a longitudinal register dataset for all children in state schools in England, thus covering roughly 93% of pupils in England. It combines pupil level attainment data with pupil characteristics as they progress through primary and secondary school. Pupil characteristics are collected in annual school censuses and include, for example, age, gender, ethnicity, the pupil’s language group and a low-income marker. Pupil level outcome data during compulsory schooling includes Foundation Stage Profiles as assessed by teachers at age 5 as well as National Curriculum assessments typically taken at ages 7, 11, 14 and 16 that comprise a mixture of teacher-led and test-based assessment depending on the age of the pupils. The advantage of using the NPD for our analysis is that it is a census and as such contains the population of all pupils in state schools. It allows us to identify the whole set of siblings and neighboring children of the relevant age groups in state schools and thus has a very large sample size. This makes it possible to study sibling and neighbor correlations for various sub-groups, and to assess how the relative importance of family and neighborhood on educational outcomes differ over time. Sibling and twin definition The NPD includes address data, released under special conditions, which allows us to match siblings in the data set. The first year that full address details were collected in the NPD across all pupil cohorts was 2007. Siblings are therefore defined as pupils in state schools aged 4-16 and living together at the same address in January 2007. Siblings that are not school-age, those in independent schools and those living at different addresses in January 2007 are excluded from our sibling definition. Step and half siblings are included if they live at the same address, and we are not able to distinguish them from biological siblings.

15

We define as living at the same address those pupils with identical postcodes and house number/house name, as well as flat and block number where applicable. Extensive data cleaning was necessary to extract information on house number or name, flat and block number, as data on these items was not always entered in the dedicated fields, and occasionally one field contains information relating to two items, e.g. ‘Flat 2, Merton House’. Special attention was given to the cleaning and extraction of flat and block information as we assume that a higher proportion of disadvantaged pupils live in flats than houses. The matching of siblings was carried out using 1) postcode and house number/name for addresses with no flat or block number; 2) postcode, house number/name and flat number for addresses without block number; 3) postcode, house number/name, flat and block number; 4) postcode, flat and block number where house number/name was missing. Of the 7.246 million pupil files with address information contained in the 2007 school census, only 4,158 cases had insufficient address information to produce a match using these criteria, and 1,212 cases were dropped where more than ten siblings were identified at an address, and it is possible that they were falsely identified as siblings (false positives).4 We define as twins any pair of siblings – living at the same address - that have the same month and year of birth. There is the possibility that this twin definition includes unrelated same-aged children living at the same address, and we drop 203 such pairs of children because they have different ethnicity.5 The twins defined in this way include both MZ and DZ twins which we are not able to identify separately. In the previous section we described how we derive MZ and DZ twin correlations. At age eleven (age sixteen) we have 5,591 (5,321) different gender twin pairs out of a total of 17,542 (16,753) twin pairs in our sample. Because of the assumption that half of DZ twin pairs will be different gender and half will be same gender, we infer that we have a total of 11,182 (10,642) DZ twin pairs in the sample. This means that there are 6,360 (6,111) MZ twin pairs which is 36.3% (36.5%) of 16

the twin sample. Data from the Office for National Statistics on multiple births in England and Wales (see Rasbash et al. 2010) can be used to derive MZ rates in a similar way. The MZ rate for births in 1991 – the cohort that took their Key Stage 4 exams in 2007 - is 36.5% so that our calculations are confirmed by external data. Neighborhood definition We define a pupil’s neighborhood in terms of where he or she lived at the time of the 2007 school census. In many cases, however, the family may have lived elsewhere before or indeed after 2007. We therefore assume that the 2007 neighborhood is a good proxy for longer-run neighborhood environment. Previous research has shown that even when families move, the neighborhoods they move to are usually similar to the ones they move from (Kunz et al. 2003). Rabe and Taylor (2010) show that this is particularly true of school-aged children in Britain. We define neighborhoods at the level of Lower Layer Super Output Areas (LSOAs). There are 32,482 LSOAs in England which were constructed using measures of proximity (to give a reasonably compact shape) and social homogeneity (type of dwelling and type of tenure, to encourage areas of similar social background). Each LSOA has constant boundaries and a mean population of 1,500 and a minimum of 1,000 individuals. LSOAs are primarily a statistical geography and thus far from being a perfect definition of a neighborhood, but they do allow fine-grained area analysis at the very local level. To investigate how the estimates vary when using a wider definition of a pupil’s neighborhood we also perform the estimates at the level of Middle Level Super Output Area (MSOA). These are built from groups of contiguous LSOAs and comprise a minimum population of 5,000 and a mean of 7,200. Outcome and observed background

17

The outcomes of interest are test results at two different stages of a pupil’s school career, at the end of primary school (Key Stage 2) and at the end of compulsory schooling (Key Stage 4). In year 6, usually at age 11, pupils take National Curriculum tests in the three core subjects of English, Mathematics and Science. At the end of compulsory schooling, usually at age 16, pupils enter General Certificate of Secondary Education (GCSE) or equivalent vocational or occupational exams. GCSEs are not compulsory, but are by far the most common qualification. Pupils decide which GCSE courses to take, and because English, Mathematics and Science are compulsory study subjects, virtually all students take GCSE examinations in these topics, plus others of their choice, with a total of ten different subjects normally taken. In addition to GCSE examinations, a pupil’s final grade may also incorporate coursework elements. In this paper we focus on the GCSE results in the core subjects English, Mathematics and Science which makes the outcome directly comparable to the Key Stage 2 results, and we would argue that they are closer to measuring the ability of a pupil than scores obtained in other subjects. 6 We perform estimates on alternative outcome measures as part of our sensitivity checks. We focus on GCSEs (Key Stage 4) because they mark the first major branching point in a young person's educational career, and lower levels of GCSE attainment are likely to have a longer term impact on experiences in the adult labor market. Key Stage 2 National Curriculum tests form a good point of comparison at a younger age because they are taken at the end of primary school and as such equally mark a turning point in the pupil’s school career. Moreover, schools give most attention to these tests rather than those taken at other times because schools are likely to be judged by parents on the outcomes and they play a prominent role in setting up so-called school league tables. Finally, Key Stage 2 and 4 exams are marked externally and contain fewer teacher assessments and therefore arguably contain less measurement error than Key Stage 1 and 3 exams. 18

In the Key Stage 2 exams, pupils can usually attain a maximum of 36 points in each subject, but teachers will provide opportunities for very bright pupils to test to higher levels. The points are then transformed into levels of achievement which are reported back to pupils and parents. We use as an outcome measure the average points achieved across the three core subjects English, Mathematics and Science. In Key Stage 4 pupils receive a grade for each GCSE course, where pass grades include A*, A, B, C, D, E, F, G. We use a scoring system developed by the Qualifications and Curriculum Authority to transform these grades into a continuous point score. A pass grade G receives 16 points, and 6 points are added for each unit improvement from grade G. The total point score is the sum of the points obtained in English, Mathematics and Science. Students who do not pass any GCSE receive a score of zero. We refer to this point score as the Key Stage 4 score. In the UK, like in many other countries, girls consistently outperform boys in their educational outcomes (e.g. Burgess et al. 2004). To make test scores comparable between boys and girls, we purge them of the gender differences. We regress standardized test scores on gender and use the residuals from these regressions as outcome variables. The NPD annual school census allows identification of a number of family background variables which we use to tighten the upper bound on the neighborhood effect. These include binary variables coding whether or not a pupil is of white British ethnicity and whether or not the first language spoken at home is English. Moreover, we can identify whether or not a pupil is eligible for free school meals (FSM). FSM eligibility is linked to parents’ receipt of means-tested benefits such as income support and income-based jobseeker’s allowance and has been used in many studies as a low-income marker. Finally, we use as family background variable the number of all siblings in the state school system in 2007 and its square. This is an approximation to the true number of siblings as it is derived 19

from our matching of pupils at the same address in 2007 and only includes school-age siblings who are in state schools at that point in time. We are also able to merge geographically coded data into the data set, using LSOA identifiers. We restrict this to an indicator of whether a neighborhood lies in a rural or urban area, where urban is defined as settlements with a population over 10,000. Estimation sample For our analysis we select two samples from the National Pupil Database. The first which we call ‘full sample’ is used to estimate the neighbor correlations and therefore includes all pupils (singletons and siblings) that took Key Stage 4 exams in 2007 or in one of the two following currently available years (2008, 2009), totaling 1.698 million English pupils. For these pupils we also have their exam results at Key Stage 2. We exclude Key Stage 4 years before 2007 because we would not be able to trace and match pupils leaving school after their GCSE exams, pre 2007, in the 2007 address data. The sample we select from the NPD thus includes only neighboring children (and siblings) that are closely spaced, i.e. one and two years apart in the school year and up to just under three years apart in age. We remove pupils with duplicate data entries, triplets, quadruplets and different ethnicity “twins” as discussed above. We also remove all pupils with missing data on any of the background variables from the dataset which leads to a reduction of 3.2%. There are missing and zero-value cases for Key Stage 2 and 4 scores, 4.4% and 9.5% respectively.7 To avoid unnecessarily reducing the sample size, we retain pupils with missing or zero information on Key Stage 2 when analyzing Key Stage 4 and vice versa. The second sample which we call ‘siblings sample’ is used to estimate sibling correlations and therefore concentrates on siblings only, now excluding singleton pupils. The resulting sample includes 327,499 siblings. The estimation of the correlation between all 20

possible distinct pairs of siblings within each household has as unit of analysis the siblingpair. Therefore we expand the dataset to include all sibling pair combinations within each household producing a number of children-pair observations of 345,806. In the vast majority of cases there are only two siblings living in the same household and taking GCSE exams in the 2007-2009 period, and the percentage of two-sibling households with respect to the total households is 96%. In some parts of the analysis we further partition this sample into a sample of twins and a sample of non-twin siblings. Table 1 describes main characteristics of the full and siblings samples. In both samples girls achieve higher mean exam scores at Key Stages 2 and 4 than boys do. For the full sample the Table shows that half of the pupils are male and 2% of pupils are twins. On average there are 1.9 school-age children in every household with at least one pupil taking Key Stage 4 exams over the time-period 2007-2009. 83% of the pupils in the sample are of white British ethnicity, and 91% speak English as their first language. 12% of pupils in the full sample are eligible for Free School Meals and 82% of them live in a neighborhood which is located in an urban area. For each neighborhood we observe on average about 54 pupils taking Key Stage 4 exams and 57 taking Key Stage 2 exams. This is a considerably larger sample size than those used in previous studies. The characteristics of the pupils contained in the sibling sample differ from the full sample in that, as expected, the number of school-age children per household (2.8) as well as the proportion of twins (10%) is higher. Moreover, the proportion of pupils of white British origin and those speaking English as their first language is slightly lower in the sibling sample, whereas the proportion of children that are eligible for Free School Meals is higher at 16%. There is also a slight variation in the mean test scores between the two samples, with pupils in the sibling sample attaining lower scores on average than those in the full sample.

21

IV. RESULTS Estimates of upper and lower bounds on the neighborhood and family influence Table 2 presents neighbor and sibling correlations in Key Stage 2 and 4 exam scores. The upper panel shows the correlations between neighboring pupils and was estimated using the full sample so that three cohorts of pupils are included for each Key Stage. By concentrating on only a few cohorts we base the neighbor correlations on pupils that have experienced similar neighborhood environments. The neighbor correlation can be interpreted as the proportion of the variation in educational outcomes explained by the neighborhood influence, and this will be an upper bound because it is inflated by neighbors’ similarity in family background. The overall neighborhood influence (model 3) is estimated to be at most 0.102 for pupils at the end of primary school and 0.145 at the end of compulsory schooling. Adjusting the neighbor correlations using observed family characteristics as suggested by Altonji (1988) tightens this upper bound by very little, probably because the set of observed covariates is quite limited (see notes to Table 2). The neighborhood correlations are modest and comparable in size to those obtained by Solon et al. (2000) and by Raaum et al. (2006) for years of schooling. When defining neighborhoods based on larger geographies (Middle Layer Super Ouput Areas), the neighbor correlations decrease to 0.076 at Key Stage 2 and 0.112 at Key Stage 4 (not displayed). This indicates that peer interactions take place within relatively confined areas and are therefore better captured at LSOA level. It is of course possible that the relevant radius within which pupils interact is yet smaller and that higher neighbor correlation would emerge were we able to identify these areas in our data.

22

We let the importance of neighborhood factors in explaining pupils’ outcomes vary by whether or not the neighborhood is located in a rural or urban area (see model 4). This may affect outcomes, say, because the physical proximity of neighbors in urban areas compared to rural ones may intensify the influence of the neighborhood factors, perhaps by increasing interaction within the neighborhood peer group. The results of these estimates are displayed in the first panel of Table 2. Indeed, the upper bound on the proportion of variation in the pupil’s outcome explained by rural neighborhood characteristics is approximately 20-30% lower than the one observed for urban neighborhoods, and this is true both for exams taken in primary and secondary school and for adjusted and unadjusted correlations. We conduct nonlinear Wald tests of the equality of neighborhood correlations for urban and rural neighborhoods (CorrUN=CorrRN) which we calculate using the delta method. The results are displayed in the bottom panel of the Table and show that equality is rejected at standard levels of significance. Secondary school pupils in England often commute outside of their neighborhood boundaries to attend different schools, whereas this is less often the case for primary school students who usually attend the local school within the so-called catchment area of residence. This would suggest that neighboring pupil’s test scores should be more correlated at the younger than at the older age because of the difference in shared school background. We find that the opposite is the case. The adjusted neighborhood correlation rises by 50% between ages eleven and sixteen, albeit from a low initial level. A possible explanation is that at age sixteen pupils are often allowed to interact independently with neighboring teenagers and to engage in what is known as ‘hanging out’ on the streets, whereas this is less common for eleven year olds. This would suggest that peer group interaction drives the neighborhood influence on educational outcomes.

23

The second panel of Table 2 shows the upper bound of the shared family influence on test scores, i.e. the sibling correlation, estimated using the sibling sample (see model 2). The correlation between siblings in exam results is 0.54 at Key Stage 2 and 0.61 at Key Stage 4. We have also estimated sibling correlations separately by subject and find these to be marginally higher in Science and Maths than in English (not displayed). These correlations are higher than those obtained by Björklund et al. (2010) for IQ amongst brothers in Sweden which range from 0.47 to 0.51 depending on cohort of birth. Solon et al. (2000) estimate a correlation of 0.51 for years of education using US data, and the sibling correlations found for Norway and Sweden by Raaum et al. (2006) and Lindahl (2011) respectively for years of schooling are again lower at around 0.42. Both comparisons seem to indicate a relatively smaller role for family background in the Nordic welfare states than in the US and UK which are liberal welfare states. It is possible that the sibling correlations estimated for the UK may be higher than those estimated for other countries because our sample contains fairly closely spaced siblings who would be expected to be more similar than siblings spaced further apart because they experience more similar environments. Björklund et al. (2010) distinguish their results by age difference, comparing brothers born up to 5 years apart with those born 6-11 years apart. The correlations of the closer spaced brothers are 0.02-0.05 points higher than those of the distantly spaced brothers, so the effect of age spacing seems to be modest. Later in this paper we present small differences between DZ twin and non-twin sibling correlations (Table 3), i.e. between siblings born at the same time and those born up to three years apart, so we do not expect the impact of sample age proximity to be large. Since we are comparing upper bounds estimates rather than point estimates of the neighborhood and family influence, we have to emphasize that the changes observed between 24

Key Stage 2 and Key Stage 4 could reflect changes in the neighborhood and family influence, but could also be explained by changes in the sorting of families into neighborhoods. Studies of residential mobility have shown that families in Britain sort into neighborhoods in the years leading up to primary school age of their children, and mobility after that is low (see Rabe and Taylor 2010). Therefore the comparison of correlations between the two Key Stages should mainly reflect changes in the influences rather than sorting. The third panel of Table 2 shows the lower bound on the family influence, which is given by the difference between the upper bound and the adjusted correlation between pupils living in the same neighborhood. The range of possible values for the family influence on Key Stage 2 scores is (0.44, 0.54) and on Key Stage 4 scores it is (0.47, 0.61). The two intervals overlap in part so it is not possible to draw strong conclusions on whether family influence increases or decreases from age 11 to age 16. If anything, the family influence seems to increase, which is in line with previous research (Ermisch and Del Bono 2011). Note however that the Key Stage 4 scores may contain elements of teacher assessments as coursework may enter them, whereas Key Stage 2 scores are entirely based on tests. We do not know what bias this is introducing in our measurement of the outcomes, if any, so that comparisons must be made with caution. The second and third panels of Table 2 also display the upper and lower bounds on the family influence separately for different gender combinations. The results show higher correlations for same gender than for mixed gender siblings both at the end of primary school and of compulsory schooling. There are also differences between sister-pair and brother-pair correlations. Wald tests displayed in the bottom panel show that equality of the correlations between sisters and between brothers (CorrSFF=CorrSMM), between sisters and mixed sex siblings (CorrSFF=CorrSFM), and between brothers and mixed sex siblings (CorrSMM=CorrSFM) 25

can be rejected at standard levels of significance. Note that the outcome measures we use in the analysis are purged of the mean gender differences in school outcomes between girls and boys, so that the lower mixed gender correlations are not a reflection of such differences. While we can provide only bound estimates for the family influence, we are able to provide point estimates for the relative family influence i.e. for the share of the total variance in pupils’ test scores which is explained by deviations of families’ characteristics from the average in their neighborhood. We have shown in section 2 that under some minor assumptions this relative family influence is given by the lower bound of the family influence. How much the characteristics of a family differ from other families living in the same neighborhood appears to be a very important factor in explaining pupil’s outcomes, indeed it explains about 44% of the variance in the Key Stage 2 and about 47% of the variance in the Key Stage 4 exam score. These results seem to suggest a high importance of differences in family characteristics within a neighborhood and less importance of differences between neighborhoods in explaining pupil’s educational outcomes. Since sibling and neighborhood correlations are given by the ratio between the covariance and variance of the test scores, changes in the correlation across different types of siblings or neighborhoods could be related to changes in the variance across subgroups. Nevertheless, this does not seem to be the case for the neighborhood correlations and indeed we find that it is the variation in the covariance that drives the variation in the correlation between pupils living in rural and urban areas (not displayed). The results are somewhat different for the sibling correlations that follow. We find again that the changes in the correlation between MZ, DZ and non-twin pairs are driven by changes in the covariance, but differences in the correlation between brother and sister pairs in some cases have the opposite sign than the differences in the corresponding covariances. This is not a major concern, given 26

that differences in the correlation between brother and sister pairs are quite small, but for clarity we report both sibling correlations and covariances in Table 3 below. Estimates of upper bounds on the family influence by sibling type Up to now we have focused on the sibling correlation as a measure of the influence of shared family and neighborhood background on educational outcomes. In addition to the family and neighborhood factors shared by siblings there are also non-shared family and neighborhood influences that affect siblings differently. Non-twin siblings and dizygotic twins, for example, have only half of their genes in common. 8 Moreover, even if they share the same family events (such as family disruption and income shocks), non-twin siblings experience these events at different points in their life so they may affect them differently. Furthermore, parents can treat their children differently. To assess the magnitude of such differentiating influences, we compare the correlation in test scores of MZ twins who have identical genes with those of DZ twins and non-twin siblings who share only half of their genes. We caution that differences in twin correlations between MZ and DZ twins may originate from non-genetic factors, for example because MZ twins could be affected by an interaction among themselves that has no counterpart in the whole population. Differences between DZ twin and non-twin sibling correlations can be interpreted as the effect of growing up at the same or at a different time and hence being exposed to an environment, including parental behavior, which is more or less similar. We begin by estimating correlations in exam scores between twins and non-twin siblings using model (5) which allows the family random component to vary across different typologies of siblings. We do not report the lower bounds on the family influence which can be easily computed by subtracting from each sibling correlation the adjusted neighborhood 27

influence (0.10 and 0.14 for Key Stages 2 and 4 respectively). The first panel in Table 3 shows twin correlations for all twins in the sample, not distinguishing between MZ and DZ twins at this point. The covariance is also reported for the reasons given above. The mixed gender twins are DZ twins by definition and have a considerably lower correlation in test scores than same gender twins. Our twin brother correlations can be compared to the recent results of Björklund et al. (2010) who, based on Swedish data, look at twin brother correlations in IQ. Their correlations in IQ are around 0.65 while ours are higher at roughly 0.75. This seems to indicate that family background matters more in the UK than in the Nordic welfare state. The second and third panels in Table 3 show correlations for MZ and DZ twins at the end of primary school and at the end of compulsory schooling, derived using the method described in section 2. As expected, the correlation between MZ brothers and sisters are considerably higher than they are for DZ twins. We estimate the correlation of Key Stage 2 scores to be 0.91 (0.87) for MZ brothers (sisters), and the correlation in Key Stage 4 scores to be 0.92 (0.91) for MZ brothers (sisters). This is a remarkably high correlation. When estimated on the alternative outcome at Key Stage 4 which includes all subjects chosen by pupils instead of concentrating only on the core subjects English, Mathematics and Science, the correlation is significantly lower for MZ twins (not displayed). This could suggest that our outcome variable comes quite close to measuring innate ability, whereas the all-subject score is a reflection of choices taken by MZ twins that may be more variable. The exposition to external influences over time and personality development as well as deliberate differential treatment by parents may lead to greater between-twin differentiation in choices than indicated by measures of innate ability.

28

Comparing DZ twins with non-twin siblings (panels 3 and 4 of Table 3), we find that DZ twin correlations are 7-8% higher than those of their non-twin siblings at Key Stage 2 and 4% at Key Stage 4. As they share the same proportion of genes (50%), these differences can only be explained by differing family and neighborhood environmental factors. In contrast to DZ twins, siblings born at different times are exposed to different family and community environments to the extent that these environments change over time.

V. SENSITIVITY ANALYSIS Comparing simple sibling-pair correlations with correlations based on multilevel models. There are two main advantages of using multilevel models to estimate sibling correlations. The first is that we can consistently estimate correlations for unbalanced data (i.e. clustered data with different size clusters). This is particularly relevant for the neighborhood correlation and less relevant in this application for the sibling correlation because in our sample we observe mainly two-sibling families. The second is that we can produce formal tests for the equality of correlations between different typologies of siblings. This comes at the cost of imposing a normality assumption. Simple sibling-pair correlations do not impose this normality assumption and produce consistent estimates of the sibling correlation at least when considering only two-sibling families. For this reason we compare the correlations for different typologies of siblings estimated by using simple sibling-pair correlations and the multilevel model (5) when considering only two-sibling families. We obtain sibling-pair correlations which are almost identical to the ones estimated using the mixed model. Joint modeling of neighborhood and family effects. Throughout all our analysis we estimate separate models to indentify the neighborhood and family effects. To check whether this could potentially bias our results, we also estimate a two-level model where pupils are 29

clustered within families and families are clustered within neighborhoods. By using this twolevel model we can compute the neighborhood correlation and the sibling correlation net of the effect of the neighborhood directly, which should be comparable to the lower bound on the family influence. We find sibling correlations which are only slightly lower than the lower bounds on the family influence estimated using model (2) and (3) (see Table 3). The correlations for pupils living in the same neighborhood are again very close to the ones estimated using model (3). Alternative measures of educational outcomes at Key Stage 4. We use two alternative measures of educational outcomes at Key Stage 4, (1) the sum of the scores obtained in any of the GCSE subjects taken and (2) the sum of the eight best GCSE exam scores plus the scores obtained in English and Mathematics. Neighborhood and sibling correlations decrease of about 20% and 30% when we use measure (1) rather than the sum of the scores obtained in English, Mathematics and Science. When using measure (2) we observe a less sharp reduction of the neighborhood and sibling correlation, about 15% and 5%. Apart from English, Mathematics and Science the GCSEs subjects taken by a pupil reflect their personal choice. Therefore the alternative outcomes are less comparable across pupils and this explains the lower observed correlations. Using a subsample of the two oldest siblings for each family. Our sample of all possible sibling pairs within each family could lead to an over-representation of big size families. For this reason we check the extent to which our results change when focusing only on the two oldest siblings from each family. We find a sibling correlation in test scores at Key Stage 2 and 4 of 0.56 and 0.61 respectively, which are just slightly above the correlation estimated using the full siblings sample. This comparison is to some extent equivalent to the comparison of different weighting schemes as implemented by Solon et al. (2000). More 30

precisely when considering our sample of all possible sibling pairs we use household weights which are increasing in the size of the household, whereas when considering the subsample of the two oldest sibling pairs we assign equal weights to all households. The differences of our results between the two samples is similar in magnitude to the differences between results obtained by Solon et al. (2000) using their preferred weighting schemes. Using the same sample at KS2 and KS4. We check whether our results are sensitive to the way we define the sample by retaining only those pupils in the sample that have nonmissing outcome measures at both Key Stages 2 and 4. The results (not displayed) are similar to those presented in the paper. All the sensitivity checks are available from the authors on request.

VI. CONCLUSIONS Our study confirms earlier research showing that growing up in the same family is much more important for explaining similarity in educational outcomes than growing up in the same neighborhood. Specifically, the sibling correlation, measuring the importance both of the shared family and neighborhood background, is computed to be 0.54 at age eleven and 0.61 at age sixteen. The proportion of variation in pupil’s exam scores explained by neighborhood factors is 0.10 and 0.14 for pupils aged eleven and sixteen respectively, and is higher in urban than in rural areas. In addition to an upper bound on the family effect we are able to derive a lower bound for the family effect which has a straightforward interpretation as relative family effect. This relative family effect measures that part of the family effect arising from a family having characteristics which differ from those of other living in the same neighborhood. The estimates show that deviations of a family’s characteristics from

31

observed neighborhood mean family characteristics account for 44% of the deviation in pupils’ Key Stage 2 and 47% in pupils’ Key Stage 4 test scores. We go beyond previous research by exploring differences in sibling correlations by sibling type. Looking at sibling gender combinations we generally find that different gender siblings are less similar than sister and brother pairs respectively. By imposing a few minor assumptions we are able to distinguish MZ and DZ twin correlations although the twin types are not identified in our data. Not surprisingly, the correlations in educational outcomes are highest for MZ twins who have identical genes. They are about 0.9 for both MZ twin brothers and sisters at both Key Stages. DZ twins have a correlation in test scores around 0.6 both at the end of primary school and at the end of compulsory schooling. Differences between MZ and DZ twin correlations will arguably be mainly explained by differences in the proportion of genes shared by MZ twins (100%) and DZ twins (50%). Differences in the correlation between DZ twins and non-twin siblings growing up at different times (about 0.05 points at Key Stage 2 and less at Key Stage 4) should be mainly caused by differences in the family and neighborhood environment.

32

References ALTONJI, J. G. (1988). The effects of family background and school characteristics on education and labor market outcomes. Mimeograph: Northwestern University. BJÖRKLUND, A., ERIKSSON, K. H. and JÄNTTI, M. (2010). IQ and family background: Are associations strong or weak? The B.E. Journal of Economic Analysis and Policy (Contributions), 10(1), Article 2. ___, ___, ___, RAAUM O. and ÖSTERBACKA, E (2002). Brother correlations in earnings in Denmark, Finland, Norway and Sweden compared to the United States. Journal of Population Economics, 15, 757-772. ___ and JÄNTTI, M. (2009). Intergenerational income mobility and the role of family background. In W. Salverda, B. Nolan, and T. M. Smeeding (eds.), Oxford Handbook of Economic Inequality, Chapter 20. Oxford: Oxford University Press. ___, ___ and LINDQUIST, M. (2009). Family background and income and during the rise of the welfare state: Brother correlations in income for Swedish men born 1932-1968. Journal of Public Economics, 93, 671-680. ___, ___, RAAUM, O., ÖSTERBACKA, E. and ERIKSSON, T. (2004). Family structure and labour market success: The influence of siblings and birth order on the earnings of young adults in Finland, Norway and Sweden. In Miles Corak (ed.) Generational Income Mobility in North America and Europe. Cambridge: Cambridge University Press, 207-225. ___, ___ and SOLON, G. (2005). Influences of nature and nurture on earning variation: A report on a study of various sibling types in Sweden. In S. Bowles, H. Gintis, M. Osborne Groves (eds.), Unequal chances: Family background and economic success. Princeton: Princeton University Press.

33

BLACK, S., DEVEREUX, P.J. and, SALVANES, K.G. (2005). The more the merrier? The effect of family size and birth order on children’s education. Quarterly Journal of Economics, 120, 669-700. BURGESS, S., MCCONNELL, B., PROPPER, C. and WILSON, D. (2004). Girls rock, boys roll: An analysis of the age 14-16 gender gap in English schools. Scottish Journal of Political Economy, 51, 209-229. ERMISCH, J. and DEL BONO, E. (2012). Inequality in achievements during adolescence. In John Ermisch, Markus Jäntii, Timothy Smeeding and James Wilson (eds.), Inequality from childhood to adulthood: A cross-national perspective on the transmission of advantage. New York: Russell Sage, forthcoming. FEINSTEIN, L. (2003). Inequality in the early cognitive development of British children in the 1970 cohort. Economica, 79, 73-97. GOODMAN, A., GREGG, P. and WASHBROOK, E. (2011). Children’s educational attainment and the aspirations, attitudes and behaviours of parents and children through childhood in the UK. Longitudinal and Life Course Studies, 2, 1-18. GUO, G. and WANG, J. (2002). The mixed or multilevel model for behavior genetic analysis. Behavior Genetics, 32, 37-49. JENCKS, C. S., BARTLETT, S., CORCORAN, M., CROUSE, J., EAGLESFIELD, D., JACKSON, G., et al. (1979). Who gets ahead? The determinants of economic success in America. New York: Basic Books. KUNZ, J., PAGE, M.E. and SOLON, G. (2003). Are point-in-Time measures of neighborhood characteristics useful proxies for children’s long-run neighborhood environment? Economics Letters,79, 231–37.

34

LINDAHL, L. (2011). “A comparison of family and neighbourhood effects on grades, test scores, educational attainment and income – evidence from Sweden. Journal of Economic Inequality, 9, 207-226. MAZUMDER, B. (2008). Sibling similarities and economic inequality in the U.S. Journal of Population Economics, 21, 685-701. PAGE, M. E. and SOLON, G. (2003a). Correlations between brothers and neighboring boys in their adult earnings: The importance of being urban. Journal of Labor Economics, 21, 831–55. ___ and ___ (2003b). Correlations between sisters and neighboring girls in their subsequent income as adults. Journal of Applied Econometrics, 18, 545-562. PRICE, J. (2008). Parent-child quality time: Does birth order matter? Journal of Human Resources, 43, 240–265. RAAUM, O., SALVANES, K.-G. and SØRENSEN, E. (2003). The impact of a primary school reform on educational stratification: A Norwegian study of neighbour and school mate correlations. Swedish Economic Policy Review, 10, 143-169. ___, ___ and ___ (2006). The neighborhood is not what it used to be. Economic Journal, 116, 200-222. RABE, B. and TAYLOR, M. (2010). Residential mobility, quality of neighbourhood and life course events. Journal of the Royal Statistical Society, Series A (Statistics in Society), 173, 531-555. RABE-HESKETH, S., SKRONDAL, A. and GJESSING, H.K. (2008). Biometrical modeling of twin and family data using standard mixed model software. Biometrics, 64, 280-288. RASBASH, J., LECKIE, G., PILLINGER, R. and JENKINS, J. (2010). Children’s educational progress: partitioning family, school and area effects. Journal of the Royal Statistical Society Series A, 172, 657-682. 35

SOLON, G., CORCORAN, M., GORDON, R. and LAREN, D. (1991). A longitudinal analysis of sibling correlations in economic status. Journal of Human Resources, 26, 509-534. SOLON, G. (1999). Intergenerational Mobility in the Labor Market. In O. Ashenfelter and D. Card (eds.) Handbook of Labor Economics, vol. 3, 1761-1800. Amsterdam: Elsevier. SOLON, G., PAGE, M.E. and DUNCAN, G.J. (2000). Correlations between neighboring children in their subsequent educational attainment. Review of Economics and Statistics, 82, 383–392. TAUBMAN, P. (1976). The determinants of earnings: Genetics, family, and other environments. A study of white male twins. American Economic Review, 66, 858-870.

36

Table 1: Sample description Full sample

Siblings sample

mean

Std. deviation

mean

Std. deviation

Key stage 2 score, girls

27.30

4.11

26.97

4.31

Key stage 2 score, boys

26.94

4.35

26.72

4.48

Key stage 4 score, girls

119.05

28.07

117.83

28.85

Key stage 4 score, boys

115.67

28.49

114.91

29.05

Male

0.51

0.51

twins

0.02

0.10

number of school-age siblings in state schools

1.90

white British

0.83

0.81

first language English

0.91

0.88

free school meal eligible

0.12

0.16

urban

0.82

0.81

Pupils per neighborhood, KS2

56.89

17.62

14.37

7.33

Pupils per neighborhood, KS4

53.57

16.61

13.60

6.98

Number of observations

1,698,373

0.94

2.78

1.02

345,806

Notes: National Pupil Database, 2007-2009: Pupils taking GCSE or equivalent exams in 2007-2009. Non-missing cases of Key Stage 2 score 1,641,612/332,780; non-missing cases of Key Stage 4 score 1,554,861/315,217 (full/siblings sample).

37

Table 2: Sibling and neighbor correlations at Key Stages 2 and 4 Key Stage 2 Correlation (SE)

Key Stage 4 Correlation (SE)

Neighbor correlation (upper bound on the neighbourhood influence) Neighbors, model (3)

0.102

(0.001)

0.145

(0.001)

Neighbors, adjusted, model (3)

0.096

(0.001)

0.143

(0.001)

Neighbors, urban, model (4)

0.107

(0.001)

0.151

(0.001)

Neighbors, urban adjusted, model (4)

0.099

(0.001)

0.148

(0.001)

Neighbors, rural, model (4)

0.078

(0.002)

0.119

(0.002)

Neighbors, rural adjusted, model (4)

0.077

(0.002)

0.118

(0.002)

N (neighbors)

1,641,612

1,554,861

Sibling correlation (upper bound on the family influence) Siblings, model (2)

0.540

(0.002)

0.610

(0.002)

Brothers, model (5)

0.544

(0.003)

0.618

(0.003)

Sisters, model (5)

0.566

(0.003)

0.642

(0.003)

Mixed gender siblings, model (5)

0.520

(0.003)

0.581

(0.002)

N (sibling pairs)

332,780

315,217

Difference between sibling and neighbor correlation (lower bound on the family influence) Relative family effect Siblings, models (2) and (3)

0.444

(0.002)

0.467

(0.002)

Brothers, models (5) and (3)

0.448

(0.003)

0.475

(0.003)

Sisters, models (5) and (3)

0.470

(0.003)

0.499

(0.003)

Mixed gender siblings, models (5) and (3)

0.424

(0.003)

0.438

(0.002)

Coefficient

p-value

Coefficient

p-value

Ho urbanicity: CorrUN-CorrRN =0, model (3)

0.037

0.000

0.034

0.000

Ho gender: CorrSFF-CorrSMM =0, model (5)

0.023

0.004

0.025

0.004

Ho gender: CorrSFM-CorrSFF =0, model (5)

-0.046

0.004

-0.061

0.004

Non-linear Wald tests

38

Ho gender: CorrSFM-CorrSMM =0, model (5)

-0.023

0.004

-0.036

0.004

Notes: National Pupil Database, 2007-2009: Pupils taking GCSE exams in 2007-2009. Neighbor estimates at the level of Lower Layer Super Output Areas (LSOA), adjusted for: white British ethnicity, first language English, low income group, number of school-age siblings in state schools and its square. Relative family effect derived using adjusted neighbour correlation. Standard errors and Wald tests calculated using the delta method.

39

Table 3: Correlations in test scores by sibling type at Key Stages 2 and 4 Key Stage 2 score

Key Stage 4 score

All twins

Corr

(SE)

Cov

Corr

(SE)

Cov

Twin brothers

0.745

(0.006)

0.846

0.776

(0.005)

0.815

Twin sisters

0.742

(0.006)

0.742

0.787

(0.005)

0.782

Mixed gender twins

0.555

(0.009)

0.580

0.600

(0.009)

0.607

MZ brothers

0.914

(0.012)

1.085

0.921

(0.012)

0.995

MZ sisters

0.874

(0.012)

0.868

0.908

(0.010)

0.887

DZ brothers

0.550

(0.010)

0.595

0.615

(0.010)

0.626

DZ sisters

0.578

(0.011)

0.584

0.641

(0.010)

0.652

DZ mixed gender twins

0.555

(0.009)

0.580

0.600

(0.009)

0.607

Brothers

0.513

(0.004)

0.570

0.593

(0.004)

0.630

Sisters

0.538

(0.004)

0.559

0.618

(0.003)

0.656

Mixed gender

0.516

(0.003)

0.555

0.578

(0.003)

0.612

Monozygotic twins

Dizygotic twins

Non-twin siblings

Non-linear Wald tests Coefficient

p-value

Coefficient

p-value

Ho: CorrMZFF-CorrMZMM=0

0.040

0.013

0.013

0.012

Ho: CorrNTFF-CorrNTMM=0

0.025

0.005

0.025

0.005

Ho: CorrNTFF-CorrNTFM=0

0.022

0.005

0.040

0.004

Ho: CorrNTMM-CorrNTFM=0

0.003

0.005

0.015

0.004

Notes: National Pupil Database, 2007-2009: Pupils taking GCSE or equivalent exams in 2007-2009. Standard errors and Wald tests calculated using the delta method. Wald tests for equality of correlations between DZ twins of different gender combinations not shown as the differences are by assumption identical to those of non-twin siblings, see section 2.

40

ACKNOWLEDGEMENTS This work was supported by the Economic and Social Research Council (ESRC) through the Research Centre on Micro-Social Change (MiSoC) (award no. RES-518-28-001). We thank the Department for Education for making available data from the National Pupil Database. We are grateful to the editor and three anonymous referees for useful comments and suggestions. NOTES

1

This is only true to the extent that the observed neighborhood is a good proxy for the lifetime neighborhood. According to Kunz et al. (2003) this is often satisfied. 2

Another method that has been used to produce a tighter upper bound on the neighborhood influence is to estimate the correlation between pupils living in the same neighborhood net of observed family characteristics, CorrN,NET by adding observed family characteristics into a mixed model (see model 3). However, this procedure does not necessarily produce a lower bound on the family influence. We therefore use the procedure suggested by Altonji (1988). 3

Notice that restricted maximum likelihood estimation of σc2 represents the covariance between all pairs of children living in a same neighborhood, including pairs of siblings. Therefore σc2 could in part capture the family effect. This implies that σc2/(σc2+ σν2) is probably an upper bound on V(β'Zc) /V(ycfs) which is less tight than Cov(ycfs,ycf’s’)/V(ycfs) computed using only pairs of unrelated children living in the same neighborhood. Solon et al. (2000) compute the neighborhood correlation by excluding pairs of siblings. More precisely, they compute the covariance between ycfs for all possible pairs of unrelated children within each neighborhood and then combine these within covariances using different weighting methods to take account of the unbalanced structure of the data. When the data are not extremely unbalanced these different weighting methods produce similar results (see Solon et al. 2000 and Rauum et al. 2006). However, this result does not hold in general (for example data used to estimate school mates’ correlation typically has a lot of variation in the number of children across schools) and it is difficult to choose among different weighting methods which are, after all, arbitrary. For this reason we prefer to adopt mixed models for the estimation of our correlations. 4

Even after extensive data cleaning there will be false positives in the data. These are, for example, pupils living at the same house number within a postcode but different streets; pupils living in different flats/blocks at same house number where this could not be identified; pupils living in boarding houses. Likewise, we expect to have false negatives, i.e. cases where siblings live at the same address but we have not been able to identify this. This may occur through data input errors (typos), omissions or entering more than one item of address information in a field where the correct information could not be extracted. However, in the vast majority of cases the address information and hence the matching of siblings was unambiguous. 5

To investigate the quality of our sibling and twin definition we have checked the extent to which they share first language spoken at home, Free School Meal (FSM) status and ethnicity. Neither FSM status nor first language spoken at home are time invariant, so that they can be expected to vary between siblings born at different times, and even in some twin pairs the parents may register FSM only for that twin who likes to take school dinners. Ethnicity may vary between half siblings. Results show that 99% of twins share, respectively, FSM status and first language spoken at home. Among non-twin siblings 97% share ethnicity and first language spoken at home and 92% share FSM status. We take this as indication that our siblings are well matched. 6

There is an ongoing debate between policy-makers and researchers in the UK about which score best measures educational attainment. Standard measures are the sum of the points obtained in each GCSE or equivalent exam; the sum of the eight best exams; and more recently, the sum of the eight best exams plus the points achieved in 41

English and Mathematics. In our sensitivity analysis, section 6, we use the first and the third measure for comparison. 7

The larger number of missing outcomes in Key Stage 4 is partly a result of concentrating on core GCSE subjects as an outcome which excludes pupils choosing to take vocational or occupational exams. 8

In cases where parents mate assortatively, the proportion of shared genes may be higher than 50%.

42