DEMOGRAPHIC RESEARCH VOLUME 31, ARTICLE 48, PAGES 1431–1454 PUBLISHED 12 DECEMBER 2014 http://www.demographic-research.org/Volumes/Vol31/48/ DOI: 10.4054/DemRes.2014.31.48 Research Article The declining effect of sibling size on children’s education in Costa Rica Jing Li William H. Dow Luis Rosero-Bixby © 2014 Jing Li, William H. Dow & Luis Rosero-Bixby. This open-access work is published under the terms of the Creative Commons Attribution NonCommercial License 2.0 Germany, which permits use, reproduction & distribution in any medium for non-commercial purposes, provided the original author(s) and source are given credit. See http:// creativecommons.org/licenses/by-nc/2.0/de/ Table of Contents 1 Introduction 1432 2 Fertility transition and education attainment in Costa Rica 1434 3 Data and methods 1437 3.1 Data 1437 3.2 Measures 1439 3.3 Analytic strategy 1440 4 Results 1443 5 Discussion and conclusion 1448 References 1451 Demographic Research: Volume 31, Article 48 Research Article http://www.demographic-research.org 1431 The declining effect of sibling size on children’s education in Costa Rica Jing Li1 William H. Dow2 Luis Rosero-Bixby3 Abstract BACKGROUND Costa Rica experienced a dramatic fertility decline in the 1960s and 1970s. The same period saw substantial improvement in children’s educational attainment in Costa Rica. This correlation is consistent with household-level quantity-quality tradeoffs, but prior research on quantity-quality tradeoff magnitudes is mixed, and little research has estimated quantity-quality tradeoff behaviors in Latin America. OBJECTIVE This study explores one dimension of the potential demographic dividend from the fertility decline: the extent to which it was accompanied by quantity-quality tradeoffs leading to higher educational attainment. Specifically, we provide the first estimate of quantity-quality tradeoffs in Costa Rica, analyzing the increase in secondary school attendance among Costa Rican children as the number of siblings decreases. Furthermore, we advance the literature by exploring how that tradeoff has changed over time. METHODS We use 1984 and 2000 Costa Rican census data as well as survey data from the Costa Rican Longevity and Healthy Aging Study (CRELES). To address endogenous family size, the analysis uses an instrumental variable strategy based on the gender of the first two children to identify the causal relationship between number of siblings and children’s education. 1 University of California at Berkeley, U.S.A. E-Mail: jingli86@berkeley.edu. 2 University of California at Berkeley, U.S.A. 3 University of California at Berkeley, U.S.A. and University of Costa Rica. Li, Dow & Rosero-Bixby: The declining effect of sibling size on children’s education in Costa Rica 1432 http://www.demographic-research.org RESULTS We find that, among our earlier cohorts, having fewer siblings is associated with a significantly higher probability of having attended at least one year of secondary school, particularly among girls. The effect is stronger after we account for the endogeneity of number of children born by the mother. For birth cohorts after 1980 this relationship largely disappears. CONCLUSIONS This study provides strong evidence for a declining quantity-quality (Q-Q) tradeoff in Costa Rica. This result suggests one potential explanation for the heterogeneous findings in prior studies elsewhere, but more work will be required to understand why such tradeoffs might vary across time and context. 1. Introduction There is a large literature examining the relationship between family size and children’s educational outcomes in various contexts. Many of these studies draw upon the quantity-quality (Q-Q) trade-off model (Becker and Lewis 1973), which suggests that a decreased number of children in the family permits more resources to be allocated to each child, which in turn increases child quality. Early empirical research supporting this theory generally finds a negative relationship between sibship size and children’s educational attainment across different countries and cultural contexts (Blake 1981; Hanushek 1992; Knodel and Wongsith 1991; Rosenzweig and Wolpin 1980). However, more recent literature on this topic has yielded mixed results. Studies have pointed out that since parents jointly determine both child quantity and quality, i.e., how many children to have and how much to invest in them, these two variables are both affected by unobserved parental preferences and other family characteristics. As a result, association between family size and children’s education outcomes does not establish a causal relationship (Angrist et al. 2010). Accordingly, some studies that use more sophisticated instrumental variable (IV) approaches that attempt to isolate exogenous variation in family size have found little to no relationship between family size and children’s education (Angrist et al. 2010; Black et al. 2005; Caceres 2004). For instance, Black et al. (2005) use twin births as an instrument to examine the effects of family size on children’s education in Norway and find the effect to be negligible. This is also the case when they control for birth order. Angrist et al. (2010) also use twin- births as well as sibling gender composition as instruments for family size in Israel, and find no evidence of a Q-Q trade-off. Demographic Research: Volume 31, Article 48 http://www.demographic-research.org 1433 Perhaps unsurprisingly, studies which find no evidence of such tradeoff almost all use data from developed countries with more comprehensive welfare systems, whereas Q-Q tradeoff could be more prominent in developing countries where social resources for education are more limited (Li et al. 2008). Consistent with this argument, recent studies that use both data from developing countries and the IV approach tend to confirm the negative relationship discovered between sibling size and children’s schooling. Utilizing the cultural preference for sons in South Korea, Lee (2008) instruments sibling size by sex of the first child and finds an adverse effect of sibling size on per-child investment in education. Jensen (2005) also uses the sex of the first two births as instruments for number of siblings in India and shows that the number of siblings helps explain gender inequality in education, as girls tend to have more siblings than boys because of son preference. In addition, Li et al. (2008) instrument family size by twin birth in examining the effect of family size on education attainment in China. Again, they find a negative effect of family size on children’s education, which is more evident in rural China where there is a poor public education system. While previous studies failed to reach consensus on the nature or magnitude of the Q-Q tradeoff, few studies have investigated this issue in the Latin America context. Our study aims to fill this gap by examining the relationship between family size and children’s education in Costa Rica. There are at least three justifications for the significance of our endeavor in investigating the Q-Q tradeoff in Costa Rica. First, Costa Rica as a developing country represents a distinct cultural and social context compared to other developing countries previously examined in this literature, many of which are in Asia (e.g., Jensen 2005; Lee 2008; Li et al. 2008). To our knowledge, only two other studies have explicitly examined the effect of family size on children’s education in the Latin American context. Using data from Peru, Patrinos and Psacharopoulos (1997) find sibling size to be a significant predictor of age-grade distortion and employment among children. Unfortunately, there is no attempt to account for the endogeneity of family size, which renders their results less convincing. More recently, Martelato and de Souza (2012) examine the causal effect of family size on children’s education in Brazil over a 30-year period. Using a twin-birth instrumental strategy, they find an effect of family size on education that is not uniform throughout a period of significant social, economic, and demographic change. Their study highlights the need for more research on this important issue in Latin America in general, as well as on the potential heterogeneity in the effect of family size on education. Second, the uniqueness of Costa Rica also lies in its rapid fertility decline before 1980, which is unprecedented in the developing world. Smaller family size has been postulated as a potential mechanism through which fertility decline brings about demographic dividend (Lee and Mason 2010). However, no study has attempted to examine or quantify this relationship in Costa Rica. Li, Dow & Rosero-Bixby: The declining effect of sibling size on children’s education in Costa Rica 1434 http://www.demographic-research.org Third, we have at our disposal unique datasets that allow us to investigate the Q-Q tradeoff at different time points during and after the fertility transition in Costa Rica. Heterogeneity in Q-Q tradeoff is a concept that has been repeatedly alluded to in previous literature but is not well understood, despite the rich set of evidence produced on this topic. Several studies have documented changing relationships between family size and children’s education across time and cohorts in various contexts. For instance, using data from Indonesia and miscarriage as the instrument, Maralani (2008) shows that the association between family size and children’s education differs depending on whether the area is urban or rural and on the cohort examined: in urban areas there is a positive association between family size and education for older cohorts but a negative association for younger cohorts, whereas no significant association is found for any cohorts in the rural area. Further, Sudha (1997) examines differential association between family size and education across different ethnic groups in Malaysia and concludes that social conditions and policies external to families play an important role in determining the nature of the relationship examined. In addition, Eloundou-Enyegue and Williams (2006) also confirm the heterogeneous relationship between sibsize and schooling in Sub-Saharan African settings as a result of differences in socioeconomic context by ruling out several other competing explanations. Given the meaningful differences in demographic composition in terms of family size at different time points of fertility transition, one may expect Q-Q tradeoff to play out differently across cohorts and time. This is indeed what we find. Using an instrumental variable approach, we find consistent results across data sources showing that number of siblings has a sizable impact on secondary education attainment among the first and second births occurring in the family before 1980, especially for girls. By contrast, the effect largely disappears for cohorts born after 1980. The finding that the magnitude of the effect of sibling size on education differs by time period in our particular setting provides another explanation for the seemingly incongruent findings in the existing literature on Q-Q tradeoff. 2. Fertility transition and education attainment in Costa Rica Costa Rica experienced one of the earliest and fastest fertility transitions in the developing world. The total fertility rate (TFR) fell from 7.3 to 3.7 births per woman between 1960 and 1976 (Figure 1). Only the fertility shifts in Singapore and Taiwan were faster than that observed in Costa Rica in this period (Coale 1983). After this extraordinary period, TFR continued declining, albeit at a slower pace, reaching below- replacement levels in 2001. The most recent estimate shows a TFR of 1.83 in 2010, Demographic Research: Volume 31, Article 48 http://www.demographic-research.org 1435 which is lower than that in the United States (1.9 births), and in Latin America ranks after Cuba, which has a TFR of 1.7 births (PRB 2012). Figure 1: Total fertility rate, cohort, and sibship size, and proportion with some secondary education, Costa Rican cohorts born 1950‒1996 Cohort size (births) and TFR source: Web page of the Central American Population Center, http://ccp.ucr.ac.cr/observa (accessed on September 3, 2013). Proportion with secondary education source: 2011 census, population aged 14 and older by birth year, micro-data available at http://censos.ccp.ucr.ac.cr (accessed on September 3, 2013). Sibship size: estimated with census data as explained in text. Period TFR is just a theoretical demographic construct that one should not expect to be immediately related to education attainment. Two mechanisms are postulated as transmission belts from fertility decline to children’s education: (1) cohort size and (2) family size, or more precisely sibship size, which is family size from the perspective of children (Lam and Marteleto 2008). Reduced cohort size would allow public education programs to broaden coverage and to improve educational quality by reducing the Li, Dow & Rosero-Bixby: The declining effect of sibling size on children’s education in Costa Rica 1436 http://www.demographic-research.org number of students per classroom. In a broader sense, the reduced size of younger cohorts alters the age structure of the population, including its dependency ratio, which may have multiple effects on the economy (Kelley and Schmidt 2005). Reduced family size would allow parents to allocate a relatively larger amount of resources to the education of each child. Higher investment in human capital resulting from fertility decline is considered an important component of the so-called demographic dividend (Lee and Mason 2010). As shown in Figure 1, cohort and sibship size trends do not correspond exactly to TFR trends. The fertility decline ended the baby boom that was taking place in Costa Rica in the 1950s, and resulted in a ‘baby bust’ from 1963 to 1973. From 1975 to 1985 a second baby boom takes place as an echo of the boom of the 1950s, because of the rapid growth of population in reproductive ages. In turn, sibship size4 is substantially larger than the TFR in the birth year of each cohort. It starts to decline for children born several years before the fall in TFR, and it declines at a more regular pace than TFR. Elementary education (six grades) has been mandatory and free by law in Costa Rica since 1869 (Salazar 2003). More than 90% of children complete elementary school. Studies have shown that the key factor of school attainment in Costa Rica is the dropout rate at the first year of secondary school, i.e., at ages 13 to 15 years (Programa Estado de la Nación 2005). The proportion with some secondary education is thus an important indicator of education attainment in this country. This proportion, according to census data, has increased from 40% for those born in the 1950s to 80% for those born in the 1990s (Figure 1). This progress to some extent mirrors the curve showing the decline in sibship size, except for cohorts born in the 1960s. The educational progress of the 1960s cohorts was interrupted by the severe economic crisis that occurred in Costa Rica in the early 1980s, which made families put their children to work and reduced government expenditure on education. Understanding the factors that allow adolescents to continue in school after sixth grade is central to policies aimed at improving education in Costa Rica. Several studies of this topic have singled out, in addition to socioeconomic constraints, low motivation to attend school derived from lack of family support, often coupled with low education of parents and deficiencies in the educational system (Programa Estado de la Nación 4 Sibship Size in Figure 1 is an estimate for cohorts at the age of 14, which is the central age when families make the decision of dropping secondary school. We estimated this indicator with census data using the mean and variance of surviving children of women aged 41 years for children born in 1950 to 1972 and aged 40 years for the cohorts born in 1973 to 1997. We obtained yearly estimates using the following approximate relationship adapted from Lam and Marteleto (2013), which, in turn, is based on an identity proposed by Preston (1976): Sc(a) = Sw(a+m) + [Vw(a+m) / Sw(a+m)], where Sc is family size from child's perspective, Sw is family size from woman's perspective, V is the variance in Sw, a is the age of children and m is the mean fertility age: 27 years in 1950−1972 and 26 years in 1973−1997. Data source of Sw and Vw: 1973 Census for cohorts 1950−58, 1984 census for 1959−70, 2000 census for 1971−85, and 2011 census for 1986−97, online microdata at http://censos.ccp.ucr.ac.cr Demographic Research: Volume 31, Article 48 http://www.demographic-research.org 1437 2005). No study in this area, however, discusses large family size as an obstacle to educational attainment or singles out the fertility transition as a contributing factor to the improvement in the level of education in younger cohorts. We intend to address this gap by assessing the effect of Costa Rica’s rapid fertility transition on the improvement in children’s education, focusing on the sibship effect. Documenting the contribution of fertility decline to improvement in education is important both for understanding this particular mechanism of the demographic dividend and for sound education policymaking. 3. Data and methods 3.1 Data We use multiple data sources to examine the relationship between sibling number and children’s education in Costa Rica. The first dataset comes from the Costa Rican Longevity and Health Aging Study (CRELES, or Costa Rica Estudio de Longevidad y Envejecimiento Saludable). It is a set of nationally representative longitudinal surveys of health and life course experiences of older Costa Ricans (Berkeley Population Center 2012). The CRELES data contain detailed demographic information on the elderly as well as their spouses and children, and thus is well suited to the purpose of this study. It currently consists of two birth cohorts (pre-1945 and 1945−1955) and multiple waves in each cohort. In order to ensure adequate sample size and to cover the entire period of fertility decline we use data from the first two waves of the pre-1945 cohort, collected in 2005 and 2007 respectively, combined with data from the first wave of the 1945−1955 cohort, which was collected in 2010. Instead of pooling the two waves of pre-1945 data we draw different variables from each wave, which collects different information. We derive our analytical sample of mothers and children mainly by applying the following restrictions to the CRELES data. First, we keep only families with at least two children, and all children in the same family have the same biological mother who is identified in the survey. This restriction is required by the instrumental variable approach we adopt. We use the gender composition of the mother’s first two births to instrument for number of siblings; therefore we need to be able to identify the sex of the two eldest children in the family. Second, we require that mothers were born between 1930 and 1960. This age-range limit ensures that the mothers in the sample gave birth during the period of fertility decline. Third, the key variables, including education, age etc. of mothers and children, need to be non-missing. Third, we restrict the sample to children who are of age 14 or above. The restriction allows us to examine the effect of Li, Dow & Rosero-Bixby: The declining effect of sibling size on children’s education in Costa Rica 1438 http://www.demographic-research.org secondary school attendance for age-appropriate children, since children normally start secondary school at age 13 in Costa Rica. Finally, we keep only the first and second births after applying the previous restrictions, as in theory they constitute the relevant treatment and control groups to which the IV estimates apply. We explain this point in more detail in Section 2.3. After applying the above restrictions, the final analytical sample contains 3,785 children of 1,423 mothers. One concern with the CRELES sample is that it may not provide sufficient power to detect the relationship we propose. Although the sample appears to contain a relatively large number of children, it has fewer than 1,500 mothers (families), which is not a particularly large sample size. This may be an issue, as our key independent variable is sibling size, which only varies by family. This is a significant limitation, given that previous studies examining the Q-Q tradeoff usually have a large sample size at their disposal, which allows for precise estimation of coefficients of interest. We therefore perform our statistical analysis on an additional dataset derived from a 10% random sample of the 1984 Costa Rica census data. Census data were collected by the National Institute of Statistics and Censuses of Costa Rica and are available from the University of Minnesota Population Center’s (2014) IPUMS international web dissemination system. However, a major limitation of census data for the purpose of this study is that it only identifies mothers of those children who lived in the same household as their mothers at the time of census. Hence we further restrict our sample to only families in which all children were living in the same household as their mother. We also require that children in the analytical sample be between age 14 and 20. The upper age limit helps minimize the possible bias from requiring that all children live in the same household as their mother, since older children were more likely to live away from their parents. After applying all restrictions, our analytical sample from the 1984 census contains 5,868 children of 3,779 mothers5. The restriction that all children live in the same household as their mother could yield biased results if the effect of sibling size operates differently on children who stay in the household before age 20 versus those who leave home before reaching 20. In the results section we demonstrate that this restriction does not yield significant bias by comparing the coefficient estimates from census data to those from the CRELES data. However, another direct consequence for our analytical sample of this data limitation and resulting age limitation is that we are unable to examine outcomes that occur after age 14, such as college attendance, total number of years of education, or marital outcomes. While we understand that this is a limitation of our data and study, it is a tradeoff we have to make in order to make use of the multiple data sources at hand 5 The mean number of children per mother in the census sample is lower than in the CRELES sample because of the restriction that all children live in the same household as the mother. Demographic Research: Volume 31, Article 48 http://www.demographic-research.org 1439 to examine possible heterogeneity of Q-Q tradeoff, a point we immediately turn to below. In order to explore the potentially heterogeneous effect of sibling size on children’s education across time periods, we bring in a third dataset from a 10% sample of the 2000 Costa Rica census. The data structure and variables are identical to the 1984 census, and we apply the same restrictions as the 1984 Census to yield the 2000 analytical sample. The final 2000 sample contains 14,718 children and 10,051 mothers. This sample size is over twice the size of the 1984 sample, for mainly two reasons: first, the total number of observations in the 10% census sample before any restrictions in 2000 is about 1.6 times that of the 1984 census; second, an improvement in educational attainment over time means that more children under age 20 are living with families while in school. 3.2 Measures In this section we focus on the measures of key variables using the CRELES data. The measures using census data are similar. Our dependent variable is whether the child has attended at least one year of secondary school. The CRELES survey asks the elderly respondent about the highest level of schooling and the number of years at last level of schooling of each of her children. We code the dependent variable as 1 if the child has attended at least one year of secondary school, and 0 otherwise. We use secondary school attendance as the dependent variable instead of other possible measures of education because continuing education after completing elementary school represents a key decision made by Costa Rican families when children are entering teen ages. In addition, more than 40% of all children in both samples did not attend any secondary school, generating sufficient variation to examine the effects of fertility on children’s education. Our key independent variable of interest is the number of siblings the child has. The CRELES survey asks all elderly respondents how many living children they have. We subtract 1 from this number to obtain the number of siblings for each child, after matching the reported number of living children with the total number of children present in the survey data for each family. We control for several family-level characteristics that potentially confound the effect of the number of siblings on children’s secondary school attendance, including gender, birth order (whether the child is the second birth), mother’s years of schooling, mother’s age at first birth, an index of family wealth, and whether the child has any sisters in the family. Butcher and Case (1994) find that women’s education choices are systematically affected by whether they were raised with any sisters. Since we use the Li, Dow & Rosero-Bixby: The declining effect of sibling size on children’s education in Costa Rica 1440 http://www.demographic-research.org gender composition of the first two born children as instruments for mothers’ fertility, controlling for whether the child has any sisters allows us to account for one potential mechanism by which the exclusion restriction could be violated. We illustrate this point in more detail in the next section. We also include dummies of child age and the canton where the parent has lived the longest6, which absorb any year-specific and region- invariant as well as any canton-specific and time-invariant confounders. Therefore we use only within-canton and within-cohort variation to identify the effect of fertility on education. 3.3 Analytic strategy Our main analytical strategy is a two-stage least squares (2SLS) regression model. We use the gender composition of the mother’s first two births as instruments for estimating the effect of having an additional sibling on the probability of the child completing at least one year of secondary school. Gender composition has been used as the instrument in previous studies (Jensen 2005; Lee 2008; Qian 2004) to examine the effect of sibling size on children’s education, since it is plausibly exogenously assigned and cannot be easily manipulated. The explanatory power of these instruments depends on the extent to which gender composition of the elder children alters fertility decisions because of parental preference for having a boy or a girl. As shown by several surveys, starting with the 1976 World Fertility Survey, Costa Ricans have no gender preferences regarding their children, with the exception of a preference for having balanced families with children of both sexes (DGEC and WFS 1978). A priori, there is no obvious theory as to the number of children on which gender composition should be computed. However, an examination of Figure 2 indicates that the most common number of children per family is three, and this is the case for both subsamples of families in our data - children born before 1980 and children born after 1980 (graphs not shown). Thus, the key fertility decision for most families is either the choice between having two or three children or the choice between having three or four children. This gives us two options for the instruments, computing the gender composition for either the first two births or the first three births. Separate analyses comparing the two sets of instruments suggest that the first stage F-stats are much weaker with the instrument of gender composition of the first three births. In addition, the endogeneity in the three-birth instrument may be more of an issue if selection on gender composition occurs between the second and the third birth, thereby justifying 6 Ideally we would like to control for dummies of the child’s birth canton, but it is not directly available in the survey. In the analysis using census data we control for dummies of the child’s canton of residence five years ago, i.e., 1979 and 1995 respectively. Demographic Research: Volume 31, Article 48 http://www.demographic-research.org 1441 our choice of using gender composition of the first two births in this case. We operationalize gender composition as a vector of two binary variables, with the first one indicating whether the first two births in the family are both boys and the second one indicating whether the first two births are both girls. Figure 2: Mean secondary school attendance and frequency of families by number of siblings, CRELES sample Using gender composition as instruments in studying Q-Q tradeoff is subject to several limitations, according to existing literature (e.g., Aslund and Grönqvist 2010; Shultz 2007). There are two main critiques of this instrument: first, the gender of a child can be affected by mother’s nutrition, as lower nutrition intake is found to be associated with a lower probability of male births (Mathews et al. 2008; Almond and Mazumder 2011). Second, gender composition of siblings may affect child outcomes through mechanisms other than sibling size (Aslund and Grönqvist 2010). Malnutrition due to fasting in Ramadan, as examined in Almond and Mazumder (2011), is not a concern in the Costa Rican context. More generally, maternal Li, Dow & Rosero-Bixby: The declining effect of sibling size on children’s education in Costa Rica 1442 http://www.demographic-research.org malnutrition because of poverty or lack of food is a non-issue in Costa Rica (Mata 1983). Moreover, the small effect size found in Mathews et al. (2008) suggests that any confounding effect is unlikely to be significant. Unfortunately, the data required to directly rule out any significant association between maternal malnutrition and child gender is extremely difficult to come by, as demonstrated in Mathews et al. 2008. We therefore perform an indirect validity test by examining the gender of the first and second born by family wealth quintile using census data, as one would expect maternal nutrition status to be correlated with family wealth. We do not find the gender of the first two births to be significantly different by family wealth quintile, using either the 1984 or the 2000 census. On the other hand, one might be more concerned about alternative mechanisms through which sibling gender composition affects children’s education other than mother’s fertility decision. As mentioned before, one such mechanism may be the reference group effect proposed by Butcher and Case (1994). They find that women raised with only brothers have received on average significantly more education than women raised with any sisters, controlling for household size. On the other hand, being raised with any sisters does not have any significant impact on men’s education. Their findings are consistent with the reference group model which suggests that the presence of a second daughter in the household changes the reference group for the first, as parents with only one daughter may measure her achievement on the same scale as their sons’ achievement and may provide her with an equal share of the household’s educational resources as her brothers (Butcher and Case 1994). Thus gender composition could affect girls’ education through parental expectations of daughters and the allocation of family resources. In Costa Rica, however, this mechanism may work differently, given that overall educational attainment is higher for females than males. Hence, having any sisters may actually be beneficial to children’s education in this setting, although the magnitude or direction of the effect is more of an empirical question. We hence control for whether the child has any sisters in the 2SLS model. In other words, our model essentially utilizes the effect of gender composition of the first two births on sibling number conditional on whether the child has any sisters. The exclusion restriction could be violated, e.g., if mothers’ marital status is affected by the gender of the first birth. Using CPS data, Dahl and Moretti (2008) find that women with first-born daughters are less likely to marry and are more likely to be divorced conditional on being ever married. Number of children is also significantly higher in families with a first-born girl. We are not able to find any such effects using either CRELES or census data from Costa Rica, one reason for which may be that our sample size is significantly smaller than that in Dahl and Moretti (2008), who work with over 1 million individual observations. Nevertheless, the null finding of such effects assuages the concerns that they are likely to significantly bias our results. Demographic Research: Volume 31, Article 48 http://www.demographic-research.org 1443 As discussed previously, we restrict the final analytical sample to children with at least one sibling in order to apply the instruments described above. Since gender composition of the first two births is mostly likely to affect mother’s decision of having a third birth, we estimate the 2SLS models only on first and second births, which is the relevant group of children affected by the instrument; i.e., children who are the third births or above potentially may not have been born had the gender composition of the first two births been different. We estimate the models separately for boys and girls in addition to the full sample in order to capture any systematic differences in the estimated relationship by gender. We report the Durbin-Wu-Hausman test statistic for each model, which is statistically significant if the 2SLS estimate is significantly different from the OLS estimate. 4. Results In this section we first present graphical evidence and descriptive statistics using the analytical sample from the CRELES survey data. We then show the regression results using both CRELES and Census data. Figure 2 depicts the relationship between the number of siblings and the probability of attending secondary school by gender, with a superimposed histogram of frequency of families by sibship size. There is a negative and linear relationship between the number of siblings and the mean probability of attending secondary school, which is similar for both boys and girls. At each sibship size, girls appear to be slightly more likely than boys to have attended secondary school. The relationship becomes noisier as sibling size grows over 10 as a result of sparseness of observations. Table 1 shows descriptive statistics for children from both the CRELES and Census sample. Children in the CRELES and 1984 census samples appear similar in most characteristics. Over 60% of all children in both samples have attended at least one year of secondary school, whereas girls have a slight advantage over boys in secondary school attendance. The majority of children have at least one sister, although the proportion is somewhat higher in the CRELES sample than the census sample. On average, mothers have about six years of schooling. Mean mothers’ age at first birth is between 21 and 22 in all samples. On the other hand, children in the census sample have on average about 1.5 fewer siblings than the CRELES sample, which could be explained by the additional restriction that all children have to live in the household in the census data. Another difference between CRELES and census samples is that the census samples tend to have a lower probability of the first two births being female. Again, this is primarily due to the restriction that all children need to be living with their mothers at the time of the survey, as elder daughters are more likely to marry out. Li, Dow & Rosero-Bixby: The declining effect of sibling size on children’s education in Costa Rica 1444 http://www.demographic-research.org Table 1: Descriptive statistics of analysis samples CRELES sample 1984 census sample 2000 census sample Family characteristics Number of siblings 3.437 (2.176) 2.978 (1.627) 2.296 (1.241) More than one sibling 0.839 0.831 0.717 More than two siblings 0.579 0.538 0.347 First two girls 0.239 0.218 0.233 First two boys 0.252 0.292 0.278 Wealth index 4.075 (2.020) 5.652 (1.321) Child characteristics Any secondary school 0.630 0.635 0.684 Female 0.494 0.460 0.476 First Birth 0.501 0.644 0.682 Any sisters 0.828 0.792 0.717 Year of birth 1971.4 (8.380) 1967.7 (1.849) 1983.6 (1.877) Age at survey 37.83 (7.564) 16.31 (1.849) 16.36 (1.877) Mother characteristics Mother’s years of schooling 6.165 (4.166) 6.010 (3.704) 8.475 (3.901) Mother’s age at first birth 22.27 (4.698) 21.45 (4.365) 21.63 (4.178) N 3785 5868 14718 Notes: mean coefficients; sd in parentheses. Mean of each variable with standard deviation in parentheses. The first validity check of our instruments is the strength of their impact on sibling size, as well as the implied parental preference on children’s gender in Costa Rica. Before presenting the first stage regression results at the child level, we first show results of OLS regression analysis at the family level, which predicts the probability of having a nth birth conditioning on having n-1 births (n=3,4,5) using gender composition of the first two births. Table 2 represents the analysis results using the 1984 census, and using the 2000 census yields very similar results. Compared with having two boys as the first two births, having a girl at either the first or second birth significantly decreases the probability of having a third birth by 5−6 percentage points. While the first two births being both female has a significant positive interaction effect on having the third birth, it does not appear to be enough to offset the combined negative effect of having either the first or second birth as female, indicating that families where the two eldest children are male are more likely to go on to have a third birth compared to when the two eldest children are female, and certainly more so than when only the first or second birth is female. On the other hand, having a girl at the second birth continues to negatively influence the probability of having a fourth birth, while none of the gender indicators has any impact on having a fifth birth. Taken together, these results suggest Demographic Research: Volume 31, Article 48 http://www.demographic-research.org 1445 that Costa Rican families in our sample have a general preference for having a balance of genders among children, with some preference for having girls. Table 2: The effect of gender composition of the first two children on progression of births (1984 census) (1) Third Child (2) Fourth Child (3) Fifth Child First girl -0.0553*** -0.0113 -0.0052 (0.0207) (0.0208) (0.0289) Second girl -0.0631**** -0.0662** 0.0306 (0.0156) (0.0251) (0.0359) First girl × second girl 0.1005**** 0.0309 -0.0068 (0.0260) (0.0309) (0.0411) Mean of Dep Var 0.79 0.60 0.55 Number of Observations 4,454 3,517 2,097 R squared 0.08 0.12 0.14 Notes: Standard errors adjusted for clustering at canton level. Each observation is at the household level. Other controls include mother’s education, mother’s age at first birth, family wealth and canton fixed effects. * p < 0.1 , ** p < 0.05 , *** p < 0.01 , **** p < 0.001. Table 3 presents the first-stage models of the 2SLS analyses, which are similar analyses to those in Table 2, except that the regressions are estimated at the child level with additional covariates. These are essentially weighted versions of the mother-level regression, where the observation for each woman is weighted by the number of children she has7. Columns 1 to 3 report the coefficient estimates using CRELES data. The effect of the gender composition of first two births on number of siblings appears to be very similar across samples. In all cases, having two eldest brothers is associated with approximately 0.7 additional siblings on average, whereas the first two births in the family being female is associated with 0.5−0.6 fewer siblings. All coefficients are statistically significant at a conventional level. The first-stage partial F-stats on the instruments are greater than 10 in all but one case, indicating that we do not have a weak instruments problem. Even in Column (3), the partial F-statistic is still very close to 10. 7 We are thankful to the reviewer for highlighting this point. Li, Dow & Rosero-Bixby: The declining effect of sibling size on children’s education in Costa Rica 1446 http://www.demographic-research.org Table 3: First stage model of gender composition on number of siblings CRELES 1984 census 2000 census All (1) Boys (2) Girls (3) All (4) Boys (5) Girls (6) All (7) Boys (8) Girls (9) First two girls -0.5583**** -0.4669*** -0.5127**** -0.5322**** -0.5748**** -0.5752**** (0.1299) (0.1222) (0.0537) (0.0537) (0.0402) (0.0413) First two boys 0.6925**** 0.6073**** 0.7174**** 0.7149**** 0.7179**** 0.7165**** (0.1037) (0.0929) (0.0708) (0.0731) (0.0495) (0.0495) First stage partial F 36.9 38.6 9.4 103.8 95.6 98.1 341.1 209.2 193.5 Mean of Dep Var 3.44 3.45 3.42 2.98 3.00 2.95 2.30 2.31 2.28 N 3,785 1,915 1,869 5,868 3,170 2,698 14,718 7,716 7,002 R squared 0.36 0.49 0.43 0.35 0.35 0.37 0.31 0.31 0.32 Notes: Standard errors adjusted for clustering at canton level. Each observation is at the child level. Controls include having any sister, mother’s years of education, mother’s age at first birth, childbirth year and canton fixed effects, birth order, family wealth index. * p < 0.1 , ** p < 0.05 , *** p < 0.01 , **** p < 0.001. Before reporting the OLS and second stage regression results, we present in Table 4 the reduced form relationship using census data. Specifically, we show probabilities of attending secondary school by gender and birth order (1st or 2nd) as a function of the gender of the other sibling in the first two births. These are essentially reduced form results analogous to the Wald estimator in the binary instrument case. Several observations emerge. In the 1984 census the gender of the other sibling in the first two births appears to have a nontrivial impact on secondary school attendance of girls, with the effect being strongly significant for second-born girls and close to statistically significant for first-born girls. On the other hand, these effects are much weaker for boys. In addition, the gender effect disappears almost completely in the 2000 census. These descriptive results provide a more intuitive interpretation and preview of the 2SLS regression results described below. Demographic Research: Volume 31, Article 48 http://www.demographic-research.org 1447 Table 4: Probability of attending secondary school by gender composition in first and second births 19 84 c en su s 1st Birth Boy 1st Birth Girl P-value of diff in prob. secondary school 2nd birth boy 0.50 0.54 0.088 2nd birth girl 0.53 0.60 0.009 2nd Birth Boy 2nd Birth Girl P-value of diff in prob. secondary school 1st birth boy 0.59 0.60 0.579 1st birth girl 0.64 0.68 0.108 20 00 c en su s 1st Birth Boy 1st Birth Girl P-value of diff in prob. secondary school 2nd birth boy 0.52 0.52 0.931 2nd birth girl 0.60 0.62 0.219 1st Birth Boy 1st Birth Girl P-value of diff in prob. secondary school 1st birth boy 0.62 0.62 0.874 1st birth girl 0.69 0.69 0.762 The OLS and the second stage of the 2SLS regression results are reported in Table 5. The first row contains OLS estimates of the effect of sibling number on the probability of attending secondary school. In all cases the estimates are highly significant and fall consistently in the range of -0.03 to -0.05. The effects seem to be slightly stronger for girls, although the confidence intervals of the estimates by gender overlap. In addition, there does not appear to be evidence of heterogeneity of treatment effects across time from the OLS estimates, as the estimates using the 2000 census are qualitatively similar to those using the 1984 census. Nevertheless, OLS estimates are subject to omitted variable bias because the fertility decisions of mothers are likely to be endogenous to unobserved confounders that also affect children’s education. We now turn to the 2SLS estimates in Row 2, which appear dramatically different from the OLS estimates. Moreover, the 2SLS estimates are fairly consistent for both the CRELES and 1984 census samples: in the full model, having an additional sibling decreases the probability of having attended at least one year of secondary school by almost 10 percentage points in the CRELES sample and over 7 percentage points in the 1984 census. In addition, there is large heterogeneity in treatment effect by gender: while the coefficient estimates for boys are barely significant or different from the OLS estimates, for girls the point estimates are almost four times larger and are statistically significant. By contrast, none of the coefficient estimates are significant using the 2000 census. The endogeneity tests confirm that the 2SLS estimates are significantly different from the OLS estimates for Li, Dow & Rosero-Bixby: The declining effect of sibling size on children’s education in Costa Rica 1448 http://www.demographic-research.org girls, but not for boys. These results suggest that bias in the OLS estimates masks important heterogeneity of treatment effects by gender and across time. Table 5: OLS and 2SLS estimates of the effect of sibling number on prob. of attending secondary school CRELES 1984 census 2000 census All (1) Boys (2) Girls (3) All (4) Boys (5) Girls (6) All (7) Boys (8) Girls (9) OLS -0.0411**** -0.0353**** -.0466**** -0.0382**** -0.0322**** -0.0469**** -0.0326**** -0.0288**** -0.0374**** (0.0050) (0.0063) (0.0071) (0.0041) (0.0055) (0.0065) (0.0043) (0.0051) (0.0051) 2SLS -0.1026*** -0.0669 -0.1825** -0.0733**** -0.0424* -0.1633**** -0.0259* -0.0260 -0.0244 (0.0316) (0.0495) (0.0763) (0.0182) (0.0257) (0.0359) (0.0149) (0.0180) (0.0225) Endogeneity Test P-value 0.0497 0.6177 0.0141 0.0542 0.6865 0.0007 0.6359 0.8659 0.5632 Mean of Dep Var 0.63 0.61 0.65 0.63 0.61 0.66 0.68 0.64 0.73 N 3,781 1,915 1,866 5,868 3,170 2,698 14,718 7,716 7,002 R squared 0.30 0.34 0.33 0.38 0.39 0.39 0.23 0.24 0.22 Notes: Standard errors adjusted for clustering at canton level. Each observation is at the child level. Controls are the same as in Table 2. * p < 0.1 , ** p < 0.05 , *** p < 0.01 , **** p < 0.001. One potential concern with the validity of our 2SLS results is that having an oldest sister may have an independent effect on younger siblings’ education beyond the effect of total sibling size or the generic “any sister” effect, which may render our instruments invalid. While we cannot rule this out directly, the fact that there is no evidence of such effect in the 2000 census according to Table 4 (secondary school attendance of the second birth does not differ by gender of the first birth) hopefully alleviates the concern that this effect may systematically bias our results. The fact that we find 2SLS effects that are larger than OLS (for girls in the CRELES and 1984 census cohorts) may seem unexpected, given that some hypothesized confounders are likely to bias the OLS estimates upward. However, it is not unreasonable that those families whose fertility is affected by gender composition may be families with weaker preferences for girls’ education, which could cause the instrumental variables estimates to be larger than the OLS estimates. Clearly, more research is needed in order to investigate the specific mechanism through which sibling size affects female education in Costa Rica and similar contexts. 5. Discussion and conclusion This study uses multiple data sources to examine heterogeneity in the relationship between sibling size and secondary education attainment in the Latin American country Demographic Research: Volume 31, Article 48 http://www.demographic-research.org 1449 of Costa Rica, on which there is little in the literature. It is especially meaningful to examine heterogeneity in the sibling size-education relationship in this context, as it may play out differently in the rapid fertility decline that marked a significant period during the relationship, compared to the period afterwards. Large fertility declines such as that experienced by Costa Rica have the potential to result in substantial demographic dividends — but these are not automatic. We do not explore in this paper the educational supply policies that could help realize educational dividends, nor the underlying drivers of fertility decline, but we do explore the extent to which households are likely to have chosen higher educational attainment for their children as they trade off quantity for quality. In the Latin American case of Costa Rica, the quantity-quality tradeoff appears quite robust in earlier periods, suggesting the potential for strong demographic dividends. OLS tends to underestimate the effect of fertility on education for girls due to confounders. The fact that the OLS and first stage estimates are qualitatively similar between CRELES and the 1984 census samples is reassuring and strengthens the validity of the 2SLS estimates using census data. Moreover, our analyses uncover important differential effects of sibling number on children’s education across time: while there is strong evidence of Q-Q tradeoff among children born before 1980, the effects almost disappear afterwards. A plausible explanation is that economic development in the recent few decades combined with the decline in fertility has improved the financial situation of most families in Costa Rica, such that an additional offspring within relatively small families no longer significantly affects resources invested in children’s education. Another somewhat related explanation for the heterogeneous sibling size-education has to do with the spillover effect. Specifically, gains in education attributed to the fertility decline among girls during the fertility transition period may have changed the perception or resource- allocation decisions of families with more children, who would otherwise have invested less in their education. Given the many economic and socio-demographic changes over this period, it is difficult to pin down the exact mechanisms that caused the decline in Q-Q tradeoffs. Future work would be valuable that further explores interactions within and across countries depending on household economic pressures, education costs, fertility, gender preference, and other social attitudes. This may help to better interpret the heterogeneous findings in the broader literature on Q-Q tradeoffs in other times and settings. Our finding is an important contribution to the broader literature on Q-Q tradeoff, as it suggests that treatment effects of sibling size can be legitimately different across samples and time periods. If different time periods in the same context of Costa Rica can yield distinct estimates of sibling size-education relationship, then it is highly Li, Dow & Rosero-Bixby: The declining effect of sibling size on children’s education in Costa Rica 1450 http://www.demographic-research.org plausible that studies on different countries with distinct cultures and stages of economic development would yield completely different results. To some extent, results from our study provide a convincing explanation for the previously incongruent findings in the literature. Interestingly, our findings seem to be different from a number of previous studies that similarly explore the heterogeneous relationship between family size and children’s education, albeit in alternative contexts (e.g., Eloundou-Enyegue and Williams 2006; Maralani 2008; Martelato and de Souza 2012). While those studies generally find the relationship to be strengthening over time, our analysis suggests the opposite in Costa Rica. This contrast in findings further highlights the importance of carefully examining the heterogeneous relationship between family size and schooling in different contexts and at different stages of socio-economic development, as well as the usage of more recent data to achieve this purpose. In order to better understand the magnitude of our estimates, we employ a further exercise where we extrapolate our coefficient estimates to simulate how they would translate into macro-level education change under the simplifying assumption of no other accompanying changes in education determinants. The proportion of teenagers with secondary education increased from 1984 to 2000 in our census’ analytical samples by 2.4 and 6.5 percentage points among boys and girls respectively. Our results thus explain more than 100% of the observed educational improvement based on the coefficients estimated with the 1984 census data, or 75% and 25% of the educational improvement of boys and girls, respectively, based on the 2000 census estimates. If we take an average of the 1984 and 2000 estimates, the inter-census variation in education explained by sibship reduction is close to 100% for both boys and girls. Of course, these extrapolations omit many other macro level trends, and investigation of this as a causal explanation of changing education attainment must be left to future work, but these simulations highlight the potential importance of further exploring this mechanism. To conclude, in Costa Rica from 1960 to 1980 — a period marked by substantial fertility decline - we find strong evidence for Q-Q tradeoff, especially among girls. Specifically, an additional sibling decreased the probability of attending secondary school by about 16 percentage points for girls and 4 percentage points for boys. The effect largely disappeared after 1980 for both boys and girls, suggesting that resource constraints from additional number of siblings are no longer a prevailing issue in Costa Rica. The differential pathways through which sibling size affects education among boys versus girls constitute an important topic for future research. Demographic Research: Volume 31, Article 48 http://www.demographic-research.org 1451 References Almond, D. and Mazumder, B. (2011). Health capital and the prenatal environment: the effect of Ramadan observance during pregnancy. American Economic Journal ‒ Applied Economics 3(4): 56. doi:10.1257/app.3.4.56. Angrist, J., Lavy, V., and Schlosser, A. (2010). Multiple experiments for the causal link between the quantity and quality of children. Journal of Labor Economics 28(4): 773−824. doi:10.1086/653830. Åslund, O. and Grönqvist, H. (2010). Family size and child outcomes: Is there really no trade-off? Labour Economics 17(1): 130−139. doi:10.1016/j.labeco.2009. 05.003. Becker, G.S. and Lewis, H.G. (1973). On the Interaction between quantity and quality of children. Journal of Political Economy 81(2): S279−S288. doi:10.1086/ 260166. Berkeley Population Center (2012). CRELES: Costa Rican Longevity and Health Aging Study [electronic resource]. Berkeley: Berkeley Population Center. http://www.creles.berkeley.edu/index.html. Black, S.E., Devereux, P.J., and Salvanes, K.G. (2005). The more the merrier? The effect of family size and birth order on children's education. The Quarterly Journal of Economics 120(2): 669−700. doi:10.1093/qje/120.2.669. Blake, J. (1981). Family size and the quality of children. Demography 18(4): 421−422. doi:10.2307/2060941. Butcher, K.F. and Case, A. (1994). The effect of sibling sex composition on women's education and earnings. The Quarterly Journal of Economics 109(3): 531−563. doi:10.2307/2118413. Caceres, J. (2004). Impact of family size on investment in child quality: Multiple births as natural experiment [unpublished manuscript]. Baltimore: University of Maryland. Coale, A.J. (1983). Recent trends in fertility in less developed countries. Science 221(4613): 828−832. doi:10.1126/science.6879179. Dahl, G.B. and Moretti, E. (2008). The demand for sons. The Review of Economic Studies 75(4): 1085−1120. doi:10.1111/j.1467-937X.2008.00514.x. Li, Dow & Rosero-Bixby: The declining effect of sibling size on children’s education in Costa Rica 1452 http://www.demographic-research.org DGEC and WFS (1978). Encuesta Nacional de Fecundidad 1976. San José, Costa Rica: Dirección General de Estadística y Censos (DGEC) and World Fertility Survey (WFS). Eloundou-Enyegue, P.M. and Williams, L.B. (2006). The effects of family size on child schooling in sub-Saharan settings: A reassessment. Demography 43(1): 25−52. doi:10.1353/dem.2006.0002. Hanushek, E.A. (1992). The trade-off between child quantity and quality. Journal of Political Economy: 84−117. doi:10.1086/261808. Jensen, R. (2005). Equal treatment, unequal outcomes? Generating sex inequality through fertility behavior [unpublished manuscript]. Harvard University. Kelley, A.C. and Schmidt, R.M. (2005). Evolution of recent economic-demographic modeling: a synthesis. Journal of Population Economics 18(2): 275−300. doi:10.1007/s00148-005-0222-9. Knodel, J. and Wongsith, M. (1991). Family size and children's education in Thailand: Evidence from a national sample. Demography 28(1): 119−131. doi:10.2307/ 2061339. Lam, D. and Marteleto, L. (2008). Stages of the demographic transition from a child's perspective: Family size, cohort size, and children's resources. Population and Development Review 34(2): 225−252. doi:10.1111/j.1728-4457.2008.00218.x. Lam, D. and Marteleto, L. (2013). Family Size of Children and Women during the Demographic Transition. Paper presented at the XXVII IUSSP International Population Conference, Busan, Korea, August 26-31 2013. Lee, J. (2008). Sibling size and investment in children's education: an Asian instrument. Journal of Population Economics 21(4): 855−875. doi:10.1007/s00148-006- 0124-5. Lee, R. and Mason, A. (2010). Fertility, human Capital, and economic growth over the demographic transition. European Journal on Population 26(2): 159−182. doi:10.1007/s10680-009-9186-x. Li, H., Zhang, J., and Zhu, Y. (2008). The quantity-quality trade-off of children in a developing country: Identification using Chinese twins. Demography 45(1): 223−243. doi:10.1353/dem.2008.0006. Maralani, V. (2008). The changing relationship between family size and educational attainment over the course of socioeconomic development: Evidence from Indonesia. Demography 45(3): 693−717. doi:10.1353/dem.0.0013. Demographic Research: Volume 31, Article 48 http://www.demographic-research.org 1453 Marteleto, L.J. and de Souza, L.R. (2012). The changing impact of family size on adolescents' schooling: Assessing the exogenous variation in fertility using twins in Brazil. Demography 49(4): 1453−1477. doi:10.1007/s13524-012-0118-8. Mata, L. (1983). The evolution of diarrheal diseases and malnutrition in Costa Rica: the role of interventions. Assignment Children 61/63: 195−224. doi:10.1007/978-1- 4615-9284-6_12. Mathews, F., Johnson, P.J., and Neil, A. (2008). You are what your mother eats: evidence for maternal preconception diet influencing foetal sex in humans. Proceedings of the Royal Society B: Biological Sciences 275(1643): 1661−1668. doi:10.1098/rspb.2008.0105. Minnesota Population Center (2014). Integrated Public Use Microdata Series, International: Version 6.3 [Machine-readable database]. Minneapolis: University of Minnesota. Qian, N. (2004). Quantity-quality and the one child policy: The positive effect of family size on school enrollment in China. [PhD thesis] Cambridge: Massachusetts Institute of Technology, Department of Economics. Patrinos, H.A. and Psacharopoulos, G. (1997). Family size, schooling and child labor in Peru–An empirical analysis. Journal of Population Economics 10(4): 387−405. doi:10.1007/s001480050050. Preston, S.H. (1976). Family sizes of children and family sizes of women. Demography 13(1): 105−114. doi:10.2307/2060423. PRB (2012). World Population Data Sheet. Washington DC: Population Reference Bureau (PRB) publications. Programa Estado de la Nación (2005). Primer Informe Estado de la Educación. San Jose, Costa Rica: Programa Estado de la Nación. Rosenzweig, M.R. and Wolpin, K.I. (1980). Testing the quantity-quality fertility model: the use of twins as a natural experiment. Econometrica: journal of the Econometric Society 48(1): 227−240. doi:10.2307/1912026. Salazar, J.M. (2003). Historia de la Educación Costarricense. San Jose, Costa Rica: EUNED. Shultz, T.P. (2007). Population policies, fertility, women's human capital, and child quality. In: Shultz, T.P. and Strauss, J. (eds.). Handbook of Development Economics 4: 3249−3303. Li, Dow & Rosero-Bixby: The declining effect of sibling size on children’s education in Costa Rica 1454 http://www.demographic-research.org Sudha, S. (1997). Family size, sex composition and children's education: Ethnic differentials over development in Peninsular Malaysia. Population Studies 51(2): 139−151. doi:10.1080/0032472031000149876.