CHAPTER 8

Intergenerational
Mobility —
Measurement Error


REFERENCE: Where is the Land of Opportunity? The Geography of Intergenerational Mobility in the United States»
REFERENCE: Recent Developments in Intergenerational Mobility»

In this chapter we are discussing the literature on intergenerational mobility as reviewed in Black and Devereux (2011), and the important recent contribution of Chetty et al. (2014).

To what extent does our parents' economic situation determine our own economic chances? And to what extent is "equality of opportunity" a reality? This seemingly well-posed question is conceptually quite a bit trickier than it might seem at first. To make this point, let us consider a number of different objects. Each of these objects might be considered a measure of intergenerational mobility.

  1. Predictability of \((\log)\) child income in a given year (or a few years) using \((\log)\) parent income in a given year (or a few years):

    $$ E[Y_{c,s}|Y_{p,t}] $$

    This conditional expectation describes the average \((\log)\) of child income in a given year, or at a given age, among all those children whose parents received an income of \(Y_{p,t}\) in a given (earlier) year.

    $$ \frac{\text{Cov}(Y_{p,t},Y_{c,s})}{\text{Var}(Y_{p,t})} $$

    This fraction gives the slope of a linear approximation to the conditional expectation in the previous display. This approximation is called the best linear predictor. If income is measured in \(\log\) terms, this slope describes the percentage increase in average child income for a 1% increase in parent income. This slope is the most common measure of intergenerational mobility. Another variant of this measure uses ranks in the national income distribution, instead of levels or logs.

  2. Predictability of \((\log)\) child lifetime income using \((\log)\) parent lifetime income:

    $$ E[\overline{Y}_c|\overline{Y}_p] $$

    This conditional expectation describes the average \((\log)\) of child lifetime income (this is also called "permanent income") among all those children whose parents received a lifetime income of \(\overline{Y}_p\).

    $$ \frac{\text{Cov}(\overline{Y}_p, \overline{Y}_c)}{\text{Var}(\overline{Y}_p)} $$

    As before, this fraction gives the slope of a linear approximation to the conditional expectation in the previous display.

    Measured income varies significantly over time, because of the life cycle of earnings, because of transitory shocks, and because of measurement error. Arguably we might be more interested in the relationship between the lifetime incomes of parents and children, that is, in the relationship between long-run average incomes. Lifetime income is in general more strongly related between parents and children than short-run income. We will discuss why in sections 1 and 2 below. The relationship between lifetime incomes is often considered the actual object of interest of mobility studies. Lifetime incomes are hard to observe, however, which is why short-term incomes are studied more often.

  3. Predictability using additional variables: But why stop there? Is it not equally relevant how other factors such as parent education, location of residence, etc. predict child outcomes? Philosophers such as Rawls argue that features such as these, determined at birth and out of our control, are "morally arbitrary" — they should not determine our chances in life. More generally, we might be interested in knowing to what extent life outcomes are predictable at birth. The more predictive factors we consider, the better we will be able to predict child outcomes. This motivates consideration of objects such as the following:

    $$ E[\overline{Y}_c|\overline{Y}_p, X_p, W_p] $$

    This conditional expectation describes the average \((\log)\) of child lifetime income, among all those children whose parents received a lifetime income of \(\overline{Y}_p\), who had education level \(X_p\), location of residence \(W_p\), etc.

    $$ \text{Var}((\overline{Y}_p, X_p, W_p))^{-1} \cdot \text{Cov}((\overline{Y}_p, X_p, W_p), \overline{Y}_c). $$

    As before, this vector gives the slopes of a linear approximation to the conditional expectation shown above.

  4. The causal effect of parent lifetime income: $$ \overline{Y}_c = g(\overline{Y}_p, \epsilon). $$

    The structural function \(g\) describes the causal effect of parent income on child income. Structural functions are an alternative, equivalent notation for the potential outcomes which we learned about in chapter. If parent income is changed, it is assumed that the function \(g\) and the set of unobserved factors \(\epsilon\) do not change. Not all correlations are causal. Child and parent income might be statistically related because education is transmitted across generations, for instance, without there being a causal effect of parent income. The causal effect of parent income on their children is the subject of a more recent literature, usually using instrumental variables; we will discuss this in section 4. We might, for instance, care about this causal effect if we are interested in the effect of redistributive taxation on the next generation.

  5. The causal effect of additional variables:

    $$ \overline{Y}_c = h(\overline{Y}_p, X_p, W_p,\epsilon') $$

    The structural function \(h\) describes the causal effect of parent income, education, location of residence, etc., on child income. As before, other factors \(X_p, W_p\) might have a causal effect, in addition to the effect of parental income \(\overline{Y}_p\). We might for instance be interested in the effect of current educational policy on future generations.


1. Classical measurement error and transitory shocks

As mentioned, much of the literature on intergenerational income mobility is interested in objects of the form

(1)
$$ \begin{equation} \beta:=\frac{\text{Cov}(\overline{Y}_p, \overline{Y}_c)}{\text{Var}(\overline{Y}_p)}, \end{equation} $$

describing the predicted percentage increase in child lifetime income for a 1% increase in parent lifetime income. A key concern is that this slope is different than the slope we would get from a regression on short-run parental income. To see why, suppose that

(2)
$$ \begin{align} Y_{p,t} =\overline{Y}_p +\epsilon_{p,t} \nonumber\\ Y_{c,s} =\overline{Y}_c +\epsilon_{c,s}, \end{align} $$

where


(3)
$$ \begin{align} \text{Cov}(\overline{Y}_p,\epsilon_{p,t}) = \text{Cov}(\overline{Y}_p,\epsilon_{c,s}) &=0 \nonumber \\ \text{Cov}(\overline{Y}_c,\epsilon_{c,s}) = \text{Cov}(\overline{Y}_c,\epsilon_{p,t}) &=0 \nonumber \\ \text{Cov}(\epsilon_{p,t},\epsilon_{c,s}) &=0. \label{eq:classicalME} \end{align} $$

These equations say that, for both parents and children, income in a given year is equal to permanent income plus a shock \(\epsilon\) of mean \(0\). The shocks might either be due to transitory fluctuations in actual income, or due to measurement error. These shocks are assumed to be uncorrelated with permanent incomes, and with each other. This assumption holds true for what is called "classical measurement error."

Suppose we estimate the slope of a regression of short run incomes,

(4)
$$ \begin{equation} \gamma := \frac{\text{Cov}(Y_{p,t},Y_{c,s})}{\text{Var}(Y_{p,t})}. \end{equation} $$

How are the parameters \(\beta\) and \(\gamma\) related to each other? Let us start by looking at the covariance in the numerator of \(\gamma\): $$ \begin{align*} & \quad\text{Cov}(Y_{p,t},Y_{c,s}) \\ &= \text{Cov}(\overline{Y}_p,\overline{Y}_c) + \text{Cov}(\overline{Y}_p,\epsilon_{c,s}) + \text{Cov}(\epsilon_{p,t},\overline{Y}_c) + \text{Cov}(\epsilon_{p,t},\epsilon_{c,s})\\ &= \text{Cov}(\overline{Y}_p,\overline{Y}_c). \end{align*} $$

All except the first of the covariance terms in the second line of this display are zero by the assumptions we imposed on the measurement errors, that is by equation (3). So we get the same covariance in the numerator for both \(\beta\) and \(\gamma\). What about the denominator?

$$ \text{Var}(Y_{p,t}) =\text{Var}(\overline{Y}_p) + 2\cdot \text{Cov}(\epsilon_{p,t},\overline{Y}_p) + \text{Var}((\epsilon_{p,t})= \text{Var}(\overline{Y}_p) + \text{Var}(\epsilon_{p,t}), $$

and therefore

(5)
$$ \begin{equation} \gamma = \frac{\text{Var}(\overline{Y}_p)}{\text{Var}(\overline{Y}_p) + \text{Var}(\epsilon_{p,t})} \cdot \beta. \end{equation} $$

The short run coefficient \(\gamma\) is smaller than the long run coefficient \(\beta\) by a factor which depend on the relative variance of measurement error (transitory shocks), and lifetime income. This phenomenon is known as attenuation bias.

Note that, for this type of measurement error, only the error on the parent side matters, while the variance of \(\epsilon_{c,s}\) does not show up in the formula for the attenuation bias!


2. Non-classical measurement error and the lifetime profile of earnings

In the last section we assumed that measurement error for both parents and children is "classical," that is, it has mean \(0\) and is independent of actual lifetime earnings. That assumption might not be correct, especially when earnings are measured early in life (say, before age 35). This is often the case for children, who are not old enough in the available data-sets for us to observe their income at a later point in life.

The reason is that the profile of earnings over a lifetime is quite different depending on the qualifications required for a particular occupation. The annual earnings of those with higher lifetime earnings tend to rise more steeply with experience relative to those with lower lifetime earnings.

For illustration, consider the following model: maintain the same assumption as in the previous section, except that

(6)
$$ \begin{equation} Y_{c,s} =\overline{Y}_c \cdot (1 + \alpha \cdot( s - \overline{s})) +\epsilon_{c,s}. \label{eq:MEnonclassical} \end{equation} $$

This equation says that the earnings of children rise over time in a way that is positively related to lifetime earnings, by a factor which is determined by the parameter \(\alpha > 0\). The earnings of children are on average equal to lifetime earnings at age \(\overline{s}\).

Under equation (6), we get

$$ \begin{align*} &\quad\text{Cov}(Y_{p,t},Y_{c,s})\\ &= \text{Cov}(\overline{Y}_p,\overline{Y}_c \cdot (1 + \alpha \cdot( s - \overline{s}))) + \text{Cov}(\overline{Y}_p,\epsilon_{c,s})\\ &\qquad + \text{Cov}(\epsilon_{p,t},\overline{Y}_c\cdot (1 + \alpha \cdot( s - \overline{s}))) + \text{Cov}(\epsilon_{p,t},\epsilon_{c,s})\\ &= (1 + \alpha \cdot( s - \overline{s})) \cdot \text{Cov}(\overline{Y}_p,\overline{Y}_c), \end{align*} $$

so that


(7)
$$ \begin{equation*} \gamma = (1 + \alpha \cdot( s - \overline{s})) \cdot \frac{\text{Var}(\overline{Y}_p)}{\text{Var}(\overline{Y}_p) + \text{Var}(\epsilon_{p,t})} \cdot \beta. \end{equation*} $$

Suppose, for simplicity, that measurement error for parents' income was not an issue, i.e. \(\text{Var}(\epsilon_{p,t})=0\). In that case

$$ \begin{equation} \gamma = (1 + \alpha \cdot( s - \overline{s})) \cdot \beta. \end{equation} $$

If we observe children at age \(\overline{s}\), there is no problem. If we observe them at an age \(s\) younger than \(\overline{s}\), however, there is a downward bias in \(\gamma\) relative to \(\beta\), since \((1 + \alpha \cdot( s - \overline{s})) <1\) in this case.


3. Remedies

It is important to recognize that the reasons for the downward bias are different between sections 1 and 2, so that the remedies in either case are different, as well. In the case of classical measurement error, as in section 1, the problem is that we over-estimate the inequality of lifetime incomes on the parent side. Since part of the estimated inequality is simply due to measurement errors and transitory shocks, this part of inequality is not inherited by children, and we conclude erroneously that inequality of incomes is transmitted to a lesser extent than it actually is. In section 2, the problem is related to the fact that we under-estimate the inequality of the lifetime incomes of children. This again implies that regressions of short-term incomes suggest that parental inequality of incomes is transmitted to a lesser extent than it actually is.

There are a number of remedies for the problem of classical measurement error on the parent side:

  1. Use better data: When we are interested in earnings, administrative data (for instance from the IRS, or from social security administrations in other countries) tend to be more reliable than self-reported earnings in surveys, that is, they have measurement error with smaller variance.

    This strategy of course only takes care of measurement error, but not of transitory shocks to actual earnings

  2. Average earnings over several years: Suppose for illustration that shocks are uncorrelated across years and have constant variance. Then \(\frac{1}{k}\sum_{t=t_0}^{t_0+k} Y_{p,t} = \overline{Y}_p + \frac{1}{k}\sum_{t=t_0}^{t_0+k} \epsilon_{p,t},\)

    and


    \(\text{Var}\left (\frac{1}{k}\sum_{t=t_0}^{t_0+k} \epsilon_{p,t}\right ) = \frac{1}{k^2}\sum_{t=t_0}^{t_0+k} \text{Var}(\epsilon_{p,t}) = \frac{1}{k} \text{Var}(\epsilon_{p,t_0}). \)

    Averaging earnings over \(k\) years thus reduces the variance of measurement error by a factor \(1/k\), and correspondingly reduces the attenuation bias from a factor of \(1/\left (1+ \text{Var}(\epsilon_{p,t}) / \text{Var}(\overline{Y}_p)\right )\) to a factor of


    (8)
    $$ \begin{equation} \frac{1}{1+ \tfrac{1}{k} \tfrac{\text{Var}(\epsilon_{p,t})}{\text{Var}(\overline{Y}_p)}}. \end{equation} $$
  3. Assessing the reliability of the data:

    Suppose we have two measurements \(Y_{p,t_1}\) and \(Y_{p,t_2}\) of parental income with independent measurement error. Then the correlation of these two variables is equal to

    (9)
    $$ \begin{equation} \text{corr}(Y_{p,t_1}, Y_{p,t_2}) = \frac{ \text{Cov}(Y_{p,t_1},Y_{p,t_2})}{\sqrt{ \text{Var}(Y_{p,t_1})\cdot \text{Var}(Y_{p,t_2})}} = \frac{ \text{Var}(\overline{Y}_p)}{ \text{Var}(\overline{Y}_p) + \text{Var}(\epsilon_{p,t})}. \end{equation} $$

    But this factor is exactly the same as the one describing the attenuation bias from \(\beta\) to \(\gamma\).

The situation is more complicated for non-classical measurement error, for instance of the form discussed in section 2. The main remedy for measurement error of this kind is to use child income measured at a later point in life, when the dispersion of annual earnings more closely resembles the dispersion of lifetime earnings. Another remedy would be to "move the goalpost," and to focus on other outcomes that are determined earlier in life. The leading example would be educational attainment, which is usually well determined by the time children have reached their late 20s.


4. The causal effect of parental income; instruments

So far, we considered regressions of short-term incomes (the first of the objects introduced at the beginning of this chapter), regressions of lifetime incomes (the second of the objects), and their relationship. We shall now turn to the causal effect of parental income (the fourth object we introduced), and how it relates to regressions of (lifetime) income.



REFERENCES

Black, S. and Devereux, P. (2011). Recent developments in intergenerational mobility. Handbook of Labor Economics, 4: 1487 – 1541.

Chetty, R., Hendren, N., Kline, P., and Saez, E. (2014). Where is the land of opportunity? The geography of intergenerational mobility in the United States. Quarterly Journal of Economics, 129(4): 1553 – 1623.