Home > Publications > Research> Settlement Patterns > Regression Analysis

Settlement Patterns and the Geographic Mobility of Recent Migrants to New Zealand

Regression Analysis

Empirical Model

We employ a discrete choice model to analyse the initial location of recent migrants, as well as, the location of earlier migrants (i.e. the resettlement of recent migrants). Following the same approach as Jaeger (2007), we estimate a McFadden's choice model (sometimes called a conditional logit model) where each individual chooses to locate in one of 58 LMAs based on the characteristics of each LMA, some of which may be individual specific (McFadden 1973; Greene 2003, section 7.3). It is assumed that individuals have an additive stochastic utility function of the form:

Uij = Z'jδ + X'ijβ+ aj + eij, (1)

where individual i is faced with J choices and Zj is a vector of LMA characteristics, Xij is a vector of LMA characteristics interacted with individual characteristics or LMA characteristics that are specific to individuals (such as the same region of birth migrant density in each LMA) and αj are LMA fixed effects.

Further assuming that individuals choose to locate in the LMA that maximises their expected utility and that the stochastic error term, eij ~ iid weibull, this model can be estimated using a conditional logit model (McFadden 1973). The probability that individual i locates in LMA j is then:

settlement.tmp01.jpg (2)

where yi is individual i's location choice out of the choice set of 58 LMAs. To estimate this model, we create 58 observations for each individual (one for each LMA) with characteristics specific to a particular LMA recorded in each observation, as well as a variable indicating the LMA in which each individual chooses to locate. It is worth noting that all individual specific characteristics that do not vary over the choice set are conditioned out of this model. Thus, for example, it is not possible to estimate whether gender is associated with living in a particular LMA, but it is possible to examine whether women are more responsive than men to local migrant networks when choosing a LMA.

Because we have data from two censuses, we are able to include LMA fixed effects in each of our regression models. These fixed effects control for time-invariant characteristics of each LMA, such as whether it a gateway LMA (Auckland, South Auckland and Christchurch), has a more desirable climate or has better amenities. Thus, the relationship between locational choice and the covariates in the model are identified by the within-LMA change in these characteristics between the 1996 and 2001 census. Including LMA fixed effects is especially important for identifying network effects, because areas with fixed characteristics that attract migrants are mechanically going to have denser networks making networks appear to attract migrants when perhaps they do not.

Where do Recent Migrants Settle?

We first use a McFadden's choice model to examine the initial location decision of recent migrants. Table 6 reports the results from estimating three specifications of this model. Each specification includes as covariates all of the variables presented in Table 5: i) the proportion of migrants from an individual's region of birth in each LMA five years ago; ii) the proportion of each LMA's population that is foreign-born five years ago; iii) the employment rate in each LMA five years ago; iv) the mean log income of full-time wage and salary workers in each LMA five years ago; v) the log mean house price in each LMA five years ago; and vi) the log population of each LMA five years ago. What varies across specifications is the population group that is used to define each variable. We do this because we have no apriori information or theory that tells us how recent migrants get their information about local areas.

The most readily available information is likely that which refers to the entire population of a LMA (e.g. what are overall employment opportunities like in Wellington). Thus, in the first specification all covariates besides the first measure of migrant networks are defined as being specific to each LMA (i.e. defined over the entire LMA population). However, if migrant networks are important for finding employment and are stratified by region of birth, recent migrants may not be attracted to a local labour market because of the overall economic conditions there, but due to how well past migrants from the same region are doing. Thus, in the second specification, labour market characteristics are defined as being specific to individuals from particular birth regions. For example, if a recent migrant is born in Australia, the employment rate in each LMA is measured for that individual as being the employment rate among all Australian-born individuals in that LMA five years ago. Another possibility is that recent migrants are drawn to areas that have good economic opportunities for individuals with similar 'skills'. Thus, in the third specification, all covariates besides local house prices are defined as being specific to an individual's skill-group, delineated by their age and qualifications (25 skill-groups based on the categories tabulated in Table 1 plus a missing qualifications group are distinguished). For example, if a recent migrant is 32 and has school qualifications, the employment rate in each LMA is measured for that individual as being the employment rate among all individuals aged between 30 and 34 with school qualifications in that LMA five years ago. We also assume in this specification that migrant networks are skill-group specific.

In each specification, we pool data from the 1996 and 2001 census and estimate the regression model on an approximately 10% random sample of recent migrants for computational reasons (note that even this results in 694,260 individual*LMA observations).[18] For all covariates, we present marginal effects evaluated at the average selection probability (1/58) and standard errors for these effects. As shown in Jaeger (2007), these are calculated by multiplying the coefficients and standard errors from the conditional logit model bysettlement.tmp02.jpg ≈ 0.0169. Overall, we have no reason to prefer the results from a particular specification, thus we focus on the commonalities and differences between the specifications to establish our overall findings

Starting with the first specification, the results reported in column (1) are interpreted as follows: i) a 10 percentage point increase in the proportion of immigrants from a recent migrant's region of birth five years ago in a particular LMA (say from 5% to 15%) is associated with a 1.1 percentage point increase in the likelihood of that migrant living in that LMA; ii) a 10 percentage point increase in the proportion of a particular LMA's population that is foreign-born five years ago is associated with a 1.2 percentage point decrease in the likelihood of a recent migrant living in that LMA; iii) a 10 percent increase in the population five years ago in a particular LMA is associated with a 0.3 percentage point increase in the likelihood of a recent migrant living in that LMA; and v) there is no significant relationship between the employment rate, average income of full-time wage and salary workers or mean house price five years ago in particular LMAs and the likelihood of recent migrants living in those LMAs.

In interpreting the size of these effects, it is useful to note that if a recent migrant chooses in which LMA to live by randomly drawing a name out of a hat, they will have 1.7% chance of living in any particular LMA, whereas the average recent migrant lives in a LMA containing 18% of their same-region population. The coefficient of 0.105 in the first column of Table 6 implies that recent migrants are approximately twice as likely to choose to live in a LMA with 18% of their same-region population than in a randomly chosen LMA, with 1.7% of their same-region population). In contrast, they are approximately 90% less likely to live in a LMA that has the percent foreign-born population for the average recent migrant (26-28%) than one that has the average percent foreign-born population across all LMAs (13%). Further, recent migrants are approximately 3.7 times more likely to live in a LMA that has the log population for the average recent migrant (11.40-11.53) than one that has the average population across all LMAs (9.08-9.15).

Turning to the second specification, the estimated relationship between migrant networks and settlement decisions is unaffected by changing how local labour market characteristics are defined. Contrary to what might be expected, it appears that recent migrants are actually settling in LMAs where past compatriots are doing badly in the labour market. For example, we find that a 10 percentage point increase in the employment rate five years ago among past migrants from the same region of birth as a particular recent migrant in a particular LMA is associated with a 0.3 percentage point decrease in the likelihood of that recent migrant living in that LMA. We also find a negative relationship between the average income of full-time wage and salary workers among past migrants from the same region of birth as a particular recent migrant in a particular LMA and the likelihood that a recent migrant settles in that LMA, but the estimated marginal effect is very small in magnitude. Examining the third specification, we find evidence that recent migrants are attracted to areas with greater foreign-born and overall populations of similarly skilled individuals, but again that labour market outcomes for similarly skill individuals have little impact on the LMA in which recent migrants chose to settle.

Overall, we find consistent evidence that the density of migrant networks has a large impact on where recent migrants choose to settle. In particular, migrants are more likely to settle in LMAs in which a larger proportion of the previous immigrant population from their same region of birth are living, but not the same region of birth and skill-group. On the other hand, once we control for the strength of birth region migrant networks, our results indicate that recent migrants are less likely to settle in LMAs with proportionally greater foreign-born population, but are more likely to settle in areas with a greater foreign-born population of similarly skilled individuals. We also find consistent evidence that recent migrants are more likely to settle in larger population LMAs. We find no evidence that recent migrants choose to settle in LMAs with better labour market outcomes for either the general population, previous migrants from the same region of birth or individuals with the same skill-level.[19]

The Geographic Mobility of Earlier Migrants

We next use a McFadden's choice model to examine the (re)location decisions of earlier migrants. Table 7 reports the results from estimating three specifications of this model. These specifications are identical to those estimated in Table 6 for recent migrants, with one additional control variable added to each specification. This is an indicator variable for whether a particular LMA is the same LMA in which an earlier migrant lived in the previous census. If an individual reports being overseas at the time of the previous census or has a missing previous address, the same LMA indicator is coded as zero in all 58 LMAs. This variable allows there to be hysteresis in locational choice - once located in a particular LMA, individuals are likely to remain in that area. Again, in each specification, we pool data from the 1996 and 2001 census, estimate the regression model on a 10% random sample of earlier migrants (resulting in 488,244 individual*LMA observations) and present marginal effects evaluated at the average selection probability (1/58) and standard errors for these effects.

The results from the first specification are interpreted as follows: i) a 10 percentage point increase in the proportion of immigrants from a earlier migrant's region of birth five years ago in a particular LMA is associated with a 0.9 percentage point increase in the likelihood of that earlier migrant living in that LMA; ii) a 10 percentage point increase in the proportion of a particular LMA's population that is foreign-born five years ago is associated with a 1.4 percentage point decrease in the likelihood of a earlier migrant living in that LMA; iii) a 10 percentage point increase in the employment rate five years ago in a particular LMA is associated with a 1.6 percentage point increase in the likelihood of that earlier migrant living in that LMA; iv) living in a particular LMA five years ago makes it 7.6 percentage points more likely that a earlier migrant will still be living in that LMA; and vii) there is no significant relationship between the average income of full-time wage and salary workers, overall population or mean house price five years ago in particular LMAs and the likelihood of earlier migrants living in those LMAs.

In the next specification, we examine region of birth specific labour market characteristics. Contrary to what we found in the first specification, we now find evidence that earlier migrants are actually settling in LMAs where past compatriots are doing badly in the labour market, although the magnitudes of these effects are very small. Turning to the third specification, where we examine the impact of skill-group specific covariates, we now find the earlier migrants are less likely to live in areas with past compatriots in the same skill group (but more likely to live in LMAs with greater foreign-born and overall populations of similarly skilled individuals), which may indicate that these individuals are viewed as potential competitors in the labour market.

Overall, as with recent migrants, we find consistent evidence that the density of migrant networks has a large impact on where earlier migrants choose to settle. We find the same overall pattern as with recent migrants; earlier migrants are more likely to settle in LMAs in which a larger proportion of the previous immigrant population from their same region of birth live and are less likely to settle in LMAs with proportionally greater foreign-born population. In contrast, they are not more likely to settle in LMAs with a large proportion of people from the same region of birth and skill-group, but are more likely to settle in areas with a greater foreign-born population of similarly skilled individuals. The magnitude of these effects compared to those for recent migrants are generally smaller for region of birth networks, but larger for foreign-born population networks. We also find that earlier migrants choose to settle in LMAs with better labour market outcomes for the general population, but not in LMAs with better labour market outcomes for previous migrants from the same region of birth or for individuals with the same skill-level. This is the first indication that local labour market conditions may have an impact on where migrants settle and provides suggestive evidence that local labour market conditions become a more important determinant of where migrants live the longer they are in New Zealand.

Additional Results

The results in Tables 6 and 7 constrain the estimated impact of migrant networks and LMA characteristics on settlement decisions to be the same across individuals and over time. In Table 8, we present results from three specifications where we relax these assumptions in particular ways.[20] In the first two columns, we allow the impact of migrant networks and LMA characteristics on settlement decisions of recent and earlier migrants to differ in 1996 and 2001. This is done by interacting a dummy variable for whether an observation is from the 2001 census which each of these variables. Otherwise, these models are identical to those estimated in first specification of Tables 6 and 7 - that is, covariates besides the first migrant network variable are population specific. Only the impact of migrant networks on the settlement decisions of recent and earlier migrants is found to vary over time. For both migrant groups, migrant networks have a larger effect on settlement decisions in 1996 than in 2001, and while differences are statistically significant, they are not large in magnitude, with migrant networks still having important effects on settlement decisions in both years.

The third and fourth columns report the results from an alternative specification where we interact all covariates with an indicator variable for whether each migrant was born in a region where English is generally spoken.[21] Perhaps surprisingly, we find that migrant networks have a larger impact on the settlement decisions of recent migrants from English-speaking backgrounds (ESB) than those from non-ESB regions. There is also some evidence that higher employment rates do attract recent migrants from non-ESB regions; a 10 percentage point increase in the employment rate five years ago in a particular LMA is associated with a 0.8 percentage point increase in the likelihood of that recent migrant from a non-ESB region living in that LMA, but this is significant only at the 10% level. On the other hand, there is no evidence that higher employment rates attract recent migrants from ESB regions (the interaction term is negative, significant, and nearly the same size as the positive effect for non-ESB recent migrants). Other interesting findings are that recent migrants from non-ESB regions are attracted to LMAs with lower house prices, while house prices have no impact on the settlement decisions of recent migrants from ESB regions and that the size of the LMA population matters less to the settlement decisions of ESB migrants than to those of non-ESB migrants. For earlier migrants, we find limited differences between ESB and non-ESB migrants in the impact of migrant networks and LMA characteristics on settlement decisions, but earlier migrants from ESB regions appear less mobile.

The fifth and six columns report the results from a final specification where we interact all covariates with an indicator variable for whether each migrant has a university degree. For recent migrants, we find that the highly educated are more attracted to LMAs with higher average wages five years ago than the less educated, but that, overall, average wages have an insignificant impact on the settlement decisions of both university graduates and other recent migrants. We also find that recent migrants with university degrees are attracted to LMAs with lower house prices (the combined main effect and interaction term are significantly different from zero), while house prices have no impact on the settlement decisions of the less educated. For earlier migrants, migrant networks have a smaller, but still important, impact on the resettlement decisions of highly educated migrants compared to other migrants. Consistent with other findings in the literature, earlier migrants with university degrees are also less likely to remain in the same LMA over time.


[18] In the second (third) specification, individual*LMA observations are dropped if the particular LMA does not have any individuals from the same region of birth (skill-group) living in it five years ago. This is equivalent to assuming that these particular LMAs are not in the choice set for particular individuals.

[19] We test the robustness of our findings to excluding LMA fixed effects, adding covariates measuring the change in each population characteristic between five years ago and current census (excluding recent migrants) and examining settlement decisions among individuals residing in LMAs only with a working-age population greater than 10,000. We do not find evidence in any of these specifications that recent migrants are settling in LMAs with better labour market outcomes, controling for other characteristics.

[20] We also estimate an additional specification where impacts are allowed to vary by the gender of the migrant. We find no significant differences in the impact of migrant networks and LMA characteristics on the settlement decisions of men and women so we do not present these results.

[21] Winkelmann and Winkelmann (1998) developed a list of countries from which most migrants to New Zealand can speak English well based on individual responses to a question in the census about spoken languages. We use this list to identify which of the 15 regions in our data send primarily English speaking migrants to New Zealand. These regions are: Australia; UK and Ireland; Western Europe; Northern Europe; North America; and Africa (from which most migrants to New Zealand are English speakers from South Africa and Zimbabwe).