This study examines the relationships between age, education or gender dissimilarity and movement from one workplace to another, examining different dissimilarity measures and asymmetries in these relationships.
Large-scale employer–employee register data from Finland were used to estimate discrete time duration models for the probability of job-to-job exits from plants. The alternative dissimilarity measures were the Euclidean distances for age and education and the shares of opposite gender, age and education groups.
When the Euclidean distance is used as the dissimilarity measure, age dissimilarity is negatively related to workplace exits; however, age dissimilarity is positively related to exits for young women. Educational dissimilarity, meanwhile, is positively related to exits. When the share of opposite groups is used, the results for age and educational dissimilarity depend on how the opposite age and educational groups are defined. The share of women is positively related to the probability of job change among men, but for women, the share of men negatively affects exits.
Identification relied on the assumption that unobservable individual characteristics can be sufficiently approximated using within-individual averages of the variables.
Researchers should conduct extensive sensitivity analyses and allow for asymmetries in workplace relational demography research.
Only a few previous studies used large-scale datasets to estimate the effects of dissimilarities on turnover, and those studies did not systematically compare different methods of measuring dissimilarities.
1. Introduction
The potential benefits of workforce diversity for firms and society, which include improved problem-solving and innovation, are receiving increasing attention (Page, 2007). This is against a backdrop of general trends leading to more diverse workforces in firms in many countries. For instance, population aging and rising statutory pension ages are increasing the share of older employees, and growing female labor force participation is increasing the share of women in the workplace. At the same time, job polarization is increasing the demand for employees at both tails of the skill distribution. The growing employee diversity resulting from these trends implies that many employees now work with colleagues who are different from them in terms of observable characteristics, such as age, gender, or education.
In that context, employees’ feeling about their position in the workplace may affect their incentive to change jobs or retire from the labor market, and we empirically examined the connection between workforce dissimilarity and job-to-job turnover, concentrating on age, education, and gender dissimilarities and on how their effects vary by gender. While the diversity of employees (e.g. variance in an employee characteristic such as age) measures the structure of the workforce at the group (firm, plant, work group) level, the dissimilarity of employees, or relational demography, is an individual-level concept that measures how different a person is, in terms of a characteristic such as age, from others in the workplace. Much of the previous research on this topic was in the field of human resource management (HRM), where hypotheses on the connection of workforce dissimilarity and job changes are based on sociological and psychological arguments (e.g. Williams and O’Reilly, 1998; Riordan, 2000; Joshi et al., 2011; Guillaume et al., 2012). An economic approach, meanwhile, holds that dissimilarity is a work characteristic positively or negatively affecting the utility from the job for an employee and thereby affecting their incentive to search for another job. The factors contributing to the utility from a job can, of course, have the same basis as is considered in HRM research, so the approaches need not be inconsistent.
Our study contributes to the literature in several ways. Previous empirical research on demographic dissimilarities and job changes mostly used small, non-representative samples, while our data set is more extensive than most used previously. In this study, we used data on individuals matched to their employer plants to study how workforce dissimilarity is related to job changes. Since the data set covers the whole working-age population and all plants, it is both very large and representative of the whole private sector of the economy. Furthermore, the previous literature seldom discussed causality. Addressing that, we used a discrete time duration model for job changes and the Mundlak approach, where the individual effects are a function of the within-individual averages of the explanatory variables. Although imperfect, this constituted an attempt to move closer to producing causal estimates of the effects of workforce dissimilarities. Additionally, previous studies typically used only one type of dissimilarity measure, but we compared the results of alternative measures. The dissimilarity measures we used were the Euclidean distance, the square root of the average squared age (or education) difference from all other employees, and the shares of opposite age, education, or gender groups. We also studied the heterogeneity of the results by age group, education level, and gender. Finally, research results on workforce dissimilarities and employee job changes are available so far only from a few countries, and our study is the first to use Finnish data.
We found clear differences in the relationship between workforce dissimilarity and job changes when we examined dissimilarity in terms of age, education, and gender, with age dissimilarity mostly negatively and educational diversity positively related to job changes. This supports the idea of complementarity between employees of different ages and a preference to work with colleagues with the same education level. However, in some subgroups, the results may differ. We also found that the commonly used approaches to measure dissimilarities, the D-index and the interaction approach, can produce different results. Lastly, there was gender heterogeneity in the results; most notably, men’s job changes were positively related to the share of women, whereas women’s job changes were negatively related to the share of men. Our results lead us to call for extensive sensitivity analyses to be conducted in the future and to recommend that researchers should allow for asymmetries in workplace relational demography research.
The structure of this paper is as follows. In Section 2, we discuss the arguments in the HRM literature and the labor economics that are used to explain the relationship between workforce dissimilarity and job changes. Section 3 introduces dissimilarity measures and, in Section 4, we review the previous empirical research on workforce dissimilarity and turnover. Section 5 introduces our data and the econometric methods applied. Section 6 then presents the results, and Section 7 concludes this paper.
2. Theoretical background
There are three main hypotheses in the HRM literature on the effects of workforce dissimilarity on job changes (e.g. Riordan, 2000; Williams and O’Reilly, 1998). First, according to the similarity attraction hypothesis, people like to work with co-workers who are similar to them in terms of gender, age, education, etc. Therefore, people are more likely to move away from a workplace where they are different from others. Second, self-categorization and social identity theories posit that individuals form their social identity by categorizing themselves in terms of their characteristics. They also categorize others, and if the demography of the workplace is consistent with the person’s social identity, there will be better social integration and lower turnover. Third, according to the tokenism hypothesis, a small minority that is dissimilar from the others is highly visible. This can take the form, for instance, of a token female, a token older employee, or a token foreigner. Because of their visibility, these individuals receive pressure from the majority, which raises their likelihood of quitting. The first two arguments have remarkably similar implications, and most empirical studies using these alternative theories adopt the same kinds of variables (Riordan, 2000); differentiating between the hypotheses is therefore difficult.
We can also approach the relationship between workforce dissimilarity and job changes from the economic point of view. Workforce dissimilarity is a workplace characteristic that affects the utility the employees obtain from the job. This, in turn, affects their likelihood of searching for a new job and thereby the job turnover rate. Let us first consider economic hypotheses for the impacts of workplace diversity on the utility from a job. The economic literature has not discussed dissimilarity as such, but the effects of diversity have received attention, and if a workplace is diverse with respect to a certain characteristic, such as age, there are necessarily dissimilarities. Alesina and La Ferrara (2005) propose that diversity effects are related to utility, strategic behavior, and the production function. Utility is impacted, for example, when there is a taste for discrimination among some employees (Becker, 1957), which may lead employees to feel isolated, especially those in the discriminated minority. This argument resembles HRM arguments. (Note that the “utility channel” here refers to the utility of those who, for example, discriminate against the minority. All the channels affect the utility of the individuals who are different from others.) Secondly, being different from others can affect the optimal behavior of employees, for example, in teamwork. Not being able to work in a team as productively as others increases peer pressure and lowers the utility from the job (e.g. Hamilton et al., 2012). Wage disparities can also have an effect, as an employee may feel that their relatively low wage is unfair and thus perceive the job as having low utility. Alternatively, being below the workplace average wage may signal they have good promotion prospects, which may improve how they perceive the utility they obtain from the job (e.g. Clark et al., 2009; Pfeifer and Schneck, 2012). Finally, in the production process there may be complementarities between employees of different skills or information sets (e.g. Lazear, 1999). Being different can in such circumstances benefit employees since they have valuable skills to contribute that others do not; however, when such complementarities are missing, an employee may feel undervalued in the workplace, as their work input can easily be substituted for by others. All the above arguments can be used to explain the effects of various dissimilarities, whereby age or gender can form sources of discrimination, workers who are different from others in terms of age, education, or gender can face peer pressure, and workers of different ages or education levels may be complements or substitutes in production.
The theory of on-the-job search can explain how workplace characteristics, like dissimilarity, affect job changes (e.g. Manning, 2003). An employee may take another job if the wage offer is higher than their current wage or exit the labor market if the alternative (value of leisure, pension, etc.) is better than their wage. The job separation rate is therefore a negative function of the wage. This framework can be extended by including dissimilarity in the utility function of the employees. In the case of perfectly competitive labor markets, wages will be changed to compensate for workforce dissimilarity. Workforce dissimilarity considered as a disamenity will lead to higher wages, and dissimilarity as an amenity to lower wages. Accordingly, in this case, workforce dissimilarity will not have a relationship with observed job changes. However, when there are frictions in the labor market, and wages are not changed sufficiently, the job turnover rate will come to be a function of both wages and workforce dissimilarity. This kind of approach was used in an analysis of working conditions by Gronberg and Reed (1994) (see also Manning, 2003).
In sum, there are opposing arguments related to the nature of the relationship between workforce dissimilarity and job changes. Similarity attraction, social integration, tokenism, discrimination, peer pressure, and lack of complementarities all predict a positive relationship, while the existence of complementarities in production and promotion prospects predict a negative relationship.
3. Measures of dissimilarity
For continuous variables, such as age, the most commonly used dissimilarity measure is the Euclidean distance, or D-index (e.g. Harrison and Klein, 2007). This measures the difference in person i’s demographic characteristics from those of the other employees in the same workplace. As an example, let us consider age dissimilarity in a workplace with N employees. Leaving out the workplace and time subscripts for simplicity, for employee i, we take the age differences from all employees in the same workplace in the same time period, square these differences, and take the square root of the average squared difference. (Naturally, when k = i, the difference is zero.) The age-related D-index for individual i is
Similar measures are used for other continuously measured characteristics, such as years of education. However, the D-index has been criticized, for example, because it is symmetric above and below the average and collapses several components to a single index (see, e.g. Riordan and Wayne, 2008). Some researchers prefer to use diversity (variance) and isolation (difference from the workplace average) as separate variables (e.g. Leonard and Levine, 2006). Yet, even this is problematic, since it imposes isolation above and below the average to have effects of different signs if only one isolation term is used.
For binary and discrete characteristics, the shares of workers in different demographic groups are commonly used as dissimilarity measures. This is the interaction approach (Riordan and Wayne, 2008), where an indicator of an individual’s demographic characteristic is interacted with a variable that describes the demographics of the workplace. This is straightforward for binary variables, such as gender, where, for example, an indicator for females is interacted with the share of women in the workplace (which is one minus the share of the opposite group, men). However, for variables such as education levels or age groups, forming the opposite group is not as easy. Consider, for example, three age groups: “young,” “middle-aged,” and “old.” For “young,” the opposite age group is those with a higher age, the sum of “middle-aged” and “old.” Similarly, for “old,” the opposite group is the sum of “young” and “middle-aged.” However, for “middle-aged,” the opposite group consists of both those who are younger and those who are older, basically assuming symmetry. Sometimes, the existence of several groups is managed, for example, by using the share of workers in the employee’s own demographic group as the dissimilarity variable (e.g. Hirsch et al., 2020). However, this implicitly assumes symmetry. Therefore, it may be necessary to use several interaction terms (e.g. Leonard and Levine, 2006). Another disadvantage of the interaction approach is that if the underlying data are continuous, like age or education years, using only a small number of age or educational groups will lead to a loss of information.
Sometimes, the D-index is applied to binary characteristics in a way that essentially leads to the interaction approach. For example, if an indicator is equal to one for employees who are of the opposite gender to person i and equal to zero for those of the same gender, the D-index is the square root of the share of opposite-gender employees in the workplace. A downside of using the D-index in this way for discrete characteristics is that it may produce artificially asymmetrical effects for minority and majority groups when the share of the minority is small (Tonidandel et al., 2008).
Moreover, both the D-index and the interaction approach have been criticized for requiring almost complete information on the employees in the work unit. An alternative approach that works with incomplete data is the use of subjective perceptions of dissimilarity (e.g. on a scale from one to seven); however, this requires a special survey of a sample of employees.
4. Previous empirical research on workforce dissimilarity and job turnover
The empirical relationship between workforce dissimilarity and various employee outcomes has been examined widely in the HRM literature (for surveys, see, e.g. Williams and O’Reilly, 1998; Riordan, 2000; Joshi et al., 2011; Guillaume et al., 2012), but much less in industrial relations or labor economics (see, however, Leonard and Levine, 2006; Ilmakunnas and Ilmakunnas, 2011; Kurtulus, 2011; Hirsch et al., 2020, for applications). The outcomes studied have included employee turnover, but also organizational citizenship, performance, communication, and earnings, and the dissimilarities that have been examined have included age, tenure, education, gender, and ethnic dissimilarity.
Most studies of workforce dissimilarity have used small samples of special groups, such as executives. Let us take the studies on age dissimilarity and turnover as an example. The samples in the studies were as follows: 599 top managers in 31 US firms (Wagner et al., 1984); 356 executives in 50 US firms (O’Reilly et al., 1989); 939 executives in 93 US management teams (Jackson et al., 1991); 1700 individuals in 151 US work units (Tsui et al., 1992); 220 executives in 40 Japanese firms (Wieserma and Bird, 1993); 356 executives in 50 Dutch firms (Godthelp and Glunk, 2003); 66 executives in five Dutch newspaper publishers (Boone et al., 2004); 175 coaches in US college track-and-field teams (Cunningham, 2007); 449 hair stylists in 112 Taiwanese salons (Liao et al., 2008); 328 individuals in 42 work groups in four US financial service organizations (Riordan and Wayne, 2008); and 134 faculty members in a Dutch university (Bogaert et al., 2009). Beyond these, similar samples have been used in numerous studies that have examined the relationship of turnover or other outcomes with dissimilarities related to characteristics such as tenure, education, gender, and race.
Only a few studies on employee turnover and workforce dissimilarity have used larger data sets. Among these, Elvira and Cohen examined the gender composition and turnover in a single US firm with c. 10,000 employees in 10 business units and across 20 job levels. Leonard and Levine (2006) studied age, gender, and racial dissimilarities and turnover using data on 70,000 employees in 700 workplaces of a single US firm. Other kinds of large data sets are those of national register data on firms and their employees. Hirsch et al. (2020) used German data on 2.6 million workers at 1780 workplaces to explain turnover by gender, nationality, education, age, and tenure dissimilarity. Moreover, Bygren (2004) used information on 170,000 employees in 1928 workplaces, and Bygren (2010) on 720,000 employees in 1890 workplaces in Sweden to study gender and ethnic dissimilarities and turnover.
The results support the view that at least some dissimilarities are related to higher employee exit rates, but, in general, the evidence is mixed. The results on a particular type of dissimilarity often vary across studies, and within studies that include many kinds of dissimilarities, the results vary from one type to another. To illustrate the diversity in the results, let us reconsider age dissimilarity and turnover as an example. Wagner et al. (1984), Tsui et al. (1992), Godthelp and Glunk (2003), Boone et al. (2004), and Riordan and Wayne (2008) found a positive relationship between age dissimilarity and turnover (in some of these studies, this was a negative relationship between age dissimilarity and the likelihood of staying, and in some, it was between age similarity and turnover). O’Reilly et al. (1989), meanwhile, found that age dissimilarity was negatively related to employee exit rates, while Jackson et al. (1991), Wieserma and Bird (1993), and Bogaert et al. (2009) found the relationship was not significant. Elsewhere, in Liao et al.’s (2008) study, both actual (“surface level”) and perceived age dissimilarity had a weak relationship with turnover, while Cunningham (2007) found that perceived age dissimilarity was related to dissimilarity in characteristics (“deep-level” dissimilarity), which in turn was positively related to turnover intentions. Moreover, Leonard and Levine (2006) found that age isolation (absolute difference from mean age) was negatively related to turnover, but the interaction of age isolation and the employee’s age had a positive relationship with turnover; this meant that for older workers, their isolation was likely to increase turnover. Lastly, Hirsch et al. (2020) found that the share of same-aged co-workers was negatively related to exits (and, therefore, that the share of co-workers from other age groups was positively related to exits); however, their result could reverse for very isolated workers (those in the first decile of the co-worker share distribution).
5. Data and models
Our data set is the FOLK data set of Statistics Finland, which covers the whole working-age population of Finland. Using person identifiers, we can track individuals over time. FOLK includes information on individuals’ characteristics, such as age, education, and family status. Each individual is linked to their employer at the end of the year with plant and firm identifiers. The information on the employer originates from the Business Register, which contains annual information on plant and firm characteristics. The link to the employer is missing mainly for self-employed and public-sector employees. We used the FOLK data combined with Business Register data for the years 2000–2019.
We treated plants as the work units, forming the level at which age dissimilarity was measured. Using the terminology of Guillaume et al. (2012), these are “pseudo” work units, as all employees do not necessarily strongly interact.
We identified job exits by comparing plant identifiers at the ends of two consecutive years. Therefore, we excluded persons without plant identifiers (except to identify transitions out of the labor market). We also excluded unclear cases, such as ones with a firm identifier but no plant identifier. Firm identifiers do not alone identify exployee exits, because they can change, for example, when a plant is sold to another firm or there are other reorganizations in a firm’s structure, or they can stay the same when a person changes from one plant to another within the same firm. We included in the analysis the following cases: (1) stayer: the plant identifier is the same in years t-1 and t and the firm identifier is the same or has changed; (2) job-to-job exit: the plant identifier changes from t − 1 to t and the firm identifier is the same or has changed; (3) other exit: the plant and firm identifiers exist in t − 1, but both are missing in t. We allowed for one-year gaps in employment spells. If a worker was employed in a particular plant in year t − 1, then not in year t, but again in year t+1, we did not consider this an exit. The case “other exit” consisted of transitions to unemployment, retirement, or otherwise out of the labor force.
The reason for exit was not known; however, job-to-job exit is most likely a voluntary move to a new job, and transition to unemployment an involuntary one. Job-to-job exits can be “forced” by an employer through the threat of future dismissal; still, it is likely that voluntary transitions dominate and reflect the preferences of the employees. Another factor to consider is that there was likely some misclassification within the dataset. For instance, job-to-job transitions were based on year-to-year comparisons, which could hide short periods of unemployment.
Most small-sample studies on workforce dissimilarity and job changes do not discuss whether employee exits are voluntary quits or employer-initiated layoffs, or whether the exit results in a new job or moving out of employment. Our use of job-to-job transitions corresponded with the practice in previous studies that used register data. For instance, Bygren (2004, 2010) and Hirsch et al. (2020) studied job-to-job transitions, without distinguishing between voluntary employee exits and layoffs. Furthermore, the data set of Leonard and Levine (2006) contained information on the reason for exiting, but in their principal analyses, they treated all exits as voluntary and did not distinguish between job-to-job and job-to-unemployment transitions; furthermore, in a robustness analysis that used voluntary quits, their conclusions were the same as in the analysis with all exits.
We started with the D-index as the measure of dissimilarity for the continuous variables age and education years (based on standard degree times). We had complete data on the employees in the work units, so we could calculate this index. As shown in Figure 1, for most of the observations in our sample, the Euclidean distance for age was between 5 and 25. For education, the distribution was bimodal, with many workplaces where educational dissimilarity is zero, and another peak at around two to three.
To address symmetry, we also examined the relationship between job exits and dissimilarity, allowing those above and below the mean age or education years to have different coefficients. Chattopadhyay (1999), among others, has emphasized another kind of asymmetry: the possibility of asymmetric effects across employee groups. We therefore examined the results by age and education groups and gender.
For gender dissimilarity, we used the share of employees of the opposite gender (gender was based on the variable “sex” in FOLK, which is the biological sex at birth). As an alternative to the D-index, we used the interaction approach for age and education by aggregating these variables into discrete groups and considering both symmetric and asymmetric effects. We used the age groups ≤29, 30–49, and ≥50 years, and education was divided into primary, secondary, and tertiary education. Table 1 shows the average shares of own and opposite groups for each of the discrete variables: age, education, and gender. The rows “All” show the overall averages of plant-level shares. There is a tendency to work in plants where one’s own age is overrepresented, as shown by the average shares of the own-age group, which are higher than the corresponding overall shares. This is also the case with the educational groups. Moreover, the age and education distributions are skewed. Those in the youngest age group typically work in plants where the share of the oldest group is low, and vice versa. The same happens with education. Especially low is the share of the lowest educated in typical workplaces of the highest educated. The workplaces are also segregated by gender. Women typically work in plants where the share of women is 64%, although the average plant-level share of women in the total data is 39%. Correspondingly, men typically work in plants where the share of men is high.
The explanatory variables included personal characteristics: age, education years (in models with the D-index), indicators for age groups and education levels (in models with the interaction approach), indicator for females, marital status (indicator for being married or cohabiting), and the number of children under the age of seven. It is likely that job mobility varies with age, so controlling the age is important. Moreover, education affects opportunities in the labor market and likely affects job-quitting behavior. Gender and family have also been found in previous studies to be related to mobility in the labor market. We included a log of the real income (annual earnings from tax registers, deflated with the consumer price index), which we expected to be negatively related to job changes.
The plant characteristics included indicators for plant size (six size groups: 0–9, 10–20, 20–49, 50–99, 100–499, and 500+ employees), indicators for plant age groups (five groups based on starting year), relative employment change (change divided by two-year average employment), log of productivity (log of real sales per employee), and indicators for industries (13 industries), as well as indicators for plants belonging to exporting, importing, foreign-owned, and publicly owned firms. It is likely that job change behavior is different in plants of different sizes, for example, because promotion opportunities are different. Furthermore, young plants are likely to have higher job turnover rates since they are more volatile and because the process of building the organization may involve (both voluntary and involuntary) turnover among the personnel. The relationship between plant growth and job changes is not clear a priori. On the one hand, plants that are downsizing may have more voluntary job changes, but on the other hand, plants that grow fast often have many short-tenure employees who may be more likely than others to quit. Low-productivity plants may have more layoffs and other job changes, whereas high-productivity plants can be more profitable and hence have more stable employment relationships. The employment relationships may be more volatile in plants belonging to firms engaged in foreign trade or under foreign ownership than in other plants, since foreign business cycles affect them more. On the other hand, exporting firms may be attractive workplaces because of higher wages or better promotion prospects, which may reduce turnover.
We also controlled plant-level employee characteristics: average education years and average age. They are part of the corresponding dissimilarity indexes but may have independent relationships with turnover. (Leaving them out results in higher (in absolute value) coefficients for the dissimilarity indexes but does not change their signs.) We also controlled time, since job change behavior likely has cyclical variation.
Using individual-level panel data, we estimated discrete time duration models (Allison, 1982; Jenkins, 1995, 2005; Cameron and Trivedi, 2005), using in our estimations a flow sample of new employment spells that started after the year 2000, which we followed until 2019. The dependent variable was a binary indicator for job-to-job transitions, which was equal to zero in periods without transitions and equal to one in the year when a job transition happened. After that, the spell ended. Spells that had not ended by 2019 were right-censored. We compared the job changers to the stayers, leaving out exits to other destinations, essentially treating them as censored observations. Transitions to unemployment, retirement, and out of the labor force therefore also led to the end of a spell, although the transition indicator stayed at zero. In the analysis, we used a logit model and a complementary log–log model, which corresponds to the discrete time proportional hazard model. All job exits were included in an additional multinomial logit model, with three competing risks: stay, job-to-job transition, and other transition (unemployment, retirement, or otherwise out of labor force as the destination); however, we concentrated on the results for the job-to-job transitions. A person could have several separate spells. We, therefore, calculated standard errors, allowing for clustering within individuals. All the estimated models included a third-degree polynomial of spell duration.
All plant-related characteristics in the models had lagged values, including the dissimilarity measures. When there was a job transition during year t, the person was linked to a new plant. Therefore, we explained the job transition with the year t − 1 characteristics of the old plant. We dropped some observations due to missing values for certain explanatory variables.
Due to lagging of some of the variables, the sample used in the estimations was from the period of 2002–2019. The data set used in the multinomial logit estimations had 10.6 million person-year observations of 1.7 million separate individuals in over 230,000 separate plants (see Table 2). The number of separate spells was over two million. The shares of stayers, job-to-job transitions, and other transitions were 0.74, 0.18, and 0.08, respectively. In the logit and complementary log–log estimations, 843,000 other exits were excluded, and the corresponding spells were treated as censored. Due to this censoring, one-period spells were left out completely. The number of spells dropped by 226,000, and the number of separate individuals dropped by 147,000. The resulting number of person-year observations was 9.7 million. In this smaller sample, the share of job-to-job transitions was 0.19.
Descriptive statistics for the variables are given in Table 3. The descriptive statistics are for the logit estimations. In the multinomial logit estimation with the larger sample, the means of the variables did not differ much from those in Table 3.
We must be cautious in interpreting the estimates as causal ones. There is the possibility that workers self-select into plants with certain kinds of demographic structures. For example, if a person purposefully chooses a plant with an average age much above their own or with a certain kind of gender distribution, and subsequently wants to stay there, the unobservables affecting the job change probability will be correlated with age or gender dissimilarity. There are no policy changes or exogenous events that will change the demographic structure. Therefore, we must rely on methods that assume selection is based on observables. Causality is very seldom discussed in the dissimilarity literature. The exceptions are Bygren (2010), Leonard and Levine (2006), and Hirsch et al. (2020), who controlled for time-invariant workplace unobservables. Bygren (2010) and Leonard and Levine (2006) used linear probability models with workplace fixed effects. In their robustness analysis, Leonard and Levine (2006) also estimated a conditional logit model and a continuous time Cox duration model with workplace fixed effects. Meanwhile, Hirsch et al. (2020) used a Cox model with workplace fixed effects. In these studies, the identification of dissimilarity effects, therefore, relied on variation within workplaces.
In contrast to these studies, we used the Mundlak (1978) approach to account for time-invariant unobservable employee characteristics that may be correlated with exit behavior. These unobservables are assumed to be a function of within-individual averages of the variables over time and a stochastic term that is uncorrelated with the remaining error of the equation. These averages are additional variables in the model. The models that we used do not correspond exactly to the unobserved effects model, but they can be used as an approximation (Wooldridge, 2010, p. 620). In principle, we could have used an individual fixed effects logit model. However, the estimates would have been based on only the spells that ended in an employee exit, resulting in a much smaller effective sample. In an individual fixed effects model (or its approximation), the identification of the dissimilarity effect is based on the variation within individuals over time. In this regard, it is reasonable to assume that individuals rarely affect the demographic structure of the workplace during their employment spell. As argued by Hirsch et al. (2020), when applying for a new job, applicants cannot easily observe the structure of the existing workforce and, therefore, are not likely to base their decision on whether to take a job on that structure.
In the empirical analysis, we first examined the effect of the estimation method in the models with the D-index for dissimilarity, then analyzed the asymmetries and heterogeneity of the results using the D-index, and finally, used the interaction approach as an alternative method of measuring dissimilarity and analyzing its heterogeneous effects.
6. Results
We started by analyzing the sensitivity of the results to the estimation method, using the D-index for age and education dissimilarity and the share of the opposite gender group as the dissimilarity measures and control for a wide set of individual and plant characteristics. (As noted above, the share of the opposite gender is essentially the square of the gender D-index.) We estimated three kinds of models: logit, complementary log–log, and multinomial logit, to which we applied the Mundlak (1978) approach. To avoid problems in the interpretation of the coefficients of the average terms, we did not use within-person averages of the D-indexes or the employee shares in demographic groups as variables. Note that there were two kinds of averages in the models. First, the average age and education years were calculated for each plant in each year over all employees in the plant in a particular year. In the Mundlak approach, averages are also taken of variables over time for each employee. Accordingly, in the second kind of average, for a particular individual, we took averages of plant average ages and education years over the time the person was part of the data.
Table 4 shows the coefficients of the three dissimilarity measures. All three models give qualitatively similar results, although there are differences in the values of the coefficients. Age dissimilarity is negatively and education dissimilarity positively related to job changes. These results indicate that employees like to work with others who are of different ages but with those whose education is similar to theirs. The age result can be interpreted as supporting the view that workers of different ages are complementary, whereas the education result is consistent with similarity attraction. For gender dissimilarity, the share of the opposite gender is positively related to job changes; again, this is consistent with similarity attraction.
The last three columns of Table 4 show the results with the Mundlak approach. For all three estimation methods, the Mundlak approach gives coefficients of age and education dissimilarity that are lower in absolute value than in the corresponding estimations without the means of the explanatory variables. In the case of gender dissimilarity, the difference is small in complementary log–log estimation, and in the other estimations, the coefficients are lower with the Mundlak approach. We take as our starting point for further analysis the complementary log–log model with the Mundlak approach, as it corresponds to the proportional hazards model and models the individual effects.
The results can be interpreted in terms of compensating differentials (cf. Gronberg and Reed, 1994; Manning, 2003). The coefficient of the log of income is −0.229 (see Table 7). This implies that the compensating differential for educational dissimilarity (in the complementary log–log model) is 0.030/0.229 = 0.131, or 13%, and the (negative) compensating differential for age dissimilarity is −0.020/0.229 = −0.087, or −8.7%. Using the model to estimate the compensating differentials is theoretically appealing, but these values are perhaps too large to be plausible.
We allow for heterogeneous effects in the model where dissimilarity is measured with the D-index in three ways. First, we estimate the models with age group indicators interacted with the age dissimilarity D-index, and education level indicators interacted with the education dissimilarity D-index. Second, we estimate models where the relationship between the D-index and job changes is different for those above and below the workplace average age, and those above and below the average education years. Third, we also conduct estimations separately for men and women. We analyze heterogeneities in the effects of gender dissimilarity later when we use the interaction approach for all dissimilarities.
It is possible that job change behavior in response to dissimilarity is different between men and women. The results with the preferred Mundlak complementary log–log approach are shown in Table 5. In Panel A of the table, the first column repeats the results for all employees from Table 4. The second and third columns of panel A show the results for men and women, respectively. For age dissimilarity, the coefficient is positive for women and negative for men. Although the results for men can be interpreted to reflect age-based complementarities, women are more likely than men to prefer same-age co-workers. For educational dissimilarity, the coefficient is positive for both genders, indicating similarity attraction, but the effect is greater for men.
Next, we analyze the effects of the dissimilarities by age and education groups. We use three age groups, under 30, 30 to 49, and 50 or above, and interact the age group indicators with the D-index for age dissimilarity, and interact the indicators for three education levels (primary, secondary, tertiary) with the D-index for educational dissimilarity (Panel B of Table 5).
For all employees, the coefficients for the age dissimilarity terms are negative, except in the youngest age group, where the coefficient is close to zero. For the older groups, the coefficient is highest in absolute value in the oldest group. This stronger age-dissimilarity-related decrease in job changes reflects older employees’ general tendency to have a lower turnover, as opportunities for job changes decrease. The results for men and women are otherwise qualitatively similar to those for all employees, though for women in the youngest age group, the coefficient is positive, and therefore age dissimilarity is related to a higher job change probability. This shows that the results for all women in Panel A are driven by the youngest group. Similarity attraction is particularly strong in this group, but the results may also indicate that young women face worse promotion prospects and are therefore more likely to change jobs. Meanwhile, the educational dissimilarity coefficients are positive and decrease with the level of education both for men and women. This gives evidence of similarity attraction, which is stronger at lower education levels.
To address another type of heterogeneity, we estimated the model allowing those above and below the average age or average education level to have different coefficients. Panel C of Table 5 shows that there is some asymmetry. For age dissimilarity, the coefficients are higher in absolute value for those above the average age in the case of all employees and men. For women below the average age, the effect is positive, while for those above the average age, it is close to zero and insignificant. These results are consistent with those in Panel B: the effects are stronger for older employees, who are likely to be over the average age, and the results for young women differ from the others. In the case of educational dissimilarity, the relationship is stronger for those below the average education level in the case of all employees and women. For men, the relationship is symmetrical above and below the average education level. Again, these results are mostly consistent with those in Panel B. Overall, these heterogeneity analyses support the view that there is similarity attraction based on education and that there may be age-based complementarities, except for young women.
In addition to the estimates reported in Table 5, we estimated the model scaling the age dissimilarity D-index by plant average age, and correspondingly the education dissimilarity index by plant average education years. This takes into account the possibility that what matters is relative rather than absolute dissimilarity. The results (not reported in the table) were consistent with those in Panel A of Table 5, although the coefficients were naturally higher in absolute values, as the scales of the relative dissimilarity variables were lower.
Next, we adopted the interaction approach to analyze the effects of dissimilarities. First, we estimated models where the dissimilarity variable is the indicator for an employee group interacted with the share of the opposite group (or 1 – share of their own group). For example, those above and below one’s age group are assumed to have the same effect. We used the three age groups, the three educational levels, and the genders. We did not include the shares of all these groups as separate variables besides the interaction terms, which aids the interpretation of the results. For example, including both the indicator for females interacted with the female share and the female share as separate variables would not straightforwardly give the effect of the female share on males and the effect of the male share on females. Instead, we used the female indicator interacted with the male share and the male indicator interacted with the female share. We proceeded similarly with the age and education groups. Second, we allowed for asymmetry by separating those in younger or older groups than one’s own and similarly for the education levels. Naturally, for the youngest age group, all others are older, and for the oldest group, all others are younger, and similarly for the education levels.
Panel A of Table 6 shows that in the case where we use the opposite-group shares, the results of age and education dissimilarity are mostly consistent with those obtained with the D-index approach. Since the scales of the variables are different, the magnitudes of the coefficients cannot, however, be directly compared. Age dissimilarities for the youngest and oldest age groups are negatively related to job changes. An exception is age dissimilarity for the group aged 30–49 years, where the share of other groups is positively related to job changes. Another difference from the results with the age-differentiated D-index is that now the relationship between age dissimilarity and employee exits is negative, even for young women. The results support complementarity between the youngest and others and between the oldest and others, but interestingly, these relationships are not symmetrical, and the results for the middle-aged group support similarity attraction. For education, all the coefficients are positive, and they are lowest for those with the highest education level, in the same way as with the D-index approach. Again, the positive relationship between educational dissimilarity and employee exits can be interpreted as supporting similarity attraction.
The results for the share of the opposite gender, given in Table 4, are positive for all employees, but this hides a significant gender difference: for women, the share of men is negatively related to job changes, while for men, the share of women is positively related. This means that men prefer to stay in male-dominated workplaces. For women, the negative relationship could reflect occupational segregation: female-dominated workplaces may be in occupations with low pay and poor promotion prospects, whereas for women in male-dominated occupations, the pay and prospects may be better.
The results of the interaction model where the direction of dissimilarity matters are shown in Panel B of Table 6. Both the share of younger age groups and the share of older age groups have a negative relationship with job changes, and this holds for both genders. For education, the results are asymmetrical: the share of co-workers with a lower education level has a positive association with job changes, but the share of co-workers with a higher education level has a negative association. While there seems to be symmetry between being younger or older than others, and asymmetry in the results on educational dissimilarity, the results may also reflect an aggregation of different age or education groups. For example, when analyzing the impact of the share of older employees, the groups aged ≤29 (for whom 30–49 and ≥50 are older age groups) and 30–49 (for whom ≥50 is an older group) are combined. Allowing all three age group shares to have different effects on each other and all three educational groups to have different effects on each other would increase the number of estimated parameters and make the interpretation of the results difficult. Although the interaction approach has advantages compared to the D-index approach, our results show that the way the “opposite” groups are defined can have a great impact on the results. Moreover, forming discrete groups from a continuous variable such as age is inevitably arbitrary. Therefore, a sensitivity analysis is essential.
There is a large body of literature on dissimilarities and their effects on different outcomes, including job turnover. Since the data sets used are typically small and involve special groups, it is difficult to compare our results to those. However, there are a few studies that have used large data sets similar to ours. Our results on the negative effects of age dissimilarity are contrary to those found by Hirsch et al. (2020), although for some workers, they also obtained negative results. The negative effect we found is also contrary to most of the literature that uses smaller data sets (see Section 3). A possible explanation for the difference is that we have used plants as work units. In more narrowly defined work units, e.g. teams, age dissimilarity may lead to employee exits, but they may be transitions to other teams within the same plant. At the plant level, there may still be a negative relationship between age dissimilarity and job changes that involve moving away from the plant. However, in previous research, age effects were typically symmetrical. Hirsch et al. (2020) used the share of own-age workers as the age dissimilarity variable, which implies symmetrical effects of the shares of all other age groups. In contrast, when taking the interaction approach whereby we considered that the effects varied by age group, we obtained positive age effects for some age groups and negative for others.
Meanwhile, our results on education dissimilarity are consistent with the findings of Hirsch et al. (2020), who found that the share of same-education employees was negatively related (and, therefore, the share of employees with different education level positively related) to exits. As for the gender dissimilarity effect, the positive relationship we found between the female share and male employee exits is consistent with the findings of Hirsch et al. (2020) and Bygren (2004), and the negative relationship we found between the share of males and female job changes is consistent with the findings of Bygren (2004, 2010) and Elvira and Cohen (2001).
As already mentioned, drawing strong causal interpretations from the results is difficult. However, the preferred Mundlak approach, which models the fixed effects using the within-person means of the variables, produced results that were consistent with those of the other estimation methods. This gives an indication that the problem with self-selection of employees to workplaces may not be serious. In future work, it would be interesting to examine whether events such as large-scale downsizing or mergers lead to sudden large changes in the workforce’s age or educational structure and the workers’ positions within that structure, and how this is reflected in job changes.
The results show that the effects of dissimilarity are precisely estimated, but they are small. Therefore, other factors are decisive in affecting individuals’ job transitions. A possible explanation for the small effects is that there are frictions in the labor market that make it difficult to change jobs even when the current workplace is suboptimal in terms of its demographic characteristics. This is similar to the issue of monopsony power due to frictions (e.g. Manning, 2003). In this regard, it is likely that turnover intentions are more responsive to dissimilarities than actual turnover.
We will briefly comment on the results for the control variables. Table 7 shows the estimation results of the Mundlak complementary log–log estimation with the D-index for dissimilarities. The job transitions are positively related to age, which is unexpected as older employees have fewer opportunities to change jobs, especially those close to retirement. However, this is a linear age effect, and a nonlinear specification might show lower exits among the oldest employees. Plant average age is, in fact, negatively related to employee exits. Meanwhile, exits are positively related to education, which shows that education brings more opportunities in the labor market. Income has a negative coefficient, as expected. The table reports the coefficients, but the corresponding marginal effect of log income is −0.0378. Given that the mean job-to-job separation rate is 0.195, this implies a wage elasticity of separations of approximately −0.2. Females and individuals with children have a lower likelihood of job-to-job transitions, and it seems that having a family forms a cost to job transitions. On the other hand, being married is positively related to exits. Men have been found to have a “marriage premium” in earnings, and a married person’s higher job transition probability may reflect this, as a job change is one way of earning a higher wage. The model estimated separately by gender (not reported in the table) gives a positive coefficient for married men but a negative one for married women. Furthermore, high productivity and past increases in employment have negative relationships with job changes. This supports the idea that highly productive and high-growth plants are attractive workplaces, for example, because of their better pay. Foreign ownership and exporting are negatively related to job changes. These results are contrary to the expectation that internationalized firms will have more volatile employment but may reflect better wages or promotion prospects. The likelihood of a job change is highest if the plant is very small (under 10 employees (reference group) or 10–19 employees), which may reflect the low career prospects and higher employment volatility in small firms. Job changes are inversely related to plant age. Since plant age and the average employee age are correlated, there are fewer voluntary employee exits in old plants.
7. Conclusions
We have examined the relationship between demographic dissimilarities in the workforce and the probability of job-to-job changes using a large sample of individuals that covers the whole working-age population of Finland, along with information on their employer plants. The results are heterogeneous with respect to the types of dissimilarities and the measurements of dissimilarities, and they vary by gender. This demonstrates why sensitivity analyses are needed in relational demographic research.
The results on the effects of age dissimilarity are dependent on the way the dissimilarity is measured. Using the Euclidean D-index for age results in the index being negatively related to job-to-job transitions, and the effect is stronger for older employees. However, the relationship between age dissimilarity and job changes is positive for young women. Using employee age groups leads to the result that there are asymmetries between the age groups. While the job changes of the youngest (≤29 years) and the oldest (50 ≥ years) are negatively related to the share of the other age groups, for the middle-aged (30–49 years), the result is the opposite: their job changes are positively related to the share of the others. Meanwhile, for educational dissimilarity, the results are more consistent across different methods of measuring dissimilarity: both the D-index and shares of other educational groups are positively related to job changes. Then, for gender dissimilarity, there is significant asymmetry: the share of women is positively related to men’s job changes, but the share of men is negatively related to women’s job changes.
Both the D-index and the interaction approach have advantages and disadvantages. The fact that the results vary according to the way dissimilarities are measured and how asymmetries are allowed for underscores that relational demography studies should use several measures and conduct sensitivity analyses.
Our results have alternative interpretations. For instance, workforce dissimilarity may affect the utility obtained from the job and therefore may be directly valued. This can be related to complementarities in the production process. For example, employees who are in a different age group, and therefore dissimilar, may possess complementary skills or knowledge. In this way, the relationship between the youngest vs. others and the oldest vs. others can be interpreted to show complementarity. Meanwhile, the result for the middle-aged group is more puzzling. It seems that similarity attraction is stronger among this group, or there are complementarities that employees in this group do not realize. Beyond this, the result for educational dissimilarity is consistent with the view that employees prefer to work with co-workers who have a similar education level, or that there is discrimination against those with a different education level. The results can also be interpreted to imply that people do not have complementarities with those with a very different education level, or at least that similarity attraction outweighs complementarities in this case.
The positive relationship between the female share and male employee exits supports the similarity attraction view, but interestingly, this does not hold for female employee exits in regard to the male share. Complementarities cannot explain this difference, since the existence of complementarities (or lack of them) between men and women would affect both genders similarly. A possible explanation is that women may have better promotion prospects in typically male workplaces, or at least do not face discrimination.
Excessive turnover of employees is costly for firms, as replacing them leads to hiring and training costs. Moreover, in the process of turnover, valuable firm-specific knowledge may be lost. On the one hand, dissimilarity-related turnover can reflect frictions that prevent wages from being adjusted sufficiently to compensate for the disamenities that being different from others entails. On the other hand, similarity attraction is a form of discrimination, and this is not something that should be compensated for. Since workplace-level demographic structures are not easily affected by public policies, a general policy recommendation is to encourage firms to create a work atmosphere where everyone sees the potential benefits and complementarities that exist in a diverse workplace.
Funding: The author is grateful to HSE Support Foundation for funding.

