This study examines the impact of macroeconomic shocks on the formal labor market in Brazil, segmented by workers’ education levels.
We estimate a factor-augmented vector autoregression (FAVAR) model identified via heteroskedasticity based on the two-step method of Bernanke et al. (2005), along with the identification approach proposed by Brunnermeier et al. (2021).
Various types of macroeconomic shocks, such as those related to monetary policy and expectations, are identified. Our empirical results support the theory of heterogeneous agents for the Brazilian formal labor market over the business cycle, showing that the impacts of these shocks on more educated workers were smaller than other groups. Additionally, the findings suggest that the primary adjustment mechanism of firms is through hiring rather than separations and reveal a pro-cyclical pattern in turnover.
Unlike previous studies, this paper applies a FAVAR model identified via heteroskedasticity to analyze the effects of macroeconomic shocks on the formal labor market in Brazil, offering additional evidence on how these effects vary across workers with different education levels.
1. Introduction
The labor market is intrinsically linked to socioeconomic issues such as social mobility and has undergone profound transformations in the last five decades, mainly due to structural changes influenced by technological advancements (Acemoglu & Autor, 2011).
In a short- and medium-term context, researchers have employed structural vector autoregression (SVAR) models to analyze the effects of macroeconomic shocks on the labor market in both aggregated (Gordon & Leeper, 1994; Bernanke et al., 2005; Mumtaz & Theodoridis, 2020) and disaggregated manners (Zavodny & Zha, 2000; Thorbecke, 2001; Chaudhuri, 2020; Dolado, Motyovszki, & Pappa, 2021). These studies provide important insights in an academic context and within the scope of public policy. However, the disaggregated studies reveal that analysis of aggregated data may obscure relevant information, as they highlight the specific vulnerabilities of different groups in the business cycle. For example, they present evidence that while monetary policy shocks generally have an overall negative impact on the labor market [1], the effects are heterogeneous for different groups, particularly in terms of ethnicity and educational level of workers.
Our study aims to contribute to this scope of research by:
Analyzing the effects of macroeconomic shocks on the Brazilian labor market, disaggregated by workers’ educational level, by combining established methodologies that allowed us to identify a FAVAR model using heteroskedasticity. This approach utilizes the two-step method described by Bernanke et al. (2005), combined with the identification methodology proposed by Brunnermeier et al. (2021).
While previous studies stratify the labor market into two educational levels (with and without higher education), our study expands this to nine distinct levels, using data from the General Register of Employed and Unemployed (CAGED) [2].
Our main results reveal heterogeneity in the impacts of various types of shocks on the formal labor market, showing that more educated workers are less affected compared to less-educated groups. A contractionary monetary policy shock and a pessimism shock regarding economic activity lead to adverse scenarios, disproportionately affecting certain groups of less-educated workers. Following these shocks, we observe an increase in average salary. Furthermore, exchange rate shocks have significant impacts only in the short term on less-educated workers, while a positive supply shock has a beneficial effect on the labor market overall. Finally, our analysis reveals a pro-cyclical pattern in turnover, highlighting that the primary adjustment mechanism in firms occurs through hiring rather than separations.
It is important to note that CAGED is an important indicator of the formal labor market in Brazil and is widely used by economic agents to analyze employment dynamics. However, Brazil has a significant contingent of informal workers, a more flexible segment characterized by a high proportion of individuals with lower levels of education that, although precarious, functions as a sort of safety net for poorer and less-educated workers, especially during periods of high unemployment in the formal sector (Costa, 2010).
Therefore, our results should be interpreted strictly within the context of formal employment, as macroeconomic shocks may have distinct effects on the overall labor market due to the important interaction between formal and informal labor sectors.
The article is organized as follows: Section 2 presents a brief literature review on the impact of macroeconomic shocks on the labor market. Section 3 describes the methodology used in our study. Section 4 details the data and the main model selection method. Section 5 presents the results, and Section 6 offers concluding remarks.
2. Literature review
Articles that analyze the impacts of economic fluctuations on the labor market mostly use aggregated observations, which may hide relevant information given the evidence that certain groups exhibit above-average vulnerability during different phases of the business cycle. Regarding employees’ skills, for instance, this is due to (1) less skilled workers have lower bargaining power (Dumont, Rayp, & Willemé, 2012); (2) higher hiring and firing costs for more skilled workers (Blatter, Muehlemann, & Schenker, 2012); (3) it is easier to replace workers that execute manual or routine tasks, which require lower qualifications (Zens, Bock, & Zorner, 2020) and (4) asymmetric search and matching frictions across skilled and unskilled workers (Zago, 2020).
Considering these aspects, economists are interested in analyzing the impact of macroeconomic fluctuations on the labor market in a more stratified manner using vector autoregression (VAR) models. For instance, studies such as Zavodny and Zha (2000), Thorbecke (2001) and Carpenter and Rodgers (2004) have shown that after a contractionary monetary policy shock, the employment rates of non-white populations are more negatively affected than those of the average population. Zavodny and Zha (2000) partly attribute these disparities to differences in average years of schooling.
Using a different empirical approach, Zens et al. (2020) examine the impact of monetary policy shocks on 32 occupations using a factor-augmented vector autoregression (FAVAR) model and data from the Current Population Survey (CPS). They note an increase in unemployment across most occupations but in a heterogeneous manner. For example, jobs related to abstract tasks, mainly high-skill jobs, are significantly less impacted compared to other groups. Chaudhuri (2020) and Dolado et al. (2021) also utilize CPS data, but they categorize workers between those with and without a college degree. Chaudhuri (2020) finds that unemployment increases more among workers without higher education following a contractionary monetary policy shock, while Dolado et al. (2021) note a larger decrease in unemployment among those with higher education after a monetary expansion.
Few studies analyze the heterogeneous impact of macroeconomic shocks on the Brazilian labor market. One such study is Cavalcanti and Moreira (2015), which analyzes the impact of a restrictive monetary policy shock and exchange rate shocks. While they observe a negative effect on aggregate employment after a contractionary monetary policy shock, the exchange rate shock did not show a significant impact. However, with a disaggregated data analysis, they observed a change in worker allocation by level of education, with a higher probability of layoff in the less educated group after these types of shocks.
Table 1 summarizes the main results of these studies, providing information on the sample period, type of shocks, the stratified groups of workers analyzed and identification methodology used in the VAR models.
The latter study is one of the most similar to ours, although we distinguish ourselves by utilizing a different database (CAGED) with a longer sample period. Moreover, we disaggregate workers into nine levels of educational attainment and identify more types of macroeconomic shocks through a factor-augmented VAR (FAVAR) model identified via heteroskedasticity, using the two-step FAVAR method introduced by Bernanke et al. (2005), with the identification methodology proposed by Brunnermeier et al. (2021).
The FAVAR model is appropriate for studies aiming to analyze the impact of macroeconomic shocks on disaggregated variables, as it has fewer limitations regarding the number of variables in the model (Bernanke et al., 2005). The heteroskedasticity approach almost eliminates all uncertainty about the structural form compared to the sign restrictions identification used by Cavalcanti and Moreira (2015) and Zens et al. (2020), as the matrix of simultaneous relationships between the variables is unique [3]. The next section presents and describes this methodology.
3. Methodology
In the literature, studies that identify FAVAR models predominantly use the zero and sign restrictions approach. However, our study proposes a methodological innovation by identifying a FAVAR via heteroskedasticity, based on the two-step method by Bernanke et al. (2005), along with the identification methodology proposed by Brunnermeier et al. (2021). This type of identification presents a more agnostic approach compared to others, as it only utilizes information from the variation in the covariance matrices of the residuals over time to identify exogenous shocks (see Brunnermeier et al. (2021) for a detailed description of this identification approach).
3.1 Model structure
In the two-step procedure presented by Bernanke et al. (2005), the latent factors, Ft, are first estimated using principal component analysis (PCA) before the estimation of the FAVAR model. Since these factors are treated as fixed variables at each time t, this approach significantly reduces computational complexity [4].
Factor estimation (first step): We extract a set of latent factors, Ft, from a high-dimensional dataset Xt using principal component analysis (PCA). These factors summarize the common variation in Xt and are treated as predetermined in subsequent steps.
FAVAR model estimation (second step): The estimation procedure follows a Bayesian approach, as in Koop and Korobilis (2010) [5], but structural shocks are identified via heteroskedasticity.
With the estimated latent factors Ft (from the first step) and a set of observed macroeconomic variables Yt, described as observable factors [6], we separately estimate:
The transition equation (VAR): A vector autoregressive model that describes the dynamic interactions between Ft and Yt, where endogeneity is explicitly addressed and, in our case, structural shocks are identified using a heteroskedasticity-based approach (following Brunnermeier et al. (2021)).
The measurement equation: A regression equation linking the observed variables Xt to Ft and Yt. This step allows us to analyze how the structural shocks identified in the transition equation impacts Xt. The assumption that Xt is linearly related to Ft and Yt is based on the standard framework of the dynamic factor model (DFM) [7], which is the standard specification of FAVAR models (Koop & Korobilis, 2010).
These equations are presented in detail below:
3.1.1 Transition equation
Equation (1) presents our transition equation in structural form, based on Brunnermeier et al. (2021):
Let Yt be the N × 1 vector of observable factors and Ft a K × 1 vector of latent factors [8] estimated by principal components from the information of h observed variables (see Stock and Watson (2016)). Let n = N + K, the total number of variables in the transition equation, A0 an n × n matrix of simultaneous relationships between the variables, matrices n × n representing the matrices of coefficients at each lag j (where p is the maximum number of lags) and C a vector of constants n × 1. Let ϵt be an n × 1 vector of uncorrelated structural shocks over time. λi,m(t) is the i-th diagonal element of the structural shocks covariance matrix . ζi,t is a random parameter with inverse gamma distribution that transforms the error distribution into a t-Student, as proposed by Brunnermeier et al. (2021).
It is assumed that the coefficients of are fixed throughout the period as in the standard SVAR approach, but the residual covariance matrix from the reduced form, , can vary over time across exogenously specified regimes ι = 1, …, M. The function m(t) → ι indicates the regime corresponding to a given date.
As shown by Brunnermeier et al. (2021), if these variations are uniquely and non-proportionally for each shock, the identification via heteroskedasticity is possible [9].
3.1.2 Measurement equation
Let Xt be a vector with h stationary variables, used to estimate the factors Ft. The measurement equation assumes that the information in Xt is related to the factors Ft and the observable factors Yt by Equation (2):
Let Φf [10] be the h × K matrix of factor loadings that multiplies the factors Ft, ΦY an h × N matrix that multiplies the observable factors Yt and et a h × 1 vector representing uncorrelated errors with mean 0.
3.2 Priors and sampling parameters
In the Bayesian perspective, the parameters of Equation (1), the transition equation, , and Equation (2), the measurement equation, θM = (ΦfFt, ΦYYt, R), are treated as random variables and sampled based on Brunnermeier et al. (2021) [11] and Bernanke et al. (2005), respectively, as described in the following subsections. The present study follows the two-step procedure proposed by Bernanke et al. (2005). That is, the FAVAR model is estimated after the latent factors [12], treated as fixed at each time point t. The two-step approach offers computational simplicity as a clear advantage. However, it does not exploit the structure of the transition equation when estimating the factors (Bernanke et al., 2005; Koop & Korobilis, 2010).
3.2.1 Sampling the parameters θTr from the transition equation
In Equation (1), given Ft, we estimate this equation following the heteroskedasticity identification approach proposed by Brunnermeier et al. (2021).
The posterior distribution is proportional to the likelihood function multiplied by the prior. To draw samples from the posterior distribution of the parameters from Equation (1), , we must use a special case of the Metropolis-Hastings algorithm known as Gibbs sampling, as detailed in Brunnermeier et al. (2021).
3.2.2 Sampling the parameters θM from the measurement equation
Let denote the values of XT between period 1 and period T and .
Assuming that the errors vt from Equation (2), the measurement equation, are uncorrelated (the errors covariance matrix, R, is diagonal), the system can be estimated equation by equation via OLS (ordinary least squares), as it is not a SUR (seemingly unrelated regressions) context. Let Φ = [ΦfΦY] and Xi denote a column of . And let Φi denote a row of the matrix Φ. A joint prior density function for Rii and Φi is assumed to follow an inverse-gamma normal density with the same set of hyperparameters specification as those used by Lima, Martinez, and Cerqueira (2018):
Assuming the joint prior of the previous equation, we obtain the conditional posterior density function as demonstrated by Bauwens, Lubrano, and Richard (2000).
is the previous draw of Rii; is the OLS estimation of Φi from the measurement equation; and are the errors.
Given and using the joint prior density function showed earlier, we obtain P, the conditional posterior density function [13]:
3.3 Summary of the procedure and obtaining the impulse response functions
As previously mentioned, latent factors are estimated before the FAVAR model and are treated as fixed variables at each time point. Given the factors Ft, the parameters of the transition equation θTr are drawn (as presented in subsection 3.2.1) [14], and from A0 and , the macroeconomic shocks and the impulse response functions from them are obtained.
The draws of the parameters of the measurement equation θM (as presented in subsection 3.2.2) are conducted separately [15]. With these values stored, we can construct the impulse response functions. For each draw of Φ in Equation (5), the impact on Xt of the macroeconomic shocks is calculated by the variation in Yt and Ft resulting from the shocks obtained from the transition equation (following the last paragraph). This enables the construction of the uncertainty regions of the impulse response functions for the variables Xt after a large number of iterations.
4. Estimation
4.1 Data
The main model estimation uses monthly data from January 2000 to December 2019 since one of our main time series from the job market, the CAGED, had changes in the research method after 2020 [16]. This period was characterized by the maintenance of the inflation-targeting monetary policy regime. The variables used are presented in the next sections.
4.1.1 Observable factors Yt
Table 2 displays the eight observable factors, Yt, used in the FAVAR. The selection of these variables is based on the literature that employs VAR models to analyze macroeconomic shocks. The variable GDPfgv [17] is collected without seasonal components, while the variables IPCA and M1 are seasonally adjusted using the X13-ARIMA method from the RjDemetra package in the R software. The lag length p of the main model is set to 10, as the series are monthly.
4.1.2 Observable variables Xt
The variables in Xt are described in Table A1 of Appendix A. The analysis of the labor market segmented by education level is based on data from the CAGED [18], which monitors and supervises the monthly process of hiring and separations of workers ruled by CLT from the entire Brazilian territory. In this study, only involuntary separations are used. The Xt variables are deseasonalized using the X13-ARIMA method from the RjDemetra package in the R software. All these series were used in natural logarithms, and they have been differenced once to achieve stationarity (verified by the KPSS [19] and ADF [20] tests).
4.1.3 Latent factors Ft
The latent factors, Ft, obtained from the monthly information of 30 variables in Xt (see Table A1 in Appendix A).
The latent factors are obtained through principal component analysis (PCA), a non-parametric alternative, as proposed by Stock and Watson (2016). PCA is a statistical method that allows capturing the variability of dozens of variables over time, through linear combinations of non-orthogonal variables, Xt, into a reduced set of synthetic variables, orthogonal to each other, called factors [21].
For our factor selection, we follow Stock and Watson (2016) by regressing selected variables of interest in Xt on the estimated factors. We then compute the average R2 for the 18 disaggregated CAGED series – the key variables in our analysis – and find that with two factors, this average is approximately 50%. Adding a third factor increases the average R2 to around 60%, but at the cost of greater model complexity. Given that the model already incorporates eight observable factors variables (Yt), we prioritize parsimony and retain only two factors.
4.1.4 Transformations of the data
Observable factors Yt are in levels, and we transform the latent factors Ft into levels variables by summing them up [22] to ensure consistency with the observable factors. The Johansen cointegration test [23] indicates the presence of five and one cointegration relationships among these variables, using the trace and maximum eigenvalue methods, respectively, at the 1% significance level.
Given that both observable and latent factors are cointegrated, there are linear combinations of these variables that are stationary. Consequently, the measurement equation can incorporate both stationary (Xt) and non-stationary variables (Yt and Ft). Applying unit root tests (ADF and KPSS) to the residuals estimated from the measurement equation provides evidence of their stationarity. This suggests that the regressions are not spurious (Hamilton, 1994, p. 557) [24].
4.2 Regime choices for identification of equation (1), the transition equation
As discussed in the previous sections, to identify A0 in the transition equation (1) via heteroskedasticity, the reduced-form residual covariance matrices, Σm(t), must vary across regimes. These regimes are specified exogenously in the model. The choice of regimes is based on the study by Giudici and Lima (2024) [25], due to the similarity of the variables and data sample used in the VAR with the current study. Table 3 presents the dates for the three regimes.
We observe a large marginal data density (MDD) [26] for the model with three regimes with structural breaks in the covariance matrix compared with a model without structural breaks or with regimes changes in all parameters (A0 and ). We also assume that the errors follow a Student-t distribution with 15.6 degrees of freedom, estimated using the residuals from a Gaussian model, since this model showed a higher MDD than the Gaussian one, although the impulse response functions for both models are very similar. We found no gain in changing the hyperparameters of the prior distributions used by Brunnermeier et al. (2021) [27].
4.3 The main model
Therefore, the transition equation (Equation (1)) permits variation only in the covariance matrix of the residuals across three regimes, with ten variables (Yt and Ft), sets the lag length p to 10, and assumes that errors are distributed according to a t-Student distribution. The measurement equation (Equation (2)) describes the contemporaneous relationships between Xt with Yt and Ft and is estimated as shown in Section 3.2.2 without structural breaks.
With the structural shocks identified in the transition equation, we derive the impulse response functions of these shocks on Xt using the measurement equation. The next section will present the main results.
5. Results
5.1 Variances of the structural shocks across regimes
The selection of regimes effectively captures notable variations in the residuals’ variances from the transition equation. Figure B1 in Appendix B illustrates the marginal posteriors of the conditional variances for each shock across the three regimes. Several shocks exhibit unique and non-proportional variations, indicating the potential for identification through heteroskedasticity [28]. For example, the commodities price shock displays the highest conditional variance during the second regime, which encompasses the period of the commodities market boom and the sharp decline in prices during the 2008 global financial crisis. Furthermore, the highest conditional variances for the expectation (Swap 180) and monetary policy (Selic) shocks are observed during the first regime, reflecting the 2001 energy crisis, the turbulent 2002 elections and the initial year (2003) of the new administration. See Table C1 in Appendix C for the dates and magnitudes of the four largest shocks for each type.
5.2 Impulse response functions
In identification via heteroskedasticity, shocks are interpreted based on their known characteristics in the economic literature, as the estimation merely separates them. Figure 1 presents the impulse response functions for five types of shocks that we have identified and are interested in evaluating. These were obtained from the identification of the structural equation (1), the transition equation. The shape of the impulse response functions represents an average across regimes. The time horizon (horizontal axis) spans 42 months, and the uncertainty bands, shown in blue, represent 68% confidence intervals. The columns depict the responses of the shocks as identified by the authors, while the lines represent their impact on the observed variables Yt (the factors do not have an economic interpretation).
The IRF exhibit similar patterns to studies identifying macroeconomic shocks in Brazil, such as Giudici and Lima (2024), namely: (1) A restrictive monetary policy shock (a sharp, immediate increase in the Selic rate) leads to immediate currency appreciation [29], declines in activity and monetary aggregates and a delayed decrease in inflation; (2) a currency depreciation shock leads to an increase in inflation, a delayed negative response in economic activity and a contractionary monetary policy response from the monetary authority; (3) a supply shock increases activity, appreciates the currency and reduces inflation; (4) A positive commodity price shock leads to currency appreciation, economic activity growth (during the first year after the shock) and a negative response in inflation after one year from the occurrence of this shock (currency appreciation could serves as an anti-inflationary pressure after type of shock) and (5) the last shock is interpreted as in Giudici and Lima (2024), i.e. a “pessimism” shock, particularly related to future expectations about economic activity, due to the strong immediate impact on Swap 180, a significant decline in economic activity and observations in Table C1 (Appendix C) that July 2001 and June 2002 were the two periods with the highest value for this type of shock, corresponding to the energy crisis and the 2002 presidential election, respectively.
5.3 Impulse response functions of net hiring of workers by level of education
We applied a transformation to the admission of CAGED variables in X. Specifically, instead of including admissions and separations separately, we replaced Admissions with net employment (admissions minus separations) in our estimation of the measurement equation, while keeping separations as an independent variable [30].
Figure 2 presents the impulse response functions of the FAVAR model concerning the impact of macroeconomic shocks on formal labor net hiring disaggregated by workers’ education levels (rows). The disaggregation is presented in ascending order of educational attainment, ranging from illiterates (illit, first row) to workers with higher education (compHE, last row). Descriptions of each variable are provided in Table A1 in Appendix A. The time horizon (horizontal axis) spans 42 months, and the error bands indicate a 68% confidence interval (blue).
The impacts of a contractionary monetary policy shock on net hiring are shown in the first column of Figure 2. As expected, a negative effect on employment is observed, marginally statistically significant throughout most of the analyzed time horizon for all groups of workers (uncertainty bands below the 0 axis), except for the more educated groups with incomplete and complete higher education. These latter groups show a much smaller impact magnitude compared to other groups, less than half during the medium term. Additionally, the three groups with lower levels of education are the most impacted by this shock.
The second column of Figure 2 illustrates the impact on net hiring following a currency depreciation shock. A marginally statistically significant increase in net hiring is observed only among less educated groups during the first seven months after the shock. This suggests that the short-term expansionary effects of a currency shock disproportionately benefit workers from these groups. This outcome could be partially explained by the fact that the country’s export basket is heavily concentrated in low-tech, labor-intensive goods (Bourscheidt & Silva, 2021). It is important to note that the effects of these shocks are symmetric; a currency appreciation would likely negatively impact the less educated in the short term, while the more educated would remain relatively unaffected.
The third column illustrates the impact of a positive supply shock. A significant impact on net hiring is observed across all groups, with greater persistence and a more pronounced effect noted particularly among the less educated. The fourth column displays the responses to a positive shock in commodity prices. This shock does not have a significant effect on any of the groups presented. However, for the less educated groups, there is a higher probability of a decrease rather than an increase in employment. Benguria, Saffie, and Urzúa (2018) observe a similar pattern using data from the RAIS (Annual Social Information Report). They note that the increase in the cost effect outweighs the wealth effect in firms employing a high proportion of less skilled workers after a rise in commodity prices, leading to a higher layoff rate for the less educated in the job market.
The final column of Figure 2 displays the responses to a shock in Swap 180, interpreted as a pessimistic expectations shock about the economy. This type of shock causes a significant negative impact on the formal labor market, as expected due to the sharp decline in activity, as shown in Figure 1. It is observed that the more educated (with incomplete and complete higher education) are less impacted compared to other groups, as indicated by the shock magnitude (around 0.25% at the median) and the uncertainty bands around the 0 axis during most of the time horizon. This type of shock exhibits greater persistence and magnitude of impact for workers with low education levels. These empirical results align with the theoretical expectation that more educated individuals are less vulnerable to business cycle fluctuations.
5.4 Impulse response functions on aggregate variables
Figure 3 illustrates the impact of various shocks on aggregated labor market variables as well as those related to activity and price levels, detailed in Table A1 in Appendix A. In this figure, the responses of the variables are represented in rows, while the respective shocks are in columns.
Following a contractionary monetary policy shock (depicted in the first column of Figure 3), an adverse scenario is observed in the labor market, characterized by reductions in working hours, employment rates and hires from CAGED. Interestingly, there is an increase in average salary, suggesting a heterogeneous impact on workers, as shown in Figure 2. Additionally, consumer price indices show a delayed decrease, while variables related to economic activity, like capacity utilization and production of automobiles, exhibit negative responses.
The impact of a currency depreciation shock is observed in the second column of Figure 3. The decrease in average salary is marginally statistically significant, aligning with the short-term increase in hires for less educated workers, as shown in Figure 2. The overall impact on employment turns negative after several months, potentially explained by the deflationary monetary restrictions that lead to a decline in GDP, as illustrated in Figure 1. The impacts on price indices are positively significant, as expected, with this shock having the highest impact on prices among the identified shocks.
Regarding the supply shock, positive movements are observed in labor market variables and economic activity, along with a decrease in price indices for tradable goods. Conversely, a positive shock in commodity prices yielded ambiguous results; while there is a small increase in working hours and employment indices, an increase in involuntary separation from CAGED occurs. Also, note the higher probability of an increase in average salary, consistent with the findings in Benguria et al. (2018). Price indices decrease, which is an unexpected result in the literature but can be partially explained by currency appreciation following this type of shock in Brazil [31].
The last column of Figure 3 presents the results of a “pessimism” shock regarding economic activity. The responses in the variables indicate a contraction in the labor market and overall economic activity, with this shock having the most substantial negative impact on these variables. Also, note the increase in the average salary, which aligns with the heterogeneous impact of this shock as shown in Figure 2. Additionally, a reduction in worker turnover is observed following this type of shock, characterized by decreases in both hires and separations from CAGED, although the decline in hires is more pronounced and significant.
It is observed that among the analyzed shocks, turnover is procyclical; meaning that in periods of decline in employment and economic activity, such as in the contexts of restrictive monetary shocks or pessimism expectations about the economy, there is a reduction in turnover. Additionally, it is noted that firms primarily adjust through changes in hiring rather than separations. These observations are aligned with previous studies in the Brazilian literature, such as Corseuil, Foguel, Gonzaga, and Ribeiro (2014) and Nunes, Menezes-Filho, and Komatsu (2016).
5.5 Robustness to alternative strategies
To assess the robustness of our results, we compare them with an alternative Bayesian FAVAR model based on Koop and Korobilis (2010) [32], which follows the two-step estimation approach of Bernanke et al. (2005) but uses a Bayesian procedure and identifies shocks via Cholesky decomposition. We find that while Cholesky decomposition successfully identifies the expectation shock (Swap 180 variable), it fails to provide clear identification for monetary and supply shocks in our context. This suggests that recursive zero restrictions may not be appropriate for our data, as they limit the ability to isolate structural macroeconomic shocks. Notably, the IRFs for the expectation shock in the Koop model are similar to those obtained under the heteroskedasticity-based approach, indicating that this shock has a smaller impact on more educated workers compared to other groups.
Additionally, we explore an alternative identification approach using sign restrictions. We attempt to replicate these shocks using the signs from our results; however, this method produces IRFs with considerably wider confidence intervals, making inference more challenging. This outcome is expected, as the matrix of contemporaneous relationships between variables in the heteroskedasticity-based approach is unique, whereas in the sign restrictions approach, it is not Dieppe, van Roye, and Legrand (2016) [33].
6. Final remarks
This study analyzes the impact of macroeconomic shocks on the formal labor market disaggregated by nine levels of workers’ education, using data from the CAGED database. For this purpose, we combined established methodologies that allowed us to identify a FAVAR model using heteroskedasticity, based on the two-step method of Bernanke et al. (2005), and the identification approach proposed by Brunnermeier et al. (2021).
The main results show evidence of heterogeneity in the impacts of shocks on the labor market, highlighting that more educated workers are less impacted than other groups. Both a contractionary monetary policy shock and a pessimism shock concerning future economic activity lead to a decline in formal employment, with the most significant effects observed among less educated groups. An increase in average salary is observed following these two types of shocks. Exchange rate shocks have significant impacts only in the short term on less educated workers, and a positive supply shock has a positive impact on the labor market overall. Furthermore, the analysis reveals a pro-cyclical pattern in turnover. It is notable that the adjustment mechanism of firms mainly occurs through hiring rather than separations.
6.1 Possible limitations of this study include
Limitation of data scope: CAGED data is limited to the formal sector of the economy. Despite this, its use is preferable due to its larger sample size compared to other sources on the Brazilian labor market;
Methodological concerns: The two-step method used considers fixed factors; however, we suggest developing a one-step method to investigate the robustness of the results;
Model specifications: The inclusion of level variables in the FAVAR model’s measurement equation may lead to spurious regressions. Nonetheless, evidence of cointegration among these variables and stationarity in unit root tests on the residuals suggests that the regressions are not spurious;
Statistical variability: Limited variability in the covariance matrices of the residuals. Yet, evidence indicates that the variation between regimes is sufficient for identification and
Model stability: For the assumption of constant A0 and is recommended further analysis.
Notes Section 1
This effect can be explained by wage rigidity due to union power, search and matching frictions and the negotiation process between firms and employees.
It is an administrative register of the Ministry of Labor and Social Security that encompasses information on the number of hirings and separations of employees under the CLT (the Brazilian Labor Code). Section 2
Identification via sign restrictions do several linear combinations of the columns of the matrix of simultaneous relationships between variables in each step of the algorithm. It selects those combinations in which the impacts on the variables (observed in impulse response functions) have the signs chosen by the authors to identify the shocks (qualitative beliefs about the likely shape of impulse responses to structural shocks). Section 3
This procedure has the advantage of lower computational cost by not considering the factors as random variables, so we do not need to use the Kalman filter to estimate the simulations with other parameters of the model at each iteration.
The Bayesian FAVAR code used as a reference is available at: https://sites.google.com/site/garykoop/home/computer-code-2.
In Bernanke et al. (2005), Yt is described as observable factors because it represents macroeconomic variables that are directly included in the state vector alongside latent or unobservable factors Ft. Although Yt is not a factor in the same sense as Ft (which is estimated), it is treated as part of the factor space since it captures key macroeconomic dynamics and is considered an integral component of the state vector.
DFMs can be estimated using state-space methods (via Kalman filter) or through principal component analysis (PCA), as done in this study. PCA provides a computationally efficient way to extract common economic dynamics from large datasets and is widely used in empirical FAVAR applications (see Bernanke et al., 2005, Koop and Korobilis, 2010).
Note that if K = 0, i.e. if factors were not used in Equation (1), this would be a structural VAR as in Brunnermeier et al. (2021).
For example, let Σk and Σj represent distinct covariance matrices for all k ≠ j. Thus, multiplying the inverse of Σk by Σj yields . This operation decomposes the matrix into eigenvalues, whose columns of are the eigenvectors. If the diagonal elements of are unique and is known, then matrix A0 can be identified (Lanne, Lutkepohl, & Maciejowska, 2010; Brunnermeier et al., 2021).
Since we estimate the factors using PCA, the estimated latent factors may have factor loadings that differ from those in the original observation equation due to PCA normalization and orthogonality constraints as well as the inclusion of observable factors Yt in the measurement equation. However, this does not affect the underlying factor structure. Moreover, indicates that the estimated factor loadings are a linear transformation of the true ones, ensuring that spans the same space as Ft and still captures the system’s common variation structure.
The R software code used for this type of identification is available at: http://www.princeton.edu/sims
Estimated by principal components following Stock and Watson (2016).
Bernanke et al. (2005) fixes the first K rows of Φ; however, such a restriction is not necessary in the present study, due to the factors being fixed at each time point t.
After completing 100,000 simulations as a burn-in, for every fourth draw, one is saved until 100,000 draws are stored. Therefore, a total of 500,000 simulations are executed, discarding the first 100,000 and saving 100,000.
θTr in R software from Brunnermeier et al. (2021) code and θM in Matlab software based on Bernanke et al. (2005). Section 4
As a result, economists have noted that current data no longer aligns with the historical series.
Used as a domestic activity level.
Obtained from the Ministry of Labor and Employment (MTE).
Kwiatkowski–Phillips–Schmidt–Shin.
Augmented Dickey–Fuller.
Both the static and exact model forms are used, as presented by Stock and Watson (2016), meaning that the latent factors variables only appear contemporaneously in the measurement equation (Equation (2)) and E(ϵitϵjι) = 0, ∀t, ι se i ≠ j. The factors are identified by the restriction (Φf)(Φf) ′/K = I.
Indeed, the latent factors are stationary, as they are estimated from dozens of stationary variables (see Table A1, which we differenced as indicated in Subsection 4.1.2). However, we transform them into level variables by summing them up.
Performed in the R software using the ca.jo command from the urca package.
When independent variables are cointegrated, it indicates that they share a long-term relationship among them, despite possible short-term variations. Cointegration ensures that the regression residuals are stationary (I(0)), preventing spurious regression results despite the presence of non-stationary regressors. This formulation is consistent with error correction models (ECM), in which short-term dynamics are modeled through first differences while maintaining the long-term equilibrium relationship between variables in levels.
We combine their first two regimes to create a more persistent one since the correction for heteroskedasticity is more effective when volatilities mainly change between persistent regimes (Sims, 2020).
Calculated using the code provided by Brunnermeier et al. (2021). However, it is sensitive to specific assumptions, which may introduce bias, especially when comparing models of different complexity levels.
μ1 = 3, μ3 = 0.5, μ5 = 1 and μ6 = 5. Identifying the hyperparameters using a grid search procedure from the BEAR toolbox software proved to be challenging, as we could not retain the combination that maximizes the likelihood. Section 5
The exchange rate shock is an exception, showing low variability in the variance of residuals. However, if the premise of independently t-distributed shocks holds true, identification can be achieved without the necessity to define any regimes at all (Sims, 2020).
Currency appreciation in the short term can be explained by the increase in the real interest rate (assuming certain price rigidities), resulting in capital inflow into the country due to higher profitability of assets linked to the Selic rate.
This provides a more direct measure of labor market dynamics. Since this is a linear transformation that neither adds nor removes information, only reorganizes it through a linear recombination of existing variables, the estimated factor space remains unchanged. As a result, the factors continue to capture the underlying economic dynamics of the labor market while maintaining consistency with the original dataset.
An increase in commodity prices improves the terms of trade, resulting in a positive impact on the trade balance and leading to currency appreciation.
The FAVAR code is referenced in note 5.
The sign restrictions approach iteratively applies linear combinations to the columns of the matrix, capturing the contemporaneous relationships between variables. At each step of the algorithm, it searches for a structural form that generates impulse response functions consistent with the sign constraints specified by the authors.
I acknowledge the support of a CAPES doctoral scholarship during my PhD, which resulted in this paper.
The supplementary material for this article can be found online.




