The Korean equity market is characterized by extreme concentration, with the top five stocks accounting for approximately 25–45 percent of total market capitalization. This study examines whether such concentration structurally distorts the Fama and French (1993) three-factor model (FF3) and assesses the impact of excluding large-cap stocks on coefficient estimates and explanatory power. I construct the original FF3 model along with three alternative versions, namely FF3_E1, FF3_E3 and FF3_E5, which sequentially exclude the top one, three and five firms by market capitalization. Using one-to-one correlations of estimated alphas and betas, the sum and difference of squared coefficients, and adjusted R2, I evaluate the robustness of the factor structure. The results show that even after excluding large caps, the FF3 model maintains highly stable coefficient estimates and slightly improved explanatory power, affirming its structural robustness. However, the downward shift in the benchmark return due to large-cap exclusion induces a systematic rise in estimated alphas. These findings confirm FF3’s validity in concentrated markets while highlighting the need for careful alpha interpretation.
1. Introduction
Asset pricing models have evolved within a long-standing research tradition aimed at explaining the distribution and structure of returns in financial markets. Among these, the three-factor model (FF3) proposed by Fama and French (1993) has indeed become the most widely used empirical framework. Comprising the market factor (RM_RF), the size factor (SMB) and the value factor (HML), this model goes beyond merely extending the CAPM to demonstrate strong explanatory power for various anomalies observed in real markets. In particular, FF3 offers a significant advantage in return analysis for individual assets or portfolios, as it is capable of capturing the structural characteristics of not only equities but entire asset classes.
Before evaluating the price of any asset, it is necessary to first ensure that the benchmark used for that evaluation is robust. In markets where few firms dominate index composition and movements, the reliability of factor-based benchmarks such as FF3 cannot be assumed. If the benchmark contains structural bias, any subsequent performance evaluation or pricing exercise will inevitably inherit that bias. This is because expansions or contractions in the weight of a few large-cap stocks can influence the behavior of the pricing model itself. This issue is particularly important in markets like Korea, where the extreme concentration of a few large-cap stocks can alter the statistical properties of the factors. Accordingly, the primary aim of this study is to examine the impact of large-cap stocks on factor-based pricing models, and to do so by removing dominant large-cap stocks from the benchmark to analyze changes in estimation, stability and interpretive validity.
Particularly in applying the FF3 model, one important structural constraint is that its factors are constructed from value-weighted portfolios. When a market’s composition is heavily unbalanced, especially when a small number of large caps dominate, both the factor returns and the regression outcomes for individual assets can be systematically affected. In the Korean equity market, this concentration is partly a consequence of the KOSPI’s market capitalization-weighted structure, under which “a large and widely held company has greater impact on the KOSPI performance than a small company” (IMF Special Data Dissemination Standard) [1]. The composition of the KOSPI 200 further amplifies this imbalance: the index consists of the country’s top 200 publicly traded companies, collectively representing roughly 70% of the total market capitalization of the Korean Exchange (Investopedia, 2025). Such concentration shapes investment behavior and can influence coefficient estimation and regression stability. In other words, if market measures become overly sensitive to the returns of a few large caps, the resulting risk premium estimates may be structurally distorted.
In this situation, the coefficients estimated by the FF3 model, such as the sensitivity to the market factor (beta on RM_RF) or the responsiveness to the size and value factors (betas on SMB and HML), can be distorted relative to their true values. Furthermore, estimated alphas for individual stocks may be structurally influenced by large caps, and the model’s explanatory power (adjusted R2) may be underestimated or overestimated.
This study starts from this concern and examines how exclusion of the largest market-cap stocks affects the coefficient structure of the FF3 model. To do this, I built the original FF3 model alongside three alternative versions that progressively exclude the top market-cap stocks (FF3_E1, FF3_E3, FF3_E5) and compared their regression coefficients, explanatory power and structural similarity. In particular, I sought to quantify the robustness of the FF3 structure by analyzing one-to-one correlations between coefficients, the magnitude of coefficient shifts and differences in the sum of squared coefficients.
Empirically, the factor-mimicking returns are virtually unchanged: the time-series correlations between the baseline FF3 factors and the corresponding series from the exclusion models (FF3_E1, FF3_E3, FF3_E5) exceed 0.97 for RM_RF, SMB and HML. The same stability appears in the estimates: across assets, cross-specification correlations between the original and exclusion-model coefficient vectors exceed 0.97 for alphas and for betas on RM_RF, SMB and HML. In particular, the estimated alpha rises by about 1.46% per year after removing the five largest stocks by market capitalization. This is a leveling effect, because excluding high-return large caps lowers the benchmark’s average return and mechanically raises intercepts for the remaining assets. However, this is a spurious effect induced by replacing the benchmark while the measurement target is unchanged; it does not represent an attainable or implementable return. For explanatory power (adjusted R2), it increases slightly when the largest caps are excluded; the change seems economically negligible yet remains statistically significant. Moreover, the incremental R2 is positively related to the sum of squared coefficient shifts and to changes in the coefficient sum of squares, indicating that large-cap presence acts as a mild hindrance to model fit.
Based on these findings, I empirically demonstrate that the FF3 structure remains highly robust even after excluding certain large-cap stocks. However, it is also confirmed that the reduction in average returns resulting from the exclusion process produces a relative increase in the alphas of the affected assets.
This paper provides the following scholarly contributions. By empirically demonstrating the structural stability of the FF3 model, it offers quantitative evidence of the model’s empirical validity. By quantitatively analyzing the impact of a large-cap–dominated market structure on coefficient estimation, it specifies the cautionary points and interpretive limitations when applying the FF3 model. By comparing model versions, it shows that the FF3 structure maintains consistent explanatory power despite large-cap concentration, thereby highlighting both the practical applicability and external robustness of the FF3 model along with its limitations.
The organization of this study is as follows. In Chapter II I review the background of the FF3 model and summarize the key discussions in the existing literature. In Chapter III I describe the research methodology and the data. In Chapter IV I present the empirical analysis results. In Chapter V I offer further discussion based on those findings. Finally, in Chapter VI I conclude the study.
2. Literature review
Fama and French (1993) pointed out that the single-factor CAPM could not fully explain the cross-section of stock returns and proposed a FF3 adding size (SMB) and value (HML) factors alongside the market factor. They analyzed the US stock market from 1963 to 1990, noting that small-market-cap stocks and high book-to-market (B/M) value stocks tended to earn higher average returns. The SMB in FF3 is based on the “small-firm effect” documented by Banz (1981), who empirically showed that stocks with lower market capitalizations systematically delivered higher excess returns; this finding was later incorporated as the SMB factor. The HML draws on research by Stattman (1980) and Rosenberg et al. (1985), which demonstrated that stocks with high book-to-market ratios generated long-term excess returns, leading to its inclusion as the HML factor in FF3.
Carhart (1997) proposed a four-factor model by adding a one-year-return-based momentum factor (PR1yr) to address the FF3 limitation of failing to capture the momentum effect, whereby stocks with high past returns continue to earn high returns. The strong empirical evidence on momentum prompted even Fama, an early skeptic of its persistence, to routinely include and publish the momentum factor in the Kenneth French Data Library, where it remains available today. Subsequently, Fama and French (2015) expanded the FF3’s explanatory power by adding the profitability factor (RMW) and the investment factor (CMA), resulting in the five-factor model (FF5). However, these extended models continue to face debate in practical application due to factor overlap and interpretive complexity. Beyond the US, Griffin (2002) found that the FF3 model retains a certain degree of explanatory power in other markets, although the relevance of additional factors depends on each market’s structural characteristics. These international findings also invite attention to structural features that may influence factor estimates, one of which is the concentration of market capitalization in a few dominant firms.
The rising dominance of a few mega-cap firms in equity markets warrants closer examination. As of end-2024, the ten largest stocks comprised about 19.5% of the MSCI All Country World Index (ACWI). This is not an anomaly confined to one market, but a recurring global structural pattern. Examples include Samsung Electronics (Korea), TSMC (Taiwan), Nokia (Finland), LVMH (France) and SAP (Germany) whose absolute dominance in their domestic markets extends beyond representation, with potential to influence widely used value-weighted pricing models. Gabaix (2011) formalizes this mechanism in his granular hypothesis, showing that when firm size follows a power-law distribution, idiosyncratic shocks to a handful of mega-cap firms can propagate to aggregate economic variables and common risk factors, thereby shaping both factor construction and estimated risk premia. However, empirical studies directly examining the impact of such mega-cap concentration on the estimation and performance of pricing models remain scarce.
While the impact of large-cap concentration on pricing models has rarely been studied, the opposite case, removing small-cap stocks, has been tested. Horowitz et al. (2000) show that removing microcap stocks materially alters the SMB factor in the US. More recent evidence by Bartram et al. (2021), who analyze the performance and role of microcaps as an asset class, and industry analysis by MSCI (2021) on the impact of including microcaps in benchmark indexes in Japan and the US, similarly highlight that factor construction can be sensitive to the inclusion or exclusion of particular market segments. While not the main background, this naturally raises the question of whether removing large-cap firms, the opposite end of the size spectrum, would yield meaningful changes in factor composition and explanatory power. Relatedly, Choi and Lee (2023) show that Korea’s market structure, characterized by high concentration in chaebols and export-driven industries, amplifies connectedness and systemic risk, underscoring the broader implications of dominance and concentration for financial stability. This complements the present study by highlighting how concentration can matter not only for systemic risk but also for benchmark-based factor models.
Meanwhile, tests of factor models in the Korean market show that Rugwiro and Choi (2019) report FF3 delivers higher explanatory power than the Carhart four-factor model (FF3 plus momentum) or FF5, and that adding momentum or other factors can raise concerns about overfitting. Chae and Kang (2019) apply FF3, Carhart four-factor and FF5 models to Korean equities, evaluate their empirical performance and find no substantial differences across the models. Consistent with these findings, Kim and Sohn (2010) and Kim and Cho (2010) have successfully applied FF3 to Korean equities without momentum, partly because prior work (Chui et al., 2010) finds that momentum effects are weak or insignificant in some Asian markets.
Taken together, these studies suggest that FF3 serves as a robust baseline model with both structural stability and empirical validity in explaining the Korean stock market. Accordingly, this study focuses on FF3 rather than its common extensions such as the Carhart four-factor model or FF5, in order to maintain comparability with prior Korean evidence and to isolate the structural impact of large-cap concentration on the core FF3 factors. It should be emphasized that the choice of FF3 here is not to test its superiority over alternative models, but rather to select an appropriate and stable framework for conducting large-cap exclusion experiments.
3. Research methodology and data
3.1 Research design
This study constructs the FF3 (Fama and French, 1993) for the Korean equity market and compares how factor composition and regression estimates change depending on whether the largest market-cap stocks are excluded.
In particular, I sequentially exclude the top 1, top 3 and top 5 stocks by market capitalization to create alternative FF3 models and quantitatively analyze changes in factor values, regression coefficients and explanatory power relative to the original FF3 model.
The Korean equity market is characterized by an excessive concentration of market capitalization in a small number of top stocks, and this concentration may have a tangible impact on factor construction. Figure 1 shows the annual trends in the share of total market capitalization held by the top 1, top 3 and top 5 stocks. Although the exact share of the top 5 stocks varies by year, they represent approximately 25–45% of total market capitalization, underscoring the need for an empirical examination of factor representativeness and stability.
The line chart showing the percentage share of “Top 1 Stock,” “Top 3 Stocks,” and “Top 5 Stocks” from the year 2000 to 2025. The vertical axis ranges from 0.0 percent to 50.0 percent in increments of 10.0 percent. The horizontal axis ranges from 2000 to 2025 in increments of 5 years. A legend at the bottom indicates that the chart plots three lines. The dashed-dotted line representing “Top 1 Stock” starts from (2001, 13.85), form several peaks, and curves and ends at (2022, 18.057). The dashed line representing “Top 3 Stocks” tarts from (2001, 29.577), form several peaks, and curves and ends at (2022, 26.526). The solid line representing “Top 5 Stocks” tarts from (2001, 42.254), form several peaks, and curves and ends at (21.455). Note: All numerical data values are approximated.Trends in the market share of top-capitalization stocks. Note(s): This figure shows the time series of the proportion of total market capitalization accounted for by the top 1, top 3 and top 5 stocks in the KOSPI and KOSDAQ markets. The values for each year are calculated based on market capitalization as of the end of June of the corresponding year. Source(s): Author’s own work
The line chart showing the percentage share of “Top 1 Stock,” “Top 3 Stocks,” and “Top 5 Stocks” from the year 2000 to 2025. The vertical axis ranges from 0.0 percent to 50.0 percent in increments of 10.0 percent. The horizontal axis ranges from 2000 to 2025 in increments of 5 years. A legend at the bottom indicates that the chart plots three lines. The dashed-dotted line representing “Top 1 Stock” starts from (2001, 13.85), form several peaks, and curves and ends at (2022, 18.057). The dashed line representing “Top 3 Stocks” tarts from (2001, 29.577), form several peaks, and curves and ends at (2022, 26.526). The solid line representing “Top 5 Stocks” tarts from (2001, 42.254), form several peaks, and curves and ends at (21.455). Note: All numerical data values are approximated.Trends in the market share of top-capitalization stocks. Note(s): This figure shows the time series of the proportion of total market capitalization accounted for by the top 1, top 3 and top 5 stocks in the KOSPI and KOSDAQ markets. The values for each year are calculated based on market capitalization as of the end of June of the corresponding year. Source(s): Author’s own work
3.2 Data
This study computes daily returns using dividend-adjusted closing prices and market capitalizations for ordinary common shares listed on the KOSPI and KOSDAQ from 2001 to 2022. Market capitalization is measured as outstanding shares times the closing price. Price and capitalization data are sourced from DataGuide, and book equity for computing book value is from Kisvalue.
The sample comprises 1,940 firms that were traded for at least three consecutive years during the sample period. The three-year requirement is adopted to ensure stable pooled regressions and reliable estimation of factor exposures, while reducing noise from very short-lived listings. To define the investable universe, the following securities are excluded: preferred shares, ETFs, ETNs, REITs and SPACs. Financial sector firms are included in the main regressions and in the construction of the market factor (RM_RF), but excluded from the construction of the size (SMB) and value (HML) factors to avoid distortions stemming from sector-specific balance sheet structures and regulations. The risk-free rate is proxied by the 1-day call rate from the Bank of Korea, converted to a daily series.
3.3 Construction of the three factors
This study closely follows the methodology of Fama and French (1993) to generate daily factors. The model comprises a market factor, a SMB and a HML, each calculated as follows:
Here, is the daily return on asset i using dividend-adjusted close-to-close prices; is the daily risk-free rate; is the value-weighted market return on the investable universe; and are value-weighted factor-mimicking returns constructed following standard Fama–French methodology; and is the error term.
First, the market factor (RM_RF) is defined as the return on the market portfolio minus the risk-free rate. The market return is computed by weighing each stock’s daily return by its prior-day market-cap share. The risk-free rate is the 1-day call rate from the Bank of Korea, converted to a daily series. Subtracting the risk-free rate from the market return yields the market factor.
The SMB and the HML are constructed each year at the end of June for all stocks excluding financials. Stocks are first split into two groups at the median market cap, then classified into three groups (high, medium and low) based on the prior period’s book-to-market ratio. These two classifications create six value-weighted portfolios. The SMB factor equals the average daily return of the three small-stock portfolios minus the average daily return of the three large-stock portfolios. The HML factor equals the average daily return of the two high book-to-market portfolios minus the average daily return of the two low book-to-market portfolios.
This study constructs four versions of the three-factor model: the original FF3 including all KOSPI and KOSDAQ stocks, FF3_E1 excluding the top-1 market-cap firm, FF3_E3 excluding the top-3 firms and FF3_E5 excluding the top-5 firms. Market-cap rankings are determined each June to align with the SMB and HML formation dates and factor calculations ensure that weights always sum to one. Financial stocks are excluded from the ranking process to maintain consistency across all models.
3.4 Analysis methods
This study proceeds in the following steps.
First, using the FF3 model, I run regressions for each stock to estimate factor sensitivities (betas), alphas and explanatory power (adjusted R2), and examine their distributions.
Next, I construct the alternative models FF3_E1, FF3_E3, and FF3_E5 and analyze the correlations between their factors and those of the original FF3.
I then compare, for each stock, the correlation coefficients and differences between the alphas and betas estimated under FF3 and under each alternative model.
Finally, I perform regressions of the change in explanatory power (ΔAdj.R2) on measures of coefficient change (the sum of squared beta shifts ∑(Δβ2) and the change in the sum of betas Δ∑β2) to quantitatively identify the drivers of explanatory power changes in the alternative models.
4. Empirical result
4.1 Analysis of FF3 regression results
Table 1 summarizes the regression results for the FF3 model applied to the entire market. Panel A displays the distributions of estimated alphas and factor sensitivities (betas), while Panel B shows the distributions of t-statistics for each coefficient. The sample comprises 1,940 stocks, including only those with at least three years of trading history.
Regression results from the FF3 model
| Variable | N. obs | Mean | Std. dev | Skew | Kurt | P10 | P50 | P90 |
|---|---|---|---|---|---|---|---|---|
| Panel A. Distribution of estimated alpha and betas | ||||||||
| α | 1,940 | −0.008 | 0.159 | −0.427 | 5.720 | −0.172 | −0.004 | 0.151 |
| RM_RF | 1,940 | 1.018 | 0.283 | −0.264 | −0.145 | 0.644 | 1.031 | 1.376 |
| SMB | 1,940 | 0.820 | 0.372 | −0.373 | 0.059 | 0.310 | 0.861 | 1.262 |
| HML | 1,940 | 0.112 | 0.372 | −1.197 | 3.006 | −0.313 | 0.165 | 0.515 |
| Adj. | 1,940 | 0.162 | 0.081 | 0.767 | 1.037 | 0.065 | 0.152 | 0.269 |
| Panel B. Distribution of t-values for alpha and betas | ||||||||
| α | 1,940 | −0.204 | 7.244 | −43.165 | 1888.450 | −1.118 | −0.024 | 1.004 |
| RM_RF | 1,940 | 21.117 | 9.935 | 0.827 | 1.157 | 9.521 | 19.846 | 33.686 |
| SMB | 1,940 | 10.849 | 5.358 | −0.299 | 0.490 | 4.177 | 10.949 | 17.719 |
| HML | 1,940 | 1.796 | 4.587 | −0.150 | 1.766 | −3.348 | 1.806 | 7.372 |
| Variable | N. obs | Mean | Std. dev | Skew | Kurt | P10 | P50 | P90 |
|---|---|---|---|---|---|---|---|---|
| Panel A. Distribution of estimated alpha and betas | ||||||||
| α | 1,940 | −0.008 | 0.159 | −0.427 | 5.720 | −0.172 | −0.004 | 0.151 |
| RM_RF | 1,940 | 1.018 | 0.283 | −0.264 | −0.145 | 0.644 | 1.031 | 1.376 |
| SMB | 1,940 | 0.820 | 0.372 | −0.373 | 0.059 | 0.310 | 0.861 | 1.262 |
| HML | 1,940 | 0.112 | 0.372 | −1.197 | 3.006 | −0.313 | 0.165 | 0.515 |
| Adj. | 1,940 | 0.162 | 0.081 | 0.767 | 1.037 | 0.065 | 0.152 | 0.269 |
| Panel B. Distribution of t-values for alpha and betas | ||||||||
| α | 1,940 | −0.204 | 7.244 | −43.165 | 1888.450 | −1.118 | −0.024 | 1.004 |
| RM_RF | 1,940 | 21.117 | 9.935 | 0.827 | 1.157 | 9.521 | 19.846 | 33.686 |
| SMB | 1,940 | 10.849 | 5.358 | −0.299 | 0.490 | 4.177 | 10.949 | 17.719 |
| HML | 1,940 | 1.796 | 4.587 | −0.150 | 1.766 | −3.348 | 1.806 | 7.372 |
Note(s): This table summarizes the regression results from the FF3 model estimated on the full market sample. Panel A presents the distribution of estimated alphas and factor loadings (betas), while Panel B shows the distribution of their corresponding t-values. The analysis includes 1,940 stocks listed on the KOSPI and KOSDAQ markets in Korea between 2001 and 2022, each with at least three years of trading history. Alpha values are annualized by multiplying daily estimates by 252
The average estimated beta on the market factor (RM_RF) in Panel A is 1.018, which is close to the theoretical value of 1 and indicates stability. The median beta is 1.031, showing a roughly symmetric distribution. The average estimated beta on the SMB is 0.820 and the median is 0.861. When combined with the positive mean return of the SMB factor reported in Table 2, [2], this implies that small-cap stocks tend to earn higher returns than large caps. In contrast, the average estimated beta on the HML is 0.112 and the 10th percentile is −0.313, indicating a negative tail and an asymmetric distribution. The skewness of HML is −1.197 and its kurtosis is 3.006, reflecting greater dispersion compared with the other factors.
Summary statistics and correlations: FF3 vs. FF3_E1, E3 and E5
| Panel A. Summary statistics of realized factor returns ( × 1,000) | |||
|---|---|---|---|
| Factor | Model | Mean | Std. Dev |
| RM_RF | FF3 | 0.243 | 13.11 |
| FF3_E1 | 0.198 | 13.07 | |
| FF3_E3 | 0.233 | 13.13 | |
| FF3_E5 | 0.256 | 13.26 | |
| SMB | FF3 | 0.345 | 8.7 |
| FF3_E1 | 0.39 | 8.4 | |
| FF3_E3 | 0.333 | 8.15 | |
| FF3_E5 | 0.285 | 7.95 | |
| HML | FF3 | 0.351 | 7.46 |
| FF3_E1 | 0.372 | 7.38 | |
| FF3_E3 | 0.408 | 7.21 | |
| FF3_E5 | 0.43 | 6.87 | |
| Panel A. Summary statistics of realized factor returns ( × 1,000) | |||
|---|---|---|---|
| Factor | Model | Mean | Std. Dev |
| RM_RF | FF3 | 0.243 | 13.11 |
| FF3_E1 | 0.198 | 13.07 | |
| FF3_E3 | 0.233 | 13.13 | |
| FF3_E5 | 0.256 | 13.26 | |
| SMB | FF3 | 0.345 | 8.7 |
| FF3_E1 | 0.39 | 8.4 | |
| FF3_E3 | 0.333 | 8.15 | |
| FF3_E5 | 0.285 | 7.95 | |
| HML | FF3 | 0.351 | 7.46 |
| FF3_E1 | 0.372 | 7.38 | |
| FF3_E3 | 0.408 | 7.21 | |
| FF3_E5 | 0.43 | 6.87 | |
| Panel B. Correlations between FF3 factors and alternative models | |||
|---|---|---|---|
| Factor | FF3 vs. FF3_E1 | FF3 vs. FF3_E3 | FF3 vs. FF3_E5 |
| RM_RF | 0.982 | 0.975 | 0.97 |
| SMB | 0.981 | 0.945 | 0.927 |
| HML | 0.978 | 0.977 | 0.904 |
| Panel B. Correlations between FF3 factors and alternative models | |||
|---|---|---|---|
| Factor | FF3 vs. FF3_E1 | FF3 vs. FF3_E3 | FF3 vs. FF3_E5 |
| RM_RF | 0.982 | 0.975 | 0.97 |
| SMB | 0.981 | 0.945 | 0.927 |
| HML | 0.978 | 0.977 | 0.904 |
Note(s): This table compares the original FF3 model with its alternative versions: FF3_E1, FF3_E3 and FF3_E5. FF3_E1 is constructed by excluding the single largest stock by market capitalization, while FF3_E3 and FF3_E5 exclude the top 1–3 and top 1–5 stocks, respectively. Market-cap rankings are determined as of the end of June each year and applied consistently from July of that year through June of the following year. Panel A reports summary statistics for each factor, and Panel B shows the correlations between the original FF3 factors (RM_RF, SMB, HML) and the corresponding factors in each alternative model. All p-values for the correlation coefficients in Panel B are below 0.001 and thus omitted
Alpha was annualized by multiplying the daily estimates by 252 for ease of interpretation. The average alpha is −0.008 and the median is −0.004, showing no notable abnormal returns. The average adjusted R2 is 0.162 and the median is 0.152, indicating that the model explains a portion of individual stock returns. A skewness of 0.767 and kurtosis of 1.037 for adjusted R2 suggest that explanatory power is somewhat concentrated in certain stocks.
In Panel B, we observe the distribution of t-statistics for each regression coefficient. The average t-statistics for the market factor is 21.117, with the 10th percentile at 9.521, indicating strong statistical significance for most stocks. The SMB’s average t-statistic is 10.849 and its median is 10.949, reflecting consistently high significance. In contrast, the HML’s average t-statistic is 1.796, with a 10th percentile of −3.348, indicating that some stocks exhibit significantly negative loadings. Moreover, a notable portion of stocks show no meaningful sensitivity to HML at all. The t-statistic distribution for alpha has a mean of −0.204 and a median of −0.024, with both the 10th and 90th percentiles also falling in ranges that are not statistically significant.
4.2 Factor correlations between FF3 and the alternative models
Table 2 summarizes the comparison between the original FF3 model and its alternative specifications that sequentially exclude the top market-cap stocks (FF3_E1, FF3_E3 and FF3_E5). Panel A reports the mean and standard deviation of each factor’s daily return, while Panel B presents the pairwise correlations between each FF3 factor and its counterparts in the alternative models.
In Panel A, the mean of RM_RF in the FF3 model is 0.243. As we move to FF3_E1, _E3 and _E5, the values shift to 0.198, 0.233 and 0.256, respectively showing a decrease followed by an increase. This indicates that RM_RF does not exhibit a clear directional trend. The standard deviation remains relatively stable, ranging from 13.07 to 13.26, with no meaningful variation.
For SMB, the mean changes from 0.345 in FF3 to 0.285 in FF3_E5, first increasing slightly, then decreasing. The standard deviation shows a consistent decline from 8.70 to 7.95. In the case of HML, the mean steadily increases from 0.351 to 0.430, while the standard deviation consistently decreases.
These results suggest that constructing alternative models leads to changes in the values and distributions of the measured factors. However, such changes are a result of removing certain return-generating stocks and reallocating their weights to the remaining stocks. They do not, by themselves, constitute sufficient evidence of significant structural differences between the FF3 model and its alternatives. Therefore, Panel B examines the correlations between each corresponding factor across the models.
As shown in Panel B, the market factor remains highly correlated with FF3 in all alternative models: 0.982 for FF3_E1, 0.975 for FF3_E3 and 0.970 for FF3_E5. This indicates that excluding up to the top five stocks by market capitalization does not materially undermine the factor that represents overall market movement.
The SMB shows a similar pattern to the market factor, but its correlation weakens more rapidly, from 0.981 in FF3_E1 to 0.927 in FF3_E5. This suggests that excluding the largest stocks leads to a reweighting that can alter the relative positions or weights of some small caps, depending on whether the largest stocks are included.
The HML shows a somewhat larger drop in correlation, decreasing from 0.978 in FF3_E1 to 0.904 in FF3_E5. This likely reflects the exclusion of large value stocks, which can influence the construction of HML more than SMB due to differences in portfolio grouping: SMB is based on six size–B/M portfolios, whereas HML uses only four, making it more sensitive to composition shifts.
Even so, the overall FF3 factor structure remains broadly similar across models. All correlations remain above 0.9, and in FF3_E1 and FF3_E3, they exceed 0.94. While some differences emerge, the evidence does not suggest that the factor structure is materially disrupted.
4.3 Analysis of regression results for alternative FF3 models
Table 3 summarizes the regression results for the three alternative models FF3_E1, FF3_E3 and FF3_E5, which sequentially exclude the top market-capitalization stocks. Panel A reports the distribution of estimated factor sensitivities (betas) and adjusted R2 for each model, while Panel B shows the distribution of t-statistics for these coefficients. In Panel A, alpha is annualized by multiplying the daily estimates by 252, consistent with Table 1.
Summary of regression results for FF3 alternative models
| Variable | Model | Mean | Std. dev | Skew | Kurt | P10 | P50 | P90 |
|---|---|---|---|---|---|---|---|---|
| Panel A. Distribution of estimated alpha and betas | ||||||||
| α | FF3_E1 | 0.002 | 0.159 | −0.313 | 5.251 | −0.164 | 0.005 | 0.162 |
| FF3_E3 | 0.005 | 0.158 | −0.307 | 5.451 | −0.159 | 0.005 | 0.168 | |
| FF3_E5 | 0.007 | 0.158 | −0.165 | 4.850 | −0.159 | 0.005 | 0.170 | |
| RM_RF | FF3_E1 | 1.017 | 0.279 | −0.282 | −0.134 | 0.650 | 1.032 | 1.367 |
| FF3_E3 | 1.019 | 0.278 | −0.305 | −0.108 | 0.654 | 1.035 | 1.366 | |
| FF3_E5 | 1.021 | 0.279 | −0.317 | −0.129 | 0.654 | 1.040 | 1.365 | |
| SMB | FF3_E1 | 0.773 | 0.383 | −0.440 | 0.105 | 0.241 | 0.822 | 1.223 |
| FF3_E3 | 0.765 | 0.398 | −0.441 | 0.041 | 0.209 | 0.815 | 1.238 | |
| FF3_E5 | 0.760 | 0.416 | −0.408 | −0.009 | 0.169 | 0.811 | 1.250 | |
| HML | FF3_E1 | 0.086 | 0.362 | −1.195 | 2.938 | −0.332 | 0.141 | 0.480 |
| FF3_E3 | 0.072 | 0.376 | −1.067 | 2.430 | −0.374 | 0.127 | 0.489 | |
| FF3_E5 | 0.076 | 0.402 | −1.038 | 2.323 | −0.413 | 0.127 | 0.517 | |
| Adjusted R2 | FF3_E1 | 0.163 | 0.081 | 0.737 | 0.882 | 0.065 | 0.154 | 0.271 |
| FF3_E3 | 0.164 | 0.082 | 0.752 | 0.963 | 0.065 | 0.154 | 0.271 | |
| FF3_E5 | 0.164 | 0.082 | 0.767 | 1.034 | 0.065 | 0.154 | 0.272 | |
| Panel B. Distribution of t-values for alpha and betas | ||||||||
| α | FF3_E1 | −0.132 | 7.226 | −43.145 | 1887.310 | −1.033 | 0.033 | 1.091 |
| FF3_E3 | −0.118 | 7.232 | −43.150 | 1887.600 | −1.024 | 0.030 | 1.114 | |
| FF3_E5 | −0.113 | 7.180 | −43.127 | 1886.230 | −1.040 | 0.036 | 1.135 | |
| RM_RF | FF3_E1 | 21.227 | 9.945 | 0.817 | 1.131 | 9.637 | 19.999 | 33.914 |
| FF3_E3 | 21.312 | 9.946 | 0.797 | 1.104 | 9.602 | 20.121 | 33.998 | |
| FF3_E5 | 21.580 | 9.963 | 0.790 | 1.088 | 9.795 | 20.385 | 34.262 | |
| SMB | FF3_E1 | 9.779 | 5.289 | −0.349 | 0.304 | 3.154 | 9.926 | 16.440 |
| FF3_E3 | 9.342 | 5.309 | −0.401 | 0.334 | 2.759 | 9.589 | 15.958 | |
| FF3_E5 | 8.951 | 5.315 | −0.420 | 0.308 | 2.221 | 9.218 | 15.536 | |
| HML | FF3_E1 | 1.495 | 4.413 | −0.148 | 2.115 | −3.480 | 1.452 | 6.644 |
| FF3_E3 | 1.342 | 4.497 | −0.118 | 1.526 | −3.756 | 1.250 | 6.764 | |
| FF3_E5 | 1.333 | 4.584 | −0.227 | 1.205 | −3.983 | 1.305 | 6.989 | |
| Variable | Model | Mean | Std. dev | Skew | Kurt | P10 | P50 | P90 |
|---|---|---|---|---|---|---|---|---|
| Panel A. Distribution of estimated alpha and betas | ||||||||
| α | FF3_E1 | 0.002 | 0.159 | −0.313 | 5.251 | −0.164 | 0.005 | 0.162 |
| FF3_E3 | 0.005 | 0.158 | −0.307 | 5.451 | −0.159 | 0.005 | 0.168 | |
| FF3_E5 | 0.007 | 0.158 | −0.165 | 4.850 | −0.159 | 0.005 | 0.170 | |
| RM_RF | FF3_E1 | 1.017 | 0.279 | −0.282 | −0.134 | 0.650 | 1.032 | 1.367 |
| FF3_E3 | 1.019 | 0.278 | −0.305 | −0.108 | 0.654 | 1.035 | 1.366 | |
| FF3_E5 | 1.021 | 0.279 | −0.317 | −0.129 | 0.654 | 1.040 | 1.365 | |
| SMB | FF3_E1 | 0.773 | 0.383 | −0.440 | 0.105 | 0.241 | 0.822 | 1.223 |
| FF3_E3 | 0.765 | 0.398 | −0.441 | 0.041 | 0.209 | 0.815 | 1.238 | |
| FF3_E5 | 0.760 | 0.416 | −0.408 | −0.009 | 0.169 | 0.811 | 1.250 | |
| HML | FF3_E1 | 0.086 | 0.362 | −1.195 | 2.938 | −0.332 | 0.141 | 0.480 |
| FF3_E3 | 0.072 | 0.376 | −1.067 | 2.430 | −0.374 | 0.127 | 0.489 | |
| FF3_E5 | 0.076 | 0.402 | −1.038 | 2.323 | −0.413 | 0.127 | 0.517 | |
| Adjusted R2 | FF3_E1 | 0.163 | 0.081 | 0.737 | 0.882 | 0.065 | 0.154 | 0.271 |
| FF3_E3 | 0.164 | 0.082 | 0.752 | 0.963 | 0.065 | 0.154 | 0.271 | |
| FF3_E5 | 0.164 | 0.082 | 0.767 | 1.034 | 0.065 | 0.154 | 0.272 | |
| Panel B. Distribution of t-values for alpha and betas | ||||||||
| α | FF3_E1 | −0.132 | 7.226 | −43.145 | 1887.310 | −1.033 | 0.033 | 1.091 |
| FF3_E3 | −0.118 | 7.232 | −43.150 | 1887.600 | −1.024 | 0.030 | 1.114 | |
| FF3_E5 | −0.113 | 7.180 | −43.127 | 1886.230 | −1.040 | 0.036 | 1.135 | |
| RM_RF | FF3_E1 | 21.227 | 9.945 | 0.817 | 1.131 | 9.637 | 19.999 | 33.914 |
| FF3_E3 | 21.312 | 9.946 | 0.797 | 1.104 | 9.602 | 20.121 | 33.998 | |
| FF3_E5 | 21.580 | 9.963 | 0.790 | 1.088 | 9.795 | 20.385 | 34.262 | |
| SMB | FF3_E1 | 9.779 | 5.289 | −0.349 | 0.304 | 3.154 | 9.926 | 16.440 |
| FF3_E3 | 9.342 | 5.309 | −0.401 | 0.334 | 2.759 | 9.589 | 15.958 | |
| FF3_E5 | 8.951 | 5.315 | −0.420 | 0.308 | 2.221 | 9.218 | 15.536 | |
| HML | FF3_E1 | 1.495 | 4.413 | −0.148 | 2.115 | −3.480 | 1.452 | 6.644 |
| FF3_E3 | 1.342 | 4.497 | −0.118 | 1.526 | −3.756 | 1.250 | 6.764 | |
| FF3_E5 | 1.333 | 4.584 | −0.227 | 1.205 | −3.983 | 1.305 | 6.989 | |
Note(s): This table summarizes the regression results for the alternative FF3 models (FF3_E1, FF3_E3, FF3_E5), each constructed by sequentially excluding top market-cap stocks. Panel A presents the distribution of estimated alphas and factor loadings (betas), while Panel B reports the distribution of their corresponding t-values. The analysis covers 1,940 stocks listed on the KOSPI and KOSDAQ markets in Korea between 2001 and 2022, each with a minimum of three years of trading history. Alpha values are annualized by multiplying daily estimates by 252
The average alpha is 0.002 for FF3_E1, 0.005 for FF3_E3 and 0.007 for FF3_E5, representing a notable shift from −0.008 in Table 1. In FF3_E5 the change amounts to 1.46%. Although a 1.46% change in annualized return may appear economically meaningful, the t-values for alpha in Panel B indicate that this increase stems from the adjusted benchmark rather than genuine outperformance. Excluding large caps alters the return profile of the reconstructed market portfolio; because the new benchmark’s return is lower, the measured returns of individual stocks rise by comparison. It is therefore more appropriate to interpret this as a relative shift than a reflection of true abnormal performance.
Turning to betas, the average beta on RM_RF is 1.017 in FF3_E1, 1.019 in FF3_E3 and 1.021 in FF3_E5, with a standard deviation of about 0.279, indicating stable dispersion despite excluding the largest stocks. The distribution of t-statistics remains consistently high, with a mean above 21 across all percentiles.
SMB’s average beta declines slightly from 0.773 in FF3_E1 to 0.765 in FF3_E3 and 0.760 in FF3_E5, while its standard deviation increases from 0.383 to 0.398 and 0.416, respectively. These changes may reflect shifts in the relative position of small-cap stocks after reweighting. Panel B shows no notable change in SMB’s t-statistics.
For HML, the average beta is 0.086 in FF3_E1, 0.072 in FF3_E3 and 0.076 in FF3_E5. The 10th percentile decreases from −0.332 to −0.374 and −0.413 across the models. All three specifications exhibit skewness below −1 and kurtosis above 2, indicating heavier tails than RM_RF and SMB. The t-statistics for HML remain broadly unchanged.
The adjusted R2 averages around 0.164 with a standard deviation of roughly 0.082 across all models, suggesting that excluding the largest stocks has a limited impact on explanatory power. Overall, while excluding large caps alters the distribution of alpha more than beta, the fundamental structure of the FF3 model remains intact.
As a robustness check, I also exclude stocks whose market capitalization exceeds 1%, 3%, or 5% of the total market in each year, as well as an extreme case where the smallest set of largest stocks cumulatively accounts for 50% of total market capitalization. The results, reported in Appendix A3 (Table A1), remain qualitatively consistent with the baseline, confirming that the main conclusions are not sensitive to the specific cutoff threshold.
4.4 Analysis of estimation differences between the FF3 model and alternative models
Table 4 summarizes the differences in regression estimates between the FF3 model and the alternative models that exclude the top market-cap stocks (FF3_E1, FF3_E3 and FF3_E5) and reports statistics on the raw differences.
Differences in estimated coefficients between FF3 and alternative models
| Variable | Model | Mean | Std. dev | Std. err |
|---|---|---|---|---|
| α | FF3_E1 | 9.747 | 11.735 | 0.266 |
| FF3_E3 | 12.886 | 16.875 | 0.383 | |
| FF3_E5 | 14.556 | 25.345 | 0.575 | |
| RM_RF | FF3_E1 | −0.626 | 21.777 | 0.494 |
| FF3_E3 | 1.279 | 30.159 | 0.685 | |
| FF3_E5 | 3.702 | 34.258 | 0.778 | |
| SMB | FF3_E1 | −46.656 | 41.977 | 0.953 |
| FF3_E3 | −54.962 | 63.931 | 1.451 | |
| FF3_E5 | −59.817 | 83.266 | 1.890 | |
| HML | FF3_E1 | −25.431 | 54.943 | 1.247 |
| FF3_E3 | −39.469 | 79.702 | 1.810 | |
| FF3_E5 | −35.760 | 90.753 | 2.060 | |
| Adjusted R2 | FF3_E1 | 1.397 | 6.939 | 0.158 |
| FF3_E3 | 1.700 | 8.092 | 0.184 | |
| FF3_E5 | 1.857 | 9.334 | 0.212 |
| Variable | Model | Mean | Std. dev | Std. err |
|---|---|---|---|---|
| α | FF3_E1 | 9.747 | 11.735 | 0.266 |
| FF3_E3 | 12.886 | 16.875 | 0.383 | |
| FF3_E5 | 14.556 | 25.345 | 0.575 | |
| RM_RF | FF3_E1 | −0.626 | 21.777 | 0.494 |
| FF3_E3 | 1.279 | 30.159 | 0.685 | |
| FF3_E5 | 3.702 | 34.258 | 0.778 | |
| SMB | FF3_E1 | −46.656 | 41.977 | 0.953 |
| FF3_E3 | −54.962 | 63.931 | 1.451 | |
| FF3_E5 | −59.817 | 83.266 | 1.890 | |
| HML | FF3_E1 | −25.431 | 54.943 | 1.247 |
| FF3_E3 | −39.469 | 79.702 | 1.810 | |
| FF3_E5 | −35.760 | 90.753 | 2.060 | |
| Adjusted R2 | FF3_E1 | 1.397 | 6.939 | 0.158 |
| FF3_E3 | 1.700 | 8.092 | 0.184 | |
| FF3_E5 | 1.857 | 9.334 | 0.212 |
Note(s): This table reports the differences in estimated coefficients between the baseline FF3 and the exclusion models (FF3_E1, FF3_E3, FF3_E5). Reported are mean, standard deviation and standard error of factor loadings. Values are scaled by 1,000, and alphas are annualized using 252 trading days
First, the average difference in alpha (annualized by multiplying by 252) rises from approximately 9.747 (× 1,000) in FF3_E1 to 14.556 in FF3_E5, with a standard error near zero. Statistically, this constitutes a significant change. As noted earlier, however, this difference reflects a relative increase due to the changed benchmark and should not be interpreted as genuine excess returns. This upward shift in alpha estimates can be more accurately understood as a leveling effect, which arises from a decline in the benchmark return after excluding large-cap stocks.
I calculate the leveling effect, defined as the change in benchmark return induced by excluding large caps, as
where:
: Average daily return of factor i in the original FF3 model.
: Average daily return of factor i in the alternative model after excluding large-cap stocks.
: Average estimated factor loading on factor i in the original FF3 model.
: Average estimated factor loading on factor i in the alternative model.
i = 1, 2, 3: Refers to the market factor (RM_RF), SMB and HML, respectively.
252: Number of trading days in a year, used to annualize the daily effect.
Using the average beta estimates from Table 1 (for the original FF3) and Table 3 (for each alternative), together with the average factor returns from Table 2 Panel A, I obtain leveling effects of 0.86% for FF3_E1, 1.19% for FF3_E3 and 1.46% for FF3_E5. These values reflect the difference in benchmark returns across model specifications and help explain the relative increase in alphas under the alternative models.
Meanwhile, the change in the market factor in FF3_E5 reaches up to 3.7, which is not substantial given the baseline coefficient level. The standard error is around 0.01 to 0.02, producing a large t statistic, but it does not represent a practically meaningful shift.
The SMB reacts more sensitively than the other factors. In FF3_E5 the average change is negative 59.8. Even in FF3_E1 there is already a negative 46.7 change, suggesting that this factor is structurally influenced by the presence of the largest market-cap stocks.
The HML shows a similar pattern, though with smaller magnitudes than SMB. In FF3_E3 and FF3_E5 the average change ranges from negative 39.5 to negative 35.8. While less pronounced than SMB, these results indicate that inclusion of large value stocks significantly affects this factor’s construction.
Adjusted R2 increases by 1.86% points in FF3_E5, indicating a slight improvement in explanatory power. However, given the modest size of this gain, the FF3_E5 regression does not reflect a fundamental shift but rather a partial correction for distortions caused by large-cap concentration. Robustness tests under alternative exclusion cutoffs (Appendix A4 with Table A2) yield qualitatively similar results, confirming that these observed shifts are not sensitive to the specific cutoff thresholds.
4.5 Correlations of estimates between the FF3 model and alternative models
Table 5 summarizes the correlations of regression estimates between the original FF3 model and the alternative models excluding the top market cap stocks FF3_E1, FF3_E3 and FF3 E5. For each coefficient alpha RM RF SMB HML and adjusted R2 the one-to-one correlation with the FF3 model is reported.
Correlations between estimates from FF3 and alternative models
| Variable | Model | ||
|---|---|---|---|
| FF3_E1 | FF3_E3 | FF3_E5 | |
| α | 0.997 | 0.994 | 0.987 |
| RM_RF | 0.997 | 0.994 | 0.993 |
| SMB | 0.994 | 0.988 | 0.984 |
| HML | 0.989 | 0.977 | 0.975 |
| Adj. R2 | 0.996 | 0.995 | 0.994 |
| Variable | Model | ||
|---|---|---|---|
| FF3_E1 | FF3_E3 | FF3_E5 | |
| α | 0.997 | 0.994 | 0.987 |
| RM_RF | 0.997 | 0.994 | 0.993 |
| SMB | 0.994 | 0.988 | 0.984 |
| HML | 0.989 | 0.977 | 0.975 |
| Adj. R2 | 0.996 | 0.995 | 0.994 |
Note(s): This table reports the pairwise correlations between the coefficients estimated from the original FF3 model and those from the alternative models (FF3_E1, FF3_E3, FF3_E5), which exclude the top market-cap stocks. The variables include α, RM_RF, SMB, HML and adjusted R2. Correlations are calculated based on stock-level regression results. Higher correlation values indicate greater structural similarity between the original and alternative models
All correlations in Table 5 exceed 0.97 indicating that the FF3 factor structure is largely preserved in the alternative models. In particular, alpha, RM_RF and SMB maintain correlations above 0.98 even in FF3_E5 confirming that excluding the top stocks has minimal impact on the estimation of these coefficients.
However, the HML factor shows a relatively larger decline in correlation dropping from 0.989 in FF3_E1 to 0.977 in FF3_E3 and 0.975 in FF3_E5. This result is consistent with the earlier analyses of HML skewness kurtosis and coefficient differences implying that HML is structurally more sensitive.
The correlation for adjusted R2 remains between 0.994 and 0.996 indicating that explanatory power is virtually unchanged regardless of whether large cap stocks are excluded.
In conclusion, the coefficients estimated under the original FF3 model remain highly correlated in the alternative models supporting the robustness of the overall factor structure and individual stock regression results to the presence of large cap stocks.
4.6 Relationship between changes in regression coefficients and explanatory power adjusted R2
Table 6 presents result of an analysis of how changes in alpha and beta estimated under the FF3 model and the corresponding coefficients in the alternative models affect the increase in explanatory power (adjusted R2) of the regression model. Panel A shows regression results that use the sum of squared changes in each coefficient ∑(Δβ2) as the sole explanatory variable. Panel B shows regression results that use the difference in sums of squared coefficients as the explanatory variable [3]. The details of the regression specification are as follows:
Relationship between changes in regression coefficients and explanatory power (Adj. R2)
| Base model | Compared model | Variable | Estimate | t-value | R2 |
|---|---|---|---|---|---|
| Panel A. Effect of the sum of squared changes in regression coefficients | |||||
| FF3 | FF3_E1 | 0.0207 | 1.94 | 0.0014 | |
| FF3 | FF3_E3 | 0.0350 | 5.6 | 0.0154 | |
| FF3 | FF3_E5 | 0.0572 | 9.83 | 0.047 | |
| Panel B. Effect of the difference in the sum of squared regression coefficients | |||||
| FF3 | FF3_E1 | 0.00983 | 6.52 | 0.021 | |
| FF3 | FF3_E3 | 0.00557 | 4.05 | 0.0079 | |
| FF3 | FF3_E5 | 0.00577 | 4.35 | 0.0092 | |
| Base model | Compared model | Variable | Estimate | t-value | R2 |
|---|---|---|---|---|---|
| Panel A. Effect of the sum of squared changes in regression coefficients | |||||
| FF3 | FF3_E1 | 0.0207 | 1.94 | 0.0014 | |
| FF3 | FF3_E3 | 0.0350 | 5.6 | 0.0154 | |
| FF3 | FF3_E5 | 0.0572 | 9.83 | 0.047 | |
| Panel B. Effect of the difference in the sum of squared regression coefficients | |||||
| FF3 | FF3_E1 | 0.00983 | 6.52 | 0.021 | |
| FF3 | FF3_E3 | 0.00557 | 4.05 | 0.0079 | |
| FF3 | FF3_E5 | 0.00577 | 4.35 | 0.0092 | |
Note(s): This table presents regression analyses examining how the relative magnitude (vector sum) of changes in α and β coefficients, estimated from the original FF3 model and its alternatives, affects the model’s explanatory power (Adj. R2). Panel A shows how the sum of squared coefficient changes, , relates to the increase in Adj. R2. Panel B analyzes the impact of the difference in total squared coefficient magnitudes, on changes in Adj. R2. Here, j refers to the betas on RM_RF, SMB and HML, respectively
Regression model
• Panel A:
• Panel B:
According to the results in Panel A, the larger the coefficient changes the greater the increase in adjusted R2. For FF3_E5 the coefficient is 0.0572 the t-value is 9.83 and the adjusted R2 of 0.047 indicating that the increase in explanatory power in the alternative models is structurally linked to coefficient changes.
Panel B indicates that changes in explanatory power respond more coherently to the difference in sums of squared coefficients rather than to simple changes in coefficients. In FF3_E1 the coefficient is 0.00983 with a t-value of 6.52, making it larger than in Panel A. In FF3_E5 the coefficient is 0.00577 with a t-value of 4.35 and remains statistically significant. This suggests that changes in the entire coefficient structure as measured by vector magnitude have a more direct impact on explanatory power than simple coefficient changes. Indeed, when comparing Panels A and B, the adjusted R2 is higher in Panel B for all models.
These results support the idea that structural changes in the estimates are partly reflected in changes in adjusted R2, but the magnitude of these changes remains limited and the FF3 factor structure is basically preserved in the alternative models. Importantly, the regression slopes in Table 6 provide a quantitative measure of how much of the variation in ΔAdj.R2 can be explained by coefficient shifts [4]. This extends the analysis beyond merely confirming the presence of a mechanical relationship, providing a quantitative assessment of its economic significance.
5. Discussion
This study examined the potential structural distortion of the FF3 model (Fama and French, 1993) caused by the largest market-cap stocks and analyzed how excluding those large caps affects coefficient estimates and explanatory power. The empirical results confirmed that the FF3 structure remains generally robust, maintaining very high levels of factor-structure similarity and stability of explanatory power in the alternative models. However, the shift in estimated alpha levels resulting from the changed benchmark after excluding large caps deserves careful attention.
These findings carry several theoretical and practical implications. First, from an efficient-frontier perspective, the fact that the FF3 model preserves a similar factor structure after excluding large caps, apart from relative shifts in alpha, suggests that common market risk factors are not driven by a few stocks but are evenly distributed across portfolios. This indicates that the FF3 effectively captures systemic risk across the entire market.
Second, these results offer practical guidance on handling newly listed large-cap IPOs. Large IPOs often lack complete or reliable accounting data needed for book-to-market ratio calculations, making immediate inclusion in the FF3 model challenging. The present study shows that even if mega-caps are temporarily excluded, the structural stability of the FF3 remains intact. This implies that practitioners may not need to treat newly listed large caps separately to preserve the validity of the FF3 betas.
Third, even with high-noise daily returns, FF3 shows high explanatory power and coefficient stability that are comparable to the exclusion specifications. This indicates that FF3 performs reliably at a high frequency, which supports its continued use as a benchmarking tool in empirical finance.
Finally, given the small incremental gains observed in this daily-frequency setting and the mixed evidence on momentum and FF5 in Korea, it may be more appropriate, for the specific question studied here, to retain FF3 as a parsimonious baseline and to focus on addressing its known limitations. A comprehensive evaluation of FF5 and FF6 for Korea is left for future research.
Markedly the robustness of the FF3 structure, even after excluding the top five stocks (and, in robustness checks, even after excluding stocks accounting for 50% of total market capitalization), can be explained by the construction of its factors. Since SMB returns are primarily generated by small firms (Banz, 1981; Fama and French, 1993; Bartram et al., 2021), the exclusion of the largest stocks has little direct impact on this factor. For HML, Samsung Electronics is classified as a growth stock for roughly half of the sample period and as a mid-B/M stock for the remainder, while the other excluded large caps are on average in the mid-B/M category; under the factor construction method, these characteristics mitigate the impact of their exclusion on HML. Regarding RM_RF, assuming it proxies the true underlying factor F, even if a proportion w of the market (exceeding 50%) is excluded, the remaining 1−w retains sufficient covariance structure to approximate F. In this sense, the excluded stocks are not unique sources of F but rather share the same factor exposure pattern as the remaining universe. This structural property, combined with proportional reweighting of the remaining constituents, helps preserve high correlations in RM_RF betas across specifications.
6. Conclusion
This study investigated the potential distortion of coefficient estimates in the FF3 model (Fama and French, 1993) caused by the largest market-cap stocks, and analyzed structural changes in regression coefficients and explanatory power by constructing alternative models that exclude the top one, three and five stocks (FF3_E1, FF3_E3, FF3_E5).
In the Korean equity market, the top five stocks account for approximately 25–45% of total market capitalization, raising concerns that a small number of firms might drive the market structure and undermine the stability of empirical models. Despite this, our findings demonstrate that excluding these large-cap stocks does not compromise the fundamental robustness of the FF3 structure.
Empirical results show that one-to-one correlations of factors between the original FF3 model and each alternative model exceed 0.90 in most cases. Even in FF3_E5, alpha and betas on RM_RF, SMB and HML maintain correlations above 0.97, indicating that coefficient structures remain stable regardless of large-cap inclusion. However, caution is warranted when interpreting shifts in alpha levels.
In terms of explanatory power (adjusted R2), the models excluding top stocks exhibit slight but statistically significant increases, suggesting that coefficient estimates may depend in part on specific stocks, although the economic magnitude of these gains is minor. The positive relationship between increases in adjusted R2 and both the sum of squared coefficient changes and the difference in sums of squared coefficients confirms that, while the FF3 structure is generally robust, it does exhibit slight structural sensitivity.
In conclusion, even after excluding the largest market-cap stocks, the FF3 model retains its core empirical validity in terms of both coefficient structure and explanatory power. Despite potential large-cap concentration effects, the FF3 framework remains a useful tool for empirical analysis in finance, provided that users remain mindful of the impact on estimated alpha.
While the empirical results demonstrate the robustness of the FF3 model in the highly concentrated Korean equity market, this study has certain limitations. The analysis is confined to a single market, and the structural stability documented here may not necessarily generalize to markets with different institutional settings, trading behaviors, or concentration patterns. Future research could extend the framework to cross-country comparisons, particularly in other emerging markets or developed markets with high large-cap concentration. Examples include Finland, where Nokia historically dominated market capitalization, Taiwan, where TSMC currently accounts for a substantial share of the equity market and France, where luxury conglomerate LVMH carries significant market weight. Investigating such cases would assess the external validity of the findings and help determine whether the observed resilience of the FF3 factor structure is a Korea-specific phenomenon or a more universal property of value-weighted factor models.
Notes
IMF, Special Data Dissemination Standard: Korea – Stock Price Index, https://dsbb.imf.org/sdds/dqaf-base/country/KOR/category/SPI00
See Table 2 for summary statistics of factor returns.
The relationship between the sum of squared coefficient changes ∑(Δβ2) and the difference in sums of squared coefficients is explained separately in Appendix A1. Additionally, Considering the units of R2, taking the square root of the sums of squares is inappropriate.
See Appendix A2 for the relationship between the sum of squared coefficients and explanatory power.
The supplementary material for this article can be found online.

