This paper aims to explore the application of the Maximum Entropy (MaxEnt) principle in geotechnical engineering for uncertainty modeling and reliability analysis. It focuses on reviewing robust methods for estimating quantile functions and probability distributions of geotechnical variables using fractional moments and probability-weighted moments, addressing challenges such as data scarcity and overfitting.
The study uses a comprehensive review of MaxEnt-based methods, numerical simulations and case studies. Novel techniques, such as fractional probability-weighted moments and partial maximum entropy, are introduced and validated through real-world applications such as slope stability and soil property characterization. Computational tools and programming languages facilitating MaxEnt implementation are also discussed.
The proposed MaxEnt-based methods demonstrate superior accuracy and efficiency in estimating quantile functions and probability distributions, particularly in tail regions. Case studies show significant improvements over traditional empirical distributions, offering better fits to sample data while avoiding overfitting or underfitting uncertainties in geotechnical applications.
This review paper provides a comprehensive synthesis of existing literature on the application of the Maximum Entropy (MaxEnt) principle in geotechnical engineering, filling a gap in consolidating knowledge on its theoretical foundations, advancements and practical implementations. By systematically reviewing recent developments, it offers a detailed guide for researchers and practitioners, including an extensive list of computational tools and programming platforms (e.g. Python, R, MATLAB, Julia, STAN, Gurobi) to facilitate MaxEnt implementation. Additionally, the inclusion of a real-world case study demonstrates the practical utility of MaxEnt in modeling uncertainty, serving as a benchmark for future research and applications in geotechnical projects.
1. Introduction
The field of geotechnical engineering, particularly in soil mechanics, has witnessed a growing interest in the reliability analysis as an extension of deterministic analysis (Zhang et al., 2023; Christian, 2004). This is because of the inherent variability of soils, which makes them amenable to comprehensive probability, statistics and reliability treatment (Jiang et al., 2022; Kalantari et al., 2023). The initial and fundamental stages for this purpose are the modeling and quantification of uncertainties in random variables, as they serve as the basis for subsequent evaluations of reliability and risk (Deng and Singh, 2023; Rana et al., 2023; Look, 2022). The modeling of uncertainties in data commonly involves the utilization of probability curves, such as the probability density function (PDF), cumulative distribution function (CDF) and quantile function (QF). However, a significant challenge arises when the specific distribution function and its parameters are unknown and not readily available, necessitating estimation based on the available data (Zheng et al., 2021; Ching et al., 2021). In cases where the distribution function governing the data is known, various approaches can be used to estimate its parameters. However, in the majority of instances, the shape and type of the distribution function remain unknown (Keaton and Ponnaboyina, 2014; Zhang et al., 2014; Gong et al., 2019). The selection of an appropriate distribution function is a critical decision that profoundly impacts the output of the reliability model. One crucial factor to consider when selecting the distribution function is the trade-off between bias and variance errors. If the chosen model is simplistic, characterized by only a few free parameters, it may be incapable of aligning with the data, resulting in a substantial bias error. Conversely, a highly complex model can give rise to a high variance error (Krimil et al., 2023; Templeman, 1992).
The determination of the distribution function is theoretically based on the probabilistic characteristics of variables, specifically the statistical moments of the data (Uzielli, M, 2008; Bozorgzadeh and Bathurst, 2022; Wu et al., 2020). The inclusion of more statistical moments in this process yields more information about the distribution function, particularly regarding the behavior of its tail. However, this also introduces higher variance in the results (Lanzante et al., 2019; Loucks and van Beek, 2017). To strike a balance between bias and variance, it has been suggested in the literature that using the first four moments generally provides a suitable compromise (Fenton and Griffiths, 2008; Mehta et al., 2019). Various models have been proposed to extract information about the probability distribution function from these four statistical moments, such as the Pearson family of distributions (Singh, 1998; Pearson, 1895), the saddle point approximation (Gatto, 2019; Demaso et al., 2011; Daniels, 1954), the Gram–Charlier and Edgeworth expansions (Wallace, 1958; Lin and Zhang, 2022) and the Hermite model (Winerstein, 1988; Xu and Ding, 2021; Tong et al., 2022). Efforts have been made to address the inherent issues associated with these methods. For example, Low, (2013) introduced the Shifted Generalized Lognormal distribution (SGLD), which encompasses a wide range of skewness–kurtosis values for unimodal densities and explores different theoretical distributions as special or limiting cases. Subsequently, SGLD has served as a foundational model for the development of new methods in various studies. Another significant contribution in this field is the four-moment normal transformation proposed by Zhao et al. (2018). This innovative approach tackled the issue of handling unknown probability distributions for basic random variables. Instead of excluding variables with unknown distributions, the four-moment normal transformation utilized statistical moments (mean, standard deviation, skewness and kurtosis) to transform these variables. The authors derived a comprehensive expression of the four-moment normal transformation, considering various cases with different combinations of skewness and kurtosis. Importantly, they confirmed the monotonicity of each case, ensuring consistent applicability. This work achieved the first successful complete monotonic expression of the four-moment normal transformation, providing a valuable tool for structural reliability practitioners.
As an alternative solution, the principle of maximum entropy (MaxEnt) has been widely used in this field (Khosravi Tanak et al., 2024; Chen et al., 2021). It can be posited that any arbitrary probability distribution function corresponds to a specific entropy (Wang, 2008; Abe, 2003). Consequently, the optimal choice for the distribution function is the one that yields the highest entropy value (Levine, 1998; Cover and Thomas, 2005). This gives rise to an optimization problem, which is subject to certain constraints. One such constraint is that the parameters of the distribution function must be proportional to the moments derived from the available data. While this method allows for the acquisition of a distribution with minimal bias, it is also plagued by drawbacks that impede practical implementation (Handley and Millea, 2019). To estimate the tail of the MaxEnt distribution function, a multitude of moments must be used, yet only a limited amount of data is accessible (Alibrandi and Mosalam, 2018; Skilling, 1991). As the number of moments (M) increases, the MaxEnt moment problem becomes ill-conditioned, resulting in an oscillating behavior of the tail (Peterson et al., 2013). Furthermore, when four moments are used simultaneously in the MaxEnt optimization constraint, the resulting distribution can only exhibit a narrower-than-Gaussian tail behavior (Xiaodong, 2018). These limitations have gradually diminished the appeal of MaxEnt in engineering applications, leading to its diminished usage (Zhang and Pandey, 2013).
In recent years, there has been notable advancement in addressing the limitations of MaxEnt in geotechnical engineering projects (Wang and Goh, 2022). These advancements encompass improvements in specific statistical aspects, such as the incorporation of fractional moments and kernel density representation (Novi Inverardi and Tagliani, 2003; Xie et al., 2021), as well as the utilization of artificial intelligence techniques such as intelligent optimization approaches to allow good modeling of the distribution tails from limited data, with low bias and variance errors. Consequently, academia has witnessed an increase in the application of entropy principles in geotechnical engineering projects. To monitor the progress in this research domain, this state-of-the-art review paper elucidates the recent developments in the formulation of entropy-based models. Furthermore, it argues how these advancements could substantially contribute to the enhanced utilization of entropy in uncertainty modeling, reliability analysis and risk management. Additionally, the paper delves into various applications of the MaxEnt principle in geotechnical engineering, particularly in the context of uncertainty modeling. To cover these objectives, the remainder of the paper is structured as follows. Section 2 provides a literature review on the MaxEnt method, encompassing both univariate and multivariate distribution functions. Section 3 presents a comprehensive review of the diverse applications of these methods in various geotechnical projects. Section 4 enumerates various tools and computing platforms that facilitate the utilization of the MaxEnt principle, with a particular emphasis placed on the most commonly used ones. Finally, Section 5 concludes the paper by offering a concise summary and engaging in a discussion of the findings.
2. Formulation maximum entropy distribution
2.1 Univariate maximum entropy distribution
The maximum entropy (MaxEnt) principle, proposed by Jaynes (1957), offers a method to determine the most accurate probability distribution based on available information. This principle suggests that the distribution with maximum entropy is the best representation. Jaynes also introduced a non-parametric probability density function (PDF) model. However, despite its initial appeal, the practical implementation of MaxEnt in engineering applications has faced several limitations as follows:
Limited usability of sample moments: To accurately characterize the tail of the MaxEnt distribution, a large number of moments (M) are required. Unfortunately, only a limited number of sample moments can be used. This limitation arises due to the need for a vast amount of data to accurately represent the distribution’s tail behavior.
Ill-conditioned MaxEnt moment problem: As the number of moments (M) increases, the MaxEnt moment problem becomes ill-conditioned (Li and Zhang, 2011). This means that finding a solution becomes increasingly challenging. The ill-conditioned nature of the problem hampers the practical implementation of MaxEnt, as it introduces uncertainties and inaccuracies in the determination of the probability distribution.
Oscillatory behavior in the tail: The tail of the MaxEnt distribution exhibits an oscillatory behavior as the number of moments increases. This behavior further complicates the accurate representation of the distribution, making it less reliable for engineering applications. The oscillatory nature of the tail introduces additional challenges in capturing the true characteristics of the distribution.
Limited representation of tail behaviors: When considering only four moments, the MaxEnt distribution can only represent narrower-than-Gaussian tail behavior (Winterstein and Kashef, 2000). This limitation restricts the versatility of MaxEnt in capturing a wide range of tail behaviors. In engineering applications, where diverse tail behaviors may be encountered, this limitation reduces the applicability of MaxEnt.
The limitations discussed above have gradually diminished the appeal of the maximum entropy (MaxEnt) principle in engineering applications. The need for a large number of moments, ill-conditioned problems and oscillatory behavior in the tail, and limited representation of tail behaviors have hindered the practical implementation of MaxEnt. As a result, alternative methods and models have gained more prominence in engineering applications.
In recent years, thanks to the emergence of the idea of using fractional moments, MaxEnt has attracted a lot of attention again. Several different studies have addressed how to implement the MaxEnt distribution using fractional moments and then evaluated it using dimensionality reduction techniques (Zhang and Pandey, 2013; Xu, 2016). Fractional moments have also been used in other studies along with kernel density representation (Alibrandi and Mosalam, 2018). The main argument as to why it is preferable to use fractional moments is that they are subject to less sample variability than integer moments. Fractional sets do not need to be fixed, but they can be optimized, and with a smaller number of moments (for example, two), the probability distribution can be accurately determined, including its tail. However, further problems have also been reported. For example, fractional moments are invalid for random variables that can take negative values. Another problem is related to the optimization problem, which is difficult to solve efficiently because the objective function is non-continuous and non-convex. Zhang and Pandey, (2013) as well as Xu, (2016) used the simplex algorithm for optimization; however, the success of the search algorithm is highly dependent on the initial guess.
The genesis of the PDF model for the classical MaxEnt distribution can be traced back to Shannon’s entropy theory and Jaynes’ MaxEnt principles (Kapur and Kesavan, 1992). These principles delineate the entropy measure as follows:
where the domain of integration is commonly denoted as the real domain (-∞,∞). It is now assumed that, , and that the distribution being analyzed is subject to a fixed number, M, of integer moments, i.e:
where m = 1,2,(…), M, αm = m, µm = Ε [Xαm] represent the raw statistical moment with the order αm, and E[.] denotes the expectation operator. In accordance with the MaxEnt principle, the optimal choice among all distributions capable of satisfying the moment constraints is the distribution associated with MaxEnt. By using the Euler–Lagrange equation, the MaxEnt probability density function, pX(x), can be derived in the following manner:
where ν = [ν1, ν2, …, νM] represent the Lagrange multipliers, and ν0 denote the normalization constant:
Previous research has demonstrated that the constrained optimization problem can be reformulated as an unconstrained optimization problem, with the objective function to be minimized as follows (Kapur and Kesavan, 1992):
It is worth mentioning that the minimization of the Kullback–Leibler distance, commonly known as Kullback’s Minimum Cross-Entropy principle (MinxEnt), leads to the same equation as equation (5).
There exist conceptual distinctions between MaxEnt and MinxEnt. MaxEnt operates by maximizing Shannon’s entropy, which fundamentally quantifies uncertainty. The MaxEnt principle aims to maximize uncertainty regarding the undisclosed information, while using all the provided information. Conversely, the MinxEnt principle seeks to minimize the probabilistic disparity between the predicted distribution and the “true” distribution (Kapur and Kesavan, 1992). The Kullback–Leibler distance between the predicted distribution, pM(x), and the true distribution, p(x), can be expressed as:
As the Kullback–Leibler distance decreases, the approximating density function pM(x) approaches the true distribution p(x), indicating a closer resemblance between the two:
The entropy of the true probability density function pX(x), denoted as H(p), is considered to be a constant. As this constant value is not dependent on the variables α or ν, minimizing the Kullback–Leibler distance is equivalent to minimizing equation (5). In practical scenarios, the population moments are not known and, therefore, they are estimated using sample moments, that is:
where Xn denotes the data points and N represents the sample size. In the field of structural reliability analysis, it is customary to employ four integer moments (M = 4), as documented by (Li and Zhang, 2011). For a given number of moments M, the PDF model specified in equation (3) encompasses unknown parameters ν = [ν1, ν2, (…), νM]. These parameters can be determined by solving a linear system of equations, as proposed by Erdogmus et al. (2004).
where ν0 is defined in equation (4). By using integration by parts and assuming satisfies , it can be deduced that:
where βij is defined as , and EpM represents the expectation operator over the pM(x). Consequently, the parameter λ can be obtained by:
where the moment constraints represented by μ = [μ1, μ2, …, μM]T in equation (2) are approximated through the utilization of sample data, as indicated in equation (8). Moreover, the parameter β is estimated by employing the sample mean derived from the available data:
The hybrid approach, as proposed by Chen et al. (2010), involves solving for ν using Newton’s method, with the linear solution serving as an initial estimate.
The classical MaxEnt distribution, which uses integer moments, is known to have several drawbacks. One of the main limitations is that four moments may not provide enough information for many applications. To address this issue, various researchers (Alibrandi and Mosalam, 2018; Zhang and Pandey, 2013; Novi Inverardi and Tagliani, 2003; Xu, 2016) have explored the use of fractional moments, leading to significant changes in the optimization problem. Firstly, the integration domain is now restricted to the positive domain (0, ∞) to ensure that the fractional moments obtained are real and meaningful. Secondly, while the M moment constraints are still defined by equation (2), the parameter αm is no longer limited to integer values and is instead defined on the real domain. This allows for a more flexible and accurate representation of the distribution. Furthermore, the decision vector, which represents the variables to be optimized, now includes ν and α, where α is a vector consisting of α1, α2, …, αM. The value of M plays a crucial role in determining the quality of the fitted distribution. Increasing M helps reduce bias errors, as it provides more parameters to capture the features of the distribution. However, it also introduces variance errors, as an excessively large M can lead to overfitting and reduced predictive capability on unseen data. To strike a balance between bias and variance, it is essential to include M as part of the optimization process. Therefore, the decision vector becomes [ν, α, M]. However, it is worth noting that incorporating fractional moments into the optimization problem significantly increases its complexity, making it a challenging task.
Maximum likelihood estimation (MLE) is a commonly used method for estimating model parameters. In the scenario where M is a constant, the likelihood function of equation (3) can be defined as:
Hence, the log-likelihood function can be expressed as:
And the expected log-likelihood can be expressed as:
The MLE aims to minimize the expected log-likelihood function as defined in equation (15), which is equivalent to minimizing the optimization problem represented by Q(ν, M) in equation (5). To incorporate the consideration of the model complexity, Novi Inverardi and Tagliani (2003) proposed the utilization of the Akaike information criterion (AIC) (Akaike 1998), which offers a means of selecting models with varying numbers of parameters. The AIC value can be computed as follows:
where K represents the count of estimated parameters, and denotes the maximum likelihood function value, as defined in equation (13). The chosen model corresponds to the minimum AIC value, as expressed by the following equation:
where N represents the size of the sample. Novi Inverardi and Tagliani, (2003) proposed the utilization of K = M. Consequently, the optimization problem pertaining to the MaxEnt approach with fractional moments can be formulated as follows:
In this scenario, M represents a positive integer and αm is a real value. The inclusion of the penalty term M/N ensures that the function Q(α, ν, M) increases as M grows or N decreases, thereby preventing overfitting of the model. Limited research has been conducted on approaches to solve this optimization problem. Zhang and Pandey (2013) applied a fixed value of M (M = 3) in their applications and employed a simplex search routine in MATLAB© (Lagarias et al., 1998) to simultaneously optimize α and ν. However, the simplex search method is susceptible to getting stuck in local minima and is highly reliant on initial values. Xu (2016) optimized Q(α, ν) by solving a linear system of equations ν = β−1μ and used the simplex search method to improve ν. However, the method for updating and optimizing α was not specified.
2.2 Multivariate maximum entropy distribution
The multivariate distribution possesses a wide range of applications and its comprehensive understanding can be found in Lai and Balakrishnan (2009), where the detailed forms, properties and applications of bivariate distributions are discussed. Estimations of multivariate distributions often rely on estimating the marginal distributions and their copula; however, the selection of an appropriate copula presents a challenge (Embrechts, 2009). The multivariate MaxEnt distribution is considered the least biased distribution. Prior to introducing the multivariate MaxEnt distribution, the fundamentals of the multivariate distribution are elucidated, as it is comparatively less familiar than the univariate distribution.
Let X1 and X2 denote two random variables, and let g(x1, x2) represent their joint PDF. The joint CDF can be expressed as:
From which the joint PDF can be extracted as follows:
The joint PDF must satisfy two criteria:
The marginal distributions can be expressed in terms of the joint distribution as follows:
The marginal density of the random variables X1 and X2, respectively, can be expressed as:
The conditional distribution function of X1, given X2=x2, can be expressed as:
The conditional density function of X1, given X2=x2, can be expressed as:
The mathematical definition of the expectation of the function f(X1, X2) is as follows:
The marginal mean is defined as follows:
And the marginal variance is defined as follows:
The same principle applies when calculating the mean (μX2) and variance () for X2. The conditional expectation of g(X1), given X2 = x2, is expressed as:
The expectation of X1 can be derived as:
This is commonly referred to as the law of iterated expectations. Covariance, on the other hand, characterizes the joint pattern of two random variables and is defined as:
To mitigate the influence of varying scales among different pairs of random variables, the covariance is standardized into correlation. Correlation is defined as:
The correlation coefficient quantifies the strength of a linear relationship between variables, but it does not capture non-linear relationships. For instance, if X1 and X2 are independent random variables, their covariance [cov(X1, X2)] will be zero. However, it is important to note that the converse is not necessarily true; a covariance of zero [cov(X1, X2] = 0) does not imply that X1 and X2 are independent.
Now, let X be a random variable with a probability density function g(x), and let U = h(X) where h(x) is either strictly increasing or strictly decreasing. In this case, the probability density function of U can be expressed as:
In which J denotes the Jacobian of transformation, expressed as follows:
The aforementioned operation is commonly known as the transformation method. Moving forward, the entropy of a multivariate distribution can be defined as:
where Ω1, Ω2, …, Ωd denote the domains of the random vector of variables. Darbellay and Vajda (2000) presented various analytical expressions for the entropy of different distributions. Ahmed and Gokhale (1989) provided entropy expressions for certain multivariate distributions. The moment of the random variable with respect to the moment function fm(X) can be expressed as follows:
where m = 1, 2, …, M and E[.] represents the expectation operator. In accordance with the MaxEnt principle (Jaynes, 1957), the maximum entropy distribution should be chosen from the set of all distributions that satisfy the given moment constraints.
By using the Euler–Lagrange equation, the PDF, fX(x), of the MaxEnt distribution can be derived as follows:
where ν = [ν1, ν2, …, νM] represents the Lagrange multipliers, and ν0 denotes the normalization constant, which can be expressed as follows:
The constrained optimization problem can be reformulated as an unconstrained optimization problem, with an objective function as follows to be minimized:
The value of μm can be determined by calculating sample moments:
where xn = (x1, x2, …, xDn) represents the nth realization of N random variables in a data set. The primary distinction among the current MaxEnt PDF models is the moment function, fm(X). Singh (2013) and Ebrahimi et al. (2008) used entropy theory to derive multivariate distributions, assuming that the set of moment constraints fm(X) is known. However, these moment constraints are typically unavailable when estimating the density function from a data set. Urzúa (1988) defined the moment constraints as:
In which the constraint order is determined by the summation of nonnegative integer D-tuples [α1, α2, […], αD], denoted as ∑αi. Various algorithms addressing the multidimensional moment–constraint maximum entropy problem were discussed in Abramov (2006), Abramov (2007), Abramov (2009) and Abramov (2010). These moment constraints follow the same format as defined in Urzúa (1988). Owing to the computational complexity of directly solving Lagrange’s multipliers, Ahooyi et al. (2014) substituted gm(X) with its truncated Taylor series expansion at the expectation of X, making the parameter estimation process involve finding the optimal truncation order. Kouskoulas et al. (2004) approximated the multivariate probability density function (PDF) by estimating the 1-D conditional PDF along different dimensions. The 1-D conditional PDFs were approximated using a weighted sum of Legendre functions. Singh and Zhang (2018) modeled the multivariate distribution by estimating the univariate distribution using entropy theory and incorporating the dependence between variables using a copula.
In comparison to the univariate MaxEnt distribution, the multivariate MaxEnt distribution is less frequently discussed in the literature. The accuracy and efficiency of existing MaxEnt models for estimating pdf from sample data have not been well-established. Additionally, the multivariate MaxEnt distribution involves estimating a higher number of parameters compared to the univariate case, necessitating a more robust parameter estimation algorithm.
3. Application of maximum entropy in geotechnical project
MaxEnt has found important and fundamental applications in civil engineering because of its high capabilities in uncertainty modeling. This tool is used in various fields such as numerical modeling (Song and Mignolet, 2019; Norouzi et al., 2019), structural health monitoring (Villalpando et al., 2016), structural design optimization (Templeman, 1992) and risk management (Guo et al., 2017). In this paper, the main focus will be on two important applications of this method in uncertainty modeling and site characterization as two prominent and interesting fields in geotechnical engineering, which are reviewed in the following subsections.
3.1 Uncertainty modeling
Deng and Pandey (2008a) introduced a novel method for estimating the quantile function of a random variable by using fractional probability weighted moments (FPWMs) in conjunction with the principle of maximum entropy. By combining Monte Carlo simulations and optimization algorithms, the authors were able to determine the FPWM fractionals that would result in the best-fit quantile function. The accuracy of the proposed FPWM-based quantile function was found to be superior to that of conventional integral-order PWMs, as demonstrated through numerical examples. This paper fills a gap in the literature by extending and improving upon previous methods for quantile function estimation. However, further research is needed to evaluate the applicability and limitations of the proposed method in different scenarios and with various types of data. Additionally, a comprehensive comparison with other existing methods would help in assessing the relative strengths and weaknesses of this approach.
Deng and Pandey (2008b) introduced a novel distribution-free method for estimating the quantile function of a non-negative random variable from censored samples using the principle of partial minimum CrossEnt. By interpreting partial PWMs (PPWMs) as partial moments of the quantile function, the method extends the minimum cross-entropy principle to the partial minimum cross-entropy principle, providing accurate estimates of extreme quantiles without the need to invert the distribution function. The paper fills a gap in the literature by addressing the challenge of estimating quantile functions from censored data, demonstrating the effectiveness of the proposed method through numerical examples. However, a limitation of the paper is its focus solely on censored samples, without considering other types of data. This paper provides valuable insights for researchers seeking to estimate quantile functions from censored samples in their own studies.
Deng and Pandey (2009a) proposed a new estimation method for deriving the quantile function using FPWMs as constraints in the minimum cross-entropy principle. By combining Monte Carlo simulations and optimization algorithms, the authors were able to estimate the FPWMs and subsequently derive the best-fit quantile function. The numerical examples presented in the paper demonstrated a significant improvement in accuracy compared to the conventional approach. This paper fills a gap in the existing literature by introducing a novel approach that enhances the estimation of the quantile function, leading to more accurate results. The use of FPWMs in the estimation process improves the accuracy of the quantile function compared to the conventional approach that uses integer-order PWMs. Overall, this paper contributes to the field by providing a more robust and efficient way to estimate the quantile function, ultimately leading to more accurate results.
Deng and Pandey (2009b) introduced a novel distribution-free method for estimating the quantile function of a non-negative random variable. The method uses the principle of partial minimum cross-entropy and fractional PPWMs (FPPWMs) to handle censored observed data commonly found in engineering reliability and hydrology distribution analysis. By incorporating constraints specified in terms of FPPWMs, the proposed method improves the efficiency and accuracy of quantile estimation. Numerical results demonstrate that this new approach outperforms existing methods in terms of accuracy and efficiency. This paper provides a valuable contribution to the field of quantile estimation and offers a flexible and effective solution for handling censored data.
Deng and Pandey (2009b) introduced a novel distribution-free method for estimating the quantile function of a non-negative random variable using censored or incomplete data. The method was based on the principle of partial maximum entropy (MaxEnt) and used PPWMs as constraints. By extending the maximum entropy principle (MEP) constrained by PWMs to handle censored or incomplete data, the authors demonstrated the accuracy and efficiency of their proposed method through numerical results and practical examples. The paper highlighted the usefulness of the partial MaxEnt quantile function estimation method for censored samples, showcasing its potential for unbiased and efficient estimation of the quantile function in such scenarios. The results and discussion presented in the paper confirmed the effectiveness of the method in estimating the quantile function of non-negative random variables using censored samples.
Li et al. (2012) conducted a research study on the utilization of the MaxEnt principle to explore the realm of probabilistic slope stability. The researchers commenced their investigation by determining the probabilistic characteristics of the random variables implicated in the problem at hand. Subsequently, they formulated the performance function by employing the Simplified Bishop’s method. Additionally, they computed the initial four moments of the random variables and subsequently derived the moments of the performance function through the application of the Taylor series method. Once the moments of the performance function were acquired, they proceeded to employ the MaxEnt method to calculate the probability density function (PDF) based on these moments. Ultimately, they undertook an assessment of the probability of slope failure.
Deng (2019) presents a distribution-free approach for flood frequency analysis using the maximum entropy method and Akaike’s information criterion. The method does not rely on any classical distributions and provides universal probability curves. Comparative studies using historical flood data demonstrate that the proposed method accurately characterizes the probabilistic information of observed or transformed flood data. By minimizing the Kullback–Leibler entropy with Akaike’s Information Criterion, the most unbiased entropy distribution is determined. The paper provides a flowchart for the maximum entropy method and Akaike’s information criterion, allowing for a systematic determination of unknown parameters of the entropy distribution function. Overall, the results show that the method can reasonably and accurately determine the probabilistic information of flood data, providing a flexible and comprehensive analysis of flood frequency.
Wang and Goh (2022) proposed a methodology to predict the factor of safety (FoS) in slope stability problems using convolutional neural networks (CNNs) in conjunction with the MaxEnt method for estimating failure probability. Initially, they constructed a finite element model of the slope and generated a set of X random field samples. Subsequently, finite element analyses were conducted on these samples to determine the FoS values for each case. The X pairs of random fields and their corresponding FoS values were then divided into a training set (comprising 75% of the data) and a validation set (comprising 25% of the data). A CNN was trained and tested to serve as a surrogate model for the finite element model. Once the CNNs were adequately trained, they were used to predict FoS values for a large number of Monte Carlo samples, thereby eliminating the need for time-consuming finite element analyses. Additionally, the CNN-predicted FoS values were subjected to post-processing using the MaxEnt distribution with fractional moments (MaxEnt-FM) technique to fit a probability distribution to the data set. Based on the fitted distribution, the probability of failure (Pfailure) was calculated using the integral function in MATLAB.
Deng (2022) presents a method for probabilistic characterization of soil properties using the maximum entropy method from fractional moments of observed data. The proposed method is objective and unbiased, avoiding the assumptions of classical distributions. A case study on the undrained shear strength of soil in the Nipigon River landslide area in Ontario, Canada, is presented, comparing the maximum entropy distributions to other common distributions. The paper discusses issues such as overfitting/underfitting, minimum sample size, fractional orders, negative moments and limitations of the method. The application of the method to geotechnical reliability analysis is illustrated, including the calculation of the reliability index “b” using the chi-square statistic and obtaining the normal distribution. Overall, the paper provides a valuable contribution to the field of geotechnical engineering by offering a robust method for estimating probability distributions of soil properties.
Deng (2023) presents a novel method for estimating quantile functions of random variables using the MEP and FPWM. The paper introduces a new nondimensional analysis and uses the Akaike information criterion to determine the optimal order of FPWM in MEP analysis. The authors demonstrate the application of this method in estimating optimal quantile functions for soil undrained shear strength and annual maximum daily discharge, showcasing its accuracy, unbiasedness and efficiency. Furthermore, the paper illustrates the use of the developed FPWM-based maximum entropy quantile functions (MEQFs) in reliability analysis in civil engineering, with a case study on the reliability analysis of a rock slope. The results show that the FPWM-based MEQFs provide a better fit to sample data, particularly in the tail regions, compared to common empirical distributions. Overall, this paper contributes to the field by offering a robust approach for estimating quantile functions and showcasing its advantages in fitting tail regions of the data.
Deng et al. (2021) present a probabilistic analysis of shear strength parameters for marble from the Jinping II hydropower project in China using laboratory triaxial compression tests. The authors propose using maximum entropy probability density functions to model the angle of internal friction and cohesive strength, arguing that these provide a more unbiased and optimal characterization compared to conventional normal and lognormal distributions. Through Kolmogorov–Smirnov testing, they demonstrate that the entropy distributions better fit the sample data while avoiding over- or under-fitting uncertainties. The study also reveals a strong negative correlation (−0.886) between the cohesive strength and angle of internal friction for the marble samples tested. This work contributes to more accurate probabilistic approaches in rock mechanics and underground excavation design.
3.2 Site characterization and data scarcity
Zhao et al. (2021) presented a novel method for determining efficient locations for Cone Penetration Testing (CPT) soundings in geotechnical site investigations. Their approach used a Bayesian compressive sensing framework informed by the principle of MaxEnt. This method aimed to maximize information gain regarding the spatial variability of soil properties within a 2D vertical cross-section. The study demonstrated that the proposed locations outperformed random selection in characterizing spatial variability across multilayer soil profiles. However, the study had some limitations. The proposed method was limited to 2D applications and assumed independence between soil property variations in different layers, which may not always be realistic. Additionally, computational challenges arose when dealing with large data sets, and sparse measurement data could introduce bias in kriging estimations. Furthermore, the accuracy of the method might have been impacted by the inherent non-stationarity of soil properties and the limited number of CPT soundings typically collected due to practical constraints.
Shi and Wang (2021) addressed the challenge of subsurface stratigraphic uncertainty in slope stability evaluation by proposing a data-driven, MaxEnt-based approach. Their paper highlighted the importance of accurate stratigraphy for slope characterization and emphasized the limitations of traditional methods due to limited site measurements. The strategy iteratively reduced uncertainty through entropy analysis, guiding borehole placement and improving spatial interpolation of subsurface features. However, limitations were identified: the methodology’s generalizability to diverse geological settings and its reliance on sparse measurements required further investigation. Additionally, the complexity of real-world slopes with heterogeneous conditions and budgetary constraints for extensive borehole sampling posed challenges for practical application. Future studies validating the approach across various geological scenarios would have strengthened its applicability.
Zhao and Wang (2019) investigated the challenge of efficient sampling location determination in geotechnical site characterization. Their paper proposed an objective and quantitative method that integrated information entropy and Bayesian compressive sampling. This method used Maximum Entropy (MaxEnt) to quantify uncertainty in subsurface information and guide the selection of optimal borehole placements. The authors demonstrated the effectiveness of their approach using real CPT data, achieving improvements in spatial interpolation and reducing uncertainty within the geological model. While the method offered a data-driven approach aligned with engineering experience, limitations were acknowledged. The assumption of homogeneous slopes and reliance on sparse measurements could have restricted generalizability. The paper called for further validation in diverse geological settings to ensure broader applicability.
4. Computational tools
The concept of Maximum Entropy has gained widespread application across various branches of science, leading to the development of numerous computational tools in the form of software, libraries and packages in different programming languages. Among the most commonly used programming languages for engineering applications are Python, R, MATLAB and Julia. Table 1 presents a comprehensive list of packages in these programming languages that facilitate the modeling of MaxEnt.
A list of computational tools for MaxEnt modeling
| No. | Software | Free/Paid | Access link | |
|---|---|---|---|---|
| Programing language | Package | |||
| 1 | R | MaxLike | Free | Chandler et al. (2025) |
| 2 | dismo | Free | Hijmans et al. (2024) | |
| 3 | MIAmaxent | Free | Vollering et al. (2026) | |
| 4 | Python | Elapid | Free | Anderson (2026) |
| 5 | lzhang10/maxent | Free | Zhang et al. (2026) | |
| 6 | PyMaxEnt | Free | Saad (2026) | |
| 7 | PyMC3 | Free | Abril-Pla et al. (2023) | |
| 8 | maxent-infer | Free | Barrett et al. (2022) | |
| 9 | QGIS maxent model plugin | Free | Matellanes and Soriano (2022) | |
| 10 | MATLAB | Maximum entropy toolbox for MATLAB | Paid | Maoz and Schneidman (2017) |
| 11 | Maximum entropy code | Free | Cunha Jr (2020) | |
| 12 | Julia | Maxent-Julia | Free | Ortiz-Bernardin and José (2026) |
| 13 | MaxEntropyGraphs.jl | Free | Clerck (2021) | |
| 14 | STAN | Free | Carpenter et al. (2017) | |
| 15 | Gurobi | Paid | Gurobi Optimizer (2024) | |
| No. | Software | Free/Paid | Access link | |
|---|---|---|---|---|
| Programing language | Package | |||
| 1 | R | MaxLike | Free | |
| 2 | dismo | Free | ||
| 3 | MIAmaxent | Free | ||
| 4 | Python | Elapid | Free | |
| 5 | lzhang10/maxent | Free | ||
| 6 | PyMaxEnt | Free | ||
| 7 | PyMC3 | Free | ||
| 8 | maxent-infer | Free | ||
| 9 | Free | |||
| 10 | Maximum entropy toolbox for | Paid | ||
| 11 | Maximum entropy code | Free | ||
| 12 | Julia | Maxent-Julia | Free | |
| 13 | MaxEntropyGraphs.jl | Free | ||
| 14 | Free | |||
| 15 | Gurobi | Paid | ||
In addition to these programming languages, there are two standalone software options, namely STAN and Gurobi, which offer the capability to model MaxEnt. These software solutions come equipped with commercial optimization solvers that can effectively address constrained MaxEnt problems. STAN is a versatile tool that can be used to implement various statistical models, including the MEP, for purposes such as uncertainty modeling. To utilize STAN for this purpose, one must first define the data in the data block, identify the measurements or observations, specify the constraints to be imposed on the probability distribution (such as moments or ranges) and define the unnormalized target density function using the log probability density function (log-PDF) in STAN. This function incorporates the MEP by maximizing entropy while adhering to the specified constraints. Common constraints in uncertainty modeling include equality constraints (such as mean and variance) and inequality constraints (such as ranges of values). Lagrange multipliers can be used to mathematically incorporate these constraints within the log-PDF function. Gurobi does not have a direct implementation for the MEP in uncertainty modeling. However, users can include an entropy term in the objective function that needs to be maximized. The specific form of the entropy term will vary depending on the chosen probability distribution. For example, if Gaussian distributions are being used, maximizing entropy can be achieved by minimizing the variance. To incorporate this into the optimization problem, one simply needs to formulate the main problem along with the entropy term in the objective function. Gurobi is capable of handling this combined objective, making it a versatile tool for uncertainty modeling.
When selecting between these options, users should consider compatibility with their preferred programming language or their willingness to learn a new language. Cost is another important factor to consider, as while most of the options listed are free and open-source, some commercial software may offer additional features.
The following section will delve into a specific MATLAB code that has been developed by the authors for uncertainty modeling in MATLAB. This code will be used to solve a case study in a step-by-step manner, showcasing the practical application of MaxEnt modeling in a real-world scenario.
5. Case study using maximum entropy
5.1 Probability weighted moments
In conventional practice, MEP is commonly used to estimate the probability density function by imposing specific moment constraints. The density function is subsequently integrated to calculate the cumulative distribution function, which, in turn, requires inversion to determine the quantile associated with a particular probability value (Pandey, 2000). However, when dealing with small sample sizes, the estimation of higher order moments (order > 2) often suffers from significant sampling errors. Consequently, using the maximum entropy distribution derived from these imprecise moment estimates can result in inaccurate quantile values (Deng and Pandey, 2008a). The primary challenge in employing the MEP for quantile estimation has been the difficulty of obtaining precise moment estimates from small samples. This obstacle has negatively influenced the widespread application of the maximum entropy approach (Pandey, 2000).
Pandey (2000) introduced a method to directly calculate quantile functions using the MEP. He suggested using PMW to directly estimate the quantile function. Unlike ordinary statistical moments, the notable advantage of utilizing PWMs is that their higher order values can be accurately estimated even with small sample sizes (Pandey, 2000).
The definition of a probability weighted moment for a random variable was introduced by Greenwood et al., (1979) as follows:
where is the integral PWM with integral order , and , is the mathematical expectation and is the probability of non-exceedance. The following two forms of PWMs are particularly simple and useful:
Type 1:
and Type 2:
where .
Unbiased estimates, denoted as and , of and , respectively, can be obtained from an ordered random sample of size , as stated i (Hosking, 1986):
where are non-negative integers.
5.2 Maximum entropy quantile functions with probability weighted moments as constraints
Pandey (2000) proposed a distribution free procedure for calculating the quantile function of a non-negative random variable. The principle of maximum entropy constrained by conditions defined in relation to probability-weighted moments derived from observed data. The entropy of a quantile function can be written as:
where is the entropy, is the quantile function and is the probability of non-exceedance.
The available information is also presented as PMWs:
where is a sample estimate of population PWM, and is the highest order of PWM considered in the analysis. To maximize the entropy in equation (47), the method of Lagrange multipliers is used, and the Lagrangian function can be given by:
where represents an unknown Lagrangian multiplier. To derive the quantile function, the entropy can be maximized using the usual condition:
Substituting the equation (49) into equation (50) and then simplifying further yields the subsequent solution as presented in equation (51):
The Lagrangian multipliers can then be calculated by solving a system of nonlinear equations:
5.3 Case study using maximum entropy
In this section an example of slope stability analysis is provided to demonstrate the application of MEQFs in conducting reliability analysis in geotechnical engineering.
The Sau Mau Ping rock slope (Figure 1) in Hong Kong was initially documented by Hoek and Bray, (1974). Subsequent investigations by various researchers have taken place (Deng, 2023; Low, 2007; Wang et al., 2013; Deng and Pandey, 2023). This slope is situated in a region characterized by high rainfall but low seismic activity. Anxiety has arisen due to a minor slide in a nearby slope, prompting worries that a significant slide could potentially occur at the Sau Mau Ping rock slope.
Sau Mau Ping rock slope
Sau Mau Ping rock slope
The rock formation from which the adjacent slope to Sau Mau Ping Road was excavated consists of unweathered granite exhibiting exfoliation or sheet joints. These joints run parallel to the granite surface, and the gap between successive joints widens as one moves deeper into the rock mass. The undermining of these sheet joints has the potential to trigger a rockslide (Hoek and Bray, 1974). The required information related to the geometry of the slope and physical properties of the rock mass are provided in Table 2. To conduct the reliability analysis in this example the geotechnical properties of the sliding surface are considered as random variables which are summarized in Table 3.
Basic parameters of the slope geometry (Hoek and Bray, 1974)
| (m) | (degree) | (degree) | (g) | (KN/m3) | (KN/m3) | (m) | (m) | (KN/m) | (degree) |
|---|---|---|---|---|---|---|---|---|---|
| 60 | 50 | 35 | 0.08 | 26 | 9.81 | 14 | 7 | 0 | 0 |
| 60 | 50 | 35 | 0.08 | 26 | 9.81 | 14 | 7 | 0 | 0 |
Random variables of rock slope (Wang et al. 2023)
| Variable | c (KN/m2) | ϕ (degree) |
|---|---|---|
| Mean | 100 | 35 |
| Standard deviation | 20 | 5 |
| Distribution | Normal | Normal |
| Variable | c (KN/m2) | ϕ (degree) |
|---|---|---|
| Mean | 100 | 35 |
| Standard deviation | 20 | 5 |
| Distribution | Normal | Normal |
A data set including 50 samples of cohesion was created using Monte Carlo Simulation to evaluate the ability of maximum entropy quantile function to represent the samples (Table 3).
Using the data provided in Table 4, it is possible to derive PWMs through using equation (45). Subsequently, solving equation (51) with corresponding PWMs allows for the determination of MEQFs. Then using AIC method, the optimum order of maximum entropy quantile function is obtained [equation (53)] which will be used in the reliability analysis.
Parameters for normal and lognormal distributions
| Distribution | Normal | Lognormal | ||
|---|---|---|---|---|
| value | 100 | 20 | 4.5856 | 0.1980 |
| Distribution | Normal | Lognormal | ||
|---|---|---|---|---|
| value | 100 | 20 | 4.5856 | 0.1980 |
The distribution of random variable can be determined by conventional methods from the sample of elements. First, assume a normal or lognormal distribution, then use the method of maximum likelihood to calculate the parameters. The normal and lognormal distributions can be obtained using equation (54) and equation (55) and Table 5:
Illustrative sample (cohesion of rock discontinuities)
| No. | Value (kPa) | No. | Value (kPa) | No. | Value (kPa) | No. | Value (kPa) |
|---|---|---|---|---|---|---|---|
| 1 | 62.86 | 14 | 92.84 | 27 | 101.27 | 40 | 119.94 |
| 2 | 64.07 | 15 | 92.88 | 28 | 101.34 | 41 | 122.68 |
| 3 | 65.58 | 16 | 93.53 | 29 | 102.35 | 42 | 123.44 |
| 4 | 65.72 | 17 | 93.68 | 30 | 103.15 | 43 | 124.32 |
| 5 | 68.67 | 18 | 94.33 | 31 | 104.13 | 44 | 126.95 |
| 6 | 77.90 | 19 | 95.19 | 32 | 105.05 | 45 | 129.34 |
| 7 | 80.61 | 20 | 95.68 | 33 | 106.35 | 46 | 130.84 |
| 8 | 80.95 | 21 | 96.44 | 34 | 106.48 | 47 | 131.80 |
| 9 | 81.72 | 22 | 96.93 | 35 | 106.95 | 48 | 135.39 |
| 10 | 83.43 | 23 | 97.18 | 36 | 107.60 | 49 | 135.48 |
| 11 | 87.80 | 24 | 98.03 | 37 | 108.03 | 50 | 136.00 |
| 12 | 89.67 | 25 | 98.62 | 38 | 110.69 | ||
| 13 | 90.16 | 26 | 99.70 | 39 | 117.96 |
| No. | Value (kPa) | No. | Value (kPa) | No. | Value (kPa) | No. | Value (kPa) |
|---|---|---|---|---|---|---|---|
| 1 | 62.86 | 14 | 92.84 | 27 | 101.27 | 40 | 119.94 |
| 2 | 64.07 | 15 | 92.88 | 28 | 101.34 | 41 | 122.68 |
| 3 | 65.58 | 16 | 93.53 | 29 | 102.35 | 42 | 123.44 |
| 4 | 65.72 | 17 | 93.68 | 30 | 103.15 | 43 | 124.32 |
| 5 | 68.67 | 18 | 94.33 | 31 | 104.13 | 44 | 126.95 |
| 6 | 77.90 | 19 | 95.19 | 32 | 105.05 | 45 | 129.34 |
| 7 | 80.61 | 20 | 95.68 | 33 | 106.35 | 46 | 130.84 |
| 8 | 80.95 | 21 | 96.44 | 34 | 106.48 | 47 | 131.80 |
| 9 | 81.72 | 22 | 96.93 | 35 | 106.95 | 48 | 135.39 |
| 10 | 83.43 | 23 | 97.18 | 36 | 107.60 | 49 | 135.48 |
| 11 | 87.80 | 24 | 98.03 | 37 | 108.03 | 50 | 136.00 |
| 12 | 89.67 | 25 | 98.62 | 38 | 110.69 | ||
| 13 | 90.16 | 26 | 99.70 | 39 | 117.96 |
The fit lines obtained from maximum entropy, normal and lognormal quantile functions are provided in Figure 2. The residual sum of squares (RSS) method [equation (56)] is also employed to compare fit lines obtained from different methods:
The probability plot presents cohesion of rock discontinuities in kilopascals against probability of non-exceedance from 0 to 1. Sample points increase from about 60 kilopascals to about 136 kilopascals. Normal, lognormal, and K equal to 4 fitted trends also increase across the probability range. Lognormal rises most steeply near the highest probabilities.Maximum entropy quantile function for the original data set
The probability plot presents cohesion of rock discontinuities in kilopascals against probability of non-exceedance from 0 to 1. Sample points increase from about 60 kilopascals to about 136 kilopascals. Normal, lognormal, and K equal to 4 fitted trends also increase across the probability range. Lognormal rises most steeply near the highest probabilities.Maximum entropy quantile function for the original data set
Where is the actual observed value for data point and is the predicted value for data point .
The results of calculating RSS for normal, lognormal and the optimal maximum entropy are provided in Table 6 which shows that the optimal order of maximum entropy is more capable of minimizing RSS compared with normal and lognormal distributions.
Comparison for the residual sum of squares (RSS)
| Distribution | Normal | Lognormal | Optimal order maximum entropy |
|---|---|---|---|
| RSS | 1049.43 | 2414.66 | 195.63 |
| Distribution | Normal | Lognormal | Optimal order maximum entropy |
|---|---|---|---|
| 1049.43 | 2414.66 | 195.63 |
In reliability analysis, the performance function for slopes is defined as:
Equation (58) represents the two-dimensional limit equilibrium analysis where the FoS is the ratio of the restoring forces to the disturbing forces. This equation is used as the first part of the limit state function for the reliability analysis in this example:
And:
where is the FoS of the slope sliding along the sheet joint; is the weight of the rock wedge resting on the failure surface (kN); is the cohesive strength along sliding surface (kN/m2); is the base area of wedge (m2); is the friction angle of the sliding surface (degree); is the angle of failure surface from horizontal (degree); is the angle of the slope face from horizontal (degree), is the horizontal earthquake acceleration (m/s2); is the uplift force due to water pressure on the failure surface (kN); is the horizontal force due to water in the tension crack (kN); is the force applied by the anchor system (kN); is the inclination of anchor, anti-clockwise from normal (degree); is the depth of the tension crack (m); is the depth of water in the tension crack (m); is the unit weight of water (kN/m3); is the unit weight of rock granite (kN/m3); and is the height of the overall slope (m).
The reliability analysis for different quantile functions obtained from maximum entropy [equation (53)] normal [equation (54)], lognormal [equation (55)] is conducted using a quantile-based first-order reliability method, and the results are provided in Table 7. The reference value for the reliability index is also calculated by using function norminv in MATLAB and directly using parameters for the normal distribution. This distribution is called the parent normal quantile function as the random variable evaluated in this example is originally a normal random variable.
Reliability index () calculated from parent normal, normal, lognormal and maximum entropy quantile function
| distribution | Reliability index (β) |
|---|---|
| Parent normal QF | 1.6819 |
| Normal QF | 1.5520 |
| Lognormal QF | 1.7919 |
| Optimum order maximum entropy QF | 1.6217 |
| distribution | Reliability index (β) |
|---|---|
| Parent normal | 1.6819 |
| Normal | 1.5520 |
| Lognormal | 1.7919 |
| Optimum order maximum entropy | 1.6217 |
By comparing the outcomes derived from the reliability analysis using normal, lognormal and MEQFs, it can be concluded that the result obtained from the maximum entropy quantile function, is the closest results with the values derived from the parent normal quantile function which is considered as the benchmark in this analysis.
The closeness of results obtained from maximum entropy distribution with the reference values roots in the ability of maximum entropy distribution to present the samples with higher precision comparing with normal and lognormal distributions.
The advantages of quantile functions directly obtained by using PWM can be described in several ways. Firstly, the benefit of PWMs over ordinary statistical moments in limited sample sizes lies in their ability to provide more accurate estimates of higher-order moments. In situations where the available data is limited, traditional statistical moments may not yield reliable estimates of moments beyond the first order (mean). However, PWMs are specifically designed to address this limitation by incorporating probability weights, allowing for more precise estimation of higher-order moments with small sample sizes.
Furthermore, PWMs exhibit a resistance to the influence of outliers or extreme observations within the data set. This resilience is a consequence of their formulation as linear combinations of observed data values. Unlike ordinary moments, which subject the data to operations such as squaring and cubing. PWMs maintain a more stable response to outliers. Therefore, PWMs offer a more reliable framework for statistical modeling, as they are less prone to distortions caused by extreme data points.
The results from the reliability analysis indicate that the optimal order of MEQFs yields results that closely match the reference (parent normal) distribution. The closeness of results obtained from the maximum entropy distribution with reference values is reflected in the ability of the maximum entropy distribution to present the samples with higher precision compared with normal and lognormal distributions.
5.4 Reliability analysis of slopes in the Nipigon River Landslide
A landslide took place in the vicinity of Nipigon, Ontario, Canada, during the early hours of April 23, 1990 (Figure 3). The landslide covered an area of nearly 350 meters inland and had a maximum width of approximately 290 meters (Dodds et al., 1993).
The location map presents the Nipigon River beside bedrock terrain and Lake Helen near Nipigon. Highway 585 appears along the west side. Highway 11 appears along the east side of Lake Helen. A C N R line follows the river and lake edge. A railway bridge is labelled near the Nipigon River. A pipeline crosses the mapped area. The clear-cut area and failure zone are labelled near the river bend. The limit of glaciolacustrine deposits is marked. A north arrow and a 0 to 2 kilometre scale appear.Location of the landslide (Dodds et al., 1993)
The location map presents the Nipigon River beside bedrock terrain and Lake Helen near Nipigon. Highway 585 appears along the west side. Highway 11 appears along the east side of Lake Helen. A C N R line follows the river and lake edge. A railway bridge is labelled near the Nipigon River. A pipeline crosses the mapped area. The clear-cut area and failure zone are labelled near the river bend. The limit of glaciolacustrine deposits is marked. A north arrow and a 0 to 2 kilometre scale appear.Location of the landslide (Dodds et al., 1993)
In July of 1991, the drilling of three boreholes was conducted. These boreholes were used to collect undisturbed soil samples using a piston sampler. To gain insights into the soil’s layering and its undrained shear strength, electric piezocone and shear-vane tests were conducted. Additionally, pore pressure dissipation tests were carried out within clayey layers using the piezocone to estimate the coefficient of consolidation for the deposits present in their natural state (Dodds et al., 1993). Based on the samples obtained from the boreholes, four main soil types were recognized, including sandy silt (upper and lower), clayey silt and very soft clayey silt (Figure 4). In addition to this, soil parameters for different soil layers of the geological section based on studies conducted by Dodds et al. (1993) are summarized in Table 8.
The cross-section presents elevation in metres against distance in metres. Soil layers are labelled silty sand, clayey silt firm, clayey silt soft to very soft, and silty sand. A curved slip surface extends from about 5 metres to about 15 metres. A water line rises slightly from left to right.Geological section of Nipigon River Landslide (Fs = 1.08 for the failure surface based on (Dodds et al., 1993)
The cross-section presents elevation in metres against distance in metres. Soil layers are labelled silty sand, clayey silt firm, clayey silt soft to very soft, and silty sand. A curved slip surface extends from about 5 metres to about 15 metres. A water line rises slightly from left to right.Geological section of Nipigon River Landslide (Fs = 1.08 for the failure surface based on (Dodds et al., 1993)
Soil parameters (Dodds et al., 1993)
| Soil type | Unit weight (KN/m3) | Friction angle (degree) | Cohesion (kPa) |
|---|---|---|---|
| Silty sand (upper) | 17.6 | 30 | 53 |
| Clayey silt (firm) | 19 | 28 | 30 |
| Clayey silt (soft to very soft) | 18.2 | 28 | 0 |
| Silty sand (lower) | 18 | 30 | 10 |
| Soil type | Unit weight (KN/m3) | Friction angle (degree) | Cohesion (kPa) |
|---|---|---|---|
| Silty sand (upper) | 17.6 | 30 | 53 |
| Clayey silt (firm) | 19 | 28 | 30 |
| Clayey silt (soft to very soft) | 18.2 | 28 | 0 |
| Silty sand (lower) | 18 | 30 | 10 |
To obtain the soil parameters required for the reliability analysis (Table 9), the cohesion of the upper silty sand layer is employed. The laboratory investigations involved the use of unconfined compression tests, which were conducted by Barros et al. (2023).
Undrained shear strength (kPa) of upper silty sand layer (Barros et al., 2023)
| No. | (kPa) | No. | (kPa) |
|---|---|---|---|
| 1 | 10.53 | 19 | 55.01 |
| 2 | 16.73 | 20 | 56.33 |
| 3 | 17.97 | 21 | 57.83 |
| 4 | 18.75 | 22 | 60.49 |
| 5 | 19.87 | 23 | 61.33 |
| 6 | 38.23 | 24 | 65.14 |
| 7 | 38.41 | 25 | 65.22 |
| 8 | 38.64 | 26 | 67.44 |
| 9 | 39.65 | 27 | 69.12 |
| 10 | 43.37 | 28 | 69.25 |
| 11 | 44.61 | 29 | 73.48 |
| 12 | 48.32 | 30 | 74.76 |
| 13 | 48.94 | 31 | 75.04 |
| 14 | 49.95 | 32 | 75.71 |
| 15 | 51.03 | 33 | 77.61 |
| 16 | 52.27 | 34 | 82.33 |
| 17 | 53.05 | 35 | 96.71 |
| 18 | 53.5 |
| No. | No. | ||
|---|---|---|---|
| 1 | 10.53 | 19 | 55.01 |
| 2 | 16.73 | 20 | 56.33 |
| 3 | 17.97 | 21 | 57.83 |
| 4 | 18.75 | 22 | 60.49 |
| 5 | 19.87 | 23 | 61.33 |
| 6 | 38.23 | 24 | 65.14 |
| 7 | 38.41 | 25 | 65.22 |
| 8 | 38.64 | 26 | 67.44 |
| 9 | 39.65 | 27 | 69.12 |
| 10 | 43.37 | 28 | 69.25 |
| 11 | 44.61 | 29 | 73.48 |
| 12 | 48.32 | 30 | 74.76 |
| 13 | 48.94 | 31 | 75.04 |
| 14 | 49.95 | 32 | 75.71 |
| 15 | 51.03 | 33 | 77.61 |
| 16 | 52.27 | 34 | 82.33 |
| 17 | 53.05 | 35 | 96.71 |
| 18 | 53.5 |
As in the previous example, the PWMs were computed using equation (45). Next, by solving equation (51) with these PWMs, the MEQFs were determined. Finally, the optimal order of the maximum entropy quantile function was identified using the AIC method [equation (64)], which is then applied in the reliability analysis:
The results of calculating RSS for normal, lognormal and optimal maximum entropy are provided in Table 10, which shows that the optimal order of maximum entropy is more capable of minimizing RSS and provides a better fit to the data set (Figure 5).
Comparison of the residual sum of squares (RSS)
| Distribution | Normal | Lognormal | Optimal order maximum entropy |
|---|---|---|---|
| RSS | 934.20 | 4895.84 | 620.18 |
| Distribution | Normal | Lognormal | Optimal order maximum entropy |
|---|---|---|---|
| 934.20 | 4895.84 | 620.18 |
The probability plot presents undrained shear strength in kilopascals against probability of non-exceedance from 0 to 1. Sample points generally increase from about 10 kilopascals to about 97 kilopascals. Normal, lognormal, and K equal to 6 fitted trends also increase across the probability range. Lognormal and normal rise steeply near the highest probabilities.Maximum entropy quantile function for Undrained shear strength of upper silty sand layer from Nipigon River landslide
The probability plot presents undrained shear strength in kilopascals against probability of non-exceedance from 0 to 1. Sample points generally increase from about 10 kilopascals to about 97 kilopascals. Normal, lognormal, and K equal to 6 fitted trends also increase across the probability range. Lognormal and normal rise steeply near the highest probabilities.Maximum entropy quantile function for Undrained shear strength of upper silty sand layer from Nipigon River landslide
To simplify the calculation of the FOS as the first part of the limit state function (equation (57)), the explicit equation proposed by Low, (1989) [equation (64)] is used. This equation is primarily offered to calculate the FOS of an embankment on soft ground (Figure 6).
The slope stability diagram presents a sloped ground surface above a horizontal base. The slope angle beta is labelled with cot beta and 1. Soil parameters C m, phi m, and gamma appear inside the slope mass. A wall or support profile appears on the right with labelled C A. Height H and depth D are marked on the right. A dashed trial limiting tangent crosses the lower section.Embankment for the FOS calculation (Low, 1989)
The slope stability diagram presents a sloped ground surface above a horizontal base. The slope angle beta is labelled with cot beta and 1. Soil parameters C m, phi m, and gamma appear inside the slope mass. A wall or support profile appears on the right with labelled C A. Height H and depth D are marked on the right. A dashed trial limiting tangent crosses the lower section.Embankment for the FOS calculation (Low, 1989)
where is the height of the embankment, is, the equivalent undrained shear strength in the foundation soil, is the cohesion of embankment soil, is the friction angle of embankment soil and is the unit weight of embankment soil. In addition to this, (stability number for the foundation soil), (stability number for the embankment soil) and (coefficient of ) can be calculated by the following equations:
The reliability analysis for various quantile functions – derived from maximum entropy, normal and lognormal distributions – is performed using the quantile-based first-order reliability method, with results shown in Table 11. As observed in the previous example, the Optimum Order Maximum Entropy Quantile Function is more effective at yielding results that closely match the parent normal quantile function.
Reliability index () calculated from parent normal, normal, lognormal and maximum entropy quantile function for Nipigon River landslide
| Distribution | Reliability index (β) |
|---|---|
| Parent normal QF | 0.7372 |
| Normal QF | 0.7075 |
| Lognormal QF | 0.8454 |
| Optimum order maximum entropy QF | 0.71501 |
| Distribution | Reliability index (β) |
|---|---|
| Parent normal | 0.7372 |
| Normal | 0.7075 |
| Lognormal | 0.8454 |
| Optimum order maximum entropy | 0.71501 |
6. Discussion on challenges and future directions
The re-emergence of interest in the Maximum Entropy (MaxEnt) principle in geotechnical engineering indicates its potential for addressing core challenges in uncertainty quantification and reliability analysis. Traditional MaxEnt approaches have been shown to be limited in terms of data requirements, numerical stability in moment problems and tail behavior characterization. However, recent methodological developments, especially the incorporation of fractional moments, have greatly increased its practical applicability. Nonetheless, several significant challenges remain that must be resolved to realize the full potential of entropy-based methods in geotechnical applications.
The introduction of fractional moments marks a significant theoretical advancement, enabling more efficient distribution characterization by lowering moment requirements and improving statistical stability. However, this approach presents notable constraints, including mathematical incompatibility with negative-valued random variables and computational complexity in solving the associated non-convex optimization problems. Future research could explore hybrid methodologies that combine fractional moments with complementary approaches such as Bayesian inference frameworks and machine learning techniques. Such syntheses may offer more robust solutions to the optimization problems while maintaining theoretical rigor.
From a theoretical perspective, the extension of MaxEnt to multivariate distributions remains an important area for development. The existing literature demonstrates thorough treatment of univariate cases; however, there remains a necessity for compressive frameworks addressing high-dimensional problems. Possible ways to make progress include the incorporation of copula theory or tensor decomposition methods, which could enable effective application to spatially correlated geotechnical parameters and coupled systems. Furthermore, estimating extreme values is a critical problem that needs special attention, as geotechnical reliability often depends on accurate characterization of distribution tails. Enhanced approaches incorporating adaptive moment selection or physics-informed constraints could substantially improve probabilistic assessments of low-probability, high-consequence events.
The ongoing digital transformation in geotechnical engineering offers substantial opportunities for MaxEnt integration with emerging computational paradigms. Quantum computing architectures may offer solutions to the challenging optimization problems through quantum annealing algorithms. Implementation within digital twin frameworks could facilitate real-time updating of probabilistic models using continuous monitoring data. Furthermore, the integration of MaxEnt with explainable artificial intelligence techniques may bridge the gap between complex probabilistic models and practical engineering decision-making.
Several implementation challenges must be addressed to ensure robust application of MaxEnt methods. The fundamental trade-off between model complexity and generalizability requires careful consideration, especially for problems with limited data availability. Advanced computational strategies, including surrogate modeling and distributed computing architectures, may help overcome current limitations in computational efficiency. Furthermore, the development of standardized validation frameworks will be essential to ensure methodological transparency and reproducibility in practical applications.
The ongoing advancement of MaxEnt methods indicates a wider shift from deterministic to probabilistic approaches in geotechnical analysis. Future developments may enable autonomous probabilistic systems capable of adaptive model refinement, as well as more comprehensive treatment of emerging challenges such as climate change impacts on geotechnical systems. The establishment of collaborative research networks and open computational resources will be instrumental in advancing these objectives.
7. Conclusion
This review examined the diverse applications of the MaxEnt principle for uncertainty modeling in geotechnical engineering. The limitations of traditional methods solely reliant on fitting distributions to limited data were highlighted. MaxEnt was presented as a powerful alternative, incorporating additional constraints and maximizing information entropy. This approach ensures a more robust representation of uncertainty, particularly for the critical tails of distributions where data scarcity is prominent.
The theoretical framework of MaxEnt was explored, explaining its formulation for both single-variable and multi-variable scenarios. Various geotechnical studies that successfully implemented MaxEnt were then examined. These studies were categorized into three key areas:
Uncertainty Modeling as the Main Focus: Applications where MaxEnt serves as the primary tool for uncertainty quantification were explored.
Site Characterization and Data Scarcity: Methods by which MaxEnt can leverage limited data from geotechnical sites to extract maximum information, facilitating improved site characterization, were discussed.
Numerical Analyses for Geotechnical Problems: The integration of MaxEnt into numerical simulations for geotechnical engineering problems was investigated.
For each category, the reasoning behind the use of MaxEnt, its implementation strategies and the relevant engineering applications were detailed. This comprehensive analysis equips readers with a strong understanding of the diverse capabilities of MaxEnt in geotechnical engineering. Furthermore, a practical guide for researchers and practitioners interested in applying MaxEnt was provided. Various programming languages (R, Python, MATLAB, Julia) and their dedicated entropy packages were discussed, along with information on their accessibility (free or paid). Additionally, standalone software, such as STAN and Gurobi were introduced, explaining their functionalities for implementing MaxEnt in geotechnical problems.
The Nipigon River Landslide case study offers a valuable context for evaluating the practical performance of the MaxEnt framework when applied to real geotechnical data, which is often limited in sample size, subject to inherent variability and affected by measurement noise. The findings revealed that the MaxEnt Quantile Function (MaxEnt QF) effectively captured the probabilistic characteristics of soil strength parameters and slope reliability, using a limited number of statistical moments. This approach outperformed conventional normal and lognormal models in representing skewness and tail behavior. Furthermore, the use of PWMs enhanced robustness in small-sample conditions, thereby mitigating sensitivity to outliers and data irregularities. This is a key advantage for site investigations where high-quality data are scarce. However, the case study also revealed practical challenges, such as the computational demands of entropy optimization and the necessity for careful constraint selection to prevent overfitting. These findings imply that although MaxEnt provides a theoretically robust and practically viable tool for uncertainty quantification, successful field application requires careful calibration and validation using diverse data sets.
The authors express their sincere gratitude to the Civil Engineering Department of Lakehead University for facilitating multiple visits to the Nipigon Slope, the case study of this paper. The department’s support in granting access to the site and providing project data was essential to the successful completion of this research. Furthermore, the authors acknowledge the use of chat.openai.com in proofreading and enhancing the manuscript’s clarity.

