Many problems in the natural sciences and engineering are also rife with sources of uncertainty. Computer experiments on computer simulations are the most common approach to study problems in uncertainty quantification.[1] [2] [3] [4]
Combined occurrence and interaction of aleatoric and epistemic uncertainty: Aleatoric and epistemic uncertainty can also occur simultaneously in a single term E.g., when experimental parameters show aleatoric uncertainty, and those experimental parameters are input to a computer simulation. If then for the uncertainty quantification a surrogate model, e.g. a Gaussian process or a Polynomial Chaos Expansion, is learnt from computer experiments, this surrogate exhibits epistemic uncertainty that depends on or interacts with the aleatoric uncertainty of the experimental parameters.[4] Such an uncertainty cannot solely be classified as aleatoric or epistemic any more, but is a more general inferential uncertainty.
In real life applications, both kinds of uncertainties are present. Uncertainty quantification intends to explicitly express both types of uncertainty separately. The quantification for the aleatoric uncertainties can be relatively straightforward, where traditional (frequentist) probability is the most basic form. Techniques such as the Monte Carlo method are frequently used. A probability distribution can be represented by its moments (in the Gaussian case, the mean and covariance suffice, although, in general, even knowledge of all moments to arbitrarily high order still does not specify the distribution function uniquely), or more recently, by techniques such as Karhunen–Loève and polynomial chaos expansions. To evaluate epistemic uncertainties, the efforts are made to understand the (lack of) knowledge of the system, process or mechanism. Epistemic uncertainty is generally understood through the lens of Bayesian probability, where probabilities are interpreted as indicating how certain a rational person could be regarding a specific claim.
Mathematical perspective
In mathematics, uncertainty is often characterized in terms of a probability distribution. From that perspective, epistemic uncertainty means not being certain what the relevant probability distribution is, and aleatoric uncertainty means not being certain what a random sample drawn from a probability distribution will be.
Types of problems
There are two major types of problems in uncertainty quantification: one is the forward propagation of uncertainty (where the various sources of uncertainty are propagated through the model to predict the overall uncertainty in the system response) and the other is the inverse assessment of model uncertainty and parameter uncertainty (where the model parameters are calibrated simultaneously using test data). There has been a proliferation of research on the former problem and a majority of uncertainty analysis techniques were developed for it. On the other hand, the latter problem is drawing increasing attention in the engineering design community, since uncertainty quantification of a model and the subsequent predictions of the true system response(s) are of great interest in designing robust systems.
Forward
See also: Propagation of uncertainty. Uncertainty propagation is the quantification of uncertainties in system output(s) propagated from uncertain inputs. It focuses on the influence on the outputs from the parametric variability listed in the sources of uncertainty. The targets of uncertainty propagation analysis can be:
- To evaluate low-order moments of the outputs, i.e. mean and variance.
- To evaluate the reliability of the outputs. This is especially useful in reliability engineering where outputs of a system are usually closely related to the performance of the system.
- To assess the complete probability distribution of the outputs. This is useful in the scenario of utility optimization where the complete distribution is used to calculate the utility.
Inverse
See also: Inverse problem. Given some experimental measurements of a system and some computer simulation results from its mathematical model, inverse uncertainty quantification estimates the discrepancy between the experiment and the mathematical model (which is called bias correction), and estimates the values of unknown parameters in the model if there are any (which is called parameter calibration or simply calibration). Generally this is a much more difficult problem than forward uncertainty propagation; however it is of great importance since it is typically implemented in a model updating process. There are several scenarios in inverse uncertainty quantification:
Bias correction only
Bias correction quantifies the model inadequacy, i.e. the discrepancy between the experiment and the mathematical model. The general model updating formula for bias correction is:
ye(x)=ym(x)+\delta(x)+\varepsilon
where
denotes the experimental measurements as a function of several input variables
,
denotes the computer model (mathematical model) response,
denotes the additive discrepancy function (aka bias function), and
denotes the experimental uncertainty. The objective is to estimate the discrepancy function
, and as a by-product, the resulting updated model is
. A prediction confidence interval is provided with the updated model as the quantification of the uncertainty.Parameter calibration only
Parameter calibration estimates the values of one or more unknown parameters in a mathematical model. The general model updating formulation for calibration is:
ye(x)=ym(x,\boldsymbol{\theta}*)+\varepsilon
where ym(x,\boldsymbol{\theta})
denotes the computer model response that depends on several unknown model parameters
, and
denotes the true values of the unknown parameters in the course of experiments. The objective is to either estimate
, or to come up with a probability distribution of
that encompasses the best knowledge of the true parameter values.Bias correction and parameter calibration
It considers an inaccurate model with one or more unknown parameters, and its model updating formulation combines the two together:
ye(x)=ym(x,\boldsymbol{\theta}*)+\delta(x)+\varepsilon
It is the most comprehensive model updating formulation that includes all possible sources of uncertainty, and it requires the most effort to solve.Selective methodologies
Much research has been done to solve uncertainty quantification problems, though a majority of them deal with uncertainty propagation. During the past one to two decades, a number of approaches for inverse uncertainty quantification problems have also been developed and have proved to be useful for most small- to medium-scale problems.
Forward propagation
Existing uncertainty propagation approaches include probabilistic approaches and non-probabilistic approaches. There are basically six categories of probabilistic approaches for uncertainty propagation:[9]
- Simulation-based methods: Monte Carlo simulations, importance sampling, adaptive sampling, etc.
- General surrogate-based methods: In a non-instrusive approach, a surrogate model is learnt in order to replace the experiment or the simulation with a cheap and fast approximation. Surrogate-based methods can also be employed in a fully Bayesian fashion. [10] [4] [11] This approach has proven particularly powerful when the cost of sampling, e.g. computationally expensive simulations, is prohibitively high.
- Local expansion-based methods: Taylor series, perturbation method, etc. These methods have advantages when dealing with relatively small input variability and outputs that don't express high nonlinearity. These linear or linearized methods are detailed in the article Uncertainty propagation.
- Functional expansion-based methods: Neumann expansion, orthogonal or Karhunen–Loeve expansions (KLE), with polynomial chaos expansion (PCE) and wavelet expansions as special cases.
- Most probable point (MPP)-based methods: first-order reliability method (FORM) and second-order reliability method (SORM).
- Numerical integration-based methods: Full factorial numerical integration (FFNI) and dimension reduction (DR).
For non-probabilistic approaches, interval analysis,[12] Fuzzy theory, Possibility theory and evidence theory are among the most widely used.
The probabilistic approach is considered as the most rigorous approach to uncertainty analysis in engineering design due to its consistency with the theory of decision analysis. Its cornerstone is the calculation of probability density functions for sampling statistics.[13] This can be performed rigorously for random variables that are obtainable as transformations of Gaussian variables, leading to exact confidence intervals.
Inverse uncertainty
Frequentist
In regression analysis and least squares problems, the standard error of parameter estimates is readily available, which can be expanded into a confidence interval.
Bayesian
Several methodologies for inverse uncertainty quantification exist under the Bayesian framework. The most complicated direction is to aim at solving problems with both bias correction and parameter calibration. The challenges of such problems include not only the influences from model inadequacy and parameter uncertainty, but also the lack of data from both computer simulations and experiments. A common situation is that the input settings are not the same over experiments and simulations. Another common situation is that parameters derived from experiments are input to simulations. For computationally expensive simulations, then often a surrogate model, e.g. a Gaussian process or a Polynomial Chaos Expansion, is necessary, defining an inverse problem for finding the surrogate model that best approximates the simulations.[4]
Modular approach
An approach to inverse uncertainty quantification is the modular Bayesian approach.[14] The modular Bayesian approach derives its name from its four-module procedure. Apart from the current available data, a prior distribution of unknown parameters should be assigned.
- Module 1: Gaussian process modeling for the computer model
To address the issue from lack of simulation results, the computer model is replaced with a Gaussian process (GP) model
ym(x,\boldsymbol{\theta})\siml{GP}(hm( ⋅ )T\boldsymbol{\beta}
m( ⋅ , ⋅ ))
where
is the dimension of input variables, and
is the dimension of unknown parameters. While
is pre-defined, \left\{\boldsymbol{\beta}m,\sigmam,
k=1,\ldots,d+r\right\}
, known as hyperparameters of the GP model, need to be estimated via maximum likelihood estimation (MLE). This module can be considered as a generalized kriging method.
- Module 2: Gaussian process modeling for the discrepancy functionSimilarly with the first module, the discrepancy function is replaced with a GP model
\delta(x)\siml{GP}(h\delta( ⋅ )T\boldsymbol{\beta}
\delta( ⋅ , ⋅ ))
where
Together with the prior distribution of unknown parameters, and data from both computer models and experiments, one can derive the maximum likelihood estimates for \left\{\boldsymbol{\beta}\delta,\sigma\delta,
k=1,\ldots,d\right\}
. At the same time,
from Module 1 gets updated as well.
- Module 3: Posterior distribution of unknown parametersBayes' theorem is applied to calculate the posterior distribution of the unknown parameters:
p(\boldsymbol{\theta}\middata,\boldsymbol{\varphi})\proptop(\rm{data}\mid\boldsymbol{\theta},\boldsymbol{\varphi})p(\boldsymbol{\theta})
where
includes all the fixed hyperparameters in previous modules.
- Module 4: Prediction of the experimental response and discrepancy function
Full approach
Fully Bayesian approach requires that not only the priors for unknown parameters
but also the priors for the other hyperparameters
should be assigned. It follows the following steps:[15] - Derive the posterior distribution
p(\boldsymbol{\theta},\boldsymbol{\varphi}\middata)
;- Integrate
out and obtain p(\boldsymbol{\theta}\middata)
. This single step accomplishes the calibration;- Prediction of the experimental response and discrepancy function.
However, the approach has significant drawbacks:
p(\boldsymbol{\theta},\boldsymbol{\varphi}\middata)
is a highly intractable function of
. Hence the integration becomes very troublesome. Moreover, if priors for the other hyperparameters
are not carefully chosen, the complexity in numerical integration increases even more.- In the prediction stage, the prediction (which should at least include the expected value of system responses) also requires numerical integration. Markov chain Monte Carlo (MCMC) is often used for integration; however it is computationally expensive.
The fully Bayesian approach requires a huge amount of calculations and may not yet be practical for dealing with the most complicated modelling situations.
Known issues
The theories and methodologies for uncertainty propagation are much better established, compared with inverse uncertainty quantification. For the latter, several difficulties remain unsolved:
- Dimensionality issue: The computational cost increases dramatically with the dimensionality of the problem, i.e. the number of input variables and/or the number of unknown parameters.
- Identifiability issue:[16] Multiple combinations of unknown parameters and discrepancy function can yield the same experimental prediction. Hence different values of parameters cannot be distinguished/identified. This issue is circumvented in a Bayesian approach, where such combinations are averaged over.[4]
- Incomplete model response: Refers to a model not having a solution for some combinations of the input variables.[17] [18]
- Quantifying uncertainty in the input quantities: Crucial events missing in the available data or critical quantities unidentified to analysts due to, e.g., limitations in existing models.[19]
- Little consideration of the impact of choices made by analysts.[20]
See also
Notes and References
- 2245858. Design and Analysis of Computer Experiments. Sacks. Jerome. Welch. William J.. Mitchell. Toby J.. Wynn. Henry P.. Statistical Science. 1989. 4. 4. 409–423. 10.1214/ss/1177012413. free.
- Iman . Ronald L. . Helton . Jon C. . An Investigation of Uncertainty and Sensitivity Analysis Techniques for Computer Models . Risk Analysis . Wiley . 8 . 1 . 1988 . 0272-4332 . 10.1111/j.1539-6924.1988.tb01155.x . 71–90. 1988RiskA...8...71I .
- Walker . W.E. . Harremoës . P. . Rotmans . J. . van der Sluijs . J.P. . van Asselt . M.B.A. . Janssen . P. . Krayer von Krauss . M.P. . Defining Uncertainty: A Conceptual Basis for Uncertainty Management in Model-Based Decision Support . Integrated Assessment . Swets & Zeitlinger Publishers . 4 . 1 . 2003 . 1389-5176 . 10.1076/iaij.4.1.5.16466 . 5–17. 2003IntAs...4....5W . 1874/386032 . free .
- Ranftl. Sascha. von der Linden. Wolfgang. 2021-11-13. Bayesian Surrogate Analysis and Uncertainty Propagation. Physical Sciences Forum. 3. 1. 6. 10.3390/psf2021003006. 2673-9984. free . 2101.04038.
- 10.1111/1467-9868.00294. Bayesian calibration of computer models. 2001. Kennedy. Marc C.. O'Hagan. Anthony. Journal of the Royal Statistical Society, Series B (Statistical Methodology). 63. 3. 425–464. free.
- 10.1016/j.strusafe.2008.06.020. Aleatory or epistemic? Does it matter?. 2009. Der Kiureghian. Armen. Ditlevsen. Ove. Structural Safety. 31. 2. 105–112.
- Book: 10.1007/978-1-4020-5656-7_4. Quantifying Uncertainty: Modern Computational Representation of Probability and Applications. Extreme Man-Made and Natural Hazards in Dynamics of Structures. NATO Security through Science Series. 2007. Matthies. Hermann G.. 105–135. 978-1-4020-5654-3.
- Abhaya Indrayan, Medical Biostatistics, Second Edition, Chapman & Hall/CRC Press, 2008, pages 8, 673
- Lee . S. H. . Chen . W. . Wei Chen (engineer). A comparative study of uncertainty propagation methods for black-box-type problems . Structural and Multidisciplinary Optimization . Springer Science and Business Media LLC . 37 . 3 . 2008-05-09 . 1615-147X . 10.1007/s00158-008-0234-7 . 239–253. 119988015 .
- Cardenas . IC. On the use of Bayesian networks as a meta-modeling approach to analyse uncertainties in slope stability analysis. Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards. 2019. 13. 1. 53–65. 10.1080/17499518.2018.1498524. 2019GAMRE..13...53C . 216590427 .
- Ranftl. Sascha. Melito. Gian Marco. Badeli. Vahid. Reinbacher-Köstinger. Alice. Ellermann. Katrin. von der Linden. Wolfgang. 2019-12-31. Bayesian Uncertainty Quantification with Multi-Fidelity Data and Gaussian Processes for Impedance Cardiography of Aortic Dissection. Entropy. 22. 1. 58. 10.3390/e22010058. 1099-4300. 7516489. 33285833. 2019Entrp..22...58R . free .
- Book: Jaulin. L.. Kieffer. M.. Didrit. O.. Walter. E.. Applied Interval Analysis. 2001. Springer . 1-85233-219-0.
- http://publications.npl.co.uk/npl_web/pdf/tqe2.pdf Arnaut, L. R. Measurement uncertainty in reverberation chambers - I. Sample statistics. Technical report TQE 2, 2nd. ed., sec. 3.1, National Physical Laboratory, 2008.
- Marc C. Kennedy, Anthony O'Hagan, Supplementary Details on Bayesian Calibration of Computer Models, Sheffield, University of Sheffield: 1–13, 2000
- Bayarri . M. J. . M. J. Bayarri. Berger . J. O. . Liu . F. . Modularization in Bayesian analysis, with emphasis on analysis of computer models . Bayesian Analysis . Institute of Mathematical Statistics . 4 . 1 . 2009-03-01 . 1936-0975 . 10.1214/09-ba404 . 119–150. free .
- Arendt . Paul D. . Apley . Daniel W. . Chen . Wei . Wei Chen (engineer). Lamb . David . Gorsich . David . Improving Identifiability in Model Calibration Using Multiple Responses . Journal of Mechanical Design . ASME International . 134 . 10 . 2012-09-28 . 1050-0472 . 10.1115/1.4007573 . 100909.
- Cardenas . IC. On the use of Bayesian networks as a meta-modeling approach to analyse uncertainties in slope stability analysis. Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards. 2019. 13. 1. 53–65. 10.1080/17499518.2018.1498524 . 2019GAMRE..13...53C . 216590427 .
- van den Eijnden . AP. Schweckendiek . T . Hicks . MA . Metamodelling for geotechnical reliability analysis with noisy and incomplete models. Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards. 2021. 16. 3. 518–535. 10.1080/17499518.2021.1952611 . 238819106 . free.
- Cardenas . I. . Aven . T. . Flage . R. . Addressing challenges in uncertainty quantification. The case of geohazard assessments . Geosci. Model Dev. Discuss . 2022 . 16 . 6 . 1601–1615 . 10.5194/gmd-16-1601-2023 . free . 11250/3105739 . free .
- Cardenas . I. . Aven . T. . Flage . R. . Addressing challenges in uncertainty quantification. The case of geohazard assessments . Geosci. Model Dev. Discuss . 2022 . 16 . 6 . 1601–1615 . 10.5194/gmd-16-1601-2023 . free . 11250/3105739 . free .