In probability theory and statistics, a copula is a multivariate cumulative distribution function for which the marginal probability distribution of each variable is uniform on the interval [0, 1]. Copulas are used to describe/model the dependence (inter-correlation) between random variables.[1] Their name, introduced by applied mathematician Abe Sklar in 1959, comes from the Latin for "link" or "tie", similar but unrelated to grammatical copulas in linguistics. Copulas have been used widely in quantitative finance to model and minimize tail risk[2] and portfolio-optimization applications.[3]
Sklar's theorem states that any multivariate joint distribution can be written in terms of univariate marginal distribution functions and a copula which describes the dependence structure between the variables.
Copulas are popular in high-dimensional statistical applications as they allow one to easily model and estimate the distribution of random vectors by estimating marginals and copulae separately. There are many parametric copula families available, which usually have parameters that control the strength of dependence. Some popular parametric copula models are outlined below.
Two-dimensional copulas are known in some other areas of mathematics under the name permutons and doubly-stochastic measures.
Consider a random vector
(X1,X2,...,Xd)
Fi(x)=\Pr[Xi\leqx]
(U1,U2,...,Ud)=\left(F1(X1),F2(X2),...,Fd(Xd)\right)
The copula of
(X1,X2,...,Xd)
(U1,U2,...,Ud)
C(u1,u2,...,ud)=\Pr[U1\lequ1,U2\lequ2,...,Ud\lequd].
The copula C contains all information on the dependence structure between the components of
(X1,X2,...,Xd)
Fi
Xi
The reverse of these steps can be used to generate pseudo-random samples from general classes of multivariate probability distributions. That is, given a procedure to generate a sample
(U1,U2,...,Ud)
(X1,X2,...,Xd)=
-1 | |
\left(F | |
1 |
(U1),F
-1 | |
2 |
(U2),...,F
-1 | |
d |
(Ud)\right).
-1 | |
F | |
i |
Fi
C(u1,u2,...,ud)=\Pr[X1\leq
-1 | |
F | |
1 |
(u1),X2\leq
-1 | |
F | |
2 |
(u2),...,Xd\leq
-1 | |
F | |
d |
(ud)].
In probabilistic terms,
C:[0,1]d → [0,1]
[0,1]d
In analytic terms,
C:[0,1]d → [0,1]
C(u1,...,ui-1,0,ui+1,...,ud)=0
C(1,...,1,u,1,...,1)=u
d | |
B=\prod | |
i=1 |
[xi,yi]\subseteq[0,1]d
\intBdC(u)
=\sum | ||||||||||
|
where the
N(z)=\#\{k:zk=xk\}
For instance, in the bivariate case,
C:[0,1] x [0,1] → [0,1]
C(0,u)=C(u,0)=0
C(1,u)=C(u,1)=u
C(u2,v2)-C(u2,v1)-C(u1,v2)+C(u1,v1)\geq0
0\lequ1\lequ2\leq1
0\leqv1\leqv2\leq1
Sklar's theorem, named after Abe Sklar, provides the theoretical foundation for the application of copulas. Sklar's theorem states that every multivariate cumulative distribution function
H(x1,...,xd)=\Pr[X1\leqx1,...,Xd\leqxd]
(X1,X2,...,Xd)
Fi(xi)=\Pr[Xi\leqxi]
C
H(x1,...,xd)=C\left(F1(x1),...,Fd(xd)\right).
If the multivariate distribution has a density
h
h(x1,...,xd)=c(F1(x1),...,Fd(xd)) ⋅ f1(x1) ⋅ ... ⋅ fd(xd),
c
The theorem also states that, given
H
\operatorname{Ran}(F1) x … x \operatorname{Ran}(Fd)
Fi
The converse is also true: given a copula
C:[0,1]d → [0,1]
Fi(x)
C\left(F1(x1),...,Fd(xd)\right)
Fi(x)
Copulas mainly work when time series are stationary[4] and continuous.[5] Thus, a very important pre-processing step is to check for the auto-correlation, trend and seasonality within time series.
When time series are auto-correlated, they may generate a non existing dependence between sets of variables and result in incorrect copula dependence structure.[6]
The Fréchet–Hoeffding theorem (after Maurice René Fréchet and Wassily Hoeffding[7]) states that for any copula
C:[0,1]d → [0,1]
(u1,...,u
d | |
d)\in[0,1] |
W(u1,...,ud)\leqC(u1,...,ud)\leqM(u1,...,ud).
W(u1,\ldots,ud)=
d | |
max\left\{1-d+\sum\limits | |
i=1 |
{ui},0\right\}.
M(u1,\ldots,ud)=min\{u1,...,ud\}.
The upper bound is sharp: is always a copula, it corresponds to comonotone random variables.
The lower bound is point-wise sharp, in the sense that for fixed u, there is a copula
\tilde{C}
\tilde{C}(u)=W(u)
In two dimensions, i.e. the bivariate case, the Fréchet–Hoeffding theorem states
max\{u+v-1,0\}\leqC(u,v)\leqmin\{u,v\}.
Several families of copulas have been described.
[0,1]d
Rd
R\in[-1,1]d x
R
Gauss | |
C | |
R |
(u)=
-1 | |
\Phi | |
R\left(\Phi |
(u1),...,\Phi-1(ud)\right),
\Phi-1
\PhiR
R
Gauss | |
C | |
R |
(u)
Gauss | |
c | |
R |
(u) =
1 | |
\sqrt{\det{R |
I
Archimedean copulas are an associative class of copulas. Most common Archimedean copulas admit an explicit formula, something not possible for instance for the Gaussian copula.In practice, Archimedean copulas are popular because they allow modeling dependence in arbitrarily high dimensions with only one parameter, governing the strength of dependence.
A copula C is called Archimedean if it admits the representation[11]
C(u1,...,ud;\theta)=\psi-1\left(\psi(u1;\theta)+ … +\psi(ud;\theta);\theta\right)
where
\psi:[0,1] x \Theta → [0,infty)
\psi(1;\theta)=0
\theta
\Theta
\psi
\psi-1
\psi-1(t;\theta)=\left\{\begin{array}{ll}\psi-1(t;\theta)&if0\leqt\leq\psi(0;\theta)\ 0&if\psi(0;\theta)\leqt\leqinfty.\end{array}\right.
Moreover, the above formula for C yields a copula for
\psi-1
\psi-1
[0,infty)
d-2
(-1)k\psi-1,(k)(t;\theta)\geq0
for all
t\geq0
k=0,1,...,d-2
(-1)d-2\psi-1,(d-2)(t;\theta)
The following tables highlight the most prominent bivariate Archimedean copulas, with their corresponding generator. Not all of them are completely monotone, i.e. d-monotone for all
d\inN
\theta\in\Theta
Name of copula | Bivariate copula C\theta(u,v) | parameter \theta | generator \psi\theta(t) | generator inverse
(t) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Ali–Mikhail–Haq |
| \theta\in[-1,1] |
\right] |
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Clayton[13] | \left[max\left\{u-\theta+v-\theta-1;0\right\}\right]-1/\theta | \theta\in[-1,infty)\backslash\{0\}
(t-\theta-1) \left(1+\thetat\right)-1/\theta
log\left[1+
\right] \theta\inR\backslash\{0\}
log(1+\exp(-t)(\exp(-\theta)-1)) \theta\in[1,infty) \left(-log(t)\right)\theta \exp\left(-t1/\theta\right) -log(t) \exp(-t) \theta\in[1,infty) -log\left(1-(1-t)\theta\right) 1-\left(1-\exp(-t)\right)1/\theta Expectation for copula models and Monte Carlo integrationIn statistical applications, many problems can be formulated in the following way. One is interested in the expectation of a response function g:Rd → R (X1,...,Xd) H \operatorname{E}\left[g(X1,...,Xd)\right]=
g(x1,...,xd)dH(x1,...,xd). If H H(x1,...,xd)=C(F1(x1),...,Fd(xd)) this expectation can be rewritten as \operatorname{E}\left[g(X1,...,Xd)\right]=\int
(u1),...,F
(ud))dC(u1,...,ud). In case the copula C is absolutely continuous, i.e. C has a density c, this equation can be written as \operatorname{E}\left[g(X1,...,Xd)\right]=\int
(u1),...,F
(ud)) ⋅ c(u1,...,ud)du1 … dud, fi \operatorname{E}\left[g(X1,...,Xd)\right]=\int
g(x1,...xd) ⋅ c(F1(x1),...,Fd(xd)) ⋅ f1(x1) … fd(xd)dx1 … dxd. If copula and marginals are known (or if they have been estimated), this expectation can be approximated through the following Monte Carlo algorithm:
C (k=1,...,n)
(X1,...,Xd)
H (k=1,...,n)
\operatorname{E}\left[g(X1,...,Xd)\right] \operatorname{E}\left[g(X1,...,Xd)\right] ≈
Empirical copulasWhen studying multivariate data, one might want to investigate the underlying copula. Suppose we have observations
i=1,...,n (X1,X2,...,Xd)
i=1,...,n. Fi
x)
i=1,...,n.
d)=
u1,...,\tilde{U}
ud\right).
The sample version of Spearman's rho:[15]
\left[Cn\left(
\right] ApplicationsQuantitative finance
For the former, copulas are used to perform stress-tests and robustness checks that are especially important during "downside/crisis/panic regimes" where extreme downside events may occur (e.g., the global financial crisis of 2007–2008). The formula was also adapted for financial markets and was used to estimate the probability distribution of losses on pools of loans or bonds. During a downside regime, a large number of investors who have held positions in riskier assets such as equities or real estate may seek refuge in 'safer' investments such as cash or bonds. This is also known as a flight-to-quality effect and investors tend to exit their positions in riskier assets in large numbers in a short period of time. As a result, during downside regimes, correlations across equities are greater on the downside as opposed to the upside and this may have disastrous effects on the economy. For example, anecdotally, we often read financial news headlines reporting the loss of hundreds of millions of dollars on the stock exchange in a single day; however, we rarely read reports of positive stock market gains of the same magnitude and in the same short time frame. Copulas aid in analyzing the effects of downside regimes by allowing the modelling of the marginals and dependence structure of a multivariate probability model separately. For example, consider the stock exchange as a market consisting of a large number of traders each operating with his/her own strategies to maximize profits. The individualistic behaviour of each trader can be described by modelling the marginals. However, as all traders operate on the same exchange, each trader's actions have an interaction effect with other traders'. This interaction effect can be described by modelling the dependence structure. Therefore, copulas allow us to analyse the interaction effects which are of particular interest during downside regimes as investors tend to herd their trading behaviour and decisions. (See also agent-based computational economics, where price is treated as an emergent phenomenon, resulting from the interaction of the various market participants, or agents.) The users of the formula have been criticized for creating "evaluation cultures" that continued to use simple copulæ despite the simple versions being acknowledged as inadequate for that purpose.[18] [19] Thus, previously, scalable copula models for large dimensions only allowed the modelling of elliptical dependence structures (i.e., Gaussian and Student-t copulas) that do not allow for correlation asymmetries where correlations differ on the upside or downside regimes. However, the development of vine copulas[20] (also known as pair copulas) enables the flexible modelling of the dependence structure for portfolios of large dimensions.The Clayton canonical vine copula allows for the occurrence of extreme downside events and has been successfully applied in portfolio optimization and risk management applications. The model is able to reduce the effects of extreme downside correlations and produces improved statistical and economic performance compared to scalable elliptical dependence copulas such as the Gaussian and Student-t copula. Other models developed for risk management applications are panic copulas that are glued with market estimates of the marginal distributions to analyze the effects of panic regimes on the portfolio profit and loss distribution. Panic copulas are created by Monte Carlo simulation, mixed with a re-weighting of the probability of each scenario. As regards derivatives pricing, dependence modelling with copula functions is widely used in applications of financial risk assessment and actuarial analysis – for example in the pricing of collateralized debt obligations (CDOs). Some believe the methodology of applying the Gaussian copula to credit derivatives to be one of the reasons behind the global financial crisis of 2008–2009;[21] see . Despite this perception, there are documented attempts within the financial industry, occurring before the crisis, to address the limitations of the Gaussian copula and of copula functions more generally, specifically the lack of dependence dynamics. The Gaussian copula is lacking as it only allows for an elliptical dependence structure, as dependence is only modeled using the variance-covariance matrix. This methodology is limited such that it does not allow for dependence to evolve as the financial markets exhibit asymmetric dependence, whereby correlations across assets significantly increase during downturns compared to upturns. Therefore, modeling approaches using the Gaussian copula exhibit a poor representation of extreme events.[22] There have been attempts to propose models rectifying some of the copula limitations.[23] [24] Additional to CDOs, copulas have been applied to other asset classes as a flexible tool in analyzing multi-asset derivative products. The first such application outside credit was to use a copula to construct a basket implied volatility surface,[25] taking into account the volatility smile of basket components. Copulas have since gained popularity in pricing and risk management[26] of options on multi-assets in the presence of a volatility smile, in equity-, foreign exchange- and fixed income derivatives. Civil engineeringRecently, copula functions have been successfully applied to the database formulation for the reliability analysis of highway bridges, and to various multivariate simulation studies in civil engineering, reliability of wind and earthquake engineering,[27] and mechanical & offshore engineering.[28] Researchers are also trying these functions in the field of transportation to understand the interaction between behaviors of individual drivers which, in totality, shapes traffic flow. Reliability engineeringCopulas are being used for reliability analysis of complex systems of machine components with competing failure modes. Warranty data analysisCopulas are being used for warranty data analysis in which the tail dependence is analysed. Turbulent combustionCopulas are used in modelling turbulent partially premixed combustion, which is common in practical combustors. MedicineCopulæ have many applications in the area of medicine, for example,
GeodesyThe combination of SSA and copula-based methods have been applied for the first time as a novel stochastic tool for Earth Orientation Parameters prediction.[41] [42] Hydrology researchCopulas have been used in both theoretical and applied analyses of hydroclimatic data. Theoretical studies adopted the copula-based methodology for instance to gain a better understanding of the dependence structures of temperature and precipitation, in different parts of the world.[43] [44] Applied studies adopted the copula-based methodology to examine e.g., agricultural droughts[45] or joint effects of temperature and precipitation extremes on vegetation growth.[46] Climate and weather researchCopulas have been extensively used in climate- and weather-related research.[47] [48] Solar irradiance variabilityCopulas have been used to estimate the solar irradiance variability in spatial networks and temporally for single locations.[49] [50] Random vector generationLarge synthetic traces of vectors and stationary time series can be generated using empirical copula while preserving the entire dependence structure of small datasets.[51] Such empirical traces are useful in various simulation-based performance studies.[52] Ranking of electrical motorsCopulas have been used for quality ranking in the manufacturing of electronically commutated motors.[53] Signal processingCopulas are important because they represent a dependence structure without using marginal distributions. Copulas have been widely used in the field of finance, but their use in signal processing is relatively new. Copulas have been employed in the field of wireless communication for classifying radar signals, change detection in remote sensing applications, and EEG signal processing in medicine. In this section, a short mathematical derivation to obtain copula density function followed by a table providing a list of copula density functions with the relevant signal processing applications are presented. AstronomyCopulas have been used for determining the core radio luminosity function of Active galactic Nuclei (AGNs),[54] while this cannot be realized using traditional methods due to the difficulties in sample completeness. Mathematical derivation of copula density functionFor any two random variables X and Y, the continuous joint probability distribution function can be written as FXY(x,y)=\Pr\begin{Bmatrix}X\leq{x},Y\leq{y}\end{Bmatrix}, where and are the marginal cumulative distribution functions of the random variables X and Y, respectively. then the copula distribution function C(u,v) FXY(x,y)=C(FX(x),FY(y))\triangleqC(u,v), where u=FX(x) v=FY(y) FXY(x,y) u,v\in(0,1) Assuming FXY( ⋅ , ⋅ ) \begin{alignat}{6} fXY(x,y)={}&{\partial2FXY(x,y)\over\partialx\partialy}\\ \vdots\\ fXY(x,y)={}&{\partial2C(FX(x),FY(y))\over\partialx\partialy}\\ \vdots\\ fXY(x,y)={}&{\partial2C(u,v)\over\partialu\partialv} ⋅ {\partialFX(x)\over\partialx} ⋅ {\partialFY(y)\over\partialy}\\ \vdots\\ fXY(x,y)={}&c(u,v)fX(x)fY(y)\\ \vdots\\
={}&c(u,v) \end{alignat} where c(u,v) fX(x) fY(y)
List of copula density functions and applicationsVarious bivariate copula density functions are important in the area of signal processing. u=FX(x) v=FY(y) fX(x) fY(y)
See alsoFurther reading
Roger B. Nelsen (1999), "An Introduction to Copulas", Springer.
Piotr Jaworski, Fabrizio Durante, Wolfgang Karl Härdle, Tomasz Rychlik (Editors): (2010): "Copula Theory and Its Applications" Lecture Notes in Statistics, Springer.
Jan-Frederik Mai, Matthias Scherer (2012): Simulating Copulas (Stochastic Models, Sampling Algorithms and Applications). World Scientific.
Abe Sklar (1997): "Random variables, distribution functions, and copulas – a personal look backward and forward" in Rüschendorf, L., Schweizer, B. und Taylor, M. (eds) Distributions With Fixed Marginals & Related Topics (Lecture Notes – Monograph Series Number 28).
Alexander J. McNeil, Rudiger Frey and Paul Embrechts (2005) "Quantitative Risk Management: Concepts, Techniques, and Tools", Princeton Series in Finance. External links |
l