Alternating conditional expectations (ACE) is an algorithm to find the optimal transformations between the response variable and predictor variables in regression analysis.[1]
In statistics, a nonlinear transformation of variables is commonly used in practice in regression problems. Alternating conditional expectations (ACE) is one of the methods to find those transformations that produce the best fitting additive model. Knowledge of such transformations aids in the interpretation and understanding of the relationship between the response and predictors.
ACE transforms the response variable
Y
Xi
Let
Y,X1,...,Xp
X1,...,Xp
Y
\theta(Y),\varphi1(X1),...,\varphip(Xp)
\theta(Y)
2(\theta,\varphi | |
e | |
1,...,\varphi |
|
\varphi1(X1),...,\varphip(Xp)
e2
\theta1(Y)=E\left[\sum
p | |
i=1 |
\varphii(Xi)|Y\right]
\theta1(Y)
k
\varphii(Xi)
\theta(Y)
e2
\tilde{\varphi}k=E\left[\theta(Y)-\sumi ≠ \varphii(Xi)|Xk\right]
e2
The optimal transformation
\theta*(Y),\varphi*(X)
p=1
\rho*(X,Y)=\rho*(\theta*,\varphi*)=max\theta,\rho(\theta(Y),\varphi(X))
\rho
\rho*(X,Y)
X
Y
In the bivariate case, the ACE algorithm can also be regarded as a method for estimating the maximal correlation between two variables.
The ACE algorithm was developed in the context of known distributions. In practice, data distributions are seldom known and the conditional expectation should be estimated from data. R language has a package acepack which implements ACE algorithm. The following example shows its usage:
The ACE algorithm provides a fully automated method for estimating optimal transformations in multiple regression. It also provides a method for estimating the maximal correlation between random variables. Since the process of iteration usually terminates in a limited number of runs, the time complexity of the algorithm is
O(np)
n
A strong advantage of the ACE procedure is the ability to incorporate variables of quite different types in terms of the set of values they can assume. The transformation functions
\theta(y),\varphii(xi)
As a tool for data analysis, the ACE procedure provides graphical output to indicate a need for transformations as well as to guide in their choice. If a particular plot suggests a familiar functional form for a transformation, then the data can be pre-transformed using this functional form and the ACE algorithm can be rerun.
As with any regression procedure, a high degree of association between predictor variables can sometimes cause the individual transformation estimates to be highly variable, even though the complete model is reasonably stable. When this is suspected, running the algorithm on randomly selected subsets of the data, or on bootstrap samples can assistin assessing the variability.