In statistics, the logit function is the quantile function associated with the standard logistic distribution. It has many uses in data analysis and machine learning, especially in data transformations.
\sigma(x)=1/(1+e-x)
\operatorname{logit}p=\sigma-1(p)=ln
p | |
1-p |
for p\in(0,1).
p | |
1-p |
(0,1)
(-infty,+infty)
If is a probability, then is the corresponding odds; the of the probability is the logarithm of the odds, i.e.:
\operatorname{logit}(p)=ln\left(
p | |
1-p |
\right)=ln(p)-ln(1-p)=-ln\left(
1 | |
p |
-1\right)=2\operatorname{atanh}(2p-1).
The base of the logarithm function used is of little importance in the present article, as long as it is greater than 1, but the natural logarithm with base is the one most often used. The choice of base corresponds to the choice of logarithmic unit for the value: base 2 corresponds to a shannon, base to a “nat”, and base 10 to a hartley; these units are particularly used in information-theoretic interpretations. For each choice of base, the logit function takes values between negative and positive infinity.
The “logistic” function of any number
\alpha
\operatorname{logit}-1(\alpha)=\operatorname{logistic}(\alpha)=
1 | |
1+\exp(-\alpha) |
=
\exp(\alpha) | |
\exp(\alpha)+1 |
=
| |||||
2 |
The difference between the s of two probabilities is the logarithm of the odds ratio, thus providing a shorthand for writing the correct combination of odds ratios only by adding and subtracting:
ln(R)=ln\left(
p1/(1-p1) | |
p2/(1-p2) |
\right)=ln\left(
p1 | |
1-p1 |
\right)-ln\left(
p2 | |
1-p2 |
\right)=\operatorname{logit}(p1)-\operatorname{logit}(p2).
Several approaches have been explored to adapt linear regression methods to a domain where the output is a probability value
(0,1)
(-infty,+infty)
(0,1)
(-infty,+infty)
In 1934, Chester Ittner Bliss used the cumulative normal distribution function to perform this mapping and called his model probit, an abbreviation for "probability unit". This is, however, computationally more expensive.[2]
In 1944, Joseph Berkson used log of odds and called this function logit, an abbreviation for "logistic unit", following the analogy for probit:
Log odds was used extensively by Charles Sanders Peirce (late 19th century).[3] G. A. Barnard in 1949 coined the commonly used term log-odds;[4] the log-odds of an event is the logit of the probability of the event.[5] Barnard also coined the term lods as an abstract form of "log-odds", but suggested that "in practice the term 'odds' should normally be used, since this is more familiar in everyday life".
Closely related to the function (and logit model) are the probit function and probit model. The and are both sigmoid functions with a domain between 0 and 1, which makes them both quantile functions – i.e., inverses of the cumulative distribution function (CDF) of a probability distribution. In fact, the is the quantile function of the logistic distribution, while the is the quantile function of the normal distribution. The function is denoted
\Phi-1(x)
\Phi(x)
\Phi(x)=
1 | |
\sqrt{2\pi |
As shown in the graph on the right, the and functions are extremely similar when the function is scaled, so that its slope at matches the slope of the . As a result, probit models are sometimes used in place of logit models because for certain applications (e.g., in item response theory) the implementation is easier.[10]