Notation in probability and statistics explained
Probability theory and statistics have some commonly used conventions, in addition to standard mathematical notation and mathematical symbols.
Probability theory
- Random variables are usually written in upper case Roman letters, such as or and so on. Random variables, in this context, usually refer to something in words, such as "the height of a subject" for a continuous variable, or "the number of cars in the school car park" for a discrete variable, or "the colour of the next bicycle" for a categorical variable. They do not represent a single number or a single category. For instance, if
is written, then it represents the probability that a particular realisation of a random variable (e.g., height, number of cars, or bicycle colour),
X, would be equal to a particular value or category (e.g., 1.735 m, 52, or purple),
. It is important that
and
are not confused into meaning the same thing.
is an idea,
is a value. Clearly they are related, but they do not have identical meanings.
- Particular realisations of a random variable are written in corresponding lower case letters. For example, could be a sample corresponding to the random variable . A cumulative probability is formally written
to distinguish the random variable from its realization.
[1] - The probability is sometimes written
to distinguish it from other functions and measure
P to avoid having to define "
P is a probability" and
is short for
P(\{\omega\in\Omega:X(\omega)\inA\})
, where
is the event space,
is a random variable that is a function of
(i.e., it depends upon
), and
is some outcome of interest within the domain specified by
(say, a particular height, or a particular colour of a car).
notation is used alternatively.
or
indicates the probability that events
A and
B both occur. The
joint probability distribution of random variables
X and
Y is denoted as
, while joint probability mass function or probability density function as
and joint cumulative distribution function as
.
or
indicates the probability of either event
A or event
B occurring ("or" in this case means
one or the other or both).
for the set of sets on which we define the probability
P)
, or
.
, or
.
- Survival functions or complementary cumulative distribution functions are often denoted by placing an overbar over the symbol for the cumulative:
, or denoted as
,
- In particular, the pdf of the standard normal distribution is denoted by , and its cdf by .
- Some common operators:
- : covariance of X and Y
- X is independent of Y is often written
or
, and X is independent of Y given W is often written
or
, the
conditional probability, is the probability of
given
Statistics
- Greek letters (e.g. θ, β) are commonly used to denote unknown parameters (population parameters).[2]
- A tilde (~) denotes "has the probability distribution of".
- Placing a hat, or caret (also known as a circumflex), over a true parameter denotes an estimator of it, e.g.,
is an estimator for
.
- The arithmetic mean of a series of values is often denoted by placing an "overbar" over the symbol, e.g.
, pronounced "
bar".
- Some commonly used symbols for sample statistics are given below:
,
-
- Some commonly used symbols for population parameters are given below:
- the population mean ,
- the population variance ,
- the population standard deviation ,
- the population correlation ,
- the population cumulants ,
is used for the
order statistic, where
is the sample minimum and
is the sample maximum from a total sample size
.
[3] Critical values
The α-level upper critical value of a probability distribution is the value exceeded with probability , that is, the value such that , where is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:
}^2 or
for the
chi-squared distribution with
degrees of freedom
or
for the
F-distribution with
and
degrees of freedom
Linear algebra
- Matrices are usually denoted by boldface capital letters, e.g. .
- Column vectors are usually denoted by boldface lowercase letters, e.g. .
- The transpose operator is denoted by either a superscript T (e.g. ) or a prime symbol (e.g. ).
- A row vector is written as the transpose of a column vector, e.g. or .
Abbreviations
Common abbreviations include:
)
\{Ani.o.\}=capNcupn\geqAn
\{Anult.\}=cupNcapn\geqAn
See also
External links
Notes and References
- Web site: 2021-08-09 . Calculating Probabilities from Cumulative Distribution Function . 2024-02-26.
- Web site: 1999-02-13 . Letters of the Greek Alphabet and Some of Their Statistical Uses . 2024-02-26 . les.appstate.edu/.
- Web site: Order Statistics . 2024-02-26 . colorado.edu.