In mathematics and statistics, the arithmetic mean, arithmetic average, or just the mean or average (when the context is clear) is the sum of a collection of numbers divided by the count of numbers in the collection.[1] The collection is often a set of results from an experiment, an observational study, or a survey. The term "arithmetic mean" is preferred in some mathematics and statistics contexts because it helps distinguish it from other types of means, such as geometric and harmonic.
In addition to mathematics and statistics, the arithmetic mean is frequently used in economics, anthropology, history, and almost every academic field to some extent. For example, per capita income is the arithmetic average income of a nation's population.
While the arithmetic mean is often used to report central tendencies, it is not a robust statistic: it is greatly influenced by outliers (values much larger or smaller than most others). For skewed distributions, such as the distribution of income for which a few people's incomes are substantially higher than most people's, the arithmetic mean may not coincide with one's notion of "middle". In that case, robust statistics, such as the median, may provide a better description of central tendency.
The arithmetic mean of a set of observed data is equal to the sum of the numerical values of each observation, divided by the total number of observations. Symbolically, for a data set consisting of the values
x1,...,xn
\bar{x}= | 1 |
n |
\left
n{x | ||||
(\sum | ||||
|
(For an explanation of the summation operator, see summation.)
For example, if the monthly salaries of
10
\{2500,2700,2400,2300,2550,2650,2750,2450,2600,2400\}
2500+2700+2400+2300+2550+2650+2750+2450+2600+2400 | |
10 |
=2530
\mu
X
\overline{X}
The arithmetic mean can be similarly defined for vectors in multiple dimensions, not only scalar values; this is often referred to as a centroid. More generally, because the arithmetic mean is a convex combination (meaning its coefficients sum to
1
The arithmetic mean has several properties that make it interesting, especially as a measure of central tendency. These include:
x1,...c,xn
\bar{x}
(x1-\bar{x})+...b+(xn-\bar{x})=0
xi-\bar{x}
a
\overline{x+a}=\bar{x}+a
x1,...c,xn
2 | |
(x | |
i-\bar{x}) |
avg(ca1, … ,can)=c ⋅ avg(a1, … ,an).
See main article: Median.
The arithmetic mean may be contrasted with the median. The median is defined such that no more than half the values are larger, and no more than half are smaller than it. If elements in the data increase arithmetically when placed in some order, then the median and arithmetic average are equal. For example, consider the data sample
\{1,2,3,4\}
2.5
\{1,2,4,8,16\}
6.2
4
There are applications of this phenomenon in many fields. For example, since the 1980s, the median income in the United States has increased more slowly than the arithmetic average of income.[4]
See main article: Weighted average.
A weighted average, or weighted mean, is an average in which some data points count more heavily than others in that they are given more weight in the calculation.[5] For example, the arithmetic mean of
3
5
3+5 | |
2 |
=4
3 ⋅
1 | |
2 |
+5 ⋅
1 | |
2 |
=4
3 ⋅
2 | |
3 |
+5 ⋅
1 | = | |
3 |
11 | |
3 |
2 | |
3 |
1 | |
3 |
1 | |
2 |
1 | |
n |
n
If a numerical property, and any sample of data from it, can take on any value from a continuous range instead of, for example, just integers, then the probability of a number falling into some range of possible values can be described by integrating a continuous probability distribution across this range, even when the naive probability for a sample number taking one certain value from infinitely many is zero. In this context, the analog of a weighted average, in which there are infinitely many possibilities for the precise value of the variable in each range, is called the mean of the probability distribution. The most widely encountered probability distribution is called the normal distribution; it has the property that all measures of its central tendency, including not just the mean but also the median mentioned above and the mode (the three Ms[6]), are equal. This equality does not hold for other probability distributions, as illustrated for the log-normal distribution here.
See main article: Mean of circular quantities.
Particular care is needed when using cyclic data, such as phases or angles. Taking the arithmetic mean of 1° and 359° yields a result of 180°.This is incorrect for two reasons:
2\pi
\tau
In general application, such an oversight will lead to the average value artificially moving towards the middle of the numerical range. A solution to this problem is to use the optimization formulation (that is, define the mean as the central point: the point about which one has the lowest dispersion) and redefine the difference as a modular distance (i.e., the distance on the circle: so the modular distance between 1° and 359° is 2°, not 358°).
The arithmetic mean is often denoted by a bar (vinculum or macron), as in
\bar{x}
Some software (text processors, web browsers) may not display the "x̄" symbol correctly. For example, the HTML symbol "x̄" combines two codes — the base letter "x" plus a code for the line above (̄ or ¯).[7]
In some document formats (such as PDF), the symbol may be replaced by a "¢" (cent) symbol when copied to a text processor such as Microsoft Word.