In probability theory and statistics, the median of a data set , sometimes written as ,[1] is a number describing the data set. This number has the property that it divides a set of observed values in two equal halves, so that half of the values are below it and half are above.
If there are a finite number of elements, the median is easy to find. The values need to be arranged in a list, from the lowest to highest. If there is an odd number of values, the median is the one at position . For example, if there are 13 values, they can be arranged into two groups of 6, with the median in between, at position 7. With an even number of values, as there is no single number which divides all of the numbers to two halves, the median is defined as the mean of the two central elements.[2] With 14 observations, this would be the mean of the elements at positions 7 and 8, which is their sum divided by 2.
Alternatively, the median of an even-sized list is sometimes defined as either of the two middle elements; the choice being either (a) always the smallest one, (b) always the largest one, or (c) randomly choose between the two. This alternative definition has two important advantages: it guarantees that the median is always a list element (for example. a list of integers will never have a fractional median), and it guarantees that the median exists for any ordinal-valued data. On the other hand, when one of the choices (a) or (b) is taken, the median of a sample will be biased, which is an unwanted property of a statistical estimator. Definition (c) does not have that disadvantage, but it is more difficult to do. It also has the disadvantage that the same list of values does not have a well-defined, deterministic median.