Click to show/hide code
<- c(1, 1, 1, 2, 2, 3, 3, 3, 3, 3)
x quantile(
x, probs = c(0.5, 0.75),
type = 1
)
50% 75%
2 3
October 20, 2024
The help of the quantile()
function in R describes the different algorithms used to calculate quantiles. However, I find the details very technical and not easy to understand for casual readers.
Therefore, I tried to simplify the algorithms as discussed below.
R provides nine algorithms to calculate the quantiles, which can be specified using the type
argument:
Types \(1-3\) are used for discontinuous data, while types \(4-9\) are used for continuous data.
Type \(1\) and \(3\) are used for class “Date” and for ordered factors.
Type \(7\) is the default method for continuous data in R.
Type \(6\) is used to get results similar to SPSS
, Minitab
, or Graphpad Prism
.
Let \(n\) be the number of observations in the dataset.
Let \(p\) a number between \(0\) and \(1\), where \((p \times 100)\%\) is the quantile to be calculated, denoted by \(Q_p\).
The first step is to calculate the product \(np\).
Next, calculate the \(j^{th}\) rank and \(Q_p\) as follows:
Algorithm | \(j\) | Condition | Quantile \((Q_p)\) |
---|---|---|---|
\(1\) | \(\lfloor np \rfloor\) | \(np = j\) \(np \ne j\) |
\(x_j\) \(x_{j+1}\) |
\(2\) | \(\lfloor np \rfloor\) | \(np = j\) \(np \ne j\) |
\(\frac{1}{2}(x_j + x_{j+1})\) \(x_{j+1}\) |
\(3\) | \(\lfloor np - \large \frac{1}{2}\rfloor\) | \(np = j + \frac{1}{2}\) and \(j\) even \(np \ne j+ \frac{1}{2}\) |
\(x_j\) \(x_{j+1}\) |
\(4\) | \(\lfloor np \rfloor\) | \(x_j + (x_{j+1} - x_j)(np-j)\) | |
\(5\) | \(\lfloor np + \large \frac{1}{2}\rfloor\) | \(x_j + (x_{j+1} - x_j)(np-j + \frac{1}{2})\) | |
\(6\) | \(\lfloor np + p \rfloor\) | \(x_j + (x_{j+1} - x_j)(np-j +p)\) | |
\(7\) | \(\lfloor np -p +1 \rfloor\) | \(x_j + (x_{j+1} - x_j)(np -j - p + 1)\) | |
\(8\) | \(\lfloor np + \large \frac{(p+1)}{3} \rfloor\) | \(x_j + (x_{j+1} - x_j)(np - j + \frac{(p+1)}{3})\) | |
\(9\) | \(\lfloor np + \large \frac{(2p+3)}{8} \rfloor\) | \(x_j + (x_{j+1} - x_j)(np - j + \frac{(2p+3)}{8})\) |
The symbol \(\lfloor x \rfloor\) reads as the floor of \(x\). This function returns the largest integer not greater than \(x\) (i.e., rounds down \(x\) to the nearest integer). For example, \(\lfloor 3.9 \rfloor = 3\).
\(x_j\) and \(x_{j+1}\) are the \(j^{th}\) and \((j+1)^{th}\) order statistics, respectively (i.e., the observations having the rank \(j\) and \(j+1\) in the ordered array of the observations).
Let’s illustrate the above algorithms with some examples.
Example \(1\) (ordered factor): consider the following set of observations pertaining to the pain severity (\(1\): mild, \(2\): moderate, \(3\): severe) of \(10\) patients: \(1, 1, 1, 2, 2, 3, 3, 3, 3, 3\). The \(50^{th}\ (Q_{0.5})\) and \(75^{th}\ (Q_{0.75})\) quantiles can be calculated using algorithms \(1-3\) as follows:
\(\small 50^{th}\) Quantile:
\(\small p = 0.5\), \(\small np = 10 \times 0.5 = 5\), and \(\small j = \lfloor np \rfloor = \lfloor 5 \rfloor = 5\)
Since \(\small np = j\), then \(\small Q_{0.5} = x_j = x_5\) (the observation having the rank \(\small 5\) in the ordered array), which has the value \(\small 2\)
\(\small 75^{th}\) Quantile:
\(\small p = 0.75\), \(\small np = 10 \times 0.75 = 7.5\), and \(\small j = \lfloor np \rfloor = \lfloor 7.5 \rfloor = 7\)
Since \(\small np \ne j\), then \(\small Q_{0.75} = x_{j+1} = x_8\) (the observation having the rank \(\small 8\) in the ordered array), which has the value \(\small 3\)
Check the result using the quantile()
function in R:
\(\small 50^{th}\) Quantile:
\(\small p = 0.5\), \(\small np = 10 \times 0.5 = 5\), and \(\small j = \lfloor np \rfloor = \lfloor 5 \rfloor = 5\)
Since \(\small np = j\), then \(\small Q_{0.5} = \frac{1}{2}(x_j + x_{j+1}) =\) \(\small \frac{1}{2}(x_5 + x_6) = \frac{1}{2}(2 + 3) = 2.5\)
\(\small 75^{th}\) Quantile:
\(\small p = 0.75\), \(\small np = 10 \times 0.75 = 7.5\), \(\small j = \lfloor np \rfloor = \lfloor 7.5 \rfloor = 7\)
Since \(\small np \ne \lfloor np \rfloor\), then \(\small Q_{0.75} = x_{j+1} = x_8 = 3\)
Check the result using the quantile()
function in R:
\(\small 50^{th}\) Quantile:
\(\small p = 0.5\), \(\small np = 10 \times 0.5 = 5\), and \(\small j = \lfloor np - 0.5 \rfloor = \lfloor 5 - 0.5 \rfloor = 4\)
Since \(\small np \ne j + 0.5\), then \(\small Q_{0.5} = x_{j+1} = x_5 = 2\)
\(\small 75^{th}\) Quantile:
\(\small p = 0.75\), \(\small np = 10 \times 0.75 = 7.5\), and \(\small j = \lfloor np - 0.5 \rfloor = \lfloor 7.5 - 0.5 \rfloor = 7\)
Since \(\small np = j + 0.5\) but \(j\) is not even, then \(\small Q_{0.75} = x_{j+1} = x_8 = 3\)
Check the result using the quantile()
function in R:
\(\small 25^{th}\) Quantile:
\(\small p = 0.25\), \(\small np = 8 \times 0.25 = 2\), and \(\small j = \lfloor np \rfloor = \lfloor 2 \rfloor = 2\)
\(\small Q_{0.25} = x_j + (x_{j+1} - x_j)(np - j) = x_2 + (x_3 - x_2)(2 - 2) = 10.4\)
\(\small 75^{th}\) Quantile:
\(\small p = 0.75\), \(\small np = 8 \times 0.75 = 6\), and \(\small j = \lfloor np \rfloor = \lfloor 6 \rfloor = 6\)
\(\small Q_{0.75} = x_j + (x_{j+1} - x_j)(np - j) = x_6 + (x_7 - x_6)(6 - 6) = 14.7\)
Check the result using the quantile()
function in R:
\(\small 25^{th}\) Quantile:
\(\small p = 0.25\), \(\small np = 8 \times 0.25 = 2\), and \(\small j = \lfloor np + 0.5 \rfloor = \lfloor 2.5 \rfloor = 2\)
\(\small Q_{0.25} = x_j + (x_{j+1} - x_j)(np - j + 0.5) = x_2 + (x_3 - x_2)(2 - 2 + 0.5) =\)
\(\small 10.4 + (11.6 - 10.4) \times 0.5 = 11\)
\(\small 75^{th}\) Quantile:
\(\small p = 0.75\), \(\small np = 8 \times 0.75 = 6\), and \(\small j = \lfloor np + 0.5 \rfloor = \lfloor 6.5 \rfloor = 6\)
\(\small Q_{0.75} = x_j + (x_{j+1} - x_j)(np - j + 0.5) = x_6 + (x_7 - x_6)(6 - 6 + 0.5) =\) \(\small 14.7 + (15.4 - 14.7) \times 0.5 = 15.05\)
Check the result using the quantile()
function in R:
\(\small 25^{th}\) Quantile:
\(\small p = 0.25\), \(\small np = 8 \times 0.25 = 2\), and \(\small j = \lfloor np + p \rfloor = \lfloor 2 + 0.25 \rfloor = 2\)
\(\small Q_{0.25} = x_j + (x_{j+1} - x_j)(np - j + p) = x_2 + (x_3 - x_2)(2 - 2 + 0.25) =\)
\(\small 10.4 + (11.6 - 10.4) \times 0.25 = 10.7\)
\(\small 75^{th}\) Quantile:
\(\small p = 0.75\), \(\small np = 8 \times 0.75 = 6\), and \(\small j = \lfloor np + p \rfloor = \lfloor 6 + 0.75 \rfloor = 6\)
\(\small Q_{0.75} = x_j + (x_{j+1} - x_j)(np - j + 0.75) = x_6 + (x_7 - x_6)(6 - 6 + 0.75) =\) \(\small 14.7 + (15.4 - 14.7) \times 0.75 = 15.225\)
Check the result using the quantile()
function in R:
\(\small 25^{th}\) Quantile:
\(\small p = 0.25\), \(\small np = 8 \times 0.25 = 2\), and \(\small j = \lfloor np - p + 1 \rfloor = \lfloor 2 - 0.25 + 1 \rfloor = 2\)
\(\small Q_{0.25} = x_j + (x_{j+1} - x_j)(np - j - p + 1) = x_2 + (x_3 - x_2)(2 - 2 - 0.25 + 1) =\)
\(\small 10.4 + (11.6 - 10.4) \times 0.75 = 11.3\)
\(\small 75^{th}\) Quantile:
\(\small p = 0.75\), \(\small np = 8 \times 0.75 = 6\), and \(\small j = \lfloor np - p + 1 \rfloor = \lfloor 6 - 0.75 + 1 \rfloor = 6\)
\(\small Q_{0.75} = x_j + (x_{j+1} - x_j)(np - j - 0.75 + 1) = x_6 + (x_7 - x_6)(6 - 6 - 0.75 + 1) =\) \(\small 14.7 + (15.4 - 14.7) \times 0.25 = 14.875\)
Check the result using the quantile()
function in R:
\(\small 25^{th}\) Quantile:
\(\small p = 0.25\), \(\small np = 8 \times 0.25 = 2\), and \(\small j = \lfloor np + \frac{p + 1}{3} \rfloor = \lfloor 2 + \frac{0.25 + 1}{3} \rfloor = 2\)
\(\small Q_{0.25} = x_j + (x_{j+1} - x_j)(np - j + \frac{p + 1}{3}) = x_2 + (x_3 - x_2)(2 - 2 + \frac{0.25 + 1}{3}) =\)
\(\small 10.4 + (11.6 - 10.4) \times 0.41667 = 10.9\)
\(\small 75^{th}\) Quantile:
\(\small p = 0.75\), \(\small np = 8 \times 0.75 = 6\), and \(\small j = \lfloor np + \frac{p + 1}{3} \rfloor = \lfloor 6 + \frac{0.75 + 1}{3} \rfloor = 6\)
\(\small Q_{0.75} = x_j + (x_{j+1} - x_j)(np - j + \frac{p + 1}{3}) = x_6 + (x_7 - x_6)(6 - 6 + \frac{0.75 + 1}{3}) =\) \(\small 14.7 + (15.4 - 14.7) \times 0.58333 = 15.10833\)
Check the result using the quantile()
function in R:
\(\small 25^{th}\) Quantile:
\(\small p = 0.25\), \(\small np = 8 \times 0.25 = 2\), and \(\small j = \lfloor np + \frac{2p + 3}{8} \rfloor = \lfloor 2 + \frac{2 \times 0.25 + 3}{8} \rfloor = 2\)
\(\small Q_{0.25} = x_j + (x_{j+1} - x_j)(np - j + \frac{2p + 3}{8}) = x_2 + (x_3 - x_2)(2 - 2 + \frac{2 \times 0.25 + 3}{8}) =\)
\(\small 10.4 + (11.6 - 10.4) \times 0.4375 = 10.925\)
\(\small 75^{th}\) Quantile:
\(\small p = 0.75\), \(\small np = 8 \times 0.75 = 6\), and \(\small j = \lfloor np + \frac{2p + 3}{8} \rfloor = \lfloor 6 + \frac{2 \times0.75 + 3}{8} \rfloor = 6\)
\(\small Q_{0.75} = x_j + (x_{j+1} - x_j)(np - j + \frac{2p + 3}{8}) = x_6 + (x_7 - x_6)(6 - 6 + \frac{2 \times 0.75 + 3}{8}) =\) \(\small 14.7 + (15.4 - 14.7) \times 0.5625 = 15.09375\)
Check the result using the quantile()
function in R:
quantile: Sample Quantiles. Retrieved October 20, 2024, from https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/quantile