Appendix B: Math for Introductory Statistics

This course does not require calculus or linear (matrix) algebra. That said, there is no getting around the need to use math in order to understand the concepts and calculations. In this appendix, some of the basic math is reviewed for those who need a refresher.

B.1

B.1.1 Exponents

In the simplest form of exponentiation, we calculate a number \(b\) (the “base”) multiplied by itself a certain number of times \(p\) (the “power”): \(b^p\).

Example: let’s use \(b=2\) and \(p = \{ 1, 2, 3 \}\): \[\begin{align*} b^1 &= 2^1 = 2 \\ b^2 &= 2^2 = 2*2 = 4 &\text{squared} \\ b^3 &= 2^3 = 2*2*2 = 8 &\text{cubed} \end{align*}\]

B.1.2 Euler’s Number

Euler’s number, often represented simply as \(e\), is an important and commonly used number throughout mathematics and science. Euler’s number to the eighth decimal place is \(e = 2.71828183\).

B.1.3 Logarithms

Logarithms are also quite common, but can be quite confusing for many. Think of calculating a logarithm as the inverse of exponentiation. Suppose we expressed an exponentiation as \(b^y = x\). Given base \(b\) and power \(y\), we could calculate \(x\) as before.

With a logarithm, we are given \(b\) and \(x\) in the above exponentiation and we instead need to find \(y\). We express that calculation as \(y = \log_b x\) — i.e., the logarithm base \(b\) of \(x\). As before, \(y\) is the value that satisfies \(b^y = x\).

For example, suppose we wanted to calculate \(\log_{10} 100\). Here, we want to find the value \(y\) such that \(10^y = 100\). The answer is \(y=2\), since \(10^2 = 10*10 = 100\).

Other examples: Again, suppose the base is \(b=10\): \[\begin{align*} \log_{10} 10 &= 1 & \text{$10^1 = 10$}\\ \log_{10} 100 &= 2 & \text{$10^2 = 100$}\\ \log_{10} 1000 &= 3 & \text{$10^3 = 1000$}\\ \log_{10} 23 &= 1.3617... & \text{$10^{1.3617} = 23$} \end{align*}\]

Natural logarithms are a special case where the base is Euler’s number \(e = 2.71828183...\). Natural logarithms can be represented as \(\log_e x\). However, they are more commonly written as \(\ln x\), where \(\ln\) implies a logarithm with base \(e\). Here are a few examples of natural logarithms: \[\begin{align*} \ln e &= 1 & \text{$e^1 = e$}\\ \ln 7.389 &= 2 & \text{$e^2=7.389$}\\ \ln 23 &= 3.135... & \text{$e^{3.135}=23$} \end{align*}\]

B.1.4 Factorials

For any nonnegative integer \(x\), the factorial of \(x\) is written as \(x!\). For \(x=0\) and \(x=1\), \(x!=1\). For \(x\ge 2\), \(x! = x (x-1)!\).

Examples: \[\begin{align*} 0! &= 1\\ 1! &= 1\\ 2! &= 2 (1) = 2\\ 3! &= 3 (2) (1) = 6\\ 4! &= 4 (3) (2) (1) = 24\\ 5! &= 5 (4) (3) (2) (1) = 120 \end{align*}\]

B.2.1 Summing the Elements of a Vector

Suppose we have a sample of six data points for the variable \(x\). Mathematically, we can represent that sample as the vector \[x = \{ x_1 \,\,\, x_2 \,\,\, x_3 \,\,\, x_4 \,\,\, x_5 \,\,\, x_6 \}\] where \(x_1\) is the first data point, \(x_2\) the second, and so on.

We can use the notation \(x_i\) to represent the i-th element (or data pont) in \(x\). As a concrete example, suppose our vector takes the values \[\begin{align*} x &= \{ x_1 \,\,\, x_2 \,\,\, x_3 \,\,\, x_4 \,\,\, x_5 \,\,\, x_6 \}\\ &=\{ \,3 \,\,\,\,\,\, 1 \,\,\,\,\,\, 0 \,\,\,\,\,\, 1 \,\,\,\,\,\, 4 \,\,\,\,\,\, 2 \} \end{align*}\] Here, for example, \(x_1=3\) and \(x_5=4\).

Suppose we wanted to add all the elements of \(x\). The capital sigma notation is commonly used to represent summation. Using the notation \[\sum_{i=1}^6 x_i\] we are adding the elements of \(x\) from the first element \((i=1)\) to the sixth element \((i=6)\) \[\sum_{i=1}^6 x_i = x_1 + x_2 + x_3 + x_4 + x_5 + x_6\] Using the values from our example above, this is \[\sum_{i=1}^6 x_i = 3 + 1 + 0 + 1 + 4 + 2 = 11\] In statistics, it’s common to identify the number of elements in our sample as \(n\) (or \(N\)). Here, the sample size is \(n=6\).

Let’s look at one more example. Suppose we have a sample of data \[x = \{ -1 \,\,\, 0 \,\,\, 3 \,\, -\!2 \,\,\, 4 \}\] There are \(n=5\) observations. We can write the summation of all five observations as \[\begin{align*} \sum_{i=1}^n x_i &= \sum_{i=1}^5 x_i \\ &= \, x_1 + x_2 + x_3 + x_4 + x_5\\ &= -1 \, + \, 0 \, + \, 3 \, -2 \, + \, 4\\ &= \, 4 \end{align*}\]

Note that it is quite common to see the notation \(\sum x_i\), without lower or upper numbers for the index \(i\). In such cases, it is assumed that we start at \(i=1\) and end at \(i=n\). So, writing \(\sum x_i\) is the same as writing \(\sum_{i=1}^n x_i\).

Summing the values of a vector is easy in R. In our first example, we had the vector

> x <- c(3, 1, 0, 1, 4, 2)
> x
[1] 3 1 0 1 4 2

To add all the elements of \(x\) — i.e., to calculate \(\sum_{i=1}^6 x_i\) — we use the sum() command:

> sum(x)
[1] 11

B.2.2 Summing the Squared Elements of a Vector

As before, suppose we have a sample of six data points \[\begin{align*} x &= \{ x_1 \,\,\, x_2 \,\,\, x_3 \,\,\, x_4 \,\,\, x_5 \,\,\, x_6 \}\\ &=\{ \,3 \,\,\,\,\,\, 1 \,\,\,\,\,\, 0 \,\,\,\,\,\, 1 \,\,\,\,\,\, 4 \,\,\,\,\,\, 2 \} \end{align*}\]

Now suppose we want to square each element of \(x\) and then add those squared values. We write this mathematically as \[\begin{align*} \sum_{i=1}^6 x_i^2 & = x_1^2 + x_2^2 + x_3^2 + x_4^2 + x_5^2 + x_6^2\\ &= 3^2 + 1^2 + 0^2 + 1^2 + 4^2 + 2^2 \\ &= \, 9\, +\, 1\, +\, 0\, + \, 1 \, + \, 16\, +\, 4\\ &=31 \end{align*}\]

Summing squared values is not difficult in R. Again, we have the sample

> x
[1] 3 1 0 1 4 2

We can square each element of \(x\)

> x^2
[1]  9  1  0  1 16  4

To sum the squared values of \(x\), simply wrap that in sum()

> sum(x^2)
[1] 31

R executes x^2 first and then sums over the vector of squared values.

B.2.3 Sum of Squared Deviations from the Mean

When calculating the sample variance or sample standard deviation, a key part of the calculation is the sum of squared deviations from the mean \[\sum_{i=1}^n \left( x_i - \bar{x} \right)^2\]

If you’ve mastered the previous two summations, calculating this is relatively straightforward.

Let’s use the same sample

> x
[1] 3 1 0 1 4 2

In order to calculate the sum of the squared deviations from the mean, we need to first calculate the sample mean. We can do that in a couple of ways

> n <- length(x)
> n
[1] 6
> mx <- sum(x)/n
> mx
[1] 1.833

Or, we can use the built-in R command

> mx <- mean(x)
> mx
[1] 1.833

Again, our sample is

> x
[1] 3 1 0 1 4 2

We have assigned the mean (1.833) to mx, because we need to calculate the deviation of each observation from the mean – and then square those deviations from the mean. \[\begin{align*} \sum_{i=1}^n \left( x_i - \bar{x} \right)^2 &= \sum_{i=1}^6 \left( x_i - 1.833 \right)^2 \\ &= (3-1.833)^2 + (1-1.833)^2 +\cdots + (4-1.833)^2 + (2-1.833)^2\\ &=10.833 \end{align*}\]

Let’s step through the components of this calculation in R. For each element of \(x\), the squared deviation from the mean is

> (x-mx)^2
[1] 1.36111 0.69444 3.36111 0.69444 4.69444 0.02778

Now we sum those squared deviations from the mean

> sum( (x-mx)^2 ) 
[1] 10.83