As an introduction to sampling theory, consider the problem of estimating
the average IQ of students attending the University of Liverpool.
To test the entire group, or *population*, of students would take
too long. Instead, it is decided that tests should be handed out to a
*sample* of the student population. From the sample, results
regarding the population can be *statistically inferred*. The reliability
of the survey depends on whether the sample is properly chosen.

IQ scores range between 0 and 200. The set of all possible scores can be
represented by the sample space \(\Omega = \{0, 1, 2, ..., 200\}\).
Let the variable \(X(\omega) = \omega\) represent a particular outcome after
completing a test. Clearly \(X\) is a discrete random variable. An alphabetic
roll call of students is used to select a *systematic* sample. The list
is first split into groups of \(k\) students (where \(k\) is an integer greater than
1). If \(k\) is not a multiple of the population size, \(N\), the last group has
a smaller size than \(k\). Next a random integer, \(r\), between \(0\) and \(k – 1\) is
chosen. Students are included in the sample if their position in the roll call is
\(r\) modulo \(k\).

Let the size of the sample be \(n\). On receipt of \(n\) tests, each student in the
sample is assigned a score, \(x_{i}\), in the range \(0\) to \(200\) where
\(x_{i}\) is the value of a random variable \(X_{i}\). The *sample mean* is a
random variable defined by

$$ \overline{X} = \frac{1}{n} \sum_{i = 1}^{n} X_{i} $$

whose value is

$$ \overline{x} = \frac{1}{n} \sum_{i = 1}^{n} x_{i}. $$

\(X_{1}, ..., X_{n}\) are independent random variables whose distribution functions are the same as the population which has mean \(\mu\) and variance \(\sigma^{2}\). The expected value of the sample mean, \(\mathbf{E}(\overline{X})\), is the population mean, \(\mu\), because

$$ \begin{align*} \mathbf{E}(\overline{X}) &= \frac{1}{n}\left(\sum_{i=1}^{n}\mathbf{E}(X_{i})\right) \\ &= \frac{1}{n}(n\mu) \\ &= \mu. \end{align*} $$

Furthermore, \(X_{1}, ..., X_{n}\) have variance \(\sigma^{2}\) and so the variance of \(\overline{X}\), \(\text{Var}(\overline{X})\), is

$$ \begin{align} \mathbf{E}\left((\overline{X} - \mu)^{2}\right) &= \text{Var}\left(\frac{1}{n}\sum_{i = 1}^{n} X_{i}\right) \\ &= \frac{1}{n^2}\sum_{i = 1}^{n}\text{Var}(X_{i}) \\ &= \frac{1}{n^2} n \sigma^{2} \\ &= \frac{\sigma^{2}}{n}. \end{align} \tag{19} $$

As the sample size increases the variation or scatter of the sample means tends to zero.

From (5) the
random variable, \(S^{2}\), giving the *sample variance* is

$$ S^{2} = \frac{1}{n} \sum_{i = 1}^{n}(X_{i} - \overline{X})^{2}. $$

It turns out that sample variance, \(S^{2}\), is not an unbiased estimator of population variance, \(\sigma^{2}\). \(S^{2}\) underestimates \(\sigma^{2}\) by a factor of \((n – 1)/n\) so that

$$ \mathbf{E}(S^{2}) = \frac{n - 1}{n} \sigma^{2}. \tag{20} $$

The proof of (20) is as follows. Consider the term \(X_{i} - \overline{X} = (X_{i} - \mu) - (\overline{X} - \mu)\). Then, \((X_{i} - \overline{X})^{2} = (X_{i} - \mu)^{2} - 2(X_{i} - \mu)(\overline{X} - \mu) + (\overline{X} - \mu)^2\) and so

$$
\begin{align}
\sum_{i = 1}^{n}(X_{i} - \overline{X})^2
&= \sum_{i=1}^{n}(X_{i} - \mu)^{2} - 2(\overline{X} - \mu)\sum_{i=1}^{n}(X_{i} - \mu) + \sum_{i=1}^{n}(\overline{X} - \mu)^{2} \\
&= \sum_{i=1}^{n}(X_{i} - \mu)^{2} - 2n(\overline{X} - \mu)^{2} + n(\overline{X} - \mu)^{2}.
\end{align}
\tag{21}
$$

\(\sum_{i = 1}^{n}(X_{i} - \overline{X})^2
= \sum_{i=1}^{n}(X_{i} - \mu)^{2} - 2(\overline{X} - \mu)\sum_{i=1}^{n}(X_{i} - \mu) + \sum_{i=1}^{n}(\overline{X} - \mu)^{2}
= \sum_{i=1}^{n}(X_{i} - \mu)^{2} - 2n(\overline{X} - \mu)^{2} + n(\overline{X} - \mu)^{2}.\)

\((21)\)

Equation (19) together with the expectation of (21) give

$$
\begin{align*}
\mathbf{E}\left(\sum_{i=1}^{n}(X_{i} - \overline{X})^{2}\right)
&= \mathbf{E}\left(\sum_{i=1}^{n}(X_{i} - \mu)^{2}\right) - n\mathbf{E}\left((\overline{X} - \mu)^{2}\right) \\
&= n\sigma^{2} - n\left(\frac{\sigma^{2}}{n}\right) \\
&= (n - 1)\sigma^{2}
\end{align*}
$$

\(\mathbf{E}\left(\sum_{i=1}^{n}(X_{i} - \overline{X})^{2}\right)
= \mathbf{E}\left(\sum_{i=1}^{n}(X_{i} - \mu)^{2}\right) - n\mathbf{E}\left((\overline{X} - \mu)^{2}\right)
= n\sigma^{2} - n\left(\frac{\sigma^{2}}{n}\right)
= (n - 1)\sigma^{2}\)

so that

$$ \mathbf{E}(S^{2}) = \frac{n-1}{n}\sigma^{2} $$

and

$$ \sigma^{2} = \frac{n}{n-1}\mathbf{E}(S^{2}). \tag{22} $$

Equation (22) is an important result that states that the population variance, \(\sigma^{2}\), is equal to the expected sample variance, \(\mathbf{E}(S^{2})\), multiplied by \(n/(n - 1)\).