39 Stats: Moment Arithmetic
Purpose: In a future exercise, we will need to be able to do some basic arithmetic with moments of a distribution. To prepare for this later exercise, we’ll do some practice now.
Reading: (None, this is the reading)
Topics: Moments, moment arithmetic, standardization
39.1 Moments
Moments are a particular kind of statistic. There is a general, mathematical definition of a moment, but we will only need to talk about two in this class.
We’ve already seen the mean; this is also called the expectation. For a random variable \(X\), the expectation is defined in terms of its pdf \(\rho(x)\) via
\[\mathbb{E}[X] = \int x \rho(x) dx.\]
We’ve also seen the standard deviation \(\sigma\). This is related to the variance \(\sigma^2\), which is defined for a random variable \(X\) in terms of the expectation
\[\sigma^2 \equiv \mathbb{V}[X] = \mathbb{E}[(X - \mathbb{E}[X])^2].\] For instance, a standard normal \(Z\) has
\[ \begin{aligned} \mathbb{E}[Z] &= 0 \\ \mathbb{V}[Z] &= 1 \end{aligned} \]
For future exercises, we’ll need to learn how to do basic arithmetic with these two moments.
39.2 Moment Arithmetic
We will need to be able to do some basic arithmetic with the mean and variance. The following exercises will help you remember this basic arithmetic.
39.3 Expectation
The expectation is linear, that is
\[\mathbb{E}[aX + c] = a \mathbb{E}[X] + c.\]
We can use this fact to compute the mean of simply transformed random variables.
39.3.1 q1 Compute the mean of \(2 Z + 3\), where \(Z\) is a standard normal.
Use the following test to check your answer.
## [1] TRUE
## [1] "Nice!"
Since the expectation is linear, it also satisfies
\[\mathbb{E}[aX + bY] = a \mathbb{E}[X] + b \mathbb{E}[Y].\]
39.4 Variance
Remember that variance is the square of standard deviation. Variance satisfies the property
\[\mathbb{V}[aX + c] = a^2 \mathbb{V}[X].\]
39.4.1 q3 Compute the variance of \(2 Z + 3\), where \(Z\) is a standard normal.
Use the following test to check your answer.
## [1] TRUE
## [1] "Well done!"
The variance of a sum of random variables is a bit more complicated
\[\mathbb{V}[aX + bY] = a^2 \mathbb{V}[X] + b^2 \mathbb{V}[Y] + 2ab \text{Cov}[X, Y],\]
where \(\text{Cov}[X, Y]\) denotes the covariance of \(X, Y\). Covariance is closely related to correlation, which we discussed in e-stat03-descriptive
. If two random variables \(X, Y\) are uncorrelated, then \(\text{Cov}[X, Y] = 0\).
39.5 Standardization
The following two exercises illustrate two important transformations.
39.5.1 q5 Compute the mean and variance of \((X - 1) / 2\), where
\[\mathbb{E}[X] = 1, \mathbb{V}[X] = 4\].
Use the following test to check your answer.
## [1] TRUE
## [1] TRUE
## [1] "Well done!"
This process of centering (setting the mean to zero) and scaling a random variable is called standardization. For instance, if \(X\) is a normal random variable, then \((X - \mu) / \sigma = Z\) is a standard normal.
39.5.2 q6 Compute the mean and variance of \(1 + 2 Z\), where \(Z\) is a standard normal.
Use the following test to check your answer.
## [1] TRUE
## [1] TRUE
## [1] "Excellent!"
This example illustrates that we can create a normal with desired mean and standard deviation by transforming a standard normal \(\mu + \sigma Z = X\).
39.6 Standard Error
The variance satisfies the property
\[\mathbb{V}[aX + bY] = a^2 \mathbb{V}[X] + b^2 \mathbb{V}[Y] + 2 \text{Cov}[X, Y],\]
where
\[\text{Cov}[X, Y] = \mathbb{E}[(X - \mathbb{E}[X])(Y - \mathbb{E}[Y])]\]
is the covariance between \(X\) and \(Y\). If \(X, Y\) are independent, then the covariance between them is zero.
Using this expression, we can prove that the standard error of the sample mean \(\overline{X}\) is \(\sigma / \sqrt{n}\).
39.6.1 q7 (Bonus) Use the identity above to prove that
\[\mathbb{V}[\overline{X}] = \sigma^2 / n,\]
where \[\overline{X} = \frac{1}{n}\sum_{i=1}^n X_i\], \(\sigma^2 = \mathbb{V}[X]\), and the \(X_i\) are mutually independent.
The quantity
\[\sqrt{\mathbb{V}[\overline{X}]}\]
is called the standard error of the mean; more generally the standard error
for a statistic is the standard deviation of its sampling distribution. We’ll return to this concept in e-stat06
.