Standard Error
|
Sample Average |
Frequency |
Percent |
|
1.00 |
1 |
.8 |
|
1.33 |
3 |
2.4 |
|
1.67 |
6 |
4.8 |
|
2.00 |
10 |
8.0 |
|
2.33 |
15 |
12.0 |
|
2.67 |
18 |
14.4 |
|
3.00 |
19 |
15.2 |
|
3.33 |
18 |
14.4 |
|
3.67 |
15 |
12.0 |
|
4.00 |
10 |
8.0 |
|
4.33 |
6 |
4.8 |
|
4.67 |
3 |
2.4 |
|
5.00 |
1 |
.8 |
Now, I present a graphic which compares the sampling distribution with a sample
size of two versus a sample size of three. The frequencies of the means are
presented on a percentage basis for easy comparison.

Here are the Take Way points:
1. Fewer Extremes: With the larger sample size (3) - note that there are extreme sample means. :Look at the number of samples means with a mean of 5. This is an extreme and not very representative mean. The percentage is dramatically less for the N=3 sample.
Thus, with larger samples - you don't get wacky means that much.
2. Tighter distributions - note that the standard deviation of all the sample means (the standard error) is smaller than with a sample size of 2. It's mean is again 3 but the standard deviation of this distribution of means is equal to .82.
t-distributions and Leptokurtosis
Sampling distributions of the mean and those of some other statistics have a particular shape. They are bell shaped, like the normal curve, but less peaked and with fatter tails.
This particular shape is called leptokurtic (from leptokurtosis).
The fatness of the tails is controlled by a parameter called the degrees of freedom (df). DF are related to sample size. In our example, df = Nsample -.1. So for example with 3 subjects in the sample, df =2. The graphics above are actual frequency distributions. However, the t-distributions are theoretical mathematical functions. Here are some examples comparing the t-distributions with df = 3 or 6. Note the fatter tails and cut off scores for 5% total extremes (2.5% in each tail). t-distributions have cut offs like the z-score in Workshop 2.

Let's
look at the tails of the distributions. We've marked the cut offs for the 5%
two tailed level. Note that the smaller
the df, the larger the value needed for the cut. This
reflects our example above where the smaller sample size gave more extreme sample
means. Thus, with big samples it's hard
to get weird means. Remember this for the workshop on
Hypothesis Testing.