What makes t-distribution flatter and wider compared standard normal distribution.

I'll try to give an intuitive explanation.

The t-statistic* has a numerator and a denominator. For example, the statistic in the one sample t-test is

$$\frac{\bar{x}-\mu_0}{s/\sqrt{n}}$$

*(there are several, but this discussion should hopefully be general enough to cover the ones you are asking about)

Under the assumptions, the numerator has a normal distribution with mean 0 and some unknown standard deviation.

Under the same set of assumptions, the denominator is an estimate of the standard deviation of the distribution of the numerator (the standard error of the statistic on the numerator). It is independent of the numerator. Its square is a chi-square random variable divided by its degrees of freedom (which is also the d.f. of the t-distribution) times $\sigma_\text{numerator}$.

When the degrees of freedom are small, the denominator tends to be fairly right-skew. It has a high chance of being less than its mean, and a relatively good chance of being quite small. At the same time, it also has some chance of being much, much larger than its mean.

Under the assumption of normality, the numerator and denominator are independent. So if we draw randomly from the distribution of this t-statistic we have a normal random number divided by a second randomly* chosen value from a right-skew distribution that's on average around 1.

* without regard to the normal term

Because it's on the denominator, the small values in the distribution of the denominator produce very large t-values. The right-skew in the denominator make the t-statistic heavy-tailed. The right tail of the distribution, when on the denominator makes the t-distribution more sharply peaked than a normal with the same standard deviation as the t.

However, as the degrees of freedom become large, the distribution becomes much more normal-looking and much more "tight" around its mean.

As such, the effect of dividing by the denominator on the shape of the distribution of the numerator reduces as the degrees of freedom increase.

Eventually - as Slutsky's theorem might suggest to us could happen - the effect of the denominator becomes more like dividing by a constant and the distribution of the t-statistic is very close to normal.

Considered in terms of the reciprocal of the denominator

whuber suggested in comments that it might be more illuminating to look at the reciprocal of the denominator. That is, we could write our t-statistics as numerator (normal) times reciprocal-of-denominator (right-skew).

For example, our one-sample-t statistic above would become:

$${\sqrt{n}(\bar{x}-\mu_0)}\cdot{1/s}$$

Now consider the population standard deviation of the original $X_i$, $\sigma_x$. We can multiply and divide by it, like so:

$${\sqrt{n}(\bar{x}-\mu_0)/\sigma_x}\cdot{\sigma_x/s}$$

The first term is standard normal. The second term (the square root of a scaled inverse-chi-squared random variable) then scales that standard normal by values that are either larger or smaller than 1, "spreading it out".

Under the assumption of normality, the two terms in the product are independent. So if we draw randomly from the distribution of this t-statistic we have a normal random number (the first term in the product) times a second randomly-chosen value (without regard to the normal term) from a right-skew distribution that's 'typically' around 1.

When the d.f. are large, the value tends to be very close to 1, but when the df are small, it's quite skew and the spread is large, with the big right tail of this scaling factor making the tail quite fat:

Definition :

When some samples are drawn from normal population whose variance is known, a distribution of the sample mean is normal. When, however, the variance of the population is unknown, the distribution is not normal but student-t, whose tail longer. That means the fact that sample mean with unknown population variance is inclined to be an extreme value. If you use normal distribution for hypothesis testing instead of t distribution, probability of error becomes bigger.

Formula :

Suppose we have a simple random sample of size n drawn from a Normal population with mean

and standard deviation
. Let
denote the sample mean and s, the sample standard deviation. Then the quantity

 

has a t distribution with n-1 degrees of freedom.

Note that there is a different t distribution for each sample size, in other words, it is a class of distributions. When we speak of a specific t distribution, we have to specify the degrees of freedom. The degrees of freedom for this t statistics comes from the sample standard deviation s in the denominator of equation 1.

The t density curves are symmetric and bell-shaped like the normal distribution and have their peak at 0. However, the spread is more than that of the standard normal distribution. This is due to the fact that in formula (1), the denominator is s rather than σ. Since s is a random quantity varying with various samples, the variability in t is more, resulting in a larger spread.

The larger the degrees of freedom, the closer the t-density is to the normal density. This reflects the fact that the standard deviation s approaches

for large sample size n. You can visualize this in the applet below by moving the sliders. 

Properties :

The Student t distribution is different for different sample sizes.

The Student t distribution is generally bell-shaped, but with smaller sample sizes shows increased variability (flatter). In other words, the distribution is less peaked than a normal distribution and with thicker tails. As the sample size increases, the distribution approaches a normal distribution. For n > 30, the differences are negligible.

The mean is zero (much like the standard normal distribution).

The distribution is symmetrical about the mean.

The variance is greater than one, but approaches one from above as the sample size increases (
=1 for the standard normal distribution).

The population standard deviation is unknown.

The population is essentially normal (unimodal and basically symmetric)

Method :

   The T distribution table

Illustrations :

The current rate for producing 5 amp fuses at Neary Electric Co. is 250 per hour. A new machine has been purchased and installed that, according to the supplier, will increase the production rate. A sample of 10 randomly selected hours from the last month revealed the mean hourly production on the new machine was 256, with a sample standard deviation of 6 per hour. At the .05 significance level can Neary conclude that the new machine is faster ?

Step 1: H0: M <= 250 H1: M>250

Step 2: H0 is rejected if t>1.833, df = 9

Step 3: t = [256 - 250]/[6/sqrt(10] = 3.16

Step 4: H0 is rejected. The new machine is faster.

Applications :

It is often the case that one wants to calculate the size of sample needed to obtain a certain level of confidence in survey results. Unfortunately, this calculation requires prior knowledge of the population standard deviation (

). Realistically,
is unknown. Often a preliminary sample will be conducted so that a reasonable estimate of this critical population parameter can be made. If such a preliminary sample is not made, but confidence intervals for the population mean are to be constructing using an unknown
, then the Student t distribution can be used.

Applet :

For an  illustration of T distributions go to
//www.econtools.com/jevons/java/Graphics2D/tDist.html

Excel Function :

These functions can be accessed by clicking on Insert and then choosing Function from the drop down menu.
The Excel function for finding a Studen'ts T distribution for a given data set is :
TTEST(array1,array2,tails,type)
Returns the probability associated with a student's T test

Why are t

The t density curves are symmetric and bell-shaped like the normal distribution and have their peak at 0. However, the spread is more than that of the standard normal distribution. This is due to the fact that in formula (1), the denominator is s rather than σ.

Is t

The main difference between using the t-distribution compared to the normal distribution when constructing confidence intervals is that critical values from the t-distribution will be larger, which leads to wider confidence intervals.

How does the t

Like a standard normal distribution (or z-distribution), the t-distribution has a mean of zero. The normal distribution assumes that the population standard deviation is known. The t-distribution does not make this assumption. The t-distribution is defined by the degrees of freedom.

Why is the t

Student t distribution is generally flatter than z distribution. This is because t curve has heavier tails compared with normal curve. The heavier tails of r distribution means that it's prone to producing values that falls away from the mean (on the tails).

Toplist

Neuester Beitrag

Stichworte