Maths @ CHARUSAT: (Biostatistics) Hypothesis Test: Theory

Dear Student, as we have discussed in theory lecture, you can find here basics of theory of Test of Statistical Hypothesis. The reference for the content is "Exploring Statistics" by D Raghavrao

Statistical Hypothesis:

We may define statistical hypothesis as an assumption regarding the parameter of the population distribution under study.Statisticians to establish claims use their complements as null hypotheses and try to find out evidence in sample data to reject the formulated null hypothesis.

For example we adopt the null hypothesis that the analytical method is not subject to systematic error. The term null is used to imply that there is no difference between the observed and known values other than that which can be attributed to random variation.

Assuming that this null hypothesis is true, statistical theory can be used to calculate the probability that the observed difference (or a greater one) between the sample mean,$\bar{x}$ and the true value, $\mu$, arises purely as a result of random errors. The lower the probability that the observed difference occurs by chance, the less likely it is that the null hypothesis is true. Usually the null hypothesis is rejected if the probability of such a difference occurring by chance is less than 1 in 20 (i.e. 0.05 or 5$\%$). In such a case the difference is said to be significant at the 0.05 (or 5$\%$) level.

Types of Hypothesis

In general, when testing the mean,$\mu$, the three cases of formulating $H_{0}$ are

$H_{0}:\mu \geq \mu_{0}, H_{1}:\mu < \mu_{0}$
$H_{0}:\mu \leq \mu_{0}, H_{1}:\mu > \mu_{0}$
$H_{0}:\mu = \mu_{0}, H_{1}:\mu \neq \mu_{0}$

Depending on the specific objectives and goals of a study one and only one of the three types of formulations of $H_{0}$ will be selected and the conclusions will be drawn accordingly.

After formulating the null and alternative hypotheses, one computes the test statistic, which measures the difference between the data and what is expected based on the null hypothesis. The test statistic is a random variable, while the computed quantity is a numerical value assumed by the random variable. In problems related to testing population means, the standard normal variable, $Z$, or the $T$- variable usually are the test statistics.

We assume the null hypothesis is true, and usually (but not always) wish to show that the alternative is actually true. After collecting sample data, we compute a test statistic which is used as evidence for or against the null hypothesis (which we assume is true when calculating the test statistic). The set of values of the test statistic that we believe provide sufficient evidence to reject the null hypothesis in favor of the alternative is called the rejection region. The probability that we could have obtained as strong or stronger evidence against the null hypothesis (when it is true) than what we observed from our sample data is called the observed significance level or $p$-value.

Types of Error in Statistical Hypothesis Testing

When testing a particular hypothesis we may commit an error in making the decision. We refer to the hypothesis being tested as hypothesis H, then

	State of $H$

	Do not Reject $H$	Reject $H$
$H$ is true	Correct Decision	Type I Error
$H$ is False	Type II Error	Correct Decision

If the hypothesis H is true and not rejected or false and rejected, the decision is in either case correct. If hypothesis H is true but rejected, it is rejected in error, and if hypothesis H is false but not rejected, this is an error. The first of these errors is called Type I error and the probability of committing it is designated by the Greek letter $\alpha$ (alpha); the second is called a Type II error and the probability of committing it is designated by the Greek letter $\beta$ (beta). The probability of committing a Type I error is called the level of significance.

If an $H_{0}$ is rejected in favor of $H_{1}$ at a level of significance, it is understood that if similar data are collected and analyzed several times, 100$\alpha \%$ times $H_{0}$ will be rejected when it is true.

In a problem, when a type I error is serious and one rarely wishes to commit such an error, $\alpha$ will be taken as small. By having $\alpha$ small, fewer rejections of $H_{0}$ will be achieved in the decisions. Usually $\alpha$ will be taken as 0.05 or 0.01. If an $H_{0}$ is rejected using $\alpha$ = 0.05, the test statistic is said to be significant. If an $H_{0}$ is rejected using $\alpha$ = 0.01, the test statistic is said to be highly significant. The test statistic is not significant, if $H_{0}$ is not rejected at the 0.05 level.

Procedure for testing Hypothesis

$H_{1}$: The alternative hypothesis is the claim we wish to establish.
$H_{0}$: The null hypothesis is the negation of the claim.

The two types of error and their probabilities are

Type I error: Rejection of $H_{0}$ and $H_{0}$ is true.
Type II error: Non - rejection of $H_{0}$ when $H_{1}$ is true.
$\alpha$ = probability of making a Type I error (also called the level of significance)
$\beta$ = probability of making a type II error

We formulate a null hypothesis and an appropriate alternative hypothesis which we accept when the null hypothesis must be rejected.
We specify the probability of a Type I error; of possible, desired or necessary, we may also specify the probability of Type II error for particular alternatives.
Based on the sampling distribution of an appropriate statistic, we construct a criterion for testing null hypothesis against the given alternative.
We calculate from the data the value of the statistic on which the decision is to be based.
We decide whether to reject the null hypothesis or whether to fail to reject it.

Maths @ CHARUSAT

Pages

Sunday, 10 May 2015

(Biostatistics) Hypothesis Test: Theory

No comments:

Post a Comment