Under the data analysis section of the research project, the researcher performs data analysis. As the name suggests, the researcher describes the data that are gathered during data collection. During data analysis, the researcher could describe data both visually and statistically. Data could be visually displayed to reveal distribution of data, trends, anomalies and outliers. Visual displays of data could take the form of graphs, histograms, tables, plots, and other diagrams. This stage is a precursor any statistical procedures, that a researcher could use to test the research hypotheses. This raises the question of why the researcher should not jump in and immediately commence testing hypotheses using statistical analysis. In this post we would explain the importance of descriptive statistics to test data to ensure assumptions are met before using a parametric test.

__Assumptions:
The Importance of Describing Data__

There are numerous advantages of
describing data. One of the most important benefits is to determine if the data
meet the assumptions that are required for the use of parametric statistical
procedures. Parametric procedures include, but are not limited to, regression,
correlation, ANOVA and t test. Parametric tests have different assumptions that
data must meet depending on which test is being considered. Most parametric tests
demand that the data should meet assumption of **normality**. **Normality** means
a normal distribution of data which, when graphed as frequencies, bear a
resemblance to a bell shape (as in the image to the right/Left). Other common
assumptions that should be met, depending on the statistical procedure employed,
include levels-of-measurement, sample size, homogeneity of variance,
independence, absence of outliers and linearity, etc. (Field, 2005). It is Important
that the researcher understands the assumptions for any parametric statistical
procedure being considered to determine if they are met before employing the
procedure in a research study. The researcher should not use parametric
statistical if the data does not meet the assumptions. Their use would result
in erroneous results. Fortunately, there are corresponding non-parametric tests
that the researcher could use when the data do not meet assumptions for
parametric tests.

Non-parametric tests also have
assumptions that data must meet, but they are fewer and less rigid. An example
of a parametric procedure for correlation would be Pearson’s correlation coefficient
(Pearson’s r), while a parallel non-parametric test for correlation would be Spearman’s
rank correlation coefficient (Spearman’s rho). An example of a
causal-comparative parametric procedure would be ANOVA, while a corresponding
non-parametric causal-comparative test would be Kruskal-Wallis. Given that non-parametric
tests do not require that as many assumptions are met, some students question why
non-parametric tests are not always used. This is because parametric tests are
superior to and more powerful than non-parametric tests and that should be used
if the assumptions are met. A parametric test is more likely to find a true
effect when one exists, therefore rejecting the null hypothesis, than a non-parametric
test. It is recommendable that researchers conduct both parametric and non-parametric
tests if they are not certain as to which is most appropriate to use. If the
test results are the same, there is nothing more to worry about. If the test
results are statistically significant for the parametric test, and
non-significant for the non-parametric test, the researcher should take a
closer look at whether the assumptions were met or not. Assumption of Normality
Assumptions are evaluated both visually and statistically.

As mentioned earlier, a normal distribution of data is the most commonly required assumption for parametric statistical tests. The following would explain how the assumption of normality could be described and tested. A normal distribution of data exhibits the characteristics of a bell-shaped curve, as shown below. In a perfect normal curve, the frequency distribution is symmetrical about the center; the mean, median, and mode are all equal;and the tails of the curve approach but do not touch the x-axis. These are all preliminary indicators that a curve may represent a normal distribution, but there are additional factors to consider.

Statistical procedures used to test hypotheses have unique assumptions about the scales on which the data are measured. Data could be measured on nominal, ordinal, interval, or ratio scales. It is important to determine the assumption of measurement scales for any statistical procedure being considered to test the data. For instance, an assumption of Pearson’s r is that data be measured at the interval or ratio level. It is critical that researchers ensure that assumptions are met to have certainty that their results reflect the integrity of validity and reliability.

0

0