The topic is: to identify specific character traits related to the occupation that may contribute to the underrepresentation of women in law enfor -

The topic is: to identify specific character traits related to the occupation that may contribute to the underrepresentation of women in law enforcement is particularly relevant, given the slow growth of female officers in this field. However, my plan to use previous studies on police personality and current issues affecting law enforcement is a good starting point for my research.

Describe the statistical methods you plan to incorporate in your study

Don't use plagiarized sources. Get Your Custom Assignment on

The topic is: to identify specific character traits related to the occupation that may contribute to the underrepresentation of women in law enfor

From as Little as $13/Page

In our text briefly discuss what statistical methods you intend to use in your study.

Basic Statistics
Statistics are mathematical procedures for describing, synthesizing, and analyzing numerical data. The statistical procedures you choose will be determined by your research questions or hypotheses and the type of data you will collect. Thus, different research questions or hypotheses and different types of data require different statistical analyses. The statistical procedures you use should be described in detail in your Methodology Chapter under the heading Data Analysis and again in your Results Chapter. Data analysis is a very important component of any research study. No matter how well you conduct your study, inappropriate statistical analyses will likely result in inappropriate research conclusions.

Many statistical procedures are available to you. In this chapter, we describe those commonly used in the social and behavioral sciences. The focus is on how to apply these statistics to your dissertation or masters thesis. Contrary to popular opinion, you do not have to be a mathematician to use statistics. All you have to know is when to use an appropriate statistic to accomplish your research purposes, and a computer program will do the rest.

Generally, we recommend using a computer for data analysis, particularly if you have a large amount of data or if multiple analyses are to be performed. The most commonly used software for statistical analysis is the Statistical Package for the Social Sciences (SPSS). Another widely used statistical program is Statistical Analysis System (SAS). Both SPSS and SAS programs can perform all of the statistical procedures described in this chapter.

DESCRIPTIVE STATISTICS

Descriptive statistics are mathematical procedures for organizing and summarizing numerical data. The major types of descriptive statistics are (a) measures of central tendency, (b) measures of variability, (c) measures of relative position, and (d) measures of relationship. These are the types of statistics you will present at the beginning of your Results Chapter under the heading Descriptive Statistics in a quantitative dissertation or masters thesis (see Table 9.1, Chapter 9). Descriptive statistics, and even inferential statistics, are sometimes used in some types of qualitative studies as well (see Slater, 2001, referenced in Chapter 9).

Measures of Central Tendency

Measures of central tendency are indices that represent the typical or average score among a distribution of scores. Measures of central tendency can describe a set of numerical data with a single number. The three most frequently used measures of central tendency are the mean, median, and mode. The choice usually depends on two factors: the type of measurement scale used and the purpose of the research. There are four measurement scales: nominal, ordinal, interval, and ratio.

Nominal data classify persons or objects into two or more categories: sex (male or female), type of school (public or private), IQ (high, average, low), political party (Democrat or Republican), personality type (dominant or passive), race/ethnicity (African American, Asian, Hispanic, White). Ordinal data not only classify persons or objects but also rank them in terms of degree to which they possess a characteristic of interest. In other words, ordinal data puts participants in order from highest to lowest. For example, 15 doctoral cohort members might be ranked from 1 to 15 with respect to height. Percentile ranks are ordinal data. Most standardized tests, like the GRE, provide a raw score, as well as a percentile rank from 100 to 0. Interval data have all of the characteristics of nominal and ordinal data, but in addition, they are based on predetermined equal intervals. Most tests used in social science research, such as achievement tests, aptitude tests, and intelligence tests, represent interval data. Ratio data are derived from scales that have an absolute zero and so enable relative comparisons to be made, such as length of school day, class size, age, speed, and dollars.

Mean, Median, and Mode

The mean is the arithmetic average of the scores. It is calculated by adding up all of the scores and dividing that total by the number of scores. The mean is the appropriate measure of central tendency when the data represent either an interval or ratio scale. Because most quantitative measurement in the social sciences uses an interval scale, the mean is the most frequently used measure of central tendency.

The median is the midpoint of a group of scores. For example, for the scores 10, 20, 30, 40, 50, the median is 30. If the number of scores is even, the median is the point halfway between the two middle scores. For example, for the scores 10, 20, 30, 40, 50, 60, the median is 35. The median is the appropriate measure of central tendency when the data represent an ordinal scale.

The mode is the score that occurs most frequently in a distribution of scores. The mode is seldom used because of the problems associated with it. For example, a set of scores may have two or more modes. Such a distribution of scores is referred to as bimodal. Another problem with the mode is that equal-sized samples randomly selected from the same target population can have different modes. Nevertheless, the mode is the appropriate measure of central tendency when the data represent a nominal scale.
Measures of Variability

Measures of variability show how spread out the distribution of scores are from the mean of the distribution, or how much, on the average, scores differ from the mean. The three most frequently used measures of variability are (a) standard deviation, (b) quartile deviation, and (c) range.

Standard Deviation

The standard deviation is the most frequently used measure of variability. The standard deviation is the appropriate measure of variability when the data represent either an interval or ratio scale. Like the mean, the standard deviation is the most stable measure of variability. It includes every score in the calculation of scores in a distribution of scores. By knowing the mean and standard deviation of a set of scores, you will have a good idea of what your distribution of scores look like. This will become much clearer in the impending discussion of the normal curve.

Quartile Deviation

The quartile deviation is one half of the difference between the upper quartile and the lower quartile in a distribution of scores. Thus, the upper quartile of any distribution of scores is the 75th percentile, that point below which are 75% of the scores. The lower quartile is the 25th percentile, that point below which are 25% of the scores. Calculation of the quartile deviation is done by subtracting the lower quartile from the upper quartile and then dividing the result by 2. If the quartile deviation is small, the scores are close together. If it is large, the scores are more spread out. The quartile deviation is the appropriate measure of variability when the data represent an ordinal scale.

Range

The range is the difference between the highest and the lowest score and is the result of subtraction. For example, for the scores 5, 5, 6, 7, 8, 8, the range is 3. If the range is small, the scores are close together. If the range is large, the scores are more spread out. The range is the appropriate measure of variability when the data represent nominal data. The range, like the mode, is not a very stable measure of variability. However, it does give a quick, rough estimate of variability.

The Normal Curve

The normal curve (or bell-shaped curve) is a theoretical distribution in which the height of the curve indicates the percentage of cases under portions of the normal curve. As shown in the example in Figure 4.1, the test scores of the majority of individuals tend to cluster close to the mean. Fewer cases occur as we move farther from the mean. Specifically, about 68% of the sample will have scores within the range of plus or minus one standard deviation from the mean. Approximately 95% of the sample will have scores within the range of plus or minus two standard deviations from the mean. And more than 99% of the sample will have scores within the range of three standard deviations from the mean. Many variables (e.g., height, weight, IQ scores, achievement test scores) yield a curve similar to the one shown in Figure 4.1, providing a large enough random sample is used.

Measures of Relative Position

Measures of relative position indicate how a person has performed in comparison to all other persons in the sample who have been measured on the same variable. This is called norm referencing. The two
most frequently used measures of relative position are percentile ranks and standard scores.

Percentile Ranks

A percentile rank indicates the percentage of scores that are at or below a given score. Percentile ranks are appropriate measures of relative position when the data represent an ordinal scale. Percentile ranks are not used much in research studies. They are often used, however, in the public schools to report students standardized test scores. They are also used in making decisions regarding acceptance to colleges and graduate schools. The SAT and ACT used as one criterion for admission to undergraduate colleges and universities include both a score and the percentile rank associated with that score. This is true for the GRE and MAT used to admit students to graduate school. The percentile rank provides the decision maker with a measure of how well the applicant did in relation to all other test takers.

Standard Scores

A standard score indicates how far a given raw score is from the mean, in terms of standard deviation units. A standard score is the appropriate measure of relative position when the data represent an interval or ratio scale. The most commonly used standard scores are z scores and stanines.

z Scores. The z score is the most basic standard score, with a mean of 0 and a standard deviation of 1. Thus, a z score of +1 is at the 84th percentile for a normal distribution, 1 is at the 16th percentile, and 2 is at the 2nd percentile. Other standard scores are linear transformations from the z score, with arbitrarily selected means and standard deviations. That is, it is possible to choose any mean and any standard deviation. Most IQ tests, for example, use 100 as the mean and 15 or 16 as the standard deviation. The resultant IQ score is a standard score (the ratio IQ, mental age divided by chronological age times 100, is rarely used today).

Stanines. Stanines are standard scores that divide a distribution of scores into nine parts. Stanines, like percentile ranks, are often reported in norms tables for standardized tests. Stanines are often used in the public schools as a basis for ability grouping, and they are used also as a criterion for selecting students for special programs. For example, on the one hand, a remediation program may select students who scored in the first and second stanine on a standardized reading test. On the other hand, a gifted program may select students who scored in the eighth and ninth stanine. Like percentile ranks, the use of stanines is a reporting device that is understandable to most lay people.

Measures of Relationship

The descriptive statistics presented thus far involve the description of a single variable. Measures of relationship indicate the degree to which two variables are related. Correlation is often used for this purpose. The two most frequently used correlational statistics are the product-moment correlation coefficient (r) and rank-difference correlation (rho). For other bivariate correlational techniques (i.e., relationship between two variables), such as the phi coefficient, biserial correlation, correlation ratio (eta), partial correlation, and others, see standard statistics textbooks.

Product-Moment Correlation Coefficient

The product-moment correlation coefficient (r) (sometimes called Pearson r) is the most appropriate measure of relationship when the data represent an interval or ratio scale. In the social sciences, most of the measures represent interval scales. For this reason, the Pearson r is the statistic most often used for determining relationship. For example, if we administer an intelligence test and an achievement test to the same group of students, we will have two sets of scores. Pearson r would be the appropriate correlational statistic to use for determining the relationship between the students scores on the two measures. The product-moment correlation can also be used to compute a correlation matrix in which subjects scores on a large number of variables are correlated with each other (e.g., see Basham & Lunenburg, 1989; this study is discussed in greater detail in Chapter 6).
Rank-Difference Correlation (Rho)

Spearmans rho, as it is sometimes called, is used when the data are ranks rather than raw scores. For example, assume the principal and assistant principal have independently ranked the 15 teachers in their school from first, most effective, to 15th, least effective, and you want to assess how much their ranks agree. You would calculate the Spearmans rho by putting the paired ranks into the Pearson r formula or by using a formula developed specifically for rho.

Spearmans rho is interpreted the same as is Pearson r. Like the Pearson product-moment coefficient of correlation, it ranges from 1.00 to +1.00. When each individual has the same rank on both variables, the rho correlation will be +1.00, and when their ranks on one variable are exactly the opposite of their ranks on the other variable, rho will be 1.00. If there is no relationship at all between the rankings, the rank correlation coefficient will be 0. If you have a computer or calculator program for Pearson r, you can calculate Spearmans rho by putting the ranks into that program.

INFERENTIAL STATISTICS

Inferential statistics deal with inferences about populations based on the results of samples (Gay, Mills, & Airasian, 2006). Most social science research deals with samples drawn from larger populations. Thus, inferential statistics are data analysis procedures for determining the likelihood that results obtained from a sample are the same results that would have been obtained for the entire population.

Example 4.1

To illustrate: You have randomly selected a sample of ninth grade students. Based on a pretest in mathematics and other criteria, you have created two closely matched samples of students. Group A will be taught mathematics using a computer, and Group B will be taught mathematics using a traditional method. You want to know if the mean scores for Group A are significantly different from the mean scores of Group B. Results of the posttest in mathematics reveals that Group A has a mean score of 93, and the mean score for Group B is 85.

You have to make a decision as to whether the difference between the two means represents a true, significant difference in the treatment (in this case, method of instruction) or simply sampling error. A true difference is one caused by the treatment (the independent variable) and not by chance (Gall, Gall, & Borg, 2007; Gay et al., 2006). Thus, an observed difference is either caused by the treatment, as stated in the research question (or research hypothesis), or is the result of chance, sampling error.

The Null Hypothesis

The research hypothesis typically states a difference or relationship in the expected direction. A null hypothesis states that no difference or relationship exists. The null hypothesis is preferred when applying statistical tests. You can never prove your hypothesis, only disprove it. Hypothesis testing is a process of disproving or rejecting, and the null hypothesis is best suited for this purpose (see Gall et al., 2007; Gay et al., 2006).

The initial step in hypothesis testing then is to establish a null hypothesis. For instance, the null hypothesis for our example can be stated as follows:

No significant difference exists between the mean mathematics scores of ninth grade students who receive computer mathematics instruction and ninth grade students who receive traditional mathematics instruction.

After formulating the null hypothesis, the researcher carries out a test of statistical significance to determine whether the null hypothesis can be rejected (i.e., whether there is a true or real difference between the groups). This test enables us to make statements of the type:

If the null hypothesis is correct, we would find this large a difference between sample means only once in a hundred experiments (p < .01). Because we have found this large a difference, the null hypothesis quite probably is false. Therefore, we will reject the null hypothesis and conclude that the difference between sample means reflects a true difference between population means. (Gall et al., p. 138) Tests of Statistical Significance Concepts underlying tests of statistical significance follow: 1.The purpose of using a test of statistical significance is to determine if the difference in two scores (or the characteristics of the sample is different from the population) is significant or the result of chance fluctuation/or sampling error. 2.To determine statistical significance, the data are analyzed via a formula and a score is obtaineda t score, chi square score, or F score. 3.You then determine the degrees of freedom (number of subjects minus one). 4.Then compare the calculated significance score with the table score in the appropriate probability table using the calculated degrees of freedom. Select the .05 level of confidence. (This would be the minimum level you should accept.) 5.If the calculated significance score is greater than the one in the table (at the .05 level), then this indicates that the difference in student scores is statistically significant at the .05 level of confidence and not the result of chance fluctuation. This means that the difference in student scores is so great that it could only be obtained by chance only 5% of the time. And 95% of the time, the difference between the student scores is the result of the impact of the independent variable on the dependent variable. Hence, the null hypothesis is rejected. 6.If the calculated significance score is less than the one in the probability table (at the .05 level), then this indicates that the difference in student scores is statistically insignificant at the .05 level and is the result of chance fluctuation. This means that the difference in the student scores is so small that the independent variable had no impact on the dependent variable. Therefore, 95% of the time, the difference in student scores is the result of chance fluctuation. And only 5% of the time could the independent variable have impacted the dependent variable. Hence, the null hypothesis is accepted. Effect Size One of the reasons that statistically significant p values can be misleading is that the value that is calculated is directly related to sample size. Thus, it is possible to have a very large sample, a very small difference or relationship, and still report it as significant. For example, a correlation of .44 will be statistically significant at p < .05 with a sample as small as 20, and a sample of 5,000 will allow a statistically significant .05 finding with a correlation of only .028, which is, practically speaking, no relationship at all. The American Psychological Association (2001) and several research journals now strongly recommend or require that investigators report appropriate indicators that illustrate the strength or magnitude of a difference or relationship along with measures of statistical significance. These effect magnitude measures, as they are called, are either measures of strength of association or effect size. Measures of association are used to estimate proportions of variance held in common, similar to the coefficient of determination. Effect size is more commonly used. It is typically reported in a generalized form as the ratio of the difference between the group means divided by the estimated standard deviation of the population. According to Cohen (1988), the effect size index then provides an indication of the practical or meaningful difference. Effect size indexes of about .20 are regarded as small effects, of about .50 as medium or moderate effects, and .80 and above large effects. Statistical Analysis Different statistical analyses are appropriate for different types of data. It is essential that you select the appropriate statistical test, because an incorrect test can result in incorrect research conclusions. Basically there are two types of statistical tests: parametric and nonparametric. Your first decision when selecting a statistical test is to determine whether a parametric or nonparametric test is appropriate. The use of a parametric test requires that three assumptions be met: the variable measured is normally distributed in the population, the data represent an interval or ordinal scale, and the selection of participants is independent. Most variables examined in social science research are normally distributed. Most measures used in social science research represent an interval scale. And the use of random sampling will fulfill the assumption of independent selection of participants. A nonparametric test is an appropriate statistical test to use when the three parametric assumptions are not met and when the data represent an ordinal or nominal scale. The following parametric statistical tests are described below: (a) t test; (b) analysis of variance, including post hoc procedures; (c) factorial analysis of variance; (d) analysis of covariance; and (e) multivariate analysis of variance. In the section following, several important nonparametric tests are discussed. The t Test In many research situations, a mean from one group is compared with a mean from another group to determine the probability that the corresponding population means are different. The most common statistical procedure for determining the level of significance when two means are compared is the t test. The t test is a formula that generates a number, and this number is used to determine the probability level (p level) of rejecting the null hypothesis. Two different forms of the equation are used in the t test, one for independent samples and one for samples that are paired, or dependent. Independent samples are groups of participants that have no relationship to each other; the two samples have different participants in each group, and the participants are usually either assigned randomly from a common population or drawn from two different populations. Example 4.2 If you are testing the difference between an experimental group and a control group mean in a posttest-only design, the independent samples t test would be the appropriate statistic. Comparing leadership styles of two groups of superintendents would also utilize an independent samples t test.

The second form of the t test can be referred to by several different names, including paired, dependent samples, correlated, or matched t test. This t test is used in situations in which the participants from the two groups are paired or matched in some way.

Example 4.3

A common example of this case is the same group of participants tested twice, as in a pretestposttest study (e.g., see Pascarella & Lunenburg, 1988). Whether the same or different subjects are in each group, as long as a systematic relationship exists between the groups it is necessary to use the dependent samples t test to calculate the probability of rejecting the null hypothesis.

In the Pascarella and Lunenburg (1988) study, elementary school principals from two school districts received leadership training using Hersey and Blanchards situational leadership framework. Pretests and posttests were administered to the principals and a sample of their teachers before and after training to determine the effects of training on principals leadership effectiveness and style range. The study provided partial support only for Hersey and Blanchards situational leadership theory. Using dependent samples t tests, principals were perceived as more effective three years after training than before training: t (principals) = 6.46 (15) < .01 and t (teachers) = 3.73 (59) < .01. However, no significant differences were found in principals effectiveness immediately following training, nor in principals leadership style range before and after training. Although the formulas and degrees of freedom are different for each form of t test, the interpretation and reporting of the results are the same, the df for the dependent t test is the number of pairs minus one. Thus, you need not worry about whether the correct formula has been used. Example 4.4 The t test can be used for purposes other than comparing the means of two samples. For example, the t test is used when a researcher wants to show that a correlation coefficient is significantly different from 0 (no correlation). The mean of a group can be compared with a number rather than another mean, and it is possible to compare variances rather than means. Because there are so many uses for the t test, it is frequently encountered in reading social science research. Example 4.5 A more concrete explanation of using the t test is the following example. Suppose a researcher is interested in finding out whether there is a significant difference between high school boys and girls with respect to mathematics achievement. The research question would be: Is there a difference in mathematics achievement (the dependent variable) of boys compared with girls (the independent variable)? The null hypothesis would be: There is no difference between boys and girls in mathematics achievement. To test this hypothesis, the researcher would randomly select a sample of boys and girls from the population of all high school students. Let us say that the sample mean for boys achievement is 540, and the sample mean for girls is 520. Because we assume the null hypothesisthat the population means are equalwe use the t test to show how often the difference of scores in the samples would occur if the population means are equal. If our degrees of freedom (total sample size minus 1) is 60 and the calculated t value 1.29, we can see by referring to the t test table that the probability of attaining this difference in the sample means, for a two-tailed test, is .20, or 20 times out of 100. We accept the null hypothesis and say that there is no statistically significant difference between the mathematics achievement of high school boys and girls. One-Way Analysis of Variance If a study is conducted in which two or more sample means are compared on one independent variable, then to test the null hypothesis the researcher would employ a procedure called one-way analysis of variance (ANOVA). ANOVA is simply an extension of the t test. Rather than using multiple t tests to compare all possible pairs of means in a study of two or more groups, ANOVA allows you to test the differences between all groups and make more accurate probability statements than when using a series of separate t tests. It is called analysis of variance because the statistical formula uses the variances of the groups and not the means to calculate a value that reflects the degree of differences in the means. Instead of a t statistic, ANOVA calculates an F statistic (or F ratio). The F is analogous to the t. It is a three- or four-digit number that is used in a distribution of F table with the degrees of freedom to find the level of significance that you use to reject or not reject the null. There are two degrees of freedom. The first is the number of groups in the study minus one, and the second is the total number of subjects minus the number of groups. These numbers follow the F in reporting the results of ANOVA. For example, in reporting F(4, 80) = 4.25, the degrees of freedom mean that five group means are being compared and 85 subjects are in the analysis. ANOVA addresses the question: Is there a significant difference between the population means? If the F value that is calculated is large enough, then the null hypothesis (meaning there is no difference among the groups) can be rejected with confidence that the researcher is correct in concluding that at least two means are different. Example 4.6 Let us assume, for example, that a researcher is comparing the quality of school life of three groups of students in urban, suburban, and rural school districts (e.g., see Lunenburg & Schmidt, 1989). The researchers in the cited study selected a random sample from each group, administered a quality of school life (QSL) instrument, and calculated the means and variances of each group. The sample group means for QSL were: urban = 18, rural = 20, and suburban = 25. The null hypothesis that is tested, then, is that the population means of 18, 20, and 25 are equal, or, more correctly, that these are different only by sampling and measurement error. The calculated F in the study was 5.12 and p < .01. Lunenburg and Schmidt (1989) concluded that at least two of the means were different, and that this conclusion will be right 99 times out of 100. Many other variables were examined in the study referenced above. Results of the ANOVA indicated that the means were different only, but not where the differences occurred. Post hoc procedures are necessary to determine exactly where the differences in mean scores occurred. Post Hoc Procedures When a researcher uses ANOVA to test the null hypothesis that three means are the same, the resulting statistically significant F ratio tells the researcher only that two or more of the means are different. Usually the researcher needs to employ further statistical tests that will indicate those means that are different from each other. These tests are called post hoc comparisons. There are five common multiple comparison tests: Fishers LSD, Duncans new multiple range test, the Newman-Keuls multiple range test, Tukeys HSD, and the Scheff test. Each test is used in the same way, but they differ in the ease with which a significant difference is obtained; for some tests, that is, the means need to be farther apart than for other tests for the difference to be statistically significant. Tests that require a greater difference between the means are said to be conservative, while those that permit less difference are said to be liberal. The listing of the tests above is sequential, with Fishers test considered most liberal and Scheffs test most conservative. The two most common tests are Tukey and Scheff, but different conclusions can be reached in a study depending on the multiple comparison technique employed. Factorial Analysis of Variance One-way ANOVA has been introduced as a procedure that is used with one independent variable and two or more levels identified by this variable. It is common, however, to have more than one independent variable in a study. In fact, it is often desirable to have several independent variables because the analysis will provide more information. For example, if a group of researchers investigates the relative effectiveness of three reading curricula, they would probably use a 1 3 ANOVA to test the null hypothesis that no difference in achievement exists between any of the three groups. If the researchers were also interested in whether males or females achieved differently, gender would become a second independent variable. Now there are six groups, because for each reading group males and females are analyzed separately. Example 4.7 In this hypothetical situation, two independent variables are analyzed simultaneously and one dependent variable (achievement). The statistical procedure that would be used to analyze the results would be a two-way ANOVA (two-way because of two independent variables). Because factor is another word for independent variable, factorial means more than one independent variable. Factorial ANOVA, then, is a generic term that means that tw

Related Questions:

Related questions

Leave a Comment Cancel Reply