Wk5 DQ – Data Analysis and Business Intelligence Question: Discuss the difference between correlation and causation Discuss the purpose of multiple

Wk5 DQ – Data Analysis and Business Intelligence
Question:

Discuss the difference between correlation and causation
Discuss the purpose of multiple regression
Discuss the underlying assumptions of multiple regression and what can be done if the assumptions are not met.

Don't use plagiarized sources. Get Your Custom Assignment on
Wk5 DQ – Data Analysis and Business Intelligence Question: Discuss the difference between correlation and causation Discuss the purpose of multiple
From as Little as $13/Page

Note:
1. Define the words in the own words. Do not directly quote from the textbook.
2. Need to write at least 2 paragraphs
3. Need to include the information from the textbook as the reference.
4. Need to include at least 1 peer reviewed article as the reference.
5. Please find the textbook and related power point in the attachment

Correlation and Linear Regression
Chapter 13
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-1

In this chapter, we study the relationship between two interval- or ratio-level variables and develop numerical measures to express the relationship between two variables. We also develop an equation to express the relationship between variables. We examine both correlation analysis and regression analysis.
1

Learning Objectives
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
LO13-1 Explain the purpose of correlation analysis
LO13-2 Calculate a correlation coefficient to test and interpret
the relationship between two variables
LO13-3 Apply regression analysis to estimate the linear
relationship between two variables
LO13-4 Evaluate the significance of the slope of the regression
equation
LO13-5 Evaluate a regression equations ability to predict using the
standard estimate of the error and the coefficient of
determination
LO13-6 Calculate and interpret confidence and prediction
intervals
LO13-7 Use a log function to transform a nonlinear relationship
13-2

What is Correlation Analysis?
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Used to report the relationship between two variables

In addition to graphing techniques, well develop numerical measures to describe the relationships
Examples
Does the amount Healthtex spends per month on training its sales force affect its monthly sales?
Does the number of hours students study for an exam influence the exam score?

CORRELATION ANALYSIS A group of techniques to measure the relationship between two variables.
13-3

In all business fields, identifying and studying relationships between variables can provide information on ways to increase profits, methods to decrease costs, or variables to predict demand.
3

Scatter Diagram
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
A scatter diagram is a graphic tool used to portray the relationship between two variables
The independent variable is scaled on the X-axis and is the variable used as the predictor
The dependent variable is scaled on the Y-axis and is the variable being estimated
Graphing the data in a scatter diagram will make the relationship between sales calls and copiers sales easier to see.
13-4

We often begin our study of the relationship between two variables with a scatter diagram. It gives us a visual representation of the relationship between the variables. For instance, a sales manager wants to know if there is a relationship between the number of sales calls made in a month and the number of copiers sold that month and begins the analysis with a random sample of 15 sales representatives. With this data, the number of sales calls is the independent variable and number of copiers sold is the dependent variable.
4

Scatter Diagram Example
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
North American Copier Sales sells copiers to businesses of all sizes throughout the United States and Canada. The new national sales manager is preparing for an upcoming sales meeting and would like to impress upon the sales representatives the importance of making an extra sales call each day. She takes a random sample of 15 sales representatives and gathers information on the number of sales calls made last month and the number of copiers sold. Develop a scatter diagram of the data.
Sales reps who make more calls tend to sell more copiers!
13-5

We develop a scatter diagram of the data. The first salesperson, Brian Virost, made 96 sales calls and sold 41 copiers; to plot this point move along the horizontal axis to x=96 and then go vertically to y=41 and place a dot at that intersection. Do this for the all the sales data. It is perfectly reasonable for the manager to tell the sales people that the more sales calls they make, the more copiers they can expect to sell. Note, that while there does seem to be a positive relationship between the two variables, all the points do not fall on a line.
5

Correlation Coefficient
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.

Characteristics of the correlation coefficient are:
The sample correlation coefficient is identified as r
It shows the direction and strength of the linear relationship between two interval- or ratio-scale variables
It ranges from 1.00 to 1.00
If its 0, there is no association
A value near 1.00 indicates a
direct or positive correlation
A value near 1.00 indicates a negative correlation
CORRELATION COEFFICIENT A measure of the strength of the linear relationship between two variables.
13-6

Both variables must be at least the interval scale of measurement to find the correlation coefficient. A value of 1 indicates perfect negative correlation and a value of +1 indicates perfect positive correlation.
6

Correlation Coefficient (2 of 2)
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
The following graphs summarize the strength and direction of the correlation coefficient
13-7

In the set of charts at the bottom of the slide, the first one indicates no correlation between the number of children as the independent variable, and income (as the dependent variable). The middle chart shows there is a slightly negative correlation between price and quantity. The chart on the right shows a strong positive relationship between hours studied (the independent variable) and exam score (the dependent variable).
7

Correlation Coefficient, r
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
How is the correlation coefficient determined? Well use North American Copier Sales as an example. We begin with a scatter diagram, but this time well draw a vertical line at the mean of the x-values (96 sales calls) and a horizontal line at the mean of the y-values (45 copiers).
13-8

Drawing lines through the center of the data establishes quadrants. These two variables are positively related when the number of copiers sold is above the mean and the number of sales calls is also above the mean; the points appear in quadrant 1. When the number of sales calls is less than the mean, so is the number of copiers sold, the points appear in quadrant lll.
8

Correlation Coefficient, r, Continued
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
How is the correlation coefficient determined? Now we find the deviations from the mean number of sales calls and the mean number of copiers sold; then multiply them. The sum of their product is 6,672 and will be used in formula 13-1 to find r. We also need the standard deviations. The result, r=.865 indicates a strong, positive relationship.
13-9

The correlation coefficient is designated by the letter r and found with equation 13-1. We will use Excel to find the standard deviations of the two variables, x (sales calls) and y (copier sales) to use in the formula.
9

Correlation Coefficient Example
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.

The Applewood Auto Groups marketing department believes younger buyers purchase vehicles on which lower profits are earned and older buyers purchase vehicles on which higher profits are earned. They would like to use this information as part of an upcoming advertising campaign to try to attract older buyers. Develop a scatter diagram and then determine the correlation coefficient. Would this be a useful advertising feature?
The scatter diagram suggests that a positive relationship does exist between age and profit, but it does not appear to be a strong relationship.
Next, calculate r, which is 0.262. The relationship is positive but weak. The data does not support a business decision to create an advertising campaign to attract older buyers!
13-10

We use Excel to calculate r; r is .262 and is much closer to zero than one. We would observe the relationship between the age of the buyer and the profit of their purchase is not strong.
10

Testing the Significance of r
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-11

11

Testing the Significance of r Example
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-12

The population in this example is all of the salespeople employed by the firm. This is a two-tailed test. We use Appendix B.5 for degrees of freedom n-2=15-2=13 and a level of significance of .05. Use formula 13-2; the result is 6.216. We reject the null hypothesis; there is correlation with respect to the number of sales calls made and the number of copiers sold in the population of salespeople.
12

Testing the Significance of r Example Continued
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-13
Step 5: Make decision; reject H0, t=6.216
Step 6: Interpret; there is correlation with respect to the number of sales calls made and the number of copiers sold in the population of salespeople.

Testing the Significance of the Correlation Coefficient
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
In the Applewood Auto Group example, we found an r=0.262 which is positive, but rather weak. We test our conclusion by conducting a hypothesis test that the correlation is greater than 0.
13-14

This is a one-tailed (right-tailed) test. The degrees of freedom in this test is n 2 = 180 2 = 178; but Appendix B.5 doesnt have 178, so we use 180, so the critical value is 1.653. We use formula 13-2 and conclude the sample correlation is too large to have come from a population with no correlation. The outcome of a marketing campaign directed to older buyers is uncertain.
14

Regression Analysis
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
In regression analysis, we estimate one variable based on another variable
The variable being estimated is the dependent variable
The variable used to make the estimate or predict the value is the independent variable
The relationship between the variables is linear
Both the independent and the dependent variables must be interval or ratio scale
REGRESSION EQUATION An equation that expresses the linear relationship between two variables.
13-15

The least squares criterion is used to determine the regression equation.
15

Least Squares Principle
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
In regression analysis, our objective is to use the data to position a line that best represents the relationship between two variables
The first approach is to use a scatter diagram to visually position the line

But this depends on judgement; we would prefer a method that results in a single, best regression line
13-16

The lines drawn in the chart on the right represents the judgement of four people. The method that results in a single, best regression line is called the least squares principle.
16

Least Squares Regression Line
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.

To illustrate, the same data are plotted in the three charts below
LEAST SQUARES PRINCIPLE A mathematical procedure that uses the data to position a line with the objective of minimizing the sum of the squares of the vertical distances between the actual y values and the predicted values of y.
13-17

The line drawn in chart 13-9 is the best fitting line and is drawn using the least squares method. It is the best fitting because the sum of the squares of the vertical deviations about it is at a minimum; the sum of the squares is 24. Chart 13-10 and 13-11 was drawn differently and their sum of the squares is 44 and 132 respectively.
17

Least Squares Regression Line (2 of 2)
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-18

18

Least Squares Regression Line Example
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Recall the example of North American Copier Sales. The sales manager gathered information on the number of sales calls made and the number of copiers sold. Use the least squares method to determine a linear equation to express the relationship between the two variables.
13-19
The first step is to find the slope of the least squares regression line, b

Next, find a

Then determine the regression line

So if a salesperson makes 100 calls, he or she can expect to sell 46.0432 copiers

The b value of .2608 indicates that for each additional sales call, the sales representative can expect to increase the number of copiers sold by about .2608. So 20 additional sales calls in a month will result in about five more copiers being sold.
19

Drawing the Regression Line
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-20

The line of regression is drawn on the scatter diagram. Estimated sales for all sales representatives are calculated using the formula we determined earlier and placed in the table. The regression line will always pass through the mean of variables x and y. Plus, there is no other line through the data where the sum of the deviations is smaller.
20

Regression Equation Slope Test
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-21

21

Regression Equation Slope Test Example
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-22

This is a one-tailed test. If we do not reject the null hypothesis, we conclude that the slope of the regression line could be zero. We use Excel to determine the needed regression statistics. We find the critical value in Appendix B.5 with degrees of freedom of n 2, 15 2 = 13 and a level of significance of .05, it is 1.771. We reject the null hypothesis and conclude the slope of the line is greater than 0.
22

Regression Equation Slope Test Example (2 of 2)
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-23

Highlighted, b is .2606; the standard error is .0420

Evaluating a Regression Equations Ability to Predict
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Perfect prediction is practically impossible in almost all disciplines, including economics and business
The North American Copier Sales example showed a significant relationship between sales calls and copier sales, the equation is
Number of copiers sold = 19.9632 + .2608(Number of sales calls)
What if the number of sales calls is 84, and we calculate the number of copiers sold is 41.8704we did have two employees with 84 sales calls, they sold just 30 and 24
So, is the regression equation a good predictor?
We need a measure that will tell how inaccurate the estimate might be
13-24

The measure well use is the standard error of the estimate, sy,x. We find more information on the next slide.
24

The Standard Error of Estimate
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
The standard error of estimate measures the variation around the regression line

It is in the same units as the dependent variable
It is based on squared deviations from the regression line
Small values indicate that the points cluster closely about the regression line
It is computed using the following formula

STANDARD ERROR OF ESTIMATE A measure of the dispersion, or scatter, of the observed values around the line of regression for a given value of x.
13-25

The standard error of estimate is the same concept as the standard deviation in chapter 3. The standard deviation measures dispersion around the mean. The standard error of estimate measures dispersion around the regression line for a given value of x.
25

The Standard Error of Estimate Example
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
The standard error of estimate is 6.720
If the standard error of estimate is small, this indicates that the data are relatively close to the regression line and the regression equation can be used. If it is large, the data are widely scattered around the regression line and the regression equation will not provide a precise estimate of y.
13-26

The standard error of estimate can be calculated using statistical software like Excel.
26

Coefficient of Determination
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.

It ranges from 0 to 1.0
It is the square of the correlation coefficient
It is found from the following formula

In the North American Copier Sales example, the correlation coefficient was .865; just square that (.865)2 = .748; this is the coefficient of determination
This means 74.8% of the variation in the number of copiers sold is explained by the variation in sales calls
COEFFICIENT OF DETERMINATION The proportion of the total variation in the dependent variable Y that is explained, or accounted for, by the variation in the independent variable X.
13-27

The coefficient of determination provides a more interpretable measure of a regression equations ability to predict. Its easy to compute too; just square the correlation coefficient.
27

Relationships among r, r2, and sy,x
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Recall the standard error of estimate measures how close the actual values are to the regression line
When it is small, the two variables are closely related
The correlation coefficient measures the strength of the linear association between two variables
When points on the scatter diagram are close to the line, the correlation coefficient tends to be large
Therefore, the correlation coefficient and the standard error of estimate are inversely related
13-28

As the strength of a linear relationship between two variables increases, the correlation coefficient increases and the standard error of the estimate decreases.
28

Inference about Linear Regression
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
We can predict the number of copiers sold (y) for a selected value of number of sales calls made (x)
But first, lets review the regression assumptions of each of the distributions in the graph below
13-29

Well now relate these assumptions to North American Copier Sales.
29

Constructing Confidence and Prediction Intervals
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Use a confidence interval when the regression equation is used to predict the mean value of y for a given value of x
For instance, we would use a confidence interval to estimate the mean salary of all executives in the retail industry based on their years of experience

Use a prediction interval when the regression equation is used to predict an individual y for a given value of x
For instance, we would estimate the salary of a particular retail executive who has 20 years of experience
13-30

Two different predictions can be made for a selected value of the independent variable; a confidence interval and a prediction interval. In a confidence interval, the width of the interval is affected by the level of confidence, the size of the standard error of the estimate, and the size of the sample, as well as the value of the independent variable. The prediction interval is also based on the level of confidence, the size of the standard error of the estimate, the size of the sample, and the value of the independent variable. The difference between formulas 13-11 and 13-12 is the 1 under the radical. The prediction interval will be wider than the confidence interval.
30

Confidence Interval and Prediction Interval Example
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
We return to the North American Copier Sales example. Determine a 95% confidence interval for all sales representatives who make 50 calls, and determine a prediction interval for Sheila Baker, a west coast sales representative who made 50 sales calls.
The 95% confidence interval for all sales representatives is 27.3942 up to 38.6122.
The 95% prediction interval for Sheila Baker is 17.442 up to 48.5644 copiers.
13-31

31

Transforming Data
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Regression analysis and the correlation coefficient requires data to be linear
But what if data is not linear?
If data is not linear, we can rescale one or both of the variables so the new relationship is linear
Common transformations include
Computing the log to the base 10 of y, Log(y)
Taking the square root
Taking the reciprocal
Squaring one or both variables
Caution: when you are interpreting a correlation coefficient or regression equation it could be nonlinear
13-32

For example, instead of using the actual values of the dependent variable y, we would create a new dependent variable by transforming it.
32

Transforming Data Example
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
GroceryLand Supermarkets is a regional grocery chain located in the midwestern United States. The director of marketing wishes to study the effect of price on weekly sales of their two-liter private brand diet cola. The objectives of the study are
To determine whether there is a relationship between selling price and weekly sales. Is this relationship direct or indirect? Is it strong or weak?
To determine the effect of price increases or decreases on sales. Can we effectively forecast sales based on the price?
To begin, the company decides to price the two-liter diet cola from $0.50 to $2.00. To collect the data, a random sample of 20 stores is taken and then each store is randomly assigned a selling price.
13-33

There is a strong relationship between the two variables. The coefficient of determination is 88.9%. So 88.9% of the variation in Sales is accounted for by the variation in Price. But, a careful analysis of the scatter diagram reveals that the relationship may not be linear. That means we need to transform the data.
33

Transforming Data Example (2 of 3)
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-34

A strong, inverse relationship!

Transforming Data Example (3 of 3)
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
The director of marketing decides to transform the dependent variable, Sales, by taking the logarithm to the base 10 of each sales value. Note the new variable, Log-Sales, in the following analysis as it is used as the dependent variable with Price as the independent variable.
13-35

Clearly, as price increases, sales decrease. This relationship will be very helpful to GroceryLand when making pricing decisions for this product.
35

Chapter 12 Practice Problems
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-36

Question 3
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-37
Bi-lo Appliance Super-Store has outlets in several large metropolitan areas in New England. The general sales manager aired a commercial for a digital camera on selected local TV stations prior to a sale starting on Saturday and ending Sunday. She obtained the information for SaturdaySunday digital camera sales at the various outlets and paired it with the number of times the advertisement was shown on the local TV stations. The purpose is to find whether there is any relationship between the number of times the advertisement was aired and digital camera sales. The pairings are:

What is the dependent variable?
Draw a scatter diagram.
Determine the
correlation coefficient.
Interpret these
statistical measures.
LO13-2

Question 11
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-38
The Airline Passenger Association studied the relationship between the number of passengers on a particular flight and the cost of the flight. It seems logical that more passengers on the flight will result in more weight and more luggage, which in turn will result in higher fuel costs. For a sample of 15 flights, the correlation between the number of passengers and total fuel cost was .667. Is it reasonable to conclude that there is positive association in the population between the two variables? Use the .01 significance level.
LO13-2

Question 17
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-39
Bloomberg Intelligence listed 50 companies to watch in 2018 (www.bloomberg.com/features/companies-to-watch-2018). Twelve of the companies are listed here with their total assets and 12-month sales.

Let sales be the dependent variable and total assets the independent variable.
Draw a scatter diagram.
Compute the correlation coefficient.
Determine the regression equation.
For a company with $100 billion in assets, predict the 12-month sales.
LO13-3

Question 23
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-40
Refer to Exercise 17. The regression equation is = 1.85 + .08x, the sample size is 12, and the standard error of the slope is 0.03. Use the .05 significance level. Can we conclude that the slope of the regression line is different from zero?
LO13-4

Question 27
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-41
Bradford Electric Illuminating Company is studying the relationship between kilowatt-hours (thousands) used and the number of rooms in a private single-family residence. A random sample of 10 homes yielded the following:

Determine the standard error of estimate and the coefficient of determination. Interpret the coefficient of determination.
LO13-5

Question 33
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-42
Bradford Electric Illuminating Company is studying the relationship between kilowatt-hours (thousands) used and the number of rooms in a private single-family residence. A random sample of 10 homes yielded the following:

Determine the .95 confidence interval, in thousands of kilowatt-hours, for the mean of all six-room homes.
Determine the .95 prediction interval, in thousands of kilowatt-hours, for a particular six-room home.
LO13-6

Question 35
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
13-43
Using the following data with x as the independent variable and y as the dependent variable, answer the items.

Create a scatter diagram and describe the relationship between x and y.
Compute the correlation coefficient.
Transform the x variable by squaring each value, x2.
Create a scatter diagram and describe the relationship between x2 and y.
Compute the correlation coefficient between x2 and y.
Compare the relationships between x and y, and x2 and y.
Interpret your results.

LO13-7 Nonparametric Methods: Nominal Level Hypothesis Tests
Chapter 15
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
15-1

This chapter considers tests of hypothesis for nominal level data. Nonparametric hypothesis tests do not require the assumption that the population be normal. First we consider two mutually exclusive groups, then several mutually exclusive groups. We will use the chi-square distribution as a test statistic in this chapter too.
1

Learning Objectives
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
LO15-1 Test a hypothesis about a population
proportion
LO15-2 Test a hypothesis about two population
proportions
LO15-3 Test a hypothesis comparing an observed set
of frequencies to an expected frequency
distribution
LO15-4 Explain the limitations of using the chi-square
statistic in goodness-of-fit tests
LO15-5 Test a hypothesis that an observed frequency
distribution is normally distributed
LO15-6 Perform a chi-square test for independence
on a contingency table
15-2

Test a Hypothesis of a Population Proportion
Copyright 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Recall that a proportion is the ratio of the number of successes to the number of observations

Examples
Historically, GM reports that 70% of leased vehicles are returned with less than 36,000 miles

Leave a Comment

Your email address will not be published. Required fields are marked *