Probability and counting rules; Discrete probability distributions
All posts 100% original work. no plagiarism.
Recall the car data set you identified in Forum 2 excel. You will want to calculate the average for your data set. (Be sure you use the numbers without the supercar outlier) Once you have the average count how many of your data points fall below the average. You will take that number and divide it by 10. This will be your p or “success” in your problem. Once you have p, calculate q.
If you were to find another random sample of 10 cars based on the same data, what is the probability that exactly 4 of them will fall below the average? Make sure you interpret your results.
If you were to find another random sample of 10 cars based on the same data, what is the probability that fewer than 5 of them will fall below the average? Make sure you interpret your results.
If you were to find another random sample of 10 cars based on the same data, what is the probability that more than 6 of them will fall below the average? Make sure you interpret your results.
If you were to find another random sample of 10 cars based on the same data, what is the probability that at least 4 of them will fall below the average? Make sure you interpret your results.
I encourage you to review the Week 3 Binomial probabilities PDF at the bottom of the discussion. This will give you a step by step example to follow and show you how to find probabilities using Excel.
Sheet1
Type Year Make Model Price MPG(CITY) MPG (HIGHWAY) Weight
variable type: qualitative variable type: quantitative variable type: qualitative variable type: qualitative variable type: quantitative variable type: quantitative variable type: quantitative variable type: quantitative
SUV 2021 Mazda CX-30 $22,795 25 33 3232
compact crossover 2021 Toyota rav4prime $29,458 36 40 5,530
SUV 2021 Chrysler Voyager $28,730 19 28 4,330
minivan 2020 kia Sedona $28,720 18 24 6,085
minivan 2020 Dodge grand caravan $29,025 17 25 4,510
passenger wagon 2020 Ford transit connect $28,315 24 26 3,689
SUV 2020 Volkswagen Tigwan $25,965 22 29 3,847
SUV 2019 kia Sorento $28,110 22 29 3,810
SUV 2020 Honda Odyssey $32,110 19 28 4,593
SUV 2021 Hyundai Palisade $33,665 19 24 4,284
sports car 2020 Bugatti La Voiture $3,250,000 9 14 4,400
SUMMARY BEFORE ADDING OUTLIER
mean $28,689 $22 $29 $4,391
standard deviation 2977.7926966873 5.5467708324 4.8579831206 863.2415652643
median $28,725 $21 $28 $4,307
SUMMARY AFTER ADDING OUTLIER
mean $321,536 $21 $27 $4,392
standard deviation $971,266 $7 $6 $819
median $28,730 $19 $28 $4,330
DISCUSSION
From the initial analysis, the following can be established:
The average MPG stands at 22 in the city and at 14 on the highway. This defines the average of all the vehicles selected in the list. Therefore, this figure can be used as a substitute for all the MPG figures respectively. However, the MPG figure for the highway deviates from the central measure of tendency with a certain value. This measure of dispersion is determined by the standard deviation. In the MPG in the city, the standard deviation is given by 5.55. this figure is used in determining the range within which most of the data points are found, defined from the central point of tendency (Mean). This is given by 16.55 and 2755. Therefore, the city MPG for all the data points is within the given range at a confidence interval of 95%. the same concept can be applied in the subsequent means and standard deviation. The median point defines the central position of the data points, when they are arranged in a chronological manner.
After addition of the outlier [chosen one is Bugatti. The figures change tremendously. The figures chosen for descriptive statistics, hike up for the standard deviation, but reduces for the mean and the media. An increase in the measures of dispersion is attributed with a decrease in the values for measures of central tendency. This is because larger outliers are attributed to minimizing the measures of central tendency, and hence the deviation from these points is increased.
Without supercar
Without supercar
Mean 28689.3
Standard Error 941.6607321347
Median 28725
Mode ERROR:#N/A
Standard Deviation 2977.7926966873
Sample Variance 8867249.34444445
Kurtosis 1.2469446572
Skewness -0.3237289483
Range 10870
Minimum 22795
Maximum 33665
Sum 286893
Count 10
Confidence Level(95.0%) 2130.1845701243
With Supercar
With Supercar
Mean 321535.727272727
Standard Error 292847.665977757
Median 28730
Mode ERROR:#N/A
Standard Deviation 971265.828779546
Sample Variance 943357310154.818
Kurtosis 10.9997518649
Skewness 3.3165733432
Range 3227205
Minimum 22795
Maximum 3250000
Sum 3536893
Count 11
Confidence Level(95.0%) 652505.262278539 Poisson Probabilities
The Poisson probability distribution gives the probability of a number of events
occurring in a fixed interval of time or space if these events happen with a known
average rate and independently of the time since the last event.
You will need to make sure you are looking at the correct time or interval when
you calculate Poisson probabilities. For example, if you can type, on average, 50
words in 1 minute, but you want to calculate a probability for a time of 3 minutes,
then the new average will be 50*3 = 150 words. You would use 150 to calculate
the probability NOT 50. We will do more examples below.
You need to rewrite the probabilities in the less than or equal to form to use the
function in EXCEL. We will use Excel to find Poisson Probabilities. The
probabilities do need to be in the less than or equal to form to use Excel. This is
very important.
Here are some common Poisson Probabilities and how they would get re-written
to calculate in the less than or equal to form, to use Excel.
P( x = r)
P( x r)
P( x < r) = P(x r 1)
P(x > r) = 1 P(x r)
P( x r) same as 1 P(x r – 1)
Expected Value is or
r is the number of occurrences
To find Poisson Probabilities we will use the =POISSON.DIST( ) function.
Example:
A Cognitive scientific designed an experiment to measure x, the number of a
times a readers eye fixated on a single word before moving past that word in 1
minute. They found that a persons eye fixated on a word 3 times in 1 minute
before moving on to the next word.
1) Find the probability that a persons eye will fixated on a word exactly 5 times in
1 minute?
Because of the word exactly we want to find this probability P(x = 5). We will
use the POISSON.DIST() function to find this probability.
P(x = 5) = POISSON.DIST(5,3, FALSE)
In Excel make sure you hit the = sign first then start typing in POISSON.DIST(
From here make sure you include the left parenthesis then type in the x value,
then the mean, then either TRUE or FALSE. Then close the parenthesis ) and hit
Enter.
Type in a TRUE when you have a less than or equal to probability and type in a
FALSE when you have an equals probability. This example has an = sign so we
will use a FALSE.
There is a 10.08% probability that a persons eye will fixate on a word exactly 5
times before it moves on to the next word.
Note: When you hit Enter the answer will return as a decimal, .1008. You will
then need to convert it to a percent.
2) Find the probability that a persons eye will fixate on certain words fewer than
100 times in 30 minutes?
The interval we want to look at for this problem is 30 minutes NOT 1 minute.
Because of this, we need to find the new mean. If a persons eye fixates on a
word 3 times in 1 minute, then 3*30 = 90. Then a persons eye will fixate on
certain words 90 times in a 30-minute time period. The new average is 90. This is
the value we will use in the Excel function.
Because of the word fewer we will use the less than sign.
This is the probability we want to find, P(x < 100) This probability is in the less than form NOT the less than or equal to form so we need to rewrite this in the less than or equal to form. Remember: P( x < r) = P( x r 1) P( x < 100) = P(x 100 1) = P(x 99). Now that the probability is in the less than or equal to form we can use Excel. P(x 99) = POISSON.DIST(99,90,TRUE) In Excel make sure you hit the = sign first then start typing in POISSON.DIST(. From here make sure you include the left parenthesis then type in the x value, then the mean, then either TRUE or FALSE. Then close the parenthesis ) and hit Enter. Type in a TRUE when you have a less than or equal to probability and type in a FALSE when you have an equals probability. This example has an sign so we will use a TRUE. There is an 84.18% probability that a persons eye will fixate on certain words fewer than 100 times in a 30-minute period before it moves on to the next word. Note: When you hit Enter the answer will return as a decimal, .8418. You will then need to convert it to a percent. 3) Find the probability that a persons eye will fixated on certain words at least 95 times in 30 minutes? The time frame we are looking at is still 30 minutes so the mean will be 90. Because of the words at least we will use the greater than or equal to sign. This is the probability we want to find, P(x 95). This probability is in the greater than or equal to form NOT the less than or equal to form so we need to rewrite this in the less than or equal to form. Remember: P( x r) = 1 - P( x r - 1) P( x 95) = 1 - P(x 95 1) = 1 - P(x 94). Now that the probability is in the less than or equal to form we can use Excel. 1 - P(x 94) = 1- POISSON.DIST(94, 90,TRUE) In Excel make sure you hit the = sign first, then the 1 - and then, POISSON.DIST(. From here make sure you include the left parenthesis then type in the x value, then the mean, then either TRUE or FALSE. Then close the parenthesis ) and hit Enter. Type in a TRUE when you have a less than or equal to probability and type in a FALSE when you have an equals probability. This example has an sign so we will use a TRUE. There is a 31.28% probability that a persons eye will fixate on certain words at least 95 times in a 30-minute period before it moves on to the next word. Note: When you hit Enter the answer will return as a decimal, .3128. You will then need to convert it to a percent. 4) Find the probability that a persons eye will fixated on certain words more than 97 times in 30 minutes? The time frame we are looking at is still 30 minutes so the mean will be 90. Because of the word more we will use the greater than sign. This is the probability we want to find, P(x > 97).
This probability is in the greater than form NOT the less than or equal to form so
we need to rewrite this in the less than or equal to form.
Remember: P(x > r) = 1 P(x r)
P( x > 97) = 1 – P(x 97)= 1 – P(x 97). Now that the probability is in the less than
or equal to form we can use Excel.
1 – P(x 97) = 1- POISSON.DIST(97, 90,TRUE)
In Excel make sure you hit the = sign first, then the 1 – and then,
POISSON.DIST(. From here make sure you include the left parenthesis then type
in the x value, the n value, then the mean, then either TRUE or FALSE. Then close
the parenthesis ) and hit Enter.
Type in a TRUE when you have a less than or equal to probability and type in a
FALSE when you have an equals probability. This example has an sign so we
will use a TRUE.
There is a 21.26% probability that a persons eye will fixate on certain words more
than 97 times in a 30-minute period before it moves on to the next word.
Note: When you hit Enter the answer will return as a decimal, .2126. You will
then need to convert it to a percent. Binomial Probabilities
The Probability of an event can be expressed as a binomial probability if its
outcomes can be broken down into two probabilities, p, which is a success and a
q, which is a failure. Where p and q are complementary
p + q = 1, thus q = 1 – p
You need to rewrite the probabilities in the less than or equal to form to use the
function in EXCEL. We will use Excel to find Binomial Probabilities. The
probabilities do need to be in the less than or equal to form to use Excel. This is
very important.
Here are some common Binomial Probabilities and how they would get re-written
to calculate in the less than or equal to form, to use Excel.
P( x = j)
P( x j)
P( x < j) = P(x j 1)
P(x > j) = 1 P(x j)
P( x j) same as 1 P(x j – 1)
Expected Value = n*p
Standard deviation =
Recall, n denotes the sample size
To find Binomial Probabilities we will use the =BINOM.DIST( ) function.
Lets use our Car Price Data from Week 2 and calculate 4 different probabilities
Car Price:
Observation 1 $ 20,000
Observation 2 $ 25,000
Observation 3 $ 30,000
Observation 4 $ 31,000
Observation 5 $ 22,500
Observation 6 $ 25,000
Observation 7 $ 29,500
Observation 8 $ 24,000
Observation 9 $ 24,500
Observation 10 $ 25,000
1) Using our data, we found the average = $25,650.
Looking at our data we see that 7 out of 10 cars fall below the average. We will
call this a success if your price falls below the average. This means p = 7/10 = .70
and q = 1 – .70 = .30.
If you were to find another random sample of 10 cars based on the same data,
what is the probability that exactly 5 of them will fall below the average?
Because of the word exactly we want to find this probability P(x = 5). We will
use the BINOM.DIST() function to find this probability.
P(x = 5) = BINOM.DIST(5, 10, .70, FALSE)
In Excel make sure you hit the = sign first then start typing in BINOM.DIST(
From here make sure you include the left parenthesis then type in the x value,
the n value, the p value (the probability), then either TRUE or FALSE. Then close
the parenthesis ) and hit Enter.
Type in a TRUE when you have a less than or equal to probability and type in a
FALSE when you have an equals probability. This example has an = sign so we
will use a FALSE.
There is a 10.29% probability that exactly 5 of the cars will fall below the average.
Note: When you hit Enter the answer will return as a decimal, .1029. You will
then need to convert it to a percent.
2) If you were to find another random sample of 10 cars based on the same data,
what is the probability that fewer than 8 of them will fall below the average?
Because of the word fewer we will use the less than sign.
This is the probability we want to find, P(x < 8) This probability is in the less than form NOT the less than or equal to form so we need to rewrite this in the less than or equal to form. Remember: P( x < j) = P( x j 1) P( x < 8) = P(x 8 1) = P(x 7). Now that the probability is in the less than or equal to form we can use Excel. P(x 7) = BINOM.DIST(7,10,.70,TRUE) In Excel make sure you hit the = sign first then start typing in BINOM.DIST(. From here make sure you include the left parenthesis then type in the x value, the n value, the p value (the probability), then either TRUE or FALSE. Then close the parenthesis ) and hit Enter. Type in a TRUE when you have a less than or equal to probability and type in a FALSE when you have an equals probability. This example has an sign so we will use a TRUE. There is a 61.72% probability that fewer than 8 of the cars will fall below the average. Note: When you hit Enter the answer will return as a decimal, .6172. You will then need to convert it to a percent. 3) If you were to find another random sample of 10 cars based on the same data, what is the probability that at least 3 of them will fall below the average? Because of the words at least we will use the greater than or equal to sign. This is the probability we want to find, P(x 3). This probability is in the greater than or equal to form NOT the less than or equal to form so we need to rewrite this in the less than or equal to form. Remember: P( x j) = 1 - P( x j - 1) P( x 3) = 1 - P(x 3 1) = 1 - P(x 2). Now that the probability is in the less than or equal to form we can use Excel. 1 - P(x 2) = 1- BINOM.DIST(2,10,.70,TRUE) In Excel make sure you hit the = sign first, then the 1 - and then, BINOM.DIST(. From here make sure you include the left parenthesis then type in the x value, the n value, the p value (the probability), then either TRUE or FALSE. Then close the parenthesis ) and hit Enter. Type in a TRUE when you have a less than or equal to probability and type in a FALSE when you have an equals probability. This example has an sign so we will use a TRUE. There is a 99.84% probability that at least 3 of the cars will fall below the average. Note: When you hit Enter the answer will return as a decimal, .9984. You will then need to convert it to a percent. 4) If you were to find another random sample of 10 cars based on the same data, what is the probability that more than 5 of them will fall below the average? Because of the word more we will use the greater than sign. This is the probability we want to find, P(x > 5).
This probability is in the greater than form NOT the less than or equal to form so
we need to rewrite this in the less than or equal to form.
Remember: P(x > j) = 1 P(x j)
P( x > 5) = 1 – P(x 5)= 1 – P(x 5). Now that the probability is in the less than or
equal to form we can use Excel.
1 – P(x 5) = 1- BINOM.DIST(5,10,.70,TRUE)
In Excel make sure you hit the = sign first, then the 1 – and then, BINOM.DIST(.
From here make sure you include the left parenthesis then type in the x value,
the n value, the p value (the probability), then either TRUE or FALSE. Then close
the parenthesis ) and hit Enter.
Type in a TRUE when you have a less than or equal to probability and type in a
FALSE when you have an equals probability. This example has an sign so we
will use a TRUE.
There is a 84.97% probability that more than 5 of the cars will fall below the
average.
Note: When you hit Enter the answer will return as a decimal, .8497. You will
then need to convert it to a percent. Car Analysis
Vehicle Type/Class Year Make Model Price MPG (city) MPG (Highway) Horsepower
SUV 2021 Toyota 4Runner $ 45,012.00 16 19 270
SUV 2021 Chevrolet Tahoe $ 60,030.00 16 20 355
SUV 2020 Land Rover Range Rover $ 93,775.00 18 23 355
SUV 2021 Toyota Rav4 Prime $ 31,228.00 41 38 219
SUV 2021 Chevrolet Suburban $ 71,445.00 15 19 355
SUV 2021 Genesis GV80 $ 60,620.00 18 23 375
SUV 2021 Lexus UX $ 29,900.00 29 37 169
SUV 2020 Audi Q3 $ 29,688.00 19 27 228
SUV 2021 Kia Telluride $ 44,343.00 20 26 291
SUV 2020 Hyundai Palisade $ 32,995.00 19 26 291
Supercar 2021 McLaren F1 $ 19,800,000.00 18 24 600
Qualitative Quantitative Qualitative Qualitative Quantitative Quantitative Quantitative Quantitative
Average Price: $ 49,903.60 p=6/10=.60 q=1-.60=.40
p 0.6
q 0.4
P(x=4) 11.15% % of 10 Random Samples of Cars probability that exactly 4 of them will fall below the average
P(x<5) and P(x<=4) 16.62% % of 10 Random Samples of Cars probability that fewer than 5 of them will fall below the average P(x>6) = 1 – P(x<=6)= 1 - P(x<=6) 38.23% % of 10 Random Samples of Cars probability that more than 6 of them will fall below the average P(x>=4) = 1 – P(x<=4-1) = 1 - P(x<=3) 94.52% % of 10 Random Samples of Cars probability that at least 4 of them will fall below the average 10 Vehicles Column1 Mean 49903.6 Standard Error 6753.1541325352 Median 44677.5 Mode ERROR:#N/A Standard Deviation 21355.34844899 Sample Variance 456050907.377778 Kurtosis 0.35728503 Skewness 0.9811742596 Range 64087 Minimum 29688 Maximum 93775 Sum 499036 Count 10 11 Vehicles Column1 Mean 1845366.90909091 Standard Error 1795473.70004346 Median 45012 Mode ERROR:#N/A Standard Deviation 5954912.58399526 Sample Variance 35460983883025.1 Kurtosis 10.9996604096 Skewness 3.3165544515 Range 19770312 Minimum 29688 Maximum 19800000 Sum 20299036 Count 11