SPC tools and techniques– part 2 & Fundamentals of Statistics—Part 1
1. In the article The tools of Quality Part IV: Histograms (Quality Progress, September 1990, Vol. XXIII, No.9, pp.75-78) data are presented on the gain of 120 amplifiers. These data are reproduced below.
a) Construct a stem-and-leaf display step by step. Category Interval=1 (7,8,9,10,11 as stems, and decimal number as leaf) (3pts).
b) Comment on the shape of the display in a. Can you make an assumption that the data follows normal distribution? Why? (2pt)
c) Construct a stem-and-leaf display step by step. Category Interval=0.5 (7.0X<7.5, 7.5X<8, 8.0X<8.5, .) (1pts).
8.1
10.4
8.8
9.7
7.8
9.9
11.7
8.0
9.3
9.0
8.2
8.9
10.1
9.4
9.2
7.9
9.5
10.9
7.8
8.3
9.1
8.4
9.6
11.1
7.9
8.5
8.7
7.8
10.5
8.5
11.5
8.0
7.9
8.3
8.7
10.0
9.4
9.0
9.2
10.7
9.3
9.7
8.7
8.2
8.9
8.6
9.5
9.4
8.8
8.3
8.4
9.1
10.1
7.8
8.1
8.8
8.0
9.2
8.4
7.8
7.9
8.5
9.2
8.7
10.2
7.9
9.8
8.3
9.0
9.6
9.9
10.6
8.6
9.4
8.8
8.2
10.5
9.7
9.1
8.0
8.7
9.8
8.5
8.9
9.1
8.4
8.1
9.5
8.7
9.3
8.1
10.1
9.6
8.3
8.0
9.8
9.0
8.9
8.1
9.7
8.5
8.2
9.0
10.2
9.5
8.3
8.9
9.1
10.3
8.4
8.6
9.2
8.5
9.6
9.0
10.7
8.6
10.0
8.8
8.6
Please be noted: you are not expected to use Excel Histogram function in this problem. In the next assignment, this function must be used.
2. An experiment in chemistry looked at the effect of temperature on the solubility of salt in water. Below are data on the solubility of Potassium Chloride (KCL). Construct a scatter plot of the data and comment on the relationship between temperature and solubility using Microsoft Excel. (3pts)
Temp. C
0
10
20
30
40
50
60
70
80
90
100
Solubility
29.6
28.0
33.6
38.1
34.2
42.6
44.8
48.1
56.5
55.4
62.9
1. In the article The tools of Quality Part IV: Histograms (Quality Progress, September 1990, Vol. XXIII, No.9, pp.75-78) data are presented on the gain of 120 amplifiers. These data are reproduced below.
a) Construct a stem-and-leaf display step by step. Category Interval=1 (7,8,9,10,11 as stems, and decimal number as leaf) (3pts).
b) Comment on the shape of the display in a. Can you make an assumption that the data follows normal distribution? Why? (2pt)
c) Construct a stem-and-leaf display step by step. Category Interval=0.5 (7.0X<7.5, 7.5X<8, 8.0X<8.5, .) (1pts).
8.1
10.4
8.8
9.7
7.8
9.9
11.7
8.0
9.3
9.0
8.2
8.9
10.1
9.4
9.2
7.9
9.5
10.9
7.8
8.3
9.1
8.4
9.6
11.1
7.9
8.5
8.7
7.8
10.5
8.5
11.5
8.0
7.9
8.3
8.7
10.0
9.4
9.0
9.2
10.7
9.3
9.7
8.7
8.2
8.9
8.6
9.5
9.4
8.8
8.3
8.4
9.1
10.1
7.8
8.1
8.8
8.0
9.2
8.4
7.8
7.9
8.5
9.2
8.7
10.2
7.9
9.8
8.3
9.0
9.6
9.9
10.6
8.6
9.4
8.8
8.2
10.5
9.7
9.1
8.0
8.7
9.8
8.5
8.9
9.1
8.4
8.1
9.5
8.7
9.3
8.1
10.1
9.6
8.3
8.0
9.8
9.0
8.9
8.1
9.7
8.5
8.2
9.0
10.2
9.5
8.3
8.9
9.1
10.3
8.4
8.6
9.2
8.5
9.6
9.0
10.7
8.6
10.0
8.8
8.6
Please be noted: you are not expected to use Excel Histogram function in this problem. In the next assignment, this function must be used.
2. An experiment in chemistry looked at the effect of temperature on the solubility of salt in water. Below are data on the solubility of Potassium Chloride (KCL). Construct a scatter plot of the data and comment on the relationship between temperature and solubility using Microsoft Excel. (3pts)
Temp. C
0
10
20
30
40
50
60
70
80
90
100
Solubility
29.6
28.0
33.6
38.1
34.2
42.6
44.8
48.1
56.5
55.4
62.9 1
1
Lecture 7: SPC tools and
techniques part 2
Chapter 4
2
Magnificent seven quality tools
Process flow diagram
Pareto diagram
Cause and effect diagram
Check sheet
Scatter diagram
Histogram
Control charts (include run chart)
3
Check sheet for paint non-conformities
4
Check sheet for plastic mold nonconformities
5
Check sheet example pin
diameter
6
Check sheet example-- for car
door painting process
The Quality Toolbook from
http://syque.com/quality_tools/toolbook/Check/example.htm
2
7
Key points in using check sheet
Check sheet should be well designed,
easy to read, and clearly labeled.
Record only necessary information.
Dont attempt to collect data not
specifically related to the issues being
studied.
Keep it simple
8
Scatter diagram (scatterplot)
Display the relationship between two
variables
Gas mileage and speed
Cutting speed and tool life
Feed rate and surface roughness
9
Constructing scatter diagram
Two measurements on each unit
Plot using coordinate axes
Vertical: response
Horizontal: explanatory or predictor
Explanatory variable
R
e
s
p
o
n
s
e
v
a
ri
a
b
le
10
Interpretation-- scatter plot
Positive
Negative
No association
Curvilinear
11
Interpretation-- scatter plot
12
Scatter diagram example
3
13
Scatter diagram example
Fig. 22-17, cutting speed and tool life,
from Materials and Processes in Mfg, by DeGarmo, Black, & Kohser, 9th ed
14
Scatter diagram example
15
Scatter diagram examples
16
Histogram
Pictorial representation (summary) of a
set of data
measurement
fr
e
q
u
e
n
c
y
17
Interpret histograms
By graphic (qualitative) features
Symmetry
Mounted, flat
Skew
Right, left
By analytical (quantitative) features
Center: mean, medium, mode
Dispersion: range, standard deviation
18
symmetric
Skew- right
Skew- left
4
19
Histogram
20
Constructing a stem-leaf chart
(one type of histogram)
Order data from smallest to largest
Determine cell intervals equal width
Construct a frequency table
Draw bar chart
Vertical: frequency by %
Horizontal: cell intervals
21
How many intervals (cells,
categories) should be formed?
Number of cells is based on judgment
General rule of thumb is the number of
cells should be between 5 and 20
5 to 9 when observations are < 100
8 to 17 observations are between 100 &
500
15 to 20 when observations are > 500
22
Stem-and-leaf chart– octane ratings
The following data on motor octane
ratings are excerpted from an article in
Technometrics Vol.19 p425
93.3 91.8 92.3 90.4 90.1 93.0 88.7 89.9
89.8 89.6 87.4 88.4 88.9 91.2 89.3 94.4
92.7 91.8 91.6 90.4 91.1 92.6 89.3 90.6
91.1 90.4 89.3 89.7 90.3 91.6 90.5 93.7
92.7 92.2 92.2 91.2 91.0 92.2 90.0 90.7
23
Stem-and-leaf chart– octane ratings
86
87 4
88 4 9 7
89 8 6 3 7 3 3 9
90 4 4 4 1 3 5 0 6 7
91 1 8 8 6 2 1 0 2 6
92 7 7 2 3 2 6 2
93 3 0 7
94 4
95 24
Stem-and-leaf chart– octane ratings
86
87 4
88 4 7 9
89 3 3 3 6 7 8 9
90 0 1 3 4 4 4 5 6 7
91 0 1 1 2 2 3 6 8 8
92 2 2 2 3 6 7 7
93 0 3 7
94 4
95
5
25
Octane Rating frequency table
Class interval Freq. Freq.% Cumu. %
87.0Y<88.0 1 2.5 2.5 88.0Y<89.0 3 7.5 10 89.0Y<90.0 7 17.5 27.5 90.0Y<91.0 9 22.5 50 91.0Y<92.0 9 22.5 72.5 92.0Y<93.0 7 17.5 90 93.0Y<94.0 3 7.5 97.5 94.0Y<95.0 1 2.5 100 40 100 26 Histogramoctane rating Histogramoctane rating (created from Excel) 27 1 3 7 9 9 7 3 1 0 1 2 3 4 5 6 7 8 9 10 Frequency Frequency Linear (Frequency) 28 Cautions (limitations) Choice of intervals can affect the picture you get Histograms can hide trends over time or other patterns 29 Run chart A graphic representation of process performance data tracked over time 30 6 31 Run chart 32 Control chart A graphic display of the results of a process over time and against established control limits Control chart Run chart with control limits A statistical tool used to detect excessive process variability due to specific assignable causes that can be corrected. Two purposes To see if special causes are present To test or evaluate if a solution works 33 Control chart Assignment today InClass Practice 5 34 1 Lecture 8: Fundamentals of StatisticsPart 1 Chapter 5 Outline Statistics, type of data, data description Frequency distribution and histogram Measures of central tendency Measures of dispersion Population, sample, Normal curve Computer program Types of Data Variable Quality characteristics that are measurable, continuous. Example: length, weight Attribute Classified either conforming or nonconforming (yes/no, accept/reject) Go/NoGo gauge A Go/no go refers to an inspection tool used to check a workpiece against its allowed tolerances. Its name derives from its use: the gauge itself has two tests; the check involves the workpiece's having to pass one test (Go) and 'fail' the other (No Go). Go/NoGo gauge Lower gauge is a plain plug gauge of 12.60mm for 'Go' and 12.90mm as 'NoGo For checking tolerance between 12.60-12.90. For a washer with dia= 12.75, pass one and fail the other and a washer with dia= 12.5, fail at both end Describe Data Summarizing data: Graphical (describe picture) Plot or picture of frequency distribution Analytical (math calculations) Measures of central tendency (mean, mode, median) Measures of dispersion (range, std. deviation) http://www.answers.com/topic/go-no-go http://www.answers.com/topic/engineering-tolerance http://upload.wikimedia.org/wikipedia/commons/2/26/GaugePlugThreadGoNoGo.jpg 2 Descriptive statistics Mean = average() Median = median() Mode = Mode() Std. Dev. = STDEV() Variance = var() Count = count() Max = max() Min = min() Range = max()-min() Skewness = skew() Kurtosis = kurt() Sum =sum() Student scores 82.64, 84.47, 77.12, 79.08, 95.21, 81.84, 94.66, 80.50, 84.96, 89.18, 87.76 Mean=85.22 Median=84.47 Max= 95.21 Min=77.12 Describe Data 0 1 2 3 4 5 6 7 <75 <80 <85 <90 <95 <100 Frequency Frequency Frequency Distribution Ungrouped data: a listing of the individual observed values Grouped data: a lumping together (subgroups) of the observed values Frequency Distribution ungrouped data A table that includes the number of daily billing errors of a larger organization Example number of daily billing errors Organize the data (array) Tabulate the frequency of each value Different types of frequency distributions Histogram: consists of a set of rectangles that represent the frequency in each 3 Different types of frequency distributions Relative frequency distribution: proportion or fraction of the total Different types of frequency distributions Cumulative frequency Different types of frequency distributions Relative cumulative frequency Summarization of this billing example This example shows how to organize and sort the data by frequency, relative frequency, cumulative frequency, and relative cumulative frequency. We have practiced this tasks in Pareto chart Students should be able to chart these frequency data by Excel Frequency Distribution grouped data Sometimes the collected data is a large volume of data set. If we still use the previous method, the frequency distribution result may not give a good explanation. Frequency Distribution example of grouped data Example: steel shaft weight (kilograms) Simplify data by coded value: the weights are coded from 2.500Kg, a weight with a value of 31 is equivalent to 2.531Kg (2.500+0.031) 4 Steel shaft weight Total: 110 Highest: 75 Lowest: 31 Frequency Distribution grouped data Collect data and construct a tally sheet Determine the range Determine the cell (subgroup) interval Determine the cell midpoints Determine the cell boundaries Post the cell frequency 1. Data tabulation: from lowest to highest, a tally sheet A large number of categories 2. Determine the range Where R=range Xh= highest data Xl = lowest data 044.0531.2575.2 lh XXR How many cells (categories) should be formed grouping Number of cells is based on judgment General rule of thumb is the number of cells should be between 5 and 20 5 to 9 when observations are < 100 8 to 17 observations are between 100 & 500 15 to 20 when observations are > 500
Three issues in creating the
cells
Equal width of cell intervals
Cell midpoints
Cell boundaries
5
3. Determine the cell interval
width
Cell interval is the distance between adjacent
cell midpoints. Odd number is preferred
because of easy calculation
0057.0
)110log(322.31
044.0
log322.31
n
R
i
3. Determine the cell interval
length
All trial-and-error method, here h is the total
number of intervals needed
Assume i=0.003, then
Assume i=0.005, then
Assume i=0.007, then
9
005.0
044.0
i
R
h
15
003.0
044.0
i
R
h
6
007.0
044.0
i
R
h
4.Determine the cell
boundaries and midpoints
When constructing a histogram, its
important to remember two things:
Histogram must contain all of the data
One particular value cannot fit into two
different cells, which means cells cannot
overlap
4.Determine the cell
boundaries and midpoints
1st method– the simplest technique is
to choose the lowest value measured as
first midpoint, and so on
2nd method– lowest value will be the
lower boundary for first cell (then, the
first midpoint would be the lowest value
plus half of interval), and so on
2
i
XMP
ll
4.Determine the cell boundaries and
midpoints by 1st method
Lowest value as the first
midpoint so
first Midpoint =2.531
Since width=0.005, so first
cell boundaries:
Lower-B=2.531-0.005/2
= 2.5285
Upper-B=2.531+0.005/2
=2.5335
Midpoint Cell boundaries
2.531 2.5285 – 2.5335
2.536 2.5335 – 2.5385
2.541 2.5385 2.5435
2.546 2.5435 2.5485
2.551 2.5485 – 2.5535
2.556 2.5535 – 2.5585
2.561 2.5585 2.5635
2.566 2.5635 2.5685
2.571 2.5685 2.5735
2.576 2.5735 2.5785
4.Determine the cell boundaries and
midpoints by 2st method
Lowest value as the lower
boundary of the first cell so
Lower-B=2.531,
Adjust a little bit, then
Lower-B=2.530
First Midpoint =2.530+0.005/2
= 2.5325
Upper-B=2.530+0.005
=2.535
Midpoint Cell boundaries
2.5325 2.530 – 2.535
2.5375 2.535 – 2.540
2.5425 2.540 – 2.545
2.5475 2.545 2.550
2.5525 2.550 – 2.555
2.5575 2.555 – 2.560
2.5625 2.560 2.565
2.5675 2.565 2.570
2.5725 2.570 2.575
6
4.Determine the cell
boundaries and midpoints
Other midpoints: 2.533+0.005=2.538
2.538+0.005=2.543..
533.2
2
005.0
531.2
2
i
XMP
ll
4.Determine the cell
boundaries and midpoints
Boundaries are established so there is no question
as to the location of an observation.
2.530 2.535 2.540 2.575 2.545 2.550 2.555 2.560 2.565 2.570
6. Post the cell frequency Excel Histogram function
Histogram command
Input range entire data
Bin range using upper boundary to
separate the cells
Given these two parameters, Excel will
generate frequency column
Be careful: you must have Data
Analysis ToolPak loaded
Characteristics of Graphs
Symmetry or lack of symmetry of the
data (Distribution around the central
value, Skewness)
Number of peaks
Peakedness of the data
Platykurtic (flat shape)
Leptokurtic (clear peak)
Characteristics of frequency
distributions
7
Analysis of Histogram
Many cases are expected to have a normal
pattern (bell)
Graphical representation can help figure
out problems if there is any in the process.
If not normally distributed, then something
in the process may be out of control or the
data follows some other basic pattern
Differences due to location,
spread, and shape
39
Histogram Computing function
with Excel
Define the data range
Find out mean, median, mode, standard
deviation
Construct histogram
Name data or other names
Use functions to describe data
Define midpoint and U-boundary
Tool Data AnalysisHistogram
Input range data
Bin range U-boundary
Output range
Chart data midpoint and frequency columns
One useful website about histogram with
Excel
Procedures of using Excel to
make histogram
How many cells
or categories
Boundaries of
cells or categories
Excel histogram
function (data, Bin)
Chart
function
R=max(data)-min(data)
i=R/(1+3.322log(n))
# of cells = R/i
The very left cell,
L-bound
U-bound= L-bound + i
U-bound (n+1) = U-
bound (n) +i
Cell
midpoint
Midpoint=(L-bound + U-bound)/2
Midpoint=L-bound + i/2
Assignment today/this week
InClass 6: Histogram function in Excel
Homework 2
http://www.utexas.edu/its/training/handouts/excelgrade/
http://www.utexas.edu/its/training/handouts/excelgrade/