Separating the Signals from the Noise

Similar documents
The Intraclass Correlation Coefficient

The Problem of Long-Term Capability

The Calibration of Measurement Systems. The art of using a consistency chart

Section 6.4. Sampling Distributions and Estimators

The Statistical Cracks in the Foundation of the Popular Gauge R&R Approach

Laboratory 1: Uncertainty Analysis

I STATISTICAL TOOLS IN SIX SIGMA DMAIC PROCESS WITH MINITAB APPLICATIONS

Assessing Measurement System Variation

Gage Repeatability and Reproducibility (R&R) Studies. An Introduction to Measurement System Analysis (MSA)

proc plot; plot Mean_Illness*Dose=Dose; run;

Exploring Data Patterns. Run Charts, Frequency Tables, Histograms, Box Plots

I STATISTICAL TOOLS IN SIX SIGMA DMAIC PROCESS WITH MINITAB APPLICATIONS

Chapter 4. Displaying and Summarizing Quantitative Data. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Kenneth Nordtvedt. Many genetic genealogists eventually employ a time-tomost-recent-common-ancestor

Chapter 12: Sampling

Measurement Systems Analysis

Development of an improved flood frequency curve applying Bulletin 17B guidelines

Process Behavior Charts

Mathematics. Foundation. Set E Paper 2 (Calculator)

AP STATISTICS 2015 SCORING GUIDELINES

(Notice that the mean doesn t have to be a whole number and isn t normally part of the original set of data.)

Pixel Response Effects on CCD Camera Gain Calibration

Statistics, Probability and Noise

An SWR-Feedline-Reactance Primer Part 1. Dipole Samples

BJT AC Analysis CHAPTER OBJECTIVES 5.1 INTRODUCTION 5.2 AMPLIFICATION IN THE AC DOMAIN

Chapter 12 Summary Sample Surveys

The study of human populations involves working not PART 2. Cemetery Investigation: An Exercise in Simple Statistics POPULATIONS

PRACTICAL ASPECTS OF ACOUSTIC EMISSION SOURCE LOCATION BY A WAVELET TRANSFORM

Anchoring: Introducing a Behavioral Economic Topic in Principles of Economics Courses

Numerical Roots and Radicals

Operations Management

Granite State ASQ 0104 MSA an alternative method for estimating % Tolerance April 18, 2012 Jack Meagher - NHBB

Lesson Sampling Distribution of Differences of Two Proportions

A Novel Method for Determining the Lower Bound of Antenna Efficiency

Session 5 Variation About the Mean

WFC3 TV3 Testing: IR Channel Nonlinearity Correction

Chapter 2. Describing Distributions with Numbers. BPS - 5th Ed. Chapter 2 1

Seven Basic Quality Control Tools HISTOGRAM TOOL

Miguel I. Aguirre-Urreta

Variations on the Two Envelopes Problem

Statistics for Quality Assurance of Ambient Air Monitoring Data

Module 7. Accounting for quantization/digitalization e ects and "o -scale" values in measurement

Waiting Times. Lesson1. Unit UNIT 7 PATTERNS IN CHANCE

A study of digital clock usage in 7-point matches in backgammon

How Many Imputations are Really Needed? Some Practical Clarifications of Multiple Imputation Theory

Advanced Engineering Statistics. Jay Liu Dept. Chemical Engineering PKNU

IEOR 130 Methods of Manufacturing Improvement Fall, 2018, Prof. Leachman Homework Assignment 8, Due Tuesday Nov. 13

M 3 : Manipulatives, Modeling, and Mayhem - Session I Activity #1

ECMA TR/105. A Shaped Noise File Representative of Speech. 1 st Edition / December Reference number ECMA TR/12:2009

Analysis and Design of a Simple Operational Amplifier

2, 3, 4, 4, 5, 5, 5, 6, 6, 7 There is an even number of items, so find the mean of the middle two numbers.

Enhanced Sample Rate Mode Measurement Precision

Statistics is the study of the collection, organization, analysis, interpretation and presentation of data.

!"#$%&'("&)*("*+,)-(#'.*/$'-0%$1$"&-!!!"#$%&'(!"!!"#$%"&&'()*+*!

2. The value of the middle term in a ranked data set is called: A) the mean B) the standard deviation C) the mode D) the median

Tutorial Using a multimeter

ECON 2100 Principles of Microeconomics (Summer 2016) Game Theory and Oligopoly

Hand and Foot Canasta. Use one deck more than the number of players, so 4 players = 5 decks plus jokers game youʹd use five decks plus ten jokers.

10:00-10:30 HOMOGENIZATION OF THE GLOBAL TEMPERATURE Victor Venema, University of Bonn

4.5 Fractional Delay Operations with Allpass Filters

Math Exam 2 Review. NOTE: For reviews of the other sections on Exam 2, refer to the first page of WIR #4 and #5.

2 players. Multiplying decimals. Purpose. How to Play. Materials. Reading the Research

Math Exam 2 Review. NOTE: For reviews of the other sections on Exam 2, refer to the first page of WIR #4 and #5.

Application Note (A13)

Astronomy 341 Fall 2012 Observational Astronomy Haverford College. CCD Terminology

BEHAVIOR OF PURE TORQUE AND TORQUE WITH CROSS FORCE MEASUREMENT OF TORQUE TRANSDUCER

Supplementary Materials for

Handling Search Inconsistencies in MTD(f)

ANALYZE. Lean Six Sigma Black Belt. Chapter 2-3. Short Run SPC Institute of Industrial Engineers 2-3-1

Narrow-Band Interference Rejection in DS/CDMA Systems Using Adaptive (QRD-LSL)-Based Nonlinear ACM Interpolators

Detiding DART R Buoy Data and Extraction of Source Coefficients: A Joint Method. Don Percival

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

We all know what it means for something to be random. Or do

RF Design: Will the Real E b /N o Please Stand Up?

Chapter 0: Preparing for Advanced Algebra

LAB 2 - BATTERIES, BULBS, & CURRENT

2011, Stat-Ease, Inc.

Assessing Measurement System Variation

Graphs and Charts: Creating the Football Field Valuation Graph

The Odds Calculators: Partial simulations vs. compact formulas By Catalin Barboianu

COURSE SYLLABUS. Course Title: Introduction to Quality and Continuous Improvement

Questioning Strategies Questions and Answers

Prognostic Modeling for Electrical Treeing in Solid Insulation using Pulse Sequence Analysis

How to Make a Run Chart in Excel

An Analysis of the Current Status of Process Control for Color Reproduction in Newspapers

Possible responses to the 2015 AP Statistics Free Resposne questions, Draft #2. You can access the questions here at AP Central.

Use of the BVD for traceability of bipolar DC voltage scale from 1 mv up to 1200 V

This page intentionally left blank

Simulations. 1 The Concept

Web Appendix: Online Reputation Mechanisms and the Decreasing Value of Chain Affiliation

Ch Analyzing Data and Graphs

Key-Words: - Dynamic, Cement, Mill, Grinding, Model, Uncertainty, PID, tuning, robustness, sensitivity

1. A factory makes calculators. Over a long period, 2 % of them are found to be faulty. A random sample of 100 calculators is tested.

ANALYSIS OF VARIANCE PROCEDURE FOR ANALYZING UNBALANCED DAIRY SCIENCE DATA USING SAS

Advanced Paralleling of LTC Transformers by VAR TM Method

ON THE EVOLUTION OF TRUTH. 1. Introduction

Department of Mechanical and Aerospace Engineering. MAE334 - Introduction to Instrumentation and Computers. Final Examination.

Evaluating the Erlang C and Erlang A Models for Call Center Modeling Working Paper

Variance Decomposition and Replication In Scrabble: When You Can Blame Your Tiles?

Standards for High-Quality Research and Analysis C O R P O R A T I O N

Transcription:

Quality Digest Daily, October 3, 2013 Manuscript 260 Donald J. Wheeler The second principle for understanding data is that while some data contain signals, all data contain noise, therefore, before you can detect the signals you will have to filter out the noise. This act of filtration is the essence of all data analysis techniques. It is the foundation for our use of data and all the predictions we make based on those data. In this column we will look at the mechanism used by all modern data analysis techniques to filter out the noise. Given a collection of data it is common to begin with the computation of some summary statistics for location and dispersion. Averages and medians are used to characterize location, while either the range statistic or the standard deviation statistic is used to characterize dispersion. This much is taught in every introductory class. However, what is usually not taught is that the structures within our data will often create alternate ways of computing these measures of dispersion. Understanding the rolls of these different methods of computation is essential for anyone who wishes to analyze data. Perhaps the most common type of structure for a data set is to have k subgroups of size n where the n values within each subgroup were collected under the same set of conditions. This structure is found in virtually all types of experimental data, and in most types of data coming from a production process. To illustrate the alternate ways of computing measures of dispersion we shall use a simple data set consisting of k = 3 subgroups of size n = 8 as shown in Figure 1. Subgroup One 4 4 8 4 3 7 Subgroup Two 2 4 3 7 4 2 Subgroup Three 3 6 6 4 4 6 6 Figure 1: Data Set One Figure 2: Method One for Estimating Dispersion www.spcpress.com/pdf/djw260.pdf 1 October 2013

DATA SET ONE WITH METHOD ONE The first method of computing a measure of dispersion is the method taught in introductory classes in statistics. All of the data from the k subgroups of size n are collected into one large group of size nk and a single dispersion statistic is found using all nk values. This dispersion statistic is then used to estimate a dispersion parameter such as the standard deviation for the distribution of, SD(). As shown in Figure 3 the range of all 24 values is 6. The bias correction factor for ranges of 24 values is 3.89. Dividing 6 by 3.89 yields an unbiased estimate of the standard deviation of the distribution of of 1.40. The global standard deviation statistic is 1.1. The bias correction factor for this statistic when it is based on 24 values is 0.9892. Dividing 1.1 by 0.9892 yields an unbiased estimate of the standard deviation of the distribution of of 1.68. Method One R = 6 (24 obs) s = 1.1 (24 obs) d 2 = 3.89 c 4 = 0.9892 Est. SD() = 1.40 Est SD() = 1.68 0 1 2 3 4 6 7 8 9 10 Figure 3: Method One with Data Set One Since the original data are given to the nearest whole number, there is no practical difference between the two estimates of SD() shown in Figure 3. Whether we use the range or the standard deviation statistic will not substantially affect our analysis. DATA SET ONE WITH METHOD TWO While Method One ignores the subgroups, Method Two respects the subgroup structure within the data. Here we calculate a dispersion statistic for each subgroup. These separate dispersion statistics are then averaged, and the average dispersion statistic is used to form an unbiased estimate for the standard deviation parameter of the distribution of. Figure 4: Method Two for Estimating Dispersion Using Data Set One, we compute a dispersion statistic for each of the three subgroups. Since the subgroups are all the same size we can average the statistics prior to dividing by the common bias correction factor. As shown in Figure, the subgroup ranges are respectively,, and 3. The average range is www.spcpress.com/pdf/djw260.pdf 2 October 2013

4.333 and the bias correction factor for ranges of eight data is 2.847. Dividing 4.333 by 2.847 we estimate the standard deviation for the distribution of to be 1.22. The subgroup standard deviation statistics are respectively 1.690, 1.690, and 1.19. The average standard deviation statistic is 1.2 and the bias correction factor is 0.960. Dividing 1.2 by 0.960 we estimate the standard deviation for the distribution of to be 1.80. Method Two Subgroup One: R 1 = s 1 = 1.690 d 2 = 2.847 c 4 = 0.960 1 2 3 4 6 7 8 9 Subgroup Two: R 2 = s 2 = 1.690 d 2 = 2.847 c 4 = 0.960 1 2 3 4 6 7 8 9 Subgroup Three: R 3 = 3 s 3 = 1.19 d 2 = 2.847 c 4 = 0.960 1 2 3 4 6 7 8 9 Average Range = 4.333 Average Std. Dev. = 1.2 Est. SD() = 1.22 Est. SD() = 1.80 Figure : Method Two with Data Set One As before, there is no practical difference between the two estimates shown in Figure. Neither is the any practical difference between the estimates in Figure 3 and those in Figure. The four estimates obtained using the two different measures of dispersion and the two different methods are all very similar. DATA SET ONE WITH METHOD THREE The third method will probably seem rather strange. It is certainly indirect. Instead of working with the individual values as the first two methods do, the third method works with the subgroup averages. These subgroup averages are used to obtain a dispersion statistic, and this dispersion statistic is then used to estimate the standard deviation parameter of the distribution of. Figure 6: Method Three for Estimating Dispersion For Data Set One the subgroup averages are respectively.0, 4.0, and.0. The range of these www.spcpress.com/pdf/djw260.pdf 3 October 2013

three averages is 1.00. The bias correction factor for the range of three values is 1.693. Since each of these averages represents eight original data, we will have to multiply by the square root of 8 and divide by the bias correction factor to estimate the standard deviation parameter for the distribution of. When we do this with the values above we obtain an estimate of SD() of 1.671. The standard deviation statistic for the three subgroup averages is 0.774. Dividing by the bias correction factor of 0.8862 and multiplying by the square root of 8 we obtain an unbiased estimate of the standard deviation of the distribution of of 1.843. Method Three R - = 1.000 s - = 0.774 d 2 = 1.693 c 4 = 0.8862 1 2 3 4 6 7 8 9 Est SD() = 1.671 Est. SD() = 1.843 Figure 7: Method Three with Data Set One Once again, there is no practical difference between using the range and using the standard deviation statistic. Here the two estimates are slightly larger than before, but not by any appreciable amount. Range Based Std. Dev. Based Est SD() c.v. Est SD() c.v. Method One 1.40 18.1% 1.63 14.7% Method Two 1.22 16.% 1.80 1.6% Method Three 1.671 0.6% 1.84 0.0% Figure 8: Summary of Three Methods for Data Set One As summarized in Figure 8, we have just obtained six unbiased estimates for the standard deviation parameter for the distribution of using three different methods and two different statistics. These six values are listed along with their coefficients of variation (c.v.). The first four unbiased estimates are all quite similar because they all have similar coefficients of variation. The last two unbiased estimates are not as cozy as the first four because they have much larger coefficients of variation and therefore have more uncertainty attached. Before we attempt to draw any lesson from this example we need to know that Data Set One has a very special property. When we place Data Set One on an average and range chart we end up with Figure 9. There we see no evidence of any differences between the three subgroups. Data Set One contains no signals. It is pure noise. www.spcpress.com/pdf/djw260.pdf 4 October 2013

Averages Ranges 6.28 4.67 3.0 8.08 4.33 0 0.9 Figure 9: Average and Range Chart for Data Set One Therefore, at this point we can reasonably conclude that when the data are homogeneous and contain no signals the three methods will yield similar values for unbiased estimates of SD() regardless of whether we use the range or the standard deviation statistic. DATA SET TWO But what happens in the presence of signals? After all, the objective is to filter out the noise so we can detect any signals that may be present. To see how signals affect our estimates of SD() we shall modify Data Set One by inserting two signals. Specifically we shall shift subgroup two down by two units while we shift subgroup three up by four units. This will result in Data Set Two which is shown in Figure 10. As may be seen in the average and range chart in Figure 11, these changes have introduced two distinct signals. Subgroup One 4 4 8 4 3 7 Subgroup Two 0 2 1 3 2 0 3 Subgroup Three 7 10 10 8 9 8 10 10 Figure 10: Data Set Two Averages 6.9.33 3.71 Ranges 0 8.08 4.33 0.9 Figure 11: Average and Range Chart for Data Set Two METHOD ONE WITH DATA SET TWO Method One uses all 24 values in Data Set Two to compute global measures of dispersion. As shown in Figure 12, the global range is 10.0 which results in an unbiased estimate of the standard deviation parameter of 2.67. The global standard deviation statistic is 3.279 which gives an www.spcpress.com/pdf/djw260.pdf October 2013

unbiased estimate of the standard deviation parameter of 3.31. Method One R = 10 s = 3.279 d 2 = 3.89 c 4 = 0.9892 Est. SD() = 2.67 Est SD() = 3.31 0 1 2 3 4 6 7 8 9 10 Figure 12: Method One with Data Set Two These estimates of SD() are roughly twice the size of those found in Figure 3. Thus, the signals introduced by shifting the subgroup averages have inflated both of the Method One estimates by an appreciable amount. METHOD TWO WITH DATA SET TWO Using Method Two, we compute a dispersion statistic for each of the three subgroups. Since the subgroups are all the same size we can average the statistics prior to dividing by the common bias correction factor. As shown in Figure 13, the average range is 4.333 and the bias correction factor for ranges of eight data is 2.847. Dividing 4.333 by 2.847 we estimate the standard deviation for the distribution of to be 1.22. The average standard deviation statistic is 1.2 and the bias correction factor is 0.960. Dividing 1.2 by 0.960 we estimate the standard deviation for the distribution of to be 1.80. Method Two Subgroup One: R 1 = s 1 = 1.690 d 2 = 2.847 c 4 = 0.960 1 2 3 4 6 7 8 9 Subgroup Two: R 2 = s 2 = 1.690 d 2 = 2.847 c 4 = 0.960 0 1 2 3 4 6 7 8 9 Subgroup Three: R 3 = 3 s 3 = 1.19 d 2 = 2.847 c 4 = 0.960 1 2 3 4 6 7 8 9 10 Average Range = 4.333 Average Std. Dev. = 1.2 Est. SD() = 1.22 Est. SD() = 1.80 Figure 13: Method Two with Data Set Two The Method Two estimates of SD() for Data Set Two are exactly the same as those obtained for Data Set One in Figure. Thus, the Method Two estimates are not afffected by the signals introduced by shifting the subgroup averages. METHOD THREE WITH DATA SET TWO For Data Set Two the subgroup averages are respectively.0, 2.0, and 9.0. The range of these three averages is 7.00. The bias correction factor for the range of three values is 1.693. Since each www.spcpress.com/pdf/djw260.pdf 6 October 2013

of these averages represents eight original data, we will have to multiply by the square root of 8 and divide by the bias correction factor to estimate the standard deviation parameter for the distribution of. When we do this with the values above we obtain an estimate of SD() of 11.693. The standard deviation statistic for the three subgroup averages is 3.12. Dividing by the bias correction factor of 0.8862 and multiplying by the square root of 8 we obtain an unbiased estimate of the standard deviation of the distribution of of 11.209. Method Three R - = 7.000 s - = 3.12 _ d 2 = 1.693 c 4 = 0.8862 1 2 3 4 6 7 8 9 Est SD() = 11.693 Est. SD() = 11.209 Figure 14: Method Three with Data Set Two These Method Three estimates of SD() are seven times larger than values found in Figure 7. Thus, the signals introduced by shifting the subgroup averages have severely inflated both of the Method Three estimates. When we summarize the results of the three methods with Data Set Two we get the table in Figure 1. We have obtained six unbiased estimates of SD() using three different methods and two different statistics, yet these six values differ by almost an order of magnitude! Range Based Std. Dev. Based Est SD() c.v. Est SD() c.v. Method One 2.67 18.1% 3.31 14.7% Method Two 1.22 16.% 1.80 1.6% Method Three 11.693 0.6% 11.209 0.0% Figure 1: Summary of Three Methods for Data Set Two The differences left to right in Figure 1 show the effects of using the different dispersion statistics. The differences top to bottom reveal the differences due to using the different methods. Clearly, the differences left to right pale in comparison with those top to bottom. The key to filtering out the noise so we can detect the signals does not depend upon whether we use the standard deviation statistic or the range, but rather upon which method we employ to compute that dispersion statistic. Method One estimates of dispersion are commonly known as the Total Variation or the Overall Variation. Method One is used for description. It implicitly assumes that the data are globally homogeneous. When the data are not globally homogeneous this method will be inflated by the signals contained within the data and the value obtained will no longer estimate SD(). www.spcpress.com/pdf/djw260.pdf 7 October 2013

Figure 16: Total or Overall Variation Method Two estimates of dispersion are commonly known as the Within-Subgroup Variation. Method Two is used for analysis. Whenever we seek to filter out the noise in order to detect signals we use Method Two to establish the filter. Method Two implicitly assumes that the data are homogeneous within the subgroups, but it places no requirement of homogeneity upon the different subgroups. Thus, even when the subgroups differ, Method Two will provide a useful estimate of SD(). Figure 17: Within-Subgroup Variation Method Three estimates of dispersion are commonly known as the Between-Subgroup Variation. Method Three is used for comparison purposes. It assumes that the subgroup averages are globally homogeneous. When Method Three is computed it is generally compared with Method Two; the idea being that any signals present in the data will affect Method Three more than they affect Method Two. When the subgroups differ, Method Three will not provide an estimate of SD(). Figure 18: Between-Subgroup Variation www.spcpress.com/pdf/djw260.pdf 8 October 2013

SEPARATING THE SIGNALS FROM THE NOISE The essence of every statistical analysis is the separation of the signals from the noise. We want to find the signals so that we can use this knowledge constructively. We want to ignore the noise where there is nothing to be learned. To this end we begin by filtering out the noise. And for the past 100 years the standard technique for filtering out the noise has been Method Two! To illustrate this point Figure 19 shows the average chart for Data Set Two with limits computed using each of the three methods. Only Method Two correctly identifies the two signals we deliberately buried in Data Set Two. So when it comes to filtering out the noise you have a choice between Method Two, Method Two, or Method Two. Any method is right as long as it is Method Two! Method One is inappropriate for filtering out the noise because it gets inflated by the signals. Method One has always been wrong for analysis, and it will always be wrong. Trying to use Method One for analysis is so wrong that it has a name. It is known as Quetelet s Fallacy and it is the reason there was so little progress in statistical analysis in the Nineteenth Centrury. 1 1.87 Averages Correct Limits Based On Method Two 6.9.33 3.71 Averages Incorrect Limits Based On Method One 8.8.33 1.82 Averages 10 0 Incorrect Limits Based On Method Three.33.20 Figure 19: Average Charts for Data Set Two Method Three is completely inappropriate for filterring out the noise because it will be severely inflated in the presence of signals. If you use Method Three to filter out the noise you will have to wait a very long time before you detect a signal. So while there are analysis techniques that make use of the Method Three (Between Subgroup) estimate of dispersion, they do so only in order to compare it with a Method Two (Within Subgroup) estimate of dispersion. Thus, the foundation of all modern data analysis techniques is the use of Method Two to filter out the noise. This is the foundation for the Analysis of Variance. This is the foundation for the Analysis of Means. And this is the foundation for Shewhart s process behavior charts. Ignore this foundation and you will undermine your whole analysis. Many analysis techniques from the Nineteenth Century, such a Franklin Pierce s test for outliers, are built on the use of Method One to filter out the noise. As may be seen in Figure 19, this approach will let you occasionally detect a signal, but it will cause you to miss other signals. In fact, many techniques developed in the Twentieth Centruy also suffer from Quetelet s Fallacy. Among these are Grubb s test for outliers, the Levey-Jennings control chart, and the Tukey control chart. Moreover, virtually every piece of statistical software available today allows the user to choose Method One for creating control charts and performing various other statistical tests. Nevertheless, this error on the part of naive programmers does not make it right or even acceptable to use Method One for analysis. www.spcpress.com/pdf/djw260.pdf 9 October 2013

So while there are proper uses of Method One and Method Three, they are never appropriate for filtering out the noise. The only correct method for filtering out the noise is Method Two. Understanding this point is the beginning of competence for every data analyst. You now know the difference between modern data analysis techniques and naive analysis techniques. Naive techniques use Method One or Method Three to filter out the noise. Today all sorts of new naive techniques are being created by those who know no better. Let the user beware. To help with this problem of identifying naive techniques Figure 20 contains a listing of 27 of the more commonly encountered within-subgroup estimators of both the standard deviation parameter and the variance parameter. There we see the hallmark of the within-subgroup approach: Each estimator is based on either the average or the median of a collection of k withinsubgroup measures of dispersion. Method One and Method Three each use a single measure of disperison. Now you know the importance of using the right method, and you know what the right method will look like in practice. While this may be more than you ever wanted to know about statistics, it is essential knowledge for all seek to understand their data. Estimators for SD() Estimators for V() Name of Estimator Biased Unbiased Biased Unbiased R R Average Range d 2 * d 2 R 2 d R 2 2 d 2 * R ~ Median Range ---- d 4 R ~ 2 d ---- 4 R R Average Moving Range 1.414 1.128 R 2 1.128 R ~ Median Moving Range ---- 0.94 R ~ 2 0.94 Average Root Mean Square Dev. s s n n c 2 Median Root Mean Square Dev. s ~ s ~ n n c 1 R 1.414 ---- ( s n ) 2 ---- ( s ~ n) 2 ---- Average Standard Deviation s s c 4 ( s ) 2 ---- 2 Median Standard Deviation s ~ s ~ c 6 ( s ~ ) 2 ---- Pooled Variance s 2 s 2 c 4 ---- s 2 Figure 20: Some Within-Subgroup Estimators This article is based on material found in Advanced Topics in Statistical Process Control, Second Edition 2004 SPC Press. Used with permission. www.spcpress.com/pdf/djw260.pdf 10 October 2013