The Problem:
What if it's not a bell
curve?
The
Solution (and it's not the Central Limit Theorem or transformations)
Nonnormal Sample Averages (x-bar chart)
Nonnormal Range or Standard Deviation (R
or s chart)
Process
capability indices for nonnormal
distributions
My papers
on this subject
|
|
Statistical
Process Control for Nonnormal Distributions
NEW (April 2010): "Statistical Process Control for Real-World
Applications" (Taylor and
Francis, available December 2010)
This book covers the material shown on this page in detail. In
addition, it includes:
- Improvements on the traditional R and s charts for the
normal distribution; calculation of exact quantiles of the sample range
and sample standard deviation allow the user to set control limits with
known risks (e.g. 0.00135 for Shewhart-equivalent risks)
- Distribution fitting for two and three parameter gamma and
Weibull functions
- Distribution fitting of left-censored distributions, for
applications in which there is a lower detection limit for the
instrument involved.
- Multiple attribute control charts that use exact control
limits for the binomial and Poisson distributions, thus rendering the
traditional p, np, c and u charts (which rely on the normal
approximation) obsolete. These are easily deployable to spreadsheets
with no programming beyond the spreadsheet functions themselves.
- Range charts for the gamma and Weibull distributions
- Confidence limits for the nonconforming fraction (and thus
the process performance index) for normal, Weibull, and tentatively
gamma distributions
- Effect of gage reproducibility and repeatability on control
charts and process performance indices
- A chapter on multivariate control charts
We plan to include either a user disk or a
Web site that contains an extensive set of Visual Basic for
Applications functions to support the content of the book; these can be
loaded into Microsoft Excel or any other spreadsheet that supports VBA.
Among these are:
- Quantile quantile plots for normal and nonnormal
distributions
- Chi square goodness of fit tests for normal and nonnormal
distributions
- Distribution fitting for the gamma and Weibull
distributions (including threshold parameters)
- Distribution fitting for left-censored Weibull and gamma
distributions, for applications in which there are lower detection
limits for a quality characteristic such as a trace impurity
- Cumulative distribution functions for the sample ranges of
the normal, gamma, and Weibull distributions. This technically makes
the traditional R chart (whose control limits rely on a normal
approximation) obsolete; the user can select control limits with an
exact Shewhart-equivalent risk of 0.00135 at each end, or anything else
he or she chooses.
- Confidence intervals for the nonconforming fraction (tail
area) of the normal, Weibull, and tentatively [1] gamma distribution,
thus allowing the user to quote confidence limits on the process
performance index.
[1] There is very little precedent for the
gamma distribution, although we successfully reproduced the results
from the only example of which we know (J. Lawless' book on reliability
statistics includes an example).
Levinson
Productivity Systems, P.C. offers consulting services for processes
whose quality characteristics do not follow the normal distribution. We
can set up control charts (X or x-bar for the gamma distribution, X for
the Weibull distribution, and R for both) with known and exact false
alarm rates, as well as tests of goodness of fit for the selected
distribution.
- The range charts do NOT rely on
normal approximations to the distribution of the actual range; they use
the actual cumulative distribution function of the range.
- We can also calculate point
estimates the process performance indices of these distributions, and
confidence intervals for the performance index of a Weibull
distribution.
- Little work has been done on
confidence intervals for the survivor function of a gamma distribution,
although we were able to reproduce the results from the only example of
which we know.)
It is fairly easy to set up and use
statistical process control
(SPC) charts for manufacturing processes that conform to the normal
(bell
curve, Gaussian) distribution. The equations are quite familiar, as are
those for the capability indices.
|
Shewhart
control limits
|
Process
capability indices
CPU and CPL reflect the process' ability to meet the upper and lower
specification limits (USL, LSL) respectively, and Cpk is their minimum.
When Cp=2, we have a "Six Sigma" process, one that (if centered on the
nominal) has six standard deviations between each specification limit
and
the mean. Such a process will have two parts per billion nonconforming
(1 ppb at each end). Motorola assumes the possibility of a 1.5 sigma
shift,
which will result in 3.4 ppm nonconforming. |
The
Problem: What if it's not a
bell curve?
Here is a control chart for a
process with 100 randomly-generated
numbers from the same distribution, i.e. a process that is in control.
(Neither the mean nor the standard deviation changed during the
simulation.)
The mean is, incidentally, 2, and the standard deviation is the square
root of 2. What's wrong with this picture?
 |
There are two points that
are out of control. The false
alarm rate should be 0.135% at each control limit so, with 100 points,
there's about a 27% chance of getting one false alarm. Two are far less
likely.
Furthermore, there is a problem with
the way the points
scatter around the center line. |
The answer is that the underlying
distribution looks like
this:
 |
- alpha is the shape
parameter
- gamma is the scale
parameter
- delta is the
threshold parameter (in reliability statistics,
the guarantee time)
The mean is alpha/gamma and
the variance is alpha/gamma^2.
Thus the mean is 2 and the standard deviation is the square root of 2.
The long upper tail
accounts for the two supposedly out
of control points in the traditional Shewhart chart. The false alarm
risk
of the upper control limit is not 0.00135 but 0.014, more than 10
times
what we expect.
|
The false alarm risk for the upper control
limit is simply
1-F(UCL), where F(x) is the cumulative distribution function for this
gamma
distribution. Here's how to do this in MathCAD:
 |
F(x)= cumulative
distribution
Q(p)= quantile function. A guess
must be provided for
x1.
rnd_gamma(z) generates a random
number from the indicated
gamma distribution. z is a dummy variable. Note that rnd(1) returns a
random
number from the uniform distribution [0,1], i.e. a random quantile. |
The Solution
(it's not the central
limit theorem or transformations!)
The central limit theorem (CLT)
says that, if we take
a big enough sample, the sample averages will follow a normal
distribution
no matter what the individual measurements do. There are, however, a
couple
of problems with this:
- It might not be convenient
or possible to take a large sample.
Some measurements, like impurity levels (in chemicals) or particle
counts
(in semiconductor processing equipment), yield only individual
measurements.
This gets into the subject of the rational subgroup. Five impurity
measurements
from one chemical batch does not constitute five independent
representations
of the process! It does not reflect the between-batch variation, it
reflects
only the within-batch variation (plus any gage repeatability variation).
- Individual measurements,
not averages, are in or out of specification. We
cannot get accurate capability indices unless we use the underlying
statistical
distribution.
- Transformations may behave
sufficiently normal for SPC purposes but they do not, in our
experience, yield accurate process performance indices.
- We have an example in which
a Johnson transformation of a gamma distribution yields a beautiful
normal probability plot, but the estimated nonconforming fraction is
four times too large.
- We have another example (Levinson,
Stensney, Webb, and Glahn, 2001)
in which the square root transformation of a gamma distribution yields
data that are apparently normal, and the square root of the
particle count data might work quite well on a control chart with
normal control limits. The estimated nonconforming fraction is,
however, off by an order of magnitude.
Recall that the Shewhart control
chart yields a 0.135% false
alarm risk at each end. We can set Shewhart-equivalent limits for a
nonnormal
distribution:
- Upper control limit:
0.99865 quantile, for a 0.00135 false
alarm risk at the upper end
- Center line: 0.50
quantile or median. (The mean is
the median of the normal distribution.) This allows us to use the
Western
Electric Zone C test for runs of eight consecutive points above or
below
the center line.
- Calculation of the
appropriate quantiles for the Zone A and
Zone B tests is also possible.
- Lower control limit:
0.00135 quantile, for a 0.00135 false
alarm risk at the lower end.
While
the false alarm (Type I) risks will be the same as for a Shewhart
chart, the average run lengths will not. These must be computed from
the distribution itself. The fact that a shift in process "mean" could
be reflected by more than one parameter of a nonnormal distribution is
likely to make this a complicated exercise.
 |
This looks a lot better.
The upper control limit, 8.9,
is the 0.99865 quantile of this distribution. The median, 1.68, is a
little
below the mean. |
Nonnormal Sample Averages (x-bar chart)
How should sample averages from nonnormal distributions be treated? If
the samples are big enough, the Central Limit Theorem applies. In
intermediate cases, it is still desirable to use the actual
distribution. We are aware of the following relationships to date:
Gamma Distribution

The average of n measurements from a gamma distribution with parameters
alpha and gamma follows a gamma distribution with parameters n*alpha
and n*gamma. Note that the variance is 1/n times the variance of an
individual measurement, which is as expected.
Nonnormal Ranges or Standard
Deviations (R or s chart)
It is in fact possible to define range charts with exact false alarm
limits.
The Process
Capability Index
The less people know
about how laws and sausages
are made, the better off they are. (Otto von Bismarck)
Does that Cpk index from your
supplier (or your own production
line) resemble a sausage? What you don't know can hurt you--- by
several orders of magnitude. We saw previously that the calculation
is rather straightforward, and this is how most commercially-available
SPC software does it.
What does the capability index
measure? It reflects
the number of standard deviations between the specification limit and
the
mean. And that number (the standard normal deviate), in turn,
should
reflect the nonconforming fraction: the proportion of units that will
be
outside the specification limit. This calculation is quite
straightforward
and well-established for the normal distribution.
In this example, suppose for
simplicity that the upper
specification limit is 6.243 (the mean plus three standard deviations,
the same as the UCL). This is not a capable process as the CPU is only
1. We have already seen that, if the distribution is normal, the
nonconforming
fraction above the USL will be 0.00135. We also saw that, for this
gamma
distribution, it's really 0.014, more than ten times as much!
(Furthermore,
since it's impossible to get less than zero from this distribution,
there
may well be no lower specification-- which is true for impurity levels
and particle counts. We don't care how few impurities are in
the
product! In this case, the Cp and CPL indices don't exist, and Cpk=CPU.)
A capability index report that
assumes a normal distribution
can be off by several orders of magnitude (in terms of reflecting the
nonconforming
fraction) when the underlying distribution is nonnormal.
We can report an equivalent
process capability index
for a nonnormal distribution. This is simply the capability index of a
normal distribution that would yield the same nonconforming fraction.
 |
A nonconforming fraction
of 0.0138 would result from
a normal distribution whose USL was 2.195 standard normal deviates
above
its mean. This corresponds to a CPU of 0.732. This process has an
equivalent
Cpk of 0.732, which is even worse than the 1.00 reported. |
Update: The above approach is now described by the Automotive Industry
Action Group (AIAG). Reference: ANSI/ASQ B1-B3-1996, Guide
for Quality Control Charts/ Control Chart Method of Analyzing Data/ Control Chart Method for Controlling Quality
During Production, pages 142-143

Our papers on this
subject:
-
Levinson, W. "Watch Out for Nonnormal Distributions of Impurities,"
Chemical
Engineering Progress, May 1997, pp. 70-76.
- Levinson, W. "Approximate Confidence Limits for Cpk and
Confidence Limits
for Non-Normal Process Capabilities," in Quality Engineering,
9(4),
635-640 (1997)
- Levinson, William and Polny, Angela. "SPC for Tool
Particle Counts," Semiconductor
International, June 1999.
- Levinson, "SPC for Real-World Processes: What to do when
the Normality
Assumption Doesn't Work." Presented at the ASQ's Annual Quality
Conference
(2000) in Indianapolis
- Levinson, W.A., Stensney, Frank, Webb, Raymond, and Glahn,
Ronald. 2001. "SPC for
Particle Counts," Semiconductor International, 10/01
|