Husaria wingLevinson Productivity Systems, P.C.
William A. Levinson, P.E.  Principal
TheBoss at
Lean Enterprise
Six Sigma
Stochos products
ISO 9000
The Man Factory
3rd Party Resources
The Problem: What if it's not a bell curve?

The Solution (and it's not the Central Limit Theorem or transformations)

Nonnormal Sample Averages (x-bar chart)

Nonnormal Range or Standard Deviation (R or s chart)

Process capability indices for nonnormal distributions

My papers on this subject

Statistical Process Control for Nonnormal Distributions

NEW (April 2010): "Statistical Process Control for Real-World Applications" (Taylor and Francis, available December 2010)
This book covers the material shown on this page in detail. In addition, it includes:
  • Improvements on the traditional R and s charts for the normal distribution; calculation of exact quantiles of the sample range and sample standard deviation allow the user to set control limits with known risks (e.g. 0.00135 for Shewhart-equivalent risks)
  • Distribution fitting for two and three parameter gamma and Weibull functions
    • Distribution fitting of left-censored distributions, for applications in which there is a lower detection limit for the instrument involved.
  • Multiple attribute control charts that use exact control limits for the binomial and Poisson distributions, thus rendering the traditional p, np, c and u charts (which rely on the normal approximation) obsolete. These are easily deployable to spreadsheets with no programming beyond the spreadsheet functions themselves.
  • Range charts for the gamma and Weibull distributions
  • Confidence limits for the nonconforming fraction (and thus the process performance index) for normal, Weibull, and tentatively gamma distributions
  • Effect of gage reproducibility and repeatability on control charts and process performance indices
  • A chapter on multivariate control charts
We plan to include either a user disk or a Web site that contains an extensive set of Visual Basic for Applications functions to support the content of the book; these can be loaded into Microsoft Excel or any other spreadsheet that supports VBA. Among these are:
  • Quantile quantile plots for normal and nonnormal distributions
  • Chi square goodness of fit tests for normal and nonnormal distributions
  • Distribution fitting for the gamma and Weibull distributions (including threshold parameters)
    • Distribution fitting for left-censored Weibull and gamma distributions, for applications in which there are lower detection limits for a quality characteristic such as a trace impurity
  • Cumulative distribution functions for the sample ranges of the normal, gamma, and Weibull distributions. This technically makes the traditional R chart (whose control limits rely on a normal approximation) obsolete; the user can select control limits with an exact Shewhart-equivalent risk of 0.00135 at each end, or anything else he or she chooses.
  • Confidence intervals for the nonconforming fraction (tail area) of the normal, Weibull, and tentatively [1] gamma distribution, thus allowing the user to quote confidence limits on the process performance index.
[1] There is very little precedent for the gamma distribution, although we successfully reproduced the results from the only example of which we know (J. Lawless' book on reliability statistics includes an example).

Levinson Productivity Systems, P.C. offers consulting services for processes whose quality characteristics do not follow the normal distribution. We can set up control charts (X or x-bar for the gamma distribution, X for the Weibull distribution, and R for both) with known and exact false alarm rates, as well as tests of goodness of fit for the selected distribution.
  • The range charts do NOT rely on normal approximations to the distribution of the actual range; they use the actual cumulative distribution function of the range.
  • We can also calculate point estimates the process performance indices of these distributions, and confidence intervals for the performance index of a Weibull distribution.
    • Little work has been done on confidence intervals for the survivor function of a gamma distribution, although we were able to reproduce the results from the only example of which we know.)
It is fairly easy to set up and use statistical process control (SPC) charts for manufacturing processes that conform to the normal (bell curve, Gaussian) distribution. The equations are quite familiar, as are those for the capability indices.

Shewhart control limits
Process capability indices

CPU and CPL reflect the process' ability to meet the upper and lower specification limits (USL, LSL) respectively, and Cpk is their minimum. When Cp=2, we have a "Six Sigma" process, one that (if centered on the nominal) has six standard deviations between each specification limit and the mean. Such a process will have two parts per billion nonconforming (1 ppb at each end). Motorola assumes the possibility of a 1.5 sigma shift, which will result in 3.4 ppm nonconforming.

The Problem: What if it's not a bell curve?

Here is a control chart for a process with 100 randomly-generated numbers from the same distribution, i.e. a process that is in control. (Neither the mean nor the standard deviation changed during the simulation.) The mean is, incidentally, 2, and the standard deviation is the square root of 2. What's wrong with this picture?

There are two points that are out of control. The false alarm rate should be 0.135% at each control limit so, with 100 points, there's about a 27% chance of getting one false alarm. Two are far less likely.
Furthermore, there is a problem with the way the points scatter around the center line.

The answer is that the underlying distribution looks like this:

  • alpha is the shape parameter
  • gamma is the scale parameter
  • delta is the threshold parameter (in reliability statistics, the guarantee time)
The mean is alpha/gamma and the variance is alpha/gamma^2. Thus the mean is 2 and the standard deviation is the square root of 2.

The long upper tail accounts for the two supposedly out of control points in the traditional Shewhart chart. The false alarm risk of the upper control limit is not 0.00135 but 0.014, more than 10 times what we expect.

The false alarm risk for the upper control limit is simply 1-F(UCL), where F(x) is the cumulative distribution function for this gamma distribution. Here's how to do this in MathCAD:

F(x)= cumulative distribution
Q(p)= quantile function. A guess must be provided for x1.
rnd_gamma(z) generates a random number from the indicated gamma distribution. z is a dummy variable. Note that rnd(1) returns a random number from the uniform distribution [0,1], i.e. a random quantile.

The Solution (it's not the central limit theorem or transformations!)

The central limit theorem (CLT) says that, if we take a big enough sample, the sample averages will follow a normal distribution no matter what the individual measurements do. There are, however, a couple of problems with this:

  1. It might not be convenient or possible to take a large sample. Some measurements, like impurity levels (in chemicals) or particle counts (in semiconductor processing equipment), yield only individual measurements. This gets into the subject of the rational subgroup. Five impurity measurements from one chemical batch does not constitute five independent representations of the process! It does not reflect the between-batch variation, it reflects only the within-batch variation (plus any gage repeatability variation).
  2. Individual measurements, not averages, are in or out of specification. We cannot get accurate capability indices unless we use the underlying statistical distribution.
  3. Transformations may behave sufficiently normal for SPC purposes but they do not, in our experience, yield accurate process performance indices.
    • We have an example in which a Johnson transformation of a gamma distribution yields a beautiful normal probability plot, but the estimated nonconforming fraction is four times too large.
    • We have another example (Levinson, Stensney, Webb, and Glahn, 2001) in which the square root transformation of a gamma distribution yields data that are apparently normal,  and the square root of the particle count data might work quite well on a control chart with normal control limits. The estimated nonconforming fraction is, however, off by an order of magnitude.
Recall that the Shewhart control chart yields a 0.135% false alarm risk at each end. We can set Shewhart-equivalent limits for a nonnormal distribution:
  • Upper control limit: 0.99865 quantile, for a 0.00135 false alarm risk at the upper end
  • Center line: 0.50 quantile or median. (The mean is the median of the normal distribution.) This allows us to use the Western Electric Zone C test for runs of eight consecutive points above or below the center line.
    • Calculation of the appropriate quantiles for the Zone A and Zone B tests is also possible.
  • Lower control limit: 0.00135 quantile, for a 0.00135 false alarm risk at the lower end.
While the false alarm (Type I) risks will be the same as for a Shewhart chart, the average run lengths will not. These must be computed from the distribution itself. The fact that a shift in process "mean" could be reflected by more than one parameter of a nonnormal distribution is likely to make this a complicated exercise.

This looks a lot better. The upper control limit, 8.9, is the 0.99865 quantile of this distribution. The median, 1.68, is a little below the mean.

Nonnormal Sample Averages (x-bar chart)
How should sample averages from nonnormal distributions be treated? If the samples are big enough, the Central Limit Theorem applies. In intermediate cases, it is still desirable to use the actual distribution. We are aware of the following relationships to date:

Gamma Distribution

The average of n measurements from a gamma distribution with parameters alpha and gamma follows a gamma distribution with parameters n*alpha and n*gamma. Note that the variance is 1/n times the variance of an individual measurement, which is as expected.

Nonnormal Ranges or Standard Deviations (R or s chart)
It is in fact possible to define range charts with exact false alarm limits.

The Process Capability Index
The less people know about how laws and sausages are made, the better off they are. (Otto von Bismarck)
Does that Cpk index from your supplier (or your own production line) resemble a sausage? What you don't know can hurt you--- by several orders of magnitude. We saw previously that the calculation is rather straightforward, and this is how most commercially-available SPC software does it.

What does the capability index measure? It reflects the number of standard deviations between the specification limit and the mean. And that number (the standard normal deviate), in turn, should reflect the nonconforming fraction: the proportion of units that will be outside the specification limit. This calculation is quite straightforward and well-established for the normal distribution.

In this example, suppose for simplicity that the upper specification limit is 6.243 (the mean plus three standard deviations, the same as the UCL). This is not a capable process as the CPU is only 1. We have already seen that, if the distribution is normal, the nonconforming fraction above the USL will be 0.00135. We also saw that, for this gamma distribution, it's really 0.014, more than ten times as much! (Furthermore, since it's impossible to get less than zero from this distribution, there may well be no lower specification-- which is true for impurity levels and particle counts. We don't care how few impurities are in the product! In this case, the Cp and CPL indices don't exist, and Cpk=CPU.)

    A capability index report that assumes a normal distribution can be off by several orders of magnitude (in terms of reflecting the nonconforming fraction) when the underlying distribution is nonnormal.
We can report an equivalent process capability index for a nonnormal distribution. This is simply the capability index of a normal distribution that would yield the same nonconforming fraction.

A nonconforming fraction of 0.0138 would result from a normal distribution whose USL was 2.195 standard normal deviates above its mean. This corresponds to a CPU of 0.732. This process has an equivalent Cpk of 0.732, which is even worse than the 1.00 reported.

Update: The above approach is now described by the Automotive Industry Action Group (AIAG). Reference: ANSI/ASQ B1-B3-1996, Guide for Quality Control Charts/ Control Chart Method of Analyzing Data/  Control Chart Method for Controlling Quality During Production, pages 142-143

Our papers on this subject:

  • Levinson, W. "Watch Out for Nonnormal Distributions of Impurities," Chemical Engineering Progress, May 1997, pp. 70-76.
  • Levinson, W. "Approximate Confidence Limits for Cpk and Confidence Limits for Non-Normal Process Capabilities," in Quality Engineering, 9(4), 635-640 (1997)
  • Levinson, William and Polny, Angela. "SPC for Tool Particle Counts," Semiconductor International, June 1999.
  • Levinson, "SPC for Real-World Processes: What to do when the Normality Assumption Doesn't Work." Presented at the ASQ's Annual Quality Conference (2000) in Indianapolis
  • Levinson, W.A., Stensney, Frank, Webb, Raymond, and Glahn, Ronald. 2001. "SPC for Particle Counts," Semiconductor International, 10/01

visitors since 29 April 2010