Card image cap

Where's the problem? Using Chi-Squared tests to identify where defects are occurring in a manufacturing line


When it comes to fault finding in a manufacturing or process engineering environment it can sometimes be easy to know what is causing a particular defect. For example, a solder joint defect can be easily attributed to the soldering process, so you know where to spend your efforts controlling that part of the process. But what do you do when defects are occurring that have a less obvious culprit such as scratches, chips or other forms of physical defects which can simply be caused by manual handling processes or loading assemblies into tooling? This scenario happened to me once, and I decided to use a statistical tool known as the chi-squared test to identify where the problem was occurring in the manufacturing line. I'm going to share what happened in this article and you can also download a copy of the tool I made in Excel from my GitHub page if you want to use it yourself.


The company I was working for at the time had a problem with certain PCB components appearing to be chipped. This was being noticed after the devices were electrically tested by test engineers when they were trying to determine why some devices failed but others did not (seemingly random). It was determined that this was happening somewhere in the manufacturing process, but we needed to know specifically where. This can be challenging when the process has approximately 30 different stages where handling damage can occur.

Example of a glass fuse mounted to a printed circuit board

There was some common instinct that suggested that the routing (PCB singulation/depanelling) might be the most likely culprit. However, we were also aware that the test fixtures which clamped the devices in place for electrical testing also appeared to apply quite a lot of force. Knowing this I decided to set up my statistical experiment to determine if either of these two process stages were causing the defects we'd witnessed.

Step 1: State the null and alternate hypothesis

Like any good statistician we must first state what our null and alternative hypothesis are. That looks like this:

Stating the null and alternative hypothesis

What we are saying here is that for H0 there is no association between our variables. If proven correct by our test we wouldn't be able to conclude that our routing stage or our test stage were the cause of our defects. Whereas if we can disprove the null hypothesis in favour of the alternate hypothesis then we can conclude that these processing steps have an influence on the components being damaged.

Step 2: Populate the Observed Table

Our next step is to draw up a contingency table and populate it with the results from our experiment. This step is usually the longest. A contingency table is a table of count data separated by categories (it won't work if we use any other type of data). On the rows we have the different categories that we want to test by and in the columns we have our results category. This can seem a little odd at first as the way we set up the table means we need to consider the total column (added to the end) and this needs to add up to the same amount for the proceeding stages. For example if we wanted to know if some test results 30 data points were higher than a given average it would not be enough to have a single column labelled higher with a value of 22, we would also need to have a column labelled not higher with a value of 8 so that the total input = the total output.

A typical batch size of PCB devices was around 120, however more than one batch was run during the time the experiment ran for. In fact a total of 6 batches were processed so that's a total of 720 devices for this experiment. Let's go ahead and populate our Observed Table now. I've chosen the categories 'Before Route' which is before the PCB depanelling stage, this accounts for any kind of handling damage that could occur to our specific component of interest. 'After Route' which is an inspection stage immediately after PCB depanelling, this would tell us how many devices were damaged at this stage, and lastly 'After test'. Telling us how many devices were observed to have cracks/chips after the testing stage (which also happens to be the final stage of assembly).

The completed Observed Table from our experiment the reason the total value is lower than 720 after test is because 198 devices were scrapped after being found to be defective

Step 3: Calculate the Expected Results

The next step is to calculate our expected result. This is done by multiplying the total of the row by the total of the column and diving this number by the overall total for each cell. The reason for doing this is to calculate what the value would be if there were no association between the variables, it allows us to look at how this value deviated from the observed values.

The completed Expected table

Step 4: The Expected > 5 Rule

This is quite a simple one. Basically the Chi Squared test is inaccurate if the values are too small. Like almost all statistical tests you need to have enough data to be sure that your results can be considered statistically significant. Therefore, in the calculator I created a simple > 5 rule. Which checks to see if the values in the expected table are > 5. If they're not, we can stop our experiment right here and go gather some more data. Fortunately for us, our values are much higher, so let's proceed.

The sanity check of our statistical experiment

Step 5: Calculating Residuals

Our next step now is to calculate our residuals. This is simply looking at the difference between the data we observed in our experiment and the expected data if there were no associations between our variables. The totals of a residual plot should always add up to zero, in my example below they do except for one value, this annoyingly is an error of spurious precision due to Excel (floating point arithmetic). For these purposes a difference of -5.64843E-14 can be considered negligible.

The completed Residual table

It's sometimes a good idea to create a residual plot of this data too so you can explore the differences more. I've done that below, I've only included three data points for our damaged column to make the plot easier to read.

Residual Plot

When interpreting a residual plot what we're looking for is deviation from 0. Basically the further away a point is from the zero line, the greater the difference from what we expected, and therefore the more significant that difference is. From our plot we can see that we expected around 121 fewer failures after routing, around 77 more failures before routing and around 45 more failures after the test stage.

Step 6: Summing our Chi-Squared Contribitions

We now want to be able to quantify how different our results are. Because the results of our residual table add up to zero we need a better way to measure this difference, so we square them (our residual value) and divide the result by the expected value. This gives us unsigned values (non-negatives) that we can add up.

The completed Chi-Squared table

Step 7: Summing our values to get a Chi Squared test statistic

The next step is simply to sum all the values from our previous table into a single figure. We need this number as we will later compare it against a table of values to determine if our result is statistically significant and by how much, for now we have this:

Summing our Chi Squared contributions

Step 8: Degrees of Freedom

The next step is to compare our test statistic from step 7 to a series of values in a table, this table actually produces a probability distribution, but the distribution changes depending on the number of rows and columns in our tables. We therefore need to calculate this, our degrees of freedom to know which set of values to compare our test statistic to. This is done by multiplying the number of rows -1 by the number of columns -1 (r-1) * (c-1), our degrees of freedom is 2.

The use of n-1 instead of n degrees of freedom fixes this because the lower the degrees of freedom of a chi-square distribution the tighter the distribution. This slightly tighter distribution makes up for our under-estimate of the the true population variance.

Chi-Squared distributions for different degrees of freedom
DOF = 2

Step 9: Comparing our test statistic to critical values (CV) tables

Now that we have our test statistic we need to compare it at different significance levels. This is to account of variation in our data set. For example, if we reject the null hypothesis at the 5% significance level that means we are confident that 95% of the variation within the data is account for by the variables within our test. For a lot of testing in a manufacturing environment this level (known as the alpha) is pretty common. In environments where more certainty in needed (such as in drug and pharmaceutical testing higher significance levels are required.

Critical Values Table

In our table (pictured above) we are now comparing our test statistic to the values in the middle row (2), for 2 degrees of freedom. Our test statistic is > 5.991 so we can reject the null hypothesis at the 5% significance level. That's great, that means we proved that there is an interaction in these process stages.

In the Excel calculator tool I made this feature using VLOOKUP, to test for different significance levels.

Critical values for 2 degrees of freedom

In the above image we can see that not only is the 5% significance level rejected, so are the 2.5, 1 and 0.1% significance levels. The Excel calculator using IF/ELSE logic to determine whether to reject these results or not.

=IF(X2_Value<CV_5,"Do not reject H0 at "&C70&" significance level","Reject H0 at "&C70&" significance level")

Step 10: Calculating P Values

As a further step toward statistical robustness it's always a good idea to consider your p values. This is the probability that the test statistic we calculated earlier could have occurred under the null hypothesis. Calculating this value can be a little convoluted. Fortunately however Excel has a built in function called CHISQ.DIST.RT which we can utilise to calculate out P value (=CHISQ.DIST.RT(X2_Value,DOF)) . You can read more about it here:

Again IF/ELSE logic is used to help us determine whether to reject the null hypothesis or not.

Calculating p values using Excel's CHISQ.DIST.RT function

Our p value is very small! That's great news, it is actually 1.08903E-74 however I've kept this to 3 decimal points to keep this looking neater in the calculator. So we now have very very strong evidence against the null hypothesis.


So from this quick statistical test we can confirm with a high degree of accuracy where defects are occurring within a manufacturing plant. After I ran this test in real life we hired a contractor to redesign the tooling for PCB routing. We then ran the same experiment with the new tooling in place and noticed that we observed no damaged components after depanelling.

A chi-squared test is a powerful tool for hypothesis testing with categorical data. I've used it in other manufacturing scenarios also such as comparing reject rates from manual inspection between operators. If you're fortunate enough to have access to Minitab or JMP you can do Chi-Squared tests within them. If you don't have access to these however then feel free to use this Excel tool I created which you can download from my GitHub.

Thanks for reading, if you enjoyed please feel free to connect on LinkedIn.