**Information about Statistical Methods Used**

A tutorial is available to assist IFM participants. The tutorial deals with “what is proficiency testing”, and provides basic information about dealing with data and reading your reports. An explanation of Z scores and within and between Z scores is also provided.

Download tutorial (680KB)

IFM usually uses consensus results generated by participants in a round of testing to determine the acceptable range of answers. When determining a performance assessment, __robust statistics__ are used. Robust statistics use the median result rather than the average. Use of the median minimises any effects due to extreme (very high or very low) results, and is seen to be a very fair way of assessing participant performance. When determining how far away a result can be from the consensus median result (acceptable range of results), the spread of participant results is again used. The spread is taken from the middle 50% of the submitted results (with corresponding manipulation of data!). Once again, taking the spread of the middle results reduces the effects of outlying/extreme results, and provides a very fair way of assessing participant performance.

As with all statistical approaches, this approach has limitations.

- If the data is skewed, biased or affected by methodology, the median result and the spread of results may not truly reflect an acceptable range of results.
- If the number of participants are too few, the calculated acceptable spread of results may not accurately reflect a realistic spread of results in the field.

Following are some definitions for the statistical terms used in most IFM reports:

- Count: Number of results included in the analysis
- Average: Mean. Sum of the values submitted divided by the number of results.
- Median: Middle score resulting when all the results are ranked in ascending order.
- Q1: The result corresponding to the first quartile. (First 25% when ranked in order.)
- Q3: The result corresponding to the 3rd quartile. (First 75% when ranked in order.)
- NIQR: The normalised inter quartile range. 0.7413 x (Q3-Q1) {The factor comes from statistical tables, assuming the results are normally distributed.}
- Robust CV: A measure of the spread of results derived by calculating the following:

100 x (NIQR/median).

This number is expressed as a percentage. The greater the number, the greater the spread of results. An example of a good spread of participant results in microbiology proficiency testing terms is less than 8%.

**Z scores**

A score given to participants which describes how close they were to the consensus result. The best Z score is zero. The further from zero the Z score is, the worse the result.

Generally, a Z score less than 1.0 from zero is excellent, up to 2.0 is acceptable. Z scores greater than 3.0 from zero are considered to be

unacceptable and corrective action should be undertaken. The formula for Z score calculations is as follows:

**Z = (result obtained by the participant – median result)/(NIQR)**

**Within and between laboratory Z scores.**

To determine between and within laboratory Z-scores the normalised sum (Result A + Result B / sqrt2) and normalised difference (Result A – Result B / sqrt2) are used respectively.

The normalised sum can demonstrate a laboratory’s tendency for bias in their results. Bias could be caused by equipment or operator techniques. This Z score is called the ‘between’ laboratory Z score.

The normalised difference can reflect a laboratory’s ability to reproduce exactly the same result, and is therefore very much like an estimate of precision. This determination is called the ‘within’ laboratory Z score.

The formula for within and between data in IFM proficiency testing programs is the same as for individual results, with the exception that normalised data are used instead:

Between laboratory Z scores:

**Z = (normalised sum of two participant data points – median normalised sum) / NIQR of normalised sums**

Within laboratory Z scores:

**Z = (normalised difference of two participant data points – median normalised difference) / NIQR of normalised differences**

**When Consensus results cannot be used:**

Some projects require adherence to various standards, and in these cases the direct use of consensus results are not applicable. In these cases, participants will be informed about the statistical methods employed for their particular program.

A classical approach not directly using consensus results is the use of **“reference values”**. Reference values can either be the __“desired” __result, or the __“true”__ result and are set by IFM from data available.

(An IFM “desired” result is the value determined by a group of laboratories who have tested the samples. An IFM “true” result can be used when certified reference materials have been used as samples in the program, and these materials are certified by a body accredited to ILACG12:2000 or ISO G34.)

When reference values are used, the acceptable spread of results are also determined by IFM from information held. It is the participant’s right to request the source of the data (in general terms) and the actual values and principals used to determine the acceptance criteria for the programs they participated in. All program reports still show data generated by the group of participants.