Z-Score
What is a z-score?
A z-score is a standardized measure of how far a particular data value is from the mean of a normally distributed data set.
How are z-scores used?
By providing a uniform scale to express how extreme a given data point is relative to the mean, z-scores are helpful in identifying outliers as well as comparing data from different distributions. A z-score is also a quick way to apply the empirical rule. For example, you can quickly check if 95% of the values in a data set are within two standard deviations by checking the percentage of values with z-scores between -2 and 2.
How do I calculate z-scores?
You calculate the z-score by subtracting the mean from a data value and then dividing by the standard deviation.
Using a z-score
See how to calculate z-scores using statistical software
- Download JMP to follow along using the sample data included with the software.
- To see more JMP tutorials, visit the JMP Learning Library.
A z-score is a way to measure how far from the mean each of your data values is using a standardized scale. Z-scores convert your raw data to data from a z-distribution. The z-distribution is a normal distribution with a mean of 0 and a standard deviation of 1. It is often called the standard normal distribution.
Why convert to z-scores?
Converting to a z-score makes it easy to apply the empirical rule. For example, since the standard deviation of the z-distribution is 1, you know that about 95% of the values are between –2 and +2.
Converting to z-scores allows us to judge distance from the mean on a standardized scale. Prior to widespread availability of computers, statistics textbooks contained tables of standardized normal distribution, allowing students to look up distances from the mean that are more precise than the one, two and three standard deviations of the empirical rule.
How to convert to z-scores
To convert a data value, subtract the mean from the value, and then divide by the standard deviation. The result is called a z-score or “standardized score.” In theory, you use the population mean and standard deviation. In practice, you typically use the sample mean and standard deviation.
This is easier to understand with an example.
For a simple example, we will use only a few data values. Suppose you measure heart rate. Most people have a heart rate between 60 and 100 beats per minute (BPM). Suppose your data values are:
55 |
60 |
65 |
75 |
80 |
85 |
The mean of these values is 70 and the standard deviation is 11.8. (You can see how to perform these calculations in the pages for mean and standard deviation.)
Suppose you are asked if the value of 55 is within two standard deviations. You can figure this out by using the mean and standard deviation to calculate the value that is two standard deviations away from 70. That calculation is as follows:
70 – (2 x 11.8) = 70 – 23.6 = 46.4
Since 55 is within the range of 46.4 to 70, 55 is within two standard deviations of the mean.
Alternatively, you could calculate the z-score. Remember, to calculate the z-score for a set of values, you simply subtract the mean from each value and divide by the standard deviation. Here are the z-scores for our heart rate measurements:
Data | Z-score |
55 | (55 – 70) / 11.8 = –1.27 |
60 | (60 – 70) / 11.8 = –0.85 |
65 | (65 – 70) / 11.8 = –0.42 |
75 | (75 – 70) / 11.8 = 0.42 |
80 | (80 – 70) / 11.8 = 0.85 |
85 | (85 – 70) / 11.8 = 1.27 |
Now, we can see that the value of 55 is within two standard deviations. In fact, it is 1.27 standard deviations below the mean.
We used the sample mean and sample standard deviation to calculate our z-scores, which is a typical practice, even though statistics theory is based on the population mean and population standard deviation.