Bar Chart
What is a bar chart?
A bar chart shows the counts of values for levels of a categorical or nominal variable.
How are bar charts used?
Bar charts help you understand the levels of your variable and can be used to check for errors.
What are some issues to think about?
Bar charts are used for nominal or categorical data. For continuous data, use a histogram instead.
Bar charts show the frequency counts of data
See how to create a bar chart using statistical software
- Download JMP to follow along using the sample data included with the software.
- To see more JMP tutorials, visit the JMP Learning Library.
Bar charts show the frequency counts of values for the different levels of a categorical or nominal variable. Sometimes, bar charts show other statistics, such as percentages. Figure 1 is an example of a bar chart for responses to a survey question.
The bars show the levels of the variable; the height of the bars show the counts of responses for that level.
What is the difference between bar charts and histograms?
Two key differences between histograms and bar charts are the gaps between bars and the types of data. Histograms do not have gaps between bars, while bar charts do. However, with many software tools, you can revise a bar chart so that it does not have gaps between the bars, which leads to the second key difference between histograms and bar charts.
Histograms are used with continuous data; bar charts are used with categorical or nominal data. See the "Bar charts and types of data" section below for more detail.
What is the difference between bar charts and Pareto charts?
A Pareto chart is a special example of a bar chart. For a Pareto chart, the bars are ordered from highest to lowest. These charts are often used in quality control to identify the areas with the most problems.
Like a histogram, a Pareto chart does not have gaps between bars. Unlike a histogram, the Pareto chart summarizes counts for a nominal or categorical variable.
Figure 2 gives an example of a Pareto chart that summarizes types of findings in an audit of business processes. It includes a legend for the categories, which allows for longer labels that make the categories easier to read.
Charting statistics other than counts
While all of the examples show bar charts with counts, these graphs can also show other statistics, such as percentages. Most software tools give options for the statistic to chart.
Bar chart examples
Software is often used to create bar charts. Software usually allows users to create either vertical or horizontal bar charts, as well as add custom features to a bar chart.
Below are a few examples of bar charts. You may wish to consult a statistician or the many books and websites available to determine which type of bar chart works best for your data.
Figures 3-15 use data from 10 bags of candy. Each bag has 100 pieces of candy and the count for the five flavors has been collected for each bag. The goal is for the bags to have nearly equal counts for each flavor, meaning we expect to have roughly 20 pieces of candy for each flavor in each bag. Across 10 bags, we expect to have approximately 200 pieces of candy for each flavor.
Our first step is to create a bar chart of the data, as shown in Figure 3:
The software orders the bars alphabetically by the name of the flavor, which might be the best way to show the results for your audience.
However, you might want to order the bars by decreasing counts, as shown in Figure 4:
We can now see that the total counts of pieces of candy for Grape and Orange are the same. This was true in Figure 3, but it was not as easy to see.
The bars are vertical. For long graph labels, a horizontal bar chart is often better. Figure 5 shows the same data with longer labels for the flavors in a horizontal chart. If we had used a vertical bar chart instead, the labels might have been harder to read.
We have used the same color for all bars in these examples. As a general rule, using many colors makes a graph harder to understand.
But, suppose that the candy company requires that every bag have at least 18 pieces of each flavor. Across 10 bags, we need at least 180 pieces for each flavor. Since our data shows only 120 pieces for Cherry, we want to highlight this problem. Figure 6 uses a shaded bar to do this. Other options are to use a different color to highlight the bar for Cherry.
You might want to add labels to the bars. Figure 7 adds the counts to the end of each bar. This approach helps show that we might also have a problem with the Red Candy Apple flavor, since it meets our requirement of 18 pieces per bag, but just barely.
How extreme data values affect bar charts
Bar charts show counts of categories in your data. Unlike histograms, bar charts are not affected by extreme values. The bar chart simply shows another bar for the category with very few (or very many) values in the bar. Figure 8 shows a different set of candy data, where the Grape flavor is replaced with Mango. The count for Mango is much lower than expected.
Figure 9 shows another example, where Grape is replaced with Pineapple. The count for Pineapple is much higher than expected.
Bar charts can help identify incorrect values in your data. In Figure 10, “Mango” was misspelled as “Mangi” for one data value, which is a clear data error that should be fixed. Checking your own data for errors with bar charts can be helpful.
How do I add groups to bar charts?
If there are groups in your data, plotting all the data together in a bar chart can help show patterns across these groups. Figure 11 combines the data from three candy factories.
From this figure, you can see which factories use which flavor in bags of candy. You can also see the problems, such as Factory A having too few Mango pieces of candy in the bags. In this example, ordering the bars alphabetically makes sense. We cannot order by counts since the order would be different across factories.
In this example, using different colors for the different factories might be helpful. Figure 12 shows each factory with a different color.
You might want to show the counts on the horizontal axis to make visual comparisons of counts easier, as seen in Figure 13.
While Figure 13 makes it easier to compare counts for the different flavors, it makes it more difficult to determine which flavors are used at the different factories than in Figure 12.
These are just a few of the many ways to add groups to bar charts. For your data, you need to think about the message to your audience and how to build the best graph for that message.
Stacked bar charts
Instead of using groups, you might want to use a stacked bar chart. With a stacked bar chart, you show the responses for your groups, which are the factories for the Candy data. Each group has one bar. The frequency counts for your variable are then stacked within the bar for each factory. For the Candy data, the counts of flavors will be stacked with the bar for each factory. Figure 14 shows a stacked bar chart for the Candy data from the three factories, using a different color for each flavor.
In Figure 14, we can easily see that only Factory A uses Mango, only Factory B uses Pineapple, and only Factory C uses Grape. By comparing sizes of the stacked sections of the bars, we can also see Factory A uses very few Mango candies, and Factory B uses a lot of Pineapple candies.
Adding a legend is important for a stacked bar chart. Many software tools allow you to add labels to a stacked bar chart, as demonstrated in Figure 15. For example, the labels help us see that Factory B had the same total count for the Cherry and Orange flavors.
You might find it helpful to print a stacked bar chart in grayscale before making final decisions on colors. Also, as Figure 15 shows, when adding labels, you need to be sure that the label can be read with the background color for each element of the stacked bar.
Bar charts and types of data
Figures 16-20 demonstrate when it makes sense to use bar charts or histograms for different types of data.
Categorical or nominal data: appropriate for bar charts
Bar charts make sense for categorical or nominal data, since they are measured on a scale with specific possible values.
With categorical data, the sample is often divided into groups, and the responses have a defined order. For example, in a survey where you are asked to give your opinion on a scale from “Strongly Disagree” to “Strongly Agree,” your responses are categorical.
With nominal data, the sample is also divided into groups but without any particular order. Country of residence is an example of a nominal variable. You can use the country abbreviation, or you can use numbers to code the country name. Either way, you are simply naming the different groups for the data.
Continuous data: use histograms
Bar charts do not make sense for continuous data, since they are measured on a scale with many possible values. Some examples of continuous data are:
- Age
- Blood pressure
- Weight
- Temperature
- Speed
For all of these examples, use histograms instead of bar charts.