The histogram displays a single variable in a bar form to indicate how often some event is likely to occur by showing the pattern of variation (distribution) of data. A pattern of variation has three aspects: the center (average), the shape of the curve, and the width of the curve. Histograms are constructed with variables—such as time, weight, temperature—and are not appropriate for attribute data.
All data show variation; histograms help interpret this variation by making the patterns clear. They tell a visual story about a specific case in a way that a table of numbers (data points) cannot. Histograms can be used to identify and verify causes of problems. They can also be used to judge a solution, by checking whether it has removed the cause of the problem.
Step 1. From the raw numbers (the data), find the highest and lowest values. This is the range.
Step 2. Determine the number of bars to be used in the histogram. If too many bars are used, the pattern may become lost in the detail; if too few are used, the pattern may be lost within the bars. Table 9.13 is a guide for choosing an appropriate number of bars.
When to Use the Histogram
Number of Data Points | Number of Bars |
---|---|
< 50 | 5-7 |
50-100 | 6-10 |
101-250 | 7-12 |
>250 | 10-20 |
Step 3. Determine the width of each bar by dividing the range by the number of bars. Then, starting with the lowest value, determine the grouping of values to be contained or represented by each bar.
Step 4. Create a compilation table like Table 9.14 and fill in the boundaries for each grouping.
Step 5. Fill in the compilation table by counting the number of data points for each bar and calculating the total number of data points in each bar.
Compilation Table for Constructing a Histogram
Bar | Boundaries | Tally | Total |
---|---|---|---|
1 | |||
2 | |||
3 | |||
4 | |||
5 |
Step 6. Draw the horizontal and vertical axes, and label them
Step 7. Draw in the bars to correspond with the totals from the frequency table
Step 8. Identify and classify the pattern of variation. The figure below presents the possible shapes and their interpretation.
Bell Shaped: The normal pattern
Double Peaked: Suggests two distributions
Skewed: Look for other processes in the tail Truncated: Look for reasons for sharp end of distribution or pattern
Ragged Plateau: No single clear process or pattern
Simple daily observations often do not tell enough about a process, and averages or ranges are not adequate summaries of the data. The potential pitfall of a histogram is not using one: it is a useful, necessary tool.
If variation is small, the histogram may not be sensitive enough to detect significant differences in variability or in the peaks of the distribution, especially if using a small-sample data set. There are advanced statistical tools that can be used in such situations.