The scatter diagram is another visual display of data. It shows the association between two variables acting continuously on the same item. The scatter diagram illustrates the strength of the correlation between the variables through the slope of a line. This correlation can point to, but does not prove, a causal relationship. Therefore, it is important not to rush to conclusions about the relationship between variables as there may be another variable that modifies the relationship. For example, analyzing a scatter diagram of the relationship between weight and height would lead one to believe that the two variables are related. This relationship, however, does not mean causality; for instance, while growing taller may cause one to weigh more, gaining weight does not necessarily indicate that one is growing taller. The scatter diagram is easy to use, but should be interpreted with caution as the scale may be too small to see the relationship between variables, or confounding factors may be involved.
Use the Scatter Diagram When: |
---|
|
Scatter diagrams are easy to construct.
Step 1. Collect at least 40 paired data points: "paired" data are measures of both the cause being tested and its supposed effect at one point in time.
Step 2. Draw a grid, with the "cause" on the horizontal axis and the "effect" on the vertical axis.
Step 3. Determine the lowest and highest value of each variable and mark the axes accordingly.
Step 4. Plot the paired points on the diagram. If there are multiple pairs with the same value, draw as many circles around the point as there are additional pairs with those same values.
Step 5. Identify and classify the pattern of association using the graphs below of possible shapes and interpretations.
Stratifying the data in different ways can make patterns appear or disappear. When experimenting with different stratifications and their effects on the scatter diagram, label how the data are stratified so the team can discuss the implications.
Interpretation can be limited by the scale used. If the scale is too small and the points are compressed, then a pattern of correlation may appear differently. Determine the scale so that the points cover most of the range of both axes and both axes are about the same length.
Be careful of the effects of confounding factors. Sometimes the correlation observed is due to some cause other than the one being studied. If a confounding factor is suspected, then stratify the data by it. If it is truly a confounding factor, then the relationship in the diagram will change significantly.
Avoid the temptation to draw a line roughly through the middle of the points. This can be misleading. A true regression line is determined mathematically. Consult a statistical expert or text prior to using a regression line.
Scatter diagrams show relationships, but do not prove that one variable causes the other.