Home → Techniques and Tips → @RISK Simulation: Graphical Results → Correlation Tornado versus Regression Tornado
Applies to: @RISK, all releases
Why do we have both? Why can a correlation tornado sometimes show bars that aren't on the regression tornado, or vice versa?
Regression and correlation both indicate the direction of the relationship. A positive means that as that input increases, the output increases; and a negative means that as that input increases, the output decreases. You can say that a regression coefficient shows the strength of the relationship, and a correlation coefficient shows the consistency of the relationship.
Correlation first. Imagine a scatter plot of just this input (horizontal axis) and output (vertical axis). Each point represents the value of that input and output in one iteration. As you sweep from left to right, you are going from low to high values of the input, in order. Now, consider two consecutive points in that sweep. The second point is to the right of the first, so it has a higher input value. But is the second point higher on the graph than the first (larger output value) or lower? In almost any simulation, the points will show some ups and some downs, but let's suppose that for every single pair of points, the point to the right is also higher. In this case you have a perfectly consistent relationship: increasing the input always increases the output. The correlation coefficient is +1 (maximum possible correlation).
Now suppose that the relationship is a little more realistic: usually when you go from left to right, the points are rising, but sometimes the right-hand point is lower than the nearest point to its left. Now the relationship is not perfectly consistent. Usually increasing the input increases the output, but not always. The higher the correlation coefficient, the more consistently increasing the input increases the output; the lower the correlation coefficient, the less often increasing the input increases the output. The lower correlation coefficient means that the relationship has less consistency to it.
Take a situation where, moving from left to right, half the time the second point is higher than the first and half the time it's lower. Increasing the input is just as likely to decrease the output as increase it. Your correlation coefficient is zero.
It works the same with negative correlations. A coefficient of –1 (the lowest possible) means that every single pair of points has the second output lower than the first. The relationship is perfectly consistent: every time you increase the input, the output decreases. As the correlation coefficient gets further from –1 and closer to 0, there is less and less consistency. The output still decreases with increasing input, more often than not, but the lower the coefficient the closer you get to 50-50 increase or decrease and zero correlation.
So the correlation coefficient tells you whether increasing the input generally increases or decreases the output, and how consistent that trend is, but it tells you nothing about the strength of the influence.
So much for correlation. What about regression coefficients? Regression coefficients tell you the size of the effect each input has on the output. For example, a regression coefficient of 6 means that the output increases 6 units for a 1-unit increase in the input; a coefficient of –4 means that the output decreases 4 units for each one-unit increase in the input.
(It's a little more complicated than that in @RISK, because you can get only scaled regression coefficients on a tornado; see Interpreting Regression Coefficients in Tornado Graphs. But you can get the actual regression coefficients in a worksheet; see Regression Coefficients in Your Worksheet.)
For more, see Which Sensitivity Measure to Use?. Also see the "Regression and Correlation" topic in the @RISK Help file.
See also: All Articles about Tornado Charts
Last edited: 2016-04-20