HomeTechniques and Tips@RISK Simulation: Numerical ResultsConfidence Intervals in @RISK

6.28. Confidence Intervals in @RISK

Applies to: @RISK 5.x–7.x

How can I compute a confidence interval on a simulated input or output in @RISK?

People don't always mean the same thing by "confidence interval" in the context of a simulation. Some want to estimate the mean of a distribution, and others want to know the middle x% of values.

Prediction Interval

Some people use "confidence interval" to mean the middle x% of the simulated data values, also known as a prediction interval. For instance, a 95% confidence interval by this definition would be the 2.5 percentile through the 97.5 percentile. @RISK can find these percentiles for you directly, with the RiskPtoX function. This downloadable workbook PredictionInterval.xls (attached) shows the calculation.

Some people mean the confidence interval that is taught in statistics classes, an estimate of a "true population mean". The idea here is that the simulation is treated as a sample from the complete distribution, which contains infinitely many values. Your simulated result has a mean, the mean of a sample from the distribution, but if you repeated the simulation you'd get a different mean. What you want is a range that estimates the true mean of the distribution, with x% confidence in that range.

This confidence interval is the simulated mean plus or minus a margin of error. In turn, the margin of error is a critical t or z times the standard error. But the estimated standard error depends on your sampling method, Latin Hypercube or Monte Carlo.

Confidence Interval in a Worksheet Function

Beginning with @RISK 7.5, you can use the RiskCIMean( ) function to place the lower or upper bound of a confidence interval in your worksheet. =RiskCIMean(A1,.95) or =RiskCIMean(A1,.95,TRUE) gives you the lower bound for the 95% confidence interval about the mean of cell A1, and =RiskCIMean(A1,.95,FALSE) gives you the upper bound. If you prefer, you can use the name of an input or output, instead of a cell reference.

The confidence interval is computed using RiskStdErrOfMean( ), which equals the simulated standard deviation divided by the square root of the number of iterations. That's accurate if you're using Monte Carlo sampling. However, that same standard error is too large when you're using Latin Hypercube sampling. In turn, the larger standard error makes the confidence interval wider than necessary, possibly much wider than necessary. Thus, the RiskCIMean( ) function makes a conservative estimate under Latin Hypercube sampling. A truer estimate would require running multiple simulations, as explained below, which is not practical in a worksheet function.

Confidence Interval with Monte Carlo Sampling

The standard error is the simulated standard deviation divided by the square root of the number of iterations. The bounds of the confidence interval are therefore

sample_mean ± zcritical × standard_dev / sqrt(sample_size)

(Critical z is easier to compute and is often used instead of critical t. For 100 iterations or more, critical t and critical z are virtually equal.)

To find this type of confidence interval, @RISK offers several auxiliary functions but no single "confidence interval" function. The attached workbook ConfidenceInterval_MC.xlsx shows how to calculate this confidence interval using the @RISK statistics function. This worksheet is a proof of concept, and therefore the calculations are spread over several cells to show every step. In production, you would probably combine the calculations into a couple of cells, or put them into a user-defined function.

To predict how many iterations will be needed to restrict the confidence interval to a specified width, please see How Many Iterations Do I Need?

Confidence Interval with Latin Hypercube Sampling

(For computing confidence intervals based on Latin Hypercube sampling, we rely on Michael Stein, "Large Sample Properties of Simulations Using Latin Hypercube Sampling", Technometrics 29:2 [May 1987], pages 143-151, accessed 2016-06-28 from https://r-forge.r-project.org/scm/viewvc.php/*checkout*/doc/Stein1987.pdf?revision=56&root=lhs.)

The simulated sample means are much less variable with Latin Hypercube than with Monte Carlo sampling. (See Latin Hypercube Versus Monte Carlo Sampling.) Therefore:

• standard_dev/sqrt(sample_size) over-estimates the standard error of the mean, quite possibly by a large amount.
• A confidence interval using that standard error will therefore be very conservative: the interval and the margin of error will be much wider than necessary.
• The RiskStdErrOfMean( ) and RiskCIMean( ) worksheet functions, as mentioned above, use that traditional calculation, and therefore they also overstate the standard error and produce an overly-wide confidence interval.

We recommend Latin Hypercube sampling, and it's the default in @RISK, because it does a better job of simulating your model than traditional Monte Carlo sampling. Just be aware that the confidence intervals that you or @RISK compute don't take the increased accuracy of Latin Hypercube into account. It may be enough just to bear in mind that the confidence intervals are bigger than necessary. But if you need confidence intervals that accurately reflect Latin Hypercube sampling, here is how you can compute them.

If the number of iterations is large relative to the number of input variables, and certain other conditions are met, the distribution of simulated sample means for each output will be approximately normal. Then you can find the standard error, margin of error, and confidence interval by this procedure:

1. In Simulation Settings » Sampling » Multiple Simulations, set "Use different seeds". Set a number of iterations in each simulation that is large relative to the number of input variables.

2. Run several simulations.

3. Each simulation will have a mean, which we can call x-bar. Collect the simulated means, and take the mean of those x-bars. This is your estimate for the true mean, and will be the center of your confidence interval.

4. Compute the standard deviation of the group of x-bars, and divide by the square root of the number of simulations (not iterations). This is the estimated standard error of the mean for Latin Hypercube sampling. Since the standard deviation of those simulated means is much less than the standard deviation of the iterations within any one simulation, this standard error will be much less than the standard error for Monte Carlo sampling.

5. Compute your critical t in the usual way, with degrees of freedom set to number of simulations minus 1, not number of iterations minus 1. For instance, with 10 simulations, critical t is 2.26 for a 95% confidence interval. (Since the degrees of freedom is low, use t and not z.)

6. Multiply critical t from step 5 by the standard error from step 4. This is the margin of error. Your final confidence interval is

(mean of x-bars) ± tcritical × standard_error

The attached workbook ConfidenceInterval_LH.xlsx shows the calculation. The model is the same one that was presented above for Monte Carlo sampling. In the Monte Carlo example, there were 10,000 iterations in one simulation, and the standard error was on the order of \$550,000. In the Latin Hypercube example, there are 1000 iterations in each of 10 simulations, totaling the same 10,000 iterations, but the standard error is much smaller, on the order of \$5,000 instead of \$550,000.

Last edited: 2017-08-02