Resources - Support - Not signed in? Sign in
Explore
My StatCrunch

Report Properties
Thumbnail:

from Flickr
Owner: websterwest
Created: Sep 08, 2009
Share: yes
Views: 3982
 
Results in this report
 
Data sets in this report
 
Need help?
To copy selected text, right click to Copy or choose the Copy option under your browser's Edit menu. Text copied in this manner can be pasted directly into most documents with formatting maintained.
To copy selected graphs, right click on the graph to Copy. When pasting into a document, make sure to paste the graph content rather than a link to the graph. For example, to paste in MS Word choose Edit > Paste Special, and select the Device Independent Bitmap option.
You can now also Mail results and reports. The email may contain a simple link to the StatCrunch site or the complete output with data and graphics attached. In addition to being a great way to deliver output to someone else, this is also a great way to save your own hard copy. To try it out, simply click on the Mail link.
Confidence Interval Demonstration
Mail   Print   Twitter   Facebook

Have you ever wondered what a 95% confidence interval really implies?  This demonstration is designed to help out with this understanding using the capabilities of StatCrunch.  Consider the situation where a 95% confidence interval for a population mean is to be constructed from a small sample from a normally distributed population.  Specifically, let's consider the situation where we have 10 samples from a normal population with a mean of 10.  If you remember your intro stat class, the alarm for a one sample t confidence interval is probably going off in your mind right now.  To aid in the understanding of the associated confidence level of 95%, let's construct not 1 but 1,000 of these intervals.  It may seem a little strange at first to develop these confidence intervals when we know the actual population mean is 10, but this knowledge will help us when we investigate the properties of these intervals later on.

After signing in to the StatCrunch site, click here to open StatCrunch in a new window or tab.  Generate the desired samples by selecting the Data > Simulate > Normal option within StatCrunch.  First, simulate 10,000 samples from a normal distribution with a mean of 10 and a standard deviation of 1 as shown in the dialog panel below.  When you click the Simulate button, a new Normal1 column with the samples will be added to the data table.  (Note to instructors using this demonstration in class: Ask your students to turn on the Use single dynamic seed option to ensure they will not all get the same results.)

Result 1: Snapshot of Normal Samples Dialog   [Info]
Right click to copy

To divvy these samples up into 1,000 separate samples each of size 10, choose the Data > Sequence data option within StatCrunch and specify the options as shown below.  When you click on Create Sequence!, a new Sequence column will be created with the value 1 repeated 10 times, followed by 2 repeated 10 times, ..., all the way up to 1,000 repeated 10 times.  This column will be used as the identifier for each of the 1,000 separate samples stored in Normal1.

Result 2: Snapshot of Sequence Data Dialog   [Info]
Right click to copy

Now, compute a separate t confidence interval for each of the samples.  To do so, choose the Stat > T statistics > One sample > with data option within StatCrunch.  Select the Normal1 column as shown below, and then choose the Sequence column for the Group by option.  StatCrunch will then compute a separate set of statistics using the Normal1 values for each of the 1,000 unique labels in the Sequence column. 

Result 3: Snapshot of One Sample T Statistics With Data Dialog   [Info]
Right click to copy

By default, a hypothesis test will be performed, click Next and choose the confidence interval option as shown below leaving the confidence level at 0.95.

Result 4: Snapshot of One sample T statistics with data Dialog   [Info]
Right click to copy

Click Next one more time an turn on the Save output to data table option so that further work can be done with the results.

Result 5: Snapshot of One Sample T Statistics With Data Dialog   [Info]
Right click to copy

When you click Compute, several new columns will be added to the data table, the most important of which are the columns entitled L. Limit and U. Limit.  These two columns contain the lower and upper endpoints of the confidence intervals associated with each of our 1,000 samples of size 10.  For example, the first row in these two columns contains the lower and upper confidence limits of a 95% confidence interval computed using the first 10 values from the Normal1 column. 

The natural question to ask about these intervals is how many of these intervals contain the true population mean of 10.  To compute this number using StatCrunch, choose the Data > Compute expression option, and then specify the expression, sum(between(10,"L. Limit","U. Limit")), as shown below.  Note, this expression can simply be copied and pasted into StatCrunch.  This expression may seem a little difficult to understand at first, but the between part of the expression simply checks to to see if 10 is between the lower and upper limits of each confidence interval.  If 10 is within the interval, a value of one is returned for that interval.  Otherwise, a value of 0 is returned.  The outer sum part of the expression, simply adds up the 1 values resulting in the total number of the 1,000 confidence intervals that contain the true population mean of 10. 

Result 6: Snapshot of Compute Expression Dialog   [Info]
Right click to copy

When you click on Compute, the resulting data set should look like the one shown below.

Data set 1. Confidence Interval Demo   [Info]
To analyze this data, please sign in.

The last column contains the results of the expression described above.  So, in this case, 952 of the 1,000 confidence intervals contained the true mean of 10.  As a proportion, this turns out to be (952/1000) = 0.952 or as a percentage, 0.952 x 100 = 95.2%.  This value is extremely close to 95%.  In fact, if we computed similar intervals for more and more samples, the percentage of intervals containing the true mean would get closer and closer to 95%.  In other words, a 95% confidence interval should contain the true population parameter 95% of the time over the long run. 

There are a number of excellent follow up questions to be asked.  A few are listed below:

  1. Another column in the above output is the Sample Mean column which contains the sample means for each of the 1,000 samples.  Construct a histogram of this column and overlay a normal distribution using StatCrunch.  Also, try a QQ plot of this column.  Do the resulting graphs indicate that the sample means follow a normal distribution? 
  2. Open a new instance of StatCrunch and repeat the experiment described above for 99% confidence intervals by changing the confidence level to 0.99.  How does this change impact the resulting interval's chances of covering the true mean?  How do the widths of the resulting intervals change?  Use the Data > Compute expression option with an expression of "U. Limit" - "L. Limit" to compute the width of each interval, and then compute summary statistics and graphs of the resulting widths.
  3. In this case, the appropriateness of the t confidence interval relies on the underlying distribution being normal.  Instead of simulating normal samples try simulating other distributions such as the exponential distribution to see how violations of the normality assumption impact the interval's chances of containing the true mean.  Also, consider the resulting sample means in this case.  Do they look normal?

HTML link:
<A href="http://www.statcrunch.com/5.0/viewreport.php?reportid=6934">Confidence Interval Demonstration</A>

Comments
Want to comment? Subscribe     Already a member? Sign in.

StatCrunch - Data analysis on the Web - Copyright 2007-2009 Integrated Analytics LLC
Distributed exclusively by Pearson Education - Terms of use