Central Limit Theorem
Interactive simulation demonstrating the Central Limit Theorem
The Central Limit Theorem is one of the most fundamental concepts in statistics. Formally, if $X_1,\dots,X_n$ are i.i.d. with $\mathbb{E}[X_i]=\mu$ and $\operatorname{Var}(X_i)=\sigma^2<\infty$, then \(\frac{\sum_{i=1}^n X_i - n\mu}{\sigma\sqrt{n}} \xrightarrow{d} \mathcal{N}(0,1),\) equivalently, \(\sqrt{n}\,\frac{\bar X_n - \mu}{\sigma} \xrightarrow{d} \mathcal{N}(0,1),\) where $\bar X_n = \tfrac{1}{n}\sum_{i=1}^n X_i$. The proof is elegant and surprisingly straightforward. At a high level, you compute the characteristic function of the normalized sample mean and show that the higher-order terms in its Taylor expansion vanish as $n$ grows, so it converges to the characteristic function of a standard normal.
In practice, the CLT says that the distribution of sample means becomes approximately normal as the sample size increases, regardless of the shape of the underlying population (assuming finite variance).
As a student, I struggled to build intuition for the CLT. An interactive demo I once used finally made it click, so I recreated a version here.
How to Use
- Select a Distribution: Choose from Uniform, Normal, Exponential, Bimodal, Right Skewed, or create your own Custom distribution
- For Custom: Click and drag on the parent population chart to draw any distribution shape you want!
- Set Sample Size (N): Adjust the slider to change how many values are in each sample
- Draw Samples:
- Click “Draw Sample of 1” or “Draw Sample of 5” to see the full animation
- Watch individual data points drop onto the middle chart one by one
- See the sample mean line appear in red
- Watch the mean drop down into the sampling distribution
- Click “Draw 1000 Samples” for quick bulk sampling (without animation)
- Animate: Click “Animate” to continuously draw samples and watch the Central Limit Theorem in action! Click again to stop.
- Reset: Clear all samples and start fresh (also clears custom distributions)
What You’re Seeing
- Top Chart: The parent population distribution you’re sampling from
- Middle Chart: The current sample with individual values shown as blue dots, and the sample mean marked by a red dashed line
- Bottom Chart: The distribution of all sample means collected so far, with a red curve showing the theoretical normal distribution
Key Observations
- Shape: No matter how skewed or unusual the parent distribution is, the sampling distribution of means approaches a normal distribution as you draw more samples
- Center: The mean of the sampling distribution approximates the population mean
- Spread: The standard deviation of the sampling distribution (Standard Error) equals σ/√n, where σ is the population standard deviation and n is the sample size
Try creating a weird custom distribution (bimodal, trimodal, super skewed) and watch how the CLT still produces a beautifully normal distribution of sample means!