10 Binomial Approximation

#NormalDistribution #Entropy #LLN #CLT #BinomialDistribution

Consider $I_{1}, \dots, I_{n} \sim Bernoulli (p)$ , and $S_{n} \sim Binomial (n, p)$ , so $S_{n} \overset{d}{=} I_{1} + \dots + I_{n}$ . We know $P (S_{n} = k) = (\binom{n}{k}) p^{k} (1 - p)^{n - k} = \frac{n!}{k! (n - k)!} p^{k} (1 - p)^{n - k}, 0 \leq k \leq n .$
We want to estimate the Binomial distribution, because factorials are troublesome to work with. The key tool is to use Stirling Approximation $e^{\frac{1}{12 n + 1}} < \frac{n!}{{(\frac{n}{e})}^{n} \sqrt{2 π n}} < e^{\frac{1}{12 n}} .$

1 Entropy Approximation

We can use entropy to approximate Binomial distribution. To be more specific, KL divergence.

Theorem (Entropy Approximation)

Let $S_{n} \sim Binomial (n, p)$ and define $f = \frac{k}{n}$ . Then for $k = 1, \dots, n - 1$ , $\begin{aligned} \frac{1}{\sqrt{2 π n f (1 - f)}} e^{- n KL (f | | p)} > P (S_{n} = k) \\ > & [1 - \frac{1}{12 n f (1 - f)}] \frac{1}{\sqrt{2 π n f (1 - f)}} e^{- n KL (f | | p)}, \end{aligned}$ where $KL (f | | p) = - f \log (\frac{p}{f}) - (1 - f) \log (\frac{1 - p}{1 - f}) .$

2 Normal Approximation

Taylor expansion of $KL (f | | p)$ (see as a function about $f$ ) about $f = p$ gives $KL (f | | p) = \frac{(f - p)^{2}}{2 p (1 - p)} + \frac{2 g - 1}{6 g^{2} (1 - g)^{2}} (f - p)^{3}$ for some $g$ between $f, p$ .

To obtain a normal approximation, we need the remainder $R$ quantity to be small. Notice that $R = | \frac{n (2 g - 1)}{6 g^{2} (1 - g)^{2}} (f - p)^{3} | \leq \frac{n | f - p |^{3}}{6 [min (f, p) min (1 - f, 1 - p)]^{2}} .$

If $p$ is away from both $0$ and $1$ , and $n | f - p |^{3}$ is small, then this upper bound will be small.
For $n | f - p |^{3}$ to be small, we need $| f - p |$ to decay faster than $n^{- \frac{1}{3}}$ as $n \to \infty$ .
Since $n | f - p |^{3} = O (\frac{1}{\sqrt{n}}) \to 0$ , the normal approximation for $P (S_{n} = k)$ is accurate for large $n$ .

Good news is that when $I_{1}, \dots, I_{n} \overset{i . i . d}{\sim} Bernoulli (p)$ , by SLLN, $\frac{S_{n}}{n} \overset{d}{=} \frac{1}{n} (I_{1} + \dots + I_{n}) \overset{a . s .}{\to} p$ .
So normal approximation is that $P (S_{n} = k) \sim \frac{1}{\sqrt{2 π n p (1 - p)}} e^{- \frac{(k - n p)^{2}}{2 n p (1 - p)}} .$
This is accurate if $\frac{n | f - p |^{3}}{6 [min (f, p) min (1 - f, 1 - p)]^{2}} ≪ 1$ .
However, in general, normal approximation is less accurate than the entropy approximation.

We can also use CLT:

Theorem (CLT for Binomial Distribution)

Let $S_{n} \sim Binomial (n, p)$ for $0 < p < 1$ . Then $\forall a, b \in R$ where $a < b$ , $lim_{n \to \infty} P (a \leq \sqrt{n} \frac{\frac{S_{n}}{n} - p}{\sqrt{p (1 - p)}} \leq b) = \int_{a}^{b} \frac{1}{\sqrt{2 π}} e^{- \frac{t^{2}}{2}} d t .$

Proof

When $n p + a \sqrt{n p (1 - p)} \leq k \leq n p + b \sqrt{n p (1 - p)}$ , $| \frac{k}{n} - p | = O (\frac{1}{\sqrt{n}})$ .
Next, let $t_{k} = \frac{k - n p}{\sqrt{n p (1 - p)}}$ . Then $t_{k} - t_{k - 1} = \frac{1}{\sqrt{n p (1 - p)}} \to 0$ .
So, $\begin{aligned} P (a \leq \sqrt{n} \frac{\frac{S_{n}}{n} - p}{\sqrt{p (1 - p)}} \leq b) & = \sum_{k = n p + a \sqrt{n p (1 - p)}}^{n p + b \sqrt{n p (1 - p)}} P (S_{n} = k) \\ \sim \sum_{t_{k} \in [a, b]} (t_{k} - t_{k - 1}) \frac{1}{\sqrt{2 π}} e^{- \frac{t_{k}^{2}}{2}}, \end{aligned}$ which is a Riemann sum converging to $\int_{a}^{b} \frac{1}{\sqrt{2 π}} e^{- t} d t$ .

2.1 Application of CLT for $Binomial (n, p)$

Let $p$ be the proportion of the population supporting Trump.
$n$ be the number of people polled u.a.r from the population.
$S_{n}$ be the number in the sample who support Trump.
${\hat{p}}_{n} = \frac{S_{n}}{n}$ be the estimator of $p$ .
How large should $n$ be, s.t. $P (p \in [{\hat{p}}_{n} - ε, {\hat{p}}_{n} + ε]) \geq 1 - α$ for given $ε > 0$ and $0 < α < 1$ ?

This is the #ConvidenceLevel . For each trial, $\hat{p}$ moves, and this moving interval should cover $p$ at least $100 (1 - α) %$ of the time.

Denote $I_{j} = {\begin{aligned} 1, j -th person polled supported Trump, \\ 0, otherwise . \end{aligned}$
Then $I_{1}, \dots, I_{n} \overset{i . i . d}{\sim} Bernoulli (p), S_{n} = I_{1} + \dots + I_{n} \sim Binomial (n, p)$ . By CLT, $\sqrt{n} \frac{{\hat{p}}_{n} - p}{\sqrt{p (1 - p)}} \overset{d}{\to} Z \sim N (0, 1) .$ so $\begin{aligned} P ({\hat{p}}_{n} - ε \leq p \leq {\hat{p}}_{n} + ε) & = P (- ε \leq {\hat{p}}_{n} - p \leq ε) \\ = P (\frac{- ε \sqrt{n}}{\sqrt{p (1 - p)}} \leq \sqrt{n} \frac{{\hat{p}}_{n} - p}{\sqrt{p (1 - p)}}) \\ \approx Φ (z) - Φ (- z) = 1 - 2 Φ (- z), \end{aligned}$
so

\begin{aligned} 1 - 2 Φ (- z) \geq 1 - α \Rightarrow Φ (- z) \leq \frac{α}{2} \\ \Rightarrow & n \geq {[- \frac{Φ^{- 1} (α / 2)}{ε}]}^{2} p (1 - p) . \end{aligned}

Since $p (1 - p) \leq \frac{1}{4}, \forall p \in [0, 1]$ , this yields $n \geq \frac{1}{4} [- \frac{Φ^{- 1} (α / 2)}{ε}] .$

If $ε = 0.05, α = 0.05$ , we have $n \geq 385$ .
If $ε = 0.01, α = 0.05$ , we have $n \geq 9604$ .

1 Entropy Approximation

2 Normal Approximation

2.1 Application of CLT for Binomial(n,p)

2.1 Application of CLT for $Binomial (n, p)$