Binomial distribution

Binomial distribution
	Probability mass function
	Cumulative distribution function
Notation
Parameters	– number of trials; – success probability for each trial;
Support	– number of successes
PMF
CDF	(the regularized incomplete beta function)
Mean
Median	or
Mode	or
Variance
Skewness
Ex. kurtosis
Entropy	; in shannons. For nats, use the natural log in the log.
MGF
CF
PGF
Fisher information	; (for fixed )

Binomial distribution for

p=0.5

with n and k as in Pascal's triangle

The probability that a ball in a Galton box with 8 layers (n = 8) ends up in the central bin (k = 4) is

70/256

.

In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question, and each with its own Boolean-valued outcome: success (with probability p) or failure (with probability $q=1-p$ ). A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment, and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.^[1]

The binomial distribution is frequently used to model the number of successes in a sample of size n drawn with replacement from a population of size N. If the sampling is carried out without replacement, the draws are not independent and so the resulting distribution is a hypergeometric distribution, not a binomial one. However, for N much larger than n, the binomial distribution remains a good approximation, and is widely used.

Definitions[edit]

Probability mass function[edit]

In general, if the random variable X follows the binomial distribution with parameters n ∈ $\mathbb {N}$ $\mathbb {N}$ and p ∈ [0,1], we write X ~ B(n, p). The probability of getting exactly k successes in n independent Bernoulli trials is given by the probability mass function:

f(k,n,p)=\Pr(k;n,p)=\Pr(X=k)={\binom {n}{k}}p^{k}(1-p)^{n-k}

for k = 0, 1, 2, ..., n, where

{\binom {n}{k}}={\frac {n!}{k!(n-k)!}}

is the binomial coefficient, hence the name of the distribution. The formula can be understood as follows: k successes occur with probability p^k and n − k failures occur with probability $(1-p)^{n-k}$ . However, the k successes can occur anywhere among the n trials, and there are ${\tbinom {n}{k}}$ different ways of distributing k successes in a sequence of n trials.

In creating reference tables for binomial distribution probability, usually the table is filled in up to n/2 values. This is because for k > n/2, the probability can be calculated by its complement as

f(k,n,p)=f(n-k,n,1-p).

Looking at the expression f(k, n, p) as a function of k, there is a k value that maximizes it. This k value can be found by calculating

{\frac {f(k+1,n,p)}{f(k,n,p)}}={\frac {(n-k)p}{(k+1)(1-p)}}

and comparing it to 1. There is always an integer M that satisfies^[2]

(n+1)p-1\leq M<(n+1)p.

f(k, n, p) is monotone increasing for k < M and monotone decreasing for k > M, with the exception of the case where (n + 1)p is an integer. In this case, there are two values for which f is maximal: (n + 1)p and (n + 1)p − 1. M is the most probable outcome (that is, the most likely, although this can still be unlikely overall) of the Bernoulli trials and is called the mode.

Example[edit]

Suppose a biased coin comes up heads with probability 0.3 when tossed. The probability of seeing exactly 4 heads in 6 tosses is

f(4,6,0.3)={\binom {6}{4}}0.3^{4}(1-0.3)^{6-4}=0.059535.

Cumulative distribution function[edit]

The cumulative distribution function can be expressed as:

F(k;n,p)=\Pr(X\leq k)=\sum _{i=0}^{\lfloor k\rfloor }{n \choose i}p^{i}(1-p)^{n-i},

where $\lfloor k\rfloor$ is the "floor" under k, i.e. the greatest integer less than or equal to k.

It can also be represented in terms of the regularized incomplete beta function, as follows:^[3]

{\begin{aligned}F(k;n,p)&=\Pr(X\leq k)\\&=I_{1-p}(n-k,k+1)\\&=(n-k){n \choose k}\int _{0}^{1-p}t^{n-k-1}(1-t)^{k}\,dt.\end{aligned}}

which is equivalent to the cumulative distribution function of the $F$ -distribution:^[4]

F(k;n,p)=F_{F{\text{-distribution}}}\left(x={\frac {1-p}{p}}{\frac {k+1}{n-k}};d_{1}=2(n-k),d_{2}=2(k+1)\right).

Some closed-form bounds for the cumulative distribution function are given below.

Properties[edit]

Expected value and variance[edit]

If X ~ B(n, p), that is, X is a binomially distributed random variable, n being the total number of experiments and p the probability of each experiment yielding a successful result, then the expected value of X is:^[5]

\operatorname {E} [X]=np.

This follows from the linearity of the expected value along with the fact that $X$ is the sum of $n$ identical Bernoulli random variables, each with expected value $p$ . In other words, if $X_{1},\ldots ,X_{n}$ are identical (and independent) Bernoulli random variables with parameter $p$ , then $X=X_{1}+\cdots +X_{n}$ and

\operatorname {E} [X]=\operatorname {E} [X_{1}+\cdots +X_{n}]=\operatorname {E} [X_{1}]+\cdots +\operatorname {E} [X_{n}]=p+\cdots +p=np.

The variance is:

\operatorname {Var} (X)=npq=np(1-p).

This similarly follows from the fact that the variance of a sum of independent random variables is the sum of the variances.

Higher moments[edit]

The first 6 central moments, defined as $\mu _{c}=\operatorname {E} \left[(X-\operatorname {E} [X])^{c}\right]$ , are given by

{\begin{aligned}\mu _{1}&=0,\\\mu _{2}&=np(1-p),\\\mu _{3}&=np(1-p)(1-2p),\\\mu _{4}&=np(1-p)(1+(3n-6)p(1-p)),\\\mu _{5}&=np(1-p)(1-2p)(1+(10n-12)p(1-p)),\\\mu _{6}&=np(1-p)(1-30p(1-p)(1-4p(1-p))+5np(1-p)(5-26p(1-p))+15n^{2}p^{2}(1-p)^{2}).\end{aligned}}

The non-central moments satisfy

{\begin{aligned}\operatorname {E} [X]&=np,\\\operatorname {E} [X^{2}]&=np(1-p)+n^{2}p^{2},\end{aligned}}

and in general ^[6] ^[7]

\operatorname {E} [X^{c}]=\sum _{k=0}^{c}\left\{{c \atop k}\right\}n^{\underline {k}}p^{k},

where $\textstyle \left\{{c \atop k}\right\}$ are the Stirling numbers of the second kind, and $n^{\underline {k}}=n(n-1)\cdots (n-k+1)$ is the $�$ th falling power of $�$ . A simple bound ^[8] follows by bounding the Binomial moments via the higher Poisson moments:

\operatorname {E} [X^{c}]\leq \left({\frac {c}{\log(c/(np)+1)}}\right)^{c}\leq (np)^{c}\exp \left({\frac {c^{2}}{2np}}\right).

This shows that if $c=O({\sqrt {np}})$ , then $\operatorname {E} [X^{c}]$ is at most a constant factor away from $\operatorname {E} [X]^{c}$

Mode[edit]

Usually the mode of a binomial B(n, p) distribution is equal to $\lfloor (n+1)p\rfloor$ , where $\lfloor \cdot \rfloor$ is the floor function. However, when (n + 1)p is an integer and p is neither 0 nor 1, then the distribution has two modes: (n + 1)p and (n + 1)p − 1. When p is equal to 0 or 1, the mode will be 0 and n correspondingly. These cases can be summarized as follows:

{\text{mode}}={\begin{cases}\lfloor (n+1)\,p\rfloor &{\text{if }}(n+1)p{\text{ is 0 or a noninteger}},\\(n+1)\,p\ {\text{ and }}\ (n+1)\,p-1&{\text{if }}(n+1)p\in \{1,\dots ,n\},\\n&{\text{if }}(n+1)p=n+1.\end{cases}}

Proof: Let

f(k)={\binom {n}{k}}p^{k}q^{n-k}.

For $p=0$ only $f(0)$ has a nonzero value with $f(0)=1$ . For $p=1$ we find $f(n)=1$ and $f(k)=0$ for $k\neq n$ . This proves that the mode is 0 for $p=0$ and $�$ for $p=1$ .

Let $0<p<1$ . We find

{\frac {f(k+1)}{f(k)}}={\frac {(n-k)p}{(k+1)(1-p)}}

.

From this follows

{\begin{aligned}k>(n+1)p-1\Rightarrow f(k+1)<f(k)\\k=(n+1)p-1\Rightarrow f(k+1)=f(k)\\k<(n+1)p-1\Rightarrow f(k+1)>f(k)\end{aligned}}

So when $(n+1)p-1$ is an integer, then $(n+1)p-1$ and $(n+1)p$ is a mode. In the case that $(n+1)p-1\notin \mathbb {Z}$ , then only $\lfloor (n+1)p-1\rfloor +1=\lfloor (n+1)p\rfloor$ is a mode.^[9]

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

Search This Blog

Dr. D. SENTHILKUMAR