Median
Median[edit]
In general, there is no single formula to find the median for a binomial distribution, and it may even be non-unique. However, several special results have been established:
- If np is an integer, then the mean, median, and mode coincide and equal np.[10][11]
- Any median m must lie within the interval ⌊np⌋ ≤ m ≤ ⌈np⌉.[12]
- A median m cannot lie too far away from the mean: |m − np| ≤ min{ ln 2, max{p, 1 − p} }.[13]
- The median is unique and equal to m = round(np) when |m − np| ≤ min{p, 1 − p} (except for the case when p = 12 and n is odd).[12]
- When p is a rational number (with the exception of p = 1/2 and n odd) the median is unique.[14]
- When p = 1/2 and n is odd, any number m in the interval 12(n − 1) ≤ m ≤ 12(n + 1) is a median of the binomial distribution. If p = 1/2 and n is even, then m = n/2 is the unique median.
Tail bounds[edit]
For k ≤ np, upper bounds can be derived for the lower tail of the cumulative distribution function , the probability that there are at most k successes. Since , these bounds can also be seen as bounds for the upper tail of the cumulative distribution function for k ≥ np.
Hoeffding's inequality yields the simple bound
which is however not very tight. In particular, for p = 1, we have that F(k;n,p) = 0 (for fixed k, n with k < n), but Hoeffding's bound evaluates to a positive constant.
A sharper bound can be obtained from the Chernoff bound:[15]
where D(a || p) is the relative entropy (or Kullback-Leibler divergence) between an a-coin and a p-coin (i.e. between the Bernoulli(a) and Bernoulli(p) distribution):
Asymptotically, this bound is reasonably tight; see [15] for details.
One can also obtain lower bounds on the tail , known as anti-concentration bounds. By approximating the binomial coefficient with Stirling's formula it can be shown that[16]
which implies the simpler but looser bound
For p = 1/2 and k ≥ 3n/8 for even n, it is possible to make the denominator constant:[17]
Statistical inference[edit]
Estimation of parameters[edit]
When n is known, the parameter p can be estimated using the proportion of successes:
This estimator is found using maximum likelihood estimator and also the method of moments. This estimator is unbiased and uniformly with minimum variance, proven using Lehmann–Scheffé theorem, since it is based on a minimal sufficient and complete statistic (i.e.: x). It is also consistent both in probability and in MSE.
A closed form Bayes estimator for p also exists when using the Beta distribution as a conjugate prior distribution. When using a general as a prior, the posterior mean estimator is:
The Bayes estimator is asymptotically efficient and as the sample size approaches infinity (n → ∞), it approaches the MLE solution. The Bayes estimator is biased (how much depends on the priors), admissible and consistent in probability.
For the special case of using the standard uniform distribution as a non-informative prior, , the posterior mean estimator becomes:
(A posterior mode should just lead to the standard estimator.) This method is called the rule of succession, which was introduced in the 18th century by Pierre-Simon Laplace.
When estimating p with very rare events and a small n (e.g.: if x=0), then using the standard estimator leads to which sometimes is unrealistic and undesirable. In such cases there are various alternative estimators.[18] One way is to use the Bayes estimator, leading to:
Another method is to use the upper bound of the confidence interval obtained using the rule of three:
Confidence intervals[edit]
Even for quite large values of n, the actual distribution of the mean is significantly nonnormal.[19] Because of this problem several methods to estimate confidence intervals have been proposed.
In the equations for confidence intervals below, the variables have the following meaning:
- n1 is the number of successes out of n, the total number of trials
- is the proportion of successes
- is the quantile of a standard normal distribution (i.e., probit) corresponding to the target error rate . For example, for a 95% confidence level the error = 0.05, so = 0.975 and = 1.96.
Wald method[edit]
A continuity correction of 0.5/n may be added.[clarification needed]
Agresti–Coull method[edit]
Here the estimate of p is modified to
This method works well for and .[21] See here for .[22] For use the Wilson (score) method below.
Arcsine method[edit]
Wilson (score) method[edit]
The notation in the formula below differs from the previous formulas in two respects:[24]
- Firstly, zx has a slightly different interpretation in the formula below: it has its ordinary meaning of 'the xth quantile of the standard normal distribution', rather than being a shorthand for 'the (1 − x)-th quantile'.
- Secondly, this formula does not use a plus-minus to define the two bounds. Instead, one may use to get the lower bound, or use to get the upper bound. For example: for a 95% confidence level the error = 0.05, so one gets the lower bound by using , and one gets the upper bound by using .
Comparison[edit]
The so-called "exact" (Clopper–Pearson) method is the most conservative.[19] (Exact does not mean perfectly accurate; rather, it indicates that the estimates will not be less conservative than the true value.)
The Wald method, although commonly recommended in textbooks, is the most biased.
Comments
Post a Comment