Skip to content

Hoeffding's inequality

In probability theory, Hoeffding's inequality provides an upper bound on the probability that the sum of bounded independent random variables deviates from its expected value by more than a certain amount [1]_.

In other terms, if we have a process which have a certain unknown percentage of success, the Hoeffding's formula can provides a level of confidence that the true percentage of success fall into a range, given the results of a few experiences with the process.

\[P(|\overline {X} - E[\overline {X}]| \geq t) \leq 2e^{-2nt^2}\]

This formula says: The probability of the observed value \(\overline {X}\) being more than \(t\) away from the expected value \(E[\overline {X}]\) is less than or equal to \(2e^{-2nt^2}\). With \(n\) = number of samples or experiences.

The inverse is also true:

\[P(|\overline {X} - E[\overline {X}]| < t) \geq 1-2e^{-2nt^2}\]

So if we want to find the range where the expected value will fall with a certain confidence level, let's say 95% we can simply equate the right side to 0.95 and solve for $t` to get:

\[t = \sqrt{-\frac{1}{2n}ln(\frac{1-0.95}{2})}\]

which give us:

\[P(\overline {X} - t < E[\overline {X}] < \overline {X} + t) \geq 0.95\]

Graphical representation

This graph represent the confidence level that the delta between the real and measured value are less than 0.10 for a given number of sample.

Hoeffding confidence level
Figure 1: Hoeffding confidence level

This graph represent the delta T between real and measured value for a fixed confidence level.

Hoeffding delta T
Figure 2: Hoeffding delta T

References

.. [1] From Wikipedia: <https://en.wikipedia.org/wiki/Hoeffding%27s_inequality>_