Hoeffding’s inequality#

In probability theory, Hoeffding’s inequality provides an upper bound on the probability that the sum of bounded independent random variables deviates from its expected value by more than a certain amount 1.

In other terms, if we have a process which have a certain unknown percentage of success, the Hoeffding’s formula can provides a level of confidence that the true percentage of success fall into a range, given the results of a few experiences with the process.

\[P(|\overline {X} - E[\overline {X}]| \geq t) \leq 2e^{-2nt^2}\]

This formula says: The probability of the observed value \(\overline {X}\) being more than \(t\) away from the expected value \(E[\overline {X}]\) is less than or equal to \(2e^{-2nt^2}\). With \(n\) = number of samples or experiences.

The inverse is also true:

\[P(|\overline {X} - E[\overline {X}]| < t) \geq 1-2e^{-2nt^2}\]

So if we want to find the range where the expected value will fall with a certain confidence level, let’s say 95% we can simply equate the right side to 0.95 and solve for \(t\) to get:

\[t = \sqrt{-\frac{1}{2n}ln(\frac{1-0.95}{2})}\]

which give us:

\[P(\overline {X} - t < E[\overline {X}] < \overline {X} + t) \geq 0.95\]

Graphical representation#

This graph represent the confidence level that the delta between the real and measured value are less than 0.10 for a given number of sample.

Hoeffding confidence level

This graph represent the delta T between real and measured value for a fixed confidence level.

Hoeffding delta T



From Wikipedia: https://en.wikipedia.org/wiki/Hoeffding%27s_inequality

Machine Learning