Birthday Problem

Given a class of size \(N\), what's the probability of two students sharing the same birthday? A classical statistics problem. The answer to that, rather unintuitively is: $$ 1 - \frac{365!}{(365 - N)! \times 365^N} $$

We can see that the probability of birthday-sharing rises very quickly with respect to \(N\). In fact you need 23 people to have more than 50% chance of sharing a birthday.

There is also such a problem in crypography. When expressed formally - given a hash function with a finite (and this is crucial) number of outputs \( H \), find a message/'image' \(x\), and a different one \(x'\), such that[1]: $$ \text{hash}(x) = \text{hash}(x') $$

We know that the attacker will eventually find \(x\) and \(x'\) because hash functions have finite outputs - if they produce 128-bits for example then the space is just \(2^{128}\); while the sizes of their inputs are usually larger than their outputs. For instance SHA1 has a block size of 512 bits while it's digest (output) is only 160 bits.

The birthday attack exploits the solution to the birthday problem- you can see where this is going; if we have a finite amount of output \(H\) then the probability of finding \(x'\) after \(N\) trials with unique inputs is: $$ 1 - \frac{H!}{(H - N)! \times H^N} $$

This brings out a host of problems- particularly in digital signatures. A common example of an 'attack' is the following:

The Birthday Attack is a test on collision resistance and second-image resistance (how easy it is to find \(x'\), not just through brute forcing but e.g. using analysis of the algorithm).

To make such attacks uneconomical[2] the hash function should choose a large enough \(H\) and it is simple to see why — from our expression for the probability, \(H^N\) grows much faster than \(H!\).