Birthday Problem

Given a class of size $N$, what's the probability of two students sharing the same birthday? A classical statistics problem. The answer to that, rather unintuitively is: $$ 1 - \frac{365!}{(365 - N)! \times 365^N} $$

We can see that the probability of birthday-sharing rises very quickly with respect to $N$. In fact you need 23 people to have more than 50% chance of sharing a birthday.

There is also such a problem in crypography. When expressed formally - given a hash function with a finite (and this is crucial) number of outputs $ H $, find a message/'image' $x$, and a different one $x'$, such that[1]: $$ \text{hash}(x) = \text{hash}(x') $$

We know that the attacker will eventually find $x$ and $x'$ because hash functions have finite outputs - if they produce 128-bits for example then the space is just $2^{128}$; while the sizes of their inputs are usually larger than their outputs. For instance SHA1 has a block size of 512 bits while it's digest (output) is only 160 bits.

The birthday attack exploits the solution to the birthday problem- you can see where this is going; if we have a finite amount of output $H$ then the probability of finding $x'$ after $N$ trials with unique inputs is: $$ 1 - \frac{H!}{(H - N)! \times H^N} $$

This brings out a host of problems- particularly in digital signatures. A common example of an 'attack' is the following:

Mallory generates a lot of versions of a fair contract $x$, e.g. by re-wording it, changes in punctuation etc, such that it will still be acceptable for Bob.
Mallory also generates a lot of versions of a fraudulent contract $x'$. She then finds the version of the fair and unfair contracts that have the same hash.
She gives $x$ to Bob, who then signs it. She then attaches the signature to $x'$, therefore proving that Bob has signed $x'$.

The Birthday Attack is a test on collision resistance and second-image resistance (how easy it is to find $x'$, not just through brute forcing but e.g. using analysis of the algorithm).

To make such attacks uneconomical[2] the hash function should choose a large enough $H$ and it is simple to see why — from our expression for the probability, $H^N$ grows much faster than $H!$.