# Birthday Problem

Given a class of size $$N$$, what's the probability of two students sharing the same birthday? A classical statistics problem. The answer to that, rather unintuitively is: $$1 - \frac{365!}{(365 - N)! \times 365^N}$$

We can see that the probability of birthday-sharing rises very quickly with respect to $$N$$. In fact you need 23 people to have more than 50% chance of sharing a birthday.

There is also such a problem in crypography. When expressed formally - given a hash function with a finite (and this is crucial) number of outputs $$H$$, find a message/'image' $$x$$, and a different one $$x'$$, such that[1]: $$\text{hash}(x) = \text{hash}(x')$$

We know that the attacker will eventually find $$x$$ and $$x'$$ because hash functions have finite outputs - if they produce 128-bits for example then the space is just $$2^{128}$$; while the sizes of their inputs are usually larger than their outputs. For instance SHA1 has a block size of 512 bits while it's digest (output) is only 160 bits.

The birthday attack exploits the solution to the birthday problem- you can see where this is going; if we have a finite amount of output $$H$$ then the probability of finding $$x'$$ after $$N$$ trials with unique inputs is: $$1 - \frac{H!}{(H - N)! \times H^N}$$

This brings out a host of problems- particularly in digital signatures. A common example of an 'attack' is the following:

• Mallory generates a lot of versions of a fair contract $$x$$, e.g. by re-wording it, changes in punctuation etc, such that it will still be acceptable for Bob.
• Mallory also generates a lot of versions of a fraudulent contract $$x'$$. She then finds the version of the fair and unfair contracts that have the same hash.
• She gives $$x$$ to Bob, who then signs it. She then attaches the signature to $$x'$$, therefore proving that Bob has signed $$x'$$.

The Birthday Attack is a test on collision resistance and second-image resistance (how easy it is to find $$x'$$, not just through brute forcing but e.g. using analysis of the algorithm).

To make such attacks uneconomical[2] the hash function should choose a large enough $$H$$ and it is simple to see why — from our expression for the probability, $$H^N$$ grows much faster than $$H!$$.