Careful when using $RANDOM

I thought I’d put out a PSA about the dangers of using bash’s convenient, built-in source of random numbers: $RANDOM.

No, this isn’t the usual lecture about using a cryptographically secure random number generator. There’s lots of situations where you just need a random blob and you’re not worried about malicious attacks. No, this is about why, even in those situations, you need to consider whether $RANDOM is random enough.

For instance, I was just using it to generate unique filenames in a bash loop. I just wanted to be able to generate filenames without worrying about collisions.

However, I overestimated the entropy provided by $RANDOM and underestimated the birthday paradox. I know $RANDOM only gives you a number from 0-32767 (15 bits of entropy), and I know about the birthday paradox, but it’s surprising what the combination of those two can result in.

I was only generating 45 filenames, but I actually encountered a collision. Only 45 numbers from 0-32767 and two are the same? How?!

Well, it’s more likely than you think. Specifically, it’s 3% likely.* Still rare, but likely enough to be plausible that I encountered it by chance.

Continue reading