I have a thousand songs on one of those thumb drives, and I always play them in “shuffle” mode. Yet it seems that there is always a lot of overlap between one listening session and another — the same songs that I heard yesterday are in today’s mix. You’d think that with a thousand songs to choose from, it would be a while before I hear the same song twice, unless there’s something not sufficiently random about the PlayStation’s song randomizer.
I was all prepared to fire off an indignant letter to Sony’s customer support department when I decided I first needed to understand exactly how unlikely was the overlap I was encountering.
Figure that a “listening session” includes twenty songs. There are 339,482,811,302,457,603,895,512,614,793,686,020,778,700 (339 duodecillion) different ways to choose twenty songs from a collection of a thousand. This result is given by the combinatorial formula:
n! / k!(n-k)!
where n is the number of items to choose from (1,000, in this case), k is the number of items to choose (20), and “!” is the “factorial” operator that means “multiply the preceding number by every other number between it and 1.” Five factorial, for instance, is written “5!” and is equal to 5×4×3×2×1, which is 120.
The combinatorial formula above is sometimes abbreviated “nCk,” pronounced “n choose k.” The very very big number is the result of calculating 1000 C 20.
So there is a vast number of possible listening sessions. But in how many ways can one listening session overlap with another? Let’s consider a second listening session that doesn’t overlap at all with the first. The way to think about this is that the first listening session “used up” twenty of the available songs, leaving 980 to choose from — specifically, 980 from which to choose 20, or 980 C 20, which is 225,752,650,356,644,030,123,857,337,771,499,346,518,885 (225 duodecillion).
So of the 339 duodecillion ways to choose 20 songs from a thousand, 225 duodecillion, or 66%, do not overlap — but that means that 34% do overlap. There is a one-in-three chance that at least one song in the second session will be the same as one in the first.
This was a stunning result to me. I never expected the odds of an overlap to be so high.
That doesn’t mean that the PlayStation is working correctly, necessarily; it’s my impression that I’m getting multiple-song overlaps, and I’m getting them much more than one-third of the time, so the PlayStation still may not be adequately randomizing its playlist. But this result does send me back to the drawing board to gather objective data about just how much overlap I am getting.
nice analysis!