Sometimes "math/rand" Uses Weighted Dice

matthewdale

Matt Dale

Posted on June 19, 2023

Sometimes "math/rand" Uses Weighted Dice

Go's math/rand package provides pseudo-random number generators, or PRNGs, that are used in a ton of Go software. I've personally used the math/rand package in the vast majority of Go projects I've worked on. It's great for generating IDs, shuffling lists, introducing jitter in timing, and lots of other things.

⚠️ math/rand should not be used for security-sensitive purposes. For that, use crypto/rand instead.

As with any PRNG, the number sequence appears random but is actually deterministic and depends on the seed value. As a result, it's possible to get math/rand to generate the exact same number sequence by giving it the same seed value.

Let's look at an example:

rand.Seed(1)
fmt.Println(rand.Int63(), rand.Int63())
// Output: 5577006791947779410 8674665223082153551

rand.Seed(1)
fmt.Println(rand.Int63(), rand.Int63())
// Output: 5577006791947779410 8674665223082153551
Enter fullscreen mode Exit fullscreen mode

Go Playground

We used the same seed and got the same random numbers each time, just like we expected! Now let's try using different seed values:

rand.Seed(1)
fmt.Println(rand.Int63(), rand.Int63())
// Output: 5577006791947779410 8674665223082153551

rand.Seed(2)
fmt.Println(rand.Int63(), rand.Int63())
// Output: 1543039099823358511 2444694468985893231
Enter fullscreen mode Exit fullscreen mode

Go Playground

Great, we got different numbers! Let's try two more seed values:

rand.Seed(243697394791045426)
fmt.Println(rand.Int63(), rand.Int63())
// Output: 5463449640670714849 6224151428804650039

rand.Seed(-6474076765971414729)
fmt.Println(rand.Int63(), rand.Int63())
// Output: 5463449640670714849 6224151428804650039
Enter fullscreen mode Exit fullscreen mode

Go Playground

Oops, that's not right. We gave it totally different seed values, but the sequences are the same. What happened?

It turns out that math/rand will sometimes generate the same number sequence for different seed values. The rand.Seed function docs describe the unexpected behavior:

Seed values that have the same remainder when divided by 231-1 generate the same pseudo-random sequence.

That's not great, that means math/rand might generate the same number sequence even if our seed values don't repeat.

Yelling formal man watching news on laptop

Why'd you do that math/rand?! You're supposed to be (pseudo) random!

Automatic Seeds in Go 1.20

If you've looked at the rand.Seed documentation recently, you may have noticed that the function is deprecated. In fact, starting with Go 1.20, the package-global math/rand PRNG is automatically initialized with a random seed value. Great, so we can ignore this seed nonsense and just live our lives, right?

Mostly yes. However, it's still possible to run into problems with the new math/rand automatic seed values. I found that if you start about 100,000 processes that generate numbers with math/rand using the automatic seed values, two of those processes will probably generate the same sequence.

Why 100,000? (click for details)
This is an example of the birthday paradox, which is the counterintuitive fact that you're likely to get a repeating value in a series of random numbers much sooner than you might expect. In this case, rand.Seed basically truncates the int64 input into 31 bits of entropy. Based on that, we can calculate that we have about a 90% chance of getting a repeating value after generating 100,000 random 31-bit numbers.

Granted, most use cases don't require starting 100,000 processes. However, it could cause problems if, for example, you deploy an agent to every container in a very large server fleet and you need those agents to generate different number sequences.

Some Possible Fixes

If that is a problem for us, is there anything we can do about it? Yes!

Use a different PRNG library.

One solution is to use a different PRNG that doesn't suffer from the same unexpected behavior as math/rand. Go issue #36133 suggests a possible alternative is the golang.org/x/exp/rand module, which implements a different type of PRNG. Importantly, the x/exp/rand module does not truncate seed values, so is less likely to cause collisions between seeds. However, it is not automatically seeded like math/rand, so you must provide a seed value yourself!

Use crypto/rand instead.

Another option is to use the crypto/rand package instead, which uses the operating system's or runtime's cryptographically-secure randomness source. Cryptographically-secure RNGs are typically slower than PRNGs, but if you don't need to generating a ton of values, then using crypto/rand can be a simple option.

Partition the PRNG values.

A more complex solution is to "partition" the math/rand output with a value that is different for each process. For example, consider the following code that generates hexadecimal strings to use as request IDs. It reads a random value from crypto/rand at startup and appends that to all values read from math/rand, reducing the probability of collisions between processes:

var partition uint32

func init() {
    err := binary.Read(cryptorand.Reader, binary.LittleEndian, &partition)
    if err != nil {
        panic(err)
    }
}

func newRequestID() string {
    return fmt.Sprintf("%016x%08x", rand.Uint64(), partition)
}
Enter fullscreen mode Exit fullscreen mode

Go Playground

⚠️ Adding partitions to pseudo-random values can change the distribution of values, so it's extremely important to understand how it affects your use case. For example, a new distribution of values might introduce "hotspots" in a database index that wasn't a problem without the partition.

Additional Reading


Cover photo is Closeup Photo of Two Red Dices Showing 4 and 5 by Jonathan Petersson
Body photo is Yelling formal man watching news on laptop
by Andrea Piacquadio

💖 💪 🙅 🚩
matthewdale
Matt Dale

Posted on June 19, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related