Random Generator (PCG64) & Reproducibility in NumPy

Generator vs legacy np.random.*

Modern NumPy uses a two-layer RNG design: a BitGenerator (raw random bits, e.g., PCG64) and a Generator (user-facing API: normal, integers, choice, etc.). Prefer np.random.default_rng() instead of the legacy global functions to avoid hidden global state.

import numpy as np

# Modern API (recommended)
rng = np.random.default_rng(seed=42)  # PCG64 by default
print(rng.integers(0, 10, size=5))
print(rng.normal(0, 1, size=3))

# Legacy (still works, not recommended for new code)
# np.random.seed(42)
# np.random.randint(0, 10, 5)

Choose a BitGenerator (PCG64 default)

PCG64 is the default BitGenerator, offering excellent statistical quality and speed. You can also use Philox, SFC64, etc., but PCG64 is usually a solid choice.

from numpy.random import Generator, PCG64, Philox

rng_pcg = Generator(PCG64(12345))
rng_philox = Generator(Philox(2025))

print(rng_pcg.integers(10, size=4))
print(rng_philox.uniform(size=3))

Reproducibility with SeedSequence

SeedSequence expands an initial seed into many independent child seeds—great for creating multiple reproducible streams (e.g., parallel workers).

from numpy.random import SeedSequence, PCG64, Generator

ss = SeedSequence(2025)
# Spawn 3 independent child sequences
child_ss = ss.spawn(3)
streams = [Generator(PCG64(s)) for s in child_ss]

# Each stream is independent and reproducible
for i, rng in enumerate(streams, 1):
    print('stream', i, rng.integers(0, 100, size=3))

Parallel streams: safe patterns

Avoid sharing a single RNG across threads/processes. Instead, create one Generator per worker using SeedSequence.spawn. This prevents overlapping sequences and contention.

# Example sketch (no actual multiprocessing shown)
base = SeedSequence(123)
children = base.spawn(4)
rngs = [Generator(PCG64(s)) for s in children]

# worker k uses rngs[k] exclusively

Saving & restoring RNG state

You can serialize either the state (exact position in the sequence) or the seed (to regenerate from start). State-based replay resumes mid-stream; seed-based replay starts from the beginning.

import numpy as np

rng = np.random.default_rng(7)
state = rng.bit_generator.state            # dict

# Generate some numbers
a = rng.standard_normal(5)

# Restore exact state later
rng2 = np.random.default_rng()
rng2.bit_generator.state = state
b = rng2.standard_normal(5)

print(np.allclose(a, b))  # True (same continuation)

Distributions in Generator

Common methods: integers, random/random(size), choice, normal, lognormal, poisson, gamma, beta, binomial, etc.

rng = np.random.default_rng(101)
print(rng.choice(['A','B','C'], size=5, replace=True, p=[0.2,0.5,0.3]))
print(rng.normal(loc=10, scale=2, size=(2,3)))
print(rng.poisson(lam=3.5, size=4))

Replacing legacy calls

  • np.random.rand(n)rng.random(n)
  • np.random.randn(n)rng.standard_normal(n) or rng.normal()
  • np.random.randint(a, b)rng.integers(a, b)
  • np.random.choice(...)rng.choice(...)

Good practices for reproducible science

  • Use Generator instances—avoid the global legacy state.
  • Record your initial seed (and optionally SeedSequence.entropy).
  • For parallel jobs, create per-worker RNGs via SeedSequence.spawn.
  • Save RNG state if you need to resume exactly mid-stream.
  • Note NumPy version in your experiments; streams are stable across sessions but library changes can affect higher-level behavior.

File I/O: storing seeds & states

Keep experiment seeds in config files or store bit_generator.state via JSON. For bulk arrays, save with .npy/.npz (see File I/O page).

import json, numpy as np

rng = np.random.default_rng(2026)
state = rng.bit_generator.state
with open('rng_state.json', 'w') as f:
    json.dump(state, f)

# Later:
with open('rng_state.json') as f:
    state2 = json.load(f)
rng2 = np.random.default_rng()
rng2.bit_generator.state = state2

Quality & speed notes

  • PCG64: excellent statistical properties and fast general-purpose choice.
  • Philox: good for parallel workloads and counter-based usage.
  • Vectorize draws (size=(...)) instead of looping Python.

Practice: quick exercises

# 1) Create three independent RNG streams using SeedSequence.spawn and draw 2 Poisson numbers from each.
from numpy.random import SeedSequence, PCG64, Generator
ss = SeedSequence(1234)
streams = [Generator(PCG64(s)) for s in ss.spawn(3)]
for i, r in enumerate(streams, 1):
    print(i, r.poisson(2.5, size=2))

# 2) Save RNG state after 5 draws, restore it, and confirm the next 5 draws match
rng = np.random.default_rng(77)
first = rng.normal(size=5)
st = rng.bit_generator.state
cont1 = rng.normal(size=5)

rng2 = np.random.default_rng()
rng2.bit_generator.state = st
cont2 = rng2.normal(size=5)
print(np.allclose(cont1, cont2))

# 3) Replace legacy np.random calls with Generator equivalents in a toy script.

# 4) Simulate a biased 6-sided die with probabilities and verify empirical frequencies.
rng = np.random.default_rng(9)
p = np.array([0.10, 0.15, 0.20, 0.25, 0.20, 0.10])
rolls = rng.choice(np.arange(1,7), size=10_000, p=p)
print(np.round(np.bincount(rolls, minlength=7)[1:] / rolls.size, 3))
Numpy rand() randint() randn() random_sample() File I/O
Subhendu Mohapatra — author at plus2net
Subhendu Mohapatra

Author

🎥 Join me live on YouTube

Passionate about coding and teaching, I publish practical tutorials on PHP, Python, JavaScript, SQL, and web development. My goal is to make learning simple, engaging, and project‑oriented with real examples and source code.



Subscribe to our YouTube Channel here



plus2net.com







Python Video Tutorials
Python SQLite Video Tutorials
Python MySQL Video Tutorials
Python Tkinter Video Tutorials
We use cookies to improve your browsing experience. . Learn more
HTML MySQL PHP JavaScript ASP Photoshop Articles Contact us
©2000-2025   plus2net.com   All rights reserved worldwide Privacy Policy Disclaimer