The Inverse Transform of RVs#
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
The idea#
Inverse transform sampling is a basic method for pseudo-random number sampling, i.e. for generating sample numbers at random from any probability distribution given its cumulative distribution function. That is, by drawing from a uniform distribution, we make it possible to draw from the other distribution in question.
Let’s start by defining uniform distribution
which generates random numbers falling on range. Now let’s look for a way of transforming random number with uniform pdf into a function where x is distributed according to some pdf . The probability to find between and is equal to:
This relation is simply transform of variables. Now the key point to realize is that integrals from over
for X and for Z are equal (these are cumulative distribution functions (CDF).
Thus, if we can (i) integrate expression on the left analytically and (ii) solve for
then we are done! For most of the pdf at least one of the two is not possible. Below is a typical example where both (i) and (ii) states are easily done.
Example: Drawing from the exponential distribution.#
For example, lets assume we would like to generate random numbers that follow the exponential distribution $
$for
and otherwise. Following the recipe from the above we have
Solving for
we get
Formalized procedure:#
Get a uniform sample
fromSolve for
yielding a new equation where is the CDF of the distribution we desire.Repeat.
# Define the probability distribution function
p = lambda x: np.exp(-x)
# Compute its cumulative distribution function (CDF)
CDF = lambda x: 1 - np.exp(-x)
# Compute the inverse of the CDF
invCDF = lambda r: -np.log(1 - r)
# Define sampling limits
xmin, xmax = 0, 6 # Domain range
rmin, rmax = CDF(xmin), CDF(xmax) # Range of CDF values
# Generate random samples
N = 10000
R = np.random.uniform(rmin, rmax, N)
X = invCDF(R)
# Compute histogram data
hist_vals, bin_edges = np.histogram(X, bins=100)
# Plot histogram of samples
plt.hist(X, bins=100, label='Samples', alpha=0.7, edgecolor='black')
# Overlay the theoretical probability density function (PDF)
x_vals = np.linspace(xmin, xmax, 1000)
plt.plot(x_vals, hist_vals[0] * p(x_vals), 'r', linewidth=2, label='p(x)')
# Display legend and formatting
plt.legend()
plt.xlabel('x')
plt.ylabel('Frequency')
plt.title('Inverse Transform Sampling of an Exponential Distribution')
plt.grid(alpha=0.3)
plt.show()
