Statistical inference of model parameters from data¶
Likelihood (function of bias for fixed data):
Prior (knowledge or experience about theta):
Posterior: most probable value of bias consistent with data:
Example of determining coin bias from N tosses¶
Throw coin N times and record number of times it lands heads, e.g count data=HTHHH or data=10111.
Probability of coin throws trajectory is given by Bernouli distribution product of probabilities to be H vs T.
We can assume no prior knowledge, e.g uninformative or uniform prior
is a normalization factors which we can ignore since there is no dependence on
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import binom
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgetsA Single experiment Suppose you tossed a coin 10 times and got 7 heads with probability and 3 tails with probability . Question is how do we determine that can best explainn the data
theta = np.linspace(0, 1, 100)
L = theta**7 * (1-theta)**3
prior = np.ones_like(L)
posterior = L * prior
fig, ax = plt.subplots(nrows=3, figsize=(12,10))
ax[0].plot(theta, L, color='grey')
ax[0].set_title('Likelihood')
ax[1].plot(theta, prior, color='blue')
ax[1].set_title('Prior')
ax[2].plot(theta, posterior, color='green')
ax[2].set_title('Posterior')
plt.show()
plt.tight_layout()Fair coin simulation and inference of
@interact(N=(10, 2000))
def visualize_likelihood(N=10, true_bias=0.5):
"""
Visualize the likelihood of the coin bias for N tosses
true_bias: coin's true bias or exact value of theta we are trying to estimate
"""
#np.random.seed(42)
# Generate random coin tosses
tosses = np.random.choice([0,1], size=N)
n = np.sum(tosses)
# Calculate the likelihood for each possible bias value
thetas = np.linspace(0, 1, 100)
L_theta = binom.pmf(n, N, thetas)
# Find theta estimate from max likelihood
theta_estimate = thetas[np.argmax(L_theta)]
# Plotting
plt.figure(figsize=(10, 6))
plt.plot(thetas, L_theta, label='Likelihood')
plt.axvline(x=true_bias, color='r', linestyle='--', label=f'True bias={true_bias}')
plt.axvline(x=theta_estimate, color='g', linestyle='-', label=f'Estimated bias={theta_estimate:.3f}')
plt.title('Likelihood of Coin Bias Given 10 Tosses')
plt.xlabel('Bias (probability of heads)')
plt.ylabel('Likelihood')
plt.legend()
plt.grid(True)
plt.show()Questions¶
Modify the code so the synthetic tosses come from a biased coin:
Replace the uniform prior with a peaked prior and plot the posterior:
Compare posterior curves for different priors on the same data.
Plot the likelihood and posterior on the same axes. How does the posterior differ from the likelihood?
As increases, does the prior still matter? At what point does it become negligible?