Coin, Bayes and Maximum Likelhood

Statistical inference of model parameters from data¶

P(\theta | data) = \frac{P(data|\theta) P(\theta)}{P(data)}

(1)

Likelihood (function of bias for fixed data):
$L(\theta) = P(data|\theta)$
(2)
Prior (knowledge or experience about theta):
$P(\theta)$
(3)
Posterior: most probable value of bias consistent with data:
$P(\theta|data)$
(4)

Example of determining coin bias from N tosses¶

Throw coin N times and record number of times it lands heads, e.g count data=HTHHH or data=10111.
Probability of coin throws trajectory is given by Bernouli distribution product of probabilities to be H vs T.

L(\theta) = \theta^n (1-\theta)^{N-n}

(5)

We can assume no prior knowledge, e.g uninformative or uniform prior $P(\theta)$
$P(data)$ is a normalization factors which we can ignore since there is no dependence on $theta$

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import binom

from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets

A Single experiment Suppose you tossed a coin 10 times and got 7 heads with probability $\theta$ and 3 tails with probability $(1-\theta)$ . Question is how do we determine $\theta$ that can best explainn the data


theta = np.linspace(0, 1, 100)

L = theta**7 * (1-theta)**3

prior = np.ones_like(L) 


posterior = L * prior

fig, ax = plt.subplots(nrows=3, figsize=(12,10))
ax[0].plot(theta, L, color='grey')
ax[0].set_title('Likelihood')

ax[1].plot(theta, prior, color='blue')
ax[1].set_title('Prior')
ax[2].plot(theta, posterior, color='green')
ax[2].set_title('Posterior')
plt.show()
plt.tight_layout()

Fair coin simulation and inference of $\theta$

@interact(N=(10, 2000))
def visualize_likelihood(N=10, true_bias=0.5):
    """
    Visualize the likelihood of the coin bias for N tosses
    true_bias: coin's true bias or exact value of theta we are trying to estimate
    """

    #np.random.seed(42)

    # Generate random coin tosses
    tosses = np.random.choice([0,1], size=N)
    n = np.sum(tosses)

    # Calculate the likelihood for each possible bias value
    thetas = np.linspace(0, 1, 100)
    L_theta = binom.pmf(n, N, thetas)

    # Find theta estimate from max likelihood
    theta_estimate = thetas[np.argmax(L_theta)]

    # Plotting
    plt.figure(figsize=(10, 6))
    plt.plot(thetas, L_theta, label='Likelihood')

    plt.axvline(x=true_bias,      color='r', linestyle='--', label=f'True bias={true_bias}')
    plt.axvline(x=theta_estimate, color='g', linestyle='-', label=f'Estimated bias={theta_estimate:.3f}')

    plt.title('Likelihood of Coin Bias Given 10 Tosses')
    plt.xlabel('Bias (probability of heads)')
    plt.ylabel('Likelihood')
    plt.legend()
    plt.grid(True)
    plt.show()

Questions¶

Modify the code so the synthetic tosses come from a biased coin:
Replace the uniform prior with a peaked prior and plot the posterior:
$P(\theta|data)\propto L(\theta)P(\theta)$
(6)
Compare posterior curves for different priors on the same data.
Plot the likelihood and posterior on the same axes. How does the posterior differ from the likelihood?
As $N$ increases, does the prior still matter? At what point does it become negligible?