Numpy and Probabilities

import numpy as np

# Given joint probability matrix p(x, y)
pxy = np.array([[0.0888395 , 0.04135306, 0.07692385, 0.04716456],
                [0.05191984, 0.04715928, 0.06539948, 0.04739613],
                [0.09325686, 0.05106682, 0.05221257, 0.05864018],
                [0.08283109, 0.11219392, 0.02191071, 0.06173214]])

Problem: Compute Full, Marginal, and Conditional Probabilities Given $p(x, y)$ ¶

Given the joint probability distribution $p(x, y)$ , perform the following tasks:

Check for Normalization:
- Verify that the sum of all elements in $p(x, y)$ equals 1 (i.e., $\sum_{x,y} p(x, y) = 1$ ).
Compute the Marginal Probability $p(x)$ :
- Compute $p(x)$ by summing over all values of $y$ :
  $p(x) = \sum_{y} p(x, y)$
  (1)
Compute a Specific Joint Probability:
- Find the probability of $x = 1$ and $y = 2$ , i.e., $p(x=1, y=2)$ .
Compute the Conditional Probability $p(x | y=3)$ :
- Use the definition of conditional probability:
  $p(x | y=3) = \frac{p(x, y=3)}{p(y=3)}$
  (2)
- Compute $p(y=3)$ by summing over $x$ , then use it to compute $p(x | y=3)$ .

Exploration: How likely is it that (n) out of (N) molecules are in the left half?¶

When a wall separating an ideal gas is removed, each molecule can be found in the left or right half of the container. Assume each molecule is independently in the left half with probability $p=\tfrac12$ .

The probability that exactly (n) molecules out of (N) are in the left half is:

P_N(n)=\binom{N}{n}\left(\frac12\right)^N

(3)

Implement $P_N(n)$ in NumPy

Create a Python function P_N(n, N) that returns the binomial probability values.
Use math.comb(N,n) for the binomial coefficient.

Plot the probability distribution

For each of these values of (N):

[N = 10,; 50,; 200,; 1000]

Create an array n = np.arange(N+1)
Compute P = P_N(n, N)
Plot $P_N(n)$ vs $n$

Check normalization

Numerically verify that:
$\sum_{n=0}^N P_N(n) = 1$
(4)

Measure fluctuations

For each $N$ , compute and print:

The mean $\langle n\rangle$
The standard deviation $\sigma$
The relative fluctuation $\sigma/N$

Problem: Understanding Covariance and Correlation Using NumPy¶

Consider two random variables, $X$ and $Y$ , representing the heights (in cm) and weights (in kg) of a group of individuals. The relationship between these variables is often analyzed using covariance and correlation.

Generate synthetic data:
- Let $X$ (heights) be normally distributed with mean 170 cm and standard deviation 10 cm.
- Let $Y$ (weights) be generated as a linear function of $X$ with some added random noise:
  $Y = 0.5X + 10 + \text{random noise}$
  (5)
  where the random noise follows a normal distribution with mean 0 and standard deviation 5.
Compute the sample covariance between $X$ and $Y$ using NumPy.
Compute the Pearson correlation coefficient using NumPy.
Interpretation:
- If the covariance is positive, what does it imply?
- How does the correlation coefficient relate to the strength of the relationship?

Hints:¶

Use np.random.normal to generate data.
Compute covariance using np.cov(X, Y).
Compute correlation using np.corrcoef(X, Y).

Problem: Understanding Histograms, Normalization, and Bin Size in Statistical Mechanics¶

Scenario: Energy Distribution of Particles in a System¶

In statistical mechanics, the distribution of particle energies in a system at thermal equilibrium follows the Boltzmann distribution. To analyze this, we simulate a system of particles with energies distributed according to:

P(E) \propto e^{-E / k_B T}

(6)

where:

$E$ is the energy of a particle,
$k_B$ is the Boltzmann constant (set to 1 for simplicity),
$T$ is the temperature (choose a reasonable value, e.g., $T = 300$ ).

Tasks:¶

Generate Energy Data:
- Simulate 10,000 particle energies following the Boltzmann distribution using an exponential distribution.
- Use NumPy’s np.random.exponential(scale=kB*T, size=N) to generate energy values.
Plot Histograms with Different Bin Sizes:
- Create histograms with 20, 50, and 100 bins.
- Compare how bin size affects the appearance of the histogram.
Normalize the Histogram:
- Convert the histogram into a probability density function (PDF) by ensuring the total area under the histogram sums to 1.
- Use the density=True option in plt.hist().
Overlay the Theoretical Boltzmann Distribution:
- Plot the theoretical function $P(E) = (1/k_B T) e^{-E / k_B T}$ on top of the histogram.
Analyze the Results:
- How does the choice of bin size affect the smoothness of the energy distribution?
- Why is normalization important in probability distributions?
- How does increasing the number of particles affect the histogram shape?

Hints:¶

Use np.random.exponential(scale=kB*T, size=N) to generate data.
Use plt.hist(energies, bins=n, density=True, alpha=0.5) to plot normalized histograms.
Overlay the theoretical Boltzmann distribution using plt.plot(E, P(E)).

Problem: Compute Full, Marginal, and Conditional Probabilities Given p(x,y) p(x, y) p(x,y)¶

Exploration: How likely is it that (n) out of (N) molecules are in the left half?¶

Problem: Understanding Covariance and Correlation Using NumPy¶

Hints:¶

Problem: Understanding Histograms, Normalization, and Bin Size in Statistical Mechanics¶

Scenario: Energy Distribution of Particles in a System¶

Tasks:¶

Hints:¶

Problem: Compute Full, Marginal, and Conditional Probabilities Given $p(x, y)$ ¶