Numpy and Probabilities

Numpy and Probabilities#

Problem: Compute Full, Marginal, and Conditional Probabilities Given $p (x, y)$ #

Given the joint probability distribution $p (x, y)$ , perform the following tasks:

Check for Normalization:
- Verify that the sum of all elements in $p (x, y)$ equals 1 (i.e., $\sum_{x, y} p (x, y) = 1$ ).
Compute the Marginal Probability $p (x)$ :
- Compute $p (x)$ by summing over all values of $y$ :
  
  $p (x) = \sum_{y} p (x, y)$
Compute a Specific Joint Probability:
- Find the probability of $x = 1$ and $y = 2$ , i.e., $p (x = 1, y = 2)$ .
Compute the Conditional Probability $p (x | y = 3)$ :
- Use the definition of conditional probability: $ $p (x | y = 3) = \frac{p (x, y = 3)}{p (y = 3)}$ $
- Compute $p (y = 3)$ by summing over $x$ , then use it to compute $p (x | y = 3)$ .

import numpy as np

# Given joint probability matrix p(x, y)
pxy = np.array([[0.0888395 , 0.04135306, 0.07692385, 0.04716456],
                [0.05191984, 0.04715928, 0.06539948, 0.04739613],
                [0.09325686, 0.05106682, 0.05221257, 0.05864018],
                [0.08283109, 0.11219392, 0.02191071, 0.06173214]])

Problem: Understanding Covariance and Correlation Using NumPy#

Consider two random variables, $X$ and $Y$ , representing the heights (in cm) and weights (in kg) of a group of individuals. The relationship between these variables is often analyzed using covariance and correlation.

Generate synthetic data:
- Let $X$ (heights) be normally distributed with mean 170 cm and standard deviation 10 cm.
- Let $Y$ (weights) be generated as a linear function of $X$ with some added random noise:
  
  $Y = 0.5 X + 10 + random noise$
  
  where the random noise follows a normal distribution with mean 0 and standard deviation 5.
Compute the sample covariance between $X$ and $Y$ using NumPy.
Compute the Pearson correlation coefficient using NumPy.
Interpretation:
- If the covariance is positive, what does it imply?
- How does the correlation coefficient relate to the strength of the relationship?

Hints:#

Use np.random.normal to generate data.
Compute covariance using np.cov(X, Y).
Compute correlation using np.corrcoef(X, Y).

Problem: Understanding Histograms, Normalization, and Bin Size in Statistical Mechanics#

Scenario: Energy Distribution of Particles in a System#

In statistical mechanics, the distribution of particle energies in a system at thermal equilibrium follows the Boltzmann distribution. To analyze this, we simulate a system of particles with energies distributed according to:

P (E) \propto e^{- E / k_{B} T}

where:

$E$ is the energy of a particle,
$k_{B}$ is the Boltzmann constant (set to 1 for simplicity),
$T$ is the temperature (choose a reasonable value, e.g., $T = 300$ ).

Tasks:#

Generate Energy Data:
- Simulate 10,000 particle energies following the Boltzmann distribution using an exponential distribution.
- Use NumPy’s np.random.exponential(scale=kB*T, size=N) to generate energy values.
Plot Histograms with Different Bin Sizes:
- Create histograms with 20, 50, and 100 bins.
- Compare how bin size affects the appearance of the histogram.
Normalize the Histogram:
- Convert the histogram into a probability density function (PDF) by ensuring the total area under the histogram sums to 1.
- Use the density=True option in plt.hist().
Overlay the Theoretical Boltzmann Distribution:
- Plot the theoretical function $P (E) = (1 / k_{B} T) e^{- E / k_{B} T}$ on top of the histogram.
Analyze the Results:
- How does the choice of bin size affect the smoothness of the energy distribution?
- Why is normalization important in probability distributions?
- How does increasing the number of particles affect the histogram shape?

Hints:#

Use np.random.exponential(scale=kB*T, size=N) to generate data.
Use plt.hist(energies, bins=n, density=True, alpha=0.5) to plot normalized histograms.
Overlay the theoretical Boltzmann distribution using plt.plot(E, P(E)).

Numpy and Probabilities

Contents

Numpy and Probabilities#

Problem: Compute Full, Marginal, and Conditional Probabilities Given p(x,y)#

Problem: Understanding Covariance and Correlation Using NumPy#

Hints:#

Problem: Understanding Histograms, Normalization, and Bin Size in Statistical Mechanics#

Scenario: Energy Distribution of Particles in a System#

Tasks:#

Hints:#

Problem: Compute Full, Marginal, and Conditional Probabilities Given $p (x, y)$ #