import numpy as np
# Given joint probability matrix p(x, y)
pxy = np.array([[0.0888395 , 0.04135306, 0.07692385, 0.04716456],
[0.05191984, 0.04715928, 0.06539948, 0.04739613],
[0.09325686, 0.05106682, 0.05221257, 0.05864018],
[0.08283109, 0.11219392, 0.02191071, 0.06173214]])
Problem: Compute Full, Marginal, and Conditional Probabilities Given ¶
Given the joint probability distribution , perform the following tasks:
Check for Normalization:
Verify that the sum of all elements in equals 1 (i.e., ).
Compute the Marginal Probability :
Compute by summing over all values of :
Compute a Specific Joint Probability:
Find the probability of and , i.e., .
Compute the Conditional Probability :
Use the definition of conditional probability:
Compute by summing over , then use it to compute .
Exploration: How likely is it that (n) out of (N) molecules are in the left half?¶
When a wall separating an ideal gas is removed, each molecule can be found in the left or right half of the container. Assume each molecule is independently in the left half with probability .
The probability that exactly (n) molecules out of (N) are in the left half is:
Implement in NumPy
Create a Python function
P_N(n, N)that returns the binomial probability values.Use
math.comb(N,n)for the binomial coefficient.
Plot the probability distribution
For each of these values of (N):
[N = 10,; 50,; 200,; 1000]
Create an array
n = np.arange(N+1)Compute
P = P_N(n, N)Plot vs
Check normalization
Numerically verify that:
Measure fluctuations
For each , compute and print:
The mean
The standard deviation
The relative fluctuation
Problem: Understanding Covariance and Correlation Using NumPy¶
Consider two random variables, and , representing the heights (in cm) and weights (in kg) of a group of individuals. The relationship between these variables is often analyzed using covariance and correlation.
Generate synthetic data:
Let (heights) be normally distributed with mean 170 cm and standard deviation 10 cm.
Let (weights) be generated as a linear function of with some added random noise:
where the random noise follows a normal distribution with mean 0 and standard deviation 5.
Compute the sample covariance between and using NumPy.
Compute the Pearson correlation coefficient using NumPy.
Interpretation:
If the covariance is positive, what does it imply?
How does the correlation coefficient relate to the strength of the relationship?
Hints:¶
Use
np.random.normalto generate data.Compute covariance using
np.cov(X, Y).Compute correlation using
np.corrcoef(X, Y).
Problem: Understanding Histograms, Normalization, and Bin Size in Statistical Mechanics¶
Scenario: Energy Distribution of Particles in a System¶
In statistical mechanics, the distribution of particle energies in a system at thermal equilibrium follows the Boltzmann distribution. To analyze this, we simulate a system of particles with energies distributed according to:
where:
is the energy of a particle,
is the Boltzmann constant (set to 1 for simplicity),
is the temperature (choose a reasonable value, e.g., ).
Tasks:¶
Generate Energy Data:
Simulate 10,000 particle energies following the Boltzmann distribution using an exponential distribution.
Use NumPy’s
np.random.exponential(scale=kB*T, size=N)to generate energy values.
Plot Histograms with Different Bin Sizes:
Create histograms with 20, 50, and 100 bins.
Compare how bin size affects the appearance of the histogram.
Normalize the Histogram:
Convert the histogram into a probability density function (PDF) by ensuring the total area under the histogram sums to 1.
Use the
density=Trueoption inplt.hist().
Overlay the Theoretical Boltzmann Distribution:
Plot the theoretical function on top of the histogram.
Analyze the Results:
How does the choice of bin size affect the smoothness of the energy distribution?
Why is normalization important in probability distributions?
How does increasing the number of particles affect the histogram shape?
Hints:¶
Use
np.random.exponential(scale=kB*T, size=N)to generate data.Use
plt.hist(energies, bins=n, density=True, alpha=0.5)to plot normalized histograms.Overlay the theoretical Boltzmann distribution using
plt.plot(E, P(E)).