Linear Variational Method#

What You Need to Know

  • The variational method is a powerful tool for estimating upper bounds and approximations for ground-state energies in a wide range of quantum mechanical problems.

  • It plays a key role in electronic structure theories, such as Hartree-Fock.

  • The linear variational method seeks solutions to the Schrödinger equation by representing trial wavefunctions as linear combinations of simple, computationally efficient functions, such as Gaussians or exponentials.

  • By applying the linear variational method, the Schrödinger equation is transformed into a linear algebra problem, where the goal is to find eigenvalues (representing energies) and eigenvectors (providing coefficients for the linear combination).

Linearizing the problem#

  • How does linearization help us in practice? In a typical QM problem, we are attempting to solve for the wavefunction(s) and energy(ies) for a given Hamiltonian. If the Hamiltonian is such that solving the problem exactly is too challenging (e.g. any two or more electron problem) we can expand the wavefunction in a basis and attempt to solve the problem that way. We start with

\[\psi \approx \phi = \sum_n^N c_nf_n\]
  • Truncating this expansion at any finite \(n\) leads to an approximate solution. Also, what the variational principle tells us is that we can minimize the energy with respect to variational paramters \({c_n}\) and still have \(E_\phi \geq E_0\).

Smallest example#

  • We will illustrate this idea and the general matrix construction with a simple example of two basis functions (\(N=2\))

\[\phi = c_1f_1 + c_2f_2\]
  • There is currently no need to define these functions explicitly so we will leave them as generic functions \(f_1\) and \(f_2\). We now solve for the energy

\[E_\phi = \frac{\langle\phi|\hat{H}|\phi\rangle}{\langle\phi|\phi\rangle} = \frac{\langle c_1f_1 + c_2f_2|\hat{H}|c_1f_1 + c_2f_2 \rangle}{\langle c_1f_1 + c_2f_2|c_1f_1 + c_2f_2 \rangle}\]
\[E_\phi = \frac{c_1^2 H_{11} + 2c_1c_2H_{12} + c_2^2H_{22}}{c_1^2S_{11} + 2c_1c_2S_{12} + c_2^2S_{22}}\]
  • \(H_{ij} = \langle f_i |\hat{H}|f_j\rangle\) Hamiltonian matrix element expressed in basis of \(f\)

  • \(S_{ij} = \langle f_i|f_j\rangle\) S matrix element or overlap integral expressed in basis of \(f\). This one measures how similiar \(f_i\) is to \(f_j\).

Eigenvalue Problem#

  • Since \(E_\phi \geq E_0\) for any trial function \(\phi\), we can minimize the energy \(E_\phi\) by varying the parameters \(c_1\) and \(c_2\).

  • To minimize with respect to \(c_1\), we differentiate \(E_\phi\) with respect to \(c_1\) and set the derivative equal to zero:

    \[\frac{\partial E_\phi}{\partial c_1} = 0 = c_1(H_{11} - ES_{11}) + c_2(H_{12} - ES_{12})\]
  • Similarly, minimizing with respect to \(c_2\) gives:

    \[\frac{\partial E_\phi}{\partial c_2} = 0 = c_1(H_{12} - ES_{12}) + c_2(H_{22} - ES_{22})\]
  • These two coupled linear equations can be expressed compactly as a matrix equation:

    \[\begin{split}\begin{bmatrix} H_{11} - ES_{11} & H_{12} - ES_{12} \\ H_{12} - ES_{12} & H_{22} - ES_{22} \end{bmatrix} \begin{bmatrix} c_1 \\ c_2 \end{bmatrix} = 0\end{split}\]
  • The matrix on the left can be rewritten as the difference between two matrices:

    \[\begin{split}\left(\begin{bmatrix} H_{11} & H_{12} \\ H_{12} & H_{22} \end{bmatrix} - E \begin{bmatrix} S_{11} & S_{12} \\ S_{12} & S_{22} \end{bmatrix} \right) \begin{bmatrix} c_1 \\ c_2 \end{bmatrix} = 0\end{split}\]
  • Rearranging, we can write:

    \[\begin{split}\begin{bmatrix} H_{11} & H_{12} \\ H_{12} & H_{22} \end{bmatrix} \begin{bmatrix} c_1 \\ c_2 \end{bmatrix} = E \begin{bmatrix} S_{11} & S_{12} \\ S_{12} & S_{22} \end{bmatrix} \begin{bmatrix} c_1 \\ c_2 \end{bmatrix}\end{split}\]
  • In more compact matrix notation, this becomes:

  • By left-multiplying both sides by \(\mathbf{S}^{-1}\), we transform this into a standard eigenvalue problem:

    \[\mathbf{S}^{-1}\mathbf{H}\mathbf{c} = E\mathbf{I}\mathbf{c}\]
  • Therefore, the minimum energies correspond to the eigenvalues of \(\mathbf{S}^{-1}\mathbf{H}\), and the variational parameters that minimize the energies are the eigenvectors of \(\mathbf{S}^{-1}\mathbf{H}\).

Breaking problem down to matrix eigenvalue eigenvector problem

In the equation

\[\mathbf{S}^{-1}\mathbf{H}\mathbf{c} = E\mathbf{I}\mathbf{c},\]

\(\mathbf{I}\) represents the identity matrix. Its role in this context is essential to express the equation as a standard eigenvalue problem.

  1. Eigenvalue Problem Form:

    In linear algebra, a standard eigenvalue problem is written as:
    $\(\mathbf{A}\mathbf{v} = \lambda \mathbf{I}\mathbf{v},\)$
    where:

    • \(\mathbf{A}\) is a square matrix,

    • \(\lambda\) is a scalar eigenvalue,

    • \(\mathbf{I}\) is the identity matrix, and

    • \(\mathbf{v}\) is the corresponding eigenvector.

    The identity matrix \(\mathbf{I}\) ensures that \(\lambda\) scales the eigenvector \(\mathbf{v}\) without altering its direction. The eigenvalue problem is about finding the values of \(\lambda\) and their associated \(\mathbf{v}\).

  2. Connecting to \(\mathbf{S}^{-1}\mathbf{H}\mathbf{c} = E\mathbf{I}\mathbf{c}\):
    Here, \(\mathbf{S}^{-1}\mathbf{H}\) acts as the operator \(\mathbf{A}\) in the standard eigenvalue problem.

    • \(\mathbf{S}^{-1}\mathbf{H}\) is a matrix resulting from left-multiplying \(\mathbf{H}\) by the inverse of \(\mathbf{S}\).

    • \(\mathbf{c}\) represents the eigenvector.

    • \(E\) is the eigenvalue (corresponding to the energy in the quantum mechanical system).

    The identity matrix \(\mathbf{I}\) is explicitly included to highlight that \(E\) is a scalar multiplying the vector \(\mathbf{c}\). This ensures that the left-hand side (a matrix operation) matches the right-hand side (a scaled vector).

  3. Why \(\mathbf{S}^{-1}\) Appears:
    Initially, we had:
    $\(\mathbf{H}\mathbf{c} = E\mathbf{S}\mathbf{c},\)\( which cannot directly be interpreted as an eigenvalue problem because of the presence of \)\mathbf{S}\( (the overlap matrix). To transform this into a standard form, we pre-multiply both sides by \)\mathbf{S}^{-1}\(: \)\(\mathbf{S}^{-1}\mathbf{H}\mathbf{c} = E\mathbf{S}^{-1}\mathbf{S}\mathbf{c}.\)\( Since \)\mathbf{S}^{-1}\mathbf{S} = \mathbf{I}\(, this simplifies to: \)\(\mathbf{S}^{-1}\mathbf{H}\mathbf{c} = E\mathbf{I}\mathbf{c}.\)$

  4. How to Interpret This as an Eigenvalue Problem:
    The equation now has the form of a standard eigenvalue problem:
    $\(\mathbf{A}\mathbf{v} = \lambda \mathbf{I}\mathbf{v},\)$
    where:

    • \(\mathbf{A} = \mathbf{S}^{-1}\mathbf{H}\) is the effective matrix to diagonalize,

    • \(\lambda = E\) are the eigenvalues, corresponding to the energy levels,

    • \(\mathbf{v} = \mathbf{c}\) are the eigenvectors, containing the coefficients of the trial wavefunctions.

  5. Physical Interpretation:
    Solving the eigenvalue problem \(\mathbf{S}^{-1}\mathbf{H}\mathbf{c} = E\mathbf{I}\mathbf{c}\) gives the approximate energies (\(E\)) of the quantum system as eigenvalues and the corresponding variational parameters (\(\mathbf{c}\)) as eigenvectors. The identity matrix \(\mathbf{I}\) is crucial for preserving the standard form of the eigenvalue problem, ensuring proper mathematical and physical interpretation.

Example: Particle in a Box#

Let’s consider a free particle in 1D bounded to \(0\leq x\leq a\). The Hamiltonian for such a system is simply the kinetic energy operator

\[\hat{H} = -\frac{\hbar^2}{2m}\frac{d^2}{dx^2}\]

While we can have solved this problem analytically, it will be instructive to see how the variational solution works. We start by approximating \(\psi(x)\) as an expansion in two basis functions

\[\psi(x) \approx c_1x(a-x) + c_2x^2(a-x)^2\]

where the basis functions are \(f_1(x) = x(a-x)\) and \(f_2(x) = x^2(a-x)^2\) and \(c_1\) and \(c_2\) are the variational parameters (/linear coefficients of basis functions). In order to solve for the variational energies and wavefunctions given this expansion we must construct the Hamiltonian matrix, \(\mathbf{H}\), and the basis function overlap matrix, \(\mathbf{S}\). We will then compute and diagonlize the \(\mathbf{S}^{-1}\mathbf{H}\) matrix. Recall,

\[H_{ij} = \langle f_i|\hat{H}|f_j\rangle\]

and

\[S_{ij} = \langle f_i|f_j\rangle \]

In this problem, we have two basis functions

\[f_1(x) = x(a-x)\]

and

\[f_2(x) = x^2(a-x)^2\]

Computing the matrix elements:#

\[H_{11} = \langle x(a-x)|\hat{H}|x(a-x)\rangle\]
\[= \langle ax|\hat{H}|ax\rangle -\langle ax|\hat{H}|x^2\rangle - \langle x^2|\hat{H}|ax\rangle + \langle x^2|\hat{H}|x^2\rangle\]
\[ = -a\langle x|\hat{H}|x^2\rangle + \langle x^2|\hat{H}|x^2\rangle\]

where the last equality holds because the second derivative of \(x\) with respect to x is zero. Let’s now investigate each of these integrals

\[\begin{split}\langle x|\hat{H}|x^2\rangle = \int_0^ax \frac{-\hbar^2}{2m}\frac{d^2}{dx^2} x^2dx \\ = \frac{-\hbar^2}{m}\int_0^axdx = \frac{-\hbar^2a^2}{2m}\end{split}\]

Similarly we should get

\[\langle x^2|\hat{H}|x^2\rangle =\frac{-\hbar^2a^3}{3m}\]

Thus

\[H_{11} = -a\frac{-\hbar^2a^2}{2m} + \frac{-\hbar^2a^3}{3m} = \frac{\hbar^2a^3}{6m}\]
\[H_{22} = \langle x^2(a-x)^2|\hat{H}|x^2(a-x)^2\rangle\]
\[ = \langle a^2x^2|\hat{H}|a^2x^2\rangle -2\langle a^2x^2|\hat{H}|2ax^3\rangle + \langle a^2x^2|\hat{H}|x^4\rangle -2\langle ax^3|\hat{H}|a^2x^2\rangle + 4\langle ax^3|\hat{H}|ax^3\rangle -2\langle ax^3|\hat{H}|x^4\rangle + \langle x^4|\hat{H}|a^2x^2\rangle - 2\langle x^4|\hat{H}|ax^3\rangle + \langle x^4|\hat{H}|x^4\rangle\]
\[ = \frac{-\hbar^2a^7}{3m} +\frac{3\hbar^2 a^7}{m} - \frac{6\hbar^2 a^7}{5m} +\frac{\hbar^2 a^7}{2m} - \frac{12\hbar^2 a^7}{5m} +\frac{2\hbar^2 a^7}{m} - \frac{\hbar^2 a^7}{5m} + \frac{\hbar^2 a^7}{m} - \frac{6\hbar^2 a^7}{7m}\]
\[ = \frac{\hbar^2a^7}{105m} \]
\[H_{12} = H_{21} = \frac{\hbar^2a^5}{30m}\]

We can now complete the Hamiltonian matrix, \(\mathbf{H}\),

\[\begin{split}\mathbf{H} = \frac{\hbar^2a^3}{m} \begin{bmatrix} \frac{1}{6} & \frac{a^2}{30}\\ \frac{a^2}{30} & \frac{a^4}{105} \end{bmatrix}\end{split}\]

It can also be shown that \(\mathbf{S}\) is

\[\begin{split}\mathbf{S} = \frac{a^5}{10} \begin{bmatrix} \frac{1}{3} & \frac{a^2}{14}\\ \frac{a^2}{14} & \frac{a^4}{63} \end{bmatrix}\end{split}\]
Hide code cell source
import numpy as np
from scipy.linalg import eig, inv, eigh

# Lets adopt simpler units
a = 1.0
hbar = 1.0
m = 1.0

S = np.array([[1.0/30.0, 1.0/140.0],[1.0/140.0,1.0/630.0]])
H = np.array([[1.0/6.0,1.0/30.0],[1.0/30.0,1.0/105.0]])

e, v = eig(inv(S) @ H) #e, v = eigh(H, S) #Same values eigh is for hermitian matrices

print("Eigenvalues:", e)
print("First eigenvector:", v[:,0])
Eigenvalues: [ 4.93487481+0.j 51.06512519+0.j]
First eigenvector: [-0.66168489 -0.74978204]

So we see that the smallest energy in this basis is

\[E_\phi = 4.9349 \frac{\hbar^2}{m}\]

How does this compare to the analytic solution? We need to recall that for particle in a box we have:

\[E_n = \frac{n^2\pi^2\hbar^2}{2ma^2} \]

Plugging in for the ground state, \(n=1\), and \(a=1\) since that is what we used numerically above we get

\[E_1 = \frac{\pi^2\hbar^2}{2} \approx 4.9348 \frac{\hbar^2}{m} \]

So we can see that our variational solution worked out well for the energy. Now how about the wavefunction?

Hide code cell source
# plot wavefunction
import matplotlib.pyplot as plt
from scipy import integrate
%matplotlib inline
# x values
x = np.arange(0,1,0.01)
# exact wavefunction
psi1Exact = np.sqrt(2)*np.sin(np.pi*x) 
plt.plot(x,psi1Exact,'k-',label="Exact",lw=3)
# variational basis function wavefunction
psi1Var = v[0,0]*x*(1-x) + v[1,0]*x**2*(1-x)**2
norm = np.sqrt(integrate.simps(np.power(psi1Var,2),x))
plt.plot(x,-psi1Var/norm,'r--', label="Variational")
plt.legend()
<matplotlib.legend.Legend at 0x7f9083657a90>
../_images/4989e054990cf789c0ad47e8c5fc69e5e8f14435d3aec603e1038fab8d205228.png