变分 Monte Carlo 方法

本文最后更新于 2025年9月2日 星期二 01:59

变分法

考虑计算有效低能 Hamilton 量 \(H\) 的基态 \(\ket{\psi_\text{GS}}\) 的问题

\[ \begin{gather} H\ket{\psi_\text{GS}}=E_\text{GS}\ket{\psi_\text{GS}}, \end{gather} \]

其中 \(\ket{\psi_\text{GS}}=\sum_x\psi_\text{GS}(x)\ket{x}\)\(\psi_\text{GS}(x)=\braket{x|\psi_\text{GS}}\) 是基态波函数。

变分态 \(\ket{\psi_\theta}\) 的能量期望值 \(E_\theta\) 不可能低于真实的基态能量 \(E_\text{GS}\),即

\[ E_\theta=\frac{\braket{\psi_\theta|H|\psi_\theta}}{\braket{\psi_\theta|\psi_\theta}}\geq E_\text{GS}. \]

变分 Monte Carlo 方法[1]

变分 Monte Carlo (variational Monte Carlo, VMC) 计算中,必须估计 \(E_\theta\) 以及能量相对于变分参数的梯度 \(\frac{\partial E_\theta}{\partial\theta_k}\),以便在 \(\theta\) 空间中找到降低 \(E_\theta\) 的方向。将能量期望值改写为

\[ \begin{gather} E_\theta=\frac{\sum_{x,x'}\psi_\theta^*(x)\mathcal{H}_{xx'}\psi_\theta(x')}{\sum_x\psi_\theta^*(x)\psi_\theta(x)}=\sum_xP_\theta(x)E_\theta^\text{loc}(x), \end{gather} \]

其中 \(P_\theta(x)=\frac{\psi_\theta^*(x)\psi_\theta(x)}{\sum_x\psi_\theta^*(x)\psi_\theta(x)}\) 是变分态的概率分布,\(E_\theta^\text{loc}(x)=\frac{\sum_{x'}\psi_\theta^*(x)\mathcal{H}_{xx'}\psi_\theta(x')}{\psi_\theta^*(x)\psi_\theta(x)}=\sum_{x'}\mathcal{H}_{xx'}\frac{\psi_\theta(x')}{\psi_\theta(x)}\) 是局域能量。

计算 \(\frac{\partial E_\theta}{\partial\theta_k}\)

现在考察梯度 \(\frac{\partial E_\theta}{\partial\theta_k}\)。记 \(E_\theta=\frac{N}{D}\),其中 \(N=\sum_{x,x'}\psi_\theta^*(x)\mathcal{H}_{xx'}\psi_\theta(x')\)\(D=\sum_x\psi_\theta^*(x)\psi_\theta(x)\),则

\[ \begin{gather} \frac{\partial E_\theta}{\partial\theta_k}=\frac{\partial N/\partial\theta_k}{D}-\frac{N}{D^2}\frac{\partial D}{\partial\theta_k}=\frac{1}{D}\frac{\partial N}{\partial\theta_k}-\frac{E_\theta}{D}\frac{\partial D}{\partial\theta_k}. \end{gather} \]

现在需要计算 \(\frac{\partial N}{\partial\theta_k}\)\(\frac{\partial D}{\partial\theta_k}\)

计算 \(\frac{\partial N}{\partial\theta_k}\)\(\frac{\partial D}{\partial\theta_k}\)

\[ \begin{align} \frac{\partial N}{\partial\theta_k} &= \sum_{x,x'}\left[\frac{\partial\psi_\theta^*(x)}{\partial\theta_k}\mathcal{H}_{xx'}\psi_\theta(x')+\psi_\theta^*(x)\mathcal{H}_{xx'}\frac{\partial\psi_\theta(x')}{\partial\theta_k}\right], \\ \frac{\partial D}{\partial\theta_k} &= \sum_x\left[\frac{\partial\psi_\theta^*(x)}{\partial\theta_k}\psi_\theta(x)+\psi_\theta^*(x)\frac{\partial\psi_\theta(x)}{\partial\theta_k} \right]. \end{align} \]

引入 对数导数(log-derivative)

\[ (\mathbf{O}_\theta^\text{loc}(x))_k\equiv\frac{\partial\ln\psi_\theta(x)}{\partial\theta_k}=\frac{1}{\psi_\theta(x)}\frac{\partial\psi_\theta(x)}{\partial\theta_k}, \]

\[ \begin{gather} \frac{\partial\psi_\theta(x)}{\partial\theta_k}=\psi_\theta(x)(\mathbf{O}_\theta^\text{loc}(x))_k, \\ \frac{\partial\psi_\theta^*(x)}{\partial\theta_k}=\psi_\theta^*(x)(\mathbf{O}_\theta^\text{loc}(x))_k^*, \end{gather} \]

那么

\[ \begin{align} \frac{\partial N}{\partial\theta_k} &= \sum_{x,x'}\psi_\theta^*(x)(\mathbf{O}_\theta^\text{loc}(x))_k^*\mathcal{H}_{xx'}\psi_\theta(x')+\sum_{x,x'}\psi_\theta^*(x)\mathcal{H}_{xx'}\psi_\theta(x')(\mathbf{O}_\theta^\text{loc}(x'))_k, \end{align} \]

\(E_\theta^\text{loc}(x')=\sum_x\mathcal{H}_{x'x}\frac{\psi_\theta(x)}{\psi_\theta(x')}\)\(\mathcal{H}_{xx'}=\mathcal{H}_{x'x}^*\),可得

\[ \begin{gather} \psi_\theta(x')E_\theta^\text{loc}(x')=\sum_x\mathcal{H}_{x'x}\psi_\theta(x)=\left[\sum_x\mathcal{H}_{xx'}\psi_\theta^*(x)\right]^*, \end{gather} \]

\(E_\theta^\text{loc}(x')\) 是实数,故

\[ \begin{gather} \sum_x\mathcal{H}_{xx'}\psi_\theta^*(x)=\psi_\theta^*(x')E_\theta^\text{loc}(x'), \end{gather} \]

所以

\[ \begin{align} \frac{\partial N}{\partial\theta_k} &=\sum_x\psi_\theta^*(x)\psi_\theta(x)(\mathbf{O}_\theta^\text{loc}(x))_k^*E_\theta^\text{loc}(x)+\sum_{x'}\psi_\theta^*(x')\psi_\theta(x')E_\theta^\text{loc}(x')(\mathbf{O}_\theta^\text{loc}(x'))_k \notag \\ &=\sum_x\psi_\theta^*(x)\psi_\theta(x)E_\theta^\text{loc}(x)\left[(\mathbf{O}_\theta^\text{loc}(x))_k^*+(\mathbf{O}_\theta^\text{loc}(x))_k\right] \notag \\ &=2\sum_x\psi_\theta^*(x)\psi_\theta(x)E_\theta^\text{loc}(x)\text{Re}\left[(\mathbf{O}_\theta^\text{loc}(x))_k\right], \end{align} \]

\[ \begin{align} \frac{1}{D}\frac{\partial N}{\partial\theta_k} &= 2\text{Re}\left[\sum_xP_\theta(x)E_\theta^\text{loc}(x)(\mathbf{O}_\theta^\text{loc}(x))_k\right]. \end{align} \]

此外

\[ \begin{align} \frac{\partial D}{\partial\theta_k} &= \sum_x\left[\psi_\theta^*(x)(\mathbf{O}_\theta^\text{loc}(x))_k^*\psi_\theta(x)+\psi_\theta^*(x)\psi_\theta(x)(\mathbf{O}_\theta^\text{loc}(x))_k\right] \notag \\ &=\sum_x\psi_\theta^*(x)\psi_\theta(x)\left[ (\mathbf{O}_\theta^\text{loc}(x))_k^*+(\mathbf{O}_\theta^\text{loc}(x))_k \right] \notag \\ &=2\sum_x\psi_\theta^*(x)\psi_\theta(x)\text{Re}\left[(\mathbf{O}_\theta^\text{loc}(x))_k\right], \end{align} \]

\[ \begin{align} \frac{E_\theta}{D}\frac{\partial D}{\partial\theta_k} &=2E_\theta\text{Re}\left[\sum_xP_\theta(x)(\mathbf{O}_\theta^\text{loc}(x))_k\right]. \end{align} \]

梯度计算结果

综上所述,

\[ \begin{gather} (\mathbf{g}_\theta)_k\equiv\frac{\partial E_\theta}{\partial\theta_k}=2\text{Re}\left[\sum_xP_\theta(x)E_\theta^\text{loc}(x)(\mathbf{O}_\theta^\text{loc}(x))_k\right]-2E_\theta\text{Re}\left[\sum_xP_\theta(x)(\mathbf{O}_\theta^\text{loc}(x))_k\right], \end{gather} \]

\[ \begin{gather} \mathbf{g}_\theta=2\text{Re}\left[\sum_xP_\theta(x)E_\theta^\text{loc}(x)\mathbf{O}_\theta^\text{loc}(x)\right]-2E_\theta\text{Re}\left[\sum_xP_\theta(x)\mathbf{O}_\theta^\text{loc}(x)\right]. \end{gather} \]

记住 \(\sum_xP_\theta(x)=1\) ,我们发现 \(E_\theta\)\(g_\theta\) 可以根据 \(E_\theta^\text{loc}(x)\)\(O_\theta^\text{loc}(x)\)\(E_\theta^\text{loc}(x) O_\theta^\text{loc}(x)\) 的 Monte Carlo 平均值信息来计算,其中采样是使用与 \(|\psi_\theta(x)|^2\) 成比例的权重进行的。

参考文献

  1. Nomura Y, Imada M. Quantum many-body solver using artificial neural networks and its applications to strongly correlated electron systems. Journal of the Physical Society of Japan, 2025, 94(3): 031001. ↩︎

变分 Monte Carlo 方法
https://blog.gtbcamp.cn/article/vmc/
作者
Great Thunder Brother
发布于
2025年9月2日
更新于
2025年9月2日
许可协议