变分 Monte Carlo 方法
本文最后更新于 2025年9月2日 星期二 01:59
变分法
考虑计算有效低能 Hamilton 量 \(H\) 的基态 \(\ket{\psi_\text{GS}}\) 的问题
\[ \begin{gather} H\ket{\psi_\text{GS}}=E_\text{GS}\ket{\psi_\text{GS}}, \end{gather} \]
其中 \(\ket{\psi_\text{GS}}=\sum_x\psi_\text{GS}(x)\ket{x}\),\(\psi_\text{GS}(x)=\braket{x|\psi_\text{GS}}\) 是基态波函数。
变分态 \(\ket{\psi_\theta}\) 的能量期望值 \(E_\theta\) 不可能低于真实的基态能量 \(E_\text{GS}\),即
\[ E_\theta=\frac{\braket{\psi_\theta|H|\psi_\theta}}{\braket{\psi_\theta|\psi_\theta}}\geq E_\text{GS}. \]
变分 Monte Carlo 方法[1]
在 变分 Monte Carlo (variational Monte Carlo, VMC) 计算中,必须估计 \(E_\theta\) 以及能量相对于变分参数的梯度 \(\frac{\partial E_\theta}{\partial\theta_k}\),以便在 \(\theta\) 空间中找到降低 \(E_\theta\) 的方向。将能量期望值改写为
\[ \begin{gather} E_\theta=\frac{\sum_{x,x'}\psi_\theta^*(x)\mathcal{H}_{xx'}\psi_\theta(x')}{\sum_x\psi_\theta^*(x)\psi_\theta(x)}=\sum_xP_\theta(x)E_\theta^\text{loc}(x), \end{gather} \]
其中 \(P_\theta(x)=\frac{\psi_\theta^*(x)\psi_\theta(x)}{\sum_x\psi_\theta^*(x)\psi_\theta(x)}\) 是变分态的概率分布,\(E_\theta^\text{loc}(x)=\frac{\sum_{x'}\psi_\theta^*(x)\mathcal{H}_{xx'}\psi_\theta(x')}{\psi_\theta^*(x)\psi_\theta(x)}=\sum_{x'}\mathcal{H}_{xx'}\frac{\psi_\theta(x')}{\psi_\theta(x)}\) 是局域能量。
计算 \(\frac{\partial E_\theta}{\partial\theta_k}\)
现在考察梯度 \(\frac{\partial E_\theta}{\partial\theta_k}\)。记 \(E_\theta=\frac{N}{D}\),其中 \(N=\sum_{x,x'}\psi_\theta^*(x)\mathcal{H}_{xx'}\psi_\theta(x')\),\(D=\sum_x\psi_\theta^*(x)\psi_\theta(x)\),则
\[ \begin{gather} \frac{\partial E_\theta}{\partial\theta_k}=\frac{\partial N/\partial\theta_k}{D}-\frac{N}{D^2}\frac{\partial D}{\partial\theta_k}=\frac{1}{D}\frac{\partial N}{\partial\theta_k}-\frac{E_\theta}{D}\frac{\partial D}{\partial\theta_k}. \end{gather} \]
现在需要计算 \(\frac{\partial N}{\partial\theta_k}\) 和 \(\frac{\partial D}{\partial\theta_k}\)。
计算 \(\frac{\partial N}{\partial\theta_k}\) 和 \(\frac{\partial D}{\partial\theta_k}\)
\[ \begin{align} \frac{\partial N}{\partial\theta_k} &= \sum_{x,x'}\left[\frac{\partial\psi_\theta^*(x)}{\partial\theta_k}\mathcal{H}_{xx'}\psi_\theta(x')+\psi_\theta^*(x)\mathcal{H}_{xx'}\frac{\partial\psi_\theta(x')}{\partial\theta_k}\right], \\ \frac{\partial D}{\partial\theta_k} &= \sum_x\left[\frac{\partial\psi_\theta^*(x)}{\partial\theta_k}\psi_\theta(x)+\psi_\theta^*(x)\frac{\partial\psi_\theta(x)}{\partial\theta_k} \right]. \end{align} \]
引入 对数导数(log-derivative)
\[ (\mathbf{O}_\theta^\text{loc}(x))_k\equiv\frac{\partial\ln\psi_\theta(x)}{\partial\theta_k}=\frac{1}{\psi_\theta(x)}\frac{\partial\psi_\theta(x)}{\partial\theta_k}, \]
则
\[ \begin{gather} \frac{\partial\psi_\theta(x)}{\partial\theta_k}=\psi_\theta(x)(\mathbf{O}_\theta^\text{loc}(x))_k, \\ \frac{\partial\psi_\theta^*(x)}{\partial\theta_k}=\psi_\theta^*(x)(\mathbf{O}_\theta^\text{loc}(x))_k^*, \end{gather} \]
那么
\[ \begin{align} \frac{\partial N}{\partial\theta_k} &= \sum_{x,x'}\psi_\theta^*(x)(\mathbf{O}_\theta^\text{loc}(x))_k^*\mathcal{H}_{xx'}\psi_\theta(x')+\sum_{x,x'}\psi_\theta^*(x)\mathcal{H}_{xx'}\psi_\theta(x')(\mathbf{O}_\theta^\text{loc}(x'))_k, \end{align} \]
由 \(E_\theta^\text{loc}(x')=\sum_x\mathcal{H}_{x'x}\frac{\psi_\theta(x)}{\psi_\theta(x')}\),\(\mathcal{H}_{xx'}=\mathcal{H}_{x'x}^*\),可得
\[ \begin{gather} \psi_\theta(x')E_\theta^\text{loc}(x')=\sum_x\mathcal{H}_{x'x}\psi_\theta(x)=\left[\sum_x\mathcal{H}_{xx'}\psi_\theta^*(x)\right]^*, \end{gather} \]
又 \(E_\theta^\text{loc}(x')\) 是实数,故
\[ \begin{gather} \sum_x\mathcal{H}_{xx'}\psi_\theta^*(x)=\psi_\theta^*(x')E_\theta^\text{loc}(x'), \end{gather} \]
所以
\[ \begin{align} \frac{\partial N}{\partial\theta_k} &=\sum_x\psi_\theta^*(x)\psi_\theta(x)(\mathbf{O}_\theta^\text{loc}(x))_k^*E_\theta^\text{loc}(x)+\sum_{x'}\psi_\theta^*(x')\psi_\theta(x')E_\theta^\text{loc}(x')(\mathbf{O}_\theta^\text{loc}(x'))_k \notag \\ &=\sum_x\psi_\theta^*(x)\psi_\theta(x)E_\theta^\text{loc}(x)\left[(\mathbf{O}_\theta^\text{loc}(x))_k^*+(\mathbf{O}_\theta^\text{loc}(x))_k\right] \notag \\ &=2\sum_x\psi_\theta^*(x)\psi_\theta(x)E_\theta^\text{loc}(x)\text{Re}\left[(\mathbf{O}_\theta^\text{loc}(x))_k\right], \end{align} \]
故
\[ \begin{align} \frac{1}{D}\frac{\partial N}{\partial\theta_k} &= 2\text{Re}\left[\sum_xP_\theta(x)E_\theta^\text{loc}(x)(\mathbf{O}_\theta^\text{loc}(x))_k\right]. \end{align} \]
此外
\[ \begin{align} \frac{\partial D}{\partial\theta_k} &= \sum_x\left[\psi_\theta^*(x)(\mathbf{O}_\theta^\text{loc}(x))_k^*\psi_\theta(x)+\psi_\theta^*(x)\psi_\theta(x)(\mathbf{O}_\theta^\text{loc}(x))_k\right] \notag \\ &=\sum_x\psi_\theta^*(x)\psi_\theta(x)\left[ (\mathbf{O}_\theta^\text{loc}(x))_k^*+(\mathbf{O}_\theta^\text{loc}(x))_k \right] \notag \\ &=2\sum_x\psi_\theta^*(x)\psi_\theta(x)\text{Re}\left[(\mathbf{O}_\theta^\text{loc}(x))_k\right], \end{align} \]
故
\[ \begin{align} \frac{E_\theta}{D}\frac{\partial D}{\partial\theta_k} &=2E_\theta\text{Re}\left[\sum_xP_\theta(x)(\mathbf{O}_\theta^\text{loc}(x))_k\right]. \end{align} \]
梯度计算结果
综上所述,
\[ \begin{gather} (\mathbf{g}_\theta)_k\equiv\frac{\partial E_\theta}{\partial\theta_k}=2\text{Re}\left[\sum_xP_\theta(x)E_\theta^\text{loc}(x)(\mathbf{O}_\theta^\text{loc}(x))_k\right]-2E_\theta\text{Re}\left[\sum_xP_\theta(x)(\mathbf{O}_\theta^\text{loc}(x))_k\right], \end{gather} \]
得
\[ \begin{gather} \mathbf{g}_\theta=2\text{Re}\left[\sum_xP_\theta(x)E_\theta^\text{loc}(x)\mathbf{O}_\theta^\text{loc}(x)\right]-2E_\theta\text{Re}\left[\sum_xP_\theta(x)\mathbf{O}_\theta^\text{loc}(x)\right]. \end{gather} \]
记住 \(\sum_xP_\theta(x)=1\) ,我们发现 \(E_\theta\) 和 \(g_\theta\) 可以根据 \(E_\theta^\text{loc}(x)\),\(O_\theta^\text{loc}(x)\) 和 \(E_\theta^\text{loc}(x) O_\theta^\text{loc}(x)\) 的 Monte Carlo 平均值信息来计算,其中采样是使用与 \(|\psi_\theta(x)|^2\) 成比例的权重进行的。
参考文献
- Nomura Y, Imada M. Quantum many-body solver using artificial neural networks and its applications to strongly correlated electron systems. Journal of the Physical Society of Japan, 2025, 94(3): 031001. ↩︎