188 82 14MB
English Pages 329 Year 2002
Recent Advances in Computational Chemistry - Vol. 2
RECENT ADVANCES IN QUANTUM MONTE CARLO METHODS Part II
edited by
William A Lester Jr. Stuart M Rothstein Shigenori Tanaka World Scientific
RECENT ADVANCES IN QUANTUM MONTE CARLO METHODS Part II
Recent Advances in Computational Chemistry Editor-in-Charge Delano P. Chong, Department of Chemistry, University of British Columbia, Canada
Published Recent Advances in Density Functional Methods, Part I (Volume 1) ed. D. P. Chong Recent Advances in Density Functional Methods, Part II (Volume 1) ed. D. P. Chong Recent Advances in Density Functional Methods, Part III (Volume 1) eds. V. Barone, A. Bencini and P. Fantucci Recent Advances in Quantum Monte Carlo Methods, Part I (Volume 2) ed. W. A. Lester Recent Advances in Coupled-Cluster Methods (Volume 3) ed. Rodney J. Bartlett
Recent Advances in Computational Chemistry-Vol. 2
RECENT ADVANCES IN QUANTUM MONTE CARLO METHODS Part II
edited by
William A Lester, Jr University of California, Berkeley, USA
Stuart M Rothstein Brock University, Canada
Shigenori Tanaka Toshiba Corporation, japan
(World Scientific New Jersey • London • Singapore • Hong Kong
Published by World Scientific Publishing Co. Pte. Ltd. P O Box 128, Farrer Road, Singapore 912805 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
RECENT ADVANCES IN QUANTUM MONTE CARLO METHODS — PART II Copyright © 2002 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 981-02-4945-4
Printed in Singapore by Uto-Print
V
PREFACE
The quantum Monte Carlo method is a rigorous stochastic approach for solving the Schroedinger equation by using stochastic methods. Although its roots may be traced to Fermi, who recognized the similarity of the Schroedinger equation to a diffusion equation, it was the advent of high-speed parallel com puters that made possible its practical implementation. The first significant modern developments may be attributed to Kalos in the 1960s and to An derson in the mid-1970s. As outlined in Kalos' introduction to this volume and further reflected by the chapters contained herein, the field has evolved rapidly, not only in "chemistry" with such problems as the structure of excited states of isolated systems and clusters, but routinely now for several important problems in condensed matter physics. This volume consists of chapters by participants in the Advances in Quan tum Monte Carlo symposium at the Pacifichem meeting held in Honolulu in December 2000. The chapters go beyond what was presented there, with some resulting from collaborations and exchanges of ideas facilitated by the meeting. They truly represent an accurate "snapshot" of the state of the art in quantum Monte Carlo methods in the chemical context as of mid-2001. The first four chapters address some of the methodological challenges alluded to in Kalos' introduction. In Chapter 1 Bressanini, Ceperley, and Reynolds address the question: "what do we know about wave function nodes?". The significance of this question rests on the fact that the fixed-node approach to quantum Monte Carlo in many cases is the best approach to solving a given problem, but at the price of a systematic, so called "fixed-node" error. To somehow optimize the nodes would ameliorate that error. The authors' strong nodal conjecture suggests that the nodes may be somewhat less complicated than previously believed. Chapter 2 is devoted to the challenge of estimating small energy differences in the context of intermolecular forces. Filippi and Umrigar present a correlated sampling method to efficiently calculate numeri cal forces and potential energy surfaces in diffusion Monte Carlo. Results are presented from all-electron and pseudopotential calculations of homonuclear first row diatomic molecules and Si2- To extend the range of application of quantum Monte Carlo to systems which contain a large number of atoms is a long-standing fundamental problem. In Chapter 3 Luechow and Mante present techniques to improve the scaling of the local energy evaluation by using local ized orbitals and short range correlation factors. They achieve an improvement of an order of magnitude, which promises applications of quantum Monte Carlo to systems with more electrons than was previously thought possible. Chapter
vi 4 by Asai addresses the issue of the so-called "fermion-sign problem", which lies at the core of solving the Schroedinger equation for fermions without re course to the fixed-node approximation. He introduces a new approximation to alleviate the sign problem, with encouraging results for the test case given. The following three chapters are concerned with applications to estimate properties other than the energy. Chapter 5 by Alexander and Coldwell revisit the beryllium atom. After exploring several trial function forms employed in variational calculations, they apply their best one to compute a number of physical properties of the atom, including moments of r; and r;j, delta func tions, relativistic corrections to the energy, the form factor and total x-ray scattering cross section. In Chapter 6 Hornik and Rothstein apply quantum Monte Carlo to estimate the static electrical properties of H and He to sixth order in the electric field. They present in detail the formulas to do this with out invoking the commonly-used finite field approximation. Although these simple systems have been accurately treated previously by using non-Monte Carlo methods, this paper is useful in that it reveals how the statistical error (the major bottleneck in quantum Monte Carlo simulations) increases with the order of perturbation. Chapter 7 is concerned with vibrational properties and with quantum dynamics of molecules. Tanaka presents a new ab initio compu tational scheme based on variational quantum Monte Carlo to treat electron correlation effects in molecular quantum dynamics and the nuclear quantum effect in the framework of the path-integral centroid molecular dynamics ap proach. He applies this methodology to describe the C-H stretching motion of benzene. The following three chapters are concerned with excited states. In Chapter 8 Huang, Viel, and Whaley describe and analyze new implementations of the so-called "POITSE method" to compute a correlation function in imaginary time from which they extract information about excitation energies. They show how to significantly reduce the statistical noise associated with this scheme by incorporating a branching process into the Monte Carlo simulation. In the fol lowing chapter Nightingale and Melik-Alaverdian introduce a new scheme to optimize excited state trial wavefunctions, applied to ground and vibrationally excited states of bosonic van der Waals clusters containing up to seven par ticles. In Chapter 10 Needs, Porter and Towler are concerned with quantum Monte Carlo calculations of excited electronic states of sodium dimer and hydrogenated silicon clusters, with up to ten silicon atoms. The following two chapters concern applications of quantum Monte Carlo to large systems and clusters. Chapter 11 by Lester and Grossman exploits density functional theory to obtain geometries followed by diffusion quantum Monte Carlo simulations at those fixed geometries to study the reactions of
vii large molecules with oxygen, and other combustion systems. In the following chapter Flad, Schautz, Wang, and Dolg use quantum Monte Carlo methods in combination with relativistic pseudopotentials and polarization potentials to accurately obtain properties of small- and medium-sized mercury clusters, up to Hgi3. They also give a perspective of how adsorption of small molecules on cluster surfaces can be treated by quantum Monte Carlo methods. The book concludes with four chapters on applications to condensed mat ter. Gori-Giorgi, Federico, and Bachelet in Chapter 13 provide an extensive review of important model solids which form a foundation for our understand ing of real systems: the two-dimensional Hubbard hamiltonian and the threedimensional homogenous electron gas, or jellium. In Chapter 14 Dewing and Ceperley for the first time employ classical Monte Carlo on the ionic degrees of freedom using energies calculated from quantum Monte Carlo simulations of the electrons and apply this scheme to liquid H 2 . Their chapter delves into several important technical issues, including bias in Metropolis sampling, corre lated sampling, and wave function optimization. Nagaoka and Suenobu in the following chapter treat condensed phases by employing a path integral method which combines quantum transition state theory with a stochastic quantization method. In addition the authors address important issues related to the exten sion of this approach to mixed quantum/classical systems. The final chapter by Baer and Neuhauser is a review of the shifted contour auxiliary field Monte Carlo method as well as a novel new method with applications to molecular electronic structure, including excited states, and to large Hubbard lattices of strongly correlated electrons. This approach to the fermion-sign problem is promising emerging development of the quantum Monte Carlo method. These chapters were carefully refereed by experts in theoretical and com putational chemistry and/or physics. We and the authors thank them for their helpful suggestions. We thank James Anderson for his having co-organized the symposium. And finally, partial support for the travel of some of the invited speakers was provided by grants from the Donors of the Petroleum Research Fund, administered by the American Chemical Society, IBM, Silicon Graphics Inc., and the Toshiba Corporation. William A. Lester, Jr., Berkeley, California, U.S.A.
Stuart M. Rothstein, St. Catharines, Ontario, CANADA
Shigenori Tanaka, Kawasaki, Kanagawa, JAPAN
This page is intentionally left blank
ix INTRODUCTION M.H. Kalos Lawrence Livermore National Laboratory, Livermore CA 94551, USA
Quantum Monte Carlo has come a long way and is in an interesting state. After some fascinating essays by Fermi and his collaborators, it was reborn in the 1960's and embarked upon a slow and steady evolution. In the 1980's there was a burst of enthusiasm- which may have been somewhat prematurefor its promise in quantum chemistry, but it has since become a powerful tool in theoretical physics and chemistry, in spite of the fact that not all of the associated algorithmic problems have been solved. This coming of age has marked by the series of symposia on Quantum Monte Carlo organized in connection with the Pacifichem International conferences held every five years in Honolulu. The present volume is devoted to papers derived from presentations at the last, Pacifichem 2000. Although this meeting was, quite properly, dedicated primarily to applica tions in chemistry, I thought it useful to mention the wide successes of Quan tum Monte Carlo in other fields, in many of which it has become the com putational method of choice. They include particle physics, nuclear structure physics, condensed matter physics, including both lattice models of electrons (e.g., Hubbard models) and low-temperature physics. The application to electronic structure, especially of atomic and molecular systems is in many ways the most exigent. We approach the problem as an explicitly many-body problem, without attempting to reduce it to three dimen sions. That this is both possible and essential still seems to be news to some distinguished scientists. The need for "chemical accuracy," typically a relative error of one part in 105 to 106 has seemed to many researchers an impossible dream for any stochastic method. We now understand that it can be done, and quite routinely. Another unique challenge is the enormous range of energy scales that are involved in a large molecule, scales that must be encompassed accurately to obtain useful answers. Again, this challenge has been met and conquered. Even with its current limitations, QMC is demonstrably the most reliable predictive tool for molecular structure up to about 50 atoms. The symposium presented a stimulating mix of exciting new ideas and challenges, new and important results of methods now standard in our field, along with continued attention to some pervasive long-standing difficultieslike correlation methods for energy differences- that still impede progress. We are, I believe, at a crucial time. Interest and understanding of QMC is growing rapidly, along with the number of researchers now making vital intel-
X
lectual contributions. Unlike the early days, the computational power needed for path-making work now exists almost everywhere, and our standard algo rithmic approaches make an ideal fit with current parallel supercomputers. I believe also that we have the essential ideas necessary to solve our our re maining methodological challenges- the "Fermion sign problem;" the issue of small perturbations; the question of estimating the square of the wave function without using "extrapolation" or some variant of "forward walking;" the prob lem of large-Z slowing down; the treatment of excited states; and a natural, non-perturbative method for relativistic effects. Our mandate, then is to make progress in the algorithmic art and to con tinue to demonstrate the power and reach of our methods, with the objective of advancing the frontiers of chemistry and physics in the broadest and most rigorous way.
XI
CONTENTS Preface
v
Introduction M. H. Kalos
ix
Theory/Algorithm Development
l
1. What do We Know About Wave Function Nodes? D. Bressanini, DM. Ceperley, and P.J. Reynolds
3
2
Interatomic Forces and Correlated Sampling in Quantum Monte Carlo C. Filippi and C.J. Umrigar
12
3. Improved Scaling in Diffusion Quantum Monte Carlo with Localized Molecular Orbitals A. Liichow and S. Manten
30
4. A Remedy for the Negative Sign Problem in the Auxiliary Field Quantum Monte Carlo Method Y. Asai
40
Properties of Ground State Atoms and Molecules 5. The Beryllium Atom Revisited S.A. Alexander and R.L. Coldwell
53 55
6. Quantum Monte Carlo Study of the Static Electrical Properties of Hand He M. Hornik and S.M. Rothstein
71
7. Ab initio Approach to Vibrational Properties and Quantum Dynamics of Molecules S. Tanaka
95
XII
Excited Electronic States
109
8. Efficient Implementation of the Projection Operator Imaginary Time Spectral Evolution (POITSE) Method for Excited States P. Huang, A. Viel and K.B. Whaley.
111
9. Trial Function Optimization for Excited States of van der Waals Clusters M.P. Nightingale and V. Melik-Alaverdian
127
10. Quantum Monte Carlo Calculations for Excited Electronic States R. J. Needs, A.R. Porter and M. D. Towler
Large Systems and Clusters 11. Quantum Monte Carlo for the Electronic Structure of Combustion systems W.A. Lester, Jr. and J.C. Grossman
143
157 159
12. Quantum Monte Carlo Study of Mercury Clusters H.-J. Flad, F. Schautz, Y.-X. Wang andM. Dolg
183
Condensed Matter
203
13. Quantum Monte Carlo for Realistic and Model Solids P. Gori-Giorgi, A. Federico and G.B. Bachelet
205
14. Methods for Coupled Electronic-ionic Monte Carlo M. Dewing and D. Ceperley
218
15. Toward Quantum Chemodynamics in Condensed Phase via Stochastic Quantization Method M. Nagaoka and K. Suenob
254
16. Shifted Contour Auxiliary Field Monte Carlo R. Baer and D. Neuhauser
279
Subject Index
311
Theory/Algorithm Development
This page is intentionally left blank
3 WHAT DO W E KNOW ABOUT WAVE FUNCTION NODES?
DARIO BRESSANINI Dipartimento di Scienze Chimiche, Fisiche e Matematiche - Universita' dell'Insubria, Via Lucini 3, 20100 Como ITALY and Department of Physics, Georgetown University, Washington, DC. USA DAVID M. CEPERLEY Department of Physics and National Centerfor Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, IL USA PETER J. REYNOLDS Department of Physics, Georgetown University, Washington, D.C. and Office of Naval Research, Arlington, VA USA
1
1.1
Introduction
Nodes and the Sign Problem
Although quantum Monte Carlo is, in principle, an exact method for solving the Schrodinger equation, it is well-known that systems of Fermions still pose a challenge. Thus far all solutions to the "sign problem" [1] remain inefficient* (or wrong). The fixed-node approach [2], however, is efficient, and in many situations remains the best approach. If only we could find the exact nodes—or at least a systematic way to improve the nodes—we would, in effect, bypass the sign problem.
1.2
The Plan of Attack
A reasonable place to begin is to study the nodes of both exact and good quality approximate trial wave functions. If we can understand their properties, we can perhaps find a way to parameterize the nodes using simple functions. Then one can optimize the nodes by minimizing the fixed-node energy. Unfortunately, very little is known about wave function nodes [3,4], and a systematic study has never been
"Efficiency" is a measure of how well a problem scales with system size. Several small systems have been treated with high precision by exact Fermion methods.
4
attempted, despite the obvious consequences for improving quantum simulations that such knowledge might generate. 1.3
The Helium Triplet
The exact spatial eigenfunctions for the helium atom (and all other two-electron atomic ions) are functions of six coordinates (for two-electron systems the spin eigenfunction can always be factored out). Thus, x
yn(^)
x
=
i'n(xlyy1,zvx2,y2,z2).
(Note that bold typeface indicates a vector quantity in any number of dimensions.) Let us consider states with S symmetry (L = 0). These states are rotationally invariant, so we can factor out the three Euler angles from the total wave function, and use only three coordinates to describe the internal wave function. These are usually chosen to be the interparticle distances, so that
We now focus our attention on the first triplet state. By the Pauli principle, the wave function must change sign if we interchange the two electrons. It follows that Vin ,r2,rl2)
= -x¥(r2,rl,
rl2).
From this equation we can infer that if we place the two electrons at the same distance from the nucleus (but not necessarily at the same point) we obtain V(r,r,r12)
=
-W(r,r,r12),
implying that
Y(r,r,r12) = 0. This means that the node is described by the equation rx —r2. The 2S state of helium is one of very few systems where we know the exact node. Note that the "Pauli hyperplane" I"j = r 2 , or {x, = x2, yi = y2, z, = z2] , belongs to the node, but it is only a subset of lower dimensionality. In fact, since we are imposing a single constraint on Si space, the node is a five dimensional surface, while the Pauli hyperplane, with three constraints, has dimensionality three. Here we summarize some facts about the node of the 2 3 5 state, with the main objective to generalize them for larger systems. The big surprise is that the node is
5
more symmetric than the wave function itself, since it does not depend on r;2Strikingly, it is also independent of Z. Thus He, Li+, Be2+,... all have the same node. Furthermore this node is present in all S states of two-electron atoms. While the wave function is not factorizable, i.e., ¥(rl»r2.'i2),t0('i'r2)P(ri2) it can be written as y
¥(r1,r2,r12)
N(r1,r2)enr>-r*'r")
=
where N(r1,r2)
= rl~r2
.
This is not a trivial result. All of the antisymmetry has been placed in a lower dimensional nodal function N^r^ri). The unknown function/is totally symmetric. Moreover, we can write the second factor as an exponential, as we do, to emphasize its positivity. It is also interesting to note that the nodal function AMs a simple polynomial in the distances. Last but not least, in this case the Hartree-Fock wave function has the exact node. 1.4
Nodal Conjectures
What properties of the nodes are present in other systems and/or states? In other words, what is general (if anything), and what is specific to atoms versus molecules, to S states versus other symmetries, to triplet states, or to two-electron atoms versus many-electron atoms? Some years ago Anderson [5] found some of these nodal properties in lP He and Zu H 2 as well. So, what is general? For a generic system, what can we say about N ? Here VExact = i V ( R ) e / ( R ) • What we call the "strong nodal conjecture" is the generalization from helium that the exact wave function can be written in the above form, with N an antisymmetric polynomial of finite order, and / a totally symmetric function. In the next sections we will present evidence that the ground states of Li and Be may indeed have such simple nodes. A weaker conjecture is that N may not be a polynomial, but can be closely approximated by a lower-order antisymmetric polynomial. The variables in which we expect N to be (or to approximate) a polynomial would be the interparticle distances. However, in general they may be any set of independent coordinates in which one could specify x¥.
6
2
Lithium Atom Ground State
The restricted Hartree-Fock (RHF) description of the ground state of Li is ^
RHF
lj(r1)lj(r2)2j(r3)
aPa =
(ls(rl)2s(r3)-ls(r3)2s(ri))ls(r2)
As for helium, the node is r\ — r 3 , where 1 and 3 are the two alpha spin electrons. In other words, if two like-spin electrons are at the same distance from the nucleus then Y = 0. For triplet He this was an exact result. How good is this RHF node for Li? As we demonstrate here, even though ^RHF is not very good (for example, it belongs to a higher symmetry group than the exact wave function), its node is surprisingly accurate. To see this numerically, for example, the exact (DMC) solution with this node gives an energy ERHF = -7.47803(5) a.u. [6] compared to EEXOCI = -7.47806032 a.u. [7]. A DMC simulation (done still with the same nodes), using a Hylleraas function for importance sampling, gives an extrapolated energy £ = -7.478060(3) a.u. Thus, the numerical evidence is that the HF node is correct (within the error bars of these Monte Carlo simulations). If true, the node has even higher symmetry than ^VRHF, since it does not depend on either r 2 or r^. Surprisingly (perhaps), the GVB wave function has the exact same node. Is this an indication that this node is exact? Not necessarily, since permutational symmetry alone does not require this node. The exact wave function, to be a pure S, requires only that
^ = f(rvr2,r-i,rn,rn,r23)
+
~ / ( r 3 ' r2 ' r l ' r23> r13> rn) ~ f(r2>
f(r2,rx,r2,rn,r23,ru) r
3 ' rl> r23> r 12' r o )
which does not constrain a node at ri=r$. To study an "almost exact" node we took a Hylleraas expansion for Li with 250 terms, whose variational energy EHy = -7.478059 a.u. We then examined the 5-D nodal structure of this function in two ways. First, we used Mathematica® to take cuts through space; and second, we used Monte Carlo simulation to compare nodal crossings of "F^ with crossings of r\ = r3. From the various cuts, we were unable to find any deviation from the r\ = r3 node. However from the simulations there were found 6 (out of 98) nodal crossings of the Hylleraas function that did not also cross r\ = r3. These six crossings appear to be, on closer examination, either sufficiently close to r\ — r 3 to be due to round-off error of a truncated Hylleraas expansion, or to be numerical artifacts. The issue thus appears unresolved by these numerical means.
7
White and Stillinger [9] determined the nodes for 3 electrons interacting with harmonic springs (so-called harmonic lithium) and gave an argument, using perturbation theory, that r\ = r-$ is not the exact node for the Li atom ground state. They found that correlations with an electron at r-i distorted the HF nodes to an aspherical, pear-like shape, but by a small amount. In the vicinity of the nucleus they obtained a form for the nodal function that to lowest order is still a sphere (in -A , fixing electrons 2 and 3), but the center is displaced from the nuclear position along the axis in the direction of the third electron, with the radius going through the position of the other up electron. Given the perturbation results of White and Stillinger, the "strong nodal conjecture" seems not to hold, at least not for the function r\ = r 3 (although it might be true for an higher order polynomial), however the "weak nodal conjecture" is certainly true given the extremely good energy obtained using that node. 3
3.1
Beryllium Atom
Numerical Arguments and Mathematical Cross-Sections
The ground state of Be is (Is) 2 (2s)2 1S. In 1992 it was realized that the HartreeFock (HF) wave function has four nodal regions [3]. In essence, T H F factors into two determinants, each one in effect a triplet Be+2. Thus there is (in this description) no effect of the alpha electrons on the beta electrons, and vice versa, and each set forms a separate r\—rt type node. Explicitly, the HF node is (j\-r2){r^—r&). This node has the wrong topology. How do we know this? Numerically, the DMC energy for this node is -14.6576(4) a.u. [6] versus the exact energy of -14.6673 a.u. This is well outside the DMC statistical error. Since only the fixed node approximation can account for the error, Be represents an unusual system where the nodal error is clearly visible, presumably because of the strong mixing of several HF configurations in the ground state. Figure 1: Schematic of nodal regions for YHF of the Be atom. Axes are tl=(r1—r2) and ti=(r3-rd.
8
In fact, this node is wrong in an easily describable way. It was conjectured a while ago [3,4] that the exact *F for ground states of atoms in general have but two nodal regions. As indicated, the Hartree-Fock node creates four nodal regions. We find upon going beyond HF, that the simple node, with its (fi—r^fa-r^) structure, changes only slightly. Yet, the crossing surfaces open up, leaving only a lower dimensional crossing "point." We see this clearly by taking cuts (done in Mathematica®) through the full 9-D (effective) space of the wave function. (The full dimensionality is 3N for an N electron system, and hence 12 here. However, as discussed earlier, for an S state the wave function is invariant under rotation, allowing us to eliminate 3 degrees of freedom.) We started by using Mathematica® to examine gradually more accurate trial wave functions. Plotting t\ = (ri~r 2 ) against ti = (r3-r 4 ) we see that the HF wave function vanishes along die axes (for arbitrary values of the variables representing the other 7 dimensions). This was illustrated above (though you have to imagine the other 7 dimensions all coming out of the page at you). Given that adjacent regions have opposite signs of the wave function, one can label the regions "+" and " - " as indicated. An optimized two-configuration (4 determinant) trial wave function also displays this crossing structure, but now only at particular values of the other variables. For more general values of those variables there is a passage between either the two "+" regions or the two " - " regions. With a proper choice of the angular variables one observes a smooth opening up of the crossing, going from interconnected "+" regions to interconnected " - " regions.
Figure 2: Schematics of nodal regions for fci of the Be atom. Axes are t\={ry-r2) and t2=(r^-r4).
9 The closeness of the nodes to the simple form (ri-r2)(r3-r4) seems to indicate the presence of this term, plus a small additional term. Taking more cuts provided us with clues. For example there is (almost) a node when two alpha electrons are along any ray from the origin, while the two beta's are on a sphere. Following up on this we deduced a node of the form N(R) = (r, - r2 )(r3 - r4) + a r12 * r34 . We are in the process of determining the remaining structure of the node when both these terms are accounted for. A simple polynomial form—of greater symmetry than apparently required (i.e. needing fewer than d - 1 = 8 variables)—appears to describe the node. In the next section we give a proof of what we found numerically; namely that the nodal structure of Be has only two disjoint volume elements. By examining cross-sections it was possible to visualize this, though it remains difficult to understand precisely how the various nodal regions are connected up in the full 12 dimensional space. The proof puts this on firmer footing and also shows the origin of the dot-product term in the node. 3.2
Proof That Four-Electron 'S Atomic Ground States Have Only Two Nodal Regions
For a singlet state of a four-electron system there will be two up electrons, at positions we denote by {i"i, Y2}, and two down electrons at {r^, r^}. The wave function must be antisymmetric with respect to exchange of either of these pairs, with corresponding permutation operators P\j and P34. We define a nodal region with respect to a reference point, R = (i"i, T2, r^, r^), as the set of points that can be reached by a path from the reference point that does not cross a node of T . Any point with ^(R*) # 0 can be chosen as a reference point. For any ground state wave function, the tiling theorem applies [4]. This theorem tells us that any nodal region defined with respect to one reference point is equivalent to those defined with respect to another point, up to a permutation. For the 4-electron case, this implies that there can be at most four nodal regions, since there are only four permutations that do not interchange spin: two positive permutations /, and P12P34 , and two negative permutations Pn and P34. Now, consider the Be atom wave function expanded in a single particle basis. Since the single particle levels are ordered as (Is) < (2s) < (2p) < (3s) ..., the two lowest energy configurations in the ground state are (pi = (ls) 2 (2s) 2 and (P2 = (Is) 2 (2p)2. Explicitly writing down the Slater determinant we find that the sign of cp] is given by the node mentioned above: (ri-/2)(/3-r 4 ). This gives the four nodal regions of the HF wave function, resulting from the direct product of the up spin and down spin determinants. As stated earlier, this property is not in the true function, wherein electron correlation changes the connectivity of the nodes.
10 To show that there are, in fact, only two nodal regions we must find a valid reference point R together with a path R(f) that connects it with its permuted image P\2 P34 R* such that < P(R*(f)) * 0 along the entire path. (The connection between the two negative regions follows by symmetry.) For the reference point, we consider a point of the form R* = (i"i, —i"i, r 3 , - r 3 ) where rj and r 3 are arbitrary non-zero vectors. As an S state, *F(R) is invariant with respect to rotation of all of the electrons about any axis through the nucleus. Therefore, consider the path connecting R* to P 1 2 F34 R* to be that generated by a 180° rotation about the axis r, X r 3 . (If ri is parallel to r 3 then any axis perpendicular to either vector can be chosen.) Since the wave function is invariant with respect to rotation, it is constant along this path. Hence Figure 3: connecting R ' t o P n ^ R *
as
j
o n g
as
R
*
is
a
vaHd
reference
point, then
there are only two connected nodal regions, a single positive and the complementary negative region. This wave function would be "maximally connected." What is left in proving that the HF nodes have the incorrect connectivity amounts to finding such a special point with *F (R *) ^ 0 • The point R* defined above, however, is not suitable if we are to believe the HF node. We now argue that R* from above is suitable, as long as electron correlation causes configuration mixing in the ground state. Suppose we expand the exact wave function in a CI basis, \P = \ " c,2 in ^ since it has the same symmetry as the ground state, and it is a double excitation. We can show that this configuration has a nodal surface described by (^(r,)^ -g(r 2 )r 2 )'(g(r 3 )r 3 -g(r 4 )r 4 ) = 0 for some positive function g. With the value of R* assumed above, the wave function will vanish only when r, • r 3 = 0 . Thus ri and r 3 can be freely chosen (as long as they are not perpendicular) for the point R to be a valid reference point. There is no reason to suspect that adding other terms would cause *F(R ) to vanish exactly for all of these points, R .
11
The Be atom is a case where the HF single determinant nodes are particularly bad because of configuration mixing, and the HF nodes cause significant fixed-node error [8]. The argument just presented, however, is quite general, applying to all 4electron atoms having ground state S states, and non-zero mixing of double excitations.
4
Conclusions
The belief that "nodes are weird" expressed jokingly by M. Foulkes at the Seattle meeting in 1999 may be overstated. Here (at the Hawaii Pacifichem 2000 meeting) we countered with "...maybe not.". Numerically determined nodes (at least for the atoms He, Li, and Be) seem to depend on few variables, and have higher symmetry than the wave function itself. Moreover, the nodes resemble polynomial functions. Possibly this is an explanation of why HF nodes—as seen in fixed-node QMC simulations of these atoms—are so good: they "naturally" have these properties. If so, it is a simple leap of faith to believe that it may in fact be possible to optimize the nodes directly, for use in QMC.
5
References
1. http://www.ncsa.uiuc.edu/Apps/CMP/lectures/signs.html and references therein 2. P. J. Reynolds, D. M. Ceperley, B. J. Alder, and W. A. Lester, Jr., J. Chem. Phys. 77,5593(1982). 3. W. A. Glauser, W. R. Brown, W. A. Lester, Jr., D. Bressanini B. L. Hammond, and M. L. Koszykowski, J. Chem. Phys. 97, 9200 (1992). 4. D. M. Ceperley, J. Stat. Phys. 63, 1237 (1991). 5. J. B. Anderson, Phys. Rev. A 35, 3550 (1987). 6. A. Luchow, and J. B. Anderson, J. Chem. Phys. 105, 7573 (1996). 7. Z. C. Yan, M. Tambasco, and G. W. F. Drake, Phys. Rev. A 57, 1652 (1998). 8. J. W. Moskowitz, K. E. Schmidt. M. A. Lee, and M. H. Kalos, J. Chem. Phys. 76, 1064(1982). 9. R. J. White and F. H. Stillinger, J. Chem. Phys. 3, 1521 (1971).
12
I N T E R A T O M I C FORCES A N D CORRELATED S A M P L I N G IN Q U A N T U M M O N T E CARLO CLAUDIA FILIPPI Department
of Physics,
National
University
of Ireland,
Cork,
Ireland
C. J. U M R I G A R Cornell
Theory
Center Cornell
and Laboratory of Atomic and Solid State University, Ithaca, New York 14853
Physics,
One of the main difficulties of quantum Monte Carlo techniques is the lack of an efficient method for computing interatomic forces. To date, most quantum Monte Carlo calculations have been performed on geometries obtained with either density functional theory or conventional quantum chemistry methods. Here, we present a correlated sampling method to efficiently calculate numerical forces and potential energy surfaces in diffusion Monte Carlo. It employs a novel coordinate transfor mation, earlier used in variational Monte Carlo, to greatly reduce the statistical error. Results are presented from all-electron and pseudopotential calculations of homonuclear diatomic molecules.
1
Introduction
Over the past decade, quantum Monte Carlo (QMC) methods [1,2] have pro vided the most accurate calculations of correlated properties for medium-sized molecules and solid systems, where conventional quantum chemistry methods become computationally very expensive. Despite their potential to become an important tool for the study of correlated systems, a major stumbling block to a more widespread use of QMC techniques has been their difficulty in de termining equilibrium geometries and potential energy surfaces: most QMC calculations have been performed on geometries obtained with either density functional theory (DFT) or conventional quantum chemistry methods. To understand the complications involved in computing forces in QMC, we need to first discuss how other conventional electronic structure methods determine interatomic forces, and why such procedures cannot be straightfor wardly extended to QMC calculations. DFT methods or standard quantum chemistry techniques use the Hellman-Feynman theorem to compute forces on nuclei [3]. The Hellman-Feynman theorem is applicable provided that the parameters in the wave function are chosen to minimize the energy for the given form of the wave function (Pulay [4] corrections must be added if the basis employed depends on the atomic coordinates and is not complete). The applicability of the Hellman-Feynman theorem in QMC methods is compli-
13
cated by the following considerations. First, the wave functions used in QMC are usually not obtained by minimizing the energy, but rather by minimiz ing the variance of the local energy. Although these conditions are identical if no constraints are placed on the form of the wave function, for the wave functions used in practice they are close but not identical. Therefore, if the Hellman-Feynman theorem were employed in variational Monte Carlo (VMC), the forces would have a systematic error. (When we speak of the error in the force we do not mean the error compared to the true force but rather the error relative to the force obtained by calculating the energy at neighboring geometries using the same form of the wave function with reoptimized param eters.) At first sight, it may appear that the systematic error disappears in diffusion Monte Carlo (DMC) since this method stochastically projects onto the true ground state. However, practical DMC calculations require one to employ the fixed-node approximation [2] to avoid an exponential growth of the bosonic ground state relative to the fermionic ground state as the projection time increases. For fixed-node DMC it has been shown [5] that the HellmanFeynman theorem is applicable provided that at least one of the following two conditions is satisfied: a) the trial wave function has the true nodes or b) the nodes of the trial functions are independent of the nuclear positions. In fact, the former condition can be relaxed to read: the parameters of the trial wave function have been optimized to give the lowest fixed-node DMC energy for that form of the wave function, and the basis used to construct the wave function is either not dependent on the atomic positions or is complete. The other problem with using the Hellman-Feynman theorem in both VMC and DMC is that a straightforward application of the theorem leads to an infinite variance estimator for the force, thereby rendering the method totally impractical. However, a recently developed and general variance reduction method, due to Assaraf and Caffarel [6], makes the variance finite and small. Alternatively, one could simply compute energy differences to obtain ei ther forces (for an infinitesimal displacement of the ions) or the full potential energy surface of the system. However, while quantum chemistry methods can rely on having an approximately constant and smoothly varying error in the energy, a major disadvantage of QMC methods is that, in addition to systematic errors, one has statistical errors which make the determination of energy differences or smooth potential energy surfaces computationally very expensive. Even though it is not possible to entirely eliminate the statistical errors, it is possible, by using correlated sampling [2], to make the statistical errors in the relative energies of different geometries much smaller than the er rors in the separate energies and to make them vanish in the limit that the two
14
geometries become identical. In the past, the correlated sampling technique has been used within VMC [7,8] but there have been very few attempts [9] to extend the approach to DMC, and these were approximate and/or inefficient and were tested only on H2, H^" and LiH. In this paper, we discuss our DMC correlated sampling technique [10] for computing accurate forces, potential energy surfaces and vibration frequen cies. The DMC bond lengths of first-row diatomic molecules computed with this algorithm are found to be in better agreement with experimental values than are the VMC, Hartree-Fock (HF), local density approximation (LDA) and generalized gradient approximation (GGA) values. 2
Correlated sampling in variational Monte Carlo
Correlated sampling enables us to compute, from a single reference Monte Carlo walk, the relative energies of different geometries, a reference and one or more secondary geometries, with nuclear coordinates R a and R^, Hamiltonians % and 7is, and wave functions ip and tps, respectively. Unbiased expectation values are obtained by reweighting the configurations sampled from ip2,
(A\A)
(R0 J '
{)
where the weights of the iVconf MC configurations are w
iVconf | & ( R Q M R i ) I2
=
f
'
,2) 2
££T h/>s(Ri)MR;)| '
and R = ( r i , . . . , rjv). The effective number of configurations sampled is
(E&"W)J N
-"= Sfe-w?
s (RI)/^(R i )l 2 J(Ri)
Eft n , K(R5)MRi)| 2 ^(Ri)'
(7)
and J ( R ) is the Jacobian for the transformation (Eq. 4). 3
Correlated sampling in diffusion Monte Carlo
In DMC [11], the primary walk is generated according to a stochastic imple mentation of the integral equation:
f(R',t + T)=JdRG(B!,-R,T)f(R,t),
(8)
16
where the importance-sampled Green's function G ( R ' , R , r) = V'(R') (R'|exp{-%T}|R) /ip(R), / = 4>i>, 4> is the ground state wave function and ip the trial wave function. For small values of r (short-time approximation), G ( R ' , R , r) is given by the product of three factors, drift, diffusion and growth/decay:
G(R',R,r) « - V e - ^ ^ e ^ ^ ,
(9)
where V = Vip(R)/ip(R) and 5 ( R ' , R , T ) = (2£ T - £ L ( R ' ) - £ ^ ( R ) ) r / 2 with EL — 7iip(R)/ip(R). A set of primary walkers characterized by the pairs (Ri,Wj) is a random realization of the distribution / . Each walker executes a branching random walk: a walker originally at R drifts to R + V ( R ) r and then diffuses to R ' according to the Gaussian term in Eq. 9. To ensure that, when ip is the ground state wave function, ip2 is sampled exactly despite the short-time approximation in the Green's function, the move is accepted with probability p mm 1
-
{ '|^(R)|ar(R'>R>r)}'
(10)
as prescribed by the detailed balance condition. We denote by T the driftdiffusion part of the Green's function G. Finally, the weight of the walker is multiplied by exp[S(R',R,r)]. In practice, we employ the improved version of G presented in Ref. [12]. 3.1
An impractical route to DMC correlated sampling
Previous attempts in the literature to perform correlated sampling in DMC adopted a simplified approach to the algorithm: they either used the same wave function for the secondary and the primary geometry, or did not include the accept/reject step, or completely neglected problems which occur at the nodes. Therefore, to clarify possible misconceptions and, at the same time, lay the building ground for our algorithm, we present in this section a correct but impractical algorithm for correlated sampling in DMC. Let us generate the primary walk according to Eq. 9 and the secondary walk as specified by the space-warp transformation (Eq. 4). Two complica tions, absent in VMC, arise for correlated sampling in DMC. First of all, the dynamics of the secondary walker should have been governed by an importance sampled Green's function, G s ( R s ' , R s , r ) , constructed from the secondary wave function ips, and the move should have been accepted with probability ps obtained by substituting, in Eq. 10, ip and T with ips and Ts, respectively.
17
However, the secondary geometry move was effectively proposed iccording to the drift-diffusion Green's function T ( R ' , R , r ) / J ( R ' ) and accepted with probability p defined in Eq. 10. To correct for the wrong dynamics, we should multiply the weights of the secondary walkers by g.(R",R',T) T(R',R,r)/J(R') '
y
'
where r = ps/p if the move is accepted and r = (1 — p s ) / ( l — p) if the move is rejected. However, these products fluctuate wildly (r can be anywhere between zero and infinity). Therefore, it is not practical to follow this route to perform correlated sampling unless bounds can be placed on the ratios while at the same time ensuring that unbiased results are obtained in the r —> 0 limit. An additional complication is the common practice in fixed-node DMC to reject moves that cross nodes. If primary and secondary walkers were to be treated on the same footing (ps set to zero when the secondary walker crosses its own nodes), the weights of the secondary walkers would all become zero in a sufficiently long run. Even though this problem can be easily overcome since it is legitimate to do fixed-node DMC allowing walkers to cross nodes [12], reweighting as in Eq. 11 remains impractical due to the large fluctuations. 3.2
Our accurate and efficient algorithm
Our alternative correlated sampling algorithm [10] is approximate but very ac curate. Given the successful implementation of correlated sampling in VMC, we want to devise a scheme which differs as little as possible from VMC and is almost as efficient as VMC. Also, to compute DMC energies for just the ref erence geometry, we have a well known algorithm including an accept/reject step, which is efficient and has a very small time-step error. Therefore, we wish to have a DMC algorithm which is close to our VMC algorithm for calculating energy differences, reduces to our usual DMC algorithm for the reference geometry, and yields results very close to the DMC limit for the secondary geometry. Our algorithm is based on the observation that, in the absence of the growth/decay factor and presence of the accept/reject step, we would be sam pling ip2 for the primary walk, and i/"s f° r the secondary walk by reweighting the averages with the ratio of wave functions (Eq. 7). If we stopped here, we would be simply obtaining the VMC energy difference. To approach the DMC limit, we then multiply the weights of the primary and secondary walkers by the corresponding growth/decay factors. Thus, we recover the fixed-node so lution for the primary walk, but we do so only approximately for the secondary
18
walk since the moves were not proposed with the right dynamics. To partially correct for this, we introduce a secondary time-step as discussed below. This algorithm must of course always give energies lower than the VMC energies for the secondary geometries and we show in Sec. 5 that it is indeed very accurate. We summarize our recipe as follows: (1) We generate secondary walks from the reference walk according to the space-warp transformation. (2) In the averages for the secondary geometry, we retain the ratios of the secondary and primary wave functions as in VMC (Eqs. 6 and 7). (3) The secondary weights are the primary ones which have been undone and multiplied by the secondary growth/decay factor over the last A^proj generations:
1 1 exp[-S(R',R,r)] '
^
The last step keeps the primary and the secondary walk correlated since the secondary and primary weights only differ over the last iVpr0j generations. iVproj is chosen large enough to project out the secondary ground state, but small enough to avoid a considerable increase in the fluctuations. In the expo nential factors, we introduced TS because the secondary moves are effectively proposed with a different time-step, T S , in the drift-diffusion term of Eq. 9: if the molecule is for instance stretched, the secondary electronic coordinates are stretched and, to a first approximation, we are sampling a Gaussian with a larger width, or equivalently a larger time-step, in the drift-diffusion term. A sensible definition of r s is therefore r s = T{AR^)/(AR2) where AR is the displacement resulting from diffusion, and ARs is the displacement needed to take the secondary walker from its drifted position to the position specified by the space-warp transformation. r s is computed over the first equilibration blocks of the DMC run. In the limit of vanishing displacement, the difference of primary and sec ondary energies and its statistical error vanish linearly, so the force and its error are well behaved. 4
S e c o n d a r y g e o m e t r y wave functions
We considered three choices for secondary geometry wave functions:
19
(1) The secondary wave functions have the same parameters {p} as the pri mary one but the coordinates are relative to the new nuclear positions: V>s(R-i,R4) = ^(R-iiR-ajPs) with p s = p , possibly with the minimal changes required to impose the cusp conditions. (2) The secondary geometry wave functions at warped electron positions are related to the primary ones at the original positions, ^>S(R*,R^) = ip(Ri,'Ra,p)/y/J(Ri). This wave function depends on the transforma tion (it was used in Ref. [9, b] with a different transformation) and has the advantages that the weights W* in (Eq. 2) are unity and that the transfor mation of Eq. 4 maps the nodes of the primary geometry wave function onto those of the secondary geometry wave function. Consequently, if the primary geometry walkers do not cross nodes, neither do the correspond ing secondary geometry walkers. Surprisingly, it gives larger fluctuations of the energy differences than choice (1). (3) ips(Ri, Hsa) = tf>(Hi, R s OI p s ) with reoptimized parameters p s . This choice gives the smallest fluctuation of the energy differences and the best po tential energy surface. For choices (1) and (3) since the nodes of the primary wave function do not map onto the nodes of the secondary wave function, it is possible to encounter configurations for which the primary and secondary geometries (and consequently the corresponding weights) become much different from each other, resulting in a loss of correlation. This could be cured by placing bounds on the differences but in this paper we have not done this because for the systems and wave functions employed this was not a serious problem. In reoptimizing the secondary geometry wave functions, we follow a par ticular procedure to keep the local energies of the primary and secondary wave functions, for each MC configuration and corresponding mapped configura tions, as closely correlated as possible. The parameters in the Jastrow factor of the primary wave function are first optimized using the variance minimiza tion method [13], i.e., the variance of the local energy is minimized over a set of MC configurations sampled from the square of the best wave function available before we start the optimization. When we reoptimize the Jastrow parameters of the secondary wave function, we employ the same configura tions used to optimize the primary one, but now properly mapped through the space-warp transformation (Eq. 4). If one uses as a starting point the primary Jastrow recentered and only reoptimizes the linear parameters in the Jastrow (but not the parameter K of Ref. [14]), only a very small number of optimiza tion steps are needed to reoptimize the Jastrow factor of the secondary wave
20
function. While this scheme always works reliably for the reoptimization of the Jastrow parameters, we sometimes have difficulty keeping the primary and secondary wave functions correlated if we optimize the determinantal part as well. Therefore, for the determinantal part, we choose to simply recenter the determinants, and follow the above procedure only for the optimization of the Jastrow component. On the other hand, if the determinantal part of the wave function is obtained using a quantum chemistry package, we can sim ply reoptimize the determinantal part by rerunning the Hartree-Fock (HF) or multi-configuration self-consistent-field calculation (MCSCF) for the sec ondary geometries. However, in this case, the determinantal parts of both the primary and the secondary geometry wave functions are not optimized in the presence of the Jastrow factor. We calculate all molecules with choice (1) but also demonstrate the supe riority of choice (3) for all-electron B 2 and pseudo Si 2 . For B 2 , we reoptimize only the Jastrow part while, for Si 2 , we present also results with the determi nantal part reoptimized using the quantum chemistry package GAMESS [15]. 5
Results
The algorithms presented in the previous sections are tested in all-electron calculations of the first-row homonuclear dimers and in pseudopotential cal culations of the Si2 dimer. For the all-electron dimers, the primary wave functions [14] were opti mized close to the experimental bond length by the variance minimization method. The potential energy curves were obtained with correlated sampling from ten geometries, using the warp transformation and recentered secondary geometry wave functions (choice (1) above). Values of NpTOjT of 5-10 H _ 1 were sufficient to project out the'secondary wave functions. Before presenting results for the bond lengths and vibration frequencies, we discuss some of the tests we performed on our method. We first .verified that, over a large range of atomic displacements, computing the energy differ ence between primary and secondary geometries with our correlated sampling method is more efficient than performing independent runs. In Fig. 1, we show the efficiency gain obtained with our VMC and DMC correlated sam pling algorithm for Li 2 . The gain is given by the ratio of the root-mean-square fluctuations of the relative energy obtained with correlated sampling and the one derived from two separate runs. The gain diverges as the secondary ge ometry approaches the primary one, and is more than a factor of 30 even for displacements as large as ± 0.2 a.u. To further ascertain the efficiency of our algorithm, we performed two ad-
21
—1
10 s
1—
Efficiency gain for Lis from correlated sampling
Q>
'o
10'
10° -0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
AR (a.u.) Figure 1. VMC and DMC efficiency gain for Li2 from using correlated sampling to compute the energy difference between the primary and secondary geometries, instead of performing independent runs.
ditional calculations for B 2 ; in the first, we omitted the warp transformation, whereas in the second we employed reoptimized, rather than recentered, Jastrows in the secondary wave functions. In Fig. 2 (upper plot), we present the VMC root-mean-square fluctuations (CTVMC) of the relative energy of primary and secondary geometries divided by the atomic displacement, AE/AR, for B2. Introducing the warp transformation yields a reduction of about a factor of 3.5-5 in which corresponds to a factor of 12-25 saving in computer time. Moreover, OVMC is only slightly dependent on the secondary geometry used. As expected, a further reduction in CTVMC is obtained when the spacewarp transformation is used in combination with reoptimized, rather than recentered, secondary geometry wave functions. The space-warp transforma tion was found to be of even greater help for heavier molecules, e.g., for F 2 the reduction in the fluctuations was at least a factor of 3.5-10. In fact, in the absence of space-warp transformation for F2, it is even difficult to reliably estimate the statistical error of secondary geometries that differ considerably from the primary one. It is clear that the gain in efficiency due to using the space-warp trans formation is greater for all-electron than for pseudopotential calculations. To
22
n 10
9 8 -^ 7 h ffl" 6
VMC r m s f l u c t u a t i o n s of AE/AR for B 2 a no-warp, rec entered tji o warp, reeentered ifi • warp, reoptimized f
3 2 1
-0.1
0.0
0.1
AR (a.u.)
VMC r m s f l u c t u a t i o n of AE/AR for p s e u d o - S i s
a no-warp, reeentered i// o warp, reeentered ^ • warp, reoptimized V
^0.4
ao.3
I
-0.1
0.0
0.1
0.3
AR (a.u.) Figure 2. VMC fluctuations ( 0.007 Ctj
0.006
—' >*, 0.005
t? V W
0-004 0.003 0.002 0.001 0.000 -0.001 2.75
2.BO
2.65
2.90
2.95
3.00
3.05
3.10
3.15
3.20
3.25
R (a.u.) Figure 3. Potential energy curve for B2 in VMC and DMC. The three DMC curves are obtained with three different primary geometries (equilibrium, stretched by 0.2 a.u. and — 0.2 a.u.) and using recentered wave functions. All curves are shifted with the energy at the equilibrium distance (arrow) defined as the zero. Atomic units are used.
answer the question: "what, if any, gain is obtained in pseudopotential cal culations", we calculate the force AEjAR for the S12 dimer and plot the corresponding T for A i ? > 0 and TS21 with a guide function * G - The cor responding random walk consists of a diffusion step, a new drift step with a "velocity" V ^ G / ^ G . and a reaction term with the "rate constant" E\oc - £ref where E\oc is the local energy £ )oc = H^a/^a and Erer a reference energy. After equilibration the distribution is p(r) = * O * G / / * O * G dr where r de notes a position vector in configuration space. The exact ground state energy is extracted statistically from this distribution with the mixed estimator. 20 N
f E0=
EXoc(r)p(r)dT=
IN
lim T rv{n)Eun(Ti) / Y > ( r i )
(6)
where {rj} is the collection of the position vectors of the sample and w(ri) its weight. Our DQMC implementation is based on the drift-diffusion algorithm as described by Reynolds et al.21 with some of the modifications suggested by Umrigar and coworkers.16 The details of our algorithm have been described previously.22,23 Due to the required antisymmetry, the electronic ground state wave func tion has nodal hypersurfaces. Thus, p(r) can be interpreted as a statistical distribution only when the nodes of \&G and \&o match. An approximation to the true ground state is obtained when the nodes of \?G are imposed on the
33
random walk resulting in the fixed-node approximation (FN-DQMC).24 The error A£ n o d e = £f N ) - Eo
(7)
is known as the node location error. If $ G satisfies the Pauli principle then the FN-DQMC solution * 0 will also. In this paper we use guide functions of the form *o = eu* (8) where $ is a Slater determinant $ = det(^(i) are linear combinations of basis functions (MO-LCAO) K
^ ( 0 = X)C^Xit(i)-
(9)
k
As in previous work, the function U is expanded in a short series25
uaij = 2 cka (fir r™/- + fi- c^ )f%",
(io)
it
where a and i, j refer to the nuclei and the electrons, respectively, and where f is a scaled distance. This function was introduced into DQMC by Schmidt and Moskowitz.26 This general form of U consists of two-body terms Uai and Uij and of three-body terms Uaij where a few three-body "back-flow" terms are particularly effective in reducing the variance.26 The parameters of U are optimized by variance minimization with Monte Carlo methods.27 In addition to DQMC calculations, we performed variational QMC cal culations (VQMC) where the guide function is employed as a trial wavefunction in a stochastic evaluation of the Rayleigh quotient with a generalized Metropolis algorithm. 3
Scaling of the Algorithm
The scaling of both the DQMC and the VQMC algorithm with the system size is evaluated easily, it is given by the number of local energy evaluations times the scaling of one local energy evaluation. It is much more difficult to calculate the scaling of the method for a given accuracy or even a given statistical precision, but in this work we are concerned only with accelerating the local energy evaluation. The local energy calculation for a guide function of the form given above requires the following steps (for n electrons and K basis functions)
34
1. calculate atomic basis Xk{i) and derivatives for all electrons: scales O(Kn) « 0{n2) 2. calculate (j>^{i) = Ylk C*/*X*(*) orbitals: scales 0{Kn2) « 0{n3)
an
n{i) is dense, but step (3) can be improved with local orbitals. Since the Slater determinant is invariant with respect to unitary transformations of the occupied orbitals, localized MOs can be used instead of the canonical orbitals. For truly localized orbitals, only a constant number of basis functions contribute the MO, irrespective of the electron position. Combined with the electron distance cut off value step (2) even becomes linear. With localized orbitals the matrix 0M(i) and its derivatives are sparse with a linear number of non-zero matrix elements. Sparse matrix algorithms can be used to improve the scaling of step (3).29'30 Since the evaluation of the determinant is computationally inexpensive for systems accessible with QMC
35
so far, the sparse matrix evaluation of the determinants is not discussed in detail here. The remaining 0(n3) step is the evaluation of the three-body terms in the correlation function U. The correlation function is computationally expensive and a reduction of the computational effort important. Originally, Schmidt and Moskowitz used f = ar/(l + or) as scaled distance in Eq. (10). Since f —> 1 for r —► oo the correlation contribution vanishes for large distances irrespective of the expansion form in Eq. (10). Due to the slow 1/r dependence of f no term of the expansion can be neglected for the system sizes considered here. A strong computational reduction can be achieved when r = r / ( l + ar) is replaced by a faster decaying function. We use here f = I — exp(-ar) and obtain with Eq. (10) a sum over "local" correlation terms where a cut-off radius is easily found. A similar exponential form has been used by Mitas.31 In our calculations the modified Schmidt/Moskowitz form yielded variances comparable to the original form. A local DQMC (LDQMC) calculation thus requires the following steps 1. HF-SCF calculation 2. localization of the occupied canonical orbitals 3. optimization of the parameters of a local correlation function U 4. LDQMC runs to the requested statistical accuracy 4
Test Calculations for Linear Hydrocarbons
To test the new concept we carried out LDQMC and standard DQMC cal culations for a series of hydrocarbons. No attempt was made to obtain en ergies with chemical accuracy. The geometries for the linear hydrocarbons C n H 2 n + i, n = 2,5,8,10,15 are calculated at the B3LYP/cc-pVDZ level. The canonical orbitals are obtained with Hartree-Fock and localized using the Boys procedure32 with Gaussian98.33 In spite of the large number of primi tive Gaussians in the Is orbital of the cc-pVDZ basis set, the variance increases drastically compared to a Slater-type orbital basis. This increase is caused by the missing cusp of the orbital at a vanishing electron-nucleus distance. This problem is circumvented by interpolating from the correct exponential at the nucleus to contracted gaussian form for small electron-nucleus distances. The details of the cubic-spline-interpolation are discussed elsewhere.34 The sparse matrix determinant evaluation was implemented, but not used for the cal culation of the data shown below. In the Boys localization procedure the
36
i
|
,
|
,
140
120
p _
1
'"
IO--0 standard DQMC | 0 - 0 local DQMC
i
-
i i
100
i
-
e
*-' g. u
-
i
22 i i
80
—
i i
> 1 60
i 1
e
i
/
~
X> /
—
40 20 1
0
50
100
150
200
# electrons Figure 1. Scaling of LDQMC vs. standard DQMC. Cpu times for linear C n H 2 n + i relative to C 2 H 6 .
local orbitals are constrained to remain pairwise orthogonal and, therefore, the LCAO coefficients do not vanish fully at atoms distant from the center of the local orbital. To get the benefit of spatially localized orbitals in our algorithm, we neglected AOs centered at atoms not adjacent to the atoms (for bond orbitals) or the atom (for atom-centered orbitals) participating in the localized orbital. The implementation of LDQMC is based on lists con taining the AOs contributing to the localized orbitals and lists containing the non-zero elements of each column in the Slater determinant. In the correlation function we used f = 1—exp(—ar). The most important correlation term is the electron cusp term Uij = cfij. The electron cusp condition 7 is fulfilled for c = l/2a. The value of a has been optimized for each system yielding values of a near two. For all other terms a has been fixed to a = 3. The cut-off radius is r = 5.0 bohr for the cusp term and 3.5 bohr for all other terms. In Figure 1, the relative cpu times for the LDQMC and the standard DQMC method are compared for a fixed number of steps.
37
i
0
I
50
i
i
100
i
I
150
i
I
200
# electrons Figure 2. Speed-Up of LDQMC compared to standard DQMC for linear C „ H 2 n + i -
In Figure 2 the speed-up is shown. The speed-up is basically linear with some overhead for the smaller molecules and a slight deviation for C15H32 indicating the increasing contribution of the 0(n3) determinant evaluation step. For even larger molecules the sparse matrix option should be used. A careful analysis of the errors due to the various cut-off radii is necessary for accurate LDQMC calculations. Such analysis is currently done by the present authors using correlated sampling techniques. 5
Conclusion
The diffusion quantum Monte Carlo method treats the electron correlation explicitly through a random walk of all electron positions. The local energy evaluation for the guide function \&G at the current electron positions offers the possibility of neglecting many terms when the Slater determinant is expressed in terms of localized orbitals instead of the canonical orbitais. Similarly, a localized form of the correlation function U is introduced that does not lead to a loss in the variance compared to the standard long-range form. The localized
38
variant of DQMC is denoted LDQMC and is shown to improve the scaling of the method by an order of magnitude, thus facilitating QMC calculations on larger systems. Acknowledgments A. L. gratefully acknowledges funding by Deutsche Forschungsgemeinschaft (DFG). References 1. J. C. Grossman, L. Mitas, and K. Raghavachari, Phys. Rev. Lett., 75, 3870 (1995). 2. W. M. C. Foulkes, L. Mitas, R. J. Needs, and G. Rajagopal, Rev. Mod. Phys., 73, 33 (2001). 3. A. Liichow and J. B. Anderson, Annu. Rev. Phys. Chem., 51, 501 (2000). 4. J. B. Anderson, Rev. Comput. Chem., 13, 133 (1999). 5. D. M. Ceperley and L. Mitas, Adv. Chem. Phys., 93, 1 (1996). 6. J. B. Anderson, Int. Rev. Phys. Chem., 14, 85 (1995). 7. B. L. Hammond, J. W. A. Lester, and P. J. Reynolds, Monte Carlo Methods in Ab Initio Quantum Chemistry, World Scientific, Singapore, 1994. 8. P. Pulay, Chem. Phys. Lett., 100, 151 (1983). 9. S. Saeb0 and P. Pulay, Annu. Rev. Phys. Chem., 44, 213 (1993). 10. S. Saeb0 and P. Pulay, J. Chem. Phys., 86, 914 (1987). 11. P. Pulay and S. Saeb0, Theoret. Chim. Ada, 69, 357 (1986). 12. C. Hampel and H.-J. Werner, J. Chem. Phys., 104, 6286 (1996). 13. M. Schiitz, G. Hetzer, and H.-J. Werner, J. Chem. Phys., I l l , 5691 (1999). 14. S. A. Chin, Phys. Rev. A, 42, 6991 (1990). 15. C. J. Umrigar, Phys. Rev. Lett, 71, 408 (1993). 16. C. J. Umrigar, M. P. Nightingale, and K. J. Runge, J. Chem. Phys., 99, 2865 (1993). 17. M. Mella, A. Liichow, and J. B. Anderson, Chem. Phys. Lett., 265, 467 (1997). 18. N. Metropolis and S. Ulam, J. Am. Stat. Assoc, 44, 335 (1949). 19. J. B. Anderson, J. Chem. Phys., 63, 1499 (1975). 20. R. C. Grimm and R. G. Storer, /. Comp. Phys., 7, 134 (1971).
39
21. P. J. Reynolds, D. M. Ceperley, B. Alder, and W. A. Lester, J. Chem. Phys., 77, 5593 (1982). 22. A. Lttchow and J. B. Anderson, J. Chem. Phys., 105, 7573 (1996). 23. A. Liichow and R. F. Fink, J. Chem. Phys., 113, 8457 (2000). 24. J. B. Anderson, J. Chem. Phys., 65, 4121 (1976). 25. S. F. Boys and N. C. Handy, Proc. R. Soc. London A, 310, 43 (1969). 26. K. E. Schmidt and J. W. Moskowitz, J. Chem. Phys., 93, 4172 (1990). 27. H. Conroy, J. Chem. Phys., 41, 1331 (1964). 28. Y. Shlyakhter, S. Sokolova, A. Liichow, and J. B. Anderson, J. Chem. Phys., 110, 10725 (1999). 29. T. A. Davis and I. S. Duff, SIAM J. Matrix Anal. Appl., 18, 140 (1997). 30. I. S. Duff, A. M. Erisman, and J. K. Reid, Direct Methods for Sparse Matrices, Clarendon Press, Oxford, 1986. 31. L. Mitas, in: D. P. Landau, K. K. Mon, and H. B. Schuttler (eds.), Computer Simulation Studies in Condensed-Matter Physics V, SpringerVerlag, Berlin, 1993. 32. S. F. Boys, Rev. Mod. Phys., 32, 296 (1960). 33. M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A. Robb, J. R. Cheeseman, V. G. Zakrzewski, J. A. Montgomery, Jr., R. E. Stratmann, J. C. Burant, S. Dapprich, J. M. Millam, A. D. Daniels, K. N. Kudin, M. C. Strain, O. Farkas, J. Tomasi, V. Barone, M. Cossi, R. Cammi, B. Mennucci, C. Pomelli, C. Adamo, S. Clifford, J. Ochterski, G. A. Petersson, P. Y. Ayala, Q. Cui, K. Morokuma, D. K. Malick, A. D. Rabuck, K. Raghavachari, J. B. Foresman, J. Cioslowski, J. V. Ortiz, A. G. Baboul, B. B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, R. Gomperts, R. L. Martin, D. J. Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, C. Gonzalez, M. Challacombe, P. M. W. Gill, B. Johnson, W. Chen, M. W. Wong, J. L. Andres, C. Gonzalez, M. Head-Gordon, E. S. Replogle, and J. A. Pople, Gaussian 98, revision a.7, Gaussian, Inc., Pittsburgh, PA, 1998. 34. S. Manten and A. Liichow, to be published.
40
A R E M E D Y FOR THE NEGATIVE SIGN PROBLEM IN T H E AUXILIARY FIELD Q U A N T U M M O N T E CARLO METHOD
Yoshihiro A S M Research Institute for Computational Sciences (RICS), National Institute of Advanced Industrial Science and Technology (AIST), Central^, Umezono 1-1-1, Tsukuba, R>araki 305-8568, Japan E-mail: [email protected] The auxiliary field quantum monte carlo (AFQMC) method is one of the most rigorous numerical methods for strongly correlated electron systems (SCES) in solids. The method does not include any approximations and the possible errors are limited only to the Trotter error and the statistical error, both of which should be reduced arbitrarily small in principle if there were no negative sign problem. The method was originally developed in studies of SCES models such as the Hubbard model, the Kondo model, and the Anderson model but we now know that it can be more generally applied to ab initio problems. Like other quantum monte carlo methods, it also suffers from the negative sign problem. However, it has turned out in these years that a remedy can be devised to reduce the negative sign ratio without much loss of accuracy. Here we will discuss the nature of the negative sign problem in the AFQMC method and how we circumvent the problem.
1
Introduction
Many body effects play quite important role in statistical physics and con densed matter physics. They have large influence on phase diagrams and crit ical phenomena. Strong correlation effects among electrons have been in the heart of the state of art in these fields. There still remains unresolved problems such as high-Tc superconductivity problems. l Demands for quantum simula tion technology are rapidly increasing to overcome such difficult problems as well as to take into account of more chemical realities. Quantum monte carlo methods may be one of the most promising meth ods to fulfil the demands. The auxiliary field quantum monte carlo (AFQMC) method 2>3'4>5 was developed to study strongly correlated electron systems (SCES) models such as the Hubbard model;
T =
H = T+V2, -tY,(clcjs+H.C), V2 =
U^Tn^nu,,
(1)
41
where (i, j) indicates the sum is taken over the pairs of nearest-neighbor sites. s is the spin variable. c\s is the Fermion operator on the Wannier orbital centered on the z'-th site. U is the on-site Coulomb repulsion defined by the one center two-electron integral between the same Wannier orbital, but we often estimate it by fitting with experimental results because of the difficulty in ab initio estimation of the screening effects in solids. We put t — 1. The method has been applied to the Kondo model and the Anderson model as well. It's application is now expanding to ab initio problems. Application of the AFQMC method to ab initio Hamiltonian of small molecules 6 as well as the combined use of the AFQMC method, the pseudopotential method and plane wave basis functions frequently used in the density functional band theory in solids has turned out to be successful. 7,s This direction may grow up to one of the most important fields of ab initio electronic structure theory in solids. It should be noted however that these models already include essential low energy physics of SCES in solids. A very detailed comparison with an experiment strongly suggests that the two-dimensional Hubbard model is the reasonable model of the cuprate high-Tc superconductors. 9 If we use the Trotter formula, the density matrix defined as (*Irt|e~/3jff]\Iirt) (ft is the projection time) can be divided into L fragments;
(* t |e-" H |* t > ~ (*M[e-ATTe-ATV2]L|*t>,
(2)
where AT = ft/L and * t is a trial wavefunction that is not orthogonal to the exact wavefunction ^G f 1 ^ / ^G ) • Because of the presence of the bilinear term, the Hubbard model cannot be solved easily. The bilinear term can be linearized by introducing summation over the auxiliary field; e -A,c/n t n, =
1 £
eX p[2A =
^
Wtx
Wit
o(l)-a(L)
Wt = det('* t1 .jB((T(L)) t • • • Wx = det{HtiB{f{1>,l3)\rl>).
(11)
/3-Voo J
The density matrix is then given by (9 t\e~ PH \91) = (\P t |tf G ). Fahy and Hamann derived a formal diffusion equation for / and suggested that the nega tive sign problem when /? -» oo may be related with the distribution function / being dominated by the even parity lowest eigenfunction / + . Physically mean ingful contribution comes from the odd parity second lowest eigenfunction / " . Such contribution becomes small as we increase /?. When the distribution function / is even with respect to ip, the simulation fails to give nontrivial eigenfunction of the Hamiltonian but gives |\PG) = 0, hence the density matrix = 0 and (Sign) = £ W 7 £ | W | = °- Conversely when (Sign) = 0, (tfti^o) = 0. This leads to two cases; (i) | * G ) = 0 or (ii) | * G ) = I 1 **). The former case means the distribution function / has the even parity. The latter case is inhibited by definition. Hence, (Sign) — 0 is equivalent with the even parity nature of the distribution function lim^-^oo f(ip,P)- What we want to do is to control the simulation to avoid / + without modifying / ~ .
44
3
The adaptive sampling method
Here we generalize the idea of Fahy and Hamann and use it to circumvent the negative sign problem. The equation; J2wa(r)=
f f(rP,T)(Vt\iP)dTP,
(12)
means that avoiding even parity distribution function is equivalent with re moving all the sets of cancelling fields such that J2a Wa (r) = 0 from the accepted samples. This may be most efficiently done if we notice the following mathematical property. A subset D of importance sampled configurations of auxiliary fields a is defined such that
X; = 0, I^^^M)!*,).
(13)
Hence, the following, distribution function; Mi>,r)
= f S(W)-U„(T,0)\*t))d*,
(14)
JD
whose integral is limited for configuration of the auxiliary field {c(l), • • •, further prop agation does not change the value of the sum; Y, (*t\e-W-T)H\U„(T,
0)* t ) = 0.
(15)
creD
This is because jt> (ip, T) has the even parity nature. Thus if we devise a method to remove the contribution from D at every r, we may be able to remove all the cancelling fields and hence avoid / + most efficiently. It should be noted that the standard probability weight function given by Eq.(8) does not provide well defined distribution function / D ( V ' , i") for 0 < r < (3. This is because the proba bility weight function to update cr(lT); \{^t\Ua(L) ■ ■ ■ Uo(lr) ■ ■ ■ Ua{l)\^t)\/Y.a \W\ is not unique function of {■ 0. It will not perturb the short r distribution function and the odd parity dis tribution function where all the accepted samples are positively signed. The accuracies of the approximations depend on the boundary behavior of / + near the node. Our choice of the probability weight function reduces errors due to the wrong position of the imposed node at $( rather than \PG- Because of the parity, there should be a region such that f~(rp) — 0 for \vip — $ G | < e, where £ is a given radius. This is an exact property because the positive projection condition Eq.(17) does not change the odd parity distribution function. At short imaginary time r where / ~ is relatively dominant in the distribution function, 10 the contribution to the integral;
J dtl>f(tl>,T)\1>),
(18)
from the circle \vip — \&G| < e is negligible; f(tp) ~ 0 for \vtp — \&G| < e. Further propagation; lim [djje-(P-T)Hf(t/j,T)\TP)~ P-K>oJ
/#e-^-T)-BG|*G)(*G|^)/(^,r)
(19)
J
does not revive the contribution. Hence we may neglect the contribution from the distribution function lim^oo f = f+ within the circle. It should be noted that lim^oo / = / + is modified by Eq.(16) and Eq.(17) in the ASQMC simu lation. If €■ Numerical results given in Sec.5 as well as those given in Ref.12 indicate the error is always less than 0.05% including the most difficult case. The numerical results stand for our presumption. Zhang, Carlson and Gubernatis simultaneously used the positive projec tion condition given by Eq.(17) as well as the modified probability weight given by Eq.(16) in their constrained path monte carlo method. Their approxima tion is in between the Fahy-Hamann approximation and ours. The discussion given in Sec.3 suggests the constrained path approximation becomes equivalent with ours in the virtual limit of AT —»■ 0, although some flaw may easily come into real calculations. It may be possible to say that the adaptive sampling
48
1.000.75-