Efficient Machine Learning Force Field for Large-Scale Molecular Simulations of Organic Systems
To address the computational challenges of ab initio molecular dynamics and the accuracy limitations of empirical force fields, the introduction of machine learning force fields (MLFFs) has proven effective in various systems including metals and inorganic materials. However, in large-scale organic systems, the application of MLFFs is often hindered by impediments such as the complexity of long-range intermolecular interactions and molecular conformations, as well as the instability in long-time molecular simulations. Therefore, we propose a universal multiscale higher-order equivariant model combined with active learning techniques, efficiently capturing the complex long-range intermolecular interactions and molecular conformations. Compared to existing equivariant models, our model achieves the highest predictive accuracy, and significant improvements in computational speed and memory efficiency. In addition, a bond length stretching method is designed to improve the stability of long-time molecular simulations. Utilizing only 901 samples from a dataset with 120 atoms, our model successfully extends high precision to systems with hundreds of thousands of atoms. These achievements guarantee high predictive accuracy, fast simulation speed, minimal memory consumption, and robust simulation stability, satisfying the requirements for high-precision and long-time molecular simulations in large-scale organic systems.
Introduction
Molecular dynamics (MD) simulation has gained significant attention in recent years across various disciplines, spanning physics, chemistry, biology, and materials science. This cutting-edge technology, by simulating interactions between molecules or atoms, offers researchers a means to investigate the microstructure of substances and understand their macroscopic properties. This technique has proved invaluable for experimental design, development of new materials, and advances in biomedical research.1–4
Traditional molecular simulations grapple with the dilemma of balancing the high computational cost of ab initio molecular dynamics (AIMD) against the low precision of empirical force fields. A resolution to this challenge is found in the application of machine learning, leveraging its powerful fitting capabilities. The fundamental idea of the machine learning force field (MLFF) is to establish a mapping from molecular coordinates to the labels of high-precision quantum chemistry data, including potential energy and forces. This approach eliminates the need to solve the intricate Schrödinger equation, resulting in significant acceleration and achieving a balance between prediction precision and simulation speed.5
Since the advent of BPNN6 in 2007, numerous MLFF models have been proposed to improve prediction accuracy, and their performance has been systematically investigated in public datasets (MD17,7 MD22,8 and OC229). From the perspective of tensor order (denoted by ), existing 3D molecular representation learning models can be categorized into two main classes: one is the invariant graph neural networks with only scalar features (i.e., ), including SchNet,10 DeePMD,11 DTNN,12 PhysNet,13 ComENet,14 SphereNet;15 the other is the equivariant graph neural networks with vector features (i.e., ) including EGNN,16 PaiNN,17 GVP-GNN,18 EQGAT,19 as well as higher-order features (i.e., ) such as TFN,20 Cormorant,21 SEGNN,22 Equiformer,23 NequlP,24 Allegro,25 BotNet,26 and MACE.27 Specifically, invariant methods directly use invariant geometric features such as distance and angles as input, ensuring invariance to the rotation and translation transformations on input molecules. In contrast, the equivariant model can maintain certain symmetries or properties under specific transformations such as rotation and translation.28 Models with equivariance properties for generally outperform invariant ones on various public datasets and tests.16,24,29,30 Furthermore, in terms of the expressive power of geometric graph neural networks, higher-order equivariant models demonstrate more expressive capabilities compared to the first-order equivariant models.31 Increasing the equivariant order often leads to improved model accuracy.27,32,33 Therefore, higher-order equivariant force field models such as NequIP, Allegro, and MACE achieve excellent performance on multiple MD simulation metrics with nearly a 1000-fold improvement in data efficiency.24,25,30 However, higher-order equivariant models exhibit suboptimal performance in MD simulation concerning atomic scale and simulation speed. For instance, a single graphics processing unit (GPU) can only achieve a simulation speed of approximately 2.5 ns per day for a system with around 1000 atoms.25,34 Numerous optimization strategies have been proposed to address these performance issues.35–37 However, these methods frequently encounter limited applicability or poor accuracy. These challenges underscore the need for a more universal and highly accurate acceleration method tailored specifically for higher-order equivariant models.
The high accuracy MLFF (the errors in atom energy and forces are around 1 meV/atom and 50 meV/Å, respectively38,39) have been successfully applied in diverse systems such as metallic and nonmetallic inorganic materials.40 However, due to the weak generalization ability of machine learning models,41 the application of MLFF in large-scale organic systems is usually challenged by several impediments such as the complexity of long-range intermolecular interactions and molecular conformations, as well as the instability in long-time molecular simulations. Therefore, molecular simulations based on MLFF often fail (e.g., loss of accuracy, bond breaking, and atomic overlap) when encountering scenarios not present in the training set.30,42
Expanding the receptive field of neural network models by appropriately increasing the number of interaction layers has been proven to enhance the predictive accuracy of MLFF in multimolecular interaction systems.43 However, increasing the number of interaction layers results in higher computational costs and can lead to over-smoothing issues that diminish the expressive capability of the machine learning model.44 In fact, expanding the model’s receptive field by increasing the number of interaction layers does not guarantee accurate characterization of long-range intermolecular interactions.29
The key to accurately capturing long-range intermolecular interactions is increasing the cutoff radius of the neural network model’s hyperparameter. However, the number of neighboring atoms increases cubically with increasing cutoff radius, resulting in a significant increase in simulation time and memory consumption. This limitation largely hampers the application of MLFF in long-time MD simulations for large-scale organic systems. Consequently, many models adopt modular methods to consider long-range intermolecular interactions. A fragment-based approach43 has been used to deal with long-range intermolecular interactions. However, this approach lacks generality and may require individual fragmentation schemes for specific molecules. Therefore, there is a need for an end-to-end method that can accurately capture long-range intermolecular interactions in organic systems.
In addition, the extensive conformational variability of organic compounds requires that the training set ensures not only quantity but also diversity.45 Unfortunately, AIMD proves to be prohibitively expensive to generate a large number of diverse conformations. Consequently, an efficient method for collecting training sets becomes imperative, emphasizing the need for nonredundant datasets to mitigate the labeling costs associated with quantum-mechanical calculations such as density functional theory (DFT). Moreover, the impracticality of preparing the training set from high-precision quantum DFT calculations for large organic systems further complicates the labeling process. As a result, the labeled datasets are typically derived from smaller systems.46 This limitation underscores the necessity of investigating whether MLFF methods can maintain high-precision generalization when applied to larger systems.
Finally, various methods have been proposed to address the issue of instability in long-time molecular simulations. For example, directly incorporating configurations not present in the training set can enhance the stability of the simulation,42 but it is impractical to exhaustively consider all possible configurations. Reducing the time step in MD is another strategy to decrease the incidence of simulation crashes.24,25 However, this approach comes at the cost of a proportional increase in simulation time. Although larger models with millions or more parameters may alleviate the issues mentioned above, they do not guarantee simulation stability.30 Therefore, a more efficient and straightforward MLFF model is required to ensure stability during long-time MD simulations.
To address the aforementioned challenges, we propose an end-to-end, highly efficient, and broadly applicable higher-order equivariant model to effectively handle the long-range intermolecular interactions. This method exhibits excellent competitiveness in various aspects, including prediction accuracy, simulation speed and memory consumption. Moreover, the incorporation of a bond length stretching technique significantly boosts the simulation stability of MLFF models in organic systems. In addition, an active learning approach based on committee queries is adopted to efficiently collect datasets. Notably, this study successfully extends high precision from a limited dataset comprising 901 samples with 120 atoms to larger systems encompassing ten thousand atoms. This achievement plays a pivotal role in facilitating long-time MD simulations in large-scale organic systems.
Computational Method
Equivariant model
Incorporating the data symmetry in machine learning models can improve the efficiency of data collection and the generalization capability of the models.47 For atomic systems, if the coordinates of the system rotate, quantities like forces and dipole moments should rotate accordingly. More strictly speaking, if the function is equivariant under the action of a group of transformation , the following equation holds:
In Equation (1), and are two vector spaces, while , , and are elements in ,, and , respectively. and are the transformation matrices parametrized by g in and . A natural and effective method to ensure the equivariance of the transformation is to impose constraints on the transformation matrices to consider the symmetric properties of the data.26,48
Han et al.49 categorizes the equivariant models into three types: vector-based, Lie group-based, and irreducible representation-based. Among these three types of models, the irreducible representation-based approach, which takes advantage of the transformation properties of spherical harmonics , exhibits higher-order equivariant capabilities and excels in various force field tasks.30 Currently, this method can be unified within the framework.32
Active learning
Due to the limited generalization capability of neural networks, the performance quality of MLFF model is heavily dependent on the quality of the training dataset. Moreover, labeling the dataset requires the use of quantum-mechanical calculations, which involve a high computational cost. Therefore, a key task prior to training MLFF model is to construct a dataset that is as comprehensive and nonredundant as possible.38,50 Active learning techniques have emerged as a prevalent strategy for collecting datasets, with committee querying methods being particularly prominent in the field of MLFF.46,51,52
The fundamental idea of an active learning technique is to use the disagreement of the committee to quantify the generalization error.53 Specifically, if a training dataset is of sufficiently high quality, a MLFF model, operating under various random seed conditions, will yield accurate predictions with low variance for normal samples that are not present in the training set. Otherwise, predictions with high variance indicate that the training dataset is incomplete and requires the addition of more effective samples for further improvement. Typically, the maximum variance in force predictions made by the MLFF model under different random seeds is used as a criterion to determine whether a sample should be added to the training set. This maximum variance is calculated based on the following formula:
DFT settings
Since this paper requires calculations with thousands of atoms and the need to perform AIMD, the energy and forces of all data sets are calculated using the lower computational cost but good accuracy method, that is, the semiempirical GFN2-xTB level of theory.54 This choice is made to balance computational cost and accuracy. The convergence level follows the default settings of the software.55
It is essential to note that in the process of data collection, a pretrained MLFF based on the multiscale higher-order equivariant model is used to perform high-precision MD, which is significantly faster than the AIMD simulation. Therefore, the data sampling stage is not computationally intensive. In practice, the primary cost of generating data labeling still lies in labeling using DFT. More accurate quantum-mechanical calculations can be employed based on specific requirements.
Molecular dynamics settings
Periodic AIMD simulations based on the semiempirical GFN2-xTB level of theory are implemented using the software.56 The MD simulations based on the MLFF models are executed using version 1.0 of the plugin and version 8.0.0 of the software.
Hyperparameter settings
All models were trained on a NVIDIA RTX 4090 GPU in single-GPU training using float32 precision. Unless explicitly stated, the default hyperparameter settings for all models in this paper are as follows: the embedding dimension of the radial basis function is 8; smooth truncation is set to be a polynomial envelope function with and ; the radial MLP is [64, 64, 64]; the dimension of the readout layer is 16; node features are represented as ; the number of layers in the interaction layer is 2; directional information is expanded to the 3rd order; and the cutoff radius is set to 8 Å.
For multiscale higher-order equivariant models, the hyperparameter settings for the short-range node features, the expansion order of short-range directional information, and the number of interaction layers remain the same as described above. However, the long-range node features are represented as and the long-range directional information is expanded to the first order. The cutoff radius is set to 3 Å for short-range interactions and 8 Å for long-range interactions. To ensure continuity, a smooth polynomial envelope function is applied at the cutoff points of both the short-range and long-range modules. A detailed discussion and numerical testing of the values of the model’s cutoff radius can be found in the section on local testing of Supporting Information. For the Allegro model, the is set to 8, and is set to 256.
The training hyperparameters include an initial learning rate of 0.01 and a ReduceLROnPlateau scheduler, which reduces the learning rate when the validation loss does not improve over a certain number of epochs. To update the evaluation and final model weights of the validation dataset, an exponential moving average with a weight of 0.99 is applied. The optimizer is Adam, and the total number of training epochs is set to 4000. Energy is normalized by the moving average of the potential energy. The loss function is as follows:
The initial weight values of the force and energy follow common settings,27,29 where the force weight is set to 1000, and the energy weight is the number of atoms of the system. For MACE and multiscale MACE (MS-MACE) models, stochastic weight averaging (SWA)57,58 is enabled at 75% of the total iteration count. After initiating SWA, the energy weights and force weights of the loss function are reset to 1000 and 10, respectively.29
Results and Discussion
Efficient and universal multiscale higher-order equivariant modeling architecture
Most MLFF models are based on the locality hypothesis of atoms, where only the central atom interacts with nearby atoms.52,59 To account for long-range intermolecular interactions, these models must increase their receptive field. However, short-range intramolecular interactions have a more significant impact and tend to dominate during the training of MLFF models.60
To achieve optimal accuracy and faster computational speed with the higher-order equivariant model, we propose a universal multiscale higher-order equivariant model. This model uses a high-cost module to accurately capture short-range interactions related to the central atom, while a low-cost module is employed to handle the corresponding long-range interactions. This method adopts a multiscale strategy that empowers MLFF models to attain optimal precision and efficiency in handling intermolecular long-range interactions. Specifically, the potential energy of each atom is divided into short-range and long-range parts, and the total potential energy of the system is obtained by summing the contributions from all atoms. The atomic forces are the negative gradients of the total potential energy with respect to the coordinates, ensuring the energy conservation of the model:61
As mentioned above, in terms of fitting atomic potential energy, higher-order equivariant models significantly outperform the first-order equivariant models and invariant models. Therefore, in this work, we only focus on higher-order equivariant models. Various higher-order equivariant models have been proposed through the construction of different nonlinear functions, convolution filters, message functions, aggregation functions, and update functions. Examples of these models include TFN,20 Equiformer,23 Cormorant,21 MACE,27 NequIP,24 SEGNNs22 and so on. Our universal and efficient multiscale higher-order equivariant model is proposed based on these groundworks (Figure 1). This multiscale model achieves fast processing of long-range messages by reducing the channel dimensions of nodes through equivariant linear transformations and lowering the expansion order of the embedding direction information. After the long-range messages are aggregated, another linear transformation is used to align the long-range and short-range features. Finally, the summation of these two features gives the entire multiscale equivariant model. In contrast to ref 62, a critical aspect of our multiscale model is the flexibility of hyperparameters within the long-range module. Specifically, when the tensor order is set to zero, the long-range module shifts from utilizing equivariant features to invariant features. This dynamic adjustment is particularly noteworthy because the short-range module consistently employs equivariant features. Consequently, the model benefits from the integration of both equivariant and invariant modules, which is analogous to the architectures in Allegro25 and PaiNN.17 By distinguishing between long-range and short-range interactions, our multiscale model is better equipped to handle systems characterized by extensive intermolecular interactions. This distinction brings advantages such as higher accuracy, faster training convergence, faster computation, and lower memory consumption. Furthermore, our multiscale method is very versatile. It can be used with any higher-order equivariant model, making it adaptable to various computational needs. This generality enhances its practical utility and effectiveness. Any higher-order equivariant model can be combined with this method. In this work, our multiscale higher-order equivariant model is constructed based on the MACE (or NequIP) model. Hereafter, it will be referred to as “MS-MACE” (or “MS-NequIP”).
Intermolecular interaction
Increasing the number of interaction layers generally enhances the receptive field of the model.63 However, for long-range intermolecular interactions, there may be scenarios that lack intermediary atoms, which causes an interruption in propagation, consequently leading to the ineffectiveness of this model. On the contrary, increasing the cutoff radius of the neural network model’s hyperparameter proves to be the key to accurately capturing long-range intermolecular interactions.
In this study, we investigate the behavior of two formaldehyde molecules separated by 9 Å using MD simulations. We prepare the dataset by conducting MD simulations with the GFN-FF64 method at 400 K, using a 1 fs time step in the NVT ensemble (canonical ensemble). Trajectories are dumped every 100 steps for a total simulation time of 100 ps, counting 1000 frames. These trajectories are then randomly divided into training, validation, and test sets in an 8:1:1 ratio. Additionally, we include a translation set of 80 trajectories of the two formaldehyde molecules in the training set.
The energy and forces for each sample are calculated using the semiempirical GFN2-xTB level of theory. The energy variation with respect to the distance between the two molecules in the translation set is illustrated in Figure 2a. Based on the MACE model with two interaction layers, three different machine learning models are considered: (1) a cutoff radius of 6 Å; (2) a cutoff radius of 10 Å; and (3) a multiscale method with a short-range interaction cutoff radius of 3 Å and a long-range interaction cutoff radius of 10 Å. Other hyperparameters are set as described in the hyperparameter settings section. Figure 2b,c show that the MACE model with a cutoff radius of 6 Å exhibits better accuracy compared to the MACE model with a cutoff radius of 10 Å. However, despite the fact that the MACE model with two interacting layers with a cutoff radius of 6 Å has a maximum receptive field of 12 Å, it is unable to capture the intermolecular interactions of molecules separated by 9 Å, resulting in erroneous MD simulations. This indicates that neither higher model accuracy nor expanding the receptive field through multiple interaction layers guarantees the correctness of MLFF in MD simulations. Instead, capturing intermolecular interactions by increasing the cutoff radius is feasible. However, on the one hand, a larger cutoff radius is prone to bottlenecks, leading to lower model prediction accuracy;65 on the other hand, it also brings huge computational cost and memory consumption, especially in large-scale systems. In contrast, the multiscale enhanced MS-MACE model accurately captures intermolecular interactions, achieves higher accuracy, and converges earlier during training. This underscores the importance and effectiveness of the multiscale method in MLFF. The advantages of the multiscale method in terms of accuracy and efficiency are further discussed in the following section.
Stability of long-time simulation
The existing MLFF models are easy to form excessively long or short bonds during long-time MD simulations in organic systems, leading to the collapse of simulations.30,42,66 This is due to the fact that most of the samples in the training dataset are obtained from sampling near the equilibrium state,67 and the proportion of molecular conformations with abnormal bond lengths in the dataset is very low. Consequently, the weak generalization ability of machine learning models leads to the collapse of MD simulations. However, it is challenging to directly collect conformations with extremely long or short bond lengths from MD trajectories, because the MD process would collapse instantaneously in this scenario. Therefore, we propose a data augmentation method to enhance the stability of the model during long-time MD simulations. We first collect conformations from normal MD trajectories and then stretch the bond lengths to improve the training dataset.
Existing literature68–70 employs the normal mode to perturb bond lengths of molecules with minimized energy, generating samples with different bond lengths. While this approach provides a straightforward method to create diverse molecular geometries, it has significant limitations. Notably, it fails to account for variations in bond angles and dihedral angles, which are critical for accurately representing molecular conformations. Additionally, the method’s reliance on random probability to perturb bond lengths does not ensure the robustness of the model when bond lengths deviate from their equilibrium positions. Another crucial consideration is the asymmetric nature of chemical bond forces: the repulsion between atoms is generally stronger than the attraction, as shown in Figure 3b. Consequently, the perturbation strategy should reflect this asymmetry. Stretching variations in chemical bonds should preferentially simulate elongation rather than employ equal probability variations, as suggested in the existing literature.68–70 An equal probability approach can introduce excessive repulsion, increasing the difficulty of training the model and decreasing the overall accuracy.
To demonstrate our bond length stretching method, we take an organic molecule as example, that is, perfluorotri-n-butylamine (molecular formula , CAS No. 86508-42-1), which is a nonconducting, well-thermally and chemically stabilized dielectric liquid and is widely used as a fully immersible electronics coolant. Molecular simulations of 300 ps are performed at 300, 600, and 800 K using AIMD simulations based on semiempirical GFN2-xTB level of theory under the NVT ensemble. Figure 3a show the distribution of C–F bond length at different temperatures. The results indicate that higher temperatures lead to broader distributions. However, even at 800 K, the distribution of bond lengths remains too narrow to guarantee the stability of long-time MD simulations. Inspired by this fact, the bond length stretching method is developed as follows. First, the training dataset is constructed by randomly selecting 600 samples of conformations from the trajectories at 300 K. Then, in every sample within this training dataset, we randomly select a chemical bond and multiply the length of this bond by a factor of λ, where λ is a random variable conforming to a uniform distribution in the range of [0.85, 2.0]. This range is chosen to avoid overstretching and overcompression of the bonds. For comparison, the bond length distribution of the dataset with stretched bonds is also added to Figure 3a. The results indicate that the stretching method can significantly improve the bond length distributions.
To verify the effectiveness of this bond stretching method, we first combine the two representative equivariant models, that is, the MACE and NequIP models, with our multiscale model, resulting in the MS-MACE and MS-NequIP models, respectively. Then these two multiscale models are trained on datasets that are both processed and unprocessed by the bond length stretching method. According to Figure 3b,c, our bond length stretching method significantly improves the prediction capabilities of potential energy and forces. As shown in Figure 3c, without employing the bond length stretching method, these two MLFF models would even yield unphysical predictions of forces for extremely short or long bond lengths. Furthermore, 150 conformations are randomly sampled from the MD trajectories at 800 K to serve as the initial configurations. Then 50,000 steps of Langevin dynamics simulations based on the above MLFF models are performed at 800 K with a time step of 3 fs. The MD simulation stability of MLFF models can be analyzed by counting the number of frames in which collapse occurs (the criterion for collapse is that there exists a bond length deviating by more than 5 Å from its equilibrium state30,42). According to Figure 3d, without employing the bond length stretching method, the stability of the MS-MACE model is superior to that of the MS-NequIP model, with a success rate of 88.67% for the former and only 2.0% for the latter, further emphasizing the excellent simulation stability of the MACE framework.29 With employing the bond length stretching method, the simulation stability of both models is greatly enhanced, with a success rate of 100%. This suggests that the bond length stretching method exhibits high generality and effectiveness.
In addition, according to the inset of Figure 3c, without employing the bond length stretching method, MS-NequIP and MS-MACE models give unphysical predictions (i.e., repulsive forces) when the bond lengths are larger than 1.9 and 2.0 Å, respectively. This seems to suggest that the threshold bond length for unphysical predictions determines the stability of the long-time MD simulations. Therefore, we infer that the simulation stability of the MLFF model is highly correlated with the predictive capability of extreme bond lengths. To further examine the simulation stability of each model, we investigate the change of C–F bond length over the simulation time in Langevin dynamics simulations with a time step of 1 fs at 600 K, where the initial bond lengths are 1.9 Å (Figure 3e) and 2.1 Å (f), respectively. According to Figure 3c, in the case where the bond length is 1.9 Å (larger than the equilibrium bond length), the MS-MACE predicts a negative force (i.e., elastic rebound force). Therefore, the elongated chemical bond can be restored to its equilibrium state (see the yellow line in Figure 3e). However, in this case, the MS-NequIP model gives an unphysical prediction (a repulsive force) (Figure 3c), resulting an abnormal elongation of the C–F bond during the MD simulations (see the blue line in Figure 3e), and ultimately leading to the collapse of simulations. At a bond length of 2.1 Å, without employing the bond length stretching method, both models give unphysical predictions (i.e., repulsive forces), resulting in incorrect elongations of the C–F bonds (see the yellow and blue lines in Figure 3f). On the other hand, by the application of our bond length stretching method, both models can make correct predictions for the cases where the initial bond lengths are 1.9 and 2.1 Å, respectively (see the green and red curves in Figure 3e,f). In summary, the bond length stretching method proves to be an effective and universally applicable method to enhance the stability of long-time MD simulations.
Efficient collection of training datasets
Active learning has been widely used.46,51,68,71 However, during the early iterations of active learning, the instability of MD simulations in the exploratory phase and the lack of model accuracy with few sample data are particularly significant. Therefore, in this section, we will demonstrate the great advantages of the bond length stretching method and the multiscale higher-order equivariant model in combination with active learning techniques. Similar to the previous section, we will take perfluorotri-n-butylamine molecules as an example. To construct a dataset that is as comprehensive and nonredundant as possible, the workflow of the active learning method involves a series of continuous iterations. First, an annealing simulation is conducted using the general AMBER force field (GAFF72) on a system consisting of 200 perfluorotri-n-butylamine molecules in the NPT ensemble (constant temperature and constant pressure ensemble). During the annealing simulation, the system cools down from 800 to 280 K in 60 ns, and 300 frames of trajectory are randomly extracted with a minimum sampling interval of 10 ps. In each frame, the three molecules closest to the center of the simulation box are selected as a sample (120 atoms) and added to the initial dataset. To enhance the simulation stability during data collection, the initial dataset is improved using the bond length stretching method proposed in the previous section. Then the improved dataset is labeled using the semiempirical GFN2-xTB level of theory. Second, four MLFF models based on our multiscale higher-order equivariant model (i.e., MS-MACE) with different random seeds and a batch size of 64 are trained on the initial dataset (300 samples). Then high precision MD simulations are conducted in the NVT ensemble at 300, 500, 700, and 900 K, respectively, using the four trained MLFF models with a step length of 1 fs and a total simulation time of 45 ps. In the meantime, a certain number (10, 20, 40, and 50 for 300, 500, 700, and 900 K, respectively.) of new samples are randomly selected as the candidate samples from the MD trajectories. Finally, the candidate samples are assessed using Equation (2), and the effective candidates are labeled and added to the training dataset. The above process is repeated until the proportion of effective candidates generated from the pool of candidate samples converges to a value less than 0.5% (Table 1).
i | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|
Accurate | 73.20 | 97.42 | 98.74 | 98.68 | 98.15 | 99.49 | 99.46 | 99.51 | 99.67 | 99.68 |
Candidate | 26.72 | 2.58 | 1.25 | 1.32 | 1.85 | 0.51 | 0.54 | 0.49 | 0.33 | 0.32 |
Failed | 0.08 | 0.00 | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Figure 4a–e depict the distribution of at different temperatures in the 1st, 2nd, 3rd, and 10th iterations. Table 1 presents the percentages of accurate (), effective candidate (), and failed () samples in each iteration. In the first iteration, the ultrahigh percentage (73.20%) of accurate samples and ultralow percentage (0.08%) of failed samples demonstrates the advantages of the high data efficiency of the multiscale high-order equivariant model and the high MD simulation stability of the bond length stretching method. After just one iteration, the percentage of accurate samples increased to 97.42%, and the percentage of effective candidate samples decreased to 2.58%. This fully demonstrates the great advantage of active learning in data collection. After 10 steps of iteration, the percentage of accurate samples approaches 99.7%, and only 0.32% of samples need to be added to the dataset. Referring to pertinent literature,46,51 we conclude that the iterations of data collection using an active learning technique have converged. In the end, we collect a training dataset consisting of 901 samples.
Prediction accuracy and molecular simulation efficiency
In this section, we compare our multiscale model (referred to the “MS-MACE” model) with three different representative equivariant models, including MACE, NequIP, and Allegro, in terms of prediction accuracy, simulation speed, and GPU memory consumption. Detailed speed and memory test descriptions can be found in the Supporting Information. It is important to note that achieving complete consistency in hyperparameters for all models is impractical due to differences in model architecture. For example, MACE and NequIP models utilize atoms for message passing, while the Allegro model employs edges for message passing.28 However, to ensure the validity of the comparisons, the shared parameters of the models are set identically. Detailed settings can be found in the hyperparameter settings in the Computational Methods section. In addition, it merits emphasis that comparisons with invariant models have not been conducted, because equivariant models typically exhibit superior performance over invariant models in the construction of MLFF.
In the previous section, we have obtained a dataset comprising a total of 901 samples, each containing three perfluorotri-n-butylamine molecules. This dataset is randomly divided into training and validation sets, comprising 801 and 100 samples, respectively. Subsequently, we will construct seven test sets, each comprising 100 samples. Within a single test set, each sample contains an equal number of molecules with different conformations. Across the first to the seventh test sets, each sample will contain 20, 40, 60, 80, 100, 120, and 140 molecules, respectively. In order to obtain these seven sets, we conduct all-atom MD simulations on a system with 1000 molecules in the NPT ensemble at 600 K using GAFF force field. All of the samples in the seven test sets are randomly extracted from the MD trajectories, and then labeled using the semiempirical GFN2-xTB level of theory. In addition, aiming to reduce the impact of randomness, the average prediction errors for each model are obtained from the model with five different random number seeds.
Figure 5 shows the results of the comparisons among different equivariant models. All models exhibit high precisions in predicting energy and forces on the test sets. Moreover, the time cost and GPU memory consumption of simulations change linearly with the increase of atom number in the system. This could be attributed to the fact that all models are based on a local assumption.52,59,73
Compared to other models, our MS-MACE model achieves the best performance across various metrics, including accuracy, simulation speed, and GPU memory consumption. Specifically, according to Figure 5a,b, our MS-MACE model gives the most accurate predictions for both energy and force. In particular, the average error in force approaches an asymptotic value of 10.5 meV/Å, indicating that even in a large-scale system comprising several hundred thousand atoms, the force error remains around 10.5 meV/Å. That is to say, although each sample in the training dataset only contains 120 atoms, our multiscale MLFF model can successfully extend high precision to large-scale systems with hundred thousand atoms. In addition, according to Figure 5c,d, the MS-MACE model enhanced by the multiscale method improves the computational speed and memory efficiency by about 70% and 50%, respectively. In summary, our MS-MACE model demonstrates exceptional competitiveness in energy and force predicting, simulation speed, and GPU memory consumption.
Compared with AIMD simulations
In this section, by comparing with the results from AIMD simulations, we validate the capability of our multiscale higher-order equivariant model. We conduct AIMD simulations in the NVT ensemble at 600 K for 500 ps. The system comprises five perfluorotri-n-butylamine molecules, where periodic boundary conditions are used in all three directions. MD simulations based on the MS-MACE model and the GAFF empirical force field are performed under the same conditions. To reduce the influence of randomness, all simulations are repeated three times independently.
Figure 6a–c shows the comparisons of radial distribution functions (RDFs) for F–F atoms, the angle distributions for F–C–F bonds, and the dihedral distributions for F–C–C–F bonds. The results indicate that the MS-MACE model can perfectly reproduce the results from AIMD simulations, highlighting the accuracy of our multiscale higher-order equivariant model. However, despite its ability to avoid unphysical results through physical formula constraints, the GAFF empirical force field still produces unreliable simulation results. On the other hand, quantum chemical methods, while more accurate, are prohibitively expensive. This underscores the critical need for the development and application of MLFF.
Conclusion
The multiscale equivariant framework is versatile and can be integrated with any state-of-the-art model, such as MACE and NequIP, as demonstrated in this work, showcasing multifaceted advantages. Notably, the MACE model has already been applied to dozens of systems.74 Therefore, we believe that extending the MS-MACE model to other molecular systems is entirely feasible.
Although the polarity of the perfluorotri-n-butylamine molecule is extremely weak, a cutoff radius of 8 Å is still required to accurately capture the intermolecular long-range interactions. However, a large cutoff radius is a disaster for the MLFF model. According to Figure 5c,d, the cost of simulation time and memory consumption is enormous for a general MLFF model, which is unaffordable for simulating large-scale systems. Moreover, a larger cutoff radius is prone to bottlenecks that lead to lower model prediction accuracy, as shown in Figure 2b. Therefore, a multiscale higher-order equivariant model with significant advantages in prediction accuracy, simulation speed, and memory consumption is necessary for large-scale organic systems.
In addition, long-time simulation stability is one of the most important and challenging issues in the development of the MLFF model. To address this issue, we proposed a bond length stretching method, in which we randomly select chemical bonds from the training dataset and randomly stretch or compress them within a reasonable range. This method achieves long-time simulation stability comparable to AIMD simulations with almost no additional computational cost. Moreover, this bond length stretching method can effectively improve the long-time simulation stability of all MLFF models.
Finally, a training set that is as comprehensive and nonredundant as possible is also extremely important for the MLFF model, as it directly determines the prediction accuracy and training efficiency of a MLFF model. The active learning technique provides a method for constructing a comprehensive training set; however, active learning is merely a concept. Its effectiveness depends on the corresponding MLFF model. In this work, the active learning concept is combined with the bond length stretching method within the framework of our multiscale higher-order equivariant model. The results show that this method can efficiently and stably construct a comprehensive and low-redundancy training set. For example, we built a training set containing only 901 samples, each with merely 120 atoms. However, based on this dataset, our multiscale model can achieve high-precision, high-efficiency, and low-memory consumption in long-time stable simulations with hundreds of thousands of atoms.
In summary, in this work we proposed a universal multiscale higher-order equivariant framework for constructing a MLFF model. Moreover, within this framework, a bond length stretching method and an active learning workflow have been designed to realize the long-time stability of MD simulations and efficient collection of a comprehensive, low-redundancy training set. By employing our multiscale model, one can achieve high-precision, high-speed, and low GPU memory consumption in long-time stable simulations of large-scale organic systems. However, since intermolecular interactions are usually much smaller than intramolecular interactions, MLFF tends to neglect important intermolecular interactions, resulting in inaccurate MD simulations.60 Therefore, it is of great interest to address this issue through a multiscale approach that distinguishes between intramolecular and intermolecular interactions. In the future, it would be worthwhile to explore applications in charged systems by combining multiscale higher-order equivariant methods with methods73,75–78 capable of handling long-range electrostatics.
Conflict of Interest
There is no conflict of interest to report.
Supporting Information
Supporting Information is available and includes local tests of the cutoff radius, the formulation process and computational complexity of multiscale high-order equivariant models, as well as detailed descriptions of model speed and memory tests.
Preprint Statement
Research presented in this article was posted on a preprint server prior to publication. The corresponding preprint article can be found here: doi.org/10.48550/arXiv.2312.09490.
Acknowledgments
This research was made possible as a result of a generous grant from the National Key R&D Program of China (grant no. 2021YFB3803200) and the National Natural Science Foundation of China (grant no. 22273112).
References
- 1. Zhang J.; Wang X.; Zhu Y.; Shi T.; Tang Z.; Li M.; Liao G.Molecular Dynamics Simulation of the Melting Behavior of Copper Nanorod.Comput. Mater. Sci.2018, 143, 248–254. Google Scholar
- 2. Zhong Z.; Du G.; Wang Y.; Jiang J.Phase Behaviors of Dialkyldimethylammonium Bromide Bilayers.Langmuir2023, 39, 11081–11089. Google Scholar
- 3. Ma L.; Zhong Z.; Hu J.; Qing L.; Jiang J.Long-Lived Weak Ion Pairs in Ionic Liquids: An Insight from All-Atom Molecular Dynamics Simulations.J. Phys. Chem. B2023, 127, 5308–5316. Google Scholar
- 4. Perilla J. R.; Schulten K.Physical Properties of the HIV-1 Capsid from All-Atom Molecular Dynamics Simulations.Nat. Commun.2017, 8, 15959. Google Scholar
- 5. Mouvet F.; Villard J.; Bolnykh V.; Rothlisberger U.Recent Advances in First-Principles Based Molecular Dynamics.Acc. Chem. Res.2022, 55, 221–230. Google Scholar
- 6. Behler J.; Parrinello M.Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces.Phys. Rev. Lett.2007, 98, 146401. Google Scholar
- 7. Chmiela S.; Tkatchenko A.; Sauceda H. E.; Poltavsky I.; Schütt K. T.; Müller K.-R.Machine Learning of Accurate Energy-Conserving Molecular Force Fields.Sci. Adv.2017, 3, e1603015. Google Scholar
- 8. Chmiela S.; Vassilev-Galindo V.; Unke O. T.; Kabylda A.; Sauceda H. E.; Tkatchenko A.; Mller K.-R.Accurate Global Machine Learning Force Fields for Molecules with Hundreds of Atoms.Sci. Adv.2023, 9, eadf0873. Google Scholar
- 9. Tran R.; Lan J.; Shuaibi M.; Wood B. M.; Goyal S.; Das A.; Heras-Domingo J.; Kolluru A.; Rizvi A.; Shoghi N.; Sriram A.; Therrien F.; Abed J.; Voznyy O.; Sargent E. H.; Ulissi Z.; Zitnick C. L.The Open Catalyst 2022 (OC22) Dataset and Challenges for Oxide Electrocatalysts.ACS Catal.2023, 13, 3066–3084. Google Scholar
- 10. Schtt K.; Kindermans P.-J.; Sauceda Felix H. E.; Chmiela S.; Tkatchenko A.; Mller K.-R.SchNet: A Continuous-Filter Convolutional Neural Network for Modeling Quantum Interactions.Adv. Neural Inf. Process. Syst.2017, 30, 992–1002. Google Scholar
- 11. Wang H.; Zhang L.; Han J.; Weinan E.DeePMD-Kit: A Deep Learning Package for Many-Body Potential Energy Representation and Molecular Dynamics.Comput. Phys. Commun.2018, 228, 178–184. Google Scholar
- 12. Schtt K. T.; Arbabzadah F.; Chmiela S.; Mller K. R.; Tkatchenko A.Quantum-Chemical Insights from Deep Tensor Neural Networks.Nat. Commun.2017, 8, 13890. Google Scholar
- 13. Unke O. T.; Meuwly M.PhysNet: A Neural Network for Predicting Energies, Forces, Dipole Moments, and Partial Charges.J. Chem. Theory Comput.2019, 15, 3678–3693. Google Scholar
- 14. Wang L.; Liu Y.; Lin Y.; Liu H.; Ji S.ComENet: Towards Complete and Efficient Message Passing for 3D Molecular Graphs.Adv. Neural Inf. Process. Syst.2022, 35, 650–664. Google Scholar
- 15. Liu Y.; Wang L.; Liu M.; Lin Y.; Zhang X.; Oztekin B.; Ji S.Spherical Message Passing for 3D Molecular Graphs. In
International Conference on Learning Representations ,April 25–29 , 2022. Google Scholar - 16. Satorras V. G.; Hoogeboom E.; Welling M.E(n) Equivariant Graph Neural Networks. In
International Conference on Machine Learning , 2021; pp 9323–9332. Google Scholar - 17. Schtt K.; Unke O.; Gastegger M.Equivariant Message Passing for the Prediction of Tensorial Properties and Molecular
Spectra. In
International Conference on Machine Learning , 2021; pp 9377–9388. Google Scholar - 18. Jing B.; Eismann S.; Suriana P.; Townshend R. J. L.; Dror R.Learning from Protein Structure with Geometric Vector Perceptrons. In
International Conference on Learning Representations ,May 3–7 , 2021. Google Scholar - 19. Le T.; Noé F.; Clevert D.-A.Equivariant Graph Attention Networks for Molecular Property Prediction. arXiv preprint arXiv: 2202.09891, 2022. Google Scholar
- 20. Thomas N.; Smidt T.; Kearnes S.; Yang L.; Li L.; Kohlhoff K.; Riley P.Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds. arXiv preprint arXiv: 1802.08219, 2018. Google Scholar
- 21. Anderson B.; Hy T. S.; Kondor R.Cormorant: Covariant Molecular Neural Networks.Adv. Neural Inf. Process. Syst.2019, 32, 14537–14546. Google Scholar
- 22. Brandstetter J.; Hesselink R.; van der Pol E.; Bekkers E. J.; Welling M.Geometric and Physical Quantities Improve E(3) Equivariant Message Passing. In
International Conference on Learning Representations ,April 25–29 , 2022. Google Scholar - 23. Liao Y.-L.; Smidt T.Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs. In
The Eleventh International Conference on Learning Representations ,May 1–5 , 2023. Google Scholar - 24. Batzner S.; Musaelian A.; Sun L.; Geiger M.; Mailoa J. P.; Kornbluth M.; Molinari N.; Smidt T. E.; Kozinsky B.E(3)-Equivariant Graph Neural Networks for Data-Efficient and Accurate Interatomic Potentials.Nat. Commun.2022, 13, 2453. Google Scholar
- 25. Musaelian A.; Batzner S.; Johansson A.; Sun L.; Owen C. J.; Kornbluth M.; Kozinsky B.Learning Local Equivariant Representations for Large-Scale Atomistic Dynamics.Nat. Commun.2023, 14, 579. Google Scholar
- 26. Batatia I.; Batzner S.; Kovács D. P.; Musaelian A.; Simm G. N. C.; Drautz R.; Ortner C.; Kozinsky B.; Csányi G.The Design Space of E(3)-Equivariant Atom-Centered Interatomic Potentials. arXiv preprint arXiv: 2205.06643, 2022. Google Scholar
- 27. Batatia I.; Kovacs D. P.; Simm G.; Ortner C.; Csanyi G.MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields. In Advances in Neural Information Processing Systems;Koyejo S., Mohamed S., Agarwal A., Belgrave D., Cho K., Oh A., Eds.; Vol. 35, MIT Press, 2022: pp 11423–11436. Google Scholar
- 28. Zhang X.; Wang L.; Helwig J.; Luo Y.; Fu C.; Xie Y.; Liu M.; Lin Y.; Xu Z.; Yan K.; Adams K.; Weiler M.; Li X.; Fu T.; Wang Y.; Yu H.; Xie Y. Q.; Fu X.; Strasser A.; Xu S.; Liu Y.; Du Y.; Saxton A.; Ling H.; Lawrence H.; Stärk H.; Gui S.; Edwards C.; Gao N.; Ladera A.; Wu T.; Hofgard E. F.; Tehrani A. M.; Wang R.; Daigavane A.; Bohde M.; Kurtin J.; Huang Q.; Phung T.; Xu M.; Joshi C. K.; Mathis S. V.; Azizzadenesheli K.; Fang A.; Aspuru-Guzik A.; Bekkers E.; Bronstein M.; Zitnik M.; Anandkumar A.; Ermon S.;Liò P.; Yu R.; Günnemann S.;Leskovec J.; Ji H.; Sun J.; Barzilay R.; Jaakkola T.; Coley C. W.; Qian X.; Qian X.; Smidt T.; Ji S.Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems. arXiv preprint arXiv:2307.08423, 2023. Google Scholar
- 29. Kovács D. P.; Batatia I.; Arany E. S.; Csányi G.Evaluation of the MACE Force Field Architecture: From Medicinal Chemistry to Materials Science.J. Chem. Phys.2023, 159, 044118. Google Scholar
- 30. Fu X.; Wu Z.; Wang W.; Xie T.; Keten S.; Gomez-Bombarelli R.; Jaakkola T.Forces are Not Enough: Benchmark and Critical Evaluation for Machine Learning Force Fields with Molecular Simulations.Trans. Mach. Learn. Res.2023. Google Scholar
- 31. Joshi C. K.; Bodnar C.; Mathis S. V.; Cohen T.; Lio P.On the Expressive Power of Geometric Graph Neural Networks. arXiv preprint arXiv:2301.09308, 2023. Google Scholar
- 32. Geiger M.; Smidt T.e3nn: Euclidean Neural Networks. arXiv preprint arXiv:2207.09453, 2022. Google Scholar
- 33. Rackers J. A.; Tecot L.; Geiger M.; Smidt T. E.A Recipe for Cracking the Quantum Scaling Limit with Machine Learned Electron Densities.Mach. Learn. Sci. Technol.2023, 4, 015027. Google Scholar
- 34. Kovács D. P.; Moore J. H.; Browning N. J.; Batatia I.; Horton J. T.; Kapil V.; Witt W. C.; Magdău I.-B.; Cole D. J.; Csányi G.MACE-OFF23: Transferable Machine Learning Force Fields for Organic Molecules. arXiv preprint arXiv: 2312.15211, 2023. Google Scholar
- 35. Passaro S.; Zitnick C. L.Reducing SO(3) Convolutions to SO(2) for Efficient Equivariant GNNs. In
International Conference on Machine Learning , 2023; pp 27420–27438. Google Scholar - 36. Luo S.; Chen T.; Krishnapriyan A. S.Enabling Efficient Equivariant Operations in the Fourier Basis via Gaunt Tensor Products. In
The Twelfth International Conference on Learning Representations ,Vienna, Austria ,May 7–11 , 2024. Google Scholar - 37. Simeon G.; Fabritiis G. D.TensorNet: Cartesian Tensor Representations for Efficient Learning of Molecular Potentials. In
Thirty-Seventh Conference on Neural Information Processing Systems ,New Orleans, United States ,December 10–16 , 2023. Google Scholar - 38. Tokita A. M.; Behler J.Tutorial: How to Train a Neural Network Potential. arXiv preprint arXiv:2308.08859, 2023. Google Scholar
- 39. Klicpera J.; Becker F.; Gnnemann S.GemNet: Universal Directional Graph Neural Networks for Molecules. In Advances in Neural Information Processing Systems;Beygelzimer A., Dauphin Y., Liang P., Vaughan J. W., Eds.; MIT Press, 2021. Google Scholar
- 40. Wen T.; Zhang L.; Wang H.; Weinan E.; Srolovitz D. J.Deep Potentials for Materials Science.Mater. Futures2022, 1, 022601. Google Scholar
- 41. Gao X.; Ramezanghorbani F.; Isayev O.; Smith J. S.; Roitberg A. E.TorchANI: A Free and Open Source PyTorch-Based Deep Learning Implementation of the ANI Neural Network Potentials.J. Chem. Inf. Model.2020, 60, 3408–3415. Google Scholar
- 42. Wang Z.; Wu H.; Sun L.; He X.; Liu Z.; Shao B.; Wang T.; Liu T.-Y.Improving Machine Learning Force Fields for Molecular Dynamics Simulations with Fine-Grained Force Metrics.J. Chem. Phys.2023, 159, 035101. Google Scholar
- 43. Li Y.; Wang Y.; Huang L.; Yang H.; Wei X.; Zhang J.; Wang T.; Wang Z.; Shao B.; Liu T.-Y.Long-Short-Range Message-Passing: A Physics-Informed Framework to Capture Non-Local
Interaction for Scalable Molecular Dynamics Simulation. In
The Twelfth International Conference on Learning Representations ,Vienna, Austria ,May 7–11 , 2024. Google Scholar - 44. Di Giovanni F.; Giusti L.; Barbero F.; Luise G.; Lio P.; Bronstein M. M.On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth,
and Topology. In
International Conference on Machine Learning , 2023; pp 7865–7885. Google Scholar - 45. Zuo Y.; Chen C.; Li X.; Deng Z.; Chen Y.; Behler J. r.; Csnyi G.; Shapeev A. V.; Thompson A. P.; Wood M. A.; Ong S. P.Performance and Cost Assessment of Machine Learning Interatomic Potentials.J. Phys. Chem. A2020, 124, 731–745. Google Scholar
- 46. Huang J.; Zhang L.; Wang H.; Zhao J.; Cheng J.; Deep Potential Generation Scheme and Simulation Protocol for the Li10GeP2S12-Type Superionic Conductors.J. Chem. Phys.2021, 154, 094703. Google Scholar
- 47. Cohen T.; Welling M.Group Equivariant Convolutional Networks. In
International Conference on Machine Learning , 2016; pp 2990–2999. Google Scholar - 48. Darby J. P.; Kovcs D. P.; Batatia I.; Caro M. A.; Hart G. L. W.; Ortner C.; Csnyi G.Tensor-Reduced Atomic Density Representations.Phys. Rev. Lett.2023, 131, 028001. Google Scholar
- 49. Han J.; Rong Y.; Xu T.; Huang W.Geometrically Equivariant Graph Neural Networks: A Survey. arXiv preprint arXiv:2202.07230, 2022. Google Scholar
- 50. Schtt K. T.; Chmiela S.; Von Lilienfeld O. A.; Tkatchenko A.; Tsuda K.; Mller K.-R.Machine Learning Meets Quantum Physics; Lecture Notes in Physics; Springer: Cham, 2020. Google Scholar
- 51. Zhang Y.; Wang H.; Chen W.; Zeng J.; Zhang L.; Wang H.; Weinan E.DP-GEN: A Concurrent Learning Platform for the Generation of Reliable Deep Learning Based Potential Energy Models.Comput. Phys. Commun.2020, 253, 107206. Google Scholar
- 52. Fedik N.; Zubatyuk R.; Kulichenko M.; Lubbers N.; Smith J. S.; Nebgen B.; Messerly R.; Li Y. W.; Boldyrev A. I.; Barros K.; Isayev O.; Tretiak S.Extending Machine Learning Beyond Interatomic Potentials for Predicting Molecular Properties.Nat. Rev. Chem.2022, 6, 653–672. Google Scholar
- 53. Schran C.; Brezina K.; Marsalek O.Committee Neural Network Potentials Control Generalization Errors and Enable Active Learning.J. Chem. Phys.2020, 153, 104105. Google Scholar
- 54. Bannwarth C.; Ehlert S.; Grimme S.GFN2-xTB—An Accurate and Broadly Parametrized Self-Consistent Tight-Binding Quantum Chemical Method with Multipole Electrostatics and Density-Dependent Dispersion Contributions.J. Chem. Theory Comput.2019, 15, 1652–1671. Google Scholar
- 55. Bannwarth C.; Caldeweyher E.; Ehlert S.; Hansen A.; Pracht P.; Seibert J.; Spicher S.; Grimme S.Extended Tight-Binding Quantum Chemistry Methods.Wiley Interdiscip. Rev. Comput. Mol. Sci.2021, 11, e1493. Google Scholar
- 56. Hourahine B.; Aradi B.; Blum V.; Bonaf F.; Buccheri A.; Camacho C.; Cevallos C.; Deshaye M. Y.; Dumitric T.; Dominguez A.; Ehlert S.; Elstner M.; van der Heide T.; Hermann J.; Irle S.; Kranz J. J.; Köhler C.; Kowalczyk T.; Kubař T.; Lee I. S.; Lutsker V.; Maurer R. J.; Min S. K.; Mitchell I.; Negre C.; Niehaus T. A.;Niklasson A. M. N.; Page A. J.; Pecchia A.;Penazzi G.; Persson M. P.; Řezáč J.; Sánchez C. G.; Sternberg M.; Stöhr M.; Stuckenberg F.; Tkatchenko A.; Yu V. W.-Z.; Frauenheim T.DFTB+, a Software Package for Efficient Approximate Density Functional Theory Based Atomistic Simulations.J. Chem. Phys.2020, 152, 124101. Google Scholar
- 57. Izmailov P.; Podoprikhin D.; Garipov T.; Vetrov D.; Wilson A. G.Averaging Weights Leads to Wider Optima and Better Generalization. arXiv preprint arXiv:1803.05407, 2018. Google Scholar
- 58. Athiwaratkun B.; Finzi M.; Izmailov P.; Wilson A. G.There are Many Consistent Explanations of Unlabeled Data: Why You Should Average. arXiv preprint arXiv:1806.05594, 2018. Google Scholar
- 59. Grisafi A.; Ceriotti M.Incorporating Long-Range Physics in Atomic-Scale Machine Learning.J. Chem. Phys.2019, 151, 204105. Google Scholar
- 60. Magdu I.-B.; Arismendi-Arrieta D. J.; Smith H. E.; Grey C. P.; Hermansson K.; Csnyi G.Machine Learning Force Fields for Molecular Liquids: Ethylene Carbonate/Ethyl Methyl Carbonate Binary Solvent.npj Comput. Mater.2023, 9, 146. Google Scholar
- 61. Chmiela S.; Sauceda H. E.; Poltavsky I.; Mller K.-R.; Tkatchenko A.sGDML: Constructing Accurate and Data Efficient Molecular Force Fields Using Machine Learning.Comput. Phys. Commun.2019, 240, 38–45. Google Scholar
- 62. Bartk A. P.; De S.; Poelking C.; Bernstein N.; Kermode J. R.; Csnyi G.; Ceriotti M.Machine Learning Unifies the Modeling of Materials and Molecules.Sci. Adv.2017, 3, e1701816. Google Scholar
- 63. Lubbers N.; Smith J. S.; Barros K.Hierarchical Modeling of Molecular Energies Using a Deep Neural Network.J. Chem. Phys.2018, 148, 241715. Google Scholar
- 64. Spicher S.; Grimme S.Robust Atomistic Modeling of Materials, Organometallic, and Biochemical Systems.Angew. Chem. Int. Ed.2020, 59, 15665–15673. Google Scholar
- 65. Alon U.; Yahav E.On the Bottleneck of Graph Neural Networks and its Practical Implications. In
International Conference on Learning Representations ,April 26–30 , 2020. Google Scholar - 66. Stocker S.; Gasteiger J.; Becker F.; Gnnemann S.; Margraf J. T.How Robust are Modern Graph Neural Network Potentials in Long and Hot Molecular Dynamics Simulations?Mach. Learn.Sci. Technol.2022, 3, 045010. Google Scholar
- 67. Liu Y.; He X.; Mo Y.Discrepancies and the Error Evaluation Metrics for Machine Learning Interatomic Potentials.npj Comput. Mater.2023, 9, 174. Google Scholar
- 68. Smith J. S.; Nebgen B.; Lubbers N.; Isayev O.; Roitberg A. E.Less is More: Sampling Chemical Space with Active Learning.J. Chem. Phys.2018, 148, 241733. Google Scholar
- 69. Rupp M.; Ramakrishnan R.; Von Lilienfeld O. A.Machine Learning for Quantum Mechanical Properties of Atoms in Molecules.J. Phys. Chem. Lett.2015, 6, 3309–3313. Google Scholar
- 70. Smith J. S.; Isayev O.; Roitberg A. E.ANI-1: An Extensible Neural Network Potential with DFT Accuracy at Force Field Computational Cost.Chem. Sci.2017, 8, 3192–3203. Google Scholar
- 71. Jinnouchi R.; Miwa K.; Karsai F.; Kresse G.; Asahi R.On-the-Fly Active Learning of Interatomic Potentials for Large-Scale Atomistic Simulations.J. Phys. Chem. Lett.2020, 11, 6946–6955. Google Scholar
- 72. Wang J.; Wolf R. M.; Caldwell J. W.; Kollman P. A.; Case D. A.Development and Testing of a General Amber Force Field.J. Comput. Chem.2004, 25, 1157–1174. Google Scholar
- 73. Huguenin-Dumittan K. K.; Loche P.; Haoran N.; Ceriotti M.Physics-Inspired Equivariant Descriptors of Nonbonded Interactions.J. Phys. Chem. Lett.2023, 14, 9612–9618. Google Scholar
- 74. Batatia I.; Benner P.; Chiang Y.; Elena A. M.; Kovács D. P.; Riebesell J.; Advincula X. R.; Asta M.; Avaylon M.; Baldwin W. J.; Berger F.; Bernstein N.; Bhowmik A.; Blau S. M.; Cărare V.; Darby J. P.; De S.; Della Pia F.; Deringer V. L.; Elijošius R.; El-Machachi Z.; Falcioni F.; Fako E.; Ferrari A. C.; Genreith-Schriever A.; George J.; Goodall R. E. A.; Grey C. P.; Grigorev P.; Han S.; Handley W.; Heenen H. H.; Hermansson K.; Holm C.; Jaafar J.; Hofmann S.; Jakob K. S.; Jung H.; Kapil V.; Kaplan A. D.; Karimitari N.; Kermode J. R.; Kroupa N.; Kullgren J.; Kuner M. C.; Kuryla D.; Liepuoniute G.; Margraf J. T.; Magdău I.-B.; Michaelides A.; Moore J. H.; Naik A. A.; Niblett S. P.; Norwood S. W.; O’Neill N.; Ortner C.; Persson K. A.; Reuter K.; Rosen A. S.; Schaaf L. L.; Schran C.; Shi B. X.; Sivonxay E.; Stenczel T. K.; Svahn V.; Sutton C.; Swinburne T. D.; Tilly J.; van der Oord C.; Varga-Umbrich E.; Vegge T.; Vondrák M.; Wang Y.; Witt W. C.; Zills F.; Csányi G.A Foundation Model for Atomistic Materials Chemistry. arXiv preprint arXiv: 2401.00096, 2024. Google Scholar
- 75. Zhang L.; Wang H.; Muniz M. C.; Panagiotopoulos A. Z.; Car R.; E W.A Deep Potential Model with Long-Range Electrostatic Interactions.J. Chem. Phys.2022, 156, 124107. Google Scholar
- 76. Ko T. W.; Finkler J. A.; Goedecker S.; Behler J.A Fourth-Generation High-Dimensional Neural Network Potential with Accurate Electrostatics Including Non-Local Charge Transfer.Nat. Commun.2021, 12, 398. Google Scholar
- 77. Kosmala A.; Gasteiger J.; Gao N.; Gnnemann S.Ewald-Based Long-Range Message Passing for Molecular Graphs. In
International Conference on Machine Learning , 2023; pp 17544–17563. Google Scholar - 78. Grisafi A.; Nigam J.; Ceriotti M.Multi-Scale Approach for the Prediction of Atomic Scale Properties.Chem. Sci.2021, 12, 2078–2090. Google Scholar