The application of machine learning models and algorithms towards describing atomic interactions has been a major area of interest in materials simulations in recent years, as machine learning interatomic potentials (MLIPs) are seen as being more flexible and accurate than their classical potential counterparts. This increase in accuracy of MLIPs over classical potentials has come at the cost of significantly increased complexity, leading to higher computational costs and lower physical interpretability and spurring research into improving the speeds and interpretability of MLIPs. As an alternative, in this work we leverage “machine learning” fitting databases and advanced optimization algorithms to fit a class of spline-based classical potentials, showing that they can be systematically improved in order to achieve accuracies comparable to those of low-complexity MLIPs. These results demonstrate that high model complexities may not be strictly necessary in order to achieve near-DFT accuracy in interatomic potentials and suggest an alternative route towards sampling the high accuracy, low complexity region of model space by starting with forms that promote simpler and more interpretable interatomic potentials.

Citation: Joshua A. Vita, Dallas R. Trinkle, "Exploring the necessary complexity of interatomic potentials", Computational Materials Science, Volume 200 (2021), 110752,

ColabFit Data Repository

Atomic interactions in classical molecular simulation are modeled using a function called an interatomic potential (IP) or force field. Traditionally, IPs have used functional forms that aim to explicitly represent aspects of the bonding and/or geometry of the system and are fitted to relatively small datasets of key material properties. Recently, interest has grown in data-driven IPs (DDIPs) in which machine learning methods are used to interpolate first principles calculations. Due to the lack of explicit physics, DDIPs must be trained on large datasets, and must be frequently retrained when applications fall outside the original dataset. To facilitate this and allow research groups to easily exchange DDIPs and their training datasets, it is important to develop a standard for archiving and retrieving datasets. This effort is being pursued as part of the ColabFit project which aims to enable the development, exchange and deployment of DDIPs and their datasets. In this work we outline a standard for distributing DDIP training sets and showcase the functionality of the ColabFit tools and data repository for providing open access to a large collection of high quality training data.

Citation: manuscript in-progress

Resources: colabfit-tools GitHub repository