3.6. Interfaces with ASE
Warning
This tutorial requires Gaussian to be installed on your system. If Gaussian is NOT installed, delete that portion of the tutorial from your python file before running it. By default, it is expected that g16 is in your path.
The goal of this tutorial is to demonstrate the use of the pharmaforge package in conjunction with the dpdata and ase packages to re-label data within the databse. If you haven’t gone through the querying tutorial in the previous example, you should go and do that one now.
3.6.1. Learning Objectives
Learn how to use the pharmaforge package with ase to re-label data in the database.
Learn how to use the interface to dpdata to re-label data in the database.
3.6.2. Required Files
The tutorial python script is located in examples/CalculationInterfaces
3.6.3. Tutorial
To get started, we will first import the required packages. The script to do this is as follows:
# This can be read in by dpdata
from pathlib import Path
from pymongo import MongoClient
from pharmaforge.queries import Query
from pharmaforge.database import Loader
from pharmaforge.interfaces import Psi4Interface
from pharmaforge.interfaces import GaussianInterface
from pharmaforge.interfaces import XTBInterface
try:
client = MongoClient('mongodb://localhost:27017/')
db = client["QDPi2_Database"]
except Exception as e:
print("MongoDB is not running, or you haven't created the database in the previous example. Please start MongoDB and try again. Reason: ", e)
exit()
"""
Note - the same as above can be accomplished with the following code for simplicity.
client = Loader()
db = client.select_db("QDPi2_Database")
"""
3.6.3.1. Psi4 with Multiprocessing Interface Example
Next, we will use the Psi4 package to calculate the energy and the forces of the system.
For large datasets it is recommended to use the parallel interface to pharmaforge to run over multiple threads. To do this, you only make one change to the code above, which is to add the num_workers=4 flag to call the interface with 4 workers. This uses multiprocessing to run the calculations in parallel with 4 workers.
print("**************************")
print("Test Parallel Psi4 -on HDF5")
print("**************************")
try:
test2 = Psi4Interface(basis="6-31G", method="B3LYP", num_threads=1)
test2.ObtainDeepData("inputs/saved_model.hdf5")
energies, forces = test2.calculate(limit=4, verbose=True, num_workers=4)
print(energies)
except Exception as e:
print("Calculation failed, skipping test. Reason: ", e)
Note
Here we show this for Psi4, but it is the same for most interfaces.
3.6.3.2. Working Directly with queries
Now, we will run the same example but using the query interface. It works essentially the same way, but now we use a call to the querly class instead to generate the data. Here we are querying the same data with the same query that we used to create saved_model.hdf5.
print("**************************")
print("Test Parallel Psi4 - on Query")
print("**************************")
try:
test2 = Psi4Interface(basis="6-31G*", method="B3LYP", num_threads=1)
# Recall our query that generated the hdf5 from the previous example.
q = Query("nmols gt 4 and nmols lt 7")
results = q.apply(db, "ani_qdpi")
test2.ObtainQueryData(results)
energies, forces = test2.calculate(limit=4, verbose=True, num_workers=4)
print(energies)
print("Note - this one didn't require an intermediate file!")
except Exception as e:
print("Calculation failed, skipping test. Reason: ", e)
3.6.3.3. Gaussian Interface Example
Next, we will use the Gaussian package to calculate the energy and the forces of the system. If you don’t have Gaussian installed, this portion of the tutorial will fail.
print("**************************")
print("Test Gaussian")
print("**************************")
try:
test3 = GaussianInterface(basis="6-31G", method="B3LYP", num_threads=4)
test3.ObtainDeepData("inputs/saved_model.hdf5")
energies, forces = test3.calculate(limit=4, verbose=True)
print(energies)
except Exception as e:
print("Calculation failed, skipping test. Reason: ", e)
3.6.3.4. XTB Interface Example
Next, we will use the XTB package to calculate the energy and the forces of the system.
print("**************************")
print("Test XTB")
print("**************************")
try:
test4 = XTBInterface(method="GFN2-xTB", accuracy=1.0, electronic_temperature=300.0)
test4.ObtainDeepData("inputs/saved_model.hdf5")
energies, forces = test4.calculate(limit=4, verbose=True)
print(energies)
except Exception as e:
print("Calculation failed, skipping test. Reason: ", e)
3.6.4. See Also
To add data to the database, check out the Relabeling tutorial.
3.6.5. Full Code
# This can be read in by dpdata
from pathlib import Path
from pymongo import MongoClient
from pharmaforge.queries import Query
from pharmaforge.database import Loader
from pharmaforge.interfaces import Psi4Interface
from pharmaforge.interfaces import GaussianInterface
from pharmaforge.interfaces import XTBInterface
try:
client = MongoClient('mongodb://localhost:27017/')
db = client["QDPi2_Database"]
except Exception as e:
print("MongoDB is not running, or you haven't created the database in the previous example. Please start MongoDB and try again. Reason: ", e)
exit()
"""
Note - the same as above can be accomplished with the following code for simplicity.
client = Loader()
db = client.select_db("QDPi2_Database")
"""
# Start Multiprocessing
print("**************************")
print("Test Parallel Psi4 -on HDF5")
print("**************************")
try:
test2 = Psi4Interface(basis="6-31G", method="B3LYP", num_threads=1)
test2.ObtainDeepData("inputs/saved_model.hdf5")
energies, forces = test2.calculate(limit=4, verbose=True, num_workers=4)
print(energies)
except Exception as e:
print("Calculation failed, skipping test. Reason: ", e)
# End Multiprocessing
# Start Multiprocessing from Query
print("**************************")
print("Test Parallel Psi4 - on Query")
print("**************************")
try:
test2 = Psi4Interface(basis="6-31G*", method="B3LYP", num_threads=1)
# Recall our query that generated the hdf5 from the previous example.
q = Query("nmols gt 4 and nmols lt 7")
results = q.apply(db, "ani_qdpi")
test2.ObtainQueryData(results)
energies, forces = test2.calculate(limit=4, verbose=True, num_workers=4)
print(energies)
print("Note - this one didn't require an intermediate file!")
except Exception as e:
print("Calculation failed, skipping test. Reason: ", e)
# End Multiprocessing from Query
# Start Gaussian
print("**************************")
print("Test Gaussian")
print("**************************")
try:
test3 = GaussianInterface(basis="6-31G", method="B3LYP", num_threads=4)
test3.ObtainDeepData("inputs/saved_model.hdf5")
energies, forces = test3.calculate(limit=4, verbose=True)
print(energies)
except Exception as e:
print("Calculation failed, skipping test. Reason: ", e)
# End Gaussian
# Start XTB
print("**************************")
print("Test XTB")
print("**************************")
try:
test4 = XTBInterface(method="GFN2-xTB", accuracy=1.0, electronic_temperature=300.0)
test4.ObtainDeepData("inputs/saved_model.hdf5")
energies, forces = test4.calculate(limit=4, verbose=True)
print(energies)
except Exception as e:
print("Calculation failed, skipping test. Reason: ", e)
# End XTB