3.4. Generating QDPi2 MongoDB Database
This tutorial will guide you through the process of taking an HDF5 file and generating the MongoDB database for that file. Note - this is identical more or less to the QDPi1 example, but the QDPi2 dataset is larger and more complex.
3.4.1. Learning Objectives
Learn how to generate the QDPi2 MongoDB database from a recipe file.
3.4.2. Required Files
You must download the QDPi2 input files from [LINK TBD, currently in a google drive folder.] - The tutorial python script is located in examples/QDPi2_Generation
To generate the QDPi dataset as a database, we will use the generate_from_recipe.py script. This is fairly short, so look down below in full code to see what it does.
The key difference between this, and the previous example, is that here there are no accepted kwargs. All of these are set (and expected) based on the recipe file.
Otherwise - the same class functions exist and can be used to interact with the database.
3.4.3. Full Code
from pathlib import Path
from pharmaforge.database import DataBase
from pharmaforge.recipes.GeneralDatabase import GeneralRecipe
from pharmaforge.recipes.QDPi2 import QDPi2Recipe
inputs = Path("inputs/")
# Check if the input directory exists
if not inputs.exists():
print("To run this example, you must first obtain the QDPi2 data files.")
print("These are located at XXX.")
print("Place these in a directory called 'inputs/' in the current working directory.")
raise FileNotFoundError(f"Input directory {inputs} does not exist.")
QDPI = QDPi2Recipe(input_dir="inputs/")
QDPI.pprint_one_entry()