Querying the MongoDB Database ================================= In this tutorial, you will learn about develop basic queries for elements from the MongoDB database. This tutorial will also cover how to use a query to obtain a new hdf5 file which can then (later) be used for DeepMDKit training. Learning Objectives ------------------- - How to query the MongoDB database - How to construct compound queries - How to use a query to obtain a new hdf5 file Required Files -------------- - MongoDB database initialized through previous tutorials (specifically, :doc:`generating_qdpi2`) - The files for this tutorial are in `examples/DatabaseQuerying`` Tutorial -------- In this tutorial, we will learn how to query the MongoDB database. To start, we need to import the necessary libraries and ensure that the MongoDB database is running. .. literalinclude :: ../../../../examples/DatabaseQuerying/run_queries.py :language: python :end-before: Start Examples Now that the database is running (if this failed, make sure that you started the mongodb server prior to this tutorial, and that you did the previous tutorial), we can start querying the database. The next step is to define a set of example queries. The key thing with queries is that they all follow a general format of . The operators that you can use are eq, neq, gt, gte, lt, lte, and any. Additionally, you can use logical operators such as and, or, and not to make compound queries. Some examples are defined below. .. literalinclude :: ../../../../examples/DatabaseQuerying/run_queries.py :language: python :start-at: Start Examples :end-at: End Examples Before we run this big list of queries, it should be pointed out that what is happening behind the scenes is that the `pharmaforge` package is converting the strings above into a dictionary that looks like a json format which is what pymongo actually interprets. .. literalinclude :: ../../../../examples/DatabaseQuerying/run_queries.py :language: python :start-at: End Examples :end-before: for query in Now lets try to run the queries. We do this in a for loop and you will see the output of a smiles code matching each of the queries that you try below, and how many documents (aka how many molecules) match the query. .. literalinclude :: ../../../../examples/DatabaseQuerying/run_queries.py :language: python :start-at: for query in :end-before: Now lets try Now being able to query things is great, but it would be better to be able to actually use that query FOR something. In the code block below, we will use a query to generate a new hdf5 file. This will be used in the next tutorial, where we will then relabel the data to get different labels at a different level of theory. .. literalinclude :: ../../../../examples/DatabaseQuerying/run_queries.py :language: python :start-at: Now lets try Full Code --------- .. literalinclude :: ../../../../examples/DatabaseQuerying/run_queries.py :language: python