Query Syntax ============ The pharmaforge package has implemented a query syntax that allows users to create complex queries using a simple and intuitive format. The syntax is designed to be flexible and powerful, enabling users to filter and manipulate data efficiently. The syntax is based on a combination of keywords, operators, and values that can be combined to form complex expressions. An example for querying a dataset is included in :doc:`examples/querying`. Basic queries ------------- The basic form of a query is as follows: ``` query = "field operator value" ``` Where: - `field` is the name of the field you want to query. - `operator` is the operator you want to use (e.g., gt, lt, eq, ne, any) - `value` is the value you want to compare against. so you can do the following: .. code-block:: python query = "nmols eq 5" This query will return all records where the `nmols` field is equal to 5. You can also use the `not` operator to negate a condition: .. code-block:: python query = "not nmols eq 5" Behind the scenes, this is creating a more complex query, like ... .. code-block:: python query = { "nmols": { "$eq": 1 } } for the first example, and... .. code-block:: python query={ "nmols": { '$not': { "$eq": 5 } } } for the second example. Advanced Query Syntax ---------------------- We have also implemented support for compound queries. These are queries that combine multiple conditios using logical operators, such as `and`, and `or`. The syntax for a compound query is as follows: .. code-block:: python compound_query = "field1 operator1 value1 and field2 operator2 value2" Where: - `field1` and `field2` are the names of the fields you want to query. - `operator1` and `operator2` are the operators you want to use (e.g., gt, lt, eq, ne, any) - `value1` and `value2` are the values you want to compare against. For example, you can create a compound query that returns all records where `contains_elements` has H and O atoms, but not C atoms, as: .. code-block:: python query = "contains_elements any [H,O] and not contains_elements any [C]" This query will return all records where the `contains_elements` field contains either H or O, but not C. The compound query is translated into a more complex query, like: .. code-block:: python query={ '$and': [{ 'contains_elements': { '$not': { '$nin': ['H', 'O'] } } }, {'contains_elements': { '$not': { '$not': { '$nin': ['C'] } } } }] } Note here the negations get confusing, which is why there is a distinct advantage in using the `not` operator in the query syntax as a compound clause. List of Query operators ----------------------- The following operators are *preliminarily* supported in the query syntax: - `eq`: Equal to - `ne`: Not equal to - `gt`: Greater than - `lt`: Less than - `gte`: Greater than or equal to - `lte`: Less than or equal to - `any`: Contains any of the specified values - `and`: Logical AND - `or`: Logical OR - `not`: Logical NOT List of fields ---------------- Query Syntax is technically agnostic to the fields in the dataset; for the qdpi dataset, the following queries have been tested. .. literalinclude:: ../../../examples/DatabaseQuerying/run_queries.py :language: python :start-at: Start Examples :end-at: End Examples