query_profiler

class QueryProfiler

A class to parse and profile queries using our java-based query parser analyzer.

Variables:
  • __jar_path (str) – The path to the jar file for the query parser analyzer.

  • query (str) – The query to be parsed and profiled.

  • tree (dict) – The tree representation of the query.

  • stats (dict) – The statistics of the query.

profile_query(query: str) dict

Parses and profiles the query.

Parameters:

query (str) – The query to be parsed and profiled.

Returns:

A dictionary containing the parse tree and statistics of the query.

Return type:

dict

get_identifiers_and_labels(query=None, distinct=True, include_brackets=True) dict

Returns a dictionary of the identifiers and labels in the query.

Parameters:
  • query (str or None) – The query to be parsed (optional, defaults to self.query).

  • distinct (bool) – Whether to return distinct identifiers (default is True).

  • include_brackets (bool) – Whether to include brackets in identifiers (default is True).

Returns:

A dictionary containing lists of tables, columns, logical operators, functions, and a dictionary of clauses.

Return type:

dict

get_identifiers_and_labels_df(query=None, query_num=-1, include_brackets=True) pandas.DataFrame

Returns a dataframe of the identifiers and labels in the query.

Parameters:
  • query (str or None) – The query to be parsed (optional, defaults to self.query).

  • query_num (int) – The query number (default is -1).

  • include_brackets (bool) – Whether to include brackets in identifiers (default is True).

Returns:

A Pandas DataFrame containing the query number, identifier type, and identifier value.

Return type:

pandas.DataFrame

tag_query(query: str, syntax: str = 'tsql') dict

Tags a query’s table and column names using the java-based query parser analyzer.

Parameters:
  • query (str) – The query to be tagged.

  • syntax (str) – The syntax of the query (default is “tsql”). Available options: “tsql”, “sqlite”.

Returns:

A dictionary containing the tagged query, table aliases, and column aliases.

Return type:

dict

__parse_query(query: str, syntax: str = 'mssql') dict

Parses a query using the java-based query parser analyzer.

Parameters:
  • query (str) – The query to be parsed.

  • syntax (str) – The syntax of the query (default is “mssql”). Currently only “mssql” is supported.

Returns:

A dictionary containing the parse tree and statistics of the query.

Return type:

dict

parse_tree_pretty_print(tree=None) None

Prints a formatted representation of the parse tree.

Parameters:

tree (dict or None) – The parse tree to print (optional, defaults to self.tree).

__single_obj_dict_to_tuple(dict_in) tuple

Converts a single-key dictionary to a tuple.

Parameters:

dict_in (dict) – The dictionary to convert.

Returns:

A tuple containing the key and value of the input dictionary.

Return type:

tuple

tag_query_test() None

Tests the tag_query method.

profile_query_test() None

Tests the profile_query method.

Prints a tagged query and its statistics.

all_tests() None

Runs all test functions.