nl_to_sql_inference_and_prompt_generation¶
- do_single_question(original_prompt, use_database, question, xwalk_directory=None, column_naturalness=0, table_naturalness=0, log=True, filename_suffix='GPT-FT', filename_prefix='', task='query', service='openai', model_name='GPT-3.5', db_type='sql server', db_list_file='.local/dbinfo.json')¶
Executes a single natural language question against a specified database using a specified AI service and generates a predicted SQL query.
- Parameters:
original_prompt (str) – The initial prompt to be used.
use_database (str) – The database to query.
question (str) – The question to be appended to the prompt.
xwalk_directory (str, optional) – Directory for crosswalk files. Defaults to None.
column_naturalness (int, optional) – Level of naturalness for columns. Defaults to 0.
table_naturalness (int, optional) – Level of naturalness for tables. Defaults to 0.
log (bool, optional) – Whether to log the attempt. Defaults to True.
filename_suffix (str, optional) – Suffix for filenames. Defaults to ‘GPT-FT’.
filename_prefix (str, optional) – Prefix for filenames. Defaults to ‘’.
task (str, optional) – The task to perform (‘query’ or ‘tables’). Defaults to ‘query’.
service (str, optional) – The AI service to use (‘openai’, ‘google-vertex’, ‘google-palm’, ‘code-llama-aws’, ‘togetherai’). Defaults to ‘openai’.
model_name (str, optional) – The model name to use. Defaults to ‘GPT-3.5’.
db_type (str, optional) – The type of database (‘sql server’ or ‘sqlite’). Defaults to “sql server”.
db_list_file (str, optional) – Path to the database list file. Defaults to “.local/dbinfo.json”.
- Returns:
A dictionary containing the prompt, SQL response, result dataframe, naturalness, and denaturalized response.
- Return type:
dict
- naturalize_prompt(schema_prompt, db_name, xwalk_directory='./db/schema-xwalks/consolidated_and_validated/', column_naturalness=0, table_naturalness=0, filename_suffix='GPT-FT', filename_prefix='')¶
Naturalize the prompt by replacing table and column names with natural language names.
- Parameters:
schema_prompt (str) – The prompt to naturalize.
db_name (str) – The name of the database on which the resulting query will be run.
xwalk_directory (str, optional) – The directory in which the crosswalk files are stored. Defaults to ‘./db/schema-xwalks/consolidated_and_validated/’.
column_naturalness (int, optional) – The level of naturalness to use for column names. Defaults to 0.
table_naturalness (int, optional) – The level of naturalness to use for table names. Defaults to 0.
filename_suffix (str, optional) – The suffix to use for the crosswalk files. Defaults to ‘GPT-FT’.
filename_prefix (str, optional) – The prefix to use for the crosswalk files. Defaults to ‘’.
- Returns:
A tuple containing a dictionary with the naturalness levels and the naturalized schema prompt.
- Return type:
tuple[dict, str]
- denaturalize_query(query, naturalness, xwalk_directory='./db/schema-xwalks/consolidated_and_validated/', db_name='PacificIslandLandbirds', filename_suffix='GPT-FT', filename_prefix='', syntax='tsql', target_naturalness='native')¶
Denaturalize a query by replacing natural language table and column names with their native identifiers.
- Parameters:
query (str) – The query to denaturalize.
naturalness (dict) – A dictionary with keys ‘table’ and ‘column’ and values corresponding to the naturalness level used for each.
xwalk_directory (str, optional) – The directory in which the crosswalk files are stored.
db_name (str, optional) – The name of the database on which the resulting query will be run. Defaults to ‘PacificIslandLandbirds’.
filename_suffix (str, optional) – The suffix to use for the crosswalk files. Defaults to ‘GPT-FT’.
filename_prefix (str, optional) – The prefix to use for the crosswalk files. Defaults to ‘’.
syntax (str, optional) – The SQL syntax to use (‘tsql’ or ‘sqlite’). Defaults to ‘tsql’.
target_naturalness (str, optional) – The target naturalness level. Defaults to “native”.
- Returns:
The denaturalized query.
- Return type:
str
- log_attempt(prompt, response, result_df, database, model_name, naturalness={'table': 0, 'column': 0}, denaturalized_response=None)¶
Logs the attempt to a file.
- Parameters:
prompt (str) – The prompt used.
response (str) – The response received.
result_df (pandas.DataFrame) – The result dataframe.
database (str) – The database used.
model_name (str) – The model name used.
naturalness (dict, optional) – The naturalness level used. Defaults to {‘table’: 0, ‘column’: 0}.
denaturalized_response (str, optional) – The denaturalized response. Defaults to None.
- denaturalize_query_test()¶
Test function for denaturalize_query.
- do_single_question_test()¶
Test function for do_single_question.