end_to_end_data_prep_and_prediction¶
- main(model: str, service: str, naturalness: str, database: str, bypass_nl_sql_inference: bool = True, db_list_file: str = '.local/dbinfo.json')
Executes the main logic for evaluating NL-to-SQL performance.
- Parameters:
model – The name of the NL-to-SQL model.
service – The name of the service providing the model.
naturalness – The naturalness level of the schema (e.g., “NATIVE”, “N1”, “N2”, “N3”).
database – The name of the database to use.
bypass_nl_sql_inference – Whether to bypass NL-to-SQL inference and load predictions from a file.
db_list_file – Path to the database information JSON file.
- Raises:
FileNotFoundError – If bypass_nl_sql_inference is True and the predicted queries file is not found.
- mp_query_parse_function(query_data: tuple) tuple ¶
Parses a SQL query using an external Java tool and returns its statistics.
- Parameters:
query_data – A tuple containing the query number, the SQL query string, and the SQL dialect.
- Returns:
A tuple containing the query number and a dictionary of query statistics.
- mp_schema_linking_eval(data: tuple) tuple ¶
Evaluates schema linking by comparing gold and predicted queries.
- Parameters:
data – A tuple containing the query number, the gold query, and the predicted query.
- Returns:
A tuple containing the query number and a dictionary of schema linking evaluation results.
- nl_to_sql_generation(q_nl_df: pd.DataFrame, bypass: bool = False, naturalness: str = None, db_name: str = None, config_dict: dict = None, nat_cat_dict: dict = None, db_info: dict = None, db_list_file: str = '.local/dbinfo.json', db_util=src.util.db_util) pd.DataFrame ¶
Generates SQL queries from natural language questions.
- Parameters:
q_nl_df – DataFrame containing natural language questions and other information.
bypass – Whether to bypass NL-to-SQL generation and load from file.
naturalness – Naturalness level for schema elements.
db_name – Name of the database.
config_dict – Configuration dictionary.
nat_cat_dict – Dictionary mapping naturalness levels to numeric values.
db_info – Database information dictionary.
db_list_file – Path to database information JSON file.
db_util – Database utility module.
- Returns:
DataFrame with generated SQL queries.