SNAILS Reproducibility NotebooksΒΆ
The SNAILS repository contains 10 numbered Jupyter notebooks in the root of the project. Notebooks 1 - 3 must be run in-sequence. Notebooks 4 - 10 may be run in any order.
NOTE on database hosting: Reproduction requires an MS SQL database. This can either be hosted natively on a Windows OS, or via Docker (requires Docker installation) and the installation scripts provided in the archive.
Before running any notebooks, you must install dependences (Installing SNAILS) and instantiate the SNAILS databases (SNAILS databases).
- Run NL-to-SQL Inference and Auto Scoring: Generates and evaluates SQL queries for both execution accuracy and schema linking performance. Outputs to
./data/nl-to-sql-performance_annotations/pending_evaluation/. NOTE: After running notebook 1 and prior to running notebook 2 analysis, manually review the generated SQL using the manual validation tool (Manual SQL Correctness Validation Tool).
- Run NL-to-SQL Inference and Auto Scoring: Generates and evaluates SQL queries for both execution accuracy and schema linking performance. Outputs to
Run Statistical Tests and Create Charts: Loads validated performance annotations from
./data/nl-to-sql-performance_annotations/and generates statistical tests and charts.Run Identifier-Focused Analysis: Loads validated performance annotations from
./data/nl-to-sql-performance_annotations/and provides an identifier-focused schema linking performance metric.Tokenizer Analysis: Tokenizes SNAILS identifiers and explores their properties.
Token Naturalness Analysis: Explores the alignment of tokens to natural language.
Naturalness Comparisons: Compares naturalness of SNAILS, Spider, Bird, and SchemaPile.
SchemaPile Naturalness: ETL scripts for SchemaPile extraction and evaluation.
CodeS Query Execution and Selection: Augments CodeS process to select the first correct SQL.
DINSQL CodeS Schema Subsetting Analysis: Evaluates schema subsets generated by CodeS and DINSQL.
Spider Query Analysis: Creates performance metrics for Spider DEV (Native and modified) inference.