SNAILS Reproducibility NotebooksΒΆ
The SNAILS repository contains 10 numbered Jupyter notebooks in the root of the project. Notebooks 1 - 3 must be run in-sequence. Notebooks 4 - 10 may be run in any order.
Before running any notebooks, you must install dependences (Installing SNAILS) and instantiate the SNAILS databases (SNAILS databases).
- Run NL-to-SQL Inference and Auto Scoring: Generates and evaluates SQL queries for both execution accuracy and schema linking performance. Outputs to
./data/nl-to-sql-performance_annotations/pending_evaluation/
. NOTE: After running notebook 1 and prior to running notebook 2 analysis, manually review the generated SQL using the manual validation tool (Manual SQL Correctness Validation Tool).
- Run NL-to-SQL Inference and Auto Scoring: Generates and evaluates SQL queries for both execution accuracy and schema linking performance. Outputs to
Run Statistical Tests and Create Charts: Loads validated performance annotations from
./data/nl-to-sql-performance_annotations/
and generates statistical tests and charts.Run Identifier-Focused Analysis: Loads validated performance annotations from
./data/nl-to-sql-performance_annotations/
and provides an identifier-focused schema linking performance metric.Tokenizer Analysis: Tokenizes SNAILS identifiers and explores their properties.
Token Naturalness Analysis: Explores the alignment of tokens to natural language.
Naturalness Comparisons: Compares naturalness of SNAILS, Spider, Bird, and SchemaPile.
SchemaPile Naturalness: ETL scripts for SchemaPile extraction and evaluation.
CodeS Query Execution and Selection: Augments CodeS process to select the first correct SQL.
DINSQL CodeS Schema Subsetting Analysis: Evaluates schema subsets generated by CodeS and DINSQL.
Spider Query Analysis: Creates performance metrics for Spider DEV (Native and modified) inference.