.. _schemarenamer: schemarenamer ============= .. py:module:: schemarenamer .. py:function:: main() Executes the main logic of the script. This includes classifying the database schema, performing schema renaming, and saving the results to an Excel file. :no-index: .. py:function:: transform_score_df(score_df: pandas.DataFrame) -> pandas.DataFrame Transforms the input score DataFrame to combine table and column scores, lowercase identifiers, and remove duplicates. :param score_df: DataFrame containing table and column scores. :type score_df: pandas.DataFrame :return: Transformed DataFrame with combined scores. :rtype: pandas.DataFrame .. py:function:: do_schema_renaming(database_name="PacificIslandLandbirds", score_lookup_file="./data/gold-data/identifier-scores-evaluated-5-9-2024.xlsx", continuous_write=False, db_type="ms sql", db_classifier_score_df=None, only_most_natural=False, verbose=True) -> pandas.DataFrame Renames schema identifiers (tables and columns) in a database based on human-evaluated naturalness scores. :param database_name: The name of the database. Defaults to "PacificIslandLandbirds". :type database_name: str :param score_lookup_file: Path to the Excel file containing human-evaluated scores. Defaults to "./data/gold-data/identifier-scores-evaluated-5-9-2024.xlsx". :type score_lookup_file: str :param continuous_write: If True, writes logs continuously. Defaults to False. :type continuous_write: bool :param db_type: The type of the database ("ms sql" or "sqlite"). Defaults to "ms sql". :type db_type: str :param db_classifier_score_df: DataFrame with classifier scores. If None, reads from score_lookup_file. Defaults to None. :type db_classifier_score_df: pandas.DataFrame or None :param only_most_natural: If True, only generates the most natural identifier. Defaults to False. :type only_most_natural: bool :param verbose: If True, prints progress information. Defaults to True. :type verbose: bool :return: DataFrame containing original and generated identifiers with scores and errors. :rtype: pandas.DataFrame : .. py:function:: do_fewshot_identifier_transform(identifier, naturalness, data_dict_interpreter=None, only_most_natural=False, verbose=True, gpt_model="gpt-4o") -> dict Transforms a given identifier to different naturalness levels using few-shot prompting and a data dictionary interpreter. :param identifier: The identifier to transform. :type identifier: str :param naturalness: The original naturalness level of the identifier (e.g., "N1", "N2", "N3"). :type naturalness: str :param data_dict_interpreter: An instance of DataDictInterpreter for retrieving natural identifiers. Defaults to None. :type data_dict_interpreter: SNAILS_Artifacts.naturalness_modifier .data_dict_reader.DataDictInterpreter or None :param only_most_natural: If True, only generates the most natural identifier. Defaults to False. :type only_most_natural: bool :param verbose: If True, prints progress information. Defaults to True. :type verbose: bool :param gpt_model: The GPT model to use. Defaults to "gpt-4o". :type gpt_model: str :return: A dictionary containing the transformed identifiers for different naturalness levels. :rtype: dict .. toctree:: :maxdepth: 2 :caption: Contents: