schema_graph

class SchemaGraph(database_name: str, repair_constraints: bool = True, use_single_starting_table_for_connections: bool = False)

A class to represent a schema graph for a database.

Parameters:
  • database_name (str) – The name of the database.

  • repair_constraints (bool) – A flag indicating whether to repair constraints. Defaults to True.

  • use_single_starting_table_for_connections (bool) – A flag indicating whether to use a single starting table for connections. Defaults to False.

Variables:
  • manual_pkfk (dict) – A dictionary containing manual primary key and foreign key relationships for specific databases.

  • database (str) – The name of the database.

  • repair_constraints (bool) – A flag indicating whether to repair constraints.

  • single_starting_table (bool) – A flag indicating whether to use a single starting table for connections.

  • edge_type_weights (dict) – A dictionary containing weights for different types of edges.

  • vertex_colors (dict) – A dictionary containing colors for different types of vertices.

  • pkfk_df (pandas.DataFrame) – A DataFrame containing primary key and foreign key relationships.

  • orphan_tables (list) – A list of orphan tables.

  • vertice_type_lookup (dict) – A dictionary for looking up vertex types.

  • vertice_name_lookup (dict) – A dictionary for looking up vertex names.

  • name_vertice_lookup (dict) – A dictionary for looking up vertex IDs by name.

  • all_vertices (list) – A list of all vertices.

  • edges (list) – A list of edges in the graph.

  • edge_type_lookup (dict) – A dictionary for looking up edge types.

  • edge_weight_lookup (dict) – A dictionary for looking up edge weights.

  • schema_graph (igraph.Graph) – The schema graph.

_construct_graph() igraph.Graph

Constructs the schema graph.

Returns:

The schema graph.

Return type:

igraph.Graph

_construct_pkfk_dataframe() pandas.DataFrame

Constructs the primary key and foreign key DataFrame.

Returns:

The primary key and foreign key DataFrame.

Return type:

pandas.DataFrame

_construct_orphan_table_list(pkfk_df: pandas.DataFrame = None) list

Constructs the list of orphan tables.

Parameters:

pkfk_df (pandas.DataFrame) – The primary key and foreign key DataFrame. Defaults to None.

Returns:

The list of orphan tables.

Return type:

list

_construct_vertice_lookup_dicts(pkfk_df: pandas.DataFrame = None, orphans: list = None) dict

Constructs dictionaries for vertex lookups.

Parameters:
  • pkfk_df (pandas.DataFrame) – The primary key and foreign key DataFrame. Defaults to None.

  • orphans (list) – The list of orphan tables. Defaults to None.

Returns:

A dictionary containing vertex lookup dictionaries.

Return type:

dict

_construct_edges(pkfk_df: pandas.DataFrame = None, all_vertices: list = None, name_vertice_lookup: dict = None, vertice_name_lookup: dict = None) dict

Constructs the edges of the graph.

Parameters:
  • pkfk_df (pandas.DataFrame) – The primary key and foreign key DataFrame. Defaults to None.

  • all_vertices (list) – The list of all vertices. Defaults to None.

  • name_vertice_lookup (dict) – The dictionary for looking up vertex IDs by name. Defaults to None.

  • vertice_name_lookup (dict) – The dictionary for looking up vertex names. Defaults to None.

Returns:

A dictionary containing edge data.

Return type:

dict

_repair_constraints(pkfk_df: pandas.DataFrame) pandas.DataFrame

Repairs constraints in the primary key and foreign key DataFrame.

Parameters:

pkfk_df (pandas.DataFrame) – The primary key and foreign key DataFrame.

Returns:

The repaired primary key and foreign key DataFrame.

Return type:

pandas.DataFrame

get_schema_identifiers_between_tables(table_names: list) dict

Gets schema identifiers between tables.

Parameters:

table_names (list) – A list of table names.

Returns:

A dictionary where keys are table names and values are lists of column names.

Return type:

dict

get_table_connections(table_names: list) dict

Gets table connections.

Parameters:

table_names (list) – A list of table names.

Returns:

A dictionary containing connection information.

Return type:

dict

get_connecting_vertices_and_edges(vertices: list, single_starting_table: bool = None) dict

Gets connecting vertices and edges.

Parameters:
  • vertices (list) – A list of vertices.

  • single_starting_table (bool) – Whether to use a single starting table. Defaults to None.

Returns:

A dictionary containing connecting vertices and edges.

Return type:

dict