snails_naturalness_classifier¶

class CanineIdentifierClassifier(identifiers=pd.DataFrame())¶

A classifier for identifying word naturalness using a pre-trained text analysis model. Classifies words as Regular (label N1), Low (label N2), or Least (label N3) natural.

Parameters:

identifiers (pd.DataFrame) – A DataFrame containing identifiers to classify.

Variables:

model_name (str) – The name of the model used for classification.
checkpoint (int) – The checkpoint number of the model.
id2label (dict) – A dictionary mapping label IDs to label names.
label2id (dict) – A dictionary mapping label names to label IDs.
classifier (pipeline) – The sentiment analysis pipeline used for classification.
identifiers (pd.DataFrame) – A DataFrame containing identifiers to classify.

do_batch_job(ident_df: pd.DataFrame = None, save_as_excel: bool = False, make_tag: bool = True)¶

Performs batch classification on the given DataFrame of identifiers.

Parameters:

ident_df (pd.DataFrame or None) – The DataFrame of identifiers to classify. Defaults to None, in which case it uses identifiers.
save_as_excel (bool) – Whether to save the results as an Excel file.
make_tag (bool) – Whether to add a token tag to the text before classification.

Returns:

None

classify_identifier(identifier: str, make_tag: bool = True)¶

Classifies a single identifier.

Parameters:

identifier (str) – The identifier to classify.
make_tag (bool) – Whether to add a token tag to the identifier before classification.

Returns:

The classification result.

Return type:

list

snails_naturalness_classifier¶

SNAILS

Navigation

Related Topics