Algorithms

BernoulliNB

The BernoulliNB algorithm uses the scikit-learn BernoulliNB estimator to fit a model to predict the value of categorical fields where explanatory variables are assumed to be binary-valued. BernoulliNB is an implementation of the Naive Ba…

The BernoulliNB algorithm uses the scikit-learn BernoulliNB estimator to fit a model to predict the value of categorical fields where explanatory variables are assumed to be binary-valued. BernoulliNB is an implementation of the Naive Bayes classification algorithm. This algorithm supports incremental fit.

Parameters

  • The alpha parameter controls Laplace/ Lidstone smoothing. The default value is 1.0.

  • The binarize parameter is a threshold that can be used for converting numeric field values to the binary values expected by BernoulliNB. The default value is 0.

    • If binarize=0 is specified, the default, values > 0 are assumed to be 1, and values <= 0 are assumed to be 0.
  • The fit_prior Boolean parameter specifies whether to learn class prior probabilities. The default value is True. If fit_prior=f is specified, classes are assumed to have uniform popularity.

Syntax

fit BernoulliNB <field_to_predict> from <explanatory_fields> [into <model name>]
[alpha=<float>] [binarize=<float>] [fit_prior=<true|false>] [partial_fit=<true|false>]

You can save BernoulliNB models using the into keyword and apply the saved model later to new data using the apply command.

... | apply TESTMODEL_BernoulliNB

You can inspect the model learned by BernoulliNB with the summary command as well as view the class and log probability information as calculated by the dataset.

.... | summary My_Incremental_Model

Syntax constraints

  • The partial_fit parameter controls whether an existing model should be incrementally updated or not. The default value is False, meaning it will not be incrementally updated. Choosing partial_fit=True allows you to update an existing model using only new data without having to retrain it on the full training data set.
  • Using partial_fit=True on an existing model ignores the newly supplied parameters. The parameters supplied at model creation are used instead. If partial_fit=False or partial_fit is not specified (default is False), the model specified is created and replaces the pre-trained model if one exists.
  • If My_Incremental_Model does not exist, the command saves the model data under the model name My_Incremental_Model. If My_Incremental_Model exists and was trained using BernoulliNB, the command updates the existing model with the new input. If My_Incremental_Model exists but was not trained by BernoulliNB, an error message displays.

Example

The following example uses BernoulliNB on a test set.

... | fit BernoulliNB type from * into TESTMODEL_BernoulliNB alpha=0.5 binarize=0 fit_prior=f

Local availability Permalink to this section

  • Local class: BernoulliNB
  • Source file: Splunk_ML_Toolkit/bin/algos/BernoulliNB.py (in-repo path Splunk_ML_Toolkit/bin/algos/BernoulliNB.py)
  • algos.conf stanza: [BernoulliNB]
  • Class bases: ClassifierMixin, BaseAlgo

Source Permalink to this section

Adapted from the Splunk AI Toolkit 5.6.4 documentation at /en/splunk-cloud-platform/apply-machine-learning/use-ai-toolkit/5.6.4/algorithms-and-scoring-metrics-in-the-ai-toolkit/algorithms-in-the-ai-toolkit (section: classifier).

Press Cmd/Ctrl+K to focus search. Esc to close.

Type to search the portal.