Algorithms

Birch

The Birch algorithm uses the scikit-learn Birch clustering algorithm to divide data points into set of distinct clusters. The cluster for each event is set in a new field named `cluster`. This algorithm supports incremental fit.

The Birch algorithm uses the scikit-learn Birch clustering algorithm to divide data points into set of distinct clusters. The cluster for each event is set in a new field named cluster. This algorithm supports incremental fit.

Parameters

  • The k parameter specifies the number of clusters to divide the data into after the final clustering step, which treats the sub-clusters from the leaves of the CF tree as new samples.

    • By default, the cluster label field name is cluster. Change that behavior by using the as keyword to specify a different field name.
  • The partial_fit parameter controls whether an existing model should be incrementally updated on not. This allows you to update an existing model using only new data without having to retrain it on the full training data set.

  • The partial_fit parameter default is False.

Syntax

fit Birch <fields> [into <model name>] [k=<int>][partial_fit=<true|false>] [into <model name>]

You can save Birch models using the into keyword and apply new data later using the apply command.

... | apply Birch_model

Syntax constraints

  • If My_Incremental_Model does not exist, the command saves the model data under the model name My_Incremental_Model.
  • If My_Incremental_Model exists and was trained using Birch, the command updates the existing model with the new input.
  • If My_Incremental_Model exists but was not trained by Birch, an error message displays.
  • Using partial_fit=true on an existing model ignores the newly supplied parameters. The parameters supplied at model creation are used instead.
  • If partial_fit=false or partial_fit is not specified the model specified is created and replaces the pre-trained model if one exists.
  • You cannot inspect the model learned by Birch with the summary command.

Examples

The following example uses Birch on a test set.

... | fit Birch * k=3 | stats count by cluster

The following example includes the partial_fit command.

| inputlookup track_day.csv | fit Birch * k=6 partial_fit=true into My_Incremental_Model

Local availability Permalink to this section

  • Local class: Birch
  • Source file: Splunk_ML_Toolkit/bin/algos/Birch.py (in-repo path Splunk_ML_Toolkit/bin/algos/Birch.py)
  • algos.conf stanza: [Birch]
  • Class bases: ClustererMixin, BaseAlgo

Source Permalink to this section

Adapted from the Splunk AI Toolkit 5.6.4 documentation at /en/splunk-cloud-platform/apply-machine-learning/use-ai-toolkit/5.6.4/algorithms-and-scoring-metrics-in-the-ai-toolkit/algorithms-in-the-ai-toolkit (section: clusterer).

Press Cmd/Ctrl+K to focus search. Esc to close.

Type to search the portal.