Algorithms

ARIMA

\\

The Autoregressive Integrated Moving Average (ARIMA) algorithm uses the StatsModels ARIMA algorithm to fit a model on a time series for better understanding and/or forecasting its future values. An ARIMA model can consist of autoregressive terms, moving average terms, and differencing operations. The autoregressive terms express the dependency of the current value of time series to its previous ones.

The moving average terms, also called random shocks or white noise, model the effect of previous forecast errors on the current value. If the time series is non-stationary, differencing operations are used to make it stationary. A stationary process is a stochastic process in that its probability distribution does not change over time.

See the StatsModels documentation at http://statsmodels.sourceforge.net/devel/generated/statsmodels.tsa.arima_model.ARIMA.html for more information.

CAUTION: It is highly recommended to send the time series through timechart before sending it into ARIMA to avoid non-uniform sampling time. If _time is not to be specified, using timechart is not necessary.

Parameters

  • The time series should not have any gaps or missing data otherwise ARIMA will complain. If there are missing samples in the data, using a bigger span in timechart or using streamstats to fill in the gaps with average values can do the trick.

  • When chaining ARIMA output to another algorithm (i.e. ARIMA itself), keep in mind the length of the data is the length of the original data + forecast_k. If you want to maintain the holdback position, you need to add the number in forecast_k to your holdback value.

  • ARIMA requires the order parameter to be specified at fitting time. The order parameter needs three values:

    • Number of autoregressive (AR) parameters
    • Number of differencing operations (D)
    • Number of moving average (MA) Parameters
  • The forecast_k=<int> parameter tells ARIMA how many points into the future should be forecasted. If _time is specified during fitting along with the field_to_forecast, ARIMA will also generate the timestamps for forecasted values. By default, forecast_k is zero.

  • The conf_interval=<1..99> parameter is the confidence interval in percentage around forecasted values. By default it is set to 95%.

  • The holdback=<int> parameter is the number of data points held back from the ARIMA model. This is useful for comparing the forecast against known data points. By default, holdback is zero.

Syntax

fit ARIMA [_time] <field_to_forecast>  order=<int>-<int>-<int> [forecast_k=<int>] [conf_interval=<int>] [holdback=<int>]

Syntax constraints

  • ARIMA supports one time series at a time.
  • ARIMA models cannot be saved and used at a later time in the current version.
  • Scoring metric values are based on all data and not on the holdback period data.

Example

The following example uses ARIMA on a test set.

... | fit ARIMA Voltage order=4-0-1 holdback=10 forecast_k=10

Local availability Permalink to this section

Source Permalink to this section

Adapted from the Splunk AI Toolkit 5.6.4 documentation at /en/splunk-cloud-platform/apply-machine-learning/use-ai-toolkit/5.6.4/algorithms-and-scoring-metrics-in-the-ai-toolkit/algorithms-in-the-ai-toolkit (section: forecasting).

Press Cmd/Ctrl+K to focus search. Esc to close.

Type to search the portal.