Algorithms
ARIMA
\\
The Autoregressive Integrated Moving Average (ARIMA) algorithm uses the StatsModels ARIMA algorithm to fit a model on a time series for better understanding and/or forecasting its future values. An ARIMA model can consist of autoregressive terms, moving average terms, and differencing operations. The autoregressive terms express the dependency of the current value of time series to its previous ones.
The moving average terms, also called random shocks or white noise, model the effect of previous forecast errors on the current value. If the time series is non-stationary, differencing operations are used to make it stationary. A stationary process is a stochastic process in that its probability distribution does not change over time.
See the StatsModels documentation at http://statsmodels.sourceforge.net/devel/generated/statsmodels.tsa.arima_model.ARIMA.html for more information.
CAUTION: It is highly recommended to send the time series through timechart before sending it into ARIMA to avoid non-uniform sampling time. If _time is not to be specified, using timechart is not necessary.
Parameters
-
The time series should not have any gaps or missing data otherwise ARIMA will complain. If there are missing samples in the data, using a bigger span in timechart or using streamstats to fill in the gaps with average values can do the trick.
-
When chaining ARIMA output to another algorithm (i.e. ARIMA itself), keep in mind the length of the data is the length of the original data +
forecast_k. If you want to maintain theholdbackposition, you need to add the number inforecast_kto yourholdbackvalue. -
ARIMA requires the
orderparameter to be specified at fitting time. Theorderparameter needs three values:- Number of autoregressive (AR) parameters
- Number of differencing operations (D)
- Number of moving average (MA) Parameters
-
The
forecast_k=<int>parameter tells ARIMA how many points into the future should be forecasted. If_timeis specified during fitting along with thefield_to_forecast, ARIMA will also generate the timestamps for forecasted values. By default,forecast_kis zero. -
The
conf_interval=<1..99>parameter is the confidence interval in percentage around forecasted values. By default it is set to 95%. -
The
holdback=<int>parameter is the number of data points held back from the ARIMA model. This is useful for comparing the forecast against known data points. By default, holdback is zero.
Syntax
fit ARIMA [_time] <field_to_forecast> order=<int>-<int>-<int> [forecast_k=<int>] [conf_interval=<int>] [holdback=<int>]
Syntax constraints
- ARIMA supports one time series at a time.
- ARIMA models cannot be saved and used at a later time in the current version.
- Scoring metric values are based on all data and not on the
holdbackperiod data.
Example
The following example uses ARIMA on a test set.
... | fit ARIMA Voltage order=4-0-1 holdback=10 forecast_k=10
Local availability Permalink to this section
- Local class:
ARIMA - Source file:
Splunk_ML_Toolkit/bin/algos/ARIMA.py(in-repo pathSplunk_ML_Toolkit/bin/algos/ARIMA.py) - algos.conf stanza:
[ARIMA] - Class bases:
BaseAlgo
Source Permalink to this section
Adapted from the Splunk AI Toolkit 5.6.4 documentation at /en/splunk-cloud-platform/apply-machine-learning/use-ai-toolkit/5.6.4/algorithms-and-scoring-metrics-in-the-ai-toolkit/algorithms-in-the-ai-toolkit (section: forecasting).