Methods of Estimations

In Orbit, we support three methods to estimate model parameters (a.k.a posteriors in Bayesian).

  1. Maximum a Posteriori (MAP)

  2. Full Bayesian estimation

  3. Aggregated posterior estimation

[1]:
%matplotlib inline
import matplotlib.pyplot as plt

import orbit
from orbit.utils.dataset import load_iclaims
from orbit.models.ets import ETSMAP, ETSFull, ETSAggregated
from orbit.diagnostics.plot import plot_predicted_data

from orbit.utils.plot import get_orbit_style
plt.style.use(get_orbit_style())
[2]:
print(orbit.__version__)
1.0.17
[3]:
# load data
df = load_iclaims()
test_size = 52
train_df = df[:-test_size]
test_df = df[-test_size:]
response_col = 'claims'
date_col = 'week'

Maximum a Posteriori (MAP)

In general, we use the naming convention of [TheModel]MAP to represent a model class using MAP estimation. You will find the usage of ETSMAP here, and DLTMAP and LGTMAP in the later sections. The advantage of MAP estimation is a faster computational speed. We also provide inference for MAP method, with the caveat that the uncertainty is mainly generated by the noise process and as such we may not observe the uncertainty band from seasonality or other components.

[4]:
%%time
ets = ETSMAP(
    response_col=response_col,
    date_col=date_col,
    seasonality=52,
    seed=8888,
)
ets.fit(df=train_df)
predicted_df = ets.predict(df=test_df)
CPU times: user 306 ms, sys: 93.7 ms, total: 399 ms
Wall time: 761 ms
[5]:
_ = plot_predicted_data(train_df, predicted_df, date_col, response_col, title='Prediction with ETSMAP')
../_images/tutorials_model_estimations_7_0.png

Full Bayesian Estimation

We use the naming convention of [TheModel]Full to represent a model using full Bayesian estimation. For example, you will find the usage of ETSFull here, and DLTFull and LGTFull in the later sections. Compared to MAP, it usually takes longer time to fit a full Bayesian models where No-U-Turn Sampler (NUTS) (Hoffman and Gelman 2011) is carried out under the hood. The advantage is that the inference and estimation are usually more robust.

[6]:
%%time
ets = ETSFull(
    response_col=response_col,
    date_col=date_col,
    seasonality=52,
    seed=8888,
    num_warmup=400,
    num_sample=400,
)
ets.fit(df=train_df)
predicted_df = ets.predict(df=test_df)
CPU times: user 433 ms, sys: 56.3 ms, total: 489 ms
Wall time: 1.19 s
[7]:
_ = plot_predicted_data(train_df, predicted_df, date_col, response_col, title='Prediction with ETSFull')
../_images/tutorials_model_estimations_10_0.png

You can also access the posterior samples by the attribute of ._posterior_samples as a dict.

[8]:
ets._posterior_samples.keys()
[8]:
odict_keys(['l', 'lev_sm', 'obs_sigma', 's', 'sea_sm'])

Aggregated Posteriors

We use the naming convention of [TheModel]Aggregated to represent a model using aggregated posteriors for prediction. For example, you will find the usage of ETSAggregated here, and DLTAggregated and LGTAggregated in later section. Just like the full Bayesian method, it runs through the MCMC algorithm which is NUTS by default. The difference from a full model is that aggregated model first aggregates the posterior samples based on mean or median (via aggregate_method) then does the prediction using the aggreated posterior.

[9]:
%%time
ets = ETSAggregated(
    response_col=response_col,
    date_col=date_col,
    seasonality=52,
    seed=8888,
)
ets.fit(df=train_df)
predicted_df = ets.predict(df=test_df)
WARNING:pystan:n_eff / iter below 0.001 indicates that the effective sample size has likely been overestimated
CPU times: user 539 ms, sys: 115 ms, total: 653 ms
Wall time: 1.5 s
[10]:
_ = plot_predicted_data(train_df, predicted_df, date_col, response_col, title='Prediction with ETSAggregated')
../_images/tutorials_model_estimations_15_0.png

For users who are interested in the two Orbit refined models – DLT and LGT. There are sections designated to the discussion of the details.