orbit.diagnostics package

Submodules

orbit.diagnostics.plot module

orbit.diagnostics.plot.metric_horizon_barplot(df, model_col='model', pred_horizon_col='pred_horizon', metric_col='smape', bar_width=0.1, path=None, figsize=None, fontsize=None, is_visible=False)
orbit.diagnostics.plot.params_comparison_boxplot(data, var_names, model_names, color_list=[(0.12156862745098039, 0.4666666666666667, 0.7058823529411765), (1.0, 0.4980392156862745, 0.054901960784313725), (0.17254901960784313, 0.6274509803921569, 0.17254901960784313), (0.8392156862745098, 0.15294117647058825, 0.1568627450980392), (0.5803921568627451, 0.403921568627451, 0.7411764705882353), (0.5490196078431373, 0.33725490196078434, 0.29411764705882354), (0.8901960784313725, 0.4666666666666667, 0.7607843137254902), (0.4980392156862745, 0.4980392156862745, 0.4980392156862745), (0.7372549019607844, 0.7411764705882353, 0.13333333333333333), (0.09019607843137255, 0.7450980392156863, 0.8117647058823529)], title='Params Comparison', fig_size=(10, 6), box_width=0.1, box_distance=0.2, showfliers=False)

compare the distribution of parameters from different models uisng a boxplot. :param data: a list of dict with keys as the parameters of interest :param var_names: a list of strings, the labels of the parameters to compare :param model_names: a list of strings, the names of models to compare :param color_list: a list of strings, the color to use for differentiating models :param title: string

the title of the chart

Parameters:
  • fig_size – tuple figure size

  • box_width – float width of the boxes in the boxplot

  • box_distance – float the distance between each boxes in the boxplot

  • showfliers – boolean show outliers in the chart if set as True

Returns:

a boxplot comparing parameter distributions from different models side by side

orbit.diagnostics.plot.plot_bt_predictions(bt_pred_df, metrics=<function smape>, split_key_list=None, ncol=2, figsize=None, include_vline=True, title='', fontsize=20, path=None, is_visible=True)

function to plot and visualize the prediction results from back testing.

bt_pred_dfdata frame

the output of orbit.diagnostics.backtest.BackTester.fit_predict(), which includes the actuals/predictions for all the splits

metricscallable

the metric function

split_key_list: list; default None

with given model, which split keys to plot. If None, all the splits will be plotted

ncolint

number of columns of the panel; number of rows will be decided accordingly

figsizetuple

figure size

include_vlinebool

if plotting the vertical line to cut the in-sample and out-of-sample predictions for each split

titlestr

title of the plot

fontsize: int; optional

fontsize of the title

pathstring

path to save the figure

is_visiblebool

if displaying the figure

orbit.diagnostics.plot.plot_bt_predictions2(bt_pred_df, metrics=<function smape>, split_key_list=None, figsize=None, include_vline=True, title='', fontsize=20, markersize=50, lw=2, fig_dir=None, is_visible=True, fix_xylim=True, export_gif=False)

a different style backtest plot compare to plot_bt_prediction where it writes separate plot for each split; this is also used to produce an animation to summarize every split

orbit.diagnostics.plot.plot_predicted_components(predicted_df, date_col, prediction_percentiles=None, plot_components=None, title='', figsize=None, path=None, fontsize=None, is_visible=True)
Plot predicted components with the data frame of decomposed prediction where components

has been pre-defined as trend, seasonality and regression.

predicted_dfpd.DataFrame

predicted data response data frame. two columns required: actual_col and pred_col. If user provide pred_percentiles_col, it needs to include them as well.

date_colstr

the date column name

prediction_percentileslist

a list should consist exact two elements which will be used to plot as lower and upper bound of confidence interval

plot_componentslist

a list of strings to show the label of components to be plotted; by default, it uses values in orbit.constants.constants.PredictedComponents.

titlestr; optional

title of the plot

figsizetuple; optional

figsize pass through to matplotlib.pyplot.figure()

pathstr; optional

path to save the figure

fontsizeint; optional

fontsize of the title

is_visibleboolean

whether we want to show the plot. If called from unittest, is_visible might = False.

Return type:

matplotlib axes object

orbit.diagnostics.plot.plot_predicted_data(training_actual_df, predicted_df, date_col, actual_col, pred_col='prediction', prediction_percentiles=None, title='', test_actual_df=None, is_visible=True, figsize=None, path=None, fontsize=None, line_plot=False, markersize=50, lw=2, linestyle='-')

plot training actual response together with predicted data; if actual response of predicted data is there, plot it too.

Parameters:
  • training_actual_df (pd.DataFrame) – training actual response data frame. two columns required: actual_col and date_col

  • predicted_df (pd.DataFrame) – predicted data response data frame. two columns required: actual_col and pred_col. If user provide prediction_percentiles, it needs to include them as well in such prediction_{x} where x is the correspondent percentiles

  • prediction_percentiles (list) – list of two elements indicates the lower and upper percentiles

  • date_col (str) – the date column name

  • actual_col (str) –

  • pred_col (str) –

  • title (str) – title of the plot

  • test_actual_df (pd.DataFrame) – test actual response dataframe. two columns required: actual_col and date_col

  • is_visible (boolean) – whether we want to show the plot. If called from unittest, is_visible might = False.

  • figsize (tuple) – figsize pass through to matplotlib.pyplot.figure()

  • path (str) – path to save the figure

  • fontsize (int; optional) – fontsize of the title

  • line_plot (bool; default False) – if True, make line plot for observations; otherwise, make scatter plot for observations

  • markersize (int; optional) – point marker size

  • lw (int; optional) – out-of-sample prediction line width

  • linestyle (str) – linestyle of prediction plot

Return type:

matplotlib axes object

orbit.diagnostics.plot.residual_diagnostic_plot(df, dist='norm', date_col='week', residual_col='residual', fitted_col='prediction', sparams=None)
Parameters:
  • df (pd.DataFrame) –

  • dist (str) –

  • date_col (str) – column name of date

  • residual_col (str) – column name of residual

  • fitted_col (str) – column name of fitted value from model

  • sparams (float or list) – extra parameters used in distribution such as t-dist

Notes

  1. residual by time

  2. residual vs fitted

  3. residual histogram with vertical line as mean

  4. residuals qq plot

  5. residual ACF

  6. residual PACF

Module contents