orbit.diagnostics package¶
Submodules¶
orbit.diagnostics.plot module¶
- orbit.diagnostics.plot.metric_horizon_barplot(df, model_col='model', pred_horizon_col='pred_horizon', metric_col='smape', bar_width=0.1, path=None, figsize=None, fontsize=None, is_visible=False)¶
- orbit.diagnostics.plot.params_comparison_boxplot(data, var_names, model_names, color_list=[(0.12156862745098039, 0.4666666666666667, 0.7058823529411765), (1.0, 0.4980392156862745, 0.054901960784313725), (0.17254901960784313, 0.6274509803921569, 0.17254901960784313), (0.8392156862745098, 0.15294117647058825, 0.1568627450980392), (0.5803921568627451, 0.403921568627451, 0.7411764705882353), (0.5490196078431373, 0.33725490196078434, 0.29411764705882354), (0.8901960784313725, 0.4666666666666667, 0.7607843137254902), (0.4980392156862745, 0.4980392156862745, 0.4980392156862745), (0.7372549019607844, 0.7411764705882353, 0.13333333333333333), (0.09019607843137255, 0.7450980392156863, 0.8117647058823529)], title='Params Comparison', fig_size=(10, 6), box_width=0.1, box_distance=0.2, showfliers=False)¶
compare the distribution of parameters from different models uisng a boxplot. :param data: a list of dict with keys as the parameters of interest :param var_names: a list of strings, the labels of the parameters to compare :param model_names: a list of strings, the names of models to compare :param color_list: a list of strings, the color to use for differentiating models :param title: string
the title of the chart
- Parameters:
fig_size – tuple figure size
box_width – float width of the boxes in the boxplot
box_distance – float the distance between each boxes in the boxplot
showfliers – boolean show outliers in the chart if set as True
- Returns:
a boxplot comparing parameter distributions from different models side by side
- orbit.diagnostics.plot.plot_bt_predictions(bt_pred_df, metrics=<function smape>, split_key_list=None, ncol=2, figsize=None, include_vline=True, title='', fontsize=20, path=None, is_visible=True)¶
function to plot and visualize the prediction results from back testing.
- bt_pred_dfdata frame
the output of orbit.diagnostics.backtest.BackTester.fit_predict(), which includes the actuals/predictions for all the splits
- metricscallable
the metric function
- split_key_list: list; default None
with given model, which split keys to plot. If None, all the splits will be plotted
- ncolint
number of columns of the panel; number of rows will be decided accordingly
- figsizetuple
figure size
- include_vlinebool
if plotting the vertical line to cut the in-sample and out-of-sample predictions for each split
- titlestr
title of the plot
- fontsize: int; optional
fontsize of the title
- pathstring
path to save the figure
- is_visiblebool
if displaying the figure
- orbit.diagnostics.plot.plot_bt_predictions2(bt_pred_df, metrics=<function smape>, split_key_list=None, figsize=None, include_vline=True, title='', fontsize=20, markersize=50, lw=2, fig_dir=None, is_visible=True, fix_xylim=True, export_gif=False)¶
a different style backtest plot compare to plot_bt_prediction where it writes separate plot for each split; this is also used to produce an animation to summarize every split
- orbit.diagnostics.plot.plot_predicted_components(predicted_df, date_col, prediction_percentiles=None, plot_components=None, title='', figsize=None, path=None, fontsize=None, is_visible=True)¶
- Plot predicted components with the data frame of decomposed prediction where components
has been pre-defined as trend, seasonality and regression.
- predicted_dfpd.DataFrame
predicted data response data frame. two columns required: actual_col and pred_col. If user provide pred_percentiles_col, it needs to include them as well.
- date_colstr
the date column name
- prediction_percentileslist
a list should consist exact two elements which will be used to plot as lower and upper bound of confidence interval
- plot_componentslist
a list of strings to show the label of components to be plotted; by default, it uses values in orbit.constants.constants.PredictedComponents.
- titlestr; optional
title of the plot
- figsizetuple; optional
figsize pass through to matplotlib.pyplot.figure()
- pathstr; optional
path to save the figure
- fontsizeint; optional
fontsize of the title
- is_visibleboolean
whether we want to show the plot. If called from unittest, is_visible might = False.
- Return type:
matplotlib axes object
- orbit.diagnostics.plot.plot_predicted_data(training_actual_df, predicted_df, date_col, actual_col, pred_col='prediction', prediction_percentiles=None, title='', test_actual_df=None, is_visible=True, figsize=None, path=None, fontsize=None, line_plot=False, markersize=50, lw=2, linestyle='-')¶
plot training actual response together with predicted data; if actual response of predicted data is there, plot it too.
- Parameters:
training_actual_df (pd.DataFrame) – training actual response data frame. two columns required: actual_col and date_col
predicted_df (pd.DataFrame) – predicted data response data frame. two columns required: actual_col and pred_col. If user provide prediction_percentiles, it needs to include them as well in such prediction_{x} where x is the correspondent percentiles
prediction_percentiles (list) – list of two elements indicates the lower and upper percentiles
date_col (str) – the date column name
actual_col (str) –
pred_col (str) –
title (str) – title of the plot
test_actual_df (pd.DataFrame) – test actual response dataframe. two columns required: actual_col and date_col
is_visible (boolean) – whether we want to show the plot. If called from unittest, is_visible might = False.
figsize (tuple) – figsize pass through to matplotlib.pyplot.figure()
path (str) – path to save the figure
fontsize (int; optional) – fontsize of the title
line_plot (bool; default False) – if True, make line plot for observations; otherwise, make scatter plot for observations
markersize (int; optional) – point marker size
lw (int; optional) – out-of-sample prediction line width
linestyle (str) – linestyle of prediction plot
- Return type:
matplotlib axes object
- orbit.diagnostics.plot.residual_diagnostic_plot(df, dist='norm', date_col='week', residual_col='residual', fitted_col='prediction', sparams=None)¶
- Parameters:
df (pd.DataFrame) –
dist (str) –
date_col (str) – column name of date
residual_col (str) – column name of residual
fitted_col (str) – column name of fitted value from model
sparams (float or list) – extra parameters used in distribution such as t-dist
Notes
residual by time
residual vs fitted
residual histogram with vertical line as mean
residuals qq plot
residual ACF
residual PACF