EDA Utilities¶
In this section, we will introduce a rich set of plotting functions in orbit for the EDA (exploratory data analysis) purpose. The plots include
Time series heatmap
Correlation heatmap
Dual axis time series plot
Wrap plot
[1]:
import seaborn as sns
from matplotlib import pyplot as plt
import pandas as pd
import numpy as np
import orbit
from orbit.utils.dataset import load_iclaims
from orbit.eda import eda_plot
[2]:
print(orbit.__version__)
1.1.4
[3]:
df = load_iclaims()
df['week'] = pd.to_datetime(df['week'])
[4]:
df.head()
[4]:
week | claims | trend.unemploy | trend.filling | trend.job | sp500 | vix | |
---|---|---|---|---|---|---|---|
0 | 2010-01-03 | 13.386595 | 0.219882 | -0.318452 | 0.117500 | -0.417633 | 0.122654 |
1 | 2010-01-10 | 13.624218 | 0.219882 | -0.194838 | 0.168794 | -0.425480 | 0.110445 |
2 | 2010-01-17 | 13.398741 | 0.236143 | -0.292477 | 0.117500 | -0.465229 | 0.532339 |
3 | 2010-01-24 | 13.137549 | 0.203353 | -0.194838 | 0.106918 | -0.481751 | 0.428645 |
4 | 2010-01-31 | 13.196760 | 0.134360 | -0.242466 | 0.074483 | -0.488929 | 0.487404 |
Time series heat map¶
[5]:
_ = eda_plot.ts_heatmap(df = df, date_col = 'week', seasonal_interval=52, value_col='claims')

[6]:
_ = eda_plot.ts_heatmap(df = df, date_col = 'week', seasonal_interval=52, value_col='claims', normalization=True)
Correlation heatmap¶
[7]:
var_list = ['trend.unemploy', 'trend.filling', 'trend.job', 'sp500', 'vix']
_ = eda_plot.correlation_heatmap(df, var_list = var_list,
fig_width=10, fig_height=6)
Dual axis time series plot¶
[8]:
_ = eda_plot.dual_axis_ts_plot(df=df, var1='trend.unemploy', var2='claims', date_col='week')
Wrap plots for quick glance of data patterns¶
[9]:
var_list=['week', 'trend.unemploy', 'trend.filling', 'trend.job', 'sp500', 'vix']
df[var_list].melt(id_vars = ['week'])
[9]:
week | variable | value | |
---|---|---|---|
0 | 2010-01-03 | trend.unemploy | 0.219882 |
1 | 2010-01-10 | trend.unemploy | 0.219882 |
2 | 2010-01-17 | trend.unemploy | 0.236143 |
3 | 2010-01-24 | trend.unemploy | 0.203353 |
4 | 2010-01-31 | trend.unemploy | 0.134360 |
... | ... | ... | ... |
2210 | 2018-05-27 | vix | -0.175192 |
2211 | 2018-06-03 | vix | -0.275119 |
2212 | 2018-06-10 | vix | -0.291676 |
2213 | 2018-06-17 | vix | -0.152422 |
2214 | 2018-06-24 | vix | 0.003284 |
2215 rows × 3 columns
[10]:
_ = eda_plot.wrap_plot_ts(df, 'week', var_list)