Rolling Module¶
The rolling module contains compiled rolling functions for pandas
DataFrames
which function better for our purposes than the built-in pandas
rolling functions.
The rolling module also contains a rolling multiple regression function that employs parallel
processing and numba
compiled routines for speed.
rolling
¶
- finance_byu.rolling.roll_sum(input, win, minp, errors='raise')¶
Computes the rolling sum for a
pandas
Series.- Parameters:
- input: pandas.core.series.Series
This series must have strictly numeric type.
- win: int
Length of the moving window
- minp: int
Minimum number of observations in window required to have a value (otherwise result is NaN).
- errors: {‘raise’,’return’}
Whether to raise an error and stop running or to return NaN on manageable errors. ‘return’ is suggested for running in a groupby in which there may be some groups which do not have a sufficient number of observations.
- Returns:
- output: pandas.core.series.Series
A
pandas
Series with the rolling sum ofinput
.
- finance_byu.rolling.roll_mean(input, win, minp, errors='raise')¶
Computes the rolling mean of a
pandas
Series.- Parameters:
- input: pandas.core.series.Series
This series must have strictly numeric type.
- win: int
Length of the moving window
- minp: int
Minimum number of observations in window required to have a value (otherwise result is NaN).
- errors: {‘raise’,’return’}
Whether to raise an error and stop running or to return NaN on manageable errors. ‘return’ is suggested for running in a groupby in which there may be some groups which do not have a sufficient number of observations.
- Returns:
- output: pandas.core.series.Series
A
pandas
Series with the rolling mean ofinput
.
- finance_byu.rolling.roll_var(input, win, minp, ddof=1, errors='raise')¶
Computes the rolling variance for a
pandas
Series.- Parameters:
- input: pandas.core.series.Series
This series must have strictly numeric type.
- win: int
Length of the moving window
- minp: int
Minimum number of observations in window required to have a value (otherwise result is NaN).
- ddof: int
Delta degrees of freedom. The divisor used in calculations is N - ddof, where N represents the number of elements.
- errors: {‘raise’,’return’}
Whether to raise an error and stop running or to return NaN on manageable errors. ‘return’ is suggested for running in a groupby in which there may be some groups which do not have a sufficient number of observations.
- Returns:
- output: pandas.core.series.Series
A
pandas
Series with the rolling variance ofinput
.
- finance_byu.rolling.roll_std(input, win, minp, ddof=1, errors='raise')¶
Computes the rolling standard deviation for a
pandas
Series.- Parameters:
- input: pandas.core.series.Series
This series must have strictly numeric type.
- win: int
Length of the moving window
- minp: int
Minimum number of observations in window required to have a value (otherwise result is NaN).
- ddof: int
Delta degrees of freedom. The divisor used in calculations is N - ddof, where N represents the number of elements.
- errors: {‘raise’,’return’}
Whether to raise an error and stop running or to return NaN on manageable errors. ‘return’ is suggested for running in a groupby in which there may be some groups which do not have a sufficient number of observations.
- Returns:
- output: pandas.core.series.Series
A
pandas
Series with the rolling standard deviation ofinput
.
- finance_byu.rolling.roll_cov(x, y, win, minp, ddof=1, idx='x', errors='raise')¶
Computes the rolling covariance of two
pandas
series.- Parameters:
- x: pandas.core.series.Series
This series must have strictly numeric type.
- y: pandas.core.series.Series
This series must have strictly numeric type.
- win: int
Length of the moving window
- minp: int
Minimum number of observations in window required to have a value (otherwise result is NaN).
- ddof: int
Delta degrees of freedom. The divisor used in calculations is N - ddof, where N represents the number of elements.
- idx: {‘x’,’y’}
Whether to use the index for x or for y for the return series. Defaults to ‘x’.
- errors: {‘raise’,’return’}
Whether to raise an error and stop running or to return NaN on manageable errors. ‘return’ is suggested for running in a groupby in which there may be some groups which do not have a sufficient number of observations.
- Returns:
- output: pandas.core.series.Series
A
pandas
Series with the rolling covariance of x and y.
- finance_byu.rolling.roll_idio(y, x, win, minp, ddof=1, idx='x', errors='raise')¶
Computes the rolling idio residual standard deviation from a univariate regression, y = a + bx + e.
- Parameters:
- y: pandas.core.series.Series
This series must have strictly numeric type.
- x: pandas.core.series.Series
This series must have strictly numeric type.
- win: int
Length of the moving window
- minp: int
Minimum number of observations in window required to have a value (otherwise result is NaN).
- ddof: int
Delta degrees of freedom. The divisor used in calculations is N - ddof, where N represents the number of elements.
- idx: {‘x’,’y’}
Whether to use the index for x or for y for the return series. Defaults to ‘x’.
- errors: {‘raise’,’return’}
Whether to raise an error and stop running or to return NaN on manageable errors. ‘return’ is suggested for running in a groupby in which there may be some groups which do not have a sufficient number of observations.
- Returns:
- output: pandas.core.series.Series
A
pandas
Series with the rolling idio residual standard deviation from a univariate regression, y = a + bx + e.
- finance_byu.rolling.roll_beta(y, x, win, minp, ddof=1, idx='x', errors='raise')¶
Computes the rolling estimated slope coefficient (beta) from an univariate regression, y = a + bx + e.
- Parameters:
- y: pandas.core.series.Series
This series must have strictly numeric type.
- x: pandas.core.series.Series
This series must have strictly numeric type.
- win: int
Length of the moving window
- minp: int
Minimum number of observations in window required to have a value (otherwise result is NaN).
- ddof: int
Delta degrees of freedom. The divisor used in calculations is N - ddof, where N represents the number of elements.
- idx: {‘x’,’y’}
Whether to use the index for x or for y for the return series. Defaults to ‘x’.
- errors: {‘raise’,’return’}
Whether to raise an error and stop running or to return NaN on manageable errors. ‘return’ is suggested for running in a groupby in which there may be some groups which do not have a sufficient number of observations.
- Returns:
- output: pandas.core.series.Series
A
pandas
Series with the rolling estimated slope coefficient (beta) from an univariate regression, y = a + bx + e.
- finance_byu.rolling.rolling_multiple(data, yvar, xvar, roll, intercept=True, residuals=False, backend='loky', ddof='default', predispatch='all', flagop='nan', append=False)¶
Rolling multiple regression implementation using
pandas
groupby for grouping, linear algebra routines compiled withnumba
for regressions, andjoblib
for parallelization. Jobs are pre-dispatched to each core for performance.- Parameters:
- data: pandas.core.frame.DataFrame
A dataframe with regressand and regressors for multiple regression. This dataframe must have strictly numeric types.
- yvar: str
The name of the regressand variable in
data
.- xvar: list(str)
A list of the names of the regressor variables in
data
.- roll: int
The number of observations (rows in data) over which to roll, inclusive of the current row. For example, if
roll=120
, the rolling regression will include the current observation and the previous 119 observations.- intercept: bool
Whether or not to regress with an intercept.
- residuals: {bool,’residvol’}
Whether or not to include residuals in the output dataframe (boolean). If
'residvol'
is input, the standard deviation of the residuals from the rolling regression will be output. IfTrue
is input, the residual corresponding to the observation will be output.- backend: {‘loky’,’multiprocessing’,’threading’}
The joblib backend to use for parallel processing. ‘loky’ is used by default and is recommended.
- ddof: {‘default’,int}
Delta degrees of freedom. Defaults to the number of x variables (including intercept if intercept=True) if set to ‘default’.
- predispatch: {‘all’,’auto’}
Whether to pre-dispatch all parallel jobs (recommended for smaller datasets) or to allow joblib to pre-dispatch according to memory (recommended for large datasets).
- flagop: {‘nan’,’elim’}
Error handling mechanism.
'nan'
returns'NaN'
values when there are insufficient observations (better functionality still in development).'elim'
eliminates groups for which there are insufficient observations. If'elim'
is used, the output dataframe will not have the same length (number of observations) asdata
.- append: False or list(str)
Whether or not to append other columns of
data
(must be numeric type for now) to the output DataFrame. Append should beFalse
if no columns should be appended.append
should be a list of of the column names of the columns to be appended to the end DataFrame.
- Returns:
- A pandas DataFrame which contains rolling regression coefficients and residuals, if specified.
Basic Examples¶
Below are some basic usage examples without a groupby
.
>>> import pandas as pd
>>> import finance_byu.rolling as rolling
>>> import numpy as np
>>>
>>> n_periods = 1.0e2
>>>
>>> df = pd.DataFrame(np.random.random((int(n_periods),2))
>>> df = df.rename(columns={0:'ret',1:'exmkt'})
>>> df['roll'] = rolling.roll_mean(df['ret'],5,5)
>>> df.head(10)
ret exmkt roll
0 0.149535 0.644943 NaN
1 0.024654 0.624619 NaN
2 0.083370 0.025087 NaN
3 0.532949 0.736360 NaN
4 0.101531 0.400754 0.178408
5 0.819424 0.215954 0.312386
6 0.419873 0.728983 0.391429
7 0.552381 0.160935 0.485232
8 0.634769 0.743071 0.505596
9 0.730326 0.246545 0.631355
>>> df['rollvar'] = rolling.roll_var(df['ret'],5,5,ddof=1)
>>> df.head(10)
ret exmkt roll rollvar
0 0.149535 0.644943 NaN NaN
1 0.024654 0.624619 NaN NaN
2 0.083370 0.025087 NaN NaN
3 0.532949 0.736360 NaN NaN
4 0.101531 0.400754 0.178408 0.041279
5 0.819424 0.215954 0.312386 0.121358
6 0.419873 0.728983 0.391429 0.095739
7 0.552381 0.160935 0.485232 0.067492
8 0.634769 0.743071 0.505596 0.071995
9 0.730326 0.246545 0.631355 0.024035
>>> df = df.drop(['roll','rollvar'],axis=1)
>>> df['rollcov'] = rolling.roll_cov(df['ret'],df['exmkt'],5,5,ddof=1)
ret exmkt rollcov
0 0.149535 0.644943 NaN
1 0.024654 0.624619 NaN
2 0.083370 0.025087 NaN
3 0.532949 0.736360 NaN
4 0.101531 0.400754 0.028305
5 0.819424 0.215954 0.000486
6 0.419873 0.728983 0.023366
7 0.552381 0.160935 -0.020825
8 0.634769 0.743071 -0.013283
9 0.730326 0.246545 -0.024831
Examples with Grouping¶
Below are some examples for rolling function usage with pandas
groupby
functionality.
>>> df = pd.DataFrame(np.random.random((100,3)),columns=['a','b','c']).sort_values('a')
>>> df.head(10)
a b c
0 0.000421 0.328225 0.595473
1 0.039568 0.002372 0.223387
2 0.041261 0.826214 0.684885
3 0.059252 0.234307 0.412450
4 0.077423 0.616780 0.027450
5 0.082915 0.489654 0.596222
6 0.090510 0.981726 0.519077
7 0.102022 0.384198 0.939078
8 0.123865 0.475949 0.890815
9 0.159163 0.169004 0.139885
>>> df['ports'] = pd.qcut(df['a'],5,labels=False)
>>> df.head(30)
a b c ports
0 0.000421 0.328225 0.595473 0
1 0.039568 0.002372 0.223387 0
2 0.041261 0.826214 0.684885 0
3 0.059252 0.234307 0.412450 0
4 0.077423 0.616780 0.027450 0
5 0.082915 0.489654 0.596222 0
6 0.090510 0.981726 0.519077 0
7 0.102022 0.384198 0.939078 0
8 0.123865 0.475949 0.890815 0
9 0.159163 0.169004 0.139885 0
10 0.182324 0.114017 0.098002 0
11 0.184595 0.712363 0.850956 0
12 0.189484 0.482832 0.568143 0
13 0.194572 0.822320 0.471494 0
14 0.200897 0.091733 0.581896 0
15 0.211877 0.613734 0.445444 0
16 0.228115 0.863478 0.928822 0
17 0.229405 0.070245 0.667584 0
18 0.247503 0.816117 0.479351 0
19 0.248632 0.698228 0.028725 0
20 0.250610 0.597579 0.595263 1
21 0.271620 0.112743 0.480844 1
22 0.285232 0.946583 0.227774 1
23 0.287593 0.354288 0.333730 1
24 0.302387 0.145458 0.117342 1
25 0.311219 0.283440 0.828860 1
26 0.312818 0.953281 0.393665 1
27 0.316154 0.113413 0.270970 1
28 0.323806 0.596317 0.951102 1
29 0.354629 0.834722 0.179076 1
>>> df['grouped_rolling_sum'] = df.groupby('ports')['c'].transform(lambda x: rolling.roll_sum(x,5,5))
>>> df.sort_values(['ports','a']).head(10)
a b c ports grouped_rolling_sum
0 0.000421 0.328225 0.595473 0 NaN
1 0.039568 0.002372 0.223387 0 NaN
2 0.041261 0.826214 0.684885 0 NaN
3 0.059252 0.234307 0.412450 0 NaN
4 0.077423 0.616780 0.027450 0 1.943644
5 0.082915 0.489654 0.596222 0 1.944394
6 0.090510 0.981726 0.519077 0 2.240084
7 0.102022 0.384198 0.939078 0 2.494277
8 0.123865 0.475949 0.890815 0 2.972642
9 0.159163 0.169004 0.139885 0 3.085077
>>> df['grouped_rolling_beta'] = df.groupby('ports')[['b','c']].apply(lambda x: rolling.roll_beta(x['b'],x['c'],5,5,ddof=1))
>>> df.head(10)
a b ... grouped_rolling_sum grouped_rolling_beta
0 0.000421 0.328225 ... NaN NaN
1 0.039568 0.002372 ... NaN NaN
2 0.041261 0.826214 ... NaN NaN
3 0.059252 0.234307 ... NaN NaN
4 0.077423 0.616780 ... 1.943644 0.328457
5 0.082915 0.489654 ... 1.944394 0.443657
6 0.090510 0.981726 ... 2.240084 0.269092
7 0.102022 0.384198 ... 2.494277 -0.171534
8 0.123865 0.475949 ... 2.972642 -0.280294
9 0.159163 0.169004 ... 3.085077 0.161115
Speed¶
Below are a few example speed comparisons with the built-in pandas
rolling functionality.
>>> %timeit df['rolling_sum'] = df['c'].rolling(5).sum().reset_index(drop=True)
671 µs ± 69.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit df['rolling_sum'] = rolling.roll_sum(df['c'],5,5)
297 µs ± 10.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit df['grouped_rolling_sum'] = df.groupby('ports')['c'].rolling(5).sum().reset_index(drop=True)
5.39 ms ± 159 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit df['grouped_rolling_sum'] = df.groupby('ports')['c'].apply(lambda x: rolling.roll_sum(x,5,5))
3.34 ms ± 166 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit df['grouped_rolling_mean'] = df.groupby('ports')['c'].rolling(5).mean().reset_index(drop=True)
5.17 ms ± 190 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit df['grouped_rolling_mean'] = df.groupby('ports')['c'].apply(lambda x: rolling.roll_mean(x,5,5))
3.01 ms ± 386 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Rolling Multiple Regression¶
Here are a couple examples with random data for the use of the rolling_multiple
method.
>>> from finance_byu.rolling import rolling_multiple
>>> df = pd.DataFrame(np.random.random((100,4)))
>>> df.columns = ['y','x1','x2','x3']
>>> coeff = rolling_multiple(df,'y',['x1','x2','x3'],10)
>>> coeff.head(20)
intercept x1 x2 x3
0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
2 NaN NaN NaN NaN
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 0.252484 0.038459 0.003861 0.490564
10 0.299362 -0.020591 0.145126 0.368189
11 0.323561 -0.067829 0.237643 0.311882
12 0.329602 0.069779 0.266711 0.130818
13 0.786978 0.107650 0.273981 -0.484018
14 0.834109 0.139822 0.256740 -0.543779
15 0.937403 0.302884 0.216420 -0.774488
16 0.288141 0.101312 0.386460 0.005133
17 0.175423 -0.180466 0.646475 0.056263
18 0.012849 0.698391 0.531258 0.187829
19 0.039563 0.795769 0.378595 0.165630
>>> withresiduals = rolling_multiple(df,'y',['x1','x2','x3'],10,residuals=True)
>>> withresiduals.head(20)
intercept x1 x2 x3 resid
0 NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN
5 NaN NaN NaN NaN NaN
6 NaN NaN NaN NaN NaN
7 NaN NaN NaN NaN NaN
8 NaN NaN NaN NaN NaN
9 0.252484 0.038459 0.003861 0.490564 -0.909839
10 0.299362 -0.020591 0.145126 0.368189 -1.262891
11 0.323561 -0.067829 0.237643 0.311882 -0.926025
12 0.329602 0.069779 0.266711 0.130818 -1.365205
13 0.786978 0.107650 0.273981 -0.484018 -1.027773
14 0.834109 0.139822 0.256740 -0.543779 -1.044181
15 0.937403 0.302884 0.216420 -0.774488 -0.875767
16 0.288141 0.101312 0.386460 0.005133 -1.374425
17 0.175423 -0.180466 0.646475 0.056263 -0.739793
18 0.012849 0.698391 0.531258 0.187829 -0.678277
19 0.039563 0.795769 0.378595 0.165630 -1.312702
>>> df['groups'] = [1 for i in range(33)]+[2 for i in range(33)]+[3 for i in range(34)]
>>> grouped = df.groupby('groups').apply(lambda x: rolling_multiple(x,'y',['x1','x2','x3'],10)).reset_index(drop=True)
>>> grouped.head(50)
intercept x1 x2 x3
0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
2 NaN NaN NaN NaN
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 0.188011 0.172650 -0.036465 0.201428
10 0.127681 0.530099 -0.256984 0.447311
11 0.211469 0.585754 -0.356648 0.407959
12 0.309164 0.466337 -0.486869 0.486394
13 0.336976 0.541512 -0.683834 0.616734
14 0.242381 0.762214 -0.718999 0.539597
15 0.534390 0.592484 -0.882821 0.436392
16 0.475962 0.516497 -0.833246 0.695213
17 0.445435 0.300559 -0.683455 0.858445
18 0.424248 0.233874 -0.676784 0.879286
19 0.528965 -0.061176 -0.462627 0.834017
20 0.793513 -0.285642 -0.570421 0.456161
21 0.771210 -0.472012 -0.427634 0.489903
22 0.690572 -0.394875 -0.350176 0.697332
23 0.652243 -0.487154 -0.253501 0.716340
24 0.758479 -0.765630 -0.202638 0.606289
25 0.929314 -0.889273 -0.357062 0.522449
26 0.875805 -0.869253 -0.272132 0.482170
27 0.527850 -0.532585 0.178658 0.156535
28 0.592946 -0.613963 0.239219 0.101630
29 0.670114 -0.794297 0.576524 -0.187906
30 0.580165 -0.608576 0.466616 -0.258936
31 0.517314 -0.609204 0.438855 -0.067928
32 0.370829 -0.338844 0.511657 -0.235225
33 NaN NaN NaN NaN
34 NaN NaN NaN NaN
35 NaN NaN NaN NaN
36 NaN NaN NaN NaN
37 NaN NaN NaN NaN
38 NaN NaN NaN NaN
39 NaN NaN NaN NaN
40 NaN NaN NaN NaN
41 NaN NaN NaN NaN
42 0.952411 0.355941 -0.646408 -0.469534
43 0.993868 0.143298 -0.498905 -0.381196
44 0.921548 0.157348 -0.543432 -0.169947
45 1.023682 -0.092272 -0.521587 -0.093258
46 0.802979 -0.343775 -0.115053 0.043999
47 0.654425 -0.186637 -0.095179 0.112743
48 0.747796 -0.250212 -0.252141 0.067891
49 0.784292 -0.308890 -0.303901 0.072922
>>> df['resid'] = df.groupby('groups').apply(lambda x: rolling_multiple(x,'y',['x1','x2','x3'],10,residuals=True)).reset_index(drop=True)['resid']
>>> df.head(50)
y x1 x2 x3 groups resid
0 0.292707 0.527416 0.352366 0.942830 1 NaN
1 0.009256 0.056392 0.577220 0.316502 1 NaN
2 0.254997 0.303502 0.312090 0.450186 1 NaN
3 0.630078 0.485770 0.652494 0.219974 1 NaN
4 0.806714 0.117059 0.544942 0.752988 1 NaN
5 0.259738 0.332130 0.041728 0.038601 1 NaN
6 0.186582 0.601384 0.903792 0.943630 1 NaN
7 0.058661 0.537003 0.749255 0.333435 1 NaN
8 0.693882 0.829473 0.949468 0.602170 1 NaN
9 0.107025 0.314554 0.881373 0.008589 1 -1.104885
10 0.985555 0.898102 0.889376 0.836676 1 -0.763909
11 0.844293 0.883783 0.097372 0.114230 1 -0.896729
12 0.110527 0.617211 0.714501 0.467282 1 -1.365879
13 0.991384 0.615916 0.331154 0.338925 1 -0.661689
14 0.858861 0.166270 0.221500 0.783701 1 -0.773878
15 0.679131 0.070385 0.024244 0.108567 1 -0.922935
16 0.083172 0.633883 0.981583 0.209229 1 -1.047747
17 0.105980 0.755695 0.346644 0.005208 1 -1.334142
18 0.169534 0.580690 0.323765 0.146509 1 -1.300227
19 0.602168 0.027302 0.991393 0.273679 1 -0.694735
20 0.824959 0.517080 0.378158 0.273586 1 -0.729945
21 0.284846 0.843899 0.496191 0.116380 1 -0.932860
22 0.808722 0.096899 0.268888 0.228202 1 -0.908562
23 0.885483 0.522229 0.584660 0.652320 1 -0.831425
24 0.020493 0.806937 0.321778 0.547365 1 -1.386828
25 0.127404 0.705129 0.991105 0.478141 1 -1.070777
26 0.296545 0.565533 0.330202 0.286323 1 -1.135866
27 0.022179 0.223031 0.029895 0.350958 1 -1.447167
28 0.971971 0.092243 0.829478 0.411799 1 -0.804620
29 0.732190 0.646287 0.889335 0.333274 1 -0.874680
30 0.412803 0.168067 0.896444 0.563714 1 -1.337410
31 0.162741 0.939633 0.546790 0.941365 1 -0.958161
32 0.325315 0.259795 0.818819 0.846654 1 -1.177284
33 0.154722 0.702591 0.791131 0.961928 2 NaN
34 0.491407 0.518855 0.231681 0.955700 2 NaN
35 0.159072 0.036437 0.620747 0.555733 2 NaN
36 0.891759 0.617571 0.187414 0.401648 2 NaN
37 0.481304 0.750031 0.900041 0.354723 2 NaN
38 0.669076 0.110193 0.764837 0.087526 2 NaN
39 0.522653 0.313408 0.491214 0.584310 2 NaN
40 0.576749 0.762469 0.646008 0.033188 2 NaN
41 0.755683 0.401390 0.045780 0.468417 2 NaN
42 0.939899 0.806682 0.387371 0.694695 2 -0.723060
43 0.921698 0.033224 0.613996 0.359314 2 -0.633637
44 0.365582 0.376107 0.799801 0.158204 2 -1.153621
45 0.401404 0.592845 0.700801 0.967785 2 -1.111793
46 0.225042 0.907086 0.020412 0.483045 2 -1.285008
47 0.325359 0.003486 0.641433 0.148209 2 -1.284073
48 0.239367 0.753501 0.730812 0.733136 2 -1.185399
49 0.221450 0.631365 0.604590 0.536159 2 -1.223183
Rolling Multiple Regression Speed¶
Here are some timings for the rolling_multiple
method.
Scaling with number of observations
>>> def produce_data(nobs,nx):
>>> return pd.DataFrame(np.random.random((nobs,nx+1)))
>>> df = produce_data(100,3)
>>> %timeit rolling_multiple(df,0,[1,2,3],10)
24.2 ms ± 440 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> df = produce_data(1000,3)
>>> %timeit rolling_multiple(df,0,[1,2,3],120)
184 ms ± 1.53 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> df = produce_data(int(1.0e4),3)
>>> %timeit rolling_multiple(df,0,[1,2,3],120)
2.04 s ± 21.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> df = produce_data(int(1.0e5),3)
>>> %timeit rolling_multiple(df,0,[1,2,3],120,predispatch='auto')
18.9 s ± 200 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> df = produce_data(int(1.0e6),3)
>>> %timeit -r2 -n1 rolling_multiple(df,0,[1,2,3],120,predispatch='auto')
3min 10s ± 3.18 s per loop (mean ± std. dev. of 2 runs, 1 loop each)
Scaling with number of regressors
>>> df = produce_data(int(1.0e4),3)
>>> %timeit rolling_multiple(df,0,[1,2,3],120)
1.94 s ± 31.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> df = produce_data(int(1.0e4),20)
>>> %timeit rolling_multiple(df,0,[i for i in range(1,21)],120)
2.31 s ± 24.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> df = produce_data(int(1.0e4),100)
>>> %timeit rolling_multiple(df,0,[i for i in range(1,101)],120)
5.71 s ± 1.25 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
Scaling with window size
>>> df = produce_data(int(1.0e4),3)
>>> %timeit rolling_multiple(df,0,[1,2,3],10)
1.82 s ± 31.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> df = produce_data(int(1.0e4),3)
>>> %timeit rolling_multiple(df,0,[1,2,3],50)
1.88 s ± 28.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> df = produce_data(int(1.0e4),3)
>>> %timeit rolling_multiple(df,0,[1,2,3],100)
1.91 s ± 18.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> df = produce_data(int(1.0e4),3)
>>> %timeit rolling_multiple(df,0,[1,2,3],500)
2.01 s ± 28.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)