Skip to content Skip to sidebar Skip to footer

How To Do A Reverse Moving Average ( In Pandas, Rolling().mean ) Operation On Predicted Values?

I have a df like this: import numpy as np import pandas as pd import matplotlib.pylab as plt np.random.seed(100) data = np.random.rand(200,3) df = pd.DataFrame(data) df.columns =

Solution 1:

This is very related with this question cumsum with shift of n I asked.

With this function cumsum_shif(n) (the question provides an implementation using for loops called cumsum_shift) you can reverse the moving average up to constant that depends on the initial values without the need of inverting a matrix that has to have as many columns as the the size of the original serie.

Let's call the moving average y_roll = df.loc[,"y_roll"] and y_estimated the reverse up to a constant. Assuming the size of the windows is 10 win_size = 10 then if you multiply by 10 the diff'ed of the rolling mean and then cumumsum(shift=10) it you obtain the original serie up to the intial values. The code:

def cumsum_shift(s, shift = 1, init_values = [0]):
    s_cumsum = pd.Series(np.zeros(len(s)))
    for i in range(shift):
        s_cumsum.iloc[i] = init_values[i]
    for i in range(shift,len(s)):
        s_cumsum.iloc[i] = s_cumsum.iloc[i-shift] + s.iloc[i]
    return s_cumsum

win_size = 10
s_diffed = win_size * df['y_roll'].diff()
df['y_unrolled'] = cumsum_shift(s=s_diffed, shift = win_size, init_values= df['y'].values[:win_size])

This code recovers exactly y from y_roll because you have the initial values.

You can see it plotting it (in my case with plotly) that y and y_unrolled are exactly the same (just the red one). enter image description here

Now doing the same thing to y_roll_predicted to obtain y_predicted_unrolled.

Code:

win_size = 10
s_diffed = win_size * df['y_roll_predicted'].diff()
df['y_predicted_unrolled'] = cumsum_shift(s=s_diffed, shift = win_size, init_values= df['y'].values[:win_size])

In this case the result are not exactly the same, notice how the initial values are from y and then y_roll_predicted incorporate noise to y_roll so the "unrolling" cannot recover exactly the original one.

Here a plot zoomed in in a smaller range to see it better: enter image description here


Post a Comment for "How To Do A Reverse Moving Average ( In Pandas, Rolling().mean ) Operation On Predicted Values?"