Skip to content Skip to sidebar Skip to footer

How To Calculate Day's Difference Between Successive Pandas Dataframe Rows With Condition

I have a pandas dataframe like following.. item_id date 101 2016-01-05 101 2016-01-21 121 2016-01-08 121 2016-01-22 128 2016-01-19 128 20

Solution 1:

I think you can use:

df['date'] = df.groupby('item_id')['date'].apply(lambda x: x.sort_values())

df['diff'] = df.groupby('item_id')['date'].diff() / np.timedelta64(1, 'D')
df['diff'] = df['diff'].fillna(0)
printdf
    item_id       date  diff
0       101 2016-01-05     0
1       101 2016-01-21    16
2       121 2016-01-08     0
3       121 2016-01-22    14
4       128 2016-01-19     0
5       128 2016-02-17    29
6       131 2016-01-11     0
7       131 2016-01-23    12
8       131 2016-01-24     1
9       131 2016-02-06    13
10      131 2016-02-07     1

Solution 2:

You can also try:

df.date.diff().fillna(pd.Timedelta(seconds=0))

Note: .fillna(0) is no longer supported for timedelta dtype

Post a Comment for "How To Calculate Day's Difference Between Successive Pandas Dataframe Rows With Condition"