Skip to content Skip to sidebar Skip to footer

Pandas: Function Equivalent To Sql's Datediff()?

Is there an equivalent to SQL's datediff function in Python's pandas? The answer to this question: Add column with number of days between dates in DataFrame pandas explains how to

Solution 1:

UPDATE:

def months_between(d1, d2):
    dd1 = min(d1, d2)
    dd2 = max(d1, d2)
    return (dd2.year - dd1.year)*12 + dd2.month - dd1.month

In [125]: months_between(pd.to_datetime('2015-01-02 12:13:14'), pd.to_datetime('2012-03-02 12:13:14'))
Out[125]: 34

OLD answer:

In [40]: (pd.to_datetime('15-10-2010') - pd.to_datetime('15-07-2010')).days
Out[40]: 92

you can also do this for months:

In [48]: pd.to_datetime('15-10-2010').month - pd.to_datetime('15-07-2010').month
Out[48]: 3

Solution 2:

If you look around a little, it seems that months is not possible to get out of a TimeDelta:

In [193]: date_1 = pd.to_datetime('2015-01-02 12:13:14')

In [194]: date_2 = pd.to_datetime('2012-03-02 12:13:14')

In [195]: date_1 - date_2
Out[195]: Timedelta('1036 days 00:00:00')

In [199]: td_1.
td_1.asm8            td_1.days            td_1.freq            td_1.microseconds    td_1.resolution      td_1.to_pytimedelta  td_1.value           
td_1.ceil            td_1.delta           td_1.is_populated    td_1.min             td_1.round           td_1.to_timedelta64  td_1.view            
td_1.components      td_1.floor           td_1.max             td_1.nanoseconds     td_1.seconds         td_1.total_seconds

In [199]: td_1.components
Out[199]: Components(days=1036, hours=0, minutes=0, seconds=0, milliseconds=0, microseconds=0, nanoseconds=0)

Additionally, Components are not offering different denominations of the same value seemingly, but

In[213]: td_1.components.daysOut[213]: 1036In[214]: td_1.components.hoursOut[214]: 0

Ultimately, it seems that what you have been doing until now seems like the "best" solution:

In[214]: td_1.components.days/30Out[214]: 34.53333333333333In[215]: np.round(td_1.components.days/30)
Out[215]: 35.0In[216]: np.floor(td_1.components.days/30)
Out[216]: 34.0

Not the great news really, but a solution in any case.

As to comparing the documentation that Matlab comes with to this of pandas, you are right. However, if you were to compare the price tag of the two as well maybe some questions are answered.. (?)

Post a Comment for "Pandas: Function Equivalent To Sql's Datediff()?"