Removing Incomplete Years From Pandas Dataframe
I want to remove incomplete years from a dataframe to analyse the complete years further. I've looked here but the question is old and unresolved. This post follows on from a previ
Solution 1:
Get the years
and pass to your original df and call isin
and pass dt.year
:
In [93]:
years = count[count['date'] >= 365].index
df[df['date'].dt.year.isin(years)]
Out[93]:
date A
0 2007-01-01 0.020745
1 2007-01-02 0.030024
2 2007-01-03 -0.385793
3 2007-01-04 -0.737720
4 2007-01-05 0.089707
5 2007-01-06 -0.820141
6 2007-01-07 -0.081740
7 2007-01-08 0.233265
8 2007-01-09 1.336224
9 2007-01-10 0.570297
10 2007-01-11 -0.280080
11 2007-01-12 -1.582950
12 2007-01-13 0.494927
13 2007-01-14 2.065250
14 2007-01-15 -2.406877
15 2007-01-16 0.124046
16 2007-01-17 -1.015604
17 2007-01-18 1.480173
18 2007-01-19 0.705919
19 2007-01-20 -2.014657
20 2007-01-21 0.130874
21 2007-01-22 -0.138736
22 2007-01-23 1.874702
23 2007-01-24 -0.170154
24 2007-01-25 -1.548015
25 2007-01-26 -0.878455
26 2007-01-27 -0.871497
27 2007-01-28 1.992482
28 2007-01-29 0.565247
29 2007-01-30 1.257662
... ... ...
2892 2014-12-02 -1.052277
2893 2014-12-03 0.123017
2894 2014-12-04 -0.970947
2895 2014-12-05 -0.821208
2896 2014-12-06 -0.027118
2897 2014-12-07 -0.100033
2898 2014-12-08 0.954733
2899 2014-12-09 0.388998
2900 2014-12-10 0.667443
2901 2014-12-11 1.580804
2902 2014-12-12 0.724011
2903 2014-12-13 -2.156507
2904 2014-12-14 0.736236
2905 2014-12-15 0.863674
2906 2014-12-16 -0.204992
2907 2014-12-17 0.976307
2908 2014-12-18 1.456367
2909 2014-12-19 -0.516854
2910 2014-12-20 -0.140291
2911 2014-12-21 1.467225
2912 2014-12-22 0.957542
2913 2014-12-23 2.061477
2914 2014-12-24 0.202104
2915 2014-12-25 0.806140
2916 2014-12-26 -0.478380
2917 2014-12-27 1.109158
2918 2014-12-28 -0.598417
2919 2014-12-29 -1.283922
2920 2014-12-30 0.546390
2921 2014-12-31 -0.640812
[2922 rows x 2 columns]
This will filter the df so that only those dates with full complement of days remain
Post a Comment for "Removing Incomplete Years From Pandas Dataframe"