Pandas: How To Fill Null Values With Mean Of A Groupby?
I have a dataset will some missing data that looks like this: id category value 1 A NaN 2 B NaN 3 A 10.5 4 C NaN
Solution 1:
I think you can use groupby
and apply
fillna
with mean
. Then get NaN
if some category has only NaN
values, so use mean
of all values of column for filling NaN
:
df.value = df.groupby('category')['value'].apply(lambda x: x.fillna(x.mean()))
df.value = df.value.fillna(df.value.mean())
print (df)
id category value
0 1 A 6.25
1 2 B 1.00
2 3 A 10.50
3 4 C 4.15
4 5 A 2.00
5 6 B 1.00
Solution 2:
You can also use GroupBy
+ transform
to fill NaN
values with groupwise means. This method avoids inefficient apply
+ lambda
. For example:
df['value'] = df['value'].fillna(df.groupby('category')['value'].transform('mean'))
df['value'] = df['value'].fillna(df['value'].mean())
Post a Comment for "Pandas: How To Fill Null Values With Mean Of A Groupby?"