Skip to content Skip to sidebar Skip to footer

Pandas: How To Fill Null Values With Mean Of A Groupby?

I have a dataset will some missing data that looks like this: id category value 1 A NaN 2 B NaN 3 A 10.5 4 C NaN

Solution 1:

I think you can use groupby and applyfillna with mean. Then get NaN if some category has only NaN values, so use mean of all values of column for filling NaN:

df.value = df.groupby('category')['value'].apply(lambda x: x.fillna(x.mean()))
df.value = df.value.fillna(df.value.mean())
print (df)
   id category  value
0   1        A   6.25
1   2        B   1.00
2   3        A  10.50
3   4        C   4.15
4   5        A   2.00
5   6        B   1.00

Solution 2:

You can also use GroupBy + transform to fill NaN values with groupwise means. This method avoids inefficient apply + lambda. For example:

df['value'] = df['value'].fillna(df.groupby('category')['value'].transform('mean'))
df['value'] = df['value'].fillna(df['value'].mean())

Post a Comment for "Pandas: How To Fill Null Values With Mean Of A Groupby?"