How to fill missing values in a pandas DataFrame with the most frequent value of each group?

3 days ago 11
ARTICLE AD BOX

I have a pandas DataFrame with a 'toy' and 'color' column, which includes missing color values. I want to fill these NaNs with the most frequent color for their corresponding 'toy' type.

import pandas as pd
import numpy as np
df = pd.DataFrame({
'toy':['car'] * 4 + ['train'] * 5 + ['ball'] * 3 + ['truck'],
'color':['red', 'blue', 'blue', np.nan, 'green', np.nan, 'red', 'red', np.nan, 'blue', 'red', np.nan, 'green']
})

The current output for the DataFrame still contains NaN values, and I can't figure out how to target the specific groups within a fillna function.

Expected output

I want the NaNs replaced based on the most frequent color for that toy (e.g., car NaNs become 'blue', train NaNs become 'red').

Read Entire Article