How to fill missing values in a pandas DataFrame with the most frequent value of each group?

3 days ago 11

ARTICLE AD BOX

I have a pandas DataFrame with a 'toy' and 'color' column, which includes missing color values. I want to fill these NaNs with the most frequent color for their corresponding 'toy' type.

import pandas as pd
import numpy as np
df = pd.DataFrame({
'toy':['car'] * 4 + ['train'] * 5 + ['ball'] * 3 + ['truck'],
'color':['red', 'blue', 'blue', np.nan, 'green', np.nan, 'red', 'red', np.nan, 'blue', 'red', np.nan, 'green']
})

The current output for the DataFrame still contains NaN values, and I can't figure out how to target the specific groups within a fillna function.

Expected output

I want the NaNs replaced based on the most frequent color for that toy (e.g., car NaNs become 'blue', train NaNs become 'red').

Read Entire Article

LEFT SIDEBAR AD

Hidden in mobile, Best for skyscrapers.

How to fill missing values in a pandas DataFrame with the most frequent value of each group?

ARTICLE AD BOX

Related

I have a problem with the request module in Automate Boring Stuff With Python - Chapter 13

How do I resolve the ConnectionResetError and CondaHTTPError when attempting to update conda despite multiple retries and Anaconda reinstalls?

Make a Python process that communicates with itself over a PTY

LEFT SIDEBAR AD