How to drop rows of Pandas DataFrame whose value in a certain column is NaN
You can drop rows of a Pandas DataFrame that have a NaN value in a certain column using the dropna()
function. By default, dropna()
will drop any rows that contain at least one NaN value, but you can use the subset
parameter to specify which column(s) to check for NaNs.
Here's an example of how you can drop rows of a DataFrame df
that have a NaN value in the column 'A':
df = df.dropna(subset=['A'])
You can also pass multiple columns as the value of the subset
parameter to check multiple columns for NaN values:
df = df.dropna(subset=['A', 'B', 'C'])
This will drop the rows that have NaN in any of the 3 specified columns.
If you want to drop the rows only if all the values in specified columns are NaN, you can use the thresh
parameter and set it as the number of columns you want to check, in this case it will be the lenght of subset parameter:
df = df.dropna(subset=['A', 'B', 'C'], thresh=len(['A', 'B', 'C']))
This will drop the rows only if all the values in A, B and C columns are NaN.
Also, you can use the inplace=True
to make the changes in place and not to return a new Dataframe.
df.dropna(subset=['A', 'B', 'C'], inplace=True)
Please note that you'll lose the data permanently by using inplace=True and data that were dropped.