Blog

Deal with missing values in Pandas

image

Brief Description: The concept of missing values is important to understand in order to successfully manage data. If the missing values are not handled properly by the researcher, then he/she may end up drawing an inaccurate inference about the data. Due to improper handling, the result obtained by the researcher will differ from ones where the missing values are present.

Step 1 - Import the library

**We have imported numpy and pandas which will be needed for the dataset.

Step 2 - Setting up the Data

**We have created a dataframe with different features like "first_name", "last_name", "age", "comedy_score" and "Rating_Score".


Step 3 - Dealing with missing values

**Here we will be using different methods to deal with missing values.

Droping missing observations

Droping rows where all cells in that row is NA

Creating a new column full of missing values

Creating a new column full of missing values

Droping column if they only contain missing values

Droping rows that contain less than five observations

Filling in missing data with zeros

Filling in missing in Comedy_Score with the mean value of Comedy_Score

Filling in missing in Comedy_Score with each age’s mean value of Comedy_Score

Selecting the rows of df where age is not NaN and age is not NaN

** For output, please run above codes on your python