How to handle missing or corrupted data in a dataset

We can handle missing or corrupted data in different ways in datasets. Handle missing or corrupted data in a dataset according to the following method. 

Machine Learning

  • Replacing With Mean/Median/Mode 

We can calculate the Mean/Median/Mode value based on the remaining values in the column and assign the result to the empty place. 

  • Predicting The Missing Values 

With the help of a machine learning method, we can predict nulls using features that do not have missing values. 

  • Using Algorithms Which Support Missing Values 

KNN is a machine learning algorithm based on the distance measure concept. When there are nulls in the dataset, this approach can be employed. KNN considers missing values by taking the majority of the K nearest values while the algorithm is running.

  • Deleting rows or columns 

If the majority of our data for a column or a row is missing, we may simply delete it.

Post a Comment

Previous Post Next Post