Skip to content

This project is an implementation of hybrid method for imputation of missing values

License

Notifications You must be signed in to change notification settings

SamanKhamesian/Imputation-of-Missing-Values

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Imputation-of-Missing-Values

Abstract

Missing values in datasets should be extracted from the datasets or should be estimated before they are used for classification, association rules or clustering in the preprocessing stage of data mining. In this paper, authors utilize a fuzzy c-means clustering hybrid approach that combines support vector regression and a genetic algorithm. In this method, the fuzzy clustering parameters, cluster size and weighting factor are optimized and missing values are estimated. The proposed novel hybrid method yields sufficient and sensible imputation performance results. The results are compared with those of fuzzy c-means genetic algorithm imputation, support vector regression genetic algorithm imputation and zero imputation. This project is an implementation of this method.

To use this work on your researches or projects you need:

  • Python 3.7.0
  • Python packages:
    • numpy
    • pandas
    • scikit-learn
    • scikit-fuzzy

To install Python:

First, check if you already have it installed or not.

python3 --version

If you don't have python 3 in your computer you can use the code below:

sudo apt-get update
sudo apt-get install python3

To install packages via pip install:

sudo pip3 install numpy scikit_fuzzy pandas scikit_learn

If you haven't installed pip, you can use the codes below in your terminal:

sudo apt-get update
sudo apt install python3-pip

You should check and update your pip:

pip3 install --upgrade pip