Skip to content

🌾 EDA and several Clustering techniques applied on a real dataset of wheat kernels. 🍞

Notifications You must be signed in to change notification settings

rmssoares/ClusteringOnWheat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ClusteringOnWheat

ClusteringOnWheat is a project that represents several methods of Clustering on a dataset of Wheat kernels. As seen in the report, the Clustering techniques used were as follows:

  • K-Means.
  • Hierarchical Clustering.
  • EM.
  • K-Nearest Neighbours.
  • DBSCAN.
  • K-Means of the feature space obtained after using Autoencoders for dimensionality reduction.

The present report would extend in EDA and the implementation of each technique.

Getting Started

Every implementation was developed in R. To start, clone the present repository into your local machine. If you're unaware of how to achieve this, please become familiar with the mechanisms of GitHub repositories.

git clone [email protected]:thyriki/ClusteringOnWheat.git

Prerequisites

Ensure that you have at least the version 3.4.4 of R installed and properly set up.

To easily program in R, RStudio was used.

The h2o platform is used for the autoencode.R R script, and it initialises a JVM. As such, Java JDK must also be installed and properly set up. The script was done with Java version 8 - 64 bits.

Every script should be easy to run, after installing the necessary libraries (as found at the start of each script).

Authors

For any inquiries, feel free to open up an issue.