Clustering

Clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters).

Algorithm

Clustering algorithm is based on k-means clustering. It aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.

Run

Open table
Run from menu: Tools | Data Science | Cluster...
Select numerical feature columns that will be used for clustering
Select number of required clusters. Integer number 1..n
Set "Show scatter plot" to open scatter plot after clustering
Run clustering

Result will be concatenated to source table as column with name "Cluster" and contains index of cluster for each table row.

Usage examples

"Cluster" column can be used as "COLOR" parameter in scatter plot.

Notes

Works only with numerical data
New scatter plot will be opened only, if there are no opened ones.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cluster-data.md

cluster-data.md

Clustering

Algorithm

Run

Usage examples

Notes

Files

cluster-data.md

Latest commit

History

cluster-data.md

File metadata and controls

Clustering

Algorithm

Run

Usage examples

Notes