You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
How to get clusters for each feature vector like displayed on the README page?
I'm currently using this function to implement clusterization algorithm, but it is not fast enough:
defannoy_clustering(data, num_trees=10, num_neighbors=10):
n_samples, n_features=data.shape# Step 1: Build the Annoy indexannoy_index=AnnoyIndex(n_features, metric='euclidean')
foriinrange(n_samples):
annoy_index.add_item(i, data[i])
annoy_index.build(num_trees)
# Step 2: Assign clusters based on nearest neighborslabels=np.full(n_samples, -1) # Initialize all labels as -1cluster_id=0foriinrange(n_samples):
iflabels[i] ==-1: # If the point is not yet labeled# Get nearest neighborsneighbors=annoy_index.get_nns_by_item(i, num_neighbors)
# Assign the same cluster ID to the point and its neighborslabels[neighbors] =cluster_idcluster_id+=1returnlabels
Is this even possible with ANNOY algorithm to get clusters directly without involving get_nns_by_item, which bloats computational complexity?
The text was updated successfully, but these errors were encountered:
How to get clusters for each feature vector like displayed on the README page?
I'm currently using this function to implement clusterization algorithm, but it is not fast enough:
Is this even possible with ANNOY algorithm to get clusters directly without involving
get_nns_by_item
, which bloats computational complexity?The text was updated successfully, but these errors were encountered: