Optimal ordering of elements in a set given their distance matrix.
Overview • How To Use • Contributions • License
This is a Python implementation of Seriation algorithm. Seriation is an approach for ordering elements in a set so that the sum of the sequential pairwise distances is minimal. We state this task as a Travelling Salesman Problem (TSP) and leverage the powerful Google's or-tools to do heavy-lifting. Since TSP is NP-hard, it is not possible to calculate the precise solution for a big number of elements. However, the or-tools' heuristics work very well in practice, and they are used in e.g. Google Maps.
Any numpy.roll
-ed
result is equivalent.
import numpy
from scipy.spatial.distance import pdist
from seriate import seriate
elements = numpy.array([
[3, 3, 3],
[5, 5, 5],
[4, 4, 4],
[2, 2, 2],
[1, 1, 1]
])
print(seriate(pdist(elements)))
# Output: [4, 3, 0, 2, 1]
The example above shows how we order 5 elements: [3, 3, 3]
,
[5, 5, 5]
, [4, 4, 4]
, [2, 2, 2]
and [1, 1, 1]
. The result
is expected:
[1, 1, 1]
[2, 2, 2]
[3, 3, 3]
[4, 4, 4]
[5, 5, 5]
pdist
from scipy.spatial.distance
uses Euclidean (L2) dstance metric by default, so the distance between
[x, x, x]
and [x + 1, x + 1, x + 1]
is constant: √3. Any other distance
is bigger, so the optimal ordering is to list our elements in the increasing
norm order.
Contributions are very welcome and desired! Please follow the code of conduct and read the contribution guidelines.
Apache-2.0, see LICENSE.md.