Skip to content

Commit

Permalink
Added inertia test and fixed some stuff
Browse files Browse the repository at this point in the history
  • Loading branch information
charnley committed Jan 4, 2025
1 parent 95e816b commit 7ec2bc8
Show file tree
Hide file tree
Showing 18 changed files with 208 additions and 131 deletions.
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ repos:
hooks:
- id: mypy
name: checking types
entry: mypy
entry: mypy --explicit-package-bases
language: system
types: [ python ]

Expand Down
35 changes: 17 additions & 18 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,15 +28,15 @@ I want to know the minimal RMSD between two molecules
To calculate the structural difference between two molecules, you might initially compute the RMSD directly (**Figure 2.a**). However, this straightforward approach could give you a misleadingly large value.
To get the true minimal RMSD, you must adjust for translations (**Figure 2.b**) and rotations (**Figure 2.c**). This process aligns the two molecules in the best possible way, ensuring the RMSD accurately reflects their structural similarity after optimal alignment.

.. list-table::
.. list-table::
:header-rows: 1

* - 1.a
- 1.b
- 1.c

* - |fig1.a|
- |fig1.b|
* - |fig1.a|
- |fig1.b|
- |fig1.c|

* - RMSD = 2.8
Expand All @@ -48,27 +48,27 @@ To get the true minimal RMSD, you must adjust for translations (**Figure 2.b**)
I do not know the order of the atoms
------------------------------------

Atom reordering methods can be used in cases where the atoms in the two molecules are not in the same order (Figure 2.1). These algorithms find the optimal mapping of atoms between the two structures to minimize RMSD.
Atom reordering methods can be used in cases where the atoms in the two molecules are not in the same order (Figure 2.1). These algorithms find the optimal mapping of atoms between the two structures to minimize RMSD.

Each method has its limitations because finding the best atom mapping depends on having the structures properly aligned. This is usually done by comparing atom-pair distances. If the molecules are already aligned, using the Hungarian linear sum assignment works well. If they’re not aligned, you can either align the molecules using their inertia eigenvectors (**Figure 2.a**) or use atomic descriptors (**Figure 2.b**), independent of the coordinate system, to reorder the atoms. Note that all reordering methods have limitations and drawbacks and might not actually find the true order.

.. _Hungarian: https://en.wikipedia.org/wiki/Hungarian_algorithm

.. list-table::
.. list-table::
:header-rows: 1

* - 2.a
- 2.b
- 2.c

* - |fig2.a|
- |fig2.b|
* - |fig2.a|
- |fig2.b|
- |fig2.c|

**Figure 2**:
**a)** Two identical molecules, but not in the same atomic order, making it impossible to rotate correctly.
**b)** Illustrating the molecular inertia vectors used to align molecules to find the atom mapping.
**c)** Illustrating atomic representation for individual atoms. Structure-dependent but coordinate system-independent, which can be used for atom mapping.
**c)** Illustrating atomic representation for individual atoms. Structure-dependent but coordinate system-independent, which can be used for atom mapping.

Installation
============
Expand Down Expand Up @@ -111,11 +111,11 @@ If the atoms are scrambled and not aligned, you can use the ``--reorder``
argument, which will align the atoms from structure B onto A.

Use ``--reorder-method`` to select the reordering method.
Choose between
Inertia_ aligned Hungarian_ distance (default),
Hungarian_ distance,
Choose between
Inertia_ aligned Hungarian_ distance (default),
Hungarian_ distance,
distance (very approximate)
QML atomic representation (coordinate independent),
QML atomic representation (coordinate independent),
and brute force (don't).

.. _Hungarian: https://en.wikipedia.org/wiki/Hungarian_algorithm
Expand Down Expand Up @@ -152,21 +152,21 @@ Calculate Root-mean-square deviation (RMSD) of Two Molecules Using Rotation, Git
http://github.com/charnley/rmsd, <git commit hash or version number>


.. list-table::
.. list-table::
:header-rows: 1

* - Method
- Argument
- Citation

* - **Kabsch**
* - **Kabsch**
- ``--rotation-method kabsch`` (Default)
- Wolfgang Kabsch (1976),
Acta Crystallographica, A32:922-923

http://dx.doi.org/10.1107/S0567739476001873

* - **Quaternion**
* - **Quaternion**
- ``--rotation-method quaternion``
- Walker, Shao & Volz (1991),
CVGIP: Image Understanding, 54:358-367,
Expand All @@ -179,9 +179,9 @@ http://github.com/charnley/rmsd, <git commit hash or version number>

http://dx.doi.org/10.1109/TAES.2016.140952

* - **FCHL19**
* - **FCHL19**
- ``--reorder-method qml``
- Christensen et al (2020), J. Chem. Phys. 152, 044107
- Christensen et al (2020), J. Chem. Phys. 152, 044107

https://doi.org/10.1063/1.5126701

Expand All @@ -207,4 +207,3 @@ For example, some hydrogens are noted as ``HG11``, which we assume is not mercur
.. |fig2.a| image:: https://raw.githubusercontent.com/charnley/rmsd/refs/heads/charnley/doc/docs/figures/fig_reorder_problem.png
.. |fig2.b| image:: https://raw.githubusercontent.com/charnley/rmsd/refs/heads/charnley/doc/docs/figures/fig_reorder_inertia.png
.. |fig2.c| image:: https://raw.githubusercontent.com/charnley/rmsd/refs/heads/charnley/doc/docs/figures/fig_reorder_qml.png

58 changes: 28 additions & 30 deletions rmsd/calculate_rmsd.py
Original file line number Diff line number Diff line change
Expand Up @@ -678,8 +678,8 @@ def kabsch_weighted_fit(
PNEW: ndarray = np.dot(P, R.T) + T
if return_rmsd:
return (PNEW, rmsd_)
else:
return (PNEW, None)

return (PNEW, None)


def kabsch_weighted_rmsd(P: ndarray, Q: ndarray, W: Optional[ndarray] = None) -> float:
Expand Down Expand Up @@ -842,20 +842,16 @@ def hungarian_vectors(

if use_kernel:
# Calculate cost matrix from similarity kernel
K = laplacian_kernel(p_vecs, q_vecs, sigma)
K *= -1.0
K += 1.0
kernel = laplacian_kernel(p_vecs, q_vecs, sigma)
kernel *= -1.0
kernel += 1.0

else:
K = distance_matrix(p_vecs, q_vecs)
kernel = distance_matrix(p_vecs, q_vecs)

# Perform Hungarian analysis on distance matrix between atoms of 1st
# structure and trial structure
indices_b: ndarray
indices_a: ndarray
indices_a, indices_b = linear_sum_assignment(K)
_, indices_q = linear_sum_assignment(kernel)

return indices_b
return indices_q


def reorder_similarity(
Expand Down Expand Up @@ -900,7 +896,6 @@ def reorder_similarity(
}

p_vecs = generate_fchl19(p_atoms, p_coord, **parameters)

q_vecs = generate_fchl19(q_atoms, q_coord, **parameters)

# generate full view from q shape to fill in atom view on the fly
Expand Down Expand Up @@ -975,23 +970,20 @@ def reorder_distance(
return view_reorder


def hungarian(A: ndarray, B: ndarray) -> ndarray:
def _hungarian(A: ndarray, B: ndarray) -> ndarray:
"""
Hungarian reordering.
Hungarian reordering, minimizing the cost distances[winner_rows, winner_cols].sum())
Assume A and B are coordinates for atoms of SAME type only
Assume A and B are coordinates for atoms of SAME type only.
"""

# should be kabasch here i think
distances = cdist(A, B, "euclidean")

# Perform Hungarian analysis on distance matrix between atoms of 1st
# structure and trial structure
indices_b: ndarray
indices_a: ndarray
indices_a, indices_b = linear_sum_assignment(distances)
winner_rows, winner_cols = linear_sum_assignment(distances)

return indices_b
return winner_cols


def reorder_hungarian(
Expand Down Expand Up @@ -1038,7 +1030,7 @@ def reorder_hungarian(
_p_coord = p_coord[p_atom_idx]
_q_coord = q_coord[q_atom_idx]

view = hungarian(_p_coord, _q_coord)
view = _hungarian(_p_coord, _q_coord)
view_reorder[p_atom_idx] = q_atom_idx[view]

return view_reorder
Expand Down Expand Up @@ -1076,9 +1068,13 @@ def reorder_inertia_hungarian(
"""

p_coord = np.array(p_coord, copy=True)
q_coord = np.array(q_coord, copy=True)

p_coord -= get_cm(p_atoms, p_coord)
q_coord -= get_cm(q_atoms, p_coord)

# Calculate inertia vectors for both structures
inertia_p = get_inertia_tensor(p_atoms, p_coord)
eigval_p, eigvec_p = np.linalg.eig(inertia_p)

Expand All @@ -1101,12 +1097,11 @@ def reorder_inertia_hungarian(

for mirror in AXIS_REFLECTIONS:

tmp_eigvec = eigvec_p * mirror.T
tmp_eigvec = eigvec_q * mirror.T
tmp_coord = np.dot(q_coord, tmp_eigvec)
tmp_coord -= get_cm(q_atoms, tmp_coord)

test_review = reorder_hungarian(p_atoms, q_atoms, p_coord, tmp_coord)
test_rmsd = rmsd(tmp_coord[test_review], p_coord)
test_rmsd = kabsch_rmsd(tmp_coord[test_review], p_coord)

if test_rmsd < best_rmsd:
best_rmsd = test_rmsd
Expand Down Expand Up @@ -1441,7 +1436,9 @@ def get_principal_axis(atoms: ndarray, V: ndarray) -> ndarray:
return principal_axis


def set_coordinates(atoms: ndarray, V: ndarray, title: str = "", decimals: int = 8) -> str:
def set_coordinates(
atoms: ndarray, V: ndarray, title: str = "", decimals: int = 8, set_atoms_as_symbols=True
) -> str:
"""
Print coordinates V with corresponding atoms to stdout in XYZ format.
Parameters
Expand All @@ -1466,6 +1463,9 @@ def set_coordinates(atoms: ndarray, V: ndarray, title: str = "", decimals: int =
if N != len(atoms):
raise ValueError("Mismatch between expected atoms and coordinate size")

if not isinstance(atoms[0], str) and set_atoms_as_symbols:
atoms = np.array([str_atom(atom) for atom in atoms])

fmt = "{:<2}" + (" {:15." + str(decimals) + "f}") * 3

out = list()
Expand Down Expand Up @@ -2280,9 +2280,8 @@ def main(args: Optional[List[str]] = None) -> str:
if settings.print_only_rmsd_atoms or not use_view:
q_coord_sub = np.dot(q_coord_sub, U)
q_coord_sub += p_cent_sub
_atoms = np.asarray([str_atom(atom) for atom in q_atoms_sub])
return set_coordinates(
_atoms,
q_atoms_sub,
q_coord_sub,
title=f"Rotated '{settings.structure_b}' to match '{settings.structure_a}', with a RMSD of {result_rmsd:.8f}",
)
Expand All @@ -2298,9 +2297,8 @@ def main(args: Optional[List[str]] = None) -> str:

q_coord = np.dot(q_coord, U)
q_coord += p_cent_sub
_atoms = np.asarray([str_atom(atom) for atom in q_atoms])
return set_coordinates(
_atoms,
q_atoms,
q_coord,
title=f"Rotated {settings.structure_b} to match {settings.structure_a}, with RMSD of {result_rmsd:.8f}",
)
Expand Down
33 changes: 33 additions & 0 deletions tests/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
from pathlib import Path

import numpy as np
from numpy import ndarray

RESOURCE_PATH = Path("tests/resources")


def get_rotation_matrix(degrees: float) -> ndarray:
"""https://en.wikipedia.org/wiki/Rotation_matrix"""

radians = degrees * np.pi / 180.0

r11 = np.cos(radians)
r12 = -np.sin(radians)
r21 = np.sin(radians)
r22 = np.cos(radians)

R = np.array([[r11, r12], [r21, r22]])

return R


def degree2radiant(degrees: float) -> float:
return degrees * np.pi / 180.0


def rotate_coord(angle: float, coord: ndarray, axis: list[int] = [0, 1]):
U = get_rotation_matrix(angle)
_xy = np.dot(coord[:, axis], U)
_coord = np.array(coord, copy=True)
_coord[:, axis] = _xy
return _coord
21 changes: 0 additions & 21 deletions tests/context.py

This file was deleted.

16 changes: 16 additions & 0 deletions tests/resources/butane.xyz
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
14

C 2.142e00 1.395e00 -8.932e00
C 3.631e00 1.416e00 -8.537e00
C 4.203e00 -1.200e-02 -8.612e00
C 5.691e00 9.000e-03 -8.218e00
H 1.604e00 7.600e-01 -8.260e00
H 1.745e00 2.388e00 -8.880e00
H 2.043e00 1.024e00 -9.930e00
H 4.169e00 2.051e00 -9.210e00
H 3.731e00 1.788e00 -7.539e00
H 3.665e00 -6.470e-01 -7.940e00
H 4.104e00 -3.840e-01 -9.610e00
H 6.088e00 -9.830e-01 -8.270e00
H 5.791e00 3.810e-01 -7.220e00
H 6.230e00 6.440e-01 -8.890e00
16 changes: 16 additions & 0 deletions tests/resources/butane_prime.xyz
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
14

C 6.71454 -5.53848 -3.50851
C 6.95865 -6.22697 -2.15264
C 8.16747 -5.57632 -1.45606
C 5.50518 -6.19016 -4.20589
H 5.33617 -5.71137 -5.14853
H 7.58263 -5.64795 -4.12498
H 6.51851 -4.49883 -3.35011
H 6.09092 -6.11832 -1.53660
H 5.70232 -7.22908 -4.36475
H 7.15558 -7.26640 -2.31068
H 8.33668 -6.05459 -0.51425
H 7.97144 -4.53667 -1.29765
H 4.63745 -6.08152 -3.58986
H 9.03610 -5.68475 -2.07173
2 changes: 1 addition & 1 deletion tests/test_file_reading.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

import numpy as np
import pytest
from context import RESOURCE_PATH
from conftest import RESOURCE_PATH # type: ignore

import rmsd as rmsdlib

Expand Down
2 changes: 1 addition & 1 deletion tests/test_kabsch.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import numpy as np
from context import RESOURCE_PATH
from conftest import RESOURCE_PATH # type: ignore

import rmsd as rmsdlib

Expand Down
2 changes: 1 addition & 1 deletion tests/test_kabsch_weighted.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import numpy as np
from context import RESOURCE_PATH
from conftest import RESOURCE_PATH # type: ignore

import rmsd as rmsdlib

Expand Down
Loading

0 comments on commit 7ec2bc8

Please sign in to comment.