-
Notifications
You must be signed in to change notification settings - Fork 430
Description
Hi team,
Thank you for the fantastic work on this project. I am attempting to reconstruct the complete 3D scene using the outputs of your pipeline (mesh.glb and parameters.json).
I have successfully developed a script that loads each object and applies its transformation. The results are promising: the approximate translation and scale of the objects in the scene appear plausible. However, I am facing a persistent issue where the rotational pose of every object is incorrect. The objects are in the right general location and size but are not oriented correctly.
My Reconstruction Approach
My method is to construct a 4x4 transformation matrix for each object from its parameters.json file and apply it to the mesh. The core of my logic, especially for the rotation, is shown in the Python function below. The final matrix is composed as Transformation = Translation @ Rotation @ Scale.
Code Snippet
This function shows exactly how I am interpreting the 6drotation_normalized parameter to generate a rotation matrix:
import numpy as np
def get_transformation_matrix(params):
"""
Builds a 4x4 transformation matrix from the parameters dictionary.
"""
# 1. Extract data, accounting for the [[[-1.2, ...]]]] nesting
scale_vec = np.array(params['scale'])[0]
rot_6d = np.array(params['6drotation_normalized'])[0][0]
trans_vec = np.array(params['translation'])[0]
# 2. Reconstruct the 3x3 rotation matrix from the 6D vector
# Assumes the 6D vector represents the first two columns of the rotation matrix
a1 = rot_6d[:3]
a2 = rot_6d[3:]
# Gram-Schmidt orthogonalization
b1 = a1 / np.linalg.norm(a1)
b2_orthogonal = a2 - np.dot(b1, a2) * b1
b2 = b2_orthogonal / np.linalg.norm(b2_orthogonal)
b3 = np.cross(b1, b2)
rot_mat = np.stack([b1, b2, b3], axis=1)
# 3. Build individual 4x4 transformation matrices
S = np.diag([*scale_vec, 1.0])
M_rot = np.eye(4)
M_rot[:3, :3] = rot_mat
M_trans = np.eye(4)
M_trans[:3, 3] = trans_vec
# 4. Combine transformations: T * R * S
transform_matrix = M_trans @ M_rot @ S
return transform_matrixRequest for Clarification
Since the primary error appears to be rotational, I suspect a subtle misunderstanding in my interpretation. In your paper, you mention that you implemented a scene reconstruction. To help resolve this, I have two questions:
-
Reference Implementation: Would it be possible to share the script or a code snippet that demonstrates your scene reconstruction implementation? Having a reference for how you interpret the pose parameters would be the most direct way to solve this issue.
-
Intended Workflow: If sharing the code is not possible, could you please clarify the intended workflow for reconstructing the entire scene using your outputs? Specifically, is there a problem with my overall approach?
- Full Transformation Formula: Is my method—applying a single
T @ R @ Smatrix derived directly fromparameters.json—the complete and correct approach? Or am I missing a crucial step, such as an additional transformation (e.g., a camera-to-world matrix) that needs to be applied? - Parameter Interpretation: How should
6drotation_normalizedbe correctly converted into a rotation matrix, and what are the underlying coordinate system conventions (+Y up, right-handed, etc.)?
- Full Transformation Formula: Is my method—applying a single
Any guidance you can provide on the correct way to transform the objects from their canonical space into the final scene would be incredibly helpful.
Thank you for your time and assistance