Skip to content

Bug: Incorrect Rotational Pose in Scene Reconstruction & Request for Method Clarification #71

@lzlfwow

Description

@lzlfwow

Hi team,

Thank you for the fantastic work on this project. I am attempting to reconstruct the complete 3D scene using the outputs of your pipeline (mesh.glb and parameters.json).

I have successfully developed a script that loads each object and applies its transformation. The results are promising: the approximate translation and scale of the objects in the scene appear plausible. However, I am facing a persistent issue where the rotational pose of every object is incorrect. The objects are in the right general location and size but are not oriented correctly.

My Reconstruction Approach

My method is to construct a 4x4 transformation matrix for each object from its parameters.json file and apply it to the mesh. The core of my logic, especially for the rotation, is shown in the Python function below. The final matrix is composed as Transformation = Translation @ Rotation @ Scale.

Image

Code Snippet

This function shows exactly how I am interpreting the 6drotation_normalized parameter to generate a rotation matrix:

import numpy as np

def get_transformation_matrix(params):
    """
    Builds a 4x4 transformation matrix from the parameters dictionary.
    """
    # 1. Extract data, accounting for the [[[-1.2, ...]]]] nesting
    scale_vec = np.array(params['scale'])[0]
    rot_6d = np.array(params['6drotation_normalized'])[0][0]
    trans_vec = np.array(params['translation'])[0]

    # 2. Reconstruct the 3x3 rotation matrix from the 6D vector
    # Assumes the 6D vector represents the first two columns of the rotation matrix
    a1 = rot_6d[:3]
    a2 = rot_6d[3:]
    
    # Gram-Schmidt orthogonalization
    b1 = a1 / np.linalg.norm(a1)
    b2_orthogonal = a2 - np.dot(b1, a2) * b1
    b2 = b2_orthogonal / np.linalg.norm(b2_orthogonal)
    b3 = np.cross(b1, b2)
    
    rot_mat = np.stack([b1, b2, b3], axis=1)

    # 3. Build individual 4x4 transformation matrices
    S = np.diag([*scale_vec, 1.0])
    M_rot = np.eye(4)
    M_rot[:3, :3] = rot_mat
    M_trans = np.eye(4)
    M_trans[:3, 3] = trans_vec

    # 4. Combine transformations: T * R * S
    transform_matrix = M_trans @ M_rot @ S
    return transform_matrix

Request for Clarification

Since the primary error appears to be rotational, I suspect a subtle misunderstanding in my interpretation. In your paper, you mention that you implemented a scene reconstruction. To help resolve this, I have two questions:

  1. Reference Implementation: Would it be possible to share the script or a code snippet that demonstrates your scene reconstruction implementation? Having a reference for how you interpret the pose parameters would be the most direct way to solve this issue.

  2. Intended Workflow: If sharing the code is not possible, could you please clarify the intended workflow for reconstructing the entire scene using your outputs? Specifically, is there a problem with my overall approach?

    • Full Transformation Formula: Is my method—applying a single T @ R @ S matrix derived directly from parameters.json—the complete and correct approach? Or am I missing a crucial step, such as an additional transformation (e.g., a camera-to-world matrix) that needs to be applied?
    • Parameter Interpretation: How should 6drotation_normalized be correctly converted into a rotation matrix, and what are the underlying coordinate system conventions (+Y up, right-handed, etc.)?

Any guidance you can provide on the correct way to transform the objects from their canonical space into the final scene would be incredibly helpful.

Thank you for your time and assistance

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions