Skip to content
This repository has been archived by the owner on Dec 19, 2024. It is now read-only.

Camera intrinsic matrix (3 by 3 matrix) has a negative value and only values on main diagonal? #221

Open
sarimmehdi opened this issue Oct 6, 2022 · 0 comments
Labels
question Further information is requested

Comments

@sarimmehdi
Copy link

I recorded a simulation by moving around with my camera and viewing a bunch of static objects. The camera intrinsic parameters were recorded as camera_intrinsic and extrinsic parameters were recorded as translation and rotation inside ego of the capture JSON (please correct me if this is wrong).

Now, I am experimenting with some triangulation methods (localizing objects using their 2D bounding box and the camera's projection matrix). Below is how I set up everything to get the projection matrix:

import numpy as np
from scipy.spatial.transform import Rotation as Rot
int_mat = np.array([
capture.sensor.camera_intrinsic[0],
capture.sensor.camera_intrinsic[1],
capture.sensor.camera_intrinsic[2]
])
r = Rot.from_quat(capture.ego.rotation)
rot_mat = r.as_matrix()
t = np.array([capture.ego.translation]).transpose()
ext_mat = np.hstack((rot_mat, t))
proj_mat = int_mat @ ext_mat

When I input the projection matrix and 2d bounding box into my custom algorithm, I noticed the triangulation was well-off. It was then that I noticed the matrix of intrinsic parameters only had values on the main diagonal and one of them is actually negative. I come from a computer vision background but not a computer graphics one. Can anyone here guide me on how to convert the intrinsic matrix provided by Unity Perception into a normal one with fx, fy, cx, cy, and skew?

I looked at the 3D Ground Truth Bounding Boxes inside the Perception_Statistics notebook but the authors seem to be considering the intrinsic parameter matrix as the projection matrix (Isn't the projection matrix supposed to be 3 by 4 due to the matrix multiplication of the 3 by 3 intrinsic parameter matrix with the 3 by 4 extrinsic parameter matrix?).

Furthermore, I also had a look at a similar question but the accepted answer there seems to assume the projection matrix is already provided while one of the answers is trying to use the FOV even though I don't see it in the data I have collected through Unity Perception.

For reference, below are the two relevant parts of the capture JSON file I am using:

"sensor": {
        "sensor_id": "7fcdda27-3029-4bb9-83f3-9d3eac23a1a1",
        "ego_id": "6a1ebd8b-1417-49a0-befc-8892801a9aa3",
        "modality": "camera",
        "translation": [
          0.0,
          0.0,
          0.0
        ],
        "rotation": [
          0.0,
          0.0,
          0.0,
          1.00000012
        ],
        "camera_intrinsic": [
          [
            0.705989,
            0.0,
            0.0
          ],
          [
            0.0,
            1.73205078,
            0.0
          ],
          [
            0.0,
            0.0,
            -1.0006001
          ]
        ]
      },
      "ego": {
        "ego_id": "6a1ebd8b-1417-49a0-befc-8892801a9aa3",
        "translation": [
          2.32,
          1.203,
          2.378
        ],
        "rotation": [
          -0.0229570474,
          0.976061463,
          -0.173389584,
          -0.129279226
        ],
        "velocity": null,
        "acceleration": null
      }
@sarimmehdi sarimmehdi added the question Further information is requested label Oct 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant