Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multiple object video tracking capabilities with SAM2.1 #668

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

Buckler89
Copy link

Support for multiple object video tracking capabilities with SAM2

Significant improvements have been made to the video segmentation and tracking features, including support for multi-object tracking.

  • Users can now choose between 'box' or 'point' prompts for object identification. Docker support has also been enhanced, with the configuration updated to reserve GPU resources and ensure the correct Docker runtime is used. Additionally, the model configuration and checkpoint paths have been updated for SAM2.1.
  • The segment-anything-2 repository can now be installed in any location by setting the appropriate environment variable.
  • Lastly, the predict function has undergone major refactoring for improved clarity and efficiency:
  • Docker support
  • Support for SAM2.1 models

Known Limitation

Significant improvements have been made to the video segmentation and tracking features, including support for multi-object tracking.

Users can now choose between 'box' or 'point' prompts for object identification. Docker support has also been enhanced, with the configuration updated to reserve GPU resources and ensure the correct Docker runtime is used. Additionally, the model configuration and checkpoint paths have been updated for SAM2.1.

The segment-anything-2 repository can now be installed in any location by setting the appropriate environment variable.

Lastly, the predict function has undergone major refactoring for improved clarity and efficiency.
@Thomas-J-Kidd
Copy link

Thomas-J-Kidd commented Nov 19, 2024

Im trying to replicate your work! Since you are familiar with this what is the root directory for installing sam2? and also do we need to install it as sam2 or segment-anything-model-2

Also I seem to be getting an error with not finding sam2_hiera_large.pt are we not using sam2.1_hiera_large.pt?

Added a new environment variable, SEGMENT_ANYTHING_2_REPO_PATH, to the README file in the segment_anything_2_video example. This variable specifies the path to the segment-anything-2 repo and defaults to `segment-anything-2`.
@Buckler89
Copy link
Author

Hi @Thomas-J-Kidd

I've updated the default values of the environment variables to use SAM2.1. If you'd like to use a different model, you can adjust them as needed.

Additionally, I've added documentation to the README about the SEGMENT_ANYTHING_2_REPO_PATH environment variable, which specifies the path to the "segment-anything" repository. You can set it to any location you prefer; personally, I like to install it outside the main repository to avoid having to ignore the folder in the .gitignore file.

Please pull the latest changes to ensure you're up-to-date.

Thanks for your input!

Enhanced the error handling mechanism by changing the fallback from 'context' to 'annotations' when drafts are not found. Also, added model version and score to the prediction value for better tracking and evaluation of predictions.
@Thomas-J-Kidd
Copy link

Thomas-J-Kidd commented Nov 25, 2024

I need to checkout your branch...

anyway now I am running into an error where my tasks variable does not have a key called draft. I am trying to take a look at how the code works but its taking me a while so I thought I would ask!

At the end of the day it comes down to this error:

  File "./mydir/sam-2-test-pr/label-studio-ml-backend/label_studio_ml/examples/segment_anything_2_video/model.py", line 263, in predict
    drafts = tasks[0]['drafts'][0]
             ~~~~~~~~^^^^^^^^^^
KeyError: 'drafts'

To be clear I am also using this annotation interface:

<View>
    <Labels name="videoLabels" toName="video" allowEmpty="true">
        <Label value="bag" background="#11A39E"/>
        <Label value="board" background="#D4380D"/>
      	<Label value="hole" background="#F4320D"/>
    </Labels>

    <!-- Please specify FPS carefully, it will be used for all project videos -->
    <Video name="video" value="$video" framerate="25.0"/>
    <VideoRectangle name="box" toName="video" smart="true"/>
</View>

to make my life easier let me ask this: How do you debug model.py?

I am having trouble setting up the logger to display the debug messages. Do you simply run model.py or do you use label-studio-ml start ./segment_anything_2_video to debug your code?

Im trying to add onto this and help with it out!

@Buckler89
Copy link
Author

Hi Thomas,

The error you're encountering might actually be expected behaviour. For instance, when you open a task, Label Studio sends a request to the label-studio-ml-backend with the available data. At that point, "drafts" might not yet exist. If it's the first time you're opening the task, there are no drafts yet.

Since SAM (Segment Anything Model) works with prompts (like boxes or points) as input, it can't function in this situation. The fact that it raises an error shouldn't be a concern. From what I understand, this is the intended behaviour—it was already happening before this PR. I've observed similar behaviour in other label-studio-ml-backend integrations, so I adapted to it. That said, it might be clearer if the system handled this situation differently. For now, if it raises an error, it simply means nothing happens.

To get the backend to work properly (if the error is due to what I think), you can safely ignore the error, draw a box, and assign it a label. At that point, everything should work as expected.

As for debugging, you need to run the _wsgi.py file in debug mode with the correct environment variables set.

Here's an example of my PyCharm configuration:
image

And here's an example of my .env file configuration:

DEVICE=cuda
LABEL_STUDIO_API_KEY=<your-api-key>
LABEL_STUDIO_URL=http://0.0.0.0:8080
MAX_FRAMES_TO_TRACK=60
MODEL_CHECKPOINT=sam2.1_hiera_tiny.pt
MODEL_CONFIG=./configs/sam2.1/sam2.1_hiera_t.yaml
SEGMENT_ANYTHING_2_REPO_PATH=<path-to-sam2-repo>/sam2/segment-anything-2/
DEBUG=true

I hope this helps! Let me know if you need further clarification.

@Thomas-J-Kidd
Copy link

Thomas-J-Kidd commented Nov 26, 2024

Thanks for that info! I kept digging last night and found that my context is None. Im still looking into this and it does not seem to track my object regardless of coming back and relabeling. I am not sure what the problem is but I am trying to find out! Hopefully I can pinpoint my error more.

Edit
seems like my videos werent downloading correctly from my s3 bucket. Potential new bug with the get_local_path function or something along those lines. When I inspect the video using cat it gives me an html file

from label_studio_sdk._extensions.label_studio_tools.core.utils.io import get_local_path

Edit 2
On uploaded videos I was able to get object tracking for one object, but for two I ran into my GPU running out of memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants