fix: OPTIC-1420: Inner IDs are duplicated and counted wrong #6785

mcanu · 2024-12-12T14:25:27Z

PR fulfills these requirements

Commit message(s) and PR title follows the format [fix|feat|ci|chore|doc]: TICKET-ID: Short description of change made ex. fix: DEV-XXXX: Removed inconsistent code usage causing intermittent errors
Tests for the changes have been added/updated (for bug fixes/features)
Docs have been added/updated (for bug fixes/features)
Best efforts were made to ensure docs/code are concise and coherent (checked for spelling/grammatical errors, commented out code, debug logs etc.)
Self-reviewed and ran all changes on a local instance (for bug fixes/features)

Change has impacts in these area(s)

(check all that apply)

Product design
Backend (Database)
Backend (API)
Frontend

Describe the reason for change

There was an issue where the inner_id was duplicated when importing from the API. We identified the problem because a client was importing with two requests almost at the same time, causing a race condition when calculating the inner_id.

What does this fix?

Potential duplicate inner ids due to a race condition when importing from the API.

What is the new behavior?

Acquire lock on the project before calculating inner id. I tried to acquire the lock on the Tasks queryset, but when there weren't any tasks yet it wouldn't work, because it locks on the returned rows (empty the first time).

Does this change affect performance?

If there are multiple async imports for a same project at once like this case, they will eventually be committed sequentially. This is needed to ensure inner_id consistency.

Does this PR introduce a breaking change?

(check only one)

Yes, and covered entirely by feature flag(s)
Yes, and covered partially by feature flag(s)
No
Not sure (briefly explain the situation below)

What level of testing was included in the change?

(check all that apply)

e2e
integration
unit

For testing I created a small script to replicate this behavior. Running two separate worker processes is also needed, so they can consume each import job concurrently.

import json
from label_studio_sdk.client import LabelStudio

# Define the URL where Label Studio is accessible and the API key for your user account
LABEL_STUDIO_URL = ''
# API key is available at the Account & Settings > Access Tokens page in Label Studio UI
API_KEY = ''


# Connect to the Label Studio API and check the connection
ls = LabelStudio(base_url=LABEL_STUDIO_URL, api_key=API_KEY)

project_id = 7

with open('data/5400.json', 'r') as f:
    data_5400 = json.load(f)

with open('data/5500.json', 'r') as f:
    data_5500 = json.load(f)


ls.projects.import_tasks(id=project_id, request=data_5400)
ls.projects.import_tasks(id=project_id, request=data_5500)

Which logical domain(s) does this change affect?

Imports

netlify · 2024-12-12T14:25:46Z

✅ Deploy Preview for label-studio-docs-new-theme ready!

Name	Link
🔨 Latest commit	`7a8fc6a`
🔍 Latest deploy log	https://app.netlify.com/sites/label-studio-docs-new-theme/deploys/6762fc1e1af9d30009260f75
😎 Deploy Preview	https://deploy-preview-6785--label-studio-docs-new-theme.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

netlify · 2024-12-12T14:25:55Z

✅ Deploy Preview for heartex-docs ready!

Name	Link
🔨 Latest commit	`7a8fc6a`
🔍 Latest deploy log	https://app.netlify.com/sites/heartex-docs/deploys/6762fc1e13e1e20008f0aa05
😎 Deploy Preview	https://deploy-preview-6785--heartex-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

codecov · 2024-12-12T14:36:46Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.78%. Comparing base (2e360bc) to head (7a8fc6a).
Report is 6 commits behind head on develop.

Additional details and impacted files

@@           Coverage Diff            @@
##           develop    #6785   +/-   ##
========================================
  Coverage    76.77%   76.78%           
========================================
  Files          171      171           
  Lines        14021    14023    +2     
========================================
+ Hits         10765    10767    +2     
  Misses        3256     3256

Flag	Coverage Δ
pytests	`76.78% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mcanu · 2024-12-16T17:55:24Z

/git merge develop

Workflow run
Successfully merged: 27 files changed, 842 insertions(+), 57 deletions(-)

Workflow run: https://github.com/HumanSignal/label-studio/actions/runs/12358429877

mcanu · 2024-12-18T15:54:50Z

/git merge develop

Workflow run
Successfully merged: create mode 100644 web/libs/datamanager/src/components/MainView/GridView/ImagePreview.tsx

Workflow run: https://github.com/HumanSignal/label-studio/actions/runs/12396610560

label_studio/tasks/serializers.py

Using select_for_update to grab a lock when calculating inner_id

d53ac20

github-actions bot added the fix label Dec 12, 2024

Merge branch 'develop' into 'fb-optic-1420'

48ab7bd

Workflow run: https://github.com/HumanSignal/label-studio/actions/runs/12358429877

Merge branch 'develop' into 'fb-optic-1420'

9fa41a8

Workflow run: https://github.com/HumanSignal/label-studio/actions/runs/12396610560

makseq requested changes Dec 18, 2024

View reviewed changes

label_studio/tasks/serializers.py Outdated Show resolved Hide resolved

Using fast_first

7a8fc6a

mcanu requested a review from makseq December 18, 2024 16:45

wesleylima approved these changes Dec 18, 2024

View reviewed changes

makseq approved these changes Dec 18, 2024

View reviewed changes

mcanu merged commit bd3498d into develop Dec 18, 2024
44 checks passed

makseq deleted the fb-optic-1420 branch December 18, 2024 23:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: OPTIC-1420: Inner IDs are duplicated and counted wrong #6785

fix: OPTIC-1420: Inner IDs are duplicated and counted wrong #6785

mcanu commented Dec 12, 2024 •

edited

Loading

netlify bot commented Dec 12, 2024 •

edited

Loading

netlify bot commented Dec 12, 2024 •

edited

Loading

codecov bot commented Dec 12, 2024 •

edited

Loading

mcanu commented Dec 16, 2024 •

edited by robot-ci-heartex

Loading

mcanu commented Dec 18, 2024 •

edited by robot-ci-heartex

Loading

fix: OPTIC-1420: Inner IDs are duplicated and counted wrong #6785

fix: OPTIC-1420: Inner IDs are duplicated and counted wrong #6785

Conversation

mcanu commented Dec 12, 2024 • edited Loading

PR fulfills these requirements

Change has impacts in these area(s)

Describe the reason for change

What does this fix?

What is the new behavior?

Does this change affect performance?

Does this PR introduce a breaking change?

What level of testing was included in the change?

Which logical domain(s) does this change affect?

netlify bot commented Dec 12, 2024 • edited Loading

✅ Deploy Preview for label-studio-docs-new-theme ready!

netlify bot commented Dec 12, 2024 • edited Loading

✅ Deploy Preview for heartex-docs ready!

codecov bot commented Dec 12, 2024 • edited Loading

Codecov Report

mcanu commented Dec 16, 2024 • edited by robot-ci-heartex Loading

mcanu commented Dec 18, 2024 • edited by robot-ci-heartex Loading

mcanu commented Dec 12, 2024 •

edited

Loading

netlify bot commented Dec 12, 2024 •

edited

Loading

netlify bot commented Dec 12, 2024 •

edited

Loading

codecov bot commented Dec 12, 2024 •

edited

Loading

mcanu commented Dec 16, 2024 •

edited by robot-ci-heartex

Loading

mcanu commented Dec 18, 2024 •

edited by robot-ci-heartex

Loading