Skip to content

Conversation

@Shofikul-Isl4m
Copy link
Contributor

@Shofikul-Isl4m Shofikul-Isl4m commented Jan 6, 2026

Proposed change

Resolves #3184
This PR addresses incomplete contribution heatmaps by fixing two core issues in the GitHub synchronization logic.

  1. Added Commit Syncing: I have implemented the synchronization of individual commit records during the repository sync process. This ensures that commit data is available for the contribution heatmaps.
  2. Extended Sync Window: The default look-back period for milestones, issues, and pull requests has been increased from 30 days to 365 days

Before:
image

After :
image

Checklist

  • Required: I read and followed the contributing guidelines
  • Required: I ran make check-test locally and all tests passed
  • I used AI for code, documentation, or tests in this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 6, 2026

Summary by CodeRabbit

  • Bug Fixes

    • Extended GitHub data sync period from 30 days to 365 days for commits, milestones, issues, and pull requests to capture more historical data.
  • Performance

    • Implemented bulk-saving optimization for GitHub commits to improve sync efficiency.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Added bulk-saving of GitHub commits, extended sync horizon fallback from 30 to 365 days for commits, milestones, issues, and pull requests, introduced a chunk-size constant for commit bulk saves, added Repository.latest_commit property, and updated tests to expect the 365-day fallback.

Changes

Cohort / File(s) Summary
Core sync logic
backend/apps/github/common.py
Added commit bulk-save flow: compute since (use Repository.latest_commit or year_ago), map authors/committers via User.update_data, construct Commit objects with save=False, batch-save using GITHUB_COMMITS_BULK_SAVE_CHUNK_SIZE, and flush remaining commits. Replaced prior 30-day (month_ago) fallback with 365-day (year_ago) for milestones, issues, and pull requests. Added public imports for Commit and chunk-size constant.
Constants
backend/apps/github/constants.py
Added GITHUB_COMMITS_BULK_SAVE_CHUNK_SIZE = 50.
Repository model
backend/apps/github/models/repository.py
Added @property latest_commit returning the most recently created commit (ordered by created_at desc).
Tests
backend/tests/apps/github/common_test.py
Updated mocks to include Commit and latest_commit = None; renamed and adjusted initial_sync test to expect a 365-day fallback and updated fixture timestamps (~300/400 days) to exercise year-ago behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes both main changes: commit syncing implementation and extension of the sync window to 365 days.
Description check ✅ Passed The description is directly related to the changeset, explaining the two core issues being addressed with detailed context and linked issue reference.
Linked Issues check ✅ Passed The PR fully addresses issue #3184 objectives: implements commit syncing, extends sync window to 365 days for milestones/issues/pulls, and aligns data collection with contribution heatmap requirements.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the linked issue requirements: commit syncing implementation, 365-day window extension, and supporting infrastructure changes are all within scope.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

🧹 Recent nitpick comments
backend/apps/github/common.py (2)

76-100: Commit syncing logic looks correct.

The implementation properly handles incremental syncing by using the latest local commit's created_at as the since parameter, with a 365-day fallback for initial sync. The chunked bulk saves help manage memory for large repositories.

Consider batching User lookups for high-volume scenarios.

The User.update_data calls at lines 83-84 perform individual database operations for each commit's author and committer. For repositories with many new commits, this could be optimized by collecting unique users first and bulk-processing them. However, since this is an incremental sync (typically processing only recent commits), the current approach is acceptable.


109-113: Reuse year_ago variable for consistency.

The year_ago variable is already defined at line 74. Reusing it here avoids duplicate calculation and improves consistency.

♻️ Suggested change
         until = (
             latest_updated_milestone.updated_at
             if (latest_updated_milestone := repository.latest_updated_milestone)
-            else timezone.now() - td(days=365)
+            else year_ago
         )

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c38bf0 and ce60e69.

📒 Files selected for processing (3)
  • backend/apps/github/common.py
  • backend/apps/github/models/repository.py
  • backend/tests/apps/github/common_test.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/tests/apps/github/common_test.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.

Applied to files:

  • backend/apps/github/models/repository.py
  • backend/apps/github/common.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.

Applied to files:

  • backend/apps/github/models/repository.py
  • backend/apps/github/common.py
🧬 Code graph analysis (2)
backend/apps/github/models/repository.py (1)
backend/tests/apps/owasp/management/commands/owasp_aggregate_contributions_test.py (1)
  • order_by (32-34)
backend/apps/github/common.py (3)
backend/apps/github/models/commit.py (1)
  • Commit (11-110)
backend/apps/github/models/repository.py (2)
  • latest_commit (117-124)
  • update_data (320-362)
backend/apps/common/models.py (1)
  • bulk_save (19-34)
🔇 Additional comments (4)
backend/apps/github/models/repository.py (1)

116-124: LGTM!

The latest_commit property follows the established pattern used by other similar properties in this model (e.g., latest_pull_request, latest_updated_issue). The implementation correctly orders commits by created_at descending and returns the first result.

backend/apps/github/common.py (3)

15-17: LGTM!

The new imports for GITHUB_COMMITS_BULK_SAVE_CHUNK_SIZE and Commit are correctly added to support the commit syncing functionality.


142-146: LGTM!

The issues fallback correctly uses the year_ago variable, aligning with the 365-day sync window requirement.


193-197: LGTM!

The pull requests fallback correctly uses the year_ago variable, consistently aligning with the 365-day sync window.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI Agents
In @backend/apps/github/management/commands/github_sync_user.py:
- Line 47: The help text for the end date argument in the management command
contains an extra space before "Defaults"; locate the add_argument call (in the
Command class / add_arguments method) where help="End date (YYYY-MM-DD). 
Defaults to today." is defined and remove the extra space so it reads "End date
(YYYY-MM-DD). Defaults to today.".
- Around line 207-209: Fix the comment formatting above the default date logic:
change the comment that precedes default_end/default_start from "#Default to
last 365 days" to "# Default to last 365 days" (i.e., add a space after the `#`)
so it follows standard Python comment style.

In @backend/apps/owasp/management/commands/owasp_create_member_snapshot.py:
- Line 44: Fix the extra space in the help text for the start date argument:
locate the add_argument call that defines the start date help (e.g., the
'--start-date' or 'start_date' argument in the Command class in
owasp_create_member_snapshot.py) and change the string from "Start date
(YYYY-MM-DD). Defaults to  365 days ago." to "Start date (YYYY-MM-DD). Defaults
to 365 days ago." so there is only a single space before "365".
🧹 Nitpick comments (1)
backend/apps/owasp/management/commands/owasp_create_member_snapshot.py (1)

90-90: Optional: Remove redundant local import.

Now that timedelta is imported at the module level (line 5), the local import on line 90 within generate_heatmap_data is redundant. Similarly, line 253 in generate_communication_heatmap_data also has a redundant import.

🔎 Proposed cleanup

Remove the local import on line 90:

-        from datetime import timedelta
-
         # Initialize all dates in range with 0

And also remove the local import on line 253 (not shown in this segment but present in the file):

-        from datetime import timedelta
-
         # Initialize all dates in range with 0
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4d694a2 and 835fdc5.

📒 Files selected for processing (2)
  • backend/apps/github/management/commands/github_sync_user.py
  • backend/apps/owasp/management/commands/owasp_create_member_snapshot.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-12-18T05:39:42.678Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:40-40
Timestamp: 2025-12-18T05:39:42.678Z
Learning: In Django management commands, prefer using self.stdout.write(...) over print(...) for user-facing stdout output. This aligns with Django conventions and improves testability. When emitting messages, consider using self.stdout.write and, for styled messages, use self.style.SUCCESS/ERROR as appropriate to maintain consistent command output formatting. Apply this guideline to all Python files within any project's management/commands directory.

Applied to files:

  • backend/apps/owasp/management/commands/owasp_create_member_snapshot.py
  • backend/apps/github/management/commands/github_sync_user.py
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.

Applied to files:

  • backend/apps/owasp/management/commands/owasp_create_member_snapshot.py
  • backend/apps/github/management/commands/github_sync_user.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.

Applied to files:

  • backend/apps/owasp/management/commands/owasp_create_member_snapshot.py
  • backend/apps/github/management/commands/github_sync_user.py
📚 Learning: 2026-01-01T18:57:05.007Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/video.py:189-215
Timestamp: 2026-01-01T18:57:05.007Z
Learning: In the OWASP backend area, maintain the established pattern: when dealing with sponsors, include all entries from Sponsor.objects.all() (including NOT_SPONSOR) and perform in-memory sorting using the same criteria/pattern used by the GraphQL sponsor query implemented in backend/apps/owasp/api/internal/queries/sponsor.py. Apply this behavior consistently to files in backend/apps/owasp (not just video.py), and ensure code paths that render sponsor lists follow this in-code sorting approach rather than pre-filtering NOT_SPONSOR entries before sorting.

Applied to files:

  • backend/apps/owasp/management/commands/owasp_create_member_snapshot.py
🔇 Additional comments (5)
backend/apps/owasp/management/commands/owasp_create_member_snapshot.py (3)

5-5: LGTM: Import addition supports the new rolling window logic.

The addition of timedelta to the top-level imports is necessary for calculating the 365-day default window and aligns with the PR objectives.


49-49: LGTM: Help text accurately reflects the new default.

The help text correctly describes the new behavior where --end-at defaults to today.


323-325: LGTM: Rolling 365-day window implemented correctly.

The default date logic now uses a dynamic 365-day window ending today, replacing the previous fixed calendar range. This aligns with the PR objectives and ensures consistent contribution views across the platform.

backend/apps/github/management/commands/github_sync_user.py (2)

4-4: LGTM: Import addition supports the new rolling window logic.

The addition of timedelta to the imports is necessary for calculating the 365-day default window and mirrors the change in the related owasp_create_member_snapshot.py command.


42-42: LGTM: Help text accurately reflects the new default.

The help text correctly describes the new behavior where --start-at defaults to 365 days ago.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/apps/owasp/management/commands/owasp_create_member_snapshot.py (1)

90-90: Remove redundant local imports of timedelta.

Since timedelta is already imported at the module level (line 5), the local imports within generate_heatmap_data and generate_communication_heatmap_data are unnecessary and can be removed for cleaner code.

🔎 Proposed fix

Remove the local import statements on lines 90 and 253:

 def generate_heatmap_data(self, commits, pull_requests, issues, start_at, end_at) -> dict:
     """Generate heatmap data from contributions within the date range.
     ...
     """
-    from datetime import timedelta
-
     # Initialize all dates in range with 0
     heatmap_data: dict[str, int] = {}
 def generate_communication_heatmap_data(self, messages, start_at, end_at) -> dict:
     """Generate communication heatmap data from public Slack channels within date range.
     ...
     """
-    from datetime import timedelta
-
     # Initialize all dates in range with 0
     heatmap_data: dict[str, int] = {}

Also applies to: 253-253

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 835fdc5 and 7ad5da7.

📒 Files selected for processing (3)
  • backend/apps/github/management/commands/github_sync_user.py
  • backend/apps/owasp/management/commands/owasp_create_member_snapshot.py
  • backend/tests/apps/github/management/commands/github_sync_user_test.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-12-18T05:39:42.678Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:40-40
Timestamp: 2025-12-18T05:39:42.678Z
Learning: In Django management commands, prefer using self.stdout.write(...) over print(...) for user-facing stdout output. This aligns with Django conventions and improves testability. When emitting messages, consider using self.stdout.write and, for styled messages, use self.style.SUCCESS/ERROR as appropriate to maintain consistent command output formatting. Apply this guideline to all Python files within any project's management/commands directory.

Applied to files:

  • backend/apps/github/management/commands/github_sync_user.py
  • backend/apps/owasp/management/commands/owasp_create_member_snapshot.py
  • backend/tests/apps/github/management/commands/github_sync_user_test.py
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.

Applied to files:

  • backend/apps/github/management/commands/github_sync_user.py
  • backend/apps/owasp/management/commands/owasp_create_member_snapshot.py
  • backend/tests/apps/github/management/commands/github_sync_user_test.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.

Applied to files:

  • backend/apps/github/management/commands/github_sync_user.py
  • backend/apps/owasp/management/commands/owasp_create_member_snapshot.py
  • backend/tests/apps/github/management/commands/github_sync_user_test.py
📚 Learning: 2026-01-01T18:57:05.007Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/video.py:189-215
Timestamp: 2026-01-01T18:57:05.007Z
Learning: In the OWASP backend area, maintain the established pattern: when dealing with sponsors, include all entries from Sponsor.objects.all() (including NOT_SPONSOR) and perform in-memory sorting using the same criteria/pattern used by the GraphQL sponsor query implemented in backend/apps/owasp/api/internal/queries/sponsor.py. Apply this behavior consistently to files in backend/apps/owasp (not just video.py), and ensure code paths that render sponsor lists follow this in-code sorting approach rather than pre-filtering NOT_SPONSOR entries before sorting.

Applied to files:

  • backend/apps/owasp/management/commands/owasp_create_member_snapshot.py
🔇 Additional comments (7)
backend/apps/owasp/management/commands/owasp_create_member_snapshot.py (3)

5-5: LGTM! Import statement correctly updated.

The timedelta import has been properly added to support the new 365-day rolling window logic.


44-44: Help text accurately reflects the new defaults.

The updated help descriptions correctly document the 365-day rolling window behavior.

Also applies to: 49-49


323-325: Rolling 365-day window implemented correctly.

The default date logic properly calculates a 365-day period ending today, aligning with the PR objectives to replace the fixed calendar range with a dynamic rolling window.

backend/tests/apps/github/management/commands/github_sync_user_test.py (1)

158-163: Test assertions correctly updated to reflect new defaults.

The test properly verifies that the command's help text documents the 365-day rolling window behavior. Good coverage of the argument documentation.

backend/apps/github/management/commands/github_sync_user.py (3)

4-4: LGTM! Import statement correctly updated.

The timedelta import has been properly added to enable the new 365-day rolling window logic.


42-42: Help text accurately reflects the new defaults.

The updated help descriptions correctly document the 365-day rolling window behavior for both start and end dates.

Also applies to: 47-47


207-209: Rolling 365-day window implemented correctly.

The default date logic properly calculates a 365-day period ending today, successfully replacing the fixed Jan 1–Oct 1 calendar range with a dynamic rolling window that aligns with the aggregation logic used elsewhere in the platform.

coderabbitai[bot]
coderabbitai bot previously approved these changes Jan 6, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/apps/github/common.py (1)

79-83: Consider consolidating the 365-day threshold calculation.

The 365-day threshold is computed twice: inline at line 82 for milestones and again at line 105 as year_ago for issues/PRs. You could define year_ago earlier and reuse it for all three, reducing duplication:

♻️ Suggested consolidation
     if not repository.is_archived:
+        year_ago = timezone.now() - td(days=365)
+
         # GitHub repository milestones.
         kwargs = {
             "direction": "desc",
             "sort": "updated",
             "state": "all",
         }

         until = (
             latest_updated_milestone.updated_at
             if (latest_updated_milestone := repository.latest_updated_milestone)
-            else timezone.now() - td(days=365)
+            else year_ago
         )

         # ... milestone sync loop ...

         # GitHub repository issues.
         project_track_issues = repository.project.track_issues if repository.project else True
-        year_ago = timezone.now() - td(days=365)

Also applies to: 105-105

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 835fdc5 and 8b51983.

📒 Files selected for processing (2)
  • backend/apps/github/common.py
  • backend/tests/apps/github/common_test.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.

Applied to files:

  • backend/tests/apps/github/common_test.py
  • backend/apps/github/common.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.

Applied to files:

  • backend/tests/apps/github/common_test.py
  • backend/apps/github/common.py
🧬 Code graph analysis (1)
backend/tests/apps/github/common_test.py (3)
backend/tests/apps/github/models/repository_test.py (1)
  • mock_gh_repository (31-48)
backend/tests/apps/github/management/commands/github_sync_user_test.py (1)
  • mock_repo (60-62)
backend/apps/github/models/repository.py (1)
  • latest_updated_issue (137-139)
🔇 Additional comments (4)
backend/tests/apps/github/common_test.py (1)

236-253: LGTM!

The test correctly validates the 365-day fallback behavior for initial sync:

  • 300 days is within the 365-day window → synced
  • 400 days is outside the 365-day window → skipped

The boundary values appropriately exercise the new threshold.

backend/apps/github/common.py (3)

79-83: LGTM!

The milestones fallback correctly uses a 365-day window for initial sync, consistent with the PR objectives.


105-117: LGTM!

Good approach defining year_ago once and reusing it for both issues and pull requests fallbacks. The naming clearly conveys the 365-day window intent.


164-168: LGTM!

Pull request sync correctly uses the shared year_ago fallback, maintaining consistency with the issues sync window.

coderabbitai[bot]
coderabbitai bot previously approved these changes Jan 12, 2026
@Shofikul-Isl4m Shofikul-Isl4m marked this pull request as draft January 12, 2026 15:11
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
backend/apps/github/constants.py (1)

14-14: Consider grouping with other GITHUB_ constants.

The new constant follows the naming convention but is placed after the OWASP_ constants. For better organization, consider moving it near the other GITHUB_* constants (lines 5-9).

Suggested placement
 GITHUB_ACTIONS_USER_LOGIN = "actions-user"
 GITHUB_GHOST_USER_LOGIN = "ghost"
 GITHUB_ITEMS_PER_PAGE = 100
+GITHUB_COMMITS_BULK_SAVE_CHUNK_SIZE = 50
 GITHUB_REPOSITORY_RE = re.compile("^https://github.com/([^/]+)/([^/]+)(/.*)?$")
 GITHUB_USER_RE = re.compile("^https://github.com/([^/]+)/?$")

 OWASP_FOUNDATION_LOGIN = "OWASPFoundation"
 OWASP_GITHUB_IO = "owasp.github.io"
 OWASP_LOGIN = "owasp"
-GITHUB_COMMITS_BULK_SAVE_CHUNK_SIZE = 50
backend/apps/github/common.py (1)

110-114: Use year_ago variable for consistency.

Line 113 recalculates timezone.now() - td(days=365) instead of reusing the year_ago variable defined on line 74. For consistency with lines 146 and 197, and to avoid potential timing drift, use the existing variable.

Suggested fix
         until = (
             latest_updated_milestone.updated_at
             if (latest_updated_milestone := repository.latest_updated_milestone)
-            else timezone.now() - td(days=365)
+            else year_ago
         )
backend/tests/apps/github/common_test.py (1)

237-254: Test correctly validates 365-day fallback for issues, but consider adding commit sync coverage.

The test values (300 and 400 days) are well-chosen boundary values. However, the new commit bulk-save logic (lines 76-101 in common.py) lacks test coverage for:

  • Verifying Commit.bulk_save is called
  • Validating chunk-size batching behavior
  • Testing incremental sync when a latest_commit exists

Would you like me to help generate additional test cases for the commit syncing functionality?

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b51983 and 2c38bf0.

📒 Files selected for processing (3)
  • backend/apps/github/common.py
  • backend/apps/github/constants.py
  • backend/tests/apps/github/common_test.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.

Applied to files:

  • backend/apps/github/constants.py
  • backend/tests/apps/github/common_test.py
  • backend/apps/github/common.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.

Applied to files:

  • backend/apps/github/constants.py
  • backend/tests/apps/github/common_test.py
  • backend/apps/github/common.py
🧬 Code graph analysis (2)
backend/tests/apps/github/common_test.py (3)
backend/tests/apps/github/models/repository_test.py (1)
  • mock_gh_repository (31-48)
backend/tests/apps/github/management/commands/github_sync_user_test.py (1)
  • mock_repo (60-62)
backend/apps/github/models/repository.py (1)
  • latest_updated_issue (137-139)
backend/apps/github/common.py (1)
backend/apps/github/models/commit.py (1)
  • Commit (11-110)
🔇 Additional comments (5)
backend/apps/github/common.py (4)

15-17: LGTM!

The new imports for GITHUB_COMMITS_BULK_SAVE_CHUNK_SIZE and Commit are correctly added to support the bulk commit syncing functionality.


74-101: LGTM! Well-structured bulk commit syncing.

The implementation correctly:

  • Uses the 365-day fallback for initial sync when no commits exist
  • Applies incremental sync based on latest commit timestamp
  • Batches saves to avoid memory issues with large repositories
  • Flushes remaining commits after the loop

143-147: LGTM!

The 365-day fallback for issues correctly uses the year_ago variable defined earlier.


194-198: LGTM!

The 365-day fallback for pull requests correctly uses the year_ago variable, consistent with the issues fallback.

backend/tests/apps/github/common_test.py (1)

18-18: LGTM!

Adding the Commit mock is necessary to support the new commit syncing logic in sync_repository.

coderabbitai[bot]
coderabbitai bot previously approved these changes Jan 13, 2026
@Shofikul-Isl4m Shofikul-Isl4m marked this pull request as ready for review January 13, 2026 12:18
@Shofikul-Isl4m Shofikul-Isl4m changed the title Update sync process to sync all relevant contributions for the last 365 days. implement commit syncing and extend sync window to 365 days Jan 13, 2026
@Shofikul-Isl4m
Copy link
Contributor Author

Shofikul-Isl4m commented Jan 13, 2026

@arkid15r I have updated the PR with the requested changes and implemented the commit syncing logic. Could you please take another look?

@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update data sync to include all contributions within last 365 days

1 participant