-
-
Notifications
You must be signed in to change notification settings - Fork 423
Simplify regex to reduce Sonar complexity while preserving behavior #3349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughRewrote the get_leaders function in backend/apps/owasp/models/common.py: removed an unused itertools import and replaced a single complex regex approach with per-line processing using separate regexes for bracketed and starred entries plus post-processing to trim parentheticals and whitespace. No public signatures changed. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
🧹 Recent nitpick comments
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (6)📓 Common learnings📚 Learning: 2025-09-06T19:03:01.985ZApplied to files:
📚 Learning: 2025-09-06T12:48:44.916ZApplied to files:
📚 Learning: 2025-12-31T05:17:39.659ZApplied to files:
📚 Learning: 2026-01-01T17:48:23.963ZApplied to files:
📚 Learning: 2026-01-01T18:57:05.007ZApplied to files:
🧬 Code graph analysis (1)backend/apps/owasp/models/common.py (1)
🔇 Additional comments (1)
✏️ Tip: You can disable this entire section by setting Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR simplifies a complex regex pattern used to extract markdown list names from content, reducing Sonar cognitive complexity while maintaining the same functional behavior. The change improves code maintainability by making the regex more readable and removing optional capture group logic.
Changes:
- Simplified the regex pattern from
r"[-*]\s*\[\s*([^(]+?)\s*(?:\([^)]*\))?\]|\*\s*([\w\s]+)"tor"[-*]\s*\[\s*([^\]]+?)\s*\]|\*\s+([\w\s]+)" - Moved parenthetical removal logic outside the regex using
re.sub(r"\s*\([^)]*\)\s*$", "", name).strip()
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
5467b72 to
b290bd7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@backend/apps/owasp/models/common.py`:
- Line 203: The regex change in names.extend(...) replaced "\*\s*" with "\*\s+",
which breaks matching list items like "*Name" that have no space after the
asterisk; restore the original behavior by changing the pattern back to use
"\*\s*" (or if the intent is to require a space, add a clear comment/docstring
near the function explaining that list items must have a space after the
asterisk), and update any tests accordingly; locate the call to
names.extend(re.findall(...)) in common.py and revert the regex or add the
documentation note to the surrounding function/class.
♻️ Duplicate comments (1)
backend/apps/owasp/models/common.py (1)
202-202: Mid-string parentheticals will be retained, changing behavior.The pattern
[^\]]+captures everything except], including parentheses. For input like[Name (Title) (Role)]:
- This captures
Name (Title) (Role)- The
re.subon line 206 only removes the trailing(Role), leavingName (Title)- The original
[^(]+?stopped at the first(, capturing onlyNameTo preserve the original behavior, exclude
(from the character class:Proposed fix
- names.extend(re.findall(r"[-*]\s*\[\s*([^\]]+)\s*\]", stripped_line)) + names.extend(re.findall(r"[-*]\s*\[\s*([^(\]]+)", stripped_line))
🧹 Nitpick comments (1)
backend/apps/owasp/models/common.py (1)
205-207: Consider:re.subis a no-op for star-prefixed names.Names from the second pattern (
[\w\s]+) cannot contain parentheses, so applyingre.subto strip trailing parentheticals is unnecessary overhead for those entries. This is a minor efficiency consideration and could be addressed by tracking which pattern matched, though the current approach is simpler and works correctly.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/apps/owasp/models/common.py
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2223
File: backend/apps/owasp/models/common.py:213-232
Timestamp: 2025-09-06T19:03:01.985Z
Learning: In the OWASP Nest project, the get_leaders_emails() method in RepositoryBasedEntityModel is designed to only capture leaders with mailto: links from leaders.md files, intentionally ignoring plain text names without email addresses. The current regex implementation works correctly for the intended behavior as validated by comprehensive test cases.
📚 Learning: 2025-09-06T19:03:01.985Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2223
File: backend/apps/owasp/models/common.py:213-232
Timestamp: 2025-09-06T19:03:01.985Z
Learning: In the OWASP Nest project, the get_leaders_emails() method in RepositoryBasedEntityModel is designed to only capture leaders with mailto: links from leaders.md files, intentionally ignoring plain text names without email addresses. The current regex implementation works correctly for the intended behavior as validated by comprehensive test cases.
Applied to files:
backend/apps/owasp/models/common.py
📚 Learning: 2025-09-06T19:28:14.297Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2223
File: backend/apps/owasp/models/entity_member.py:50-56
Timestamp: 2025-09-06T19:28:14.297Z
Learning: In the OWASP/Nest project, when migrating scraper logic to GitHub .md file parsing, the sync_leaders method uses member_name as the primary identifier for finding and updating existing EntityMember records, not member_email. This approach is chosen because names are more stable identifiers in markdown files, while emails might be added/updated over time.
Applied to files:
backend/apps/owasp/models/common.py
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.
Applied to files:
backend/apps/owasp/models/common.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.
Applied to files:
backend/apps/owasp/models/common.py
📚 Learning: 2026-01-01T18:57:05.007Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/video.py:189-215
Timestamp: 2026-01-01T18:57:05.007Z
Learning: In the OWASP backend area, maintain the established pattern: when dealing with sponsors, include all entries from Sponsor.objects.all() (including NOT_SPONSOR) and perform in-memory sorting using the same criteria/pattern used by the GraphQL sponsor query implemented in backend/apps/owasp/api/internal/queries/sponsor.py. Apply this behavior consistently to files in backend/apps/owasp (not just video.py), and ensure code paths that render sponsor lists follow this in-code sorting approach rather than pre-filtering NOT_SPONSOR entries before sorting.
Applied to files:
backend/apps/owasp/models/common.py
🧬 Code graph analysis (1)
backend/apps/owasp/models/common.py (1)
backend/apps/owasp/api/internal/nodes/common.py (1)
leaders(19-21)
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
8623dfe to
79b2c28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@backend/apps/owasp/models/common.py`:
- Line 347: Remove the unnecessary explicit "return None" statement (the lone
"return None" in backend/apps/owasp/models/common.py) so the function relies on
Python's implicit None return; locate the function that contains the "return
None" line and delete that line, leaving the rest of the function logic
unchanged.
♻️ Duplicate comments (1)
backend/apps/owasp/models/common.py (1)
200-206: Behavioral changes from previous review remain unaddressed.Two issues flagged in prior reviews still apply:
Line 202: Using
[^\]]+instead of[^(]+?changes behavior for entries with multiple parentheticals like[Name (Title) (Role)]. The new pattern captures the entire content and only strips the trailing parenthetical, leavingName (Title)instead ofName.Line 203: Using
\*\s+(one or more spaces) instead of the original\*\s*(zero or more) breaks matching for entries like*Namewith no space after the asterisk.Please address these or confirm the behavioral changes are intentional.
🧹 Nitpick comments (2)
backend/apps/owasp/models/common.py (2)
251-253: Consider using@staticmethodinstead of the no-op assignment.The method doesn't use instance state. A
@staticmethoddecorator would be cleaner than the_ = selfworkaround, though both achieve the linter-silencing goal.Suggested refactor
+ `@staticmethod` - def get_related_url(self, url, exclude_domains=(), include_domains=()) -> str | None: + def get_related_url(url, exclude_domains=(), include_domains=()) -> str | None: """Get OWASP entity related URL.""" - _ = selfNote: Converting to
@staticmethodwould require updating all call sites fromself.get_related_url(...)toClassName.get_related_url(...)or justget_related_url(...). Verify callers before applying.
306-308: Same observation: method doesn't use instance state.
parse_tagscould also be a@staticmethodfor the same reasons asget_related_url.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/apps/owasp/models/common.py
🧰 Additional context used
🧠 Learnings (8)
📓 Common learnings
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2223
File: backend/apps/owasp/models/common.py:213-232
Timestamp: 2025-09-06T19:03:01.985Z
Learning: In the OWASP Nest project, the get_leaders_emails() method in RepositoryBasedEntityModel is designed to only capture leaders with mailto: links from leaders.md files, intentionally ignoring plain text names without email addresses. The current regex implementation works correctly for the intended behavior as validated by comprehensive test cases.
📚 Learning: 2025-09-06T19:03:01.985Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2223
File: backend/apps/owasp/models/common.py:213-232
Timestamp: 2025-09-06T19:03:01.985Z
Learning: In the OWASP Nest project, the get_leaders_emails() method in RepositoryBasedEntityModel is designed to only capture leaders with mailto: links from leaders.md files, intentionally ignoring plain text names without email addresses. The current regex implementation works correctly for the intended behavior as validated by comprehensive test cases.
Applied to files:
backend/apps/owasp/models/common.py
📚 Learning: 2025-09-06T19:28:14.297Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2223
File: backend/apps/owasp/models/entity_member.py:50-56
Timestamp: 2025-09-06T19:28:14.297Z
Learning: In the OWASP/Nest project, when migrating scraper logic to GitHub .md file parsing, the sync_leaders method uses member_name as the primary identifier for finding and updating existing EntityMember records, not member_email. This approach is chosen because names are more stable identifiers in markdown files, while emails might be added/updated over time.
Applied to files:
backend/apps/owasp/models/common.py
📚 Learning: 2025-09-06T12:48:44.916Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2223
File: backend/apps/owasp/models/common.py:0-0
Timestamp: 2025-09-06T12:48:44.916Z
Learning: When migrating scraper logic to GitHub .md file parsing in the OWASP project, the new implementation should maintain the same case-sensitivity behavior as the original scraper logic, even if case-insensitive matching might seem more robust.
Applied to files:
backend/apps/owasp/models/common.py
📚 Learning: 2025-11-23T11:37:26.253Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2606
File: backend/apps/api/rest/v0/project.py:43-48
Timestamp: 2025-11-23T11:37:26.253Z
Learning: In the OWASP Nest backend, `entity_leaders` is a `property` method defined in `RepositoryBasedEntityModel` (backend/apps/owasp/models/common.py) that returns a dynamically constructed QuerySet. It cannot be prefetched using standard `prefetch_related()` because Django's prefetch mechanism only works on model fields and relations, not property methods.
Applied to files:
backend/apps/owasp/models/common.py
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.
Applied to files:
backend/apps/owasp/models/common.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.
Applied to files:
backend/apps/owasp/models/common.py
📚 Learning: 2026-01-01T18:57:05.007Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/video.py:189-215
Timestamp: 2026-01-01T18:57:05.007Z
Learning: In the OWASP backend area, maintain the established pattern: when dealing with sponsors, include all entries from Sponsor.objects.all() (including NOT_SPONSOR) and perform in-memory sorting using the same criteria/pattern used by the GraphQL sponsor query implemented in backend/apps/owasp/api/internal/queries/sponsor.py. Apply this behavior consistently to files in backend/apps/owasp (not just video.py), and ensure code paths that render sponsor lists follow this in-code sorting approach rather than pre-filtering NOT_SPONSOR entries before sorting.
Applied to files:
backend/apps/owasp/models/common.py
🧬 Code graph analysis (1)
backend/apps/owasp/models/common.py (2)
backend/apps/owasp/api/internal/nodes/common.py (1)
leaders(19-21)backend/apps/mentorship/api/internal/nodes/mentor.py (1)
name(18-20)
🪛 Ruff (0.14.11)
backend/apps/owasp/models/common.py
347-347: Do not explicitly return None in function if it is the only possible return value
Remove explicit return None
(RET501)
347-347: Useless return statement at end of function
Remove useless return statement
(PLR1711)
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
981c203 to
03088c7
Compare
03088c7 to
1a0b05a
Compare
arkid15r
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert all unrelated changes.
Run make check-test locally -- just marking the checkbox is not enough.
1a3cb9c to
632c245
Compare
632c245 to
7f3f808
Compare
|
PawanMohit16
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reverted all the unrelated changes. Based on our earlier conversation, I had run make check-test-backend for this backend-only change. I had already run it before, and I ran it again after your message as well. All good locally. Let me know if anything else is needed.



Proposed change
Resolves #2910
Hardens the regex used for extracting markdown list names by removing backtracking-prone constructs and bounding all quantifiers. Behavior is unchanged, but the implementation is now ReDoS-safe and scanner-compliant.
Checklist
make check-testlocally