Skip to content

Conversation

@rudransh-shrivastava
Copy link
Collaborator

Proposed change

Resolves #3217

  • Use Lambda SnapStart
  • Add callbacks to reduce Lambda package size
  • Add aws-xray

Checklist

  • Required: I followed the contributing workflow
  • Required: I verified that my code works as intended and resolves the issue as described
  • Required: I ran make check-test locally: all warnings addressed, tests passed
  • I used AI for code, documentation, tests, or communication related to this PR

@github-actions github-actions bot added docs Improvements or additions to documentation backend labels Jan 16, 2026
@rudransh-shrivastava rudransh-shrivastava linked an issue Jan 16, 2026 that may be closed by this pull request
2 tasks
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 16, 2026

Walkthrough

Adds Zappa deployment callbacks and settings, enables conditional AWS X-Ray instrumentation, moves a few exception imports to local scope, bumps Zappa and adds aws-xray-sdk, and refactors ALB Terraform to use Lambda function name + alias (removing lambda_arn) with related staging/docs edits.

Changes

Cohort / File(s) Summary
Backend import locality
backend/apps/github/models/common.py, backend/apps/github/models/repository.py
Moved exception imports (UnknownObjectException, GithubException) from module level into the local method scope where they are used.
Runtime instrumentation & WSGI
backend/wsgi.py
Adds conditional AWS X-Ray usage: imports xray_recorder and patch_all, configures recorder when AWS_LAMBDA_FUNCTION_NAME is present, and calls patch_all().
Zappa deployment callbacks & config
backend/zappa_callback.py, backend/zappa_settings.example.json
Adds zappa_callback.py with clean_package, update_alias, cleanup_versions, post; updates example Zappa settings to include callbacks, memory_size, regex_excludes, snap_start, and xray_tracing.
Dependencies
backend/pyproject.toml
Bumps zappa to ^0.61.4 and adds aws-xray-sdk ^2.15.0.
ALB Terraform: alias-based Lambda integration
infrastructure/modules/alb/main.tf, infrastructure/modules/alb/variables.tf
Adds data "aws_lambda_function" "backend" and resource "aws_lambda_alias" "live"; refactors ALB listener rules, target group/attachment, and aws_lambda_permission to use lambda_function_name + alias qualifier; removes lambda_arn variable.
ALB Terraform: outputs removed
infrastructure/modules/alb/outputs.tf
Removes lambda_target_group_arn output.
Staging Terraform cleanup
infrastructure/staging/main.tf, infrastructure/staging/terraform.tfvars.example, infrastructure/staging/variables.tf
Removes lambda_arn usage/definition from staging configs; retains lambda_function_name.
Docs / infra README
infrastructure/README.md
Changes Zappa setup command to poetry sync --without dev --without test --without video && eval $(poetry env activate), removes lambda_arn example, and deletes the Known Issues section.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • kasya
  • arkid15r
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description check ✅ Passed The description relates to the changeset by outlining the three main changes: using Lambda SnapStart, adding callbacks to reduce package size, and adding aws-xray, which align with the actual code modifications.
Linked Issues check ✅ Passed The PR addresses issue #3217's objectives: enables Lambda SnapStart via Terraform alias and zappa_settings, reduces package size through zappa_callback functions and dependency optimization, and adds aws-xray integration.
Out of Scope Changes check ✅ Passed All changes are scoped to the linked issue #3217: import refactoring improves maintainability for SnapStart compatibility, Zappa and infrastructure changes enable SnapStart deployment, and aws-xray integration is explicitly mentioned in the PR description.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The title accurately describes the main changes: enabling Lambda SnapStart and optimizing the Zappa package through new callbacks that reduce package size.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rudransh-shrivastava rudransh-shrivastava changed the title Feature/nest zappa migration cold start Use LambdaSnapStart and optimize Zappa package Jan 16, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@backend/zappa_callback.py`:
- Around line 45-54: The update_alias function can raise IndexError when there
are no published versions (only "$LATEST"); to fix, after calling
client.list_versions_by_function and building the filtered list (e.g., versions
= client.list_versions_by_function(... )["Versions"] and extracting
published_versions = [v["Version"] for v in versions if v["Version"] !=
"$LATEST"]), check if published_versions is empty and handle it (log a clear
message via print or a logger and return/abort the alias update, or raise a
descriptive exception) instead of unconditionally taking the last element; then
use the chosen latest value when calling client.update_alias.

In `@infrastructure/modules/alb/main.tf`:
- Around line 71-82: Add an explanatory comment above the aws_lambda_alias
"live" resource clarifying that data.aws_lambda_function.backend[0].version
returns the latest published version (not $LATEST), that this attribute will be
empty if no published versions exist so Zappa must publish a version before
Terraform can create aws_lambda_alias.live, and that ignore_changes =
[function_version] is intentional to allow Zappa to manage subsequent version
updates independently (Terraform will create the alias once a published version
exists but will not follow Zappa-managed version changes thereafter).
🧹 Nitpick comments (1)
backend/zappa_callback.py (1)

57-73: Consider adding error handling for version deletion failures.

The deletion loop at lines 70-71 doesn't handle potential AWS API errors (e.g., version in use by an alias, throttling). A single failure will abort the entire cleanup.

♻️ Suggested improvement with error handling
     to_delete = published[:-keep]
     for version in to_delete:
-        client.delete_function(FunctionName=version["FunctionArn"])
+        try:
+            client.delete_function(FunctionName=version["FunctionArn"])
+        except client.exceptions.ResourceConflictException:
+            print(f"Version {version['Version']} is in use, skipping")
+        except Exception as e:
+            print(f"Failed to delete version {version['Version']}: {e}")
 
-    print(f"Deleted {len(to_delete)} old versions, kept {keep}")
+    print(f"Attempted to delete {len(to_delete)} old versions, kept {keep}")
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1966639 and d6e039a.

⛔ Files ignored due to path filters (1)
  • backend/poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (13)
  • backend/apps/github/models/common.py
  • backend/apps/github/models/repository.py
  • backend/pyproject.toml
  • backend/wsgi.py
  • backend/zappa_callback.py
  • backend/zappa_settings.example.json
  • infrastructure/README.md
  • infrastructure/modules/alb/main.tf
  • infrastructure/modules/alb/outputs.tf
  • infrastructure/modules/alb/variables.tf
  • infrastructure/staging/main.tf
  • infrastructure/staging/terraform.tfvars.example
  • infrastructure/staging/variables.tf
💤 Files with no reviewable changes (4)
  • infrastructure/staging/main.tf
  • infrastructure/staging/terraform.tfvars.example
  • infrastructure/modules/alb/outputs.tf
  • infrastructure/staging/variables.tf
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.
📚 Learning: 2025-10-23T19:22:23.811Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.

Applied to files:

  • infrastructure/README.md
  • backend/zappa_settings.example.json
  • backend/zappa_callback.py
📚 Learning: 2025-11-27T19:38:23.956Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2743
File: infrastructure/modules/storage/main.tf:1-9
Timestamp: 2025-11-27T19:38:23.956Z
Learning: In the OWASP/Nest infrastructure code (infrastructure/ directory), the team prefers exact version pinning for Terraform (e.g., "1.14.0") and AWS provider (e.g., "6.22.0") over semantic versioning constraints (e.g., "~> 1.14" or "~> 6.22"). This is a deliberate choice for maximum reproducibility and explicit control over version updates in their testing environment.

Applied to files:

  • infrastructure/README.md
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.

Applied to files:

  • backend/apps/github/models/common.py
  • backend/wsgi.py
  • backend/zappa_callback.py
  • backend/apps/github/models/repository.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.

Applied to files:

  • backend/apps/github/models/common.py
  • backend/wsgi.py
  • backend/zappa_callback.py
  • backend/apps/github/models/repository.py
📚 Learning: 2025-11-23T11:52:15.463Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2699
File: backend/wsgi.py:13-13
Timestamp: 2025-11-23T11:52:15.463Z
Learning: In the OWASP Nest project, the SSM parameter store setup in backend/wsgi.py (using boto3 to fetch parameters from AWS Systems Manager) is designed for staging and production environments, not just for testing purposes.

Applied to files:

  • backend/zappa_settings.example.json
  • backend/wsgi.py
📚 Learning: 2026-01-13T15:15:35.282Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 3333
File: infrastructure/modules/database/variables.tf:144-147
Timestamp: 2026-01-13T15:15:35.282Z
Learning: In Terraform (AWS provider), the aws_secretsmanager_secret resource's recovery_window_in_days can be set to 0 to trigger immediate deletion, mapping to AWS's ForceDeleteWithoutRecovery. This contrasts with the AWS API, which documents RecoveryWindowInDays as 7–30. Review any variable/module usage of recovery_window_in_days in infrastructure/ modules (e.g., infrastructure/modules/database/variables.tf) to ensure this behavior is intentional and documented. If using 0 for immediate deletion, add a clear comment and ensure it aligns with your deletion policies and compliance requirements.

Applied to files:

  • infrastructure/modules/alb/variables.tf
  • infrastructure/modules/alb/main.tf
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (14)
infrastructure/README.md (2)

121-122: LGTM!

The dependency installation command correctly excludes dev and test groups for deployment, reducing the Lambda package size as intended.


154-167: LGTM!

Documentation correctly reflects the shift from lambda_arn to lambda_function_name for alias-based SnapStart support.

backend/apps/github/models/repository.py (1)

253-254: LGTM!

The lazy import of GithubException is appropriate here since it's used to catch exceptions from actual GitHub API calls (commits.totalCount and get_contents()). This supports the cold start optimization objective.

infrastructure/modules/alb/variables.tf (1)

41-45: LGTM!

The updated description clearly documents the purpose of lambda_function_name for alias-based SnapStart support. This aligns with the architecture where Zappa manages Lambda functions while Terraform handles the supporting infrastructure (ALB routing via alias). Based on learnings, this separation is intentional for Zappa-based deployments.

backend/apps/github/models/common.py (1)

86-91: The exception handling is correct; raw_data property access can raise UnknownObjectException.

The code properly catches UnknownObjectException because PyGithub's raw_data property can itself raise this exception when accessed (e.g., if the GitHub API returns 404 for a lazy-loaded sub-resource). This is confirmed by the existing test in backend/tests/apps/github/models/common_test.py (lines 114–120), which explicitly validates that get_node_id returns None when raw_data access raises UnknownObjectException. The lazy import is appropriate for cold start optimization and doesn't introduce any logic errors.

Likely an incorrect or invalid review comment.

backend/pyproject.toml (1)

53-54: Dependency versions are current and correctly specified.

The Zappa upgrade to ^0.61.4 and addition of aws-xray-sdk ^2.15.0 align with the PR objectives for SnapStart and X-Ray tracing. Both versions are the latest available on PyPI as of January 2026.

backend/wsgi.py (1)

7-11: LGTM!

The X-Ray instrumentation is correctly configured:

  • Conditional activation based on AWS_LAMBDA_FUNCTION_NAME ensures it only runs in Lambda
  • context_missing="LOG_ERROR" prevents exceptions when tracing context is missing
  • Placement before populate_environ_from_ssm() ensures SSM calls are traced
backend/zappa_callback.py (2)

13-42: LGTM!

The clean_package function is well-implemented:

  • Uses filter="data" for safe tarfile extraction (prevents path traversal)
  • Properly uses context manager for temp directory cleanup
  • Gracefully handles missing configuration

76-79: LGTM!

The post callback correctly orchestrates alias updates and version cleanup.

infrastructure/modules/alb/main.tf (3)

39-42: LGTM!

The conditional data source correctly fetches Lambda function metadata when lambda_function_name is provided.


84-92: LGTM!

The permission correctly includes the qualifier to scope ALB invocations to the aliased version, which is required for SnapStart and alias-based routing.


235-240: LGTM!

The target group attachment correctly uses the alias ARN (aws_lambda_alias.live[0].arn) instead of the function ARN, ensuring traffic routes to the SnapStart-optimized published version.

backend/zappa_settings.example.json (2)

29-42: The /data/ exclude pattern is appropriate.

The regex_excludes patterns correctly exclude the backend/data/ directory, which contains Django fixtures (55MB+ nest.json.gz) and metadata used only for local setup and initialization. These are loaded via management commands (e.g., load_data, restore_backup) that run locally or via zappa invoke, not during Lambda handler execution. Excluding these files from the deployment package is the correct and intended behavior.


46-46: Configuration value is correct and supported. The snap_start value "PublishedVersions" is a valid AWS Lambda SnapStart configuration that Zappa fully supports. Verify that your target AWS region and Python 3.13 runtime support SnapStart (Python support was expanded in Nov 2024).

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@backend/zappa_callback.py`:
- Around line 1-11: The module imports boto3 (see the top-level import "import
boto3" in zappa_callback.py) but boto3 is not declared in the project's
dependencies; add boto3 to the dependencies section of backend/pyproject.toml
(choose an appropriate version constraint, e.g., "boto3 >= 1.26,<2.0" or similar
per project policy) so the package is explicitly installed rather than relying
on transitive deps, then run poetry/pip tooling to update the lockfile.
- Around line 57-73: The cleanup_versions function currently calls
boto3.client("lambda").list_versions_by_function once, which misses additional
pages; update cleanup_versions to paginate through all versions using the Lambda
client's paginator for list_versions_by_function (or loop with
Marker/NextMarker) so published = all versions except "$LATEST" across all pages
before computing to_delete; also audit and update the update_alias logic to use
the same paginator pattern when listing versions or aliases so it works with >50
items; ensure you still call
delete_function(FunctionName=version["FunctionArn"]) only for the computed
to_delete set.
- Around line 33-40: The code currently unlinks full_path before rebuilding the
archive (full_path.unlink()) which can permanently lose the original if the new
tar creation fails; change the flow to keep the original until the new archive
is fully written: write the rebuilt archive to a temporary path (e.g.,
temp_full_path = full_path.with_suffix(full_path.suffix + ".tmp") or use
tempfile) using the same tarfile block that references temp_path and excludes,
close/flush it, then atomically replace the original by
Path.replace(temp_full_path, full_path) (or unlink and rename only after
success); ensure errors during creation do not remove the original and add
try/except or finally to clean up the temp artifact if needed.
♻️ Duplicate comments (1)
backend/zappa_callback.py (1)

45-54: Handle empty published versions list to prevent IndexError.

Line 51 will raise IndexError if no published versions exist (e.g., first deployment or if only $LATEST exists). This was flagged in a previous review.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d6e039a and e106029.

📒 Files selected for processing (1)
  • backend/zappa_callback.py
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.
📚 Learning: 2025-10-23T19:22:23.811Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.

Applied to files:

  • backend/zappa_callback.py
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.

Applied to files:

  • backend/zappa_callback.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.

Applied to files:

  • backend/zappa_callback.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
🔇 Additional comments (1)
backend/zappa_callback.py (1)

76-79: LGTM!

The post-deploy orchestration is clean and straightforward. The fail-fast behavior (if update_alias fails, cleanup_versions won't run) is reasonable for deployment callbacks.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@rudransh-shrivastava rudransh-shrivastava force-pushed the feature/nest-zappa-migration-cold-start branch from e106029 to 0137559 Compare January 16, 2026 18:37
There's a known issue with Zappa removing permissions and disconnecting the externally managed
API Gateway on each `update` or `deploy` action.
The temporary fix is to run `terraform apply` right after these actions.
Reference: https://github.com/zappa/Zappa/issues/939
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be fixed in latest version of Zappa.

"""Github app common models."""

from django.db import models
from github.GithubException import UnknownObjectException
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an easy one to lazy load.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@backend/zappa_settings.example.json`:
- Around line 29-42: The "regex_excludes" array currently contains a broad
"/data/" entry which will exclude repository directories like
backend/data/project-custom-tags; update this by removing "/data/" from
regex_excludes or replace it with package-scoped patterns that only exclude
vendor package data (for example use patterns targeting boto3/botocore data
directories such as a regex matching "boto3/.*/data" and "botocore/.*/data"),
and then verify that any runtime-required files in backend/data are retained in
the Lambda package (or document the decision if you intentionally remove them).
♻️ Duplicate comments (4)
backend/zappa_callback.py (3)

33-40: Archive deletion before rebuild risks data loss.

The original archive is deleted at line 35 before the new archive is created. If the rebuild fails (disk space, permissions, or any exception), the package is lost with no recovery path.

🛡️ Proposed safer approach
     with tempfile.TemporaryDirectory() as temp_dir:
         temp_path = Path(temp_dir)

         with tarfile.open(full_path, "r:gz") as tf:  # NOSONAR archive is trusted
             tf.extractall(temp_path, filter="data")
-        full_path.unlink()

-        with tarfile.open(full_path, "w:gz") as tf:  # NOSONAR archive is trusted
+        new_archive = full_path.with_suffix(".new.tar.gz")
+        with tarfile.open(new_archive, "w:gz") as tf:  # NOSONAR archive is trusted
             for filepath in temp_path.rglob("*"):
                 if filepath.is_file() and not any(re.search(p, str(filepath)) for p in excludes):
                     tf.add(filepath, filepath.relative_to(temp_path))
+        full_path.unlink()
+        new_archive.rename(full_path)

49-54: Handle empty published versions list and add pagination.

Line 51 will raise IndexError if no published versions exist. Additionally, list_versions_by_function is paginated (default 50 items), which could cause issues with many versions.

🐛 Proposed fix
 def update_alias(zappa):
     """Update Lambda alias to latest published version."""
     print("Updating Lambda alias...")

     client = boto3.client("lambda")
-    versions = client.list_versions_by_function(FunctionName=zappa.lambda_name)["Versions"]
-    latest = [v["Version"] for v in versions if v["Version"] != "$LATEST"][-1]
+    paginator = client.get_paginator("list_versions_by_function")
+    published = []
+    for page in paginator.paginate(FunctionName=zappa.lambda_name):
+        published.extend(v["Version"] for v in page["Versions"] if v["Version"] != "$LATEST")
+
+    if not published:
+        print("No published versions found, skipping alias update")
+        return
+
+    latest = published[-1]

     client.update_alias(FunctionName=zappa.lambda_name, Name="live", FunctionVersion=latest)
     print(f"Alias 'live' now points to version {latest}")

61-73: Add pagination for cleanup_versions.

list_versions_by_function returns at most 50 items per call. If more than 50 published versions exist, only the first page is retrieved, causing incorrect version management.

🐛 Proposed fix using paginator
 def cleanup_versions(zappa, keep=5):
     """Delete old Lambda versions."""
     print("Cleaning up old Lambda versions...")

     client = boto3.client("lambda")
-    versions = client.list_versions_by_function(FunctionName=zappa.lambda_name)["Versions"]
-    published = [v for v in versions if v["Version"] != "$LATEST"]
+    paginator = client.get_paginator("list_versions_by_function")
+    published = []
+    for page in paginator.paginate(FunctionName=zappa.lambda_name):
+        published.extend(v for v in page["Versions"] if v["Version"] != "$LATEST"])

     if len(published) <= keep:
         print(f"Only {len(published)} versions exist, nothing to delete")
         return

     to_delete = published[:-keep]
     for version in to_delete:
         client.delete_function(FunctionName=version["FunctionArn"])

     print(f"Deleted {len(to_delete)} old versions, kept {keep}")
infrastructure/modules/alb/main.tf (1)

71-82: Document the Zappa/Terraform handoff for maintainability.

The ignore_changes = [function_version] lifecycle rule is correct for allowing Zappa to manage subsequent alias updates. However, data.aws_lambda_function.backend[0].version returns the latest published version—if no published versions exist, this will be empty/null, causing alias creation to fail.

Add a comment above this resource clarifying:

  1. Zappa must publish at least one version before Terraform can create the alias
  2. After creation, Zappa independently manages version updates via ignore_changes

Based on learnings, Lambda functions are Zappa-managed while Terraform provisions supporting infrastructure—this handoff should be documented for future maintainers.

📝 Suggested documentation
+# Note: This alias requires Zappa to have published at least one Lambda version
+# before Terraform can create it (data source returns empty if no published versions).
+# After creation, Zappa manages version updates independently via ignore_changes.
 resource "aws_lambda_alias" "live" {
   count            = var.lambda_function_name != null ? 1 : 0
   description      = "Alias pointing to latest published version for SnapStart"
   function_name    = var.lambda_function_name
   function_version = data.aws_lambda_function.backend[0].version
   name             = "live"
 
   lifecycle {
     create_before_destroy = true
     ignore_changes        = [function_version]
   }
 }
🧹 Nitpick comments (2)
backend/wsgi.py (1)

7-11: Consider lazy-loading X-Ray SDK for non-Lambda environments.

The aws_xray_sdk import happens unconditionally at module load, adding overhead in non-Lambda environments (local development). Since the SDK is only used when AWS_LAMBDA_FUNCTION_NAME is set, consider moving the import inside the conditional block:

♻️ Suggested refactor
-from aws_xray_sdk.core import patch_all, xray_recorder
-
 if os.environ.get("AWS_LAMBDA_FUNCTION_NAME"):
+    from aws_xray_sdk.core import patch_all, xray_recorder
+
     xray_recorder.configure(context_missing="LOG_ERROR", service="nest-backend")
     patch_all()

This aligns with the PR's lazy-import pattern applied to the GitHub models.

backend/zappa_callback.py (1)

76-79: Consider error handling in post callback.

The post function calls update_alias and cleanup_versions sequentially without error handling. If update_alias fails, cleanup_versions won't run, which may be the desired behavior. However, if cleanup_versions fails, the deployment will fail even though the alias was updated successfully.

Consider whether partial success should be allowed or if failures in version cleanup should be non-fatal:

♻️ Suggested approach for resilient cleanup
 def post(zappa):
     """Post deploy callback."""
     update_alias(zappa)
-    cleanup_versions(zappa)
+    try:
+        cleanup_versions(zappa)
+    except Exception as e:
+        print(f"Warning: Version cleanup failed: {e}")
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e106029 and 0137559.

⛔ Files ignored due to path filters (1)
  • backend/poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (13)
  • backend/apps/github/models/common.py
  • backend/apps/github/models/repository.py
  • backend/pyproject.toml
  • backend/wsgi.py
  • backend/zappa_callback.py
  • backend/zappa_settings.example.json
  • infrastructure/README.md
  • infrastructure/modules/alb/main.tf
  • infrastructure/modules/alb/outputs.tf
  • infrastructure/modules/alb/variables.tf
  • infrastructure/staging/main.tf
  • infrastructure/staging/terraform.tfvars.example
  • infrastructure/staging/variables.tf
💤 Files with no reviewable changes (4)
  • infrastructure/modules/alb/outputs.tf
  • infrastructure/staging/terraform.tfvars.example
  • infrastructure/staging/variables.tf
  • infrastructure/staging/main.tf
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/pyproject.toml
  • infrastructure/README.md
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.
📚 Learning: 2025-11-23T11:52:15.463Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2699
File: backend/wsgi.py:13-13
Timestamp: 2025-11-23T11:52:15.463Z
Learning: In the OWASP Nest project, the SSM parameter store setup in backend/wsgi.py (using boto3 to fetch parameters from AWS Systems Manager) is designed for staging and production environments, not just for testing purposes.

Applied to files:

  • backend/zappa_settings.example.json
  • backend/wsgi.py
📚 Learning: 2025-10-23T19:22:23.811Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.

Applied to files:

  • backend/zappa_settings.example.json
  • infrastructure/modules/alb/main.tf
  • backend/zappa_callback.py
📚 Learning: 2026-01-13T15:15:35.282Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 3333
File: infrastructure/modules/database/variables.tf:144-147
Timestamp: 2026-01-13T15:15:35.282Z
Learning: In Terraform (AWS provider), the aws_secretsmanager_secret resource's recovery_window_in_days can be set to 0 to trigger immediate deletion, mapping to AWS's ForceDeleteWithoutRecovery. This contrasts with the AWS API, which documents RecoveryWindowInDays as 7–30. Review any variable/module usage of recovery_window_in_days in infrastructure/ modules (e.g., infrastructure/modules/database/variables.tf) to ensure this behavior is intentional and documented. If using 0 for immediate deletion, add a clear comment and ensure it aligns with your deletion policies and compliance requirements.

Applied to files:

  • infrastructure/modules/alb/variables.tf
  • infrastructure/modules/alb/main.tf
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.

Applied to files:

  • backend/apps/github/models/common.py
  • backend/wsgi.py
  • backend/zappa_callback.py
  • backend/apps/github/models/repository.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.

Applied to files:

  • backend/apps/github/models/common.py
  • backend/wsgi.py
  • backend/zappa_callback.py
  • backend/apps/github/models/repository.py
📚 Learning: 2025-08-16T13:32:36.373Z
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2067
File: backend/apps/common/clients.py:40-44
Timestamp: 2025-08-16T13:32:36.373Z
Learning: In the OWASP Nest project, prefer letting specific AWS/boto3 exceptions (like botocore.exceptions.ClientError) propagate naturally rather than catching them just to raise generic exceptions like ValueError. The original boto3 exceptions provide better error context and structured information without unnecessary abstraction layers.

Applied to files:

  • backend/zappa_callback.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run frontend accessibility tests
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (8)
backend/zappa_settings.example.json (1)

8-11: LGTM on SnapStart and X-Ray configuration.

The callbacks, snap_start: "PublishedVersions", and xray_tracing: true settings are correctly configured to enable Lambda SnapStart and X-Ray tracing. This aligns well with the PR objectives for cold start optimization.

Also applies to: 46-46, 56-56

backend/apps/github/models/repository.py (1)

253-261: LGTM on lazy import for GithubException.

Moving the import inside from_github aligns with the PR's cold-start optimization strategy. The exception is correctly used to handle the empty repository case (409 status) and FUNDING.yml fetch failures.

backend/apps/github/models/common.py (1)

86-91: The exception handling is correct and properly tested.

The except UnknownObjectException clause is appropriate here. raw_data is a PyGithub property that can trigger lazy API calls, and accessing it can raise UnknownObjectException if the object doesn't exist or the API call fails (e.g., 404 errors). The test suite in backend/tests/apps/github/models/common_test.py::test_get_node_id_raises_exception explicitly validates this scenario, confirming the exception handling is intentional and works as designed.

Likely an incorrect or invalid review comment.

infrastructure/modules/alb/variables.tf (1)

41-45: LGTM!

The updated description accurately reflects the new purpose of this variable—using the function name to create an alias for SnapStart support rather than directly using a Lambda ARN.

infrastructure/modules/alb/main.tf (4)

39-42: LGTM!

The conditional data source correctly retrieves Lambda function metadata only when lambda_function_name is provided, enabling alias creation with the published version.


84-92: LGTM!

The qualifier attribute correctly restricts ALB invocation permission to the live alias rather than $LATEST, ensuring requests are routed to the SnapStart-enabled version.


167-168: LGTM!

The for_each conditions correctly gate backend routing rules on lambda_function_name presence, consistent with the variable refactoring throughout the module.

Also applies to: 185-186


235-240: LGTM!

The target attachment correctly uses aws_lambda_alias.live[0].arn as the target, ensuring ALB traffic is routed to the SnapStart-optimized alias rather than $LATEST.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@backend/zappa_callback.py`:
- Around line 31-39: The rebuilt tarball is being created at new_archive_path
under temp_path so the rglob("*") loop in the tar creation (tarfile.open(...,
"w:gz") and the loop that calls tf.add) may recursively include the new archive
itself; fix by excluding new_archive_path from added files (e.g., skip any
filepath equal to new_archive_path or whose name matches its basename) or move
creation of new_archive_path outside temp_path before iterating, updating the
loop in the tar creation block that iterates temp_path.rglob("*") to explicitly
skip the new archive.
- Around line 51-59: The alias update can fail on first deploy because
update_alias raises ResourceNotFoundException if the "live" alias doesn't exist;
modify the block around client.update_alias(FunctionName=zappa.lambda_name,
Name="live", FunctionVersion=published[-1]) to catch
botocore.exceptions.ClientError, check e.response['Error']['Code'] ==
'ResourceNotFoundException' and call
client.create_alias(FunctionName=zappa.lambda_name, Name="live",
FunctionVersion=published[-1]) as a fallback, re-raising other ClientError
cases, and add the import for ClientError from botocore.exceptions.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0137559 and a10290e.

📒 Files selected for processing (3)
  • backend/zappa_callback.py
  • backend/zappa_settings.example.json
  • infrastructure/modules/alb/variables.tf
💤 Files with no reviewable changes (1)
  • infrastructure/modules/alb/variables.tf
🧰 Additional context used
🧠 Learnings (9)
📓 Common learnings
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 3381
File: backend/zappa_callback.py:1-11
Timestamp: 2026-01-16T18:52:44.595Z
Learning: In the OWASP Nest project, `boto3` is available transitively through Zappa's dependencies and does not need to be explicitly declared in `backend/pyproject.toml` for use in Zappa callback scripts (like `backend/zappa_callback.py`), since these callbacks run locally during deployment when Zappa is installed.
📚 Learning: 2026-01-16T18:48:50.932Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 3381
File: backend/zappa_settings.example.json:29-42
Timestamp: 2026-01-16T18:48:50.932Z
Learning: In the OWASP/Nest backend, the `/data/` directory contains archives and JSON files that are intentionally excluded from Lambda deployments via Zappa's `regex_excludes` to reduce package size.

Applied to files:

  • backend/zappa_settings.example.json
📚 Learning: 2026-01-16T18:52:44.595Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 3381
File: backend/zappa_callback.py:1-11
Timestamp: 2026-01-16T18:52:44.595Z
Learning: In the OWASP Nest project, `boto3` is available transitively through Zappa's dependencies and does not need to be explicitly declared in `backend/pyproject.toml` for use in Zappa callback scripts (like `backend/zappa_callback.py`), since these callbacks run locally during deployment when Zappa is installed.

Applied to files:

  • backend/zappa_settings.example.json
📚 Learning: 2025-11-23T11:52:15.463Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2699
File: backend/wsgi.py:13-13
Timestamp: 2025-11-23T11:52:15.463Z
Learning: In the OWASP Nest project, the SSM parameter store setup in backend/wsgi.py (using boto3 to fetch parameters from AWS Systems Manager) is designed for staging and production environments, not just for testing purposes.

Applied to files:

  • backend/zappa_settings.example.json
📚 Learning: 2025-10-23T19:22:23.811Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.

Applied to files:

  • backend/zappa_settings.example.json
  • backend/zappa_callback.py
📚 Learning: 2026-01-16T18:52:44.595Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 3381
File: backend/zappa_callback.py:1-11
Timestamp: 2026-01-16T18:52:44.595Z
Learning: In backend/zappa_callback.py, do not declare boto3 in pyproject.toml because boto3 is provided transitively via Zappa's dependencies. These callbacks run locally during deployment when Zappa is installed, so boto3 is available at runtime without an explicit dependency. If other scripts rely on boto3 outside the Zappa deployment context, consider declaring it explicitly.

Applied to files:

  • backend/zappa_callback.py
📚 Learning: 2025-08-16T13:32:36.373Z
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2067
File: backend/apps/common/clients.py:40-44
Timestamp: 2025-08-16T13:32:36.373Z
Learning: In the OWASP Nest project, prefer letting specific AWS/boto3 exceptions (like botocore.exceptions.ClientError) propagate naturally rather than catching them just to raise generic exceptions like ValueError. The original boto3 exceptions provide better error context and structured information without unnecessary abstraction layers.

Applied to files:

  • backend/zappa_callback.py
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.

Applied to files:

  • backend/zappa_callback.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.

Applied to files:

  • backend/zappa_callback.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Run frontend accessibility tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run backend tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (5)
backend/zappa_settings.example.json (3)

8-11: Callbacks wiring looks good.

Hooking zip and post to the new callback module is clean and consistent with the deployment flow.


29-41: Regex exclusions align with package-slimming goals.

Noted the /data/ exclusion; given prior context this is intentional to keep large archives/JSONs out of the Lambda bundle. Based on learnings, this is expected here.


27-27: Both snap_start and xray_tracing are supported settings in Zappa 0.61.4. AWS Lambda SnapStart for Python became generally available in November 2024, and Zappa includes both settings in its current configuration options. The example values (snap_start: "PublishedVersions" and xray_tracing: true) are correctly formatted for the documented Zappa API.

backend/zappa_callback.py (2)

62-78: Cleanup flow looks consistent.

Keeps the most recent versions and prunes older ones as intended.


81-84: Post-deploy orchestration is clear.

Simple and readable sequencing of the post hooks.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@sonarqubecloud
Copy link

@rudransh-shrivastava rudransh-shrivastava changed the title Use LambdaSnapStart and optimize Zappa package Use Lambda SnapStart and optimize Zappa package Jan 16, 2026
@rudransh-shrivastava rudransh-shrivastava marked this pull request as ready for review January 16, 2026 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend docs Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reduce Django Load Time and use Lambda SnapStart

1 participant