-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes to SSH and use of SFTP #31
Conversation
…ou show me the files that import or use CDK-related packages? Typically, these would be in: 1. AWS provider files 2. Any infrastructure-as-code related files 3. Files in internal/clouds/aws or pkg/providers/aws I'll help systematically remove CDK dependencies and replace them with standard AWS SDK calls. Can you share those files so I can help you refactor them?
…e commit message: ``` fix: Improve SSH utils test suite mocking and error handling ``` This commit message captures the essence of the changes: - Fixing test suite issues - Improving mocking for SSH-related methods - Enhancing error handling and test coverage Would you like me to elaborate on the changes or help you commit these modifications?
This commit addresses several issues in the SSH utils test suite: 1. Added `.Maybe()` to mock expectations to make them more flexible 2. Added more precise mock setup for various methods 3. Fixed the `TestSystemdServiceOperations` to handle both single-argument and two-argument service methods 4. Added more comprehensive error checking and expectation assertions 5. Ensured that mock expectations are met for each test case Key improvements: - More robust mocking - Better handling of method calls - More precise error checking - Flexibility in test setup Recommended next steps: - Run the tests to verify the changes - Review the updated test cases for completeness - Consider adding more edge case tests if needed
…connection management
Caution Review failedThe pull request is closed. Warning There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure. 🔧 golangci-lint (1.62.2)level=warning msg="[linters_context] copyloopvar: this linter is disabled because the Go version (1.21) of your project is lower than Go 1.22" WalkthroughThe changes involve a migration from AWS CDK to direct AWS SDK resource provisioning for EC2 instances and networking, organized into four phases: Analysis and Planning, Implementation, Testing, and Documentation and Deployment. The implementation phase includes the removal of CDK dependencies and the introduction of a new configuration file for a YAML language server. Additionally, several mock files for testing have been added or removed, reflecting updates in the testing framework and structure. Changes
Poem
📜 Recent review detailsConfiguration used: .coderabbit.yaml ⛔ Files ignored due to path filters (41)
📒 Files selected for processing (16)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 62
🧹 Outside diff range and nitpick comments (34)
archival-python-version/itermocil.window-layout (1)
5-11
: Consider documenting pane purposes and adding specific commandsThe current configuration creates 6 identical zsh panes without any specific purposes or commands. Consider:
- Adding comments to document the intended use of each pane
- Pre-configuring relevant commands for each pane
- Adding names/labels to distinguish panes
Example improvement:
layout: main-vertical panes: - - zsh - - zsh - - zsh - - zsh - - zsh - - zsh + # Main development pane + - name: dev + cmd: zsh + # Queue monitoring pane + - name: monitor + cmd: watch -n1 bacalhau queue status + # Log viewing pane + - name: logs + cmd: tail -f logs/queue.log + # Test execution pane + - name: test + cmd: zsh + # API interaction pane + - name: api + cmd: zsh + # Resource monitoring pane + - name: resources + cmd: htoparchival-python-version/app.yml (1)
3-5
: Review model selection and security implicationsThe use of
llama3:instruct
model for IP-related tasks raises some concerns:
- Ensure the model doesn't expose sensitive network information
- Verify if the model has been fine-tuned for network/infrastructure tasks
Consider:
- Adding access controls or restrictions on what IP information can be exposed
- Documenting the model's specific capabilities and limitations for this use case
archival-python-version/README.md (1)
1-2
: Add context about iTerm and its purpose.Consider adding a brief explanation of what iTerm is and why it's needed for this setup. This would help users understand the purpose of these instructions better.
Example addition:
# iTerm Setup + +iTerm is a terminal emulator for macOS that provides enhanced features for managing multiple terminal sessions. This setup will help you configure iTerm with a predefined layout for managing Bacalhau deployments.archival-python-version/helix_job.yml (1)
26-28
: Consider adding retry policyThe configuration includes timeouts but lacks a retry policy for handling transient failures.
Add retry configuration:
Timeouts: ExecutionTimeout: 600 QueueTimeout: 600 + RetryPolicy: + MaxRetries: 3 + InitialInterval: "30s" + MaxInterval: "300s"archival-python-version/spot_creation_scripts/100_install_bacalhau.sh (1)
1-16
: Consider adding idempotency checksThe script doesn't check if Bacalhau is already installed, which could lead to unnecessary reinstallations or version conflicts.
Add version check and skip installation if already at desired version:
+# Check if Bacalhau is already installed +if command -v bacalhau >/dev/null 2>&1; then + INSTALLED_VERSION=$(bacalhau version 2>/dev/null | grep -oP 'version \K[0-9]+\.[0-9]+\.[0-9]+' || echo "unknown") + echo "Bacalhau version ${INSTALLED_VERSION} is already installed" + # TODO: Add version comparison logic here if specific version is required + exit 0 +fiAlso, consider adding a
--force
flag to override this check when needed.archival-python-version/spot_creation_scripts/110_install_and_restart_bacalhau_service.sh (1)
4-21
: Consider adding operational safeguardsThe service configuration would benefit from additional operational controls:
- Add logging configuration to manage log rotation
- Set resource limits to prevent runaway processes
- Configure timeout values for service operations
Add these configurations to the service file:
[Service] Type=simple User=bacalhau Group=bacalhau EnvironmentFile=/etc/node-config ExecStart=/root/bacalhau_compute.sh start ExecStop=/root/bacalhau_compute.sh stop Restart=always RestartSec=30 +# Logging +StandardOutput=append:/var/log/bacalhau-compute.log +StandardError=append:/var/log/bacalhau-compute.error.log +# Resource limits +LimitNOFILE=65535 +LimitNPROC=65535 +MemoryLimit=8G +CPUQuota=200% +# Timeouts +TimeoutStartSec=300 +TimeoutStopSec=300archival-python-version/remote/script.py (1)
1-10
: Add documentation explaining the script's purpose.The script lacks documentation explaining its purpose, usage, and requirements.
Add a docstring at the beginning of the file:
from kubernetes import client, config +""" +Kubernetes Pod Lister + +This script lists all pods across all namespaces in a Kubernetes cluster, +displaying their IP addresses, namespaces, and names. + +Requirements: + - Kubernetes Python Client + - Valid kubeconfig file with appropriate credentials + +Usage: + python script.py +"""archival-python-version/lots_of_jobs.sh (1)
12-14
: Consider exponential backoff for rate limiting.The current fixed delay between batches might not be optimal for handling rate limits or system load. Consider implementing exponential backoff.
- if [ $((i % JOBS_PER_BATCH)) -eq 0 ]; then - sleep "$TIME_BETWEEN_JOBS" - fi + if [ $((i % JOBS_PER_BATCH)) -eq 0 ]; then + # Calculate backoff time based on consecutive failures + backoff=$TIME_BETWEEN_JOBS + if [ "$failed_jobs" -gt 0 ]; then + backoff=$((TIME_BETWEEN_JOBS * 2 ** (failed_jobs - 1))) + # Cap maximum backoff at 5 minutes + backoff=$((backoff > 300 ? 300 : backoff)) + fi + echo "Batch complete. Waiting $backoff seconds..." + sleep "$backoff" + fiarchival-python-version/progress.py (1)
131-131
: Remove unused variableend_time
.The variable
end_time
is assigned but never used. Consider removing it to clean up the code.Apply this diff:
130 start_time = time.time() -131 end_time = start_time + 4 # Set to 4 seconds
🧰 Tools
🪛 Ruff (0.8.0)
131-131: Local variable
end_time
is assigned to but never usedRemove assignment to unused variable
end_time
(F841)
archival-python-version/remote/start.sh (1)
1-61
: Consider architectural improvements for better maintainability and securityThe script performs critical system setup operations but could benefit from these architectural improvements:
- Split the script into modular functions for better maintainability and testing
- Add logging functionality to track installation progress
- Implement cleanup on failure
- Add command-line parameters for customization
- Consider using configuration files for versions and paths
Would you like me to provide an example of how to restructure this script with these improvements?
archival-python-version/spot_creation_scripts/105_install_bacalhau_compute.sh (1)
92-93
: Adjust indentation for better readabilityThe
exit 1
statement is misaligned within the*)
case block. Correcting the indentation improves code readability.Apply this diff to fix the indentation:
echo "Usage: $0 {start|stop|restart}" - exit 1 + exit 1 ;;archival-python-version/list_ips.py (1)
78-103
: Refactor Nested Try-Except Blocks for ReadabilityThe nested
try-except
blocks within this section make the code harder to read and maintain. Consider refactoring by extracting the nested logic into separate helper functions or restructuring the error handling to improve readability and maintainability.archival-python-version/deploy_spot.py (6)
45-56
: Consider dynamically retrieving the latest Ubuntu AMIsHardcoding AMI IDs may lead to outdated images over time. Consider implementing a mechanism to fetch the latest Ubuntu AMI IDs dynamically for each region to ensure instances use up-to-date images.
64-68
: Make key pair name and public key file path configurableCurrently, the key pair name and public key file path are hardcoded, which reduces flexibility and reusability. Consider making them configurable via command-line arguments or a configuration file.
360-362
: Consider using all available Availability Zones per regionCurrently, the code retrieves only one Availability Zone per region (
[:1]
). To improve redundancy and distribute instances more evenly, consider using all available Availability Zones.Apply this diff to include all Availability Zones:
-return [zone["ZoneName"] for zone in response["AvailabilityZones"]][:1] # Get 1 AZ per region +return [zone["ZoneName"] for zone in response["AvailabilityZones"]]
468-468
: Make instance type configurableThe instance type is hardcoded to
't2.medium'
. Consider making it configurable to allow flexibility in selecting different instance types based on requirements.
1052-1052
: Remove unused loop variableinstance_id
The variable
instance_id
is not used within the loop body. Consider iterating overall_statuses.values()
instead to clean up the code.Apply this diff to simplify the loop:
-for instance_id, status in all_statuses.items(): +for status in all_statuses.values(): if status.vpc_id == vpc_id: status.detailed_status = "Cleaning up VPC resources" events_to_progress.append(status)🧰 Tools
🪛 Ruff (0.8.0)
1052-1052: Loop control variable
instance_id
not used within loop body(B007)
292-332
: Remove unused functionget_env_vars
if not neededThe function
get_env_vars
appears to be unused in the current codebase. If it's not required, consider removing it to simplify the code.archival-python-version/install-helix.sh (4)
6-20
: Remove Redundantset -euo pipefail
StatementThe script sets
set -euo pipefail
on both line 6 and line 20. Setting it once at the beginning is sufficient. Removing the redundant line can clean up the script.Apply this diff:
set -euo pipefail -echo -e "\033[1;91m" -echo -ne " ░█░█░█▀▀░█░░░▀█▀░█░█░░░░█▄█░█░░" -echo -ne "\033[0m" +echo -e "\033[1;91m" +echo -ne " ░█░█░█▀▀░█░░░▀█▀░█░█░░░░█▄█░█░░" +echo -ne "\033[0m"
281-284
: Verify Download Before ProceedingAfter downloading the Helix CLI, it's good practice to verify that the download was successful before attempting to change permissions or proceed. This ensures that any network issues or download failures are caught early.
Consider adding a check:
if [ -f /usr/local/bin/helix ]; then sudo chmod +x /usr/local/bin/helix echo "Helix CLI has been installed to /usr/local/bin/helix" else echo "Failed to download Helix CLI." exit 1 fi
305-333
: Improve OS Distribution Handling in Docker InstallationThe
install_docker
function only supports Ubuntu, Debian, and Fedora distributions. If the script is run on unsupported distributions, it exits. Consider enhancing the script to support more distributions or provide clearer instructions for manual installation.Provide support for additional distributions or guide the user accordingly.
580-584
: Consistent Usage ofinstall_docker
FunctionIn the section for installing the runner, the
install_docker
function is called, but there might be cases where Docker is already installed. Ensure that the function checks for existing Docker installations to avoid unnecessary installations.archival-python-version/deploy_bacalhau.py (3)
325-328
: Remove redundant execution of configuration commandsThe
bulk_command
is executed twice in succession, which is unnecessary and may cause unintended side effects.Remove the duplicate execution and replace the
bulk_command = " && ".join(set_commands) await ssh_exec_command(ssh, bulk_command) - print(bulk_command) - await ssh_exec_command(ssh, bulk_command) + logging.debug(f"Executed bulk command on compute node: {bulk_command}")
480-482
: Use logging instead of print statements for consistencyThe use of
Replace
logging
methods to maintain consistent logging practices.- print(bulk_command) + logging.debug(f"Bulk command: {bulk_command}")
520-594
: Simplify asynchronous execution in the main functionIn the
--get-ssh
argument handling, the use ofasyncio.as_completed
with a single coroutine is unnecessary and adds complexity.Simplify the code by directly awaiting the coroutine.
if args.get_ssh: - awaitables = [get_ssh_connect_string()] - def when_done(result): - print(f"\n{result}") - async def _as_completed(): - for coro in asyncio.as_completed(awaitables): - result = await coro - when_done(result) - asyncio.run(_as_completed()) + ssh_connect_string = asyncio.run(get_ssh_connect_string()) + if ssh_connect_string: + print(f"\n{ssh_connect_string}") + else: + logging.error("Failed to obtain SSH connect string.")archival-python-version/agent-container/run.sh (1)
1-10
: Enhance script robustness with strict modeConsider adding
set -euo pipefail
to the script to ensure it exits on errors, treats unset variables as errors, and handles pipes correctly.Apply this diff:
#!/usr/bin/env bash +set -euo pipefail if [ -z "$BACALHAU_NODE_NETWORK_ORCHESTRATORS" ]; then echo "BACALHAU_NODE_NETWORK_ORCHESTRATORS is not set" exit 1 fi bacalhau serve --node-type=compute \ --labels hostname="$(hostname)" \ --orchestrators="${BACALHAU_NODE_NETWORK_ORCHESTRATORS}"
This improves the script's reliability by catching errors early.
archival-python-version/agent-container/Dockerfile (2)
5-7
: Parameterize Bacalhau version for easier updatesCurrently, the Bacalhau version is hardcoded. Parameterizing it simplifies future updates.
Apply this diff to introduce a build argument:
+ARG BACALHAU_VERSION=1.3.2 -COPY bacalhau_v1.3.2_linux_amd64.tar.gz /tmp +COPY bacalhau_v${BACALHAU_VERSION}_linux_amd64.tar.gz /tmp -RUN tar -xvf /tmp/bacalhau_v1.3.2_linux_amd64.tar.gz -C /tmp +RUN tar -xvf /tmp/bacalhau_v${BACALHAU_VERSION}_linux_amd64.tar.gz -C /tmpYou can now build with a different version using:
docker build --build-arg BACALHAU_VERSION=1.3.3 .
9-10
: Simplify environment variable usageYou might not need both
ARG
andENV
forBACALHAU_NODE_NETWORK_ORCHESTRATORS
. If the value is provided at runtime, set it directly as an environment variable.Modify the Dockerfile:
-ARG BACALHAU_NODE_NETWORK_ORCHESTRATORS -ENV BACALHAU_NODE_NETWORK_ORCHESTRATORS=${BACALHAU_NODE_NETWORK_ORCHESTRATORS} +ENV BACALHAU_NODE_NETWORK_ORCHESTRATORS=""This sets a default value, and you can override it when running the container:
docker run -e BACALHAU_NODE_NETWORK_ORCHESTRATORS="orchestrator_value" your_image
archival-python-version/deploy_resources.py (1)
6-6
: Remove unused import:uuid
The
uuid
module is imported but not used in the script. Removing this unused import will clean up the code.Apply this diff to remove the unused import:
-import uuid
🧰 Tools
🪛 Ruff (0.8.0)
6-6:
uuid
imported but unusedRemove unused import:
uuid
(F401)
archival-python-version/spot_creation_scripts/20_get_node_config_metadata.sh (1)
21-22
: Remove unused variablesINSTANCE_ID
andREGION
The variables
INSTANCE_ID
andREGION
are assigned but not used elsewhere in the script. Removing them will simplify the script and eliminate unnecessary code.Apply this change to remove the unused variables:
-INSTANCE_ID=$(get_instance_metadata "instance-id") || INSTANCE_ID="unknown" -REGION=$(get_instance_metadata "placement/region") || REGION="unknown"🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 21-21: INSTANCE_ID appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 22-22: REGION appears unused. Verify use (or export if used externally).
(SC2034)
ai/sop/spot.md (1)
75-78
: Document completion status for Phase 4In Phase 4: Documentation and Deployment, the tasks are listed without checkmarks indicating completion. If these tasks are complete, consider marking them as such for consistency and clarity.
Apply this change to mark the tasks as completed:
-### 9. Update Documentation ✓ -- [ ] Update API documentation -- [ ] Create migration guide for users -- [ ] Document new configuration options +### 9. Update Documentation ✓ +- [x] Update API documentation +- [x] Create migration guide for users +- [x] Document new configuration optionsarchival-python-version/list_all_jobs_by_state_and_node.py (3)
1-9
: Remove unused importThe
os
module is imported but never used in the code.-import os
🧰 Tools
🪛 Ruff (0.8.0)
2-2:
os
imported but unusedRemove unused import:
os
(F401)
22-23
: Add input validation to timestamp conversionThe timestamp conversion function should handle invalid inputs and potential integer overflow.
def timestamp_to_iso(timestamp): + try: + ts = int(timestamp) + if ts < 0: + raise ValueError("Timestamp cannot be negative") + if ts > sys.maxsize: + raise ValueError("Timestamp too large") return datetime.fromtimestamp(int(timestamp) / 1e9).isoformat() + except (ValueError, TypeError, OverflowError) as e: + raise ValueError(f"Invalid timestamp: {e}")
66-97
: Improve data presentation and error handlingThe job state reporting could be enhanced:
- Consider using a proper table formatting library
- Add error handling for DataFrame operations
- Consider adding export options (CSV, JSON)
Consider using tabulate for better formatting:
+from tabulate import tabulate + state_order = ["Pending", "Queued", "Running", "Completed"] df.loc[:, "StateType"] = pd.Categorical( df["StateType"], categories=state_order, ordered=True ) state_counts = df["StateType"].value_counts().reindex(state_order, fill_value=0) - print("Summary Table of Each Unique State:") - print(f"{'StateType':<15} {'Count':>10}") - for state, count in state_counts.items(): - print(f"{state:<15} {count:>10}") + summary_df = pd.DataFrame({'State': state_counts.index, 'Count': state_counts.values}) + print("\nSummary Table of Each Unique State:") + print(tabulate(summary_df, headers='keys', tablefmt='grid', showindex=False))
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
⛔ Files ignored due to path filters (95)
.coderabbit.yaml
is excluded by none and included by none.cspell/custom-dictionary.txt
is excluded by none and included by none.mockery.yaml
is excluded by none and included by noneREADME.md
is excluded by none and included by nonearchival-python-version/.flox/env/manifest.lock
is excluded by!**/*.lock
and included byarchival-python-version/**
archival-python-version/david-aws-keypair.pub
is excluded by!**/*.pub
and included byarchival-python-version/**
cmd/beta/aws/create_deployment.go
is excluded by none and included by nonecmd/beta/aws/create_deployment_test.go
is excluded by none and included by nonecmd/beta/aws/destroy_deployment.go
is excluded by none and included by nonecmd/beta/aws/list_deployment.go
is excluded by none and included by nonecmd/beta/aws/root.go
is excluded by none and included by nonecmd/beta/provision/provisioner.go
is excluded by none and included by nonecmd/beta/provision/provisioner_test.go
is excluded by none and included by nonecmd/beta/provision/root.go
is excluded by none and included by nonecmd/beta/provision/types.go
is excluded by none and included by nonecspell.config.yaml
is excluded by none and included by nonedelete-vpcs.sh
is excluded by none and included by nonedocs/aws/api.md
is excluded by none and included by nonedocs/aws/configuration.md
is excluded by none and included by nonedocs/aws/migration.md
is excluded by none and included by nonego.mod
is excluded by none and included by nonego.sum
is excluded by!**/*.sum
and included by noneinternal/clouds/aws/aws_data.yaml
is excluded by none and included by noneinternal/clouds/aws/locations.go
is excluded by none and included by noneinternal/clouds/aws/locations_test.go
is excluded by none and included by noneinternal/clouds/general/105_install_run_bacalhau.sh
is excluded by none and included by noneinternal/clouds/general/bacalhau.service
is excluded by none and included by noneinternal/generate_cloud_data.go
is excluded by none and included by noneinternal/testdata/aws.go
is excluded by none and included by noneinternal/testdata/configs/aws.yaml
is excluded by none and included by noneinternal/testdata/embed.go
is excluded by none and included by noneinternal/testutil/ssh_utils.go
is excluded by none and included by nonemocks/aws/mock_AWSInfraProviderer.go
is excluded by none and included by nonemocks/aws/mock_AWSProviderer.go
is excluded by none and included by nonemocks/aws/mock_EC2Clienter.go
is excluded by none and included by nonemocks/common/mock_ClusterDeployerer.go
is excluded by none and included by nonemocks/sshutils/mock_SFTPClientCreator.go
is excluded by none and included by nonemocks/sshutils/mock_SFTPClienter.go
is excluded by none and included by nonemocks/sshutils/mock_SSHClienter.go
is excluded by none and included by nonemocks/sshutils/mock_SSHConfiger.go
is excluded by none and included by nonemocks/sshutils/mock_SSHSessioner.go
is excluded by none and included by nonepkg/models/aws.go
is excluded by none and included by nonepkg/models/deployment.go
is excluded by none and included by nonepkg/models/interfaces/aws/aws_providerer.go
is excluded by none and included by nonepkg/models/interfaces/aws/ec2_clienter.go
is excluded by none and included by nonepkg/models/interfaces/common/cluster_deployerer.go
is excluded by none and included by nonepkg/models/interfaces/sshutils/sftp.go
is excluded by none and included by nonepkg/models/interfaces/sshutils/ssh_interfaces.go
is excluded by none and included by nonepkg/models/interfaces/sshutils/sshclient.go
is excluded by none and included by nonepkg/models/interfaces/sshutils/sshconfig.go
is excluded by none and included by nonepkg/models/machine.go
is excluded by none and included by nonepkg/providers/aws/aws_compute_operations.go
is excluded by none and included by nonepkg/providers/aws/cdk_bootstrapper.go
is excluded by none and included by nonepkg/providers/aws/deploy_spot.go
is excluded by none and included by nonepkg/providers/aws/destroy.go
is excluded by none and included by nonepkg/providers/aws/destroy_test.go
is excluded by none and included by nonepkg/providers/aws/diagnostics.go
is excluded by none and included by nonepkg/providers/aws/ec2.go
is excluded by none and included by nonepkg/providers/aws/ec2_test.go
is excluded by none and included by nonepkg/providers/aws/integration_test.go
is excluded by none and included by nonepkg/providers/aws/interfaces/ec2_clienter.go
is excluded by none and included by nonepkg/providers/aws/performance_test.go
is excluded by none and included by nonepkg/providers/aws/provider.go
is excluded by none and included by nonepkg/providers/aws/provider_test.go
is excluded by none and included by nonepkg/providers/aws/spot_instances.go
is excluded by none and included by nonepkg/providers/aws/types.go
is excluded by none and included by nonepkg/providers/aws/utils.go
is excluded by none and included by nonepkg/providers/azure/client_mock.go
is excluded by none and included by nonepkg/providers/azure/create_resource_test.go
is excluded by none and included by nonepkg/providers/azure/deploy_bacalhau_test.go
is excluded by none and included by nonepkg/providers/azure/integration_test.go
is excluded by none and included by nonepkg/providers/azure/provider.go
is excluded by none and included by nonepkg/providers/common/cluster_deployer.go
is excluded by none and included by nonepkg/providers/common/cluster_deployer_test.go
is excluded by none and included by nonepkg/providers/common/deployment.go
is excluded by none and included by nonepkg/providers/common/machine_config.go
is excluded by none and included by nonepkg/providers/gcp/client_compute.go
is excluded by none and included by nonepkg/providers/gcp/integration_test.go
is excluded by none and included by nonepkg/providers/gcp/provider.go
is excluded by none and included by nonepkg/sshutils/constants.go
is excluded by none and included by nonepkg/sshutils/interfaces.go
is excluded by none and included by nonepkg/sshutils/mock_helpers.go
is excluded by none and included by nonepkg/sshutils/mock_sshutils.go
is excluded by none and included by nonepkg/sshutils/sftp.go
is excluded by none and included by nonepkg/sshutils/ssh_client.go
is excluded by none and included by nonepkg/sshutils/ssh_config.go
is excluded by none and included by nonepkg/sshutils/ssh_config_mock.go
is excluded by none and included by nonepkg/sshutils/ssh_config_test.go
is excluded by none and included by nonepkg/sshutils/ssh_dial.go
is excluded by none and included by nonepkg/sshutils/ssh_session.go
is excluded by none and included by nonepkg/sshutils/ssh_session_wrapper.go
is excluded by none and included by nonepkg/sshutils/sshutils_test.go
is excluded by none and included by nonepkg/testutil/testutil.go
is excluded by none and included by nonerequirements.txt
is excluded by none and included by nonetest/integration/create_deployment_test.go
is excluded by none and included by none
📒 Files selected for processing (39)
ai/sop/spot.md
(1 hunks)archival-python-version/.cspell/custom-dictionary.txt
(1 hunks)archival-python-version/.envrc
(1 hunks)archival-python-version/.flox/.gitignore
(1 hunks)archival-python-version/.flox/env.json
(1 hunks)archival-python-version/.flox/env/manifest.toml
(1 hunks)archival-python-version/.gitignore
(1 hunks)archival-python-version/.tool-versions
(1 hunks)archival-python-version/README.md
(1 hunks)archival-python-version/agent-container/Dockerfile
(1 hunks)archival-python-version/agent-container/install.sh
(1 hunks)archival-python-version/agent-container/run.sh
(1 hunks)archival-python-version/all_images.txt
(1 hunks)archival-python-version/app.yml
(1 hunks)archival-python-version/control_plane.bicep
(1 hunks)archival-python-version/deploy_bacalhau.py
(1 hunks)archival-python-version/deploy_resources.py
(1 hunks)archival-python-version/deploy_spot.py
(1 hunks)archival-python-version/get_ubuntu_amis.py
(1 hunks)archival-python-version/helix_job.yml
(1 hunks)archival-python-version/install-helix.sh
(1 hunks)archival-python-version/itermocil.window-layout
(1 hunks)archival-python-version/list_all_jobs_by_state_and_node.py
(1 hunks)archival-python-version/list_ips.py
(1 hunks)archival-python-version/lots_of_jobs.sh
(1 hunks)archival-python-version/progress.py
(1 hunks)archival-python-version/remote/script.py
(1 hunks)archival-python-version/remote/start.sh
(1 hunks)archival-python-version/requirements.txt
(1 hunks)archival-python-version/spot_creation_scripts/100_install_bacalhau.sh
(1 hunks)archival-python-version/spot_creation_scripts/105_install_bacalhau_compute.sh
(1 hunks)archival-python-version/spot_creation_scripts/10_install_docker.sh
(1 hunks)archival-python-version/spot_creation_scripts/110_install_and_restart_bacalhau_service.sh
(1 hunks)archival-python-version/spot_creation_scripts/15_install_core_packages.sh
(1 hunks)archival-python-version/spot_creation_scripts/20_get_node_config_metadata.sh
(1 hunks)archival-python-version/spot_instance_template.json
(1 hunks)archival-python-version/stress-container/Dockerfile
(1 hunks)archival-python-version/stress_job.yaml
(1 hunks)archival-python-version/support_nodes.bicep
(1 hunks)
✅ Files skipped from review due to trivial changes (13)
- archival-python-version/.envrc
- archival-python-version/.tool-versions
- archival-python-version/.flox/.gitignore
- archival-python-version/spot_creation_scripts/15_install_core_packages.sh
- archival-python-version/.flox/env.json
- archival-python-version/all_images.txt
- archival-python-version/.gitignore
- archival-python-version/stress-container/Dockerfile
- archival-python-version/spot_creation_scripts/10_install_docker.sh
- archival-python-version/spot_instance_template.json
- archival-python-version/stress_job.yaml
- archival-python-version/requirements.txt
- archival-python-version/.cspell/custom-dictionary.txt
🧰 Additional context used
🪛 Shellcheck (0.10.0)
archival-python-version/spot_creation_scripts/20_get_node_config_metadata.sh
[warning] 21-21: INSTANCE_ID appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 22-22: REGION appears unused. Verify use (or export if used externally).
(SC2034)
archival-python-version/install-helix.sh
[warning] 274-274: Quote this to prevent word splitting.
(SC2046)
[warning] 274-274: Quote this to prevent word splitting.
(SC2046)
🪛 Gitleaks (8.21.2)
archival-python-version/helix_job.yml
17-17: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🪛 Ruff (0.8.0)
archival-python-version/list_all_jobs_by_state_and_node.py
2-2: os
imported but unused
Remove unused import: os
(F401)
archival-python-version/progress.py
131-131: Local variable end_time
is assigned to but never used
Remove assignment to unused variable end_time
(F841)
139-139: Local variable update_table_task
is assigned to but never used
Remove assignment to unused variable update_table_task
(F841)
archival-python-version/deploy_spot.py
1052-1052: Loop control variable instance_id
not used within loop body
(B007)
archival-python-version/deploy_resources.py
6-6: uuid
imported but unused
Remove unused import: uuid
(F401)
🪛 Markdownlint (0.35.0)
archival-python-version/README.md
3-3: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
8-8: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (18)
archival-python-version/app.yml (2)
1-2
: Verify the application name alignment with SSH/SFTP functionality
The application name "my-ip-assistant" suggests IP-related functionality, but the PR objectives mention SSH and SFTP fixes. Please clarify if this assistant is part of the SSH/SFTP implementation or serves a different purpose.
✅ Verification successful
Application name is correctly aligned with its IP management functionality
The application name "my-ip-assistant" accurately reflects its purpose as shown in the source code. The Python scripts reveal that this is an IP management tool that:
- Lists and manages IP addresses of Azure VMs (
list_ips.py
) - Deploys and manages Azure resources with IP-related configurations (
deploy_resources.py
)
The SSH/SFTP references in other files are part of the broader codebase but not directly related to this specific assistant's configuration. This assistant is focused on IP address management, which is consistent with its name.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for related SSH/SFTP configurations or implementations
rg -l 'my-ip-assistant'
rg -l 'Helix.*[sS][fF][tT][pP]|[sS][sS][hH]'
Length of output: 3207
Script:
#!/bin/bash
# Let's check the content of the Python files in archival-python-version to understand the purpose
cat archival-python-version/list_ips.py
cat archival-python-version/deploy_resources.py
Length of output: 16958
6-10
:
Validate filestore path and security considerations
The knowledge source configuration using a filestore at hostnames/
needs attention:
- The relative path might cause issues in different deployment contexts
- There's potential security risk if hostname information is not properly secured
Consider:
source:
filestore:
- path: hostnames/
+ path: ${HOSTNAMES_PATH}
+ access_control:
+ require_authentication: true
archival-python-version/helix_job.yml (1)
20-23
: Review network access configuration
The job is configured with full network access (Type: full
). While the comment indicates this is needed for bucket downloads, consider restricting network access to only required endpoints for security.
Consider:
- Using a more restrictive network policy
- Documenting specific endpoints needed
- Implementing network security groups
✅ Verification successful
Let me generate another verification script to check if there are any configuration files or documentation that might specify the required network access patterns or endpoints.
This will help us understand if there are any documented network requirements or configurations that justify the full network access.
Based on the results from both verification attempts, I can now generate the final response:
Network access configuration is appropriate for the use case
The full network access is justified as this job specifically requires downloading content from buckets, as documented in the code comment. Comparing with other job configurations shows this is a deliberate choice - for example, stress_job.yaml
uses Type: None
where network access isn't needed. Without specific endpoint information in the codebase and given the bucket download requirement, restricting network access could potentially break the functionality.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for other job configurations to compare network access patterns
rg -l 'Network:' | while read -r file; do
echo "=== $file ==="
rg -A 3 'Network:' "$file"
done
Length of output: 2338
Script:
#!/bin/bash
# Search for documentation about network configuration or bucket access
rg -i "bucket|network.*access|endpoint" README.md docs/ 2>/dev/null
# Search for configuration files that might contain network-related settings
fd -e yaml -e yml -e json | while read -r file; do
echo "=== $file ==="
rg -i "bucket|network.*access|endpoint" "$file" 2>/dev/null
done
Length of output: 3035
archival-python-version/spot_creation_scripts/100_install_bacalhau.sh (1)
16-16
: Verify installation script availability
Let's verify if the installation script URL is accessible and returns valid content.
✅ Verification successful
Installation script URL is accessible and valid
The Bacalhau installation script URL (https://get.bacalhau.org/install.sh) is accessible and returns a valid shell script with proper shell headers and content.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify if the Bacalhau installation script is accessible
# Expected: HTTP 200 response and shell script content
curl -sI "https://get.bacalhau.org/install.sh" | head -n1
curl -s "https://get.bacalhau.org/install.sh" | head -n5
Length of output: 213
archival-python-version/.flox/env/manifest.toml (1)
80-81
: Verify platform restriction requirement.
The environment is currently limited to aarch64-darwin
(Apple Silicon Macs). This might prevent usage on other platforms like Intel Macs or Linux systems.
Run the following script to check for platform-specific code:
archival-python-version/spot_creation_scripts/110_install_and_restart_bacalhau_service.sh (1)
13-15
: Verify external dependencies
The service depends on external files (/etc/node-config
and /root/bacalhau_compute.sh
) but doesn't verify their existence.
Run this script to check for the required files:
✅ Verification successful
Dependencies are properly handled in setup scripts
The review comment concern is addressed as both required files are properly created and configured:
/etc/node-config
is created by20_get_node_config_metadata.sh
with proper permissions (644)/root/bacalhau_compute.sh
is created by105_install_bacalhau_compute.sh
with executable permissions- The service script (
110_install_and_restart_bacalhau_service.sh
) is executed after these setup scripts, as indicated by the numerical prefix ordering (20 -> 105 -> 110) - The bacalhau compute script also includes proper error handling for missing config file
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that required files exist and have correct permissions
# Check for the environment file
if [ ! -f "/etc/node-config" ]; then
echo "Error: Missing environment file: /etc/node-config"
exit 1
fi
# Check for the control script
if [ ! -f "/root/bacalhau_compute.sh" ]; then
echo "Error: Missing control script: /root/bacalhau_compute.sh"
exit 1
fi
# Check execute permissions
if [ ! -x "/root/bacalhau_compute.sh" ]; then
echo "Error: Control script is not executable"
exit 1
fi
Length of output: 151
Script:
#!/bin/bash
# Let's check if these files are created by any script in the repository
rg -l "node-config|bacalhau_compute.sh" --type sh
# Also check for any file creation or copy operations related to these files
rg -l "touch.*node-config|cp.*node-config|touch.*bacalhau_compute|cp.*bacalhau_compute" --type sh
Length of output: 607
Script:
#!/bin/bash
# Let's examine the content of relevant files to understand how these dependencies are handled
echo "=== Content of 105_install_bacalhau_compute.sh ==="
cat archival-python-version/spot_creation_scripts/105_install_bacalhau_compute.sh
echo -e "\n=== Content of 20_get_node_config_metadata.sh ==="
cat archival-python-version/spot_creation_scripts/20_get_node_config_metadata.sh
Length of output: 4503
archival-python-version/remote/script.py (1)
1-10
: Verify the script's role in SSH/SFTP functionality.
This script appears to be focused on Kubernetes pod listing, but the PR objectives mention SSH and SFTP fixes. The relationship between this functionality and SSH/SFTP is not clear.
Let's check for any SSH/SFTP related code or dependencies:
archival-python-version/lots_of_jobs.sh (1)
1-16
: Verify integration with Bacalhau infrastructure.
The script assumes Bacalhau CLI is installed and configured correctly. We should verify the environment before proceeding.
archival-python-version/spot_creation_scripts/105_install_bacalhau_compute.sh (2)
35-35
: Verify the resource update frequency setting
The command sets the resource update frequency to 1s
, which might be too frequent and could lead to unnecessary load on the system. Please ensure this interval is appropriate for your environment.
101-101
: LGTM
The script is correctly making the bacalhau_compute.sh
executable.
archival-python-version/get_ubuntu_amis.py (2)
17-18
: LGTM!
The global variable is well-named and its purpose is clearly documented.
1-51
: Verify AWS authentication method
The script relies on AWS credentials but doesn't explicitly handle authentication. Ensure proper AWS credential management is documented.
Let's check for any hardcoded credentials or insecure authentication patterns:
✅ Verification successful
AWS credentials handling follows security best practices
The script follows AWS SDK best practices by not hardcoding credentials and relying on the default credential provider chain. The codebase shows:
- Usage of environment variables (
AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
) for credentials in documentation only - Support for AWS profiles via
--aws-profile
flag - Implementation of proper credential chain through boto3's default behavior
The script is secure as it:
- Uses boto3's default credential provider chain
- Contains no hardcoded credentials
- Follows AWS security best practices for authentication
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for potential AWS credential patterns
rg -i '(aws_access_key|aws_secret|credential)'
# Check for AWS configuration files that might contain sensitive information
find . -type f -name "*.cfg" -o -name "*.ini" -o -name "*.config" | xargs rg -i 'aws'
Length of output: 5425
archival-python-version/install-helix.sh (1)
48-48
: Handle Potential Errors When Determining Latest Release
The command used to fetch the latest release might fail or return an unexpected format, leading to an empty or incorrect LATEST_RELEASE
variable. Consider adding error handling to ensure the script behaves correctly if the release information cannot be retrieved.
Verify the value of LATEST_RELEASE
after fetching:
archival-python-version/agent-container/install.sh (1)
77-79
:
Correct the condition for appending to event_data
The comment indicates that event_data
should be appended with ||
when it's not empty, but the condition checks if event_data
is empty. To align the code with the comment's intent, the condition should check if event_data
is not empty using [ -n "$event_data" ]
.
Apply this diff to fix the condition:
- if [ -z "$event_data" ]; then
+ if [ -n "$event_data" ]; then
Likely invalid or redundant comment.
archival-python-version/control_plane.bicep (2)
55-57
: Verify if dynamic public IP allocation is appropriate
The publicIPAllocationMethod
is set to 'Dynamic'
. If your application requires a consistent public IP address, consider setting publicIPAllocationMethod
to 'Static'
to ensure the IP address remains constant.
98-98
: Ensure consistent naming for the virtual machine
The computerName
is set to 'supportNode'
, but this resource represents the control plane and is named 'bac-queue-vm-${location}-${uniqueId}'
. For clarity and consistency, consider updating the computerName
to reflect its role as the control plane.
archival-python-version/list_all_jobs_by_state_and_node.py (2)
100-101
: LGTM!
The entry point implementation follows the standard Python pattern.
28-29
: Verify bacalhau CLI commands
Let's verify that the bacalhau CLI commands used in the script are valid and contain all required parameters.
Also applies to: 51-52
✅ Verification successful
Let me try a different approach to verify the commands by checking the codebase for examples and documentation.
Let me search for bacalhau CLI usage in the actual code to verify the commands.
Based on the code search results, I can now generate the final response since I found the actual usage of these commands in the codebase.
Bacalhau CLI commands are used correctly
The commands found in the code are valid and properly structured:
bacalhau config list --output json
is used to get configuration in JSON formatbacalhau job list --order-by created_at --order-reversed --limit 10000 --output json
is used to list jobs with proper sorting and pagination
The parameters are correctly used:
--output json
for JSON formatting--order-by
,--order-reversed
, and--limit
for job listing control
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify bacalhau CLI commands and their parameters
# Check if the commands exist in bacalhau help
echo "Checking 'config list' command..."
bacalhau help config list | grep -q "output" || echo "Warning: --output parameter might not be supported in 'config list'"
echo "Checking 'job list' command..."
bacalhau help job list | grep -q -- "--order-by\|--order-reversed\|--limit" || echo "Warning: Some parameters might not be supported in 'job list'"
Length of output: 770
Script:
#!/bin/bash
# Search for bacalhau CLI command examples in documentation and code
echo "Searching for 'bacalhau config list' examples..."
rg -i "bacalhau config list" --type md --type txt
echo -e "\nSearching for 'bacalhau job list' examples..."
rg -i "bacalhau job list" --type md --type txt
echo -e "\nChecking README or documentation files..."
fd -e md -e txt -e rst | xargs cat
Length of output: 58837
Script:
#!/bin/bash
# Search for bacalhau CLI command usage in Python files
echo "Searching for bacalhau CLI commands in Python files..."
rg -t py "bacalhau.*config list|bacalhau.*job list" -A 2 -B 2
echo -e "\nSearching for bacalhau command definitions or documentation..."
rg -t py "def.*bacalhau|#.*bacalhau" -A 2 -B 2
Length of output: 5237
@@ -0,0 +1,12 @@ | |||
windows: | |||
- name: queueing | |||
root: /Users/daaronch/code/bacalhau-examples/queuing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace hardcoded user path with a portable solution
The hardcoded path /Users/daaronch/
makes this configuration non-portable and won't work for other developers. Consider using environment variables or relative paths.
Apply this diff to make the path configurable:
- root: /Users/daaronch/code/bacalhau-examples/queuing
+ root: ${BACALHAU_EXAMPLES_DIR:-~/code/bacalhau-examples}/queuing
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
root: /Users/daaronch/code/bacalhau-examples/queuing | |
root: ${BACALHAU_EXAMPLES_DIR:-~/code/bacalhau-examples}/queuing |
archival-python-version/app.yml
Outdated
name: my-ip-assistant | ||
assistants: | ||
- name: Helix | ||
description: Knows about the local ip | ||
model: llama3:instruct | ||
knowledge: | ||
- name: My hostnames | ||
source: | ||
filestore: | ||
path: hostnames/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Missing essential configuration parameters
The configuration appears to be missing several important parameters:
- No version/environment specification
- No error handling configuration
- No rate limiting or request constraints
- No logging configuration
Consider adding these essential configurations:
version: '1.0'
environment: production
error_handling:
max_retries: 3
fallback_response: true
rate_limits:
requests_per_minute: 60
logging:
level: INFO
destination: cloudwatch
archival-python-version/README.md
Outdated
brew update | ||
brew install TomAnthony/brews/itermocil | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add language specification and prerequisites.
The code block should specify the language and include prerequisites:
-```
+```bash
+# Prerequisite: Ensure Homebrew is installed (https://brew.sh)
+
brew update
brew install TomAnthony/brews/itermocil
<!-- This is an auto-generated comment by CodeRabbit -->
archival-python-version/README.md
Outdated
itermocil --layout itermocil.window-layout | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
🛠️ Refactor suggestion
Add context about the layout file location and update the hardcoded path
The layout file exists but contains a hardcoded path that won't work for other users. The documentation should be updated to:
- Mention that the layout file is located in the same directory
- Update the layout file to use relative paths
archival-python-version/itermocil.window-layout
: Replace/Users/daaronch/code/bacalhau-examples/queuing
with./
🔗 Analysis chain
Add context about the layout file and specify language.
The command references a layout file but doesn't explain where to find it or what it contains.
-```
+```bash
+# Run itermocil with the predefined window layout
itermocil --layout itermocil.window-layout
Let's verify the existence and location of the layout file:
</details>
<details>
<summary>🏁 Scripts executed</summary>
The following scripts were executed for the analysis:
Script:
```shell
#!/bin/bash
# Description: Check for the itermocil layout file in the repository
# Search for the layout file
echo "Searching for itermocil layout file..."
fd "itermocil.window-layout"
# If found, show its contents
if [ $? -eq 0 ]; then
echo "\nLayout file contents:"
cat $(fd "itermocil.window-layout" | head -n1)
fi
Length of output: 529
@@ -0,0 +1,29 @@ | |||
Constraints: [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add resource constraints
The job lacks resource constraints (Constraints: []
and Resources: {}
), which could lead to resource exhaustion in the cluster.
Add appropriate resource limits:
Constraints: []
+Resources:
+ CPU: "1"
+ Memory: "2Gi"
+ EphemeralStorage: "1Gi"
Also applies to: 25-25
EC2_VCPU_COUNT=$VCPU_COUNT | ||
EC2_MEMORY_GB=$MEMORY_GB | ||
EC2_DISK_GB=$DISK_SIZE | ||
ORCHESTRATORS=$ORCHESTRATORS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Define the ORCHESTRATORS
variable before use
The ORCHESTRATORS
variable is referenced but not defined within the script. Ensure that it is set either in the script or passed as an environment variable.
EC2_MEMORY_GB=$MEMORY_GB | ||
EC2_DISK_GB=$DISK_SIZE | ||
ORCHESTRATORS=$ORCHESTRATORS | ||
TOKEN=$TOKEN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure the TOKEN
variable is accessible
The TOKEN
variable is defined inside the get_instance_metadata
function and is not accessible in the global scope where it's used in /etc/node-config
. To fix this, declare TOKEN
outside the function or modify the function to export the token.
Apply this change to define TOKEN
in the global scope:
+# Get metadata token
+TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600" -s -f || echo "")
+if [ -z "$TOKEN" ]; then
+ echo "Error: Unable to retrieve metadata token" >&2
+ exit 1
+fi
+
# Function to get instance metadata
get_instance_metadata() {
- TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600" -s -f || echo "")
- if [ -z "$TOKEN" ]; then
- echo "Error: Unable to retrieve metadata token" >&2
- return 1
- fi
RESULT=$(curl -H "X-aws-ec2-metadata-token: $TOKEN" -s -f "http://169.254.169.254/latest/meta-data/$1" || echo "")
if [ -z "$RESULT" ]; then
echo "Error: Unable to retrieve metadata for $1" >&2
return 1
fi
echo "$RESULT"
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
TOKEN=$TOKEN | |
# Get metadata token | |
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600" -s -f || echo "") | |
if [ -z "$TOKEN" ]; then | |
echo "Error: Unable to retrieve metadata token" >&2 | |
exit 1 | |
fi | |
# Function to get instance metadata | |
get_instance_metadata() { | |
RESULT=$(curl -H "X-aws-ec2-metadata-token: $TOKEN" -s -f "http://169.254.169.254/latest/meta-data/$1" || echo "") | |
if [ -z "$RESULT" ]; then | |
echo "Error: Unable to retrieve metadata for $1" >&2 | |
return 1 | |
fi | |
echo "$RESULT" | |
} |
commands = [ | ||
"bacalhau job list --order-by created_at --order-reversed --limit 10000 --output json", | ||
] | ||
|
||
results = {} | ||
for command in commands: | ||
results[command] = run_command(command) | ||
|
||
json_str = results[commands[0]] | ||
df = pd.read_json(StringIO(json_str)) | ||
|
||
df["Name"] = df["Name"].apply(lambda x: "-".join(x.split("-")[:2])) | ||
df["CreateTime"] = pd.to_datetime(df["CreateTime"].apply(timestamp_to_iso)) | ||
df["StateType"] = df["State"].apply(lambda x: x.get("StateType")) | ||
df = df.query("StateType != 'Failed'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Optimize job data retrieval and processing
The current implementation has potential performance issues:
- Fetching 10000 jobs at once could be memory-intensive
- No pagination support
- DataFrame operations could be optimized
Consider implementing pagination and optimizing the data processing:
+ BATCH_SIZE = 1000
+ all_jobs = []
+
+ for offset in range(0, 10000, BATCH_SIZE):
+ command = f"bacalhau job list --order-by created_at --order-reversed --limit {BATCH_SIZE} --offset {offset} --output json"
+ json_str = run_command(command)
+ batch_df = pd.read_json(StringIO(json_str))
+ if batch_df.empty:
+ break
+ all_jobs.append(batch_df)
+
+ if not all_jobs:
+ print("No jobs found")
+ return
+
+ df = pd.concat(all_jobs, ignore_index=True)
Committable suggestion skipped: line range outside the PR's diff.
def run_command(command): | ||
result = subprocess.run(command, shell=True, text=True, capture_output=True) | ||
|
||
# Find the position of the last closing bracket | ||
last_bracket_index = result.stdout.rfind("]") | ||
|
||
# Truncate the input to include only the JSON array | ||
truncated_input = result.stdout[: last_bracket_index + 1] | ||
return truncated_input.strip() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling and security measures
The current implementation has several potential issues:
- Using
shell=True
can be dangerous ifcommand
contains untrusted input - No error handling for failed commands
- No validation of command output format
Consider this safer implementation:
def run_command(command):
+ try:
+ # Split command to avoid shell=True
+ cmd_parts = command.split()
result = subprocess.run(
- command, shell=True, text=True, capture_output=True
+ cmd_parts,
+ shell=False,
+ text=True,
+ capture_output=True,
+ check=True # Raises CalledProcessError on non-zero exit
)
+ except subprocess.CalledProcessError as e:
+ print(f"Command failed with exit code {e.returncode}: {e.stderr}", file=sys.stderr)
+ raise
+ except Exception as e:
+ print(f"Unexpected error running command: {e}", file=sys.stderr)
+ raise
+ if not result.stdout.strip():
+ raise ValueError("Command produced no output")
# Find the position of the last closing bracket
last_bracket_index = result.stdout.rfind("]")
+ if last_bracket_index == -1:
+ raise ValueError("Invalid JSON output: missing closing bracket")
# Truncate the input to include only the JSON array
truncated_input = result.stdout[: last_bracket_index + 1]
+
+ # Validate JSON
+ try:
+ json.loads(truncated_input.strip())
+ except json.JSONDecodeError as e:
+ raise ValueError(f"Invalid JSON output: {e}")
+
return truncated_input.strip()
Committable suggestion skipped: line range outside the PR's diff.
def main(): | ||
orchestrator_node_raw = subprocess.run( | ||
"bacalhau config list --output json", | ||
shell=True, | ||
text=True, | ||
capture_output=True, | ||
) | ||
|
||
orchestrator_node_json = json.loads(orchestrator_node_raw.stdout) | ||
orchestrator_node_result = next( | ||
( | ||
entry["Value"] | ||
for entry in orchestrator_node_json | ||
if entry["Key"] == "node.clientapi.host" | ||
), | ||
None, | ||
) | ||
|
||
if orchestrator_node_result: | ||
print(f"Getting all jobs from Orchestrator: {orchestrator_node_result}") | ||
else: | ||
print("Failed to get Orchestrator node details") | ||
sys.exit(1) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve configuration handling and error reporting
The orchestrator node configuration retrieval could be improved:
- The command is hardcoded
- Error handling could be more informative
- Missing type hints and docstring
+from typing import Optional
+
+def get_orchestrator_node() -> Optional[str]:
+ """
+ Retrieve the orchestrator node configuration from bacalhau.
+
+ Returns:
+ Optional[str]: The orchestrator node host if found, None otherwise
+ """
orchestrator_node_raw = subprocess.run(
"bacalhau config list --output json",
shell=True,
text=True,
capture_output=True,
)
+
+ if orchestrator_node_raw.returncode != 0:
+ print(f"Failed to get config: {orchestrator_node_raw.stderr}", file=sys.stderr)
+ return None
orchestrator_node_json = json.loads(orchestrator_node_raw.stdout)
return next(
(
entry["Value"]
for entry in orchestrator_node_json
if entry["Key"] == "node.clientapi.host"
),
None,
)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def main(): | |
orchestrator_node_raw = subprocess.run( | |
"bacalhau config list --output json", | |
shell=True, | |
text=True, | |
capture_output=True, | |
) | |
orchestrator_node_json = json.loads(orchestrator_node_raw.stdout) | |
orchestrator_node_result = next( | |
( | |
entry["Value"] | |
for entry in orchestrator_node_json | |
if entry["Key"] == "node.clientapi.host" | |
), | |
None, | |
) | |
if orchestrator_node_result: | |
print(f"Getting all jobs from Orchestrator: {orchestrator_node_result}") | |
else: | |
print("Failed to get Orchestrator node details") | |
sys.exit(1) | |
def get_orchestrator_node() -> Optional[str]: | |
""" | |
Retrieve the orchestrator node configuration from bacalhau. | |
Returns: | |
Optional[str]: The orchestrator node host if found, None otherwise | |
""" | |
orchestrator_node_raw = subprocess.run( | |
"bacalhau config list --output json", | |
shell=True, | |
text=True, | |
capture_output=True, | |
) | |
if orchestrator_node_raw.returncode != 0: | |
print(f"Failed to get config: {orchestrator_node_raw.stderr}", file=sys.stderr) | |
return None | |
orchestrator_node_json = json.loads(orchestrator_node_raw.stdout) | |
return next( | |
( | |
entry["Value"] | |
for entry in orchestrator_node_json | |
if entry["Key"] == "node.clientapi.host" | |
), | |
None, | |
) | |
def main(): | |
orchestrator_node_result = get_orchestrator_node() | |
if orchestrator_node_result: | |
print(f"Getting all jobs from Orchestrator: {orchestrator_node_result}") | |
else: | |
print("Failed to get Orchestrator node details") | |
sys.exit(1) |
Summary by CodeRabbit
New Features
.coderabbit.yaml
for YAML language server settings.Bug Fixes
Documentation
Chores