-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS Provisioning Now Working, Has Bugs #41
base: main
Are you sure you want to change the base?
Conversation
This commit introduces several improvements to the AWS networking setup process: - Add a 15-minute timeout to prevent indefinite waiting - Use errgroup for parallel VPC networking setup - Implement retry mechanisms for internet gateway creation - Enhance logging for better debugging - Improve error handling to prevent blocking on single VPC failures
…rastructure setup
…nternet gateway diagnostics
…led error handling
This commit enhances the AWS provider's network error handling by: - Adding comprehensive DNS and network diagnostics - Implementing retry mechanisms with exponential backoff - Providing more detailed error logging - Configuring HTTP client with longer timeouts - Logging system nameserver information The changes aim to improve visibility into network-related issues when interacting with AWS EC2 endpoints and provide more robust error handling.
Warning There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure. 🔧 golangci-lint (1.62.2)level=warning msg="[linters_context] copyloopvar: this linter is disabled because the Go version (1.21) of your project is lower than Go 1.22" WalkthroughThe pull request introduces a comprehensive set of changes across multiple files, focusing on standardizing location terminology from "location" to "region" and "zone" throughout the codebase. The modifications span cloud providers (AWS, Azure, GCP), deployment configurations, and testing suites. Key improvements include enhanced error handling, more precise location normalization, and updates to machine configuration methods. The changes aim to improve code clarity, consistency, and robustness in managing cloud resources and deployments. Changes
Sequence DiagramsequenceDiagram
participant User
participant DeploymentPrep
participant CloudProvider
participant LocationNormalizer
participant MachineConfig
User->>DeploymentPrep: Initiate Deployment
DeploymentPrep->>LocationNormalizer: Normalize Location
LocationNormalizer-->>DeploymentPrep: Return Region, Zone
DeploymentPrep->>MachineConfig: Create Machine
MachineConfig->>CloudProvider: Validate Machine
CloudProvider-->>MachineConfig: Validation Result
MachineConfig-->>DeploymentPrep: Machine Created
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 18
🧹 Outside diff range and nitpick comments (33)
internal/clouds/general/location.go (4)
17-30
: Reduce verbose comments to enhance code readabilityThe extensive comments listing sample AWS locations may clutter the code and make it harder to read. Consider moving these examples to external documentation or summarizing them to include only essential information.
39-43
: Reduce verbose comments in Azure caseSimilar to the AWS section, the comments in the Azure case may be excessive. Streamlining these comments will improve readability.
45-51
: Simplify comments in GCP caseThe detailed examples in the GCP case can be condensed or moved to documentation to keep the code clean and maintainable.
81-84
: Ensure safe access to string indicesWhen checking for AWS-style zones, ensure that the string is long enough before accessing
input[len(input)-1]
to prevent index out of range errors. Althoughlen(input) > 0
checks for empty strings, consider adding additional safeguards if necessary.pkg/providers/common/deployment.go (3)
159-160
: Add support for AWS provider in default configurationsIn the
SetDefaultConfigurations
function, default configurations are set for Azure and GCP but not for AWS. Consider adding default configurations for AWS to maintain consistency and prevent potential configuration issues.
Line range hint
221-236
: Initialize 'orchestratorZone' before useThe variable
orchestratorZone
may be used before being assigned if theorchestrator
condition is not met. This could lead to logging an empty value for the orchestrator location.Ensure
orchestratorZone
is properly initialized before use. Here's a suggested fix:var orchestratorZone string if orchestrator, ok := params["orchestrator"].(bool); ok && orchestratorMachineName == "" { machine.Orchestrator = orchestrator orchestratorMachineName = machine.Name orchestratorZone = machine.Zone machine.SetNodeType(models.BacalhauNodeTypeOrchestrator) if count > 1 && !orchestratorMessagePrinted { l.Infof( "Orchestrator flag is set, but count is greater than 1. Making the first machine the orchestrator.", ) } } else if orchestratorMachineName != "" { l.Infof("Orchestrator flag must be set in a single location. Ignoring flag.") l.Infof("Orchestrator machine name: %s", orchestratorMachineName) + orchestratorZone = machine.Zone l.Infof("Orchestrator location: %s", orchestratorZone) }
Line range hint
228-235
: Avoid redundant region extraction logicThe code for extracting the region from the machine's location is repeated. Consider refactoring this logic into a helper function to improve maintainability.
cmd/beta/aws/create_deployment.go (1)
271-278
: Improve logging levels for important deployment stepsThe
l.Debug
statements for successful infrastructure creation and regional networking may be important enough to meritl.Info
level to ensure they are visible in default logging configurations.Consider changing the logging level:
-l.Debug("Infrastructure created successfully") +l.Info("Infrastructure created successfully") -l.Debug("Creating regional networking...") +l.Info("Creating regional networking...") ... -l.Debug("Regional networking created successfully") +l.Info("Regional networking created successfully")pkg/models/deployment.go (1)
Line range hint
228-242
: Handle potential incorrect region extraction logicThe code for extracting the region from the machine's region may not account for all possible cases, such as when the region name does not end with a letter. Ensure that the logic correctly handles all AWS regions and does not unintentionally modify the region string.
Review and correct the region extraction logic. Consider using a reliable method to extract the region without assuming the format.
pkg/providers/aws/aws_compute_operations.go (2)
377-379
: Parameterize Architecture inGetLatestUbuntuAMI
The architecture is hardcoded to
"x86_64"
when callingGetLatestUbuntuAMI
. This may not accommodate machines with different architectures, such as ARM instances. Consider parameterizing the architecture to support various machine types.Apply this diff to parameterize the architecture:
- amiID, err := p.GetLatestUbuntuAMI(ctx, machine.GetRegion(), "x86_64") + amiID, err := p.GetLatestUbuntuAMI(ctx, machine.GetRegion(), machine.GetArchitecture())Ensure that
machine.GetArchitecture()
returns the appropriate architecture string (e.g.,"x86_64"
or"arm64"
).
386-400
: Consolidate TagSpecifications to Avoid DuplicationMultiple
TagSpecification
entries with the sameResourceType
(ec2_types.ResourceTypeInstance
) are being appended. AWS recommends consolidating tags into a singleTagSpecification
per resource type to prevent potential issues.Refactor the code to combine all tags into a single
TagSpecification
:- var tagSpecs []ec2_types.TagSpecification - for k, v := range m.Deployment.Tags { - tagSpecs = append(tagSpecs, ec2_types.TagSpecification{ - ResourceType: ec2_types.ResourceTypeInstance, - Tags: []ec2_types.Tag{{Key: aws.String(k), Value: aws.String(v)}}, - }) - } - - tagSpecs = append(tagSpecs, ec2_types.TagSpecification{ - ResourceType: ec2_types.ResourceTypeInstance, - Tags: []ec2_types.Tag{ - {Key: aws.String("Name"), Value: aws.String(machine.GetName())}, - }, - }) + tags := []ec2_types.Tag{ + {Key: aws.String("Name"), Value: aws.String(machine.GetName())}, + } + for k, v := range m.Deployment.Tags { + tags = append(tags, ec2_types.Tag{Key: aws.String(k), Value: aws.String(v)}) + } + tagSpecs := []ec2_types.TagSpecification{ + { + ResourceType: ec2_types.ResourceTypeInstance, + Tags: tags, + }, + }pkg/models/machine.go (2)
278-308
: RefactorSetDisplayLocation
to Reduce Code DuplicationThe
SetDisplayLocation
method contains duplicated validation logic for each cloud provider. Refactoring this method to reduce code duplication will improve maintainability and readability.Consider abstracting the validation logic into a separate helper function or using a map of validation functions per provider.
690-701
: Consolidate Location Validation LogicThe
IsValidLocation
function duplicates logic similar toSetDisplayLocation
. Consolidate the location validation and normalization to a shared utility function to ensure consistency and reduce maintenance overhead.pkg/providers/aws/provider.go (1)
33-39
: Externalize Configuration Parameters for FlexibilityThe hardcoded constants for timeouts, intervals, and other configurations limit flexibility. Consider externalizing these parameters into configuration files or environment variables to allow adjustments without code changes, catering to different deployment environments and requirements.
pkg/providers/aws/utils.go (1)
17-17
: Consider adding region filtering capabilitiesWhile the implementation is correct, consider adding optional region filtering to avoid unnecessary API calls for unsupported regions.
func (p *AWSProvider) GetAllAWSRegions( ctx context.Context, client ec2.Client, + filter *ec2.Filter, ) ([]string, error) { output, err := client.DescribeRegions(ctx, &ec2.DescribeRegionsInput{ AllRegions: aws.Bool(true), + Filters: filter, })internal/clouds/azure/locations_test.go (1)
49-58
: Add test cases for additional edge casesWhile the basic validation cases are covered, consider adding test cases for:
- Regions with special characters
- Maximum length validation
- Case sensitivity handling
Example additional test cases:
{ name: "Region with special characters", region: "central-us-1", expectedValid: false, expectedReason: "region contains invalid characters", }, { name: "Region name too long", region: "centraluswesteuropenorthsouth", expectedValid: false, expectedReason: "region name exceeds maximum length", },internal/clouds/azure/locations.go (2)
Line range hint
10-12
: Update struct field name for consistencyThe
Locations
field inAzureData
struct should be renamed toRegions
to maintain consistency with the new terminology.Apply these changes:
type AzureData struct { - Locations map[string][]string `yaml:"locations"` + Regions map[string][]string `yaml:"regions"` }
Line range hint
66-93
: Update IsValidAzureVMSize to use region terminologyFor consistency, the
IsValidAzureVMSize
function should useregion
instead oflocation
in its parameters and error messages.Apply these changes:
-func IsValidAzureVMSize(location, vmSize string) bool { +func IsValidAzureVMSize(region, vmSize string) bool { l := logger.Get() data, err := getSortedAzureData() if err != nil { l.Warnf("Failed to get sorted Azure data: %v", err) return false } var azureData AzureData err = yaml.Unmarshal(data, &azureData) if err != nil { l.Warnf("Failed to unmarshal Azure data: %v", err) return false } - vmSizes, exists := azureData.Locations[location] + vmSizes, exists := azureData.Regions[region] if !exists { - l.Warnf("Location not found: %s", location) + l.Warnf("Region not found: %s", region) return false } for _, size := range vmSizes { if size == vmSize { return true } } - l.Warnf("Invalid VM size for location: %s, vmSize: %s", location, vmSize) + l.Warnf("Invalid VM size for region: %s, vmSize: %s", region, vmSize) return false }pkg/providers/azure/resource_group.go (4)
Line range hint
13-32
: Update parameter names for consistencyFor consistency with the new region/zone terminology, the parameter
rgLocation
should be renamed torgRegion
.Apply these changes:
func (c *LiveAzureClient) GetOrCreateResourceGroup(ctx context.Context, rgName string, - rgLocation string, + rgRegion string, tags map[string]string) (*armresources.ResourceGroup, error) { l := logger.Get() if rgName == "" { return nil, fmt.Errorf("rgName is not set") } if !common.IsValidResourceGroupName(rgName) { return nil, fmt.Errorf("invalid resource group name: %s", rgName) } - if !internal.IsValidAzureRegion(rgLocation) { - return nil, fmt.Errorf("invalid resource group location: %s", rgLocation) + if !internal.IsValidAzureRegion(rgRegion) { + return nil, fmt.Errorf("invalid resource group region: %s", rgRegion) }
Line range hint
34-35
: Update function call for consistencyThe call to
GetResourceGroup
should use the renamed parameter.Apply this change:
- existing, err := c.GetResourceGroup(ctx, rgLocation, rgName) + existing, err := c.GetResourceGroup(ctx, rgRegion, rgName)
Line range hint
47-49
: Update ResourceGroup creation parametersThe Location field in the ResourceGroup struct should be set using the region parameter.
Apply this change:
parameters := armresources.ResourceGroup{ Name: to.Ptr(rgName), - Location: to.Ptr(rgLocation), + Location: to.Ptr(rgRegion), Tags: dereferencedTags, }
Line range hint
63-82
: Update GetResourceGroup signature for consistencyThe
GetResourceGroup
method should use region terminology in its parameters.Apply these changes:
func (c *LiveAzureClient) GetResourceGroup(ctx context.Context, - rgLocation string, + rgRegion string, rgName string) (*armresources.ResourceGroup, error) {internal/clouds/gcp/locations.go (2)
Line range hint
26-42
: Add input validation for zone formatThe function should validate the zone format before performing the lookup. GCP zones follow a specific format (e.g., 'us-central1-a').
Consider adding format validation:
func IsValidGCPZone(zone string) bool { l := logger.Get() + if !strings.Contains(zone, "-") { + return false + } gcpDataRaw, err := GetGCPData()
Line range hint
64-93
: Improve error handling and remove magic numbersThe function has several issues:
- Magic number in array index access
- Generic error messages that don't indicate the specific validation failure
Consider this improvement:
- //nolint:mnd - if len(parts) >= 6 { + const expectedPartsLength = 6 + const projectIDIndex = 6 + if len(parts) >= expectedPartsLength { - projectID := parts[6] // e.g., "ubuntu-os-cloud" + projectID := parts[projectIDIndex] imageURL := GetGCPDiskImageURL(projectID, diskImage.Family) return imageURL, nil } } - return "", fmt.Errorf("invalid disk image family for GCP: %s", diskImageFamilyToCheck) + return "", fmt.Errorf("invalid license format for disk image family '%s'", diskImageFamilyToCheck)internal/clouds/gcp/locations_test.go (1)
Line range hint
13-47
: Add edge cases to zone validation testsThe test suite would benefit from additional edge cases:
- Zone with special characters
- Zone with multiple hyphens
Add these test cases:
{ name: "Empty location", zone: "", expectedValid: false, }, + { + name: "Zone with special characters", + zone: "us-central1-a!", + expectedValid: false, + }, + { + name: "Zone with multiple hyphens", + zone: "us-central1-a-extra", + expectedValid: false, + },cmd/beta/gcp/create_vm.go (2)
Line range hint
141-150
: Use GetGCPRegionFromZone from locations packageThe
getRegionFromZone
function duplicates functionality already available in the locations package.Replace the local implementation with the package function:
-func getRegionFromZone(zone string) string { - // GCP zones are typically in the format of <region>-<zone>, e.g., us-central1-a - parts := strings.Split(zone, "-") - if len(parts) < 2 { //nolint:mnd - return "" - } - return strings.Join(parts[:len(parts)-1], "-") +func getRegionFromZone(zone string) string { + region, err := GetGCPRegionFromZone(zone) + if err != nil { + return "" + } + return region }
Based on the available zone information in the codebase, I can now provide a final response:
Enhance zone validation error message with nearby zone suggestions
When an invalid zone is provided, suggest zones from the same region. For example, if
us-central1-x
is invalid, suggestus-central1-a
,us-central1-b
,us-central1-c
, andus-central1-f
as alternatives.
- Update the error message in
cmd/beta/gcp/create_vm.go
to extract the region from the invalid zone and suggest all valid zones in that region.- Example implementation:
if strings.Contains(err.Error(), "Unknown zone") { region := zone[:strings.LastIndex(zone, "-")] return fmt.Errorf( `invalid zone '%s'. Available zones in region %s are: %s You can list all available zones using: \tgcloud compute zones list`, zone, region, getAvailableZonesForRegion(region), // implement this helper function ) }🔗 Analysis chain
Line range hint
124-132
: Enhance zone validation error messageThe error message for invalid zones could be more helpful by suggesting nearby valid zones.
Let's verify if we can find the available zones in the codebase:
Let's examine the content of the found YAML file to see if it contains zone information that could be used for suggesting nearby zones.
Let me check if we can get the zone information in a more manageable way by focusing on just the zone names.
Let me try a different approach to extract the zone names.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for GCP zone definitions or constants rg -l "us-central1-[a-z]" --type yamlLength of output: 73
Script:
#!/bin/bash # Check the content of the GCP data file cat internal/clouds/gcp/gcp_data.yamlLength of output: 65605
Script:
#!/bin/bash # Extract just the zone names from the YAML file grep -o '^[[:space:]]*[a-z0-9-]*-[a-z][0-9]*-[a-z]:' internal/clouds/gcp/gcp_data.yaml | tr -d ' :'Length of output: 101
Script:
#!/bin/bash # Extract zone names using awk awk -F: '/^[[:space:]]*[a-z].*:/ {print $1}' internal/clouds/gcp/gcp_data.yaml | sed 's/^[[:space:]]*//'Length of output: 8463
cmd/beta/testdisplay_test.go (1)
31-73
: Consider refactoring repeated test machine configurations.The test machine configurations are nearly identical with only minor differences in the name and status message. Consider creating a helper function to generate test machines with customizable parameters.
func createTestMachine(name string, statusMessage string) *models.Machine { return &models.Machine{ Name: name, CloudProvider: models.DeploymentTypeAzure, Type: models.AzureResourceTypeVM, DisplayLocation: "us-west-2", StatusMessage: statusMessage, Orchestrator: true, StartTime: time.Now().Add(-29 * time.Second), } }pkg/display/display_model.go (1)
286-289
: Consider using a constant for the resource type stringThe hardcoded string
"compute.googleapis.com/Instance"
should be moved to a constant to improve maintainability and prevent typos.+const ( + ComputeInstanceResourceType = "compute.googleapis.com/Instance" +) + if machine.GetPublicIP() != "" && machine.GetMachineResourceState( - "compute.googleapis.com/Instance", + ComputeInstanceResourceType, ) == models.ResourceStateSucceeded {pkg/models/types.go (1)
363-374
: Update error message to maintain terminology consistency.The error message still uses "location" while the function has been updated to use "region" terminology.
Apply this change for consistency:
- return nil, fmt.Errorf("location is empty") + return nil, fmt.Errorf("region is empty")pkg/logger/logger.go (2)
122-123
: Document the rationale for disabling console logging.This change might impact debugging capabilities. Please add a comment explaining why console logging is being explicitly disabled.
155-159
: Document the nolint directive rationale.The file permissions and flags are correctly set, but please add a comment explaining why the gosec and mnd linter checks are disabled.
cmd/beta/gcp/create_deployment_test.go (1)
354-356
: LGTM! Good separation of region and zone validation.The changes correctly validate both the region and zone separately, improving test clarity and aligning with the standardized location terminology.
Consider adding a comment explaining the relationship between region and zone for better documentation:
+// Verify region and zone are correctly set +// Region (e.g., us-west1) is extracted from the zone (e.g., us-west1-b) machine := m suite.Equal("us-west1", machine.GetRegion(), "Expected location to be us-west1") suite.Equal("us-west1-b", machine.GetZone(), "Expected zone to be us-west1-b")
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (7)
.cspell/custom-dictionary.txt
is excluded by none and included by none.envrc
is excluded by none and included by noneai/sop/spot.md
is excluded by none and included by nonedelete-vpcs.sh
is excluded by none and included by nonemocks/aws/mock_EC2Clienter.go
is excluded by none and included by nonerequirements.txt
is excluded by none and included by nonetest/integration/create_deployment_test.go
is excluded by none and included by none
📒 Files selected for processing (42)
cmd/andaime.go
(2 hunks)cmd/beta/aws/create_deployment.go
(5 hunks)cmd/beta/azure/create_deployment_test.go
(1 hunks)cmd/beta/gcp/create_deployment_test.go
(1 hunks)cmd/beta/gcp/create_vm.go
(1 hunks)cmd/beta/gcp/progress_test.go
(1 hunks)cmd/beta/provision/provisioner_test.go
(1 hunks)cmd/beta/testdisplay_test.go
(3 hunks)internal/clouds/azure/locations.go
(2 hunks)internal/clouds/azure/locations_test.go
(1 hunks)internal/clouds/gcp/locations.go
(3 hunks)internal/clouds/gcp/locations_test.go
(3 hunks)internal/clouds/general/location.go
(1 hunks)internal/testdata/aws.go
(1 hunks)pkg/display/display_model.go
(2 hunks)pkg/display/rendering.go
(2 hunks)pkg/logger/logger.go
(4 hunks)pkg/models/aws.go
(0 hunks)pkg/models/deployment.go
(4 hunks)pkg/models/machine.go
(7 hunks)pkg/models/types.go
(2 hunks)pkg/providers/aws/aws_compute_operations.go
(7 hunks)pkg/providers/aws/integration_test.go
(0 hunks)pkg/providers/aws/provider.go
(39 hunks)pkg/providers/aws/provider_test.go
(9 hunks)pkg/providers/aws/utils.go
(1 hunks)pkg/providers/azure/create_resource_test.go
(1 hunks)pkg/providers/azure/deploy.go
(5 hunks)pkg/providers/azure/deploy_bacalhau_test.go
(1 hunks)pkg/providers/azure/integration_test.go
(1 hunks)pkg/providers/azure/resource_group.go
(1 hunks)pkg/providers/common/cluster_deployer.go
(1 hunks)pkg/providers/common/deployment.go
(4 hunks)pkg/providers/common/machine_config.go
(5 hunks)pkg/providers/common/machines.go
(2 hunks)pkg/providers/gcp/client_compute.go
(4 hunks)pkg/providers/gcp/gcp_cluster_deployer.go
(1 hunks)pkg/providers/gcp/integration_test.go
(1 hunks)pkg/providers/gcp/machines.go
(1 hunks)pkg/providers/gcp/provider.go
(3 hunks)pkg/sshutils/constants.go
(1 hunks)pkg/sshutils/ssh_config.go
(3 hunks)
💤 Files with no reviewable changes (2)
- pkg/models/aws.go
- pkg/providers/aws/integration_test.go
✅ Files skipped from review due to trivial changes (2)
- pkg/providers/gcp/gcp_cluster_deployer.go
- pkg/providers/common/cluster_deployer.go
🔇 Additional comments (40)
cmd/beta/aws/create_deployment.go (2)
43-43
: Ensure console logging is appropriately configured
Changing EnableConsole
from true
to false
disables console logging. Confirm that this change is intentional and that file-based logging is sufficient for your debugging and monitoring needs.
431-432
: Include 'zone' and 'region' in deployment configuration
When writing the deployment configuration, ensure that both zone
and region
are included for each machine to maintain consistency and completeness of the deployment data.
pkg/models/deployment.go (4)
469-473
: Protect 'Clients' map assignment with a mutex
Ensure that the assignment to r.Clients[region]
in SetClient
is thread-safe by holding the lock during the operation.
497-507
: Handle errors from 'viper.WriteConfig' in 'SaveVPCConfig'
In the SaveVPCConfig
method, after calling viper.WriteConfig()
, the error is returned but not handled in the caller function. Ensure that any errors are appropriately handled to prevent silent failures.
463-467
:
Initialize 'Clients' map before access
Before accessing r.Clients[region]
, ensure that the Clients
map is initialized to prevent nil map dereference.
Initialize the map if necessary:
func (r *RegionalResources) GetClient(region string) aws_interface.EC2Clienter {
r.RLock()
defer r.RUnlock()
+ if r.Clients == nil {
+ r.Clients = make(map[string]aws_interface.EC2Clienter)
+ }
return r.Clients[region]
}
Likely invalid or redundant comment.
434-435
:
Correct mutex usage in 'GetVPC' method
In the GetVPC
method, the use of RLock()
should be paired with RUnlock()
. Currently, mu.Lock()
and mu.Unlock()
are used, which can lead to unintended blocking behavior.
Apply this fix to use read locks appropriately:
-func (r *RegionalResources) GetVPC(region string) *AWSVPC {
- r.RLock()
+func (r *RegionalResources) GetVPC(region string) *AWSVPC {
+ r.mu.RLock()
defer r.mu.RUnlock()
if r.VPCs == nil {
r.VPCs = make(map[string]*AWSVPC)
}
return r.VPCs[region]
}
Likely invalid or redundant comment.
pkg/providers/aws/aws_compute_operations.go (1)
531-543
: Verify DescribeAvailabilityZones
Usage with Filters and AllAvailabilityZones
When calling DescribeAvailabilityZones
with AllAvailabilityZones
set to true
, the filters may be ignored, and all zones are returned, including not available ones. This might not be the intended behavior. Consider removing the AllAvailabilityZones
parameter or adjusting the filters to ensure only available zones in the specified region are retrieved.
Run the following script to verify the behavior:
Ensure that the output matches the expected available zones in the specified region.
pkg/providers/aws/provider.go (1)
Line range hint 2176-2226
: Handle Empty Results in GetLatestUbuntuAMI
Gracefully
If no matching AMIs are found, GetLatestUbuntuAMI
returns an error. Ensure that this error is properly handled by the calling functions to prevent runtime panics or unhandled exceptions when an empty result is encountered.
Run the following script to check the function's behavior with no matching AMIs:
✅ Verification successful
Empty AMI results are properly handled in the codebase
The error from GetLatestUbuntuAMI
is properly propagated and handled by its caller in aws_compute_operations.go
. When no matching AMIs are found, the function returns a descriptive error message that includes both the region and architecture, and the caller wraps this error appropriately with additional context using fmt.Errorf("failed to get latest AMI: %w", err)
. This ensures that the error is properly propagated up the call stack with sufficient context for debugging.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Test GetLatestUbuntuAMI with an invalid architecture to simulate no matching results.
aws ec2 describe-images \
--region "$REGION" \
--filters "Name=name,Values=ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-*-server-*" \
"Name=architecture,Values=invalid-arch"
# Expected: The command should return an empty list, and the code should handle this gracefully.
Length of output: 250
Script:
#!/bin/bash
# Let's analyze how the error from GetLatestUbuntuAMI is handled by its callers
# First, find all references to GetLatestUbuntuAMI
rg "GetLatestUbuntuAMI" -A 3
# Then, let's look at the function definition and its error handling
ast-grep --pattern 'func $$$GetLatestUbuntuAMI($$$) (string, error) {
$$$
}'
Length of output: 852
pkg/sshutils/constants.go (1)
11-11
: Evaluate the Increased SSHRetryDelay
Impact
Increasing SSHRetryDelay
from 3 * time.Second
to 20 * time.Second
significantly extends the total SSH retry time. This might lead to longer deployment times, especially in high-retry scenarios. Consider whether such a substantial increase is necessary or if a smaller increment would suffice.
Please verify if the increased delay is essential for your use case. If variable network conditions necessitate this change, consider making SSHRetryDelay
a configurable parameter.
pkg/providers/aws/utils.go (1)
12-15
: Verify AWS client initialization across the codebase
The change to accept ec2.Client
as a parameter improves testability through dependency injection. However, given the PR title mentioning AWS provisioning bugs, we should verify all AWS client initialization paths.
✅ Verification successful
Based on the results, I can now generate the final response:
AWS EC2 client initialization is properly handled
The change to accept ec2.Client
as a parameter aligns well with the existing codebase patterns. The search results show:
- All EC2 client initializations are done properly using
ec2.New
orec2.NewFromConfig
- The client is consistently passed through function parameters
- No instances of nil client checks were found, suggesting robust initialization practices
- The change maintains consistency with other AWS service interactions in the codebase
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for AWS client initialization patterns
rg -A 3 "ec2\.New|ec2\.Client"
# Look for potential nil client usage
ast-grep --pattern 'if $client == nil { $$$ }'
Length of output: 5857
cmd/beta/gcp/progress_test.go (1)
42-43
: LGTM! Valid GCP region and zone configuration.
The region "us-central1" and zone "us-central1-a" are valid GCP locations, and the change aligns with the standardization of location terminology across the codebase.
internal/testdata/aws.go (1)
173-181
: LGTM! Consistent mock implementation.
The new mock function follows the established patterns and provides necessary test data for AWS Internet Gateway functionality. This addition supports the AWS provisioning improvements mentioned in the PR objectives.
Let's verify the consistency of the Internet Gateway ID format across the codebase:
✅ Verification successful
Internet Gateway ID format is consistent across the codebase
The verification confirms that all Internet Gateway IDs in the codebase follow the same format "igw-12345", maintaining consistency between:
- The new mock function in
internal/testdata/aws.go
- Other mock data in the same file
- The test constant
FAKE_IGW_ID
inpkg/providers/aws/provider_test.go
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for consistency in Internet Gateway ID format
rg -A 1 'igw-[0-9a-z]+' --type go
Length of output: 712
pkg/providers/azure/create_resource_test.go (1)
155-155
: LGTM! Valid Azure region configuration.
The change from Location to Region aligns with the terminology standardization, and the test uses valid Azure regions.
Let's verify the Azure region validation across the codebase:
✅ Verification successful
Region validation is properly implemented and used
The codebase has proper Azure region validation through internal_azure.IsValidAzureRegion()
function, which is correctly integrated in the following key places:
pkg/models/machine.go
: Validates Azure regions during machine configurationpkg/providers/azure/resource_group.go
: Validates regions for resource group creationinternal/clouds/azure/locations.go
: Contains the core validation logic against Azure location data
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for Azure region validation implementation
ast-grep --pattern 'func IsValidAzureRegion($region string) bool {
$$$
}'
Length of output: 77
Script:
#!/bin/bash
# Let's search for any region validation patterns in the codebase
rg -i "region.*valid" -A 5
# Also search for region-related constants or variables
rg "Region.*=.*\"[a-z]+\""
# Check for any Azure-specific region validation
rg -i "azure.*region" -A 5
Length of output: 7190
cmd/beta/testdisplay_test.go (1)
201-206
: LGTM: Progress bar test configuration is consistent.
The test machine configuration in the progress bar test follows the same pattern as the other test configurations.
pkg/providers/common/machine_config.go (1)
Line range hint 236-248
: LGTM: Function signature update is consistent.
The addition of region and zone parameters to createNewMachine and their usage in NewMachine is consistent with the location standardization changes.
cmd/beta/azure/create_deployment_test.go (2)
227-232
: LGTM: Interface usage improves flexibility.
The change from concrete type *models.Machine
to interface type models.Machiner
and the update to use GetRegion()
instead of GetLocation()
improves the code's flexibility and maintains consistency with the location standardization changes.
Line range hint 1-1
: Verify consistent location terminology usage across codebase.
The changes standardize location terminology from "location" to "region" and "zone". Let's verify this is consistently applied across the codebase.
✅ Verification successful
Let me gather more specific information about the usage of region terminology to ensure we have a complete picture.
Based on the comprehensive search results, I can now provide the final verification response:
Location terminology standardization is consistently implemented across the codebase
The codebase shows a clear and consistent pattern in handling location terminology:
- Cloud provider-specific code (AWS, GCP, Azure) correctly uses Region/Zone terminology
- Display and model layers properly handle location abstractions
- No legacy location getter/setter methods found
- Remaining "Location" fields in structs are appropriately used for display/template purposes
The changes align with cloud provider conventions where:
- AWS uses regions (e.g., us-east-1) and availability zones (e.g., us-east-1a)
- GCP uses regions (e.g., us-central1) and zones (e.g., us-central1-a)
- Azure uses regions for resource placement
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any remaining references to the old location terminology
# that might have been missed during the update.
# Search for old getter/setter method names
echo "Checking for old location getter/setter methods..."
rg "Get[S]?etLocation\(" --type go
# Search for Location field declarations
echo "Checking for Location field declarations..."
ast-grep --pattern 'struct {
$$$
Location $_
$$$
}'
# Search for location-related function parameters
echo "Checking for location parameters in function signatures..."
ast-grep --pattern 'func $_($_: string, location string, $$$) $_'
Length of output: 3891
Script:
#!/bin/bash
# Check for region-related terminology usage
echo "Checking for region-related fields and methods..."
rg "Region|Zone" --type go -g "!*_test.go" -A 2
# Check for location string usage in function calls
echo "Checking for location string usage in function calls..."
ast-grep --pattern 'func $_($_, location string) $_'
# Check for location variable assignments
echo "Checking for location variable assignments..."
rg "location :=" --type go
Length of output: 79488
pkg/display/rendering.go (2)
179-179
: LGTM: Location terminology standardization
The change from GetLocation()
to GetDisplayLocation()
maintains consistent terminology across the codebase.
212-212
: LGTM: Display location formatting remains intact
The change maintains consistent terminology while preserving the existing truncation logic for display purposes.
pkg/providers/aws/provider_test.go (2)
105-106
: LGTM: Improved mock client initialization
The mock client initialization is now more clearly associated with regional resources, improving code clarity.
Line range hint 294-306
: LGTM: Test updated to match new method signature
The test case correctly incorporates the region parameter while maintaining proper mock expectations.
pkg/display/display_model.go (1)
224-224
: LGTM: Consistent method naming
The change to SetDisplayLocation
maintains consistency with the codebase-wide terminology standardization.
pkg/models/types.go (1)
280-280
: LGTM: Location to Region terminology update.
The change from GetMachinesInLocation
to GetMachinesInRegion
aligns with the standardization of location terminology.
Let's verify other potential location-related method calls:
✅ Verification successful
Based on the search results, I can now generate the final response since we have sufficient information:
Verified: Location to Region terminology update is complete
The codebase search confirms that all instances of GetMachinesInLocation
have been replaced with GetMachinesInRegion
. No remaining references to the old method name were found, indicating the terminology standardization is complete and consistent.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for any remaining references to "Location" in method calls
rg -l 'GetMachinesInLocation|GetLocation\('
Length of output: 45
Script:
#!/bin/bash
# Let's try a broader search for "location" related methods and variables
rg -i 'location' --type go -A 2
# Also search for any remaining instances of old terminology
ast-grep --pattern 'GetMachinesInLocation($_)'
Length of output: 120974
pkg/providers/azure/deploy_bacalhau_test.go (1)
309-309
: LGTM: Test updated for Region terminology.
The test case correctly uses the new Region
field name, maintaining consistency with the terminology changes.
pkg/logger/logger.go (3)
25-25
: LGTM: LastLogLines constant addition.
The constant standardizes the number of log lines to retrieve, matching the default buffer size.
144-152
: LGTM: Comprehensive encoder configuration.
The encoder configuration follows zap's best practices with proper key naming and encoding settings.
Line range hint 494-498
: LGTM: Standardized log line retrieval.
The method now uses the LastLogLines constant, providing consistent behavior across the codebase.
cmd/beta/provision/provisioner_test.go (1)
420-420
: LGTM: Proper region and zone separation
The geographical hierarchy is correctly implemented with proper AWS region and zone values.
Also applies to: 423-424
pkg/providers/gcp/integration_test.go (1)
142-143
: Verify location handling across the codebase
The duplicate location usage pattern appears in multiple cloud provider tests. Let's verify if this is a wider issue.
pkg/sshutils/ssh_config.go (2)
Line range hint 589-605
: LGTM! Enhanced logging implementation.
The changes improve error visibility and debugging by:
- Using structured logging instead of direct output
- Including more detailed connection attempt information
Also applies to: 631-632
Line range hint 786-789
: LGTM! Good addition of modular SSH configuration methods.
The new methods enhance testability and maintainability through proper dependency injection and state management.
pkg/providers/azure/deploy.go (2)
94-94
: LGTM! Consistent use of region terminology.
The change properly aligns with the standardization of location terminology across the codebase.
188-189
: LGTM! Consistent region-based machine grouping.
The changes maintain the existing functionality while properly adopting the standardized location terminology.
pkg/providers/gcp/client_compute.go (3)
623-623
: LGTM: Consistent with the zone terminology change
The change from GetLocation()
to GetZone()
is consistent with the standardization effort and properly validates the zone parameter.
682-682
: LGTM: Consistent zone parameter usage in machine type path
The change from GetLocation()
to GetZone()
is consistent and correctly used in constructing the machine type path.
575-575
: Verify the interface change from GetLocation to GetZone
The change from GetLocation()
to GetZone()
appears to be part of a broader standardization effort.
✅ Verification successful
Interface change from GetLocation to GetZone is correctly implemented
The verification shows that:
- The
Machiner
interface correctly definesGetZone()
method GetZone()
is consistently used across the codebase in GCP, Azure, and AWS providers- No occurrences of
GetLocation()
were found, indicating complete migration - The change aligns with the interface's current definition and its implementations
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that the Machiner interface has been updated to use GetZone
# and that all implementations have been updated accordingly.
# Check the interface definition
rg -A 5 "type\s+Machiner\s+interface"
# Check for any remaining uses of GetLocation
rg "GetLocation\(\)"
Length of output: 317
Script:
#!/bin/bash
# Let's check the interface definition with more context lines and search for GetZone
rg -A 15 "type\s+Machiner\s+interface"
# Search for GetZone method
rg "GetZone\(\)"
# Search for any files implementing Machiner interface
rg "struct.*implements.*Machiner|struct.*satisfies.*Machiner"
Length of output: 2103
pkg/providers/gcp/provider.go (1)
787-787
: LGTM: Consistent zone terminology usage
The changes from GetLocation()
to GetZone()
are consistent with the standardization effort across the codebase.
Also applies to: 797-797
cmd/andaime.go (3)
196-207
: LGTM: Well-documented function with clear parameters and return values
The added documentation clearly describes the function's purpose, parameters, and return values.
209-211
: LGTM: Good use of constants for magic values
Extracting the Ubuntu version and Canonical owner ID into named constants improves maintainability and readability.
227-227
: LGTM: Improved error handling with wrapped errors
The error handling has been improved by:
- Using
fmt.Errorf
with%w
verb for error wrapping - Adding more descriptive error messages
- Proper error propagation
Also applies to: 232-232, 236-236, 241-241, 249-249
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (1)
internal/clouds/general/location.go (1)
81-85
: Add length check for AWS zone parsingThe AWS zone parsing could have an edge case with an empty string. While the empty input check at the start prevents most issues, it's better to be explicit here for maintainability.
// Check if the input is an AWS-style zone (ends with a letter a-z) - if len(input) > 0 && input[len(input)-1] >= 'a' && input[len(input)-1] <= 'z' { + if len(input) > 1 && input[len(input)-1] >= 'a' && input[len(input)-1] <= 'z' { // Input is a zone, extract region by removing the last character return input[:len(input)-1], input, nil }
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
cmd/beta/aws/create_deployment.go
(5 hunks)internal/clouds/general/location.go
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- cmd/beta/aws/create_deployment.go
🔇 Additional comments (1)
internal/clouds/general/location.go (1)
9-61
: Well-structured implementation with comprehensive provider handling
The function effectively normalizes locations across different cloud providers with:
- Detailed documentation of location formats
- Consistent error handling
- Appropriate debug logging
- Clear separation of provider-specific logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
cmd/beta/aws/create_deployment.go (3)
278-285
: Consider enhancing error handling with resource cleanupWhile the error handling is proper, consider adding cleanup logic when regional networking creation fails to avoid orphaned resources.
if err := awsProvider.CreateRegionalResources(ctx, m.Deployment.AWS.RegionalResources.GetRegions()); err != nil { + // Attempt to clean up any partially created resources + if cleanupErr := awsProvider.CleanupRegionalResources(ctx); cleanupErr != nil { + l.Error(fmt.Sprintf("Failed to cleanup regional resources: %v", cleanupErr)) + } return fmt.Errorf("failed to create regional networking: %w", err) }
340-347
: Add error handling for queue operationsConsider handling potential errors during update queue operations to ensure UI updates are not silently dropped.
-m.QueueUpdate(display.UpdateAction{ +if err := m.QueueUpdate(display.UpdateAction{ MachineName: machine.GetName(), UpdateData: display.UpdateData{ UpdateType: display.UpdateTypeResource, ResourceType: "SSH", ResourceState: models.MachineResourceState(models.ServiceTypeSSH.State), }, -}) +}); err != nil { + l.Warn(fmt.Sprintf("Failed to queue display update for machine %s: %v", machine.GetName(), err)) +}
Line range hint
1-458
: Consider architectural improvements for better separation of concernsThe file handles multiple responsibilities (deployment, SSH, display updates, config management). Consider splitting these into separate packages or services for better maintainability and testing.
Suggested improvements:
- Extract SSH handling into a dedicated service
- Create a separate config management package
- Implement a proper state management service for resource states
This would improve:
- Code organization
- Testability
- Maintainability
- Separation of concerns
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
.cspell/custom-dictionary.txt
is excluded by none and included by none
📒 Files selected for processing (16)
cmd/beta/aws/create_deployment.go
(6 hunks)internal/clouds/general/location.go
(1 hunks)pkg/models/interfaces/aws/mocks/aws/mock_EC2Clienter.go
(0 hunks)pkg/models/interfaces/aws/mocks/azure/mock_AzureClienter.go
(0 hunks)pkg/models/interfaces/aws/mocks/azure/mock_Pollerer.go
(0 hunks)pkg/models/interfaces/aws/mocks/common/mock_Clienter.go
(0 hunks)pkg/models/interfaces/aws/mocks/common/mock_ClusterDeployerer.go
(0 hunks)pkg/models/interfaces/aws/mocks/gcp/mock_GCPClienter.go
(0 hunks)pkg/models/interfaces/aws/mocks/sshutils/mock_SFTPClientCreator.go
(0 hunks)pkg/models/interfaces/aws/mocks/sshutils/mock_SFTPClienter.go
(0 hunks)pkg/models/interfaces/aws/mocks/sshutils/mock_SFTPFile.go
(0 hunks)pkg/models/interfaces/aws/mocks/sshutils/mock_SSHClientCreator.go
(0 hunks)pkg/models/interfaces/aws/mocks/sshutils/mock_SSHClienter.go
(0 hunks)pkg/models/interfaces/aws/mocks/sshutils/mock_SSHConfiger.go
(0 hunks)pkg/models/interfaces/aws/mocks/sshutils/mock_SSHSessioner.go
(0 hunks)pkg/providers/aws/interfaces/ec2_clienter.go
(0 hunks)
💤 Files with no reviewable changes (14)
- pkg/providers/aws/interfaces/ec2_clienter.go
- pkg/models/interfaces/aws/mocks/common/mock_Clienter.go
- pkg/models/interfaces/aws/mocks/sshutils/mock_SFTPClientCreator.go
- pkg/models/interfaces/aws/mocks/sshutils/mock_SSHClientCreator.go
- pkg/models/interfaces/aws/mocks/sshutils/mock_SSHClienter.go
- pkg/models/interfaces/aws/mocks/azure/mock_AzureClienter.go
- pkg/models/interfaces/aws/mocks/azure/mock_Pollerer.go
- pkg/models/interfaces/aws/mocks/sshutils/mock_SFTPFile.go
- pkg/models/interfaces/aws/mocks/sshutils/mock_SFTPClienter.go
- pkg/models/interfaces/aws/mocks/sshutils/mock_SSHSessioner.go
- pkg/models/interfaces/aws/mocks/common/mock_ClusterDeployerer.go
- pkg/models/interfaces/aws/mocks/sshutils/mock_SSHConfiger.go
- pkg/models/interfaces/aws/mocks/gcp/mock_GCPClienter.go
- pkg/models/interfaces/aws/mocks/aws/mock_EC2Clienter.go
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/clouds/general/location.go
🔇 Additional comments (3)
cmd/beta/aws/create_deployment.go (3)
177-190
: LGTM! Properly initialized maps for RegionalResources
The implementation correctly addresses the potential nil map issue by initializing both VPCs and Clients maps before use, following the suggested fix from previous comments.
438-439
: LGTM! Consistent location terminology
The change from location
to region
and zone
fields aligns with the standardization of location terminology across the codebase.
319-339
:
Ensure thread-safe state updates
The SSH resource state updates should be protected by appropriate synchronization mechanisms to prevent race conditions during concurrent access.
Consider wrapping the state updates with a mutex:
+func (m *Machine) SetMachineResourceStateSafe(serviceType string, state ResourceState) {
+ m.mu.Lock()
+ defer m.mu.Unlock()
+ m.SetMachineResourceState(serviceType, state)
+}
Then use it in the code:
-machine.SetMachineResourceState(
+machine.SetMachineResourceStateSafe(
models.ServiceTypeSSH.Name,
models.ResourceStatePending,
)
Summary by CodeRabbit
Release Notes
New Features
Improvements
Bug Fixes
Tests