💥 See issue 819 on how to migrate to v7 smoothly. 💥 See pr 1204 on how to migrate to v8 smoothly.
This Terraform modules creates a GitLab Runner. A blog post describes the original version of the runner. See the post at 040code. The original setup of the module is based on the blog post: Auto scale GitLab CI runners and save 90% on EC2 costs.
The runners created by the module use spot instances by default for running the builds using the docker+machine
executor.
- Shared cache in S3 with life cycle management to clear objects after x days.
- Logs streamed to CloudWatch.
- Runner agents registered automatically.
The runner supports 3 main scenarios:
-
GitLab CI docker-machine runner - one runner agent
In this scenario the runner agent is running on a single EC2 node and runners are created by docker machine using spot instances. Runners will scale automatically based on the configuration. The module creates a S3 cache by default, which is shared across runners (spot instances).
-
GitLab CI docker-machine runner - multiple runner agents
In this scenario the multiple runner agents can be created with different configuration by instantiating the module multiple times. Runners will scale automatically based on the configuration. The S3 cache can be shared across runners by managing the cache outside the module.
-
GitLab Ci docker runner
In this scenario not docker machine is used but docker to schedule the builds. Builds will run on the same EC2 instance as the agent. No auto-scaling is supported.
For detailed concepts and usage please refer to usage.
PRs are welcome! Please see the contributing guide for more details.
Thanks to all the people who already contributed!
Made with contributors-img.
This project is licensed under the MIT License - see the LICENSE file for details.
Name | Version |
---|---|
terraform | >= 1.3 |
aws | >= 5.26 |
local | >= 2.4.0 |
tls | >= 3 |
Name | Version |
---|---|
aws | 5.77.0 |
local | 2.5.2 |
tls | 4.0.6 |
Name | Source | Version |
---|---|---|
cache | ./modules/cache | n/a |
terminate_agent_hook | ./modules/terminate-agent-hook | n/a |
Name | Description | Type | Default | Required |
---|---|---|---|---|
debug | trace_runner_user_data: Enable bash trace for the user data script on the Agent. Be aware this could log sensitive data such as you GitLab runner token. write_runner_config_to_file: When enabled, outputs the rendered config.toml file in the root module. Note that enabling this can potentially expose sensitive information. write_runner_user_data_to_file: When enabled, outputs the rendered userdata.sh file in the root module. Note that enabling this can potentially expose sensitive information. |
object({ |
{} |
no |
enable_managed_kms_key | Let the module manage a KMS key. Be-aware of the costs of an custom key. Do not specify a kms_key_id when enable_kms is set to true . |
bool |
false |
no |
environment | A name that identifies the environment, used as prefix and for tagging. | string |
n/a | yes |
iam_object_prefix | Set the name prefix of all AWS IAM resources. | string |
"" |
no |
iam_permissions_boundary | Name of permissions boundary policy to attach to AWS IAM roles | string |
"" |
no |
kms_key_id | KMS key id to encrypt the resources. Ensure that CloudWatch and Runner/Runner Workers have access to the provided KMS key. | string |
"" |
no |
kms_managed_alias_name | Alias added to the created KMS key. | string |
"" |
no |
kms_managed_deletion_rotation_window_in_days | Key deletion/rotation window for the created KMS key. Set to 0 for no rotation/deletion window. | number |
7 |
no |
runner_ami_filter | List of maps used to create the AMI filter for the Runner AMI. Must resolve to an Amazon Linux 1, 2 or 2023 image. | map(list(string)) |
{ |
no |
runner_ami_owners | The list of owners used to select the AMI of the Runner instance. | list(string) |
[ |
no |
runner_cloudwatch | enable = Boolean used to enable or disable the CloudWatch logging. log_group_name = Option to override the default name ( environment ) of the log group. Requires enable = true .retention_days = Retention for cloudwatch logs. Defaults to unlimited. Requires enable = true . |
object({ |
{} |
no |
runner_enable_asg_recreation | Enable automatic redeployment of the Runner's ASG when the Launch Configs change. | bool |
true |
no |
runner_gitlab | ca_certificate = Trusted CA certificate bundle (PEM format). certificate = Certificate of the GitLab instance to connect to (PEM format). registration_token = (deprecated, This is replaced by the registration_token in runner_gitlab_registration_config .) Registration token to use to register the Runner.runner_version = Version of the GitLab Runner. Make sure that it is available for your AMI. See https://packages.gitlab.com/app/runner/gitlab-runner/search?dist=amazon%2F2023&filter=rpms&page=1&q= url = URL of the GitLab instance to connect to. url_clone = URL of the GitLab instance to clone from. Use only if the agent can’t connect to the GitLab URL. access_token_secure_parameter_store_name = (deprecated) The name of the SSM parameter to read the GitLab access token from. It must have the api scope and be pre created.preregistered_runner_token_ssm_parameter_name = The name of the SSM parameter to read the preregistered GitLab Runner token from. |
object({ |
n/a | yes |
runner_gitlab_registration_config | (deprecated, replaced by runner_gitlab.preregistered_runner_token_ssm_parameter_name) Configuration used to register the Runner. See the README for an example, or reference the examples in the examples directory of this repo. There is also a good GitLab documentation available at: https://docs.gitlab.com/ee/ci/runners/configure_runners.html | object({ |
{} |
no |
runner_gitlab_registration_token_secure_parameter_store_name | (deprecated, replaced by runner_gitlab.preregistered_runner_token_ssm_parameter_name) The name of the SSM parameter to read the GitLab Runner registration token from. | string |
"gitlab-runner-registration-token" |
no |
runner_gitlab_token_secure_parameter_store | Name of the Secure Parameter Store entry to hold the GitLab Runner token. | string |
"runner-token" |
no |
runner_install | amazon_ecr_credential_helper = Install amazon-ecr-credential-helper inside userdata_pre_install scriptdocker_machine_download_url = URL to download docker machine binary. If not set, the docker machine version will be used to download the binary. docker_machine_version = By default docker_machine_download_url is used to set the docker machine version. This version will be ignored once docker_machine_download_url is set. The version number is maintained by the CKI project. Check out at https://gitlab.com/cki-project/docker-machine/-/releasespre_install_script = Script to run before installing the Runner post_install_script = Script to run after installing the Runner start_script = Script to run after starting the Runner yum_update = Update the yum packages before installing the Runner |
object({ |
{} |
no |
runner_instance | additional_tags = Map of tags that will be added to the Runner instance. collect_autoscaling_metrics = A list of metrics to collect. The allowed values are GroupDesiredCapacity, GroupInServiceCapacity, GroupPendingCapacity, GroupMinSize, GroupMaxSize, GroupInServiceInstances, GroupPendingInstances, GroupStandbyInstances, GroupStandbyCapacity, GroupTerminatingCapacity, GroupTerminatingInstances, GroupTotalCapacity, GroupTotalInstances. ebs_optimized = Enable EBS optimization for the Runner instance. max_lifetime_seconds = The maximum time a Runner should live before it is killed. monitoring = Enable the detailed monitoring on the Runner instance. name = Name of the Runner instance. name_prefix = Set the name prefix and override the Name tag for the Runner instance.private_address_only = Restrict the Runner to use private IP addresses only. If this is set to true the Runner will use a private IP address only in case the Runner Workers use private addresses only.root_device_config = The Runner's root block device configuration. Takes the following keys: device_name , delete_on_termination , volume_type , volume_size , encrypted , iops , throughput , kms_key_id spot_price = By setting a spot price bid price the Runner is created via a spot request. Be aware that spot instances can be stopped by AWS. Choose "on-demand-price" to pay up to the current on demand price for the instance type chosen. ssm_access = Allows to connect to the Runner via SSM. type = EC2 instance type used. use_eip = Assigns an EIP to the Runner. |
object({ |
{ |
no |
runner_manager | For details check https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-global-section gitlab_check_interval = Number of seconds between checking for available jobs (check_interval) maximum_concurrent_jobs = The maximum number of jobs which can be processed by all Runners at the same time (concurrent). prometheus_listen_address = Defines an address (:) the Prometheus metrics HTTP server should listen on (listen_address). sentry_dsn = Sentry DSN of the project for the Runner Manager to use (uses legacy DSN format) (sentry_dsn) |
object({ |
{} |
no |
runner_metadata_options | Enable the Runner instance metadata service. IMDSv2 is enabled by default. | object({ |
{ |
no |
runner_networking | allow_incoming_ping = Allow ICMP Ping to the Runner. Specify allow_incoming_ping_security_group_ids too!allow_incoming_ping_security_group_ids = A list of security group ids that are allowed to ping the Runner. security_group_description = A description for the Runner's security group security_group_ids = IDs of security groups to add to the Runner. |
object({ |
{} |
no |
runner_networking_egress_rules | List of egress rules for the Runner. | list(object({ |
[ |
no |
runner_role | additional_tags = Map of tags that will be added to the role created. Useful for tag based authorization. allow_iam_service_linked_role_creation = Boolean used to control attaching the policy to the Runner to create service linked roles. assume_role_policy_json = The assume role policy for the Runner. create_role_profile = Whether to create the IAM role/profile for the Runner. If you provide your own role, make sure that it has the required permissions. policy_arns = List of policy ARNs to be added to the instance profile of the Runner. role_profile_name = IAM role/profile name for the Runner. If unspecified then ${var.iam_object_prefix}-instance is used. |
object({ |
{} |
no |
runner_schedule_config | Map containing the configuration of the ASG scale-out and scale-in for the Runner. Will only be used if runner_schedule_enable is set to true . |
map(any) |
{ |
no |
runner_schedule_enable | Set to true to enable the auto scaling group schedule for the Runner. |
bool |
false |
no |
runner_sentry_secure_parameter_store_name | The Sentry DSN name used to store the Sentry DSN in Secure Parameter Store | string |
"sentry-dsn" |
no |
runner_terminate_ec2_lifecycle_hook_name | Specifies a custom name for the ASG terminate lifecycle hook and related resources. | string |
null |
no |
runner_terminate_ec2_lifecycle_timeout_duration | Amount of time in seconds to wait for GitLab Runner to finish picked up jobs. Defaults to the maximum_timeout configured + 5m . Maximum allowed is 7200 (2 hours) |
number |
null |
no |
runner_terminate_ec2_timeout_duration | Timeout in seconds for the graceful terminate worker Lambda function. | number |
90 |
no |
runner_terraform_timeout_delete_asg | Timeout when trying to delete the Runner ASG. | string |
"10m" |
no |
runner_worker | For detailed information, check https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runners-section. environment_variables = List of environment variables to add to the Runner Worker (environment). max_jobs = Number of jobs which can be processed in parallel by the Runner Worker. output_limit = Sets the maximum build log size in kilobytes. Default is 4MB (output_limit). request_concurrency = Limit number of concurrent requests for new jobs from GitLab (default 1) (request_concurrency). ssm_access = Allows to connect to the Runner Worker via SSM. type = The Runner Worker type to use. Currently supports docker+machine or docker or docker-autoscaler . |
object({ |
{} |
no |
runner_worker_cache | Configuration to control the creation of the cache bucket. By default the bucket will be created and used as shared cache. To use the same cache across multiple Runner Worker disable the creation of the cache and provide a policy and bucket name. See the public runner example for more details." For detailed documentation check https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runnerscaches3-section access_log_bucker_id = The ID of the bucket where the access logs are stored. access_log_bucket_prefix = The bucket prefix for the access logs. authentication_type = A string that declares the AuthenticationType for [runners.cache.s3]. Can either be 'iam' or 'credentials' bucket = Name of the cache bucket. Requires create = false .bucket_prefix = Prefix for s3 cache bucket name. Requires create = true .create = Boolean used to enable or disable the creation of the cache bucket. create_aws_s3_bucket_public_access_block = Boolean used to enable or disable the creation of the public access block for the cache bucket. Useful when organizations do not allow the creation of public access blocks on individual buckets (e.g. public access is blocked on all buckets at the organization level). expiration_days = Number of days before cache objects expire. Requires create = true .include_account_id = Boolean used to include the account id in the cache bucket name. Requires create = true .policy = Policy to use for the cache bucket. Requires create = false .random_suffix = Boolean used to enable or disable the use of a random string suffix on the cache bucket name. Requires create = true .shared = Boolean used to enable or disable the use of the cache bucket as shared cache. versioning = Boolean used to enable versioning on the cache bucket. Requires create = true . |
object({ |
{} |
no |
runner_worker_docker_add_dind_volumes | Add certificates and docker.sock to the volumes to support docker-in-docker (dind) | bool |
false |
no |
runner_worker_docker_autoscaler | fleeting_plugin_version = The version of aws fleeting plugin connector_config_user = User to connect to worker machine key_pair_name = The name of the key pair used by the Runner to connect to the docker-machine Runner Workers. This variable is only supported when enables is set to true .capacity_per_instance = The number of jobs that can be executed concurrently by a single instance. max_use_count = Max job number that can run on a worker update_interval = The interval to check with the fleeting plugin for instance updates. update_interval_when_expecting = The interval to check with the fleeting plugin for instance updates when expecting a state change. instance_ready_command = Executes this command on each instance provisioned by the autoscaler to ensure that it is ready for use. A failure results in the instance being removed. |
object({ |
{} |
no |
runner_worker_docker_autoscaler_ami_filter | List of maps used to create the AMI filter for the Runner Worker. | map(list(string)) |
{ |
no |
runner_worker_docker_autoscaler_ami_owners | The list of owners used to select the AMI of the Runner Worker. | list(string) |
[ |
no |
runner_worker_docker_autoscaler_asg | enable_mixed_instances_policy = Make use of autoscaling-group mixed_instances_policy capacities to leverage pools and spot instances. health_check_grace_period = Time (in seconds) after instance comes into service before checking health health_check_type = Controls how health checking is done. Values are - EC2 and ELB instance_refresh_min_healthy_percentage = The amount of capacity in the Auto Scaling group that must remain healthy during an instance refresh to allow the operation to continue, as a percentage of the desired capacity of the Auto Scaling group. instance_refresh_triggers = Set of additional property names that will trigger an Instance Refresh. A refresh will always be triggered by a change in any of launch_configuration, launch_template, or mixed_instances_policy. max_growth_rate = The maximum number of machines that can be added to the runner in parallel. on_demand_base_capacity = Absolute minimum amount of desired capacity that must be fulfilled by on-demand instances. on_demand_percentage_above_base_capacity = Percentage split between on-demand and Spot instances above the base on-demand capacity. override_instance_types = List to override the instance type in the Launch Template. Allow to spread spot instances on several types, to reduce interruptions profile_name = profile_name = Name of the IAM profile to attach to the Runner Workers. sg_ingresses = Extra security group rule for workers spot_allocation_strategy = How to allocate capacity across the Spot pools. 'lowest-price' to optimize cost, 'capacity-optimized' to reduce interruptions spot_instance_pools = Number of Spot pools per availability zone to allocate capacity. EC2 Auto Scaling selects the cheapest Spot pools and evenly allocates Spot capacity across the number of Spot pools that you specify. subnet_ids = The list of subnet IDs to use for the Runner Worker when the fleet mode is enabled. types = The type of instance to use for the Runner Worker. In case of fleet mode, multiple instance types are supported. upgrade_strategy = Auto deploy new instances when launch template changes. Can be either 'bluegreen', 'rolling' or 'off' enabled_metrics = List of metrics to collect. |
object({ |
{} |
no |
runner_worker_docker_autoscaler_autoscaling_options | Set autoscaling parameters based on periods, see https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runnersautoscalerpolicy-sections | list(object({ |
[] |
no |
runner_worker_docker_autoscaler_instance | ebs_optimized = Enable EBS optimization for the Runner Worker. http_tokens = Whether or not the metadata service requires session tokens http_put_response_hop_limit = The desired HTTP PUT response hop limit for instance metadata requests. The larger the number, the further instance metadata requests can travel. monitoring = Enable detailed monitoring for the Runner Worker. private_address_only = Restrict Runner Worker to the use of a private IP address. If runner_instance.use_private_address_only is set to true (default),root_device_name = The name of the root volume for the Runner Worker. root_size = The size of the root volume for the Runner Worker. start_script = Cloud-init user data that will be passed to the Runner Worker. Should not be base64 encrypted. volume_type = The type of volume to use for the Runner Worker. gp2 , gp3 , io1 or io2 are supportedvolume_iops = Guaranteed IOPS for the volume. Only supported when using gp3 , io1 or io2 as volume_type .volume_throughput = Throughput in MB/s for the volume. Only supported when using gp3 as volume_type . |
object({ |
{} |
no |
runner_worker_docker_autoscaler_role | additional_tags = Map of tags that will be added to the Runner Worker. assume_role_policy_json = Assume role policy for the Runner Worker. policy_arns = List of ARNs of IAM policies to attach to the Runner Workers. profile_name = Name of the IAM profile to attach to the Runner Workers. |
object({ |
{} |
no |
runner_worker_docker_machine_ami_filter | List of maps used to create the AMI filter for the Runner Worker. | map(list(string)) |
{ |
no |
runner_worker_docker_machine_ami_owners | The list of owners used to select the AMI of the Runner Worker. | list(string) |
[ |
no |
runner_worker_docker_machine_autoscaling_options | Set autoscaling parameters based on periods, see https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runnersmachine-section | list(object({ |
[] |
no |
runner_worker_docker_machine_ec2_metadata_options | Enable the Runner Worker metadata service. Requires you use CKI maintained docker machines. | object({ |
{ |
no |
runner_worker_docker_machine_ec2_options | List of additional options for the docker+machine config. Each element of this list must be a key=value pair. E.g. '["amazonec2-zone=a"]' | list(string) |
[] |
no |
runner_worker_docker_machine_extra_egress_rules | List of egress rules for the Runner Workers. | list(object({ |
[ |
no |
runner_worker_docker_machine_fleet | enable = Activates the fleet mode on the Runner. https://gitlab.com/cki-project/docker-machine/-/blob/v0.16.2-gitlab.19-cki.2/docs/drivers/aws.md#fleet-mode key_pair_name = The name of the key pair used by the Runner to connect to the docker-machine Runner Workers. This variable is only supported when enables is set to true . |
object({ |
{ |
no |
runner_worker_docker_machine_instance | For detailed documentation check https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runnersmachine-section docker_registry_mirror_url = The URL of the Docker registry mirror to use for the Runner Worker. destroy_after_max_builds = Destroy the instance after the maximum number of builds has been reached. ebs_optimized = Enable EBS optimization for the Runner Worker. idle_count = Number of idle Runner Worker instances (not working for the Docker Runner Worker) (IdleCount). idle_time = Idle time of the Runner Worker before they are destroyed (not working for the Docker Runner Worker) (IdleTime). max_growth_rate = The maximum number of machines that can be added to the runner in parallel. monitoring = Enable detailed monitoring for the Runner Worker. name_prefix = Set the name prefix and override the Name tag for the Runner Worker.private_address_only = Restrict Runner Worker to the use of a private IP address. If runner_instance.use_private_address_only is set to true (default), runner_worker_docker_machine_instance.private_address_only will also apply for the Runner.root_device_name = The name of the root volume for the Runner Worker. root_size = The size of the root volume for the Runner Worker. start_script = Cloud-init user data that will be passed to the Runner Worker. Should not be base64 encrypted. subnet_ids = The list of subnet IDs to use for the Runner Worker when the fleet mode is enabled. types = The type of instance to use for the Runner Worker. In case of fleet mode, multiple instance types are supported. volume_type = The type of volume to use for the Runner Worker. gp2 , gp3 , io1 or io2 are supported.volume_throughput = Throughput in MB/s for the volume. Only supported when using gp3 as volume_type .volume_iops = Guaranteed IOPS for the volume. Only supported when using gp3 , io1 or io2 as volume_type . Works for fleeting only. See runner_worker_docker_machine_fleet . |
object({ |
{} |
no |
runner_worker_docker_machine_instance_spot | enable = Enable spot instances for the Runner Worker. max_price = The maximum price willing to pay. By default the price is limited by the current on demand price for the instance type chosen. |
object({ |
{} |
no |
runner_worker_docker_machine_role | additional_tags = Map of tags that will be added to the Runner Worker. assume_role_policy_json = Assume role policy for the Runner Worker. policy_arns = List of ARNs of IAM policies to attach to the Runner Workers. profile_name = Name of the IAM profile to attach to the Runner Workers. |
object({ |
{} |
no |
runner_worker_docker_machine_security_group_description | A description for the Runner Worker security group | string |
"A security group containing Runner Worker instances" |
no |
runner_worker_docker_options | Options added to the [runners.docker] section of config.toml to configure the Docker container of the Runner Worker. For details check https://docs.gitlab.com/runner/configuration/advanced-configuration.html Default values if the option is not given: disable_cache = "false" image = "docker:18.03.1-ce" privileged = "true" pull_policy = "always" shm_size = 0 tls_verify = "false" volumes = "/cache" |
object({ |
{ |
no |
runner_worker_docker_services | Starts additional services with the Docker container. All fields must be set (examine the Dockerfile of the service image for the entrypoint - see ./examples/runner-default/main.tf) | list(object({ |
[] |
no |
runner_worker_docker_services_volumes_tmpfs | Mount a tmpfs in gitlab service container. https://docs.gitlab.com/runner/executors/docker.html#mounting-a-directory-in-ram | list(object({ |
[] |
no |
runner_worker_docker_volumes_tmpfs | Mount a tmpfs in Executor container. https://docs.gitlab.com/runner/executors/docker.html#mounting-a-directory-in-ram | list(object({ |
[] |
no |
runner_worker_gitlab_pipeline | post_build_script = Script to execute in the pipeline just after the build, but before executing after_script. pre_build_script = Script to execute in the pipeline just before the build. pre_clone_script = Script to execute in the pipeline before cloning the Git repository. this can be used to adjust the Git client configuration first, for example. |
object({ |
{} |
no |
security_group_prefix | Set the name prefix and overwrite the Name tag for all security groups. |
string |
"" |
no |
subnet_id | Subnet id used for the Runner and Runner Workers. Must belong to the vpc_id . In case the fleet mode is used, multiple subnets forthe Runner Workers can be provided with runner_worker_docker_machine_instance.subnet_ids. |
string |
n/a | yes |
suppressed_tags | List of tag keys which are automatically removed and never added as default tag by the module. | list(string) |
[] |
no |
tags | Map of tags that will be added to created resources. By default resources will be tagged with name and environment. | map(string) |
{} |
no |
vpc_id | The VPC used for the runner and runner workers. | string |
n/a | yes |
Name | Description |
---|---|
runner_agent_role_arn | ARN of the role used for the ec2 instance for the GitLab runner agent. |
runner_agent_role_name | Name of the role used for the ec2 instance for the GitLab runner agent. |
runner_agent_sg_id | ID of the security group attached to the GitLab runner agent. |
runner_as_group_name | Name of the autoscaling group for the gitlab-runner instance |
runner_cache_bucket_arn | ARN of the S3 for the build cache. |
runner_cache_bucket_name | Name of the S3 for the build cache. |
runner_eip | EIP of the Gitlab Runner |
runner_launch_template_name | The name of the runner's launch template. |
runner_role_arn | ARN of the role used for the docker machine runners. |
runner_role_name | Name of the role used for the docker machine runners. |
runner_sg_id | ID of the security group attached to the docker machine runners. |