Connectionless Ansible Deployment with Terraform via SSM

Deploying Ansible playbooks & roles to EC2 instances via Terraform from anywhere without interruption and without accessing SSH.

TL;DR

GitHub repository with poc here.

Introduction

While this blog usually focuses on Red Team topics, I believe that this approach could benefit anyone using Terraform and Ansible to deploy infrastructure on AWS.

Red Team Operations often require us to spin up and destroy infrastructure on the fly. Doing this manually for every operation would be a time-intensive task, prone to error. Therefore, our team attempts to automate as much as possible using Infrastructure as Code tools like Terraform (deploy infra) and Ansible (configure infra). Projects like Red Baron, give us great examples on how to achieve this. This results in several benefits:

  • Consistency and reliability: Anyone can deploy complex infrastructure components in the same way.

  • Version control: If mistakes happen, we can issue a new version of our Terraform script/Ansible playbook and all future deployments will benefit.

  • Speed: Setting up infra is a matter of minutes, where manual configuration would take hours / days. This is time you can no longer afford to spend as a modern Red Team.

  • Scalability: It's easy to upscale. E.g. deploying additional CDN domains to rotate in front of your C2 redirector is a matter of specifying the number.

  • Cost management: We can define default instance types or usage plans in our code that fit the purpose of the infrastructure component to control costs of our environments. The risk of dangling/forgotten infrastructure also lowers, since we can tear down entire environments in one go.

When designing our own infrastructure automation project (Red Bastion) in 2020, I hit a major limitation: After deploying infrastructure with Terraform, SSH must be accessible to deploy the Ansible playbook. This implies you need to deploy a VPC with VPN or jump host first or give your instance a public IP address and expose SSH. The latter is not even an option if you have services that should not have a public IP.

This means I cannot simply deploy my entire Red Team VPC in one go, it with one VPN/jump box, wait for deployment to complete, configure my VPN client, connect, and then deploy stage 2 over VPN. Surely this also frustrates other people, so let's try to come up with an alternative.

SSM Documents: AWS-ApplyAnsiblePlaybooks

Enter the AWS Systems Manager (SSM). According to the AWS documentation, the Systems manager enables you to manage your infrastructure:

Systems Manager provides a unified user interface so you can view operational data from multiple AWS services and enables you to automate operational tasks across your AWS resources.

One of its features is that it allows you to run "SSM Documents", automation runbooks, on onboarded assets. AWS-ApplyAnsiblePlaybooks is such a runbook, which would fit our use case perfectly. It allows you to run an Ansible playbook from an s3 bucket locally on your asset.

At this point, the following questions arose:

  1. Can we automatically package our playbook via terraform and upload it to s3?

  2. What are the instance requirements?

  3. How do we pass arguments to our Ansible playbook?

We will answer this and more in the following sections.

Custom SSM Documents

Before we continue, the default SSM document comes with certain limitations which you would potentially like to overcome. For example, it only installs the ansible package as shown below. What if our playbook depends on Ansible collections?

Instead of following the next steps, you could also copy our read-to-go json file from GitHub. In that case, simply create a new SSM document and paste the json content there.

We can easily select the default playbook and clone it via the AWS console. Go to Systems Manager > Documents and search for "AWS-ApplyAnsiblePlaybooks". Select the automation runbook and click Actions > Clone document.

Cloning the default runbook

We can name the new runbook Custom-ApplyAnsiblePlaybooksWithCollections. Target Type can be left empty.

Custom runbook

We can add an additional parameter RequirementsFile, to pass our playbook's requirements.yml file if the InstallCollections parameter is set to "True".

Next, the following shell script can be added to the aws:runShellScript action to validate whether requirements.yml should be parsed.

Congratulations! We now have our own Custom-ApplyAnsiblePlaybooksWithCollections SSM runbook. Keep in mind that this only exists in the region we created it. In this case, we are using eu-west-1. Therefore, we can only apply it to intances deployed in this region.

Successfully created SSM Custom-ApplyAnsiblePlaybooksWithCollections runbook in eu-west-1

Packaging Ansible Playbooks

Next, we must come up with an approach to package our Ansible playbooks.

Ansible folder structure

The role we are about to create, can be found on GitHub.

At DXC Strikeforce, we decided to stay as close as possible to official Ansible role structure recommendations and centralise roles in git repositories. This generally corresponds to the following:

This approach allows us to recursively include the roles in playbook repositories calling them. E.g., if we would like to apply multiple roles to an instance, we can simply create an Ansible playbook repository as follows:

main.yml contents:

Pushing to s3 via Terraform

The final Terraform code can be found on GitHub.

Now that we have our Ansible playbook structure ready, we can try to push it to S3 via Terraform. Normally, we would develop separate Terraform modules and add them to our private registry, but for this proof of concept, we will add everything in one repository.

The following Terraform code will archive and upload the ansible directory of our project to s3 as ansible.zip. The Terraform code to achieve this is shown below.

variables.tf

s3_bucket.tf

upload_ansible_zip.tf

If we execute the following, Terraform connects to the AWS API to deploy the resources in EC2.

Successful deployment of the s3 bucket

We can check the bucket content via the AWS console to confirm our zip file was indeed uploaded.

Successfully uploaded zip

Bonus: If we would like to update the playbook at any time we can simply change the contents and type terraform apply again. The s3 ansible.zip object will automatically be updated. This allows us a simple method to re-apply a playbook to a previously deployed machine.

EC2 Instance Deployment

At this point, we completed the following items:

  • Successfully created a custom SSM Document automation runbook Custom-ApplyAnsiblePlaybooksWithCollections to run complex Ansible playbooks from s3 bucket.

  • Automated uploading a local Ansible playbook to S3 via Terraform.

The next steps would be to:

  1. Deploy an EC2 instance with Terraform.

  2. Create and assign the correct EC2 instance role to:

    1. Onboard the instance to SSM.

    2. Access the created S3 bucket.

  3. Apply the Custom-ApplyAnsiblePlaybooksWithCollections to the instance via SSM, triggering the Ansible playbook to execute.

Deploy EC2 Instance

Deploying an EC2 instance is both straightforward and well-documented, so we will spend limited time explaining the steps. For our purpose, we will use the following AMI:

This default build of Ubuntu server 22.04 comes with Amazon SSM Agent preinstalled, saving us the hassle of pushing it to the image ourselves. We can go with the cheapest t3a.nano and 8GB of EBS storage, since we do not need much computing power for our purpose.

We will also use data sources to convert VPC and subnet name variables to corresponding IDs, but you could also use the IDs directly. Most of the settings will be defined as defaults in variables.tf.

variables.tf

main.tf

I have the habit of storing some common resources in the main.tf file. This is just a personal preference.

security_group.tf

Our security group will only contain egress rules. Inbound SSH is not required as we will use SSM to push an Ansible playbook to the instance.

ssh_keypair.tf

Just to be safe, we will create an SSH keypair for the instance. This will be our backup key in case the connection with the Amazon Systems Manager is broken somehow.

ec2_instance.tf

Execution

We can now run our terraform code to deploy our EC2 instance.

Successful deployment

SSM & S3 Instance Role

Next, we should create and assign an EC2 instance role to onboard the machine to SSM, which would enable it to communicate with the Systems Manager. This will allow us to apply SSM documents to the EC2 instance. The role should also be able to read the ansible.zip file from our automatically created s3 bucket.

iam_role.tf

The minimal privileges to onboard an instance to SSM are defined in the default AWS arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore IAM policy. If we attach this policy and a custom S3 policy, we fulfilled the requirements.

We will also grant the s3:PutObject permission on the S3 bucket to allow the machine to upload data as well. This can be useful in case we would like to collect logs or files later on.

Apply Playbook

All that is left now is to apply the Custom-ApplyAnsiblePlaybooksWithCollections to the instance via SSM, triggering the Ansible playbook to be downloaded and executed. This should create /hello-world.txt on the target instance.

variables.tf

ssm_ansible_playbook.tf

Execution

After expanding our Terraform project with the SSM association, we can apply the changes. We should not forget to add ansible_extra_vars to ensure we can pass parameters to our script.

Successful Terraform apply

Next, we can validate successfull application via the State Manager.

Successful SSM document association

Additionally, we can start a session via SSM via Fleet Manager and check if the file was indeed added. If we monitor the root directory, we can observe the moment /hello_world.txt is written with the value "hello test" as specified in our Ansible playbook.

Successfully wrote hello_world.txt via Ansible

Conclusion

Success! We successfully managed to deploy a fresh Ubuntu 22.04 LTS EC2 instance and applied an Ansible role through a playbook with parameters without exposing SSH! This can now easily be replicated to automatically spin up entire private environments from anywhere in the world, without any direct connection.

Troubleshooting

If your machine is not showing up in SSM, I've found that it's usually one of these:

  • Egress traffic does not allow comms with AWS SSM API

  • SSM instance role not applied correctly

  • Deployed VM in a public subnet but forgot to assign a public IP.

  • Chose an AMI that does not have Amazon SSM Agent preinstalled

Ansible Playbook Execution

Your ansible.zip will be expanded under /var/lib/amazon/ssm/<instanceid>/document/orchestration/<orchestrationid>/downloads. When troubleshooting playbook execution, we usually go into this directory and execute manually with the appropriate extra vars.

Known Limitations

Passing extra vars via SSM documents can be a bit tricky, as certain characters are not allowed. A workaround could be to pass a config.json inside the files directory of the Ansible role or use the vars directory to pass parameters and large values.

Last updated