Refactoring your GitHub Actions workflow into a script
Is your GitHub Actions workflow file getting a bit large and perhaps a little difficult to follow? Or maybe you want to share logic between multiple workflows in the same repo and want to keep things 'DRY'? In this article we'll compare using different scripting engines to extract out code from a workflow.
The problem
Here's a workflow which deploys files to an Amazon S3 bucket and then invalidates a Cloudfront Distribution using the AWS CLI. This is a common situation for anyone deploying a static website site to AWS.
name: Deployment Workflow
jobs:
deploy:
runs-on: ubuntu-latest
steps:
...
- name: Deploy
run: |
aws s3 sync ./public/ s3://my-test-bucket --delete aws cloudfront create-invalidation --distribution-id MYCLOUDFRONTID --paths /*
This example has only two commands and those two commands don't have too many command line options. However, real world solutions will often be more complex.
The best practice is to create your own GitHub Action (see the official GitHub guide for more details). There's quite a lot of overhead with this approach and if you're only working on a single repo it's probably easiest just to move the logic into a script within the same repo.
So how should our script work?
Ideally the script should tick the following boxes:
- Fast - CI/CD is all about quick feedback, we don't want to make things noticably slower. Also, time is money with GitHub Actions!
- Named parameters - we're trying to make our workflow file easier to read by extracting out implementation details. Let's build on this by making sure the parameters we're sending to the script are properly labelled in the workflow file.
- Clean and maintainable - we want to make things easier to work with so let's make sure the script we build doesn't end up adding more complexity.
1. A Bash script
deploy.sh
#!/bin/bash
# Read named command line arguments into an args variable
declare -A args
while (( "$#" )); do
if [[ $1 == --* ]] && [ "$2" ]; then
args[${1:2}]=$2
shift
fi
shift
done
# Do the deployment
aws s3 sync ./build/ "s3://${args[bucketname]}" --delete
if [ -n "${args[cloudfrontid]}" ]; then
aws cloudfront create-invalidation --distribution-id "${args[cloudfrontid]}" --paths /*
fi
And inside the workflow yml file:
...
steps:
...
- name: Deploy
run: ./scripts/deploy.sh --bucketname my-test-bucket --cloudfrontid MYCLOUDFRONTID
Pros:
- It's fast
- One file and no dependencies
- The AWS commands inside the bash file are nice and clear
Cons:
- It's a Bash script with logic in - not for everyone. That code to parse the command line is pretty impenetrable to anyone who isn't a Bash ninja...
2. A Python script
deploy.py
import argparse
import os
parser = argparse.ArgumentParser(description='Deploy')
parser.add_argument('--bucketname', dest='bucketname', required=True)
parser.add_argument('--cloudfrontid', dest='cloudfrontid')
args = parser.parse_args()
os.system(f'aws s3 sync ./build/ s3://{args.bucketname} --delete')
if args.cloudfrontid:
os.system(f'aws cloudfront create-invalidation --distribution-id {args.cloudfrontid} --paths /*')
And inside the workflow yml file:
...
- name: Deploy
run: python3 ./scripts/deploy.sh --bucketname my-test-bucket --cloudfrontid MYCLOUDFRONTID
Pros:
- It's fast
- One file and no dependencies
- The code to parse the command line arguments is nice and explicit
Cons:
- The shell commands aren't as clear as the Bash script (because of the explicit
os.system
call), but overall it's pretty readable.
3. A PowerShell script
deploy.ps1
param (
[Parameter(Mandatory)]
[string]$BucketName,
[string]$CloudfrontID
)
iex "aws s3 sync ./build/ s3://$BucketName --delete"
if ($CloudfrontID) {
iex "aws cloudfront create-invalidation --distribution-id $CloudfrontID --paths /*"
}
And the entry in the workflow yml file:
...
- name: Deploy
run: pwsh ./scripts/deploy.ps1 -BucketName my-test-bucket -CloudfrontID MYCLOUDFRONTID
Pros:
- One file and no dependencies
- The command line argument expectations are clear
Cons:
- It's a little slow to get going. 15s latency on Ubuntu (probably much faster on a Windows runner!)
- Like the Python script, issuing the AWS commands isn't as clear as it is in the Bash script.
4. A Node script
deploy.js
const { execSync } = require('child_process');
const yargs = require('yargs');
const args = yargs.options({
'bucketname': { demandOption: true },
'cloudfrontid': {},
}).argv;
execSync(
`aws s3 sync ./build/ s3://${args.bucketname} --delete`,
{stdio: 'inherit'}
);
if (args.cloudfrontid) {
execSync(
`aws cloudfront create-invalidation --distribution-id ${args.cloudfrontid} --paths /*`,
{stdio: 'inherit'}
);
}
And the workflow yml file:
...
- name: Deploy
run: node ./scripts/deploy.js --bucketname my-test-bucket --cloudfrontid MYCLOUDFRONTID
Pros:
- Fast, no latency
- The command line argument expectations are clear
Cons:
- Relies on another package ('yargs') to do the argument parsing. If your project is node based then just include this in your package.json file and you're done. Otherwise you'll have to setup a mini node project within your repo, not impossible but not that nice either...
execSync
adds some visual overhead versus just calling the command. In addition, an extra parameter is needed to make sure we see the command's output in the logs.
5. A Node based GitHub Action in the same repo
For a step by step guide to using this method see the article 'Create your own local js GitHub Action with just two files'
deploy.js
const core = require("@actions/core");
const { exec } = require("@actions/exec");
async function deploy() {
const bucketName = core.getInput('bucket-name');
await exec(`aws s3 sync ./build/ s3://${bucketName} --delete`);
const cloudfrontID = core.getInput('cloudfront-id');
if (cloudfrontID) {
await exec(`aws cloudfront create-invalidation --distribution-id ${cloudfrontID} --paths /*`);
}
}
deploy()
.catch(error => core.setFailed(error.message));
action.yml
name: "Deploy"
description: "Deploy action"
inputs:
bucket-name:
description: "S3 Bucket"
required: true
cloudfront-id:
description: "Cloudfront ID"
runs:
using: "node12"
main: "deploy.js"
And the workflow yml file:
...
- name: Deploy
uses: ./actions/deploy with: bucket-name: my-test-bucket cloudfront-id: MYCLOUDFRONTID
Pros:
- Fast, no latency
- Parameters are nicely documented by the action.yml file.
- The entry in the workflow file is clearer then cramming everything onto one line (as we do with the other scripts).
- Can be made into a standalone GitHub Action in the future.
Cons:
- Dependant on the GitHub toolkit packages. If you're working on a node project then just add these packages to your package.json file. Otherwise you can include the package files in the repo, see the GitHub Actions documentation for an example.
- The async/await handling adds some extra complexity.
- Requires an additional action.yml file and also we need a certain directory structure - but on the other hand this provides a nice seperation of concerns, seperating the parameter definition from the code.
TL;DR
-
The Python script offers a good balance of clarity and maintainability.
-
Go with Bash or Powershell if that's your thing!
-
If you're working on a node project then creating a local GitHub Action is a nice option. It integrates well with the workflow yaml file and in the future you've got options to upgrade to a fully fledged action in its own repo.
See this tutorial for a step-by-step guide to create a local javascript GitHub Action.