Posts

    

Introduction

In our earlier article on Git pipelines, we mentioned that GitHub had released a beta of Actions, their latest CI/CD workflow automation tool. Let’s take a quick look at some of its features.

For simplicity, we’ll use the same example as in the previous article – that of rendering this article into HTML – which is more than enough to demonstrate the basic features.

To recap, the workflow for Git pipelines was:

  1. Get the latest commit in the repository
  2. Install GNU Make
  3. Install pandoc which is used to render Markdown into HTML
  4. Render the HTML document from Markdown
  5. Archive HTML document

The Actions based workflow is similar, but quite a bit simpler. It performs the following tasks:

  1. Get latest commit in the repository
  2. Render the HTML document from Markdown
  3. Publish the rendered HTML document to GitHub pages

It’s simpler, because we don’t need to install the dependent software – we can use pre-prepared Docker Hub images instead.

What are GitHub Actions?

Actions introduce integrated pipelines called workflows into a GitHub repository. That means we can access workflows directly from GitHub’s dashboard via the Actions tab. (Note that when we were preparing this article, the job history in the Actions tab did not show until after we had published to the master branch.)

From the Actions tab we can view job history as well as view, edit or add workflows:

What are GitHub Workflows?

Workflows define the automation steps of a pipeline. Workflows are stored in the .github/workflows directory at the root of your project. A workflow has one or more jobs that contains a sequence of tasks called steps. As an example, lets work through this project’s workflow, which is defined in the yaml file below:



There are three core sections to a workflow (1) – (3):

(1) name

A workflow has a name. This name will appear as a title on the dashboard.

(2) on

This describes how this workflow gets triggered. There are multiple ways that a workflow can be triggered:

  1. on push or pull request on branch or tag
  2. on push or pull request on a path
  3. on a schedule

Here, we are experimenting with being triggered by a push on file changes to README.md.

(3) jobs

Jobs contain steps for execution. The bulk of a jobs workflow appears under section (3). These are explained in sections (4) to (14) below.

(4) id

Jobs are given a unique id. Here, we have labelled it build.

(5) name

Jobs have a name which will appear on GitHub.

(6) runs-on

Jobs are run on GitHub hosted virtual machine images. The current choices offer these three virtual environments types:

  1. Ubuntu
  2. Windows Server
  3. macOS X.

Apart from latest there are a choice of versions for each virtual environment. The limitation here is that you must use one of these images. If you are invoking Docker based Actions, then you must use a Linux image. These Docker images also must run as root, which could be problematic. For example, Haskell Stack will complain when installing dependencies with a user of different privileges.

(7) steps

The remainder of the job is composed of Steps. Steps are the workhorse of workflows. Steps can run set-up tasks, run commands or run actions. Our workflow performs three named tasks:

  1. shallow checkout
  2. render document
  3. publish to pages

(8) checkout

Previously with Azure pipelines we only needed to specify how the pipeline was triggered – it was assumed that the code was already checked out. With Actions this step is explicit: that is, we need to invoke an action to checkout from GitHub. The benefit is that you can finely tune how and what to checkout. In the example action, (8), we are performing a shallow checkout (depth of 1 commit) from the master branch.

(9) uses

To perform the checkout we are using the standard checkout action. We would recommend that you specify a specific version instead of a generic tag like @latest.

When we were reviewing actions, it was helpful and instructive to view the source code to check whether the action provided the required features. For instance, we were able to trial three different actions to publish content, before settling on the current solution.

(10) with

Some actions require parameters. These are provide using the with clause. In this case, (10) we are supplying specific checkout options.

Each Action can define its own values or defaults so it pays to read the source to determine the available choices for the specific version being used.

In other examples (11), we are overriding the default entry point of the Docker container, or specifying the directory location to publish, (12).

(11) using custom Docker

Custom Docker containers can be called as Actions. In this example we are calling a prepared image with all the tools used for rendering this project from markdown to HTML.

(12) publish pages

In our previous article we rendered markdown to HTML and provided it as an archive to download. A better solution is to publish static content to GitHub Pages. This required the creation of an access token which is nicely described here. This token is added to the project as a Settings > Secret named GH_PAGES_TOKEN. This token is passed to the action so it is able to publish the rendered static HTML page to the gh_pages branch.

(13) if

If can conditionally execute a step. The conditional expression can be a Boolean expression or a GitHub context. If the condition is true, the step will execute. In our example it uses a context to check the status of the previous step.

(14) secrets

Secrets are encrypted environment variables. They are restricted for use in Actions. Here, we store the token required to publish to GitHub Pages. See (12).

Putting It All Together

We now have all the pieces in place to execute our workflow that will:

  1. Invoke an action to perform a shallow checkout of our repository from the master branch
  2. Render the markdown using a custom Action from our own pandoc Docker container
  3. Use a public Action to publish the static HTML to GitHub pages

Workflows are integrated into GitHub unlike the previous Azure pipelines. A big relief!

Some Extras

Workflow Logs

Job runs are recorded. You can review a job by following the link from Workflow runs. This will show a run history like:

Each job step has logs that can be viewed and/or downloaded.

Editing a Workflow

GitHub provides an online editor for your workflow:

However, this editor does not currently validate the workflow. So, why is it even provided as it offers nothing that normal online editing doesn’t?

 

First Impressions

Our first impression of GitHub Actions is that they are a significant improvement over the former Azure pipelines. Features we particularly like are:

  • Actions are well integrated into GitHub
  • There’s an active marketplace for Actions and Apps. See a comparison between Actions and Apps here.
  • Documentation is good
  • The ability to use custom Docker images
  • Fast workflows

However, there are also some drawbacks:

  • There is no cache between jobs. The current recommended practice is to archive the required data, and then restore the archive on the required job. To do this you will require to execute Actions. Having a local cache is really important for projects like Java that have many dependencies. No cache means downloading each and every build!
  • The recommended practice is to write actions in JavaScript since these actions are performed on the GitHub host, and do not need to be pulled from external sources. Really? JavaScript? It seems like a bizarre choice – JavaScript is not the first language DevOps would turn to when building workflow pipelines. Will GitHub Actions support other languages in the future?

We also found Docker Actions available on the marketplace are of variable quality. We spent time experimenting with different variations until we found those that matched our requirements. As the source code is available it was easy to evaluate an Actions implementation. Or, you could simply write your own following these instructions. We also found that we could use our existing Docker images without modification.

There are some good features to GitHub Actions which are easily composed. While JavaScript is not the first tool we would consider as a workflow language, Docker is very workable compromise, even with the small performance hit.

Resources

Introduction

Git has become the de facto standard for version control, but until recently you needed external tools such as Jenkins or GoCD to manage Continuous Integration / Continuous Delivery (CI/CD) pipelines.

Now, though, we’re seeing vendors like Gitlab and others providing pipeline features with extensible suites of tools to build, test and deploy code. These integrated CI/CD features greatly streamline solution delivery and have given rise to whole new ways of doing things like GitOps.

In this article we examine and compare some of the current pipeline features from three popular Git hosting sites: GitLabBitbucket and GitHub, and ask the question: “Is it time to switch from your current CI/CD toolset?”

Example Pipeline

Let’s use pipelines to render the Git Markdown version of this article into an HTML document.

The pipeline features we are using:

  • using Docker images to execute build tasks
  • customising the build environment
  • pipeline stages
  • archiving generated artefacts – in this case a document, but in real life you might be archiving a built Docker image

The pipeline workflow is:

  1. install GNU Make
  2. install pandoc – we are using this to render Markdown to HTML
  3. render the HTML document from Markdown
  4. archive rendered document

The code for this project can be viewed from these Git repositories:

GitLab

GitLab’s Community Edition pipelines are a well-integrated tool, and are our current pipeline of choice.

Example Pipeline

The CI/CD pipelines are easily accessed from the sidebar:

Viewing jobs gives you a pipelines history:

The YAML configuration file .gitlab-ci.yml for this pipeline is:

image: conoria/alpine-pandoc

variables:
  TARGET: README.html

stages:
  - build

before_script:
  - apk update
  - apk add make

render:
  stage: build
  script:
    - make $TARGET
  artifacts:
    paths:
      - $TARGET

Where:

  • image – specifies a custom Docker image from Docker Hub (can be custom per job)
  • variables – define a variable to be used in all jobs
  • stages – declares the jobs to run
  • before_script – commands to run before all jobs
  • render – name of job associated with a stage. Jobs in the same stage are run in parallel
  • stage – associates a job with a stage
  • script – commands to run for this job
  • artitacts – path to objects to archive, these can be downloaded if the job completes successfully

What this pipeline configuration does is:

  • load an Alpine Docker image for pandoc
  • invoke the build stage which
    • initialises with alpine package update and install
    • runs the render job which generates the given target HTML
    • on successful completion, the target HTML is archived for download

Features and Limitations

There are many other features including scheduling pipelines and the ability to configuring jobs by branch.

One useful feature for Java / Maven projects is caching of the .m2 directory. This speeds up the build as you don’t have a completely new environment for each build, but can leverage previous cached artefacts instead. GitLab also provides a clear cache button on the pipeline page.

GitLab also supports hosting of static pages. This is simple to set-up and use, requiring only an additional pages job in the deployment stage to move static content into a directory called public. This makes it very easy to host a project’s generated documentation and test results.

Finally, GitLab provides additional services that can be integrated with your project. For example: JIRA tracking, Kubernetes, and monitoring using Prometheus.

Summary

Overall, GitLab is easy to configure and easy to navigate, and provides Marlo with our current preferred Git pipeline solution.

Bitbucket

Atlassian’s Bitbucket pipeline functionality and configuration is similar to GitLab.

Example Pipeline

Again, pipelines and settings are easily navigated into using the side-bar.

But there are some important differences. Below is the configuration file bitbucket-pipelines.yml:

pipelines:
  branches:
    master:
      - step:
          name: render
          image: conoria/alpine-pandoc
          trigger: automatic
          script:
            - apk update && apk add make curl
            - export TARGET=README.html
            - make -B ${TARGET}
            - curl -X POST --user "${BB_AUTH_STRING}" +
                "https://api.bitbucket.org/2.0/" +
                "repositories/${BITBUCKET_REPO_OWNER}/" +
                "${BITBUCKET_REPO_SLUG}/downloads " +
                --form files=@"${TARGET}"

Here the pipeline will be triggered automatically (trigger: automatic) when you commit to the master branch.

You can define a Docker image (image: conoria/alpine-pandoc) to provision at the level of the pipeline step.

Variables (${BB_AUTH_STRING}${BITBUCKET_REPO_OWNER} and ${BITBUCKET_REPO_SLUG}) can be defined and read from the Bitbucket settings page. This is useful for recording secrets that you don’t want to have exposed in your source code.

Internal script variables are set via the script language, which here is Bash. Finally, in order for the build artefacts to be preserved after the pipeline completes, you can publish to a downloads location. This requires that a secure variable be configured, as described here. If you don’t, the pipeline workspace is purged on completion.

That you have to externally / manually configure repository settings has some benefits. The consequence though, is that there are then settings that are not recorded by your project.

Pipeline build performance is very good, where this entire step took only around 11 seconds to complete.

Features and Limitations

One limitation is that the free account limits you to only 50 minutes per month with 1GB storage.

A feature of being able to customise the Docker image used at the step level is that your build and test steps can use different images. This is great if you want to trial your application on a production-like image.

GitHub

GitHub was recently acquired by Microsoft.

When you create a GitHub repository, there is an option to include Azure Pipelines. However this is not integrated to GitHub directly, but is configured under Azure DevOps.

Broadly, the steps to set-up a pipeline are:

  • sign up to Azure pipelines
  • create a project
  • add GitHub repository to project
  • configure pipeline job

Builds are managed from the Azure DevOps dashboard. There appears to be no way to manually trigger a build directly from the GitHub repository. Though, if you commit, it will happily trigger a build for you. But, again, you need to be on the Azure DevOps dashboard to monitor the pipeline jobs.

Example Pipeline

The following YAML configuration uses an Ubuntu 16.04 image provided by Azure. There are limited number of images, but they are well maintained with packages kept up-to-date. They come with many pre-installed packages.

Below is the Azure pipeline configuration azure-pipelines.yml:

trigger:
  - master

pool:
  vmImage: 'Ubuntu-16.04'

steps:

  - script: |
      sudo apt-get install pandoc
    displayName: 'install_pandoc'

  - script: |
      make -B README.html
    displayName: 'render'

  - powershell: |
      gci env:* |
      sort-object name |
      Format-Table -AutoSize |
      Out-File $env:BUILD_ARTIFACTSTAGINGDIRECTORY/environment-variables.txt

  - task: PublishBuildArtifacts@1
    inputs:
      pathtoPublish: '$(System.DefaultWorkingDirectory)/README.html'
      artifactName: README

If the package you need is not installed, then you can install it if available from the Ubuntu package repositories. The default user profile is not root, so installation requires the use of sudo.

To create an archive of artefacts for download, you need to invoke a specific PublishBuildArtifacts task.

Azure is fast as it uses images that Microsoft manages and hosts. The above job to install pandoc and render this page as HTML takes only 1 minute.

Features and Limitations

The biggest negative to Azure Pipelines is its limited integration to the GitHub dashboard. Instead, you are strongly encouraged to manage pipelines using the Azure DevOps dashboard.

Update

Since the first draft of this article, GitHub announced the support of pipeline automation called GitHub Actions. Marlo is engaged in the beta program and we will have some new information to post here shortly.

Summary

In Marlo’s DevOps practice we are constantly looking at ways to increase our productivity and effectiveness in solution delivery. Of the three Git pipelines looked at here, we found GitLab the easiest to adopt and use. It’s YAML based syntax is simple, but functionality broad. Our developers have quickly picked up and implemented pipeline concepts.

Git pipelines will not be suitable in every circumstance – for example Ansible infrastructure provisioning projects. However, there are clear advantages to using a hosted pipeline that ensures that your project builds somewhere other than on your machine. It also removes the cost of building and maintaining your own infrastructure. This could be of great benefit to projects where time constraints limit ones ability to prepare an environment.

The pipeline configuration augments your projects documentation for build, test and deployment. It is an independent executable description for your project that explicitly lists dependencies.

Since the first draft of this article was written, there has been increasing competition and continuous innovation amongst Git repository vendors:

So, yes: it is a great time to switch to a Git pipeline toolset!