New pipelines in our CI

Daniele is the local horologist at Novoda. Passionate about wearable technologies, he likes to experiment with the whole Android ecosystem, from AOSP to Cast and Android TV.

Working with Agile, one of the most important tools we work with is a Continuous Integration (CI) system. This allows us to automate most of the tasks which follow a change in a project codebase, from testing to static analysis, including automatic deployment.
At Novoda we use Jenkins for all our projects, in order to automate our daily working routine, and we recently started investigating a new feature of this tool: build pipelines.

Pipeline what???

Wikipedia gives us the following definition for a pipeline:

In computing, a pipeline is a set of data processing elements connected in series,
where the output of one element is the input of the next one.

The core point in this definition is the fact that a pipeline is composed by several elements, one executed after the other, and the execution of every element is dependent from the result of the previous one.
It is easy to see a relation with a CI job, where the execution is composed by several steps, each one with its own configuration and dependent on the result of the previous ones.
Compared to traditional (freestyle) jobs, pipelines provide a better visualisation of the status of the several parts composing a job and they are more versatile, given their ability to fork, join, iterate, and work in parallel with each other.

Pipeline DSL

With the Pipeline plugin, Jenkins introduces a domain-specific language (DSL) based on groovy which can be used to define a new pipeline as a script. The DSL is extensible and supports custom extensions and multiple options for integration with other plugins.
Let’s have a look at the basic syntax which can be used to start defining a pipeline.

Nodes

In order to execute most of the steps in a pipeline, the job needs at least one node.
The name might be misleading, given that usually in Jenkins UI a node is one of the elements of the CI network: the main one is the master coordinating a set of slaves (or agents). Every one of these nodes can have more executors: every executor allows one job instance to run, so having multiple executors associated to the same machine allows to run several jobs in parallel.
The node we are talking about in the pipeline context is directly related to an executor. All the commands included in a node closure will be run by a dedicated executor in one of the CI agents.
To request a node then we just need to use the syntax

node {  
    // commands to run
}

It is possible to filter which node can run the enclosed commands using one of the labels defined in Jenkins Nodes settings, specifying directly one of the agents name or based on some of the agent OS characteristics (windows, unix, etc...).
For example in our CI network we have several agents with the label “android” and we can select only them using

node ('android') {  
    // commands to run
}

Stages

Stages allow to group job steps into different parts and they are the main components of a pipeline. The new job visualisation makes this aspect even more clear:

It is easy to see the duration of a single stage in a pipeline, for multiple job runs.
Also, in case of error in one of the stages, that one is immediately highlighted and the following stages are not executed.
To define a set of stages we proceed as following:

stage "Checkout"  
// get project source code from SCM

stage "Build"  
// run the project-specific build script

Stage "Tests"  
// if the build was successful, run all the tests

Steps

Inside a stage it is possible to add several steps, like it was possible to do for traditional freestyle jobs. Jenkins pipelines support any build step from the installed plugins in the Jenkins environment.
For example, if we want our CI to change the status of the build to stable or unstable depending on the result of Findbugs static analysis, we can use the following step:

step([$class             : 'FindBugsPublisher',  
      canComputeNew      : false,
      pattern            : '**/findbugs/*.xml',
      unstableTotalHigh  : '0'
      unstableTotalNormal: '5',
      unstableTotalLow   : '10']
)

Scripts

It is also possible to specify shell (or batch) scripts to be executed in a pipeline.
For example we can run gradle tasks for the current project using

sh "./gradlew clean build"  

Parallel execution

Pipeline plugin supports the parallel execution of commands, by creating a map of parallel branches and commands:

def branches = [:]  
branches["Branch 1"] = {  
    node() {
        unstash name: 'workspace'
         // do something with the project files
    }
}

branches["Branch 2"] = {  
    node() {
        unstash name: 'workspace'
        // do something else with the project files
    }
}

parallel branches

The first problem with parallel execution is that it is not possible to define stages inside a parallel branch, so it won’t be possible to have a graphical representation of the status of a single branch.
The second and bigger problem is related to performance: we executed some benchmark, and we found that having serial execution takes less time than having the same steps executed in a parallel way. This is probably due to the extra time needed to copy the workspace files between two nodes.

Pipeline job

To create a new pipeline the first step is to create a new Pipeline Job. To do so, from the main Jenkins screen select “New Item”, specify a job name and then “Pipeline”.

In the following configuration page, the first three sections are the same as for a traditional Jenkins job (General, Build Triggers and Advanced Project Options).
What really changes is the new Pipeline section, where we can choose between two options in order to define the job pipeline: using the inline editor or loading the pipeline from a versioned file.

Inline editor

Using the inline editor allows to quickly test a new job configuration by editing the pipeline script in a text field, saving the configuration and running the job.

Versioned pipeline

As soon as pipelines grow, it would be difficult to maintain them only using the text area present in the Jenkins job configuration page. A better option is to store them in specific files versioned within your project repository. Doing so all the changes to the job configuration will be versioned and could be easily updated using a script (think about a sprint automatically lowering static analysis thresholds after every release, for example).
In this case we won't have the option to type the pipeline content directly in the configuration page, but we will need to specify the repository parameters for the job and the path of the file containing the pipeline definition inside the repository itself.

Snippet generator

In order to better learn the pipelines DSL, an online snippet generator is provided along with the Pipeline Plugin. This generator allows to create snippets of code for practically all the steps available within a pipeline. More interestingly, it is aware of the Jenkins environment and then it will provide some error checking and additional steps depending on the installed plugins.
As we can see in the following image, the snippet generator integrates with existing builds steps, allowing a configuration similar to the one we are used to for traditional Jenkins jobs.

No pull request builder :(

What described so far is great for a pipeline used to build the main branch of a repository projects, but what if we want to check our changes even before these get merged?
At Novoda we use pull requests in order to have manual review of every change (if you don’t know what pull requests are, have a look at this page and start using them in your team NOW). Apart from a manual review by our peers, we use the Github Pull Request Builder plugin in most of our projects. This allow us to have our CI performing the same checks and tests that would be run on the main branch, even before the pull request is merged. More importantly, it will notify us posting the result of those checks on the pull request page.

Now, this is beautiful, but there is a small problem: the Github Pull Request Builder plugin is not compatible with the Pipeline one, so it is not possible to use the two to automatically trigger a pipeline when a new pull request is opened or updated.
Luckily there is an alternative to that which allows us to keep using pipelines with a pull requests-based flow.

Multibranch pipelines

A multibranch pipeline job will execute a given pipeline on multiple branches of a repository automatically creating a secondary job for every branch.
In order for the job to correctly integrate with Github we need to use the Github plugin, when specifying the project source, using as credentials the ones of a Github user with admin access on the repo.

Using Github webhooks, every time a change is pushed to the repository the plugin will scan the configured repository for branches containing a file named Jenkinsfile in the repository root directory. By default all the branches will be checked, but it is possible to apply filters in the job configuration.
For every branch containing such file, a subproject will be automatically created. It will not be possible to edit these projects configuration, but it’s enough to say that when run they will execute the pipeline defined in the Jenkinsfile. Once the branch will be deleted from the repository, the related job will be automatically removed.

For every change pushed to the repository, the parent project will trigger a build for the subproject related to the modified branch.
In case of a pull request opened in Github, the status of the build for the modified branch is displayed on the pull request page and it is updated after subsequent commits.

As it easy to see, this strategy will force the CI to work more, running a job every time a changed is pushed compared to doing it only when a pull request is open. This however has the big advantage of allowing us developers to be always aware of the status of the branch we are working against. It will also speed up the review of pull requests: once one is opened, the status will be immediately available, compared with the Pull Request Builder which asked us to wait for the build to complete after the pull request was opened to be able to know the status.

Conclusions

What we liked

What we didn't like


Pipelines provide new power and flexibility to our Jenkins CI configuration. and it is really easy to migrate existing jobs to new ones using the pipelines system.
Multi-branch jobs provide great visibility over the status of a work-in-progress branch, and having pipelines defined in a versioned file allows us to easily find who changed what and why, making the history of changes to code readily apparent.
Finally, the DSL used to defined pipelines is also extensible, allowing us to define new custom steps, in order to better specify all the steps our CI might need to execute.

Unfortunately there still are several plugin that aren't entirely compatible with the pipelines plugin, for example the Pull Request Builder one and some static analysis visualisations. Luckily there seems to be a good effort in updating not-yet-compatible plugins, so these problems should be solved soon.

For more information please have a look at the official documentation.

About Novoda

We plan, design, and develop the world’s most desirable Android products. Our team’s expertise helps brands like Sony, Motorola, Tesco, Channel4, BBC, and News Corp build fully customized Android devices or simply make their mobile experiences the best on the market. Since 2008, our full in-house teams work from London, Liverpool, Berlin, Barcelona, and NYC.

Let’s get in contact