Run Android modularized builds selectively on GitHub Actions
Rosário Pereira Fernandes
Posted on April 9, 2020
Header Photo by Brad Neathery on Unsplash
TL;DR: I've improved Android build times on GitHub Actions for Pull Requests from ~17min down to ~5min using this.
A few months ago I read this great post "Selectively running Android modularized unit tests on your CI server", where Joe Birch describes some of the tips and tricks he used to implement a CI Workflow that selectively runs unit tests on a modularized Android Project. This week I had some free time and decided to try implementing a similar workflow on a project that I've been contributing to (firebase/quickstart-android). So in this post I've decided to share the challenges I faced while doing it and some of the workarounds.
Before we get started, let me walk you through some of the basics of it all to make sure you're familiarized with the concepts described here. If this post's title was 100% clear for you, you can go ahead and skip to Current Scenario section.
The Basics
1. Modularized Android?
If you're an Android Developer, you (probably) already know that an Android Studio Project can be divided into modules. This allows developers to create different apps for the same Project. For example, if I'm building a "Health Tracking" app, I could split it into different modules:
-
app
- this is usually created by default when you select "new Android App" on Android Studio. This module is the Android app for phones/tablets; -
wear
- let's suppose we also want our app to work on Android Wear devices. We can build that app in this separate module; -
lite
- let's say we have a different version of the app which uses less resources and is meant for Android devices with low-processing power.
If you're not yet familiar with Modules, I recommend checking the Android Documentation.
2. GitHub Actions?
Now before we get into this, you should first understand the concept of Continuous Integration. I really like this definition by Atlassian:
Continuous integration (CI) is the practice of automating the integration of code changes from multiple contributors into a single software project. The CI process is comprised of automatic tools that assert the new code’s correctness before integration.
Putting it in simple words: let's say we have a project stored on GitHub and many contributors. Before accepting any contribution, we want to make sure that this contribution will not break our code, and for that we can use Automatic CI Tools which will try to build the new code and then tell us if it breaks our current code or not.
Particularly, I've been working with tools such as Travis CI and CircleCI. But recently, GitHub introduced GitHub Actions - another automated CI tool. This came as great news for some developers, since this means they can have their source code and their CI on one and only platform: GitHub. No need to share the code with any other 3rd party CI tools.
Current Scenario
As I've mentioned in the beginning of the post, I've been contributing to firebase/quickstart-android. This Open Source project is basically a collection of Android sample apps that aim to demonstrate developers how to use the Firebase APIs. Because Firebase provides many different services, the project has been divided into modules (one for each Firebase Service).
At the time I wrote this post, it had 18 different modules and building the whole project on GitHub Actions would take around 17min (average).
So whenever someone sent a contribution to the project, they'd have to wait for around 17 minutes before they could find out whether their contribution breaks the project or not.
This may be "unfair" at times when someone has implemented changes in a single module and has to wait for GitHub Actions to build all the other unchanged modules. Or even worse: when someone doesn't change any code at all, because their changes were made on the project's documentation (E.g. README.md, LICENSE, etc), but will still have to wait for the project to build. And Joe's post tries to show how to reduce CI time in such cases.
NEED FOR SPEED!!!
So what Joe proposes is: instead of testing the whole project (done with gradlew testDebugUnitTest
), we could simply test the changed modules only (with gradlew :moduleA:testDebugUnitTest :moduleB:testDebugUnitTest
) which might be a lot faster.
Now in my scenario, the project doesn't really have any tests and what the CI does is basically check if the project builds successfully and if there are no lint issues(gradlew clean ktlint build
).
For Pull Requests I've decided to only build the debug variant of the changed module (gradlew clean ktlint :module:assembleDebug :module:check
).
Doing this should be easy, right? I could simply copy Joe's code. But as it turned out, it was not that straight forward. Mainly because he uses a different CI Environment (Bitrise) and some of his code was not running correctly on GitHub Actions.
So let's have a look at what my code looks like and I'll explain the changes I made.
1. Find out which modules were changed
In order to find the changed modules, he uses git diff
with a while loop in a pipe to read the command's output. I've tested this on my local terminal (zsh) and it works fine. But GitHub Actions uses the bash terminal, and bash was not so happy with that pipe. So I had to rewrite the while loop in the main shell, as suggested here:
# build_pull_request.sh
# Get all the modules that were changed
while read line; do
module_name=${line%%/*} # This gets the first word before '/'
# Now we check if we haven't already added this module
if [[ ${MODULES} != *"${module_name}"* ]]; then
MODULES="${MODULES} ${module_name}" # string concat
fi
done < <(git diff --name-only origin/$GITHUB_BASE_REF)
Notice that we're using an env variable from GitHub Actions: $GITHUB_BASE_REF
- this contains the name of the branch where our Pull Request is supposed to be merged into (generally the master
branch).
After pushing this, it still wouldn't work. It would show an error, saying it couldn't find the branch I had specified ($GITHUB_BASE_REF
) and that's because GitHub Actions pulls our code to their CI machine using a shallow clone, therefore the branches were not being fetched. So in order to fix this, we have to unshallow and fetch the branches before running git diff
:
# build_pull_request.sh
git fetch --unshallow
git fetch origin
2. Check if the changed files belong to any module
Now this is not very different from Joe's script, with a few exceptions, since I adapted it for my own scenario:
# build_pull_request.sh
# Get a list of all available gradle tasks
AVAILABLE_TASKS=$(./gradlew tasks --all)
# Check if that list contains the modules that were changed
build_commands=""
for module in $changed_modules
do
if [[ $AVAILABLE_TASKS =~ $module":app:" ]]; then
build_commands=${build_commands}" :"${module}":app:assembleDebug :"${module}":app:check"
fi
done
# Build the Pull Request
eval "./gradlew clean ktlint ${build_commands}"
Notice that if no module was changed, the build_commands
variable will be empty and CI will only run gradlew clean ktlint
.
Now we can save our file as build_pull_request.sh
and change it's permission so that it can be executed: chmod +x build_pull_request.sh
.
How will GitHub Actions know whether it's a Pull Request?
That's the easiest part. We can add conditions to our workflow config:
# .github/workflows/android.yml
name: Android CI
on:
- pull_request
- push
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: set up JDK 1.8
uses: actions/setup-java@v1
with:
java-version: 1.8
- name: Build with Gradle (Pull Request)
run: ./build_pull_request.sh
if: github.event_name == 'pull_request'
- name: Build with Gradle (Push)
run: ./gradlew clean ktlint build
if: github.event_name != 'pull_request'
And that's it! One git commit
and a git push
and you should see CI times faster than Lightning McQueen.
Final Considerations
- After implementing this change, CI times in quickstart-android PRs dropped from ~17min to ~5min. While this may seem like it's only 12 minutes, it actually adds up to days, weeks or months on the long run. It also means that the CI Server will be freed quicker, allowing other contributions on different repositories of the same organization to use the CI.
- The full script and workflow config file is available on Gist.
- Please note that this code only works if all modules are independent. So if you have moduleA dependending on moduleB and you make changes to moduleB, then moduleA might break and not be tested.
- If you're new to Continuous Integration, I'd also recommend reading about Continuous Delivery.
I hope this helps you improve build times in your CI Environment. If you plan on trying out anything I've mentioned here, leave a comment bellow or reach out to me on Twitter(@_rpfernandes) and I'll be happy to have a chat about it.
Posted on April 9, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.