The Git workflow you need: How to deal with multiple teams in a single repository
Brian Neville-O'Neill
Posted on July 8, 2019
Stop me if you’ve lived it before: you’re either a part of, or are currently managing a big team of developers, they don’t all work using the same tech stack because your team is comprised of back-end developers working on JAVA, front-end devs working on AngularJS and you even have a couple of data scientists working in Python.
On top of that, everyone said they know how to use GIT, but in reality, they don’t. They usually deal with version control using their IDE of choice, clicking options without knowing exactly what they do.
Normally, reason would dictate that these teams handle their source code separately, which means using different repositories for each codebase. That would also, give them the ability to have individual development flows, independent of each other.
That being said, oftentimes luck is not on your side, and you’re left with a single repository and three different teams, trying to learn how to work together. In this particular article, I’m going to tackle this scenario, but solely from the source control point of view. In other words, how to create a useful development flow that allows everyone to work together without messing up each other’s code.
Why not simply go with a successful Git branching model?
In 2010, Vincent Driessen published a very interesting article, describing an approach at handling version control with GIT in development teams.
Essentially, what that article proposed (without all of the bells and whistles, if you want all the details, go directly to the article) was that you’d:
- Create one branch for each feature you need to work on. These branches will come from a base development branch, where all the dev code resides
- Each developer will work on their respective feature branches until they are considered ready
- Once ready, they’ll be merged back to their source
- When all features are ready, you’ll create a release branch from development, where only bug fixes will be accepted to ensure no half-finished feature are deployed
That’s the flow, in a nutshell, there are a few other considerations when it comes to tagging and hotfixes, but I’ll let you read the original article for that.
So, just like many others, I took that approach to heart, and it works very well (in my humble opinion) with homogenous teams when they all work as one on the same code.
The problem comes, when that is no longer the reality.
And don’t get me wrong, the model still works if your team is proficient with the tool. If they know what it means to pull versus fetch from a repository, or how to deal with merge conflicts correctly, then, by all means, use this model.
Sadly, this is not the case all of the time, too many developers tend to gloss over the documentation of GIT when they need to use it. This causes either minor problems when the teams are small enough or it forces them to elect teammates to take on the responsibility of doing all merges.
Maybe you’ve been there as well, you have some devs on your team that know the tool very well, and understand what happens when they use it, so they tend to be the ones that handle the most complicated tasks.
For example, you might have these devs creating the feature branches at the start of the sprint and then, taking care of the merges once the others deem the code ready.
This might be a setup that works in some cases, but no doubt, it’ll add a lot of responsibility to those specific individuals and it will definitely take time away from their development.
So, what’s the worst that can happen if we don’t try to adjust our git flow?
Common problems we need to avoid
Let me share a few examples I’ve lived through that made me come up with this approach.
Chaining branches
The flow dictates that every new branch needs to come from the main development branch, this is to avoid bringing incomplete code with us from other half-finished branches. The problem here is developers who are not paying attention when creating their branches and using another, maybe use an older branch as a source by mistake.
Now they’re trying to merge their complete code into development and, understandably, are having a lot of merge conflicts. This gets even worse if the developer just accepts their version of the code to resolve it since, in their mind, their work is the latest.
Once this is all said and done, they’ve uploaded their code, yes, but in the process, they also overwrote the newest version of the other team’s code with older, unfinished versions of it.
Let’s look at it using a very simple diagram:
In the end, the code that gets merged from branch F2 had the unfinished code from F1. And because all teams share the same repository, F1 could’ve been a front-end specific branch and the F2 could be for the back-end team. Can you imagine the chaos that comes from having someone from back-end messing up the code for the front-end? It’s not pretty, I can tell you.
Premature merges
Similarly to the previous problem, if you merge into development your unfinished feature branch, just to see how that would work, or (even worse) to make sure there are no conflicts there, you’re essentially poisoning the main branch with your unfinished code.
The next developer that comes and creates a brand new branch from the base one (like they’re supposed to), will be carrying your code. And when they decide to merge it back, assuming you’ve already finished your code and merged it before them, they’ll be having to solve merge conflicts for your code, and not theirs! #WTF
Check out the next flow diagram showing this exact case:
In the end, the results are the same as before, you’re affecting other people’s work without even realizing it. In fact, these problems can remain unseen until they hit production, so you need to be extra careful with the way you handle code.
There are other ways to screw up your co-workers’ code, but they are somewhat related to these two examples, and as you are probably guessing by now, the actual challenge is not with the flow itself but rather with the team.
The ultimate fix for this, is training the developers involved so they don’t keep making the same mistakes, but if you can’t, or they won’t learn (after all, to err is human) the other option that you have is to adjust your flow in a way you can minimize the damage done.
A new flow
What I tried to achieve with this flow, is to narrow down the area of effect a mistake can have. By compartmentalizing the code into very segregated branches, if someone forgets something, or simply doesn’t want to play by the rules, they’ll only affect their immediate teammates and not the rest of the teams.
Problems are impossible to avoid, the key here is to not let them spread into other teams, because then, fixing them becomes a project-wide task, while if it’s just a front-end or back-end issue, that team can take care of it on their own.
Let’s now look at how this flow would look for a two-team composition (you can easily extrapolate to any number of sub-teams inside your project):
That’s a lot of lines, I know, but bear with me for a second.
The flow tries to show how two teams (T1 and T2) would work within a sprint’s worth of time, in two different features (F1 and F2).
Just so everything is clear, here are the details:
- Dotted arrows are merges that happen automatically
- T1Dev and T2Dev are development branches for each team individually. The code within them should not mix, that’s the whole point, this is like mixing front-end code and data science code (you just don’t do it)
- T1Stable and T2Stable are copies of the corresponding T1Dev and T2Dev but they only contain code that is stable. This is ensured because merges into these branches happen only when their features are closed (meaning QA team has approved them).
- At the start of each sprint, a tag is created for each team from their corresponding stable branches
- New feature branches are created from the tag of the current sprint
- Whatever gets merged into the base Development branch, is tested by the developer, and if working as expected, a merge command is issued, so the code is merged in the QA branch (and subsequently deployed into that environment for that team to test)
- At the end of the sprint, the stable code gets deployed into Production (by merging it into the PROD branch)
I know that sounds like a lot and might look like too much to handle, but it helps prevent a lot of disasters.
Let me explain.
Tags make sure all of your branches created within a sprint will contain the same origin code, this is very important, because if you don’t, you could potentially create a new branch one week into the sprint with the content of any partial test any other teammates of yours could have merged into your team’s development branch. This basically prevents you from unwillingly promoting unfinished code from others while merging yours.
Stable branches help you in the process of promoting code into production (or possibly a step before that, UAT). You see, in an ideal world, you’d just promote your QA branch into the next environment. But in reality, there can always be carry over, either due to unfinished features, or bugged ones. Whatever the case may be, those pieces of code are not good enough to get out of QA and into production, so when setting up the next deployment, you’ll need to hand-pick your branches, only those which got approved. This way, you already have a branch for each team that is already pre-approved, so all you gotta do is merge these branches into production and you’re ready.
Individual development branches (T1Dev and T2Dev in the example above) help isolate the code. You see, merging code into these branches needs to be done by the developers themselves, and as we discussed at the start of this article, you can’t always trust in their ability to do so correctly.
By having individual development branches, you make sure that if they make any mistakes, they will only affect their team and not the entire project.
Depending on the size of the features, you might need to create several individual branches from your feature branch. You might structure your local development workflow however you see fit, just remember one thing: anything you do needs to come from and go into the feature branch, that’s it.
A few more recommendations outside of the flow
Although the flow by itself will help limit the area of effect of any unintentional mistake your team or teammates can make, there are other recommendations that go hand-in-hand with it and can help prevent them even more.
Document the flow
Development flows need to be documented, especially complex ones. Everyone needs to be able to understand exactly what needs to happen when, and more importantly how to do it.
In other words, don’t be afraid to write foolproof documents, that lead the developers by the hand. It might sound like a lot, but you’ll write it once, and you’ll use it often, especially at the start of your project and with every new dev joining it afterwards.
Having step-by-step descriptions helps them to avoid guessing how to perform pulls or merges, and gives them a standardized way of handling those tasks, that way if there is any doubt, anyone will be able to answer it.
Discuss the flow
Another form of documentation is face-to-face Q&A’s when possible, or at least over hangouts or any other type of live gathering of members, where everyone can voice their doubts.
Sometimes those doubts will highlights flaws in your plan so, on the flip side, be open to changes.
Just like they need to be open to following your lead (if you’re the one crafting the flow), you need to be open to possible overlooks on your part, or even improvements you’ve missed. Be aware these things can happen, and try to review the plan with the members of your team that are more versed in GIT before releasing it to everyone. If they’re OK with it, there’s a very good chance, everyone else will be too.
Don’t be afraid to enforce some standards
Again, sometimes problems come from freedom of action. If the developers working with GIT don’t really understand how it works but try to compensate for that by using external tools they might end up causing more trouble than they would’ve without the tools.
So in an effort to avoid that, feel free to enforce the GIT client they need to use, or the environment they need to work on, or the folder structure or whatever you feel might simplify their tasks in regards to handling source control (I wrote an article on the kind of standards you’d benefit from implementing, in case you’re interested in knowing more about this subject).
One of my goto’s here is enforcing the use of CLI client that comes with GIT out of the box, and then list, in the step-by-step documentation every command they need to enter. This way, the task becomes a no-brainer for everyone (which is the ideal scenario, having your devs worry about lines of codes, not lines of GIT).
Final words
That’s it for this article, thanks for reading up to this point, and remember:
- Not everyone knows enough about GIT to be left alone with it
- Not everyone will admit to that
- Standard git flows aren’t always the right choice for your team
- You should aim to have a flow that minimizes collateral damage when problems happen (and they will)
- You should also aim to train your team in the usage of GIT, it might not look like it at first, but it’s an investment that will save you from missing delivery dates due to incorrectly done merges
- Finally, try to provide as much documentation on the process as you can, and be open to it being a live document, ever growing and ever changing
Thanks again for reading and if you’d like, please leave a comment with similar stories on what kind of problems have you encounter in the past due to the miss use of GIT, or different flows you used to avoid them!
Until the next one!
Plug: LogRocket, a DVR for web apps
LogRocket is a frontend logging tool that lets you replay problems as if they happened in your own browser. Instead of guessing why errors happen, or asking users for screenshots and log dumps, LogRocket lets you replay the session to quickly understand what went wrong. It works perfectly with any app, regardless of framework, and has plugins to log additional context from Redux, Vuex, and @ngrx/store.
In addition to logging Redux actions and state, LogRocket records console logs, JavaScript errors, stacktraces, network requests/responses with headers + bodies, browser metadata, and custom logs. It also instruments the DOM to record the HTML and CSS on the page, recreating pixel-perfect videos of even the most complex single page apps.
Posted on July 8, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
July 8, 2019