The Pull Request Paradox with Yishai Beeri

therubyrep

Mandy Moore

Posted on March 23, 2022

The Pull Request Paradox with Yishai Beeri

Yishai Beeri is the CTO of Software Delivery Intelligence startup, LinearB. Yishai likes to solve problems, and that's why he is so fascinated with programming when he first encountered Logo back in the 80s, where the possibilities seemed endless.

He made it a focus of his career to solve complex programming problems, both as a consultant and entrepreneur. In 2014, he joined the CTO office of a fast-moving cloud security startup, which was later acquired by a networking giant. At this startup, he also met Ori and Dan, now co-founders of LinearB. He joined them shortly after the company was established, in order to get back to what he loves most about engineering: solving exciting challenges.

In this episode, Yishai talks with host Ben Greenberg about the paradox of pull requests.

Links:

LinearB (Twitter | LinkedIn)
Yishai Beeri: Twitter | LinkedIn
The Pull Request Paradox: Merge Faster By Promoting Your PR

Do you have ideas about how we can make our show better? Or would you like to be a guest on an upcoming episode? Reach out to our #devrel team at devrel@newrelic.com. We would LOVE to hear from you with any questions, curiosities, and/or feedback you have in hopes of making this the best show possible!

Give us a follow: @PolyglotShow

play pause Polyglot

Transcript:

Ben Greenberg: Hello. Welcome to another episode of Polyglot. I am Ben Greenberg, Lead Developer Relations Engineer at New Relic. And today, I am joined by Yishai Beeri, CTO of Software Delivery Intelligence startup, LinearB. Welcome, Yishai, to Polyglot.

Yishai Beeri: Thank you. Pleasure to be here.

Ben: So let's start off by telling our listeners a little bit about who you are. Who are you, Yishai?

Yishai: So I'm a software developer at heart. I've spent most of my career writing code and then managing projects of writing code starting in the military days back in the '90s. And then spent most of my career as one of the owners of a small consulting shop like a boutique software for hire shop, which has worked with startups and larger companies here in Israel for about 18 years. Then my last contracting gig was for a cloud security startup called Cloudlock. And after about a year there, I was lured to join Cloudlock as a full-time employee for the first time in my life.

Ben: They made you an offer you could not refuse.

Yishai: It was the right time for me. And so I joined Cloudlock, worked for the co-founder and CTO there; after a couple of years, we got acquired by Cisco. I spent another three and a half years in Cisco, led the Tel Aviv site for Cisco. And then, almost two years ago, I left and joined LinearB, which is, as you said, a Software Delivery Intelligence startup. And here I'm, the CTO, and we focus on helping engineers and development teams improve and work better.

Ben: So you have been all over the gamut around the software industry, from startups to major corporations and back again, and now you're back at a startup. And from a software developer yourself to being CTO now, you have seen really the whole spectrum of conversation around software development, I imagine.

Yishai: Yeah, a lot of similarities across all that gamut but also a lot of differences. I don't think it's all attributable to size or stage. I've seen larger companies that are more agile than smaller companies and vice versa. But yes, very wide range of behaviors, technologies, ways to manage the coding process. I'm old enough to remember days way before Git and before Subversion.

Ben: There were things before Subversion, Yishai. [laughs]

Yishai: Yeah, like CVS and stuff like that.

Ben: Yeah, true, true.

Yishai: Or just plain old code with the whole versioning.

Ben: [laughs]

Yishai: That used to be a norm.

Ben: That used to be the norm. That is true. So you've seen so many different ways of managing software teams and building and coding together. And this is a big subject; how do we successfully get past the friction or help alleviate the friction in software teams that are building code and trying to push out changes in a reasonable timeframe?

I've read some interesting things you recommended to me from LinearB. And one of them that I would love to talk about a bit with you is what was termed in an article, and we'll share that article in the show notes for listeners, what was termed in the article The Pull Request Paradox, the paradox of pull requests. And I would love it if you shared with us a bit of what that means. What is the paradox of pull requests? And what did your research in that help discover?

Yishai: Pull requests have become the de facto standard for evolving a team or a company's codebase over time to add value, to fix bugs, to improve the product. Pull requests have evolved from mainly an open-source project artifact where very distributed teams on very lax timeframes and timelines with mostly people working on their spare time, volunteers that needed a way to coordinate how they are adding and changing code to a common codebase.

It slowly became how most companies do changes to the master codebase. It's not the only way. And there are various movements for trunk-based development and PR layers or various flavors. We can touch on those a bit later. But it's the de-facto standard. And it lets the organization or the team make sure that it's very clear how changes are getting added to the codebase, and it at least allows some form of code review or peer review process to those changes.

When you look at the different people involved or the typical cycle, is me as a developer, I'm writing a cool feature or fixing a horrible bug. My goal is to get this working and pushed to the main branch, merged back to the codebase, and eventually deployed as quickly as I can. I've already done the work. Part of my definition of done is to actually get it reviewed, merged, and eventually deployed. So my motivation is there. But I need to get help and collaboration from people who are working on their own items. And I need them to review my code just like I would review theirs.

It's a peer-based process or sometimes an expert-based process. I need someone else. And my priorities are different from their priorities. So I need to think about how do I get them to help me sooner while things are still fresh in my mind? How do we as a team reduce context switches and so on?

Ben: Well, your priority is getting your work done, and their priority is getting their work done.

Yishai: Yes. And so, by definition, we're at some clash. There is a, you know, some people like to call it a buyer-seller. I need to sell my PR. Their top priority is probably not my PR. They have their work. And also, once I have given away my PR for review, I'm typically starting a new thing. It's now my key priority.

And this older PR which is waiting for someone else, maybe they'll get back to me in a couple of hours or a couple of days, now is conflicting with my new toy or my new PR. So I have this internal conflict as well. It's very tempting to start new things, but I want to get my work finished. Or the team wants this existing work or work that is advanced and is almost done to get finished.

Ben: And the longer the review process takes on that pull request, the more diminished its importance is in your head as you moved on to something new and are building on your new pull request. So if it takes two, three, four, or five days to receive a review, those are two, three, four, five days that you have now started progressing forward on some new area of work. And this thing feels older and older and less relevant for you with each day and maybe even each hour that passes.

Yishai: Right. So it's both...like, there's a tax, a mental tax, a cognitive tax in reloading that context. Oh, I need to get back to this. What was I thinking? What was I doing? So that I can really read those review comments and react in the right way. And there's also the priority shift where my mind and my heart is now in this new thing. And I need to get back to that old thing and finish it. And people don't like to finish things as much as they'd like to start or do the fun, creative part of working on something new or a new piece; at least most people like that better.

So there are multiple conflicts of priorities and multiple people needing to get into the context of something and then drop it to work on something else and then go back. Understanding those dynamics is what we headed out in our research to understand better what's actually happening in actual teams, focusing on dev teams and companies that are building software for profit, not open source or loosely coupled teams.

We want to understand what's typically happening? When are people working or not working on PRs? How is that dynamic actually happening so that we can analyze those patterns and see okay, are there opportunities for improvement? What can we learn from that behavior? So studying close to a million PRs across our very varied customer base, we began to look at what can we say about the time when developers and reviewers are interacting or working on a specific PR, on a specific code change versus the time that it actually lies idle? No one is looking at it. No one is actually thinking about it.

And we discovered that a lot of the time is idle. More than half of the PRs are idle for half of the time; that's even when you discount the nights or the weekends. You just look at work hours. Most of the time, this PR is just waiting there, and no one is even looking at it, thinking about it. It's not fresh in anyone's mind. And for PRs that take a bit longer to review, the ones that are not happening within a single day, the idle time is even higher, up to 80% and sometimes more.

So if you think about it, all that time spent idle means two things: first, things that could progress and could move into done and actually bring value that doesn't happen, and it languishes. And then the longer things wait in idle, the more expensive the cognitive tax is going to be about reloading or getting back to this whenever someone does get back to it.

Ben: Right. And there are probably other roll-on effects as well. Like, the longer something is staying idle, perhaps the more work you need to bring it up to the main branch because other changes might have been merged into the main branch since that are raising conflicts in your existing work. And the longer it sits there, perhaps the more conflicts you're going to have, which is more cognitive tax on the developer. It just keeps on compounding the problems.

Yishai: Yes. So you have a higher width because you have items that are not done, and you still own them. And the team has a higher width. Like you said, the merge risk is higher. You are not learning anything as a team from this new code. No one is actually using it. So the customer or the world is typically a moving target. If you're not deploying something and letting the customers use this, you're getting further and further away from actually hitting something that someone cares about.

Ben: That's such an articulate formulation of it as well. The longer it's staying out of the production codebase, the more time before a user, your actual most important end result, is actually going to experience the code that you wrote. And it's just sitting there dormant, actually not being put to use at all.

Yishai: Right, so no value but also no learning. Typically in agile teams, you want to learn from the...you typically have...you're deploying an initial part of something. You're never deploying a whole product in one go. And you want to learn from how it's going to use or not use. Your learning cycle is longer. Does it actually work in production, or does it explode? That's also a learning curve. If it takes me two more weeks to learn it, that's much, much longer for me to fix it and get it right.

All of these things compound when your PRs are languishing and basically idle. No one is working on them. So everyone is busy. Everyone is working on something; not saying the team is idle, but the PR is idle.

Ben: So how do you address those things, Yishai? How do you come to find a solution that resolves all these issues given the constraints of a system, a methodology of operating within the context of pull requests?

Yishai: So I think the first step is realizing, first of all, having the data and understanding both from a broad perspective what we provide as research but also specific to my team. My team's PR is what is actually happening? What are the patterns? Are we waiting for people to pick up their PR to begin reviewing it? Are we waiting for reviews that have begun but are not closing fast enough and maybe spiraling out of control or just lying in some gutter waiting to be picked up? So having the data and the visibility is the first step, understanding what is actually happening in my team or in my org.

Second thing is also to understand the reasons why the PR cycle and the cycle time for code work is important, why that is a key metric, why that is something that is worth measuring, tracking, and setting goals against. Why this is a good way to measure the effectiveness of a dev team compared to lines of code or numbers of PRs and other metrics that are probably not the right things you want to measure. Those are the beginnings. You want to understand the motivations, how it hurts me when PRs get longer, and be able to measure it.

Then when you're drilling down and looking at the reasons, like, why is a PR waiting so long? When we look at that or when we analyze the idle time for PRs, we typically see two types. There is time when the PR is handed over from someone to someone else. I created a PR. This is basically handed over to a reviewer or a set of people that may review it.

When they have finished the review, it is getting handed over back to me. So this is what you call this transition idle time. And this is very much reliant on the way we communicate. Does the next person in the chain know that this is now waiting for them? So communication, awareness that there's some work that I can do here that someone is waiting on me for help. Sometimes there is congestion.

Ben: Do you find that automation helps with some of that, letting the reviewer know that they've been asked to review something? There are many ways to automate that.

Yishai: So on the communication side, looking back three years ago, we used to swivel in our chair and holler to someone, "Hey, I have a PR for you," because they were in the same room. Then came COVID, and also the world is changing, and we have much more remote and distributed work. So the chair has become Slack. We Slack each other, or we text each other. "Hey, here's a PR. Can you look at it for me?" "What's up with my PR?" "Oh, actually, I reviewed it yesterday. Didn't you see?"

So we have manual communication in various channels, back channels that try to make up for that communication or communication gap. And automation means there's going to be reliable communication that's going to reach the right person at the right time where they typically are, which is, newsflash, not email. It's probably Slack or Teams. Shocking news, it's not email.

Ben: [laughs]

Yishai: So having reliable communication channels with built-in capabilities for snoozing, for letting the person make this something that they can be reminded about. It's another incoming task for me now, or something is waiting on me. I need to be aware of it, and I need some management tools or capabilities so that I can manage this. It's typically not just one thing, so I have multiple things coming in. Plus, I'm actually working on something, my main priority, so even elementary tools to manage those.

And what we also found is in addition to the basic communication, if you can give me more context. If in the communication you're not just giving me a link to a PR, but you're telling me more about it, I can make better decisions. And this goes back to the other type of idle time, what we called intrusion idle time, which is where I start to work on something, a PR, and then I get drawn away to I have a meeting or something else of priority has interrupted me, and then I get back to this PR later. It's still in my hands. I'm not transitioning it to someone else. But there was some intervention, and I needed to start and then drop it and then go back.

And we've looked at both types of idle time. So how can we reduce that? How can we make sure that if I'm already working on something, I can finish that part hopefully in one sitting or minimize those jumps between tasks? So context is very helpful here, if you will give me more context about this PR.

Or looking at a PR review if I'm being asked to review a PR and I get context about what's in it? What's coming up? How many files are there in the PR? How large is it? How much time am I expected to spend reviewing this? Is this a two-minute thing or an hour that needs to be spent here? That will help me make the right decision. And maybe I won't start it now if I only have 10 minutes until my next meeting.

Ben: Within the context, including perhaps estimated amount of time this will take to review the pull request, giving an actual estimate of work time is essential as well, you're saying.

Yishai: Right. And we've seen estimated reading time is a very common add-on for many blogs and articles where you tell the person here's a nice thing for you. It's interesting. It's going to take you 10 minutes to read. And it's different the way people react to that. It's different from something that's going to take you 3 minutes or 30 minutes. So we've done the same for PRs. And we are also studying the response of this queue for how people react to those messages. Like, you have a review, and it's going to take you 2 minutes versus it's going to take you 15, or 30, or 60. And we see that, yeah, this actually shapes behavior.

When PRs are expected to take longer for the review, people time it differently, and people will do them when they have time rather than jumping right now to maybe remove something small out of the way. All of these things help to optimize the types of idle time, reduce them. They're not going to go away completely. And you also do not want people to jump around just because something came in or there's a task. Context switch should be deliberate rather than by interrupt.

But adding context helps make that decision. Choosing when to notify helps make that decision. So these are all ways where automation helps to reduce the idle time and helps eventually the team focus on finishing things sooner and giving that priority over starting new things, which is a good thing, less width, all the costs we have described before, and all the benefits of shorter PRs in a shorter timeframe.

Ben: So, based on your research, let me ask you the following theoretical question and probably a very common scenario. Let's say I've been assigned to work on a new feature. And I've built the feature, and I have my unit tests, my integration tests. I've done everything I need to do. When I'm submitting a pull request to somebody to review it, and it's a complete feature, and I'm going to provide estimated time to review of, let's say, an hour, an hour, and a half. It's a big feature.

Would you recommend, based on your research, that I take that feature and break it up into smaller component parts and have multiple pull requests that all are part of the larger picture? Or is it better to have a fully comprehensive one pull request, one feature but a larger chunk of work for the reviewer? Which one tends to reduce that cognitive load, that time between that we're talking about?

Yishai: So I'll start with a simpler version of that, which is easier to answer, and then I'll go to the harder version. So the easier question is, I have a feature, but I also found a bug somewhere, and I have this nice refactor. And I've bundled them all into this one big PR complete with all unit tests and whatnot. And I want you to review it. So here, the answer is simple. Yes, break it down. Don't do things that are not necessarily related in the same PR just because you happen to be touching that area of the code, or you saw an opportunity.

Larger PRs are harder to review and harder to merge. People will shy away from reviewing them because you're going to GitHub, and you see this wall of red and green. You're scared away. I don't have time for this. I don't have the mental capacity right now. So, by all means, smaller PRs that focus on just one thing are easier to grok, easier to review, less risks to merge. They will lead to shorter cycle time and much quicker review just by being smaller in addition to all of those automations and communication helpers I just discussed.

In some cases, it doesn't make sense, or you need to be thinking about how to split this, and hopefully, you've done this. You're not splitting an existing PR, but you're starting out by doing smaller slices. Just doing smaller slices that have no context or can only be understood with other sibling PRs that may be too complex. But if it's a well-defined piece of change to the codebase that can be justified on its own and hopefully deployed on its own, definitely chunk it down.

Ben: That will help reduce the barrier to entry for reviewers. They'll look at the pull requests and maybe only see a few lines of code to change, a few lines to review as opposed to an endless sea of changes that they need to --

Yishai: Right. And the complexity of understanding a large PR, the complexity is not linear. There are multiple files that you need to follow, multiple trains of thought as you try to decompose what's happening here. We all know, we all have experience with those huge PRs that are intractable. In extreme cases, we ask people, "Hey, break this down and give it to me in separate PRs." But ideally, this happens from the onset.

You begin by slicing your work in a way that can be reviewed, merged, deployed, and not just a technical big breakdown of the PR but also slicing the value. We're all in the business of building value through a series of code changes to our codebase. Having good atomic-sized changes that can be reviewed reasonably is key here.

And what tends to happen with huge PRs is they get delayed. People don't start reviewing them. And then when they start reviewing the actual review, quality is going to be very low. Because they say, "Okay, it's intractable. I'm just going to skim it," and say, "Okay, it looks good." It's very hard to get good feedback on a huge PR. So quality will go down as well.

Ben: Yeah, that makes a lot of sense. So, where do you think that culture comes from? Where does it start from? Is it something that's part of the onboarding for new engineers into a software team, how we do PRs? Is that something that should be not implicit but made explicit? And if so, how do you do that? How do you onboard people to that culture of more sustainable, I would say, more sustainable PR?

Yishai: Like most things, it's done to people at the beginning. Culture is about people and about agreements on how to work as a team. In some cases, these are soft things that you learn as you onboard or learn through seeing others do in the team. In some other cases, you could think of these as either written or unwritten tenets of the development process, like this is how we do things. We run these kinds of tests before we merge the PR, and we do QA here.

And for some of these behaviors, you can get help, or the team can get help, again, by using tools and automations to help anything between force or shape behavior. For example, if you want your PRs to be small, why not use a tool that will let you first of all track the size of your PRs? Take goals against that size if you want to make sure that it's small or reducing. You want to change existing behavior. Let developers know when their PRs are starting to be too large when they still have a chance to change it or to break it down.

So visibility into what's going on, making sure that it's part of what we describe as a team this is how we want to behave. This is how we should be doing things, measuring ourselves against that, using goals, KPIs, or whatever means. This is a retrospect kind of motion. What has happened? Do we like it? Do we want to change it? Here's our new goal or experiment or suggested way of changing behavior. And also, because this is all in code, this is all through tools. This is a very measurable and automatable process. Maybe get some alerts and heads up on specific PRs that are beginning to look different or behaviors on PRs that are different from what we want.

If we want people to begin reviewing a PR within a day, wait no longer than one day from issuing the PR until someone starts to review it. If that is a behavior we want to encourage, let's make sure we get a reminder when PRs are getting near that threshold. These are things that are very readily automated. There are tools to do that. And you integrate that in your work. And now your dev process and preferences become codified in how you measure things, in how you take goals, in how you automate alerts, communications, automations, workflow automations, all of that.

Ben: I think that makes a lot of sense. If what we're talking about is around quantifiable metrics, then quantify it, and let's measure it. And let's set ourselves goals and benchmarks and evaluate our progress towards them. I think that makes a ton of sense to me, and it's a really good way to actually start progressing and moving forward from where we are to where we want to be.

In the time we have left, I want to go into a subject which is a hot subject in our last few minutes, which is tell me about a world where there are no pull requests. Tell me about a developer environment software shop that is just merging right to main. Is that a doable thing? Is that recommended, and if so, when, and where, and how? And if it's not, should we put up a big, flashing warning sign, don't enter this space, never do this?

Yishai: [laughs]

Ben: Yishai, help us navigate these terrains.

Yishai: Yeah, so great question. And I think we are seeing a burgeoning desire to change something with the PR process. Like I said, it has evolved from the open-source world into much more tightly coupled teams. And it has drawbacks. It is slower than just putting code directly to main or not doing code reviews. So it has a lot of benefits. It also has some drawbacks.

It's definitely possible to drop PRs completely. Some teams are doing that, typically small teams, very high skilled developers. So it's a bit easier when everyone is very skilled, complete ownership of entire areas of the codebase, and ability to maintain high quality through that. So it's doable; it's hard. And it's probably not a great fit for most companies, most teams, which have diverse levels of skills and experience with the codebase, larger teams or larger organizations which have more moving parts, and so on.

What perhaps is more viable than dropping PRs completely are two things. First of all, code review process does not have to be async. PRs today are mostly async. I put the PR up, someone else picks it up, takes a review, writes comments, I read them later, and so on. So the ability to think about should we move this to a sync review? And making smart decisions about when to let's huddle together and spend some synchronous time together on the PR. Some teams do pair coding and other methods to even preempt that.

But some PRs are better reviewed in async mode rather than asynchronous. So identifying that, knowing when is a good time to do this, finding the communication methods to actually make that happen. That is one win and one change in the PR process that applies to some PRs in some teams some of the time.

Another thing which can help to get most of the value you get from PRs like better quality, a lot of bugs get caught in the PR's phase, better design, and so on but not pay all of the taxes, all of the cost of PRs, is by saying not all code changes require a PR. You can, as a team, decide on...there are models like Ship / Show / Ask, which some changes are small enough and gentle enough that the developer is given full autonomy to just push them. He owns it completely. No need to do anything. I'm just shipping code.

Other changes require that I can ship the code, but I need to also make people aware of it. That is the show option. And then ask is for larger changes, more complex ones, and more risky ones. Let's do a PR process for those changes. So now it becomes an art. How do I choose? Who gets to choose? Again, looking at our data, 90% of the PRs are very small and very simple. The PRs are not all...their distribution is far from normal. Most of the PRs are tiny. In many cases, the review process for those PRs is an overkill.

So finding the right methods and, again, tooling and automation can help a lot here. So maybe let's decide which PRs can skip that process and do not really need a review. And we can streamline those and focus our review efforts and all those cycles in the PRs that really matter and that really need them. So I think looking forward, we'll probably find ourselves with much more teams or organizations that use a combination approach and run a full review process for some of the code changes and a much easier or smaller or no process for a lot of the small changes, tiny PRs that do not really need a peer review and all that process.

Ben: It's a very happy medium I think negotiating between the two extremes and finding that right balance so when you need the pull requests and when you don't need the pull requests and determining and figuring out for your organizational complexity, and dynamics, and your team's strengths where to put it and where to not put it. I think that is a really, really good approach.

Yishai, thank you for joining us on this episode of Polyglot. Thank you to our listeners for joining us. It has been such a pleasure. You can find us wherever you find your podcasts. And again, you can find links to all the articles we mentioned in the show notes. Yishai, thank you so much for joining us.

Yishai: Thank you. It was a pleasure.

Ben: Have a great day, everyone.

💖 💪 🙅 🚩
therubyrep
Mandy Moore

Posted on March 23, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related