How are pull requests integrated in a repo with high commit frequency?

Question

So let's assume we open a pull request, merged current main into our feature branch and are running tests. The full acceptance test suite takes 5 minutes to complete. Meanwhile, in those 5 minutes, 10 other commits have been made. If we merge now, we can't be sure that we won't break stuff right?So after merging the pull request, are you supposed to perform another set of regression tests? Hard to find info about that

Revell · Accepted Answer

Sounds like you want to implement something like merge trains?[0][0]: https://about.gitlab.com/blog/2020/01/30/all-aboard-merge-tr...

lhorie · Answer

The technology you're looking for goes by many names, "submit queue", "merge queue", "merge train", etc.
In a nutshell, the idea is to keep track of currently running jobs, then any time a new commit enters the queue, you merge all the running commits into that and test that as a bundle. If merging fails, bail out. If that bundle job fails, bail out. If one of the previously running commits fail, you bail out of the bundle job and run another speculative bundle job without the failed commit. When a bundle succeeds, land all of its commits and abort any remaining redundant jobs.
Such systems will sometimes have heuristics to bundle commits together in some smarter way than a naive queue (e.g. preferring to bundle of commits without overlapping changes, or not bundling small commits with huge ones to prevent infecting speculative builds with the slowness of the big commit, using AI to come up with heuristics to detect "likely to fail" commits and preemptively starting speculative builds without them, etc)

Cerium · Answer

I work on a project with a 3 hour CI process that handles about 15 to 40 PRs a day. The problem you describe here is a real challenge. Our PRs don't use a merge button. Once CI passes (your changes built on latest develop), then you can click a box that says this PR should merge. An automated process scoops up all those PRs each night and builds them together. The system test team tests that build in the morning. If there is a problem, that is debugged (we build build some bifurcation builds ahead of time to make this faster). Then the batch is merged to develop. This process has worked very well for us. Develop keeps a good pace but is only very rarely broken.

yakubin · Answer

At dayjob we have a "commit queue" server. When a patch is accepted on Gerrit, there is a button to send it to the queue. The queue cherry-picks the commit onto the current master branch, makes several builds (different configs) and runs tests. If everything is ok, it pushes the commit to master and marks the patch on Gerrit as merged. If any of the builds or tests fail, it doesn't push the change and instead makes a comment on Gerrit with a link to the broken build/test.So all commits are made sequentially, but most of the time developers don't need to rebase them themselves before pushing. The time distribution of commits sent to the queue is non-uniform, so you may have a long queue during the day, but by late evening it's pretty much empty. It has the greatest number of commits on Fridays.

apetresc · Answer

Check out something like Bors, which implements a paradigm that fixes this class of issue: https://github.com/bors-ng/bors-ng(Homepage: https://bors.tech/)

jakub_g · Answer

This is a real problem in my current place (huge monorepo, 100+ devs, 100+ merges per day).
Like others said, "merge queue" is the solution. GitHub's one has been in beta for many months now.
There are also dedicated companies doing this like https://mergify.com/
But there's several tricky aspects like "what if I want some commit to jump in front of queue?", compliance etc.
Due to the multitude of requirements and off-the-shelf solutions being somehow limited, infra folks in my place are considering building a custom solution :)

scottmsul · Answer

Wait, is it 10 other commits, or 10 other pull requests? For medium-sized projects (5-10 devs) it's generally good practice for devs to make a separate branch per feature, make lots of commits on the side branch, then merge the side branch into master in a single PR once the feature is ready.

msoad · Answer

Github is working on this for a while now. It's still in beta :(https://github.blog/changelog/2021-10-27-pull-request-merge-...

msk20 · Answer

There is ZUUL Gating[0] CI it is actually the perfect solution for this. It works with Github or Git based repository system.
It automatically tests the changes with a simulated merge on master together. So it orders PR1 -> PR2 -> PR3 -> .... -> PR-100 by order of approval. If PR1 -> PR2 (Fails) -> PR3 -> .... -> PR-100
It restarts -> PR3 -> .... -> PR-100 and Up after removing PR2. This behavior is even customizable.
Video of it in action: https://zuul-ci.org/media/simulation.webm
Links: [0]:https://zuul-ci.org/

sz4kerto · Answer

Our full acceptance test run in 3 hrs (on build servers with 64 cores and 3xx GB memory) and we have PRs coming up every hour or so.
We run the acceptance tests for all PRs and also the main branch after each change. So a given PR needs to be 'green', and if we merge the PR then we run the tests on the integration branch too. If the integration branchs gets broken somehow, then all merges stop -- we don't merge any further PRs into a 'broken' branch.
We have tried the option of always rebasing PRs on the most recent integration branch and re-running the tests, but that results in exponential number of builds.
Of course you need a meaningful build farm -- we have around 1000 CPU cores for ~30 developers that are fully utilised during work hours.

Communitivity · Answer

I've worked on two projects in the last six years that used PRs and had a high commit frequency.
Two things had to work or we would have lost all our sanity points (reference to Call of Chthulhu): automated checks of style, building, testing; and small PRs subscribing to the 'do one thing and do it well' approach.
As for the second, it's more doable than you might think - if you decomposed your work well. We decomposed our work at these levels on one project: product, demo capability, epic, task; the other project used feature, epic, task. Typically PRs were at the task level, though sometimes at PR level.
For us this had the added benefit that the developer velocity on these two projects exceeded the velocity of any of the other projects I've been on.
Some other tricks we used were to make sure we rebased our task branches from develop every morning (and if needed after lunch); each task branch had 1 person only working on it; where complexity warranted we created an Epic-### branch for that epic and treated it as a mini-develop for tasks on that epic.

slaymaker1907 · Answer

What we do at my job is just merge it anyway, but when failures happen, we work backwards to find faulting commit(s). This is also necessary because certain tests take too long to block PRs for. For people creating new branches, we also keep track of the last known good commit since this almost always lags compared to the latest commit.

jeremy_k · Answer

It is probably a huge leap to think about using Bazel, but I found Uber's blog post about how they manage this issue after setting up a monorepo a really good read - https://eng.uber.com/go-monorepo-bazel/

epage · Answer

While tools like Github check for patch conflicts, they don't check for semantic conflicts (e.g. one PR adds a new use of a function another PR removes).
This is where merge queues come into play which have a variety of names (merge queue, submit queue, merge train, etc)
I have a write up on merge queues [0] that goes into other benefits (e.g. speeding up PR builds), design considerations and their trade offs, and a comparison of various implementations.
[0] https://epage.github.io/dev/submit-queue/

Someone · Answer

If you want to be sure, you do run a regression test, yes.
> in those 5 minutes, 10 other commits have been made
If running tests take 5 minutes, that either means you have ten different developers working on your code, or people committing to the main branch without running tests.
Seems like you have too many developers working on the code for the size of the project and/or velocity of your approval process, or a spaghetti project (edit: or your tasks are too small)
Removing developers from the project may decrease the time spent on merge conflicts so much that it makes you move faster.

lamontcg · Answer

Do you really need to solve this problem?
You should definitely make sure that every PR is being tested against a merge commit to current main and not just tested against the code in the PR that may have forked off weeks ago. If the PR has been sitting you may need to recycle the tests to test against current main before merging.
This should take care of most problems, but doesn't guarantee 100% that main is never broken.
I'd argue that isn't likely to be possible, and you need gates in between main and whatever production is and that code needs to be retested before deployment actions happen.
When you do this you should then address how often main actually breaks and what the root cause (or what the accident chain is for people who deeply hate that term). If it is really breaking a lot due to races in different orthogonal commits touching the same sensitive locations (somehow without merge conflicts), I'd argue the correct course of action might be to refactor the code rather than the pipeline, and worrying about the pipeline is the last thing.
And your pipeline probably should break and hold up the tests from time to time, that probably costs less than designing a complex solution which tries to be perfect for the sake of being perfect.
Of course for the FAANG-scale readers it is probably worth it, but most devs aren't actually FAANG-scale, and there'll be some fuzzy line in the middle where it really starts to matter.
But if something breaks once in a blue moon that doesn't necessarily mean you should always fix it, as long as you can always contain the damage. So how often is this really happening that you think you need to fix it?

sandspit · Answer

There are two great saas solutions for this - mergify.io and mergequeue.com.

frozenport · Answer

https://github.com/smarkets/marge-bot

__alexs · Answer

Such a code base is likely to be large and the changes distributed somewhat uniformly across it. This means that conflicts are actually not super likely and also that your assumptions are not highly likely to get invalidated by the commits that happen in between.If your change does break after merging, it can usually either be quickly fixed with a minor code change or simply reverted immediately.

efxhoy · Answer

We run tests on the master branch before we deploy. If tests fail there we find the offending PR and revert it.

andersco · Answer

Is it possible that the solution is to look at why there is such a high frequency of commits by different devs into the same main branch?Rather than look for a technical solution I would explore process and culture changes, such as having the various devs making all the different commits start pairing.

avgcorrection · Answer

Relevant: "not rocket science" (the story of monotone and bors)[1][1]: https://graydon2.dreamwidth.org/1597.html

Aeolun · Answer

We don&rsquo;t try to deal with that issue. If something breaks because of the merges, it&rsquo;ll be the next persons problem (when they make a new feature branch off of master and have a few failing tests).Our test suite takes about 30m to finish.

jFriedensreich · Answer

we rebase all branches on top of the target branch and only allow fast forward merges, which prevents any untested merge commits to cause issues and is really good for reading history. if there are changes to the target branch since the last rebase, there needs to be another rebase triggering automatic CI/CD. if there is a known bottleneck where this would happen a lot to a number of PRs and all the prs are green we rebase the prs on each other in the desired merge order just have to wait for the youngest PR to run through ci/cd and then can fast forward merge all at once.

dmurray · Answer

You run the tests on the same snapshot of the code you are about to deploy. That might be 10 minutes or 2 hours or a month behind your trunk branch that got the 10 new commits.

hnrodey · Answer

Run the tests before allowing the pull request to complete?

mountainriver · Answer

Kubernetes does this with a tool called Prow, but it&rsquo;s roughly what others have mentioned around merge trains

drewcoo · Answer

What are your "regression tests" and why are you only worried about those?

readme · Answer

i would perform the tests before merging - rebase the pull request branch on top of master and run the tests, then merge using the --no-ff flag to ensure a merge commit is still entered

frays · Answer

Great thread, thanks.

How are pull requests integrated in a repo with high commit frequency?

Sounds like you want to implement something like merge trains?[0]
[0]: https://about.gitlab.com/blog/2020/01/30/all-aboard-merge-tr...

Check out something like Bors, which implements a paradigm that fixes this class of issue: https://github.com/bors-ng/bors-ng
(Homepage: https://bors.tech/)