So after merging the pull request, are you supposed to perform another set of regression tests? Hard to find info about that
[0]: https://about.gitlab.com/blog/2020/01/30/all-aboard-merge-tr...
In a nutshell, the idea is to keep track of currently running jobs, then any time a new commit enters the queue, you merge all the running commits into that and test that as a bundle. If merging fails, bail out. If that bundle job fails, bail out. If one of the previously running commits fail, you bail out of the bundle job and run another speculative bundle job without the failed commit. When a bundle succeeds, land all of its commits and abort any remaining redundant jobs.
Such systems will sometimes have heuristics to bundle commits together in some smarter way than a naive queue (e.g. preferring to bundle of commits without overlapping changes, or not bundling small commits with huge ones to prevent infecting speculative builds with the slowness of the big commit, using AI to come up with heuristics to detect "likely to fail" commits and preemptively starting speculative builds without them, etc)
So all commits are made sequentially, but most of the time developers don't need to rebase them themselves before pushing. The time distribution of commits sent to the queue is non-uniform, so you may have a long queue during the day, but by late evening it's pretty much empty. It has the greatest number of commits on Fridays.
(Homepage: https://bors.tech/)
Like others said, "merge queue" is the solution. GitHub's one has been in beta for many months now.
There are also dedicated companies doing this like https://mergify.com/
But there's several tricky aspects like "what if I want some commit to jump in front of queue?", compliance etc.
Due to the multitude of requirements and off-the-shelf solutions being somehow limited, infra folks in my place are considering building a custom solution :)
https://github.blog/changelog/2021-10-27-pull-request-merge-...
It automatically tests the changes with a simulated merge on master together. So it orders PR1 -> PR2 -> PR3 -> .... -> PR-100 by order of approval. If PR1 -> PR2 (Fails) -> PR3 -> .... -> PR-100
It restarts -> PR3 -> .... -> PR-100 and Up after removing PR2. This behavior is even customizable.
Video of it in action: https://zuul-ci.org/media/simulation.webm
Links: [0]:https://zuul-ci.org/
We run the acceptance tests for all PRs and also the main branch after each change. So a given PR needs to be 'green', and if we merge the PR then we run the tests on the integration branch too. If the integration branchs gets broken somehow, then all merges stop -- we don't merge any further PRs into a 'broken' branch.
We have tried the option of always rebasing PRs on the most recent integration branch and re-running the tests, but that results in exponential number of builds.
Of course you need a meaningful build farm -- we have around 1000 CPU cores for ~30 developers that are fully utilised during work hours.
Two things had to work or we would have lost all our sanity points (reference to Call of Chthulhu): automated checks of style, building, testing; and small PRs subscribing to the 'do one thing and do it well' approach.
As for the second, it's more doable than you might think - if you decomposed your work well. We decomposed our work at these levels on one project: product, demo capability, epic, task; the other project used feature, epic, task. Typically PRs were at the task level, though sometimes at PR level.
For us this had the added benefit that the developer velocity on these two projects exceeded the velocity of any of the other projects I've been on.
Some other tricks we used were to make sure we rebased our task branches from develop every morning (and if needed after lunch); each task branch had 1 person only working on it; where complexity warranted we created an Epic-### branch for that epic and treated it as a mini-develop for tasks on that epic.
This is where merge queues come into play which have a variety of names (merge queue, submit queue, merge train, etc)
I have a write up on merge queues [0] that goes into other benefits (e.g. speeding up PR builds), design considerations and their trade offs, and a comparison of various implementations.
> in those 5 minutes, 10 other commits have been made
If running tests take 5 minutes, that either means you have ten different developers working on your code, or people committing to the main branch without running tests.
Seems like you have too many developers working on the code for the size of the project and/or velocity of your approval process, or a spaghetti project (edit: or your tasks are too small)
Removing developers from the project may decrease the time spent on merge conflicts so much that it makes you move faster.
You should definitely make sure that every PR is being tested against a merge commit to current main and not just tested against the code in the PR that may have forked off weeks ago. If the PR has been sitting you may need to recycle the tests to test against current main before merging.
This should take care of most problems, but doesn't guarantee 100% that main is never broken.
I'd argue that isn't likely to be possible, and you need gates in between main and whatever production is and that code needs to be retested before deployment actions happen.
When you do this you should then address how often main actually breaks and what the root cause (or what the accident chain is for people who deeply hate that term). If it is really breaking a lot due to races in different orthogonal commits touching the same sensitive locations (somehow without merge conflicts), I'd argue the correct course of action might be to refactor the code rather than the pipeline, and worrying about the pipeline is the last thing.
And your pipeline probably should break and hold up the tests from time to time, that probably costs less than designing a complex solution which tries to be perfect for the sake of being perfect.
Of course for the FAANG-scale readers it is probably worth it, but most devs aren't actually FAANG-scale, and there'll be some fuzzy line in the middle where it really starts to matter.
But if something breaks once in a blue moon that doesn't necessarily mean you should always fix it, as long as you can always contain the damage. So how often is this really happening that you think you need to fix it?
If your change does break after merging, it can usually either be quickly fixed with a minor code change or simply reverted immediately.
Rather than look for a technical solution I would explore process and culture changes, such as having the various devs making all the different commits start pairing.
Our test suite takes about 30m to finish.