HACKER Q&A
📣 scrubs

Strategy for Choosing PRs to Combine


Env: git monorepo 10 million lines lots eg. 10s of PRs per day. Validation est 90-120mins per PR

Goal: validate PRs fast so release manager can approve them to close PR

Ask: rather than some Jenkins job taking PRs one at time for validation, need a strategy to pick several candidate PRs I can combine and validate together perhaps based on disjointedness of change set

My previous job had zillions of very small repos eg one task, lib. Here running PRs in order is effective, obvious. Each repo had an owner so changes were a bit more parallel. Of course it means building code is somewhat harder because dependencies must be found and combined. Dpkg helped a lot there

Monorepo is the opposite. All the code is in one trunk. It either builds and validates with PR or no. However, there's now more to build and validate and it's no longer obvious which PRs should be considered for a validation run. Sequential processing of today's PRs could run into tomorrow creating an ever increasing backlog.

I want to focus on PR selection. Making the validation faster say through bazel which can better exclude sub validation steps for files that haven't changed, is something we will do regardless


  👤 verdverm Accepted Answer ✓
You should only build the parts that changed in a PR. I manage this process for a monorepo on Jenkins. We keep each PR separate, and given the complexities we already manage, I would not advise merging PRs, it will only make things more confusing for people trying to understand why a build fails.

Why are you builds and PRs not being processed in parallel?


👤 slaymaker1907
What we do for the repo I work on is just optimistically assume everything will work together and then do binary search when something breaks and have a bot issue a rollback once the offending commit is found (sending an email to the committer).