Batching

What it is

Batching allows Trunk Merge Queue to test multiple pull requests together as a single unit, rather than testing them one at a time.

When batching is enabled, Trunk intelligently groups compatible PRs and runs your test suite once for the entire batch. If the batch passes, all PRs in the batch merge together, dramatically reducing total test time.

Why use it

Reduce total test time by 60-80% - Instead of running your full test suite 10 times for 10 PRs, you run it 2-3 times for the same PRs grouped into batches. More PRs merged with less CI time.
Increase merge throughput - Process 3-5x more PRs per hour compared to testing individually. A queue that handled 20 PRs/hour can now handle 60-100 PRs/hour with batching.
Lower CI costs - Fewer test runs means lower CI/CD infrastructure costs. Teams report 50-70% reduction in CI minutes consumed by merge queue testing.
Faster time-to-production - PRs spend less time waiting in queue. What used to take hours can now take minutes, getting features and fixes to production faster.

How to enable

Batching is disabled by default and must be explicitly enabled.

Batching is enabled in the Merge Settings of your repo at Settings > Repositories > your repository > Merge Queue > Batching and toggle batching On.

Configuration options

With Batching enabled, you can configure two options:

Maximum wait time - The maximum amount of time the Merge Queue should wait to fill the target batch size before beginning testing. A higher maximum wait time will cause the Time-In-Queue metric to increase but have the net effect of reducing CI costs per pull request.
Target batch size - The largest number of entries in the queue that will be tested in a single batch. A larger target batch size will help reduce CI cost per pull request but require more work to be performed when progressive failures necessitate bisection.

A good place to start is with the defaults, Maximum wait time set to 5 (minutes) and Target batch size set to 4 (PRs).

Bisection Testing Concurrency

When a batch fails, Trunk automatically splits it apart (bisects) to identify which PR caused the failure. You can configure a separate, higher concurrency limit specifically for these bisection tests to isolate failures faster without impacting your main queue.

Why Separate Bisection Concurrency?

By default, bisection tests use the same concurrency limit as your main queue. This means:

Bisection can slow down other PRs waiting to merge
Developers wait longer to learn which PR broke the batch
Your main queue's throughput decreases during failure investigation

With independent bisection concurrency, you can:

Speed up failure isolation - Run bisection tests at higher concurrency to identify problems faster
Maintain queue throughput - Keep your main queue running at optimal capacity during bisection
Optimize each workflow independently - Be aggressive about isolating failures without impacting successful PR flow

How It Works

When you set a higher bisection concurrency:

Main queue concurrency controls how many PRs test simultaneously in the normal queue
Bisection concurrency controls how many PRs test simultaneously during failure isolation
Both run independently - bisection tests don't count against your main queue limit

Example scenario:

Main queue concurrency: 5
Bisection concurrency: 15
Batch ABCD fails and needs to be split

The bisection process can spin up 15 test runners to quickly isolate which PR failed, while your main queue continues processing 5 PRs normally. Developers get faster feedback about failures without slowing down successful merges.

Configuring Bisection Concurrency

Navigate to Settings > Repositories > your repository > Merge Queue > Batching:

Enable Batching (if not already enabled)
Find the Bisection Testing Concurrency setting
Set a value higher than your main Testing Concurrency for faster failure isolation
Monitor your CI resource usage and adjust as needed

Recommended Settings

Main queue concurrency: 5
Bisection concurrency: 10
Good for: Teams managing CI costs carefully

When to Use Higher Bisection Concurrency

Consider increasing bisection concurrency if:

Developers frequently wait for bisection results to know what to fix
Your CI system has spare capacity during failure investigation
Large batches fail and take a long time to isolate the culprit
Fast feedback on failures is critical to your workflow

Monitoring and Optimization

Track these metrics to optimize your bisection concurrency:

Time to isolate failures - How long it takes to identify which PR broke a batch
CI resource usage during bisection - Are you maxing out your runners?
Developer wait time - How long developers wait for failure feedback
Main queue throughput during bisection - Is bisection slowing down other PRs?

Start with bisection concurrency 2x your main queue concurrency, monitor the impact, and adjust based on your team's priorities and CI capacity.

Best Practices

✅ Set bisection concurrency higher than main queue - This is the whole point of the feature

✅ Monitor CI costs - Higher bisection concurrency means more runners during failures

✅ Start conservative - Begin with 2x main concurrency and increase gradually

✅ Combine with other optimizations - Works best alongside Pending Failure Depth and Anti-flake Protection

❌ Don't set too high - Extremely high bisection concurrency can overwhelm CI systems

❌ Don't set lower than main queue - This defeats the purpose and slows down bisection

Test Caching During Bisection

When a batch fails and Trunk splits it apart to identify the failing PR, the merge queue intelligently reuses test results it has already collected during the bisection process. This avoids redundant CI runs and speeds up failure isolation.

How It Works

During bisection, Trunk maintains a cache of test results as it progressively splits the failed batch. If the queue knows with certainty that a particular combination of PRs will fail (because it already tested that exact combination earlier in the bisection process), it skips running the test again and reuses the previous result.

Example bisection with test caching

Batch ABCD fails testing (main ← ABCD)
Trunk splits the batch: AB and CD
Tests AB (passes) and CD (fails)
Now Trunk needs to split CD further: C and D
Before testing, Trunk checks: "Have I already tested C or D individually?"
If main ← ABCD failed and main ← AB passed, Trunk knows CD contains the failure
When testing main ← AB ← C, if this combination was already tested earlier, reuse that result
Skip redundant CI runs and identify the failing PR faster

Benefits

Faster failure isolation: Skip tests you've already run during bisection, reducing time to identify the culprit PR

Significant CI cost savings: Especially important for large batches or expensive test suites where redundant tests would waste substantial resources

Quicker developer feedback: Developers learn which PR broke the batch sooner, allowing them to fix issues faster

Automatic optimization: No configuration required - the merge queue automatically detects and reuses applicable test results

When Test Caching Applies

Test caching only applies during the bisection process when:

Batching is enabled - This is a batching-specific optimization
A batch has failed and is being split to identify the failure
The merge queue has already tested a specific combination of PRs during the current bisection
The test result is definitive - The queue has high confidence the result would be the same

Test caching does not apply to:

Initial batch testing (before any failures)
PRs in the main queue that aren't undergoing bisection
Tests that haven't been run yet in the current bisection process

Example Scenario

Without test caching:

Batch ABCDEF (6 PRs) fails
First bisection: Test ABC and DEF (2 CI runs)
DEF fails, need to split further
Second bisection: Test DE and F (2 CI runs)
DE fails, need to split further
Third bisection: Test D and E (2 CI runs)
Total: 6 CI runs to isolate the failure

With test caching:

Batch ABCDEF fails - we know ABCDEF combination fails
First bisection: Test ABC (passes) and identify DEF fails (no new test needed - we know from original batch)
Second bisection: Test DE - if we've already tested this combination, reuse result
Third bisection: Test D or E - reuse any already-known results
Total: 2-4 CI runs instead of 6

The exact savings depend on your batch size, bisection pattern, and which combinations have already been tested.

Best Practices

✅ Use with larger batch sizes - More PRs in a batch means more opportunities to cache results

✅ Combine with bisection concurrency - Fast bisection + test caching = maximum efficiency

✅ Enable batching - This feature only works when batching is enabled

✅ Monitor your metrics - Track CI spend and bisection time to see the impact

❌ Don't try to configure it - Test caching is automatic and always enabled when batching

❌ Don't rely on it for flaky tests - Caching assumes consistent test behavior; flaky tests may bypass caching for safety

How This Works with Other Features

Test caching complements other batching optimizations:

Bisection Testing Concurrency - Run bisection tests faster AND skip redundant ones
Pending Failure Depth - Keep more PRs in queue during failure recovery
Optimistic Merging - Merge successful batches while bisection runs in background

Together, these features create a highly efficient batch failure recovery system that minimizes both time and CI cost.

Note: Test caching for batch failure isolation is automatically enabled for all repositories using batching mode. No configuration is required.

Fine tuning batch sizes

Signs your batch size is too large:

Batches frequently fail and need to be split
Long wait times to form full batches
Test suite times out or becomes unstable

Signs your batch size is too small:

Not seeing significant throughput improvement
Batches form immediately (could handle more PRs)
Still consuming lots of CI resources

Optimal batch size depends on:

Test suite speed (faster tests = larger batches)
Test stability (more flaky tests = smaller batches)
PR submission rate (more PRs = larger batches)

Tradeoffs and considerations

The downsides here are very limited. Since batching combines multiple pull requests into one, you essentially give up the proof that every pull request in complete isolation can safely be merged into your protected branch.

In the unlikely case that you have to revert a change from your protected branch or do a rollback, you will need to retest that revert or submit it to the queue to ensure nothing has broken. In practice, this re-testing is required in almost any case, regardless of how it was originally merged, and the downsides are fairly limited.

Common misconceptions

Misconception: "Batching merges multiple PRs into a single commit"
- Reality: No! Each PR is still merged as a separate commit. Batching only affects testing, not merging.
Misconception: "If a batch fails, all PRs in the batch fail"
- Reality: Trunk automatically splits the batch and retests to identify only the failing PR(s). Passing PRs still merge.
Misconception: "Batching always makes the queue faster"
- Reality: Batching is most effective with stable tests and high PR volume. For low-traffic repos or flaky tests, the overhead may outweigh benefits.

Batching works exceptionally well with these optimizations:

Predictive testing - Batching builds on predictive testing. Batches are tested against the projected future state of main, just like individual PRs. These features complement each other perfectly.

Optimistic merging - While a batch is testing, the next batch can begin forming and testing optimistically. Combining batching with optimistic merging provides maximum throughput. Configure both for best results.

Pending failure depth - When a batch fails and is being split/retested, pending failure depth controls how many other PRs can test simultaneously. Higher pending failure depth helps maintain throughput during batch failures.

Anti-flake protection - Essential companion to batching. Reduces false batch failures caused by flaky tests, making batching more reliable and efficient.

Batching + Optimistic Merging and Pending Failure Depth

Enabling batching along with Pending Failure Depth and Optimistic Merging can help you realize the major cost savings of batching while still reaping the anti-flake protection of optimistic merging and pending failure depth.

event

queue

Enqueue A, B, C, D, E, F, G

main <- ABC <- DEF +abc

Batch ABC fails

main <- ABC

pending failure depth keeps ABC from being evicted while DEF

main <- ABC (hold) <- DEF+abc

DEF passes

main <- ABC <- DEF+abc

optimistic merging allows ABC and DEF to merge

merge ABC, DEF

Combined, Pending Failure Depth, Optimistic Merging, and Batching can greatly improve your CI performance because now Merge can optimistically merge whole batches of PRs with far less wasted testing.

Next steps

Start with batching:

Enable batching with conservative settings (batch size: 3-5)
Monitor for a few days and observe behavior
Gradually increase batch size as you gain confidence
Check Metrics and monitoring to measure impact

Optimize further:

Optimistic merging - Combine with batching for maximum throughput
Anti-flake protection - Reduce false batch failures
Pending failure depth - Tune behavior during batch failures

Monitor performance:

Metrics and monitoring - Track throughput improvements and CI cost savings
Watch batch failure rate (should be <10%)
Measure time-to-merge improvements

Troubleshoot issues:

If batches fail frequently → Lower batch size or enable Anti-flake protection
If not seeing improvements → Check PR volume and test stability
For detailed help → Troubleshooting

Last updated 16 days ago

hashtagWhat it is

hashtagWhy use it

hashtagHow to enable

hashtagConfiguration options

hashtagBisection Testing Concurrency

hashtagWhy Separate Bisection Concurrency?

hashtagHow It Works

hashtagConfiguring Bisection Concurrency

hashtagRecommended Settings

hashtagWhen to Use Higher Bisection Concurrency

hashtagMonitoring and Optimization

hashtagBest Practices

hashtagTest Caching During Bisection

hashtagHow It Works

hashtagBenefits

hashtagWhen Test Caching Applies

hashtagExample Scenario

hashtagBest Practices

hashtagHow This Works with Other Features

hashtagFine tuning batch sizes

hashtagTradeoffs and considerations

hashtagCommon misconceptions

hashtagRelated features

hashtagBatching + Optimistic Merging and Pending Failure Depth

hashtagNext steps

What it is

Why use it

How to enable

Configuration options

Bisection Testing Concurrency

Why Separate Bisection Concurrency?

How It Works

Configuring Bisection Concurrency

Recommended Settings

When to Use Higher Bisection Concurrency

Monitoring and Optimization

Best Practices

Test Caching During Bisection

How It Works

Benefits

When Test Caching Applies

Example Scenario

Best Practices

How This Works with Other Features

Fine tuning batch sizes

Tradeoffs and considerations

Common misconceptions

Related features

Batching + Optimistic Merging and Pending Failure Depth

Next steps