Alerting on flaky test escalation with Trunk webhooks

A single “this test is now flaky” alert tells you a test crossed a threshold once. It says nothing about what happens next: the same test failing on more branches, tripping more monitors, or sliding from flaky into a consistently broken regression. For the tests that matter, you want to hear about the escalation, not just the first detection. This page wires that up with Trunk webhooks and a Slack transformation. It builds on the Slack integration guide, so set that connection up first, then come back here to filter it down to escalations.

Pick the right event

The one decision that matters is which event you subscribe to. Two events fire here, at two different granularities.

Event	Fires when	Use it to
`v2.test_case.status_changed`	The test’s overall health status transitions between `HEALTHY`, `FLAKY`, and `BROKEN`	Alert on health escalations like `FLAKY` → `BROKEN`
`test_case.monitor_status_changed`	Any individual monitor activates or resolves for the test	Alert every time a monitor flags the test, even if its overall status doesn’t move

That distinction matters. v2.test_case.status_changed only fires when the test’s combined status changes. If a test is already FLAKY and a second monitor starts flagging it, the overall status stays FLAKY, so nothing is sent. To catch a test that keeps getting flagged by more monitors over time (the “more than just the first detection” case), subscribe to test_case.monitor_status_changed instead.

A test goes HEALTHY to FLAKY when Monitor A fires, so both events send. When Monitor B fires while the test is already FLAKY, v2.test_case.status_changed sends nothing while test_case.monitor_status_changed still fires.

Test status priority is Broken > Flaky > Healthy. A test flagged by both a broken-type and a flaky-type monitor shows as BROKEN until the broken monitor resolves. See Flake Detection for how the combined status is calculated.

Alert when a test becomes broken

Use this when consistently failing tests deserve a louder, separate signal than routine flakiness. 1. Configure a broken-type monitor. A test only reaches BROKEN status when a failure rate or failure count monitor with its Detection type set to Broken is active for it. Set one up if you haven’t already. A common pattern is to pair a broken-type monitor (catching consistently failing tests) with a flaky-type monitor (catching intermittent ones). 2. Filter the transformation to escalations. In your Slack endpoint’s transformation, cancel the webhook unless the status got worse. This example ranks the three statuses and only sends a message when new_status is more severe than previous_status, so recoveries and resolutions stay quiet:

// Status values are uppercase (HEALTHY, FLAKY, BROKEN), matching the payload.
const SEVERITY = { HEALTHY: 0, FLAKY: 1, BROKEN: 2 };

function handler(webhook) {
  const { previous_status = "HEALTHY", new_status = "HEALTHY" } = webhook.payload;

  // Only alert when the test got worse, not when it recovered.
  if (SEVERITY[new_status] <= SEVERITY[previous_status]) {
    webhook.cancel = true;
    return webhook;
  }

  // summarizeTestCase() is defined in the Slack integration guide.
  webhook.payload = summarizeTestCase(webhook.payload);
  return webhook;
}

To alert only when a test reaches the broken state, and stay quiet on first-time flaky detections, gate on the new status directly instead:

function handler(webhook) {
  if (webhook.payload.new_status !== "BROKEN") {
    webhook.cancel = true;
    return webhook;
  }

  // summarizeTestCase() is defined in the Slack integration guide.
  webhook.payload = summarizeTestCase(webhook.payload);
  return webhook;
}

Both snippets replace the handler function from the Slack integration guide; keep that guide’s summarizeTestCase helper in the same transformation so the message body still renders. Its previous_status → new_status line makes the escalation obvious in the channel.

The quarantine trade-off

Before you reach for a broken-type monitor, know what it does to quarantine. Classifying a test as broken changes its health status, and auto-quarantine applies only to tests with a Flaky status. So when a broken-type monitor flags a test that was auto-quarantined as flaky, the test becomes BROKEN, drops out of the auto-quarantine set, and its failures start blocking CI again. That is by design, since a broken test is a real regression, not a flake to skip. It also means a broken classification is not a side-effect-free way to get an escalation alert. Labels avoid this. A labeling monitor doesn’t change health status, so an auto-quarantined test stays quarantined while you still get the activation signal (see Alert every time a monitor flags a test below). Manually quarantined tests are unaffected either way. See Quarantining and Flake Detection for the full composite-status behavior.

A flaky, auto-quarantined test with CI passing. A broken-type monitor fires and reclassifies it as BROKEN. Because broken tests are not quarantine candidates, it drops out of auto-quarantine and its failures block CI again.

Alert every time a monitor flags a test

Use this when you want to know about every detection event on a test, including the ones that don’t change its overall status (a second monitor piling on, or a labeling monitor surfacing a new pattern). 1. Subscribe to test_case.monitor_status_changed. On your Slack endpoint, enable this event in addition to (or instead of) v2.test_case.status_changed. 2. Filter to monitor activations. The event fires on both activation and resolution, so cancel the webhook unless a monitor is becoming active:

function handler(webhook) {
  const { monitor } = webhook.payload;

  // Only alert when a monitor starts flagging the test.
  if (!monitor || monitor.status !== "active") {
    webhook.cancel = true;
    return webhook;
  }

  webhook.payload = {
    blocks: [
      {
        type: "header",
        text: { type: "plain_text", text: `Monitor active: ${webhook.payload.test_case.name}` },
      },
      {
        type: "section",
        text: {
          type: "mrkdwn",
          text: [
            `Monitor type: \`${monitor.type}\``,
            `Test Details: ${webhook.payload.test_case.html_url}`,
          ].join("\n"),
        },
      },
    ],
  };
  return webhook;
}

Because test_case.monitor_status_changed fires for every monitor independently, this catches a test that keeps tripping new monitors over time, even while its headline status stays FLAKY. The monitor.type field tells you which monitor fired, so you can branch on it: route labeling monitors to a triage channel and health classification monitors to your on-call channel.

To route by pattern without changing a test’s health status, set a monitor’s action to Apply labels, then branch on monitor.type in your transform to send those activations wherever they belong. See Test Labels for the full setup.

Integration for Slack. The Slack connection these transformations build on.
Webhooks. The full event catalog and field reference.
Flake Detection. How monitors classify tests as flaky or broken.
Test Labels. Apply and route labels with monitors.

Flaky Tests

Alert When a Test Escalates

Pick the right event

Alert when a test becomes broken

The quarantine trade-off

Alert every time a monitor flags a test

​Pick the right event

​Alert when a test becomes broken

​The quarantine trade-off

​Alert every time a monitor flags a test

​Related

Pick the right event

Alert when a test becomes broken

The quarantine trade-off

Alert every time a monitor flags a test

Related