Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tallwatch.com/llms.txt

Use this file to discover all available pages before exploring further.

An incident is what Tallwatch opens when a monitor’s regions agree it’s down. It’s the unit your alerts, your status page, and your on-call rotation all hang off. This page covers its lifecycle and the cases where Tallwatch deliberately stays quiet.

Lifecycle

An incident moves through three states:
1

Open

Tallwatch decides the monitor is down once more than one region agrees, and opens the incident. The alert goes to the channels on the monitor’s escalation policy. A monitor has at most one open incident at a time.
2

Acknowledged

Someone acknowledges the incident from its detail page, signalling that a human is on it. Acknowledging is for coordination; it doesn’t change whether the monitor is up or down.
3

Resolved

The incident closes, either automatically or by hand (below). The notifier sends the resolved alert.
Every transition is recorded as an event on the incident, so its detail page reads as a timeline: opened, who acknowledged, which channels were notified, when it resolved.

Automated vs manual resolution

Automated. Consensus is the source of truth for a monitor’s state. Once enough regions agree the check is healthy again, the incident resolves on its own and the resolved alert fires. No timer to configure; recovery is just consensus running in reverse. Manual. If you’ve fixed the problem and don’t want to wait for the next checks to confirm, resolve the incident from its page. To stop consensus from immediately reopening it on the same batch of still-failing probes, a manual resolve sets a five-minute cooldown on the monitor. During that window consensus won’t open a fresh incident, which gives recovery time to show up in the data.
Acknowledging records the event on the incident, but acknowledgement alerts to channels are not guaranteed to be delivered in this release. Treat acknowledge as a coordination signal inside Tallwatch, not as something that pages a separate channel. Resolving, by contrast, always sends a resolved alert.

When alerts are suppressed

Sometimes the right behavior is silence. Two cases suppress alerts on purpose:
  • Maintenance windows. Inside a scheduled maintenance window, consensus doesn’t open incidents at all, so nothing fires.
  • Dependency down. If a monitor lists a dependency that’s already down, its incident still opens but the alert is suppressed and recorded as suppressed. You get paged about the upstream cause, not every downstream symptom.
In both cases the suppression is visible in the record, so nothing is hidden. The alert was held back on purpose, and you can see that it was.