§ Wiki · Wiki entry

Node Provider Alerting Options

Available monitoring and alerting tools that node providers can use to watch their node machines and data centers.

Node providers are expected to run their own alerting against their nodes and data center infrastructure. There is no mandatory tool — each provider can pick the approach that fits their existing operations practice.

Available tooling

A few community and DFINITY-maintained options are widely used:

Both tools are starting points rather than complete solutions; many providers layer additional alerting on top — for example SNMP traps from their switch fabric, BMC health checks, or external uptime probes.

[!NOTE] Providers retain full discretion over their alerting stack. The network does not prescribe a specific platform or vendor — only that the provider be able to detect and respond to incidents on their own infrastructure.

Where alerting fits

Alerting is one half of the operational picture. The other half is the network monitoring that runs at the switch and BMC layer; see Node Provider Networking Guide for SNMP and gNMI guidance there. For incident response procedures once an alert fires, see Node Provider Maintenance Guide and Node Provider Troubleshooting.