This page is part of my personal knowledge database, that helps me to store and navigate my learnings.
Read on here for details


ยท 1 minute read

Having Alarming in place, means to have automated systems that are connected to telemetry, that trigger actions - like notifying operators in case of non-automatic-resolvable incidents.

While such tooling is required within Complex Systems - so that small teams can operate large scale services within Programmable Infrastructure with a huge number of components - it is also an indicator of Technical Debt, if overused.

Predictive Alarming - that is notifying operators before an incident happens, based on probability predictors - can be a good tool for optimizing an already reliable system, but should not used during the early transformation phases.