Pervasive Telemetry Practices

Without metrics there are no Insights when you needs them, which results in confusing, loss of control, misplaced blame. Good Pervasive Telemetry Practices are:

Make metrics it easy
- Writing to and reading from metrics must be easy
- Access to metrics must be easy and highly visible
  - Like a status page, so everyone can see & radiate all progress / change information transparently
Monitor every source
- Err on the side of too much monitoring
- Collect metrics holistically (e.g. no separation between application and operations)
- Extract metrics from logs => create statistics
  - Write all appropriate logs & use appropriate log levels
- If it’s worth implementing, it’s worth monitoring (or not vice versa not).
  - Any new business functionality must result in appropriate new business metrics
  - Every Deployment Stage must have metrics
Monitor every layer (4-Layer Architecture + Deployment)
- Application layer metrics must contain resource use, auth, session, timing, ..
- Business/Domain layer metrics must always map to business goal (or they are vanity = superfluous != useful <=> actionable)
- Infra layer metrics must be relatable to services (so devs can understand them)
- Deploy layer metrics, when related to the other metrics, put them into context of code deploys => allow devs to debug / work

Sources

DevOps Handbook, Chapter 3+14

Ulrich Kautz Blog

Pervasive Telemetry Practices

Sources

Ulrich Kautz

Projects

Ulrich Kautz Blog

Pervasive Telemetry Practices

Sources

Related

Ulrich Kautz

Projects