This page is part of my personal knowledge database, that helps me to store and navigate my learnings.
Read on here for details

Pervasive Telemetry Practices

Without metrics there are no Insights when you needs them, which results in confusing, loss of control, misplaced blame. Good Pervasive Telemetry Practices are:

  1. Make metrics it easy
    • Writing to and reading from metrics must be easy
    • Access to metrics must be easy and highly visible
      • Like a status page, so everyone can see & radiate all progress / change information transparently
  2. Monitor every source
    • Err on the side of too much monitoring
    • Collect metrics holistically (e.g. no separation between application and operations)
    • Extract metrics from logs => create statistics
      • Write all appropriate logs & use appropriate log levels
    • If it’s worth implementing, it’s worth monitoring (or not vice versa not).
      • Any new business functionality must result in appropriate new business metrics
      • Every Deployment Stage must have metrics
  3. Monitor every layer (4-Layer Architecture + Deployment)
    • Application layer metrics must contain resource use, auth, session, timing, ..
    • Business/Domain layer metrics must always map to business goal (or they are vanity = superfluous != useful <=> actionable)
    • Infra layer metrics must be relatable to services (so devs can understand them)
    • Deploy layer metrics, when related to the other metrics, put them into context of code deploys => allow devs to debug / work