Monitoring as Code
Manage your SLOs and Thresholds as code and publish them to Asserts cloud
SLO
Here is an example SLO that we use to monitor Asserts itself. The service level indicator defined here measures how long a recurring task that updates the Asserts graph takes to run, and the objective is to keep that run time under 15 seconds 99% of the time.
Here’s another SLO example. This one checks that the Asserts API server responds successfully to 99.5% of the requests it receives:
These examples demonstrate the two kinds of SLOs that Asserts supports:
Occurrence SLOs are based on time and evaluated each minute. Based on the application’s performance during a minute, that minute is deemed either good or bad. Bad minutes are counted against the SLO’s error budget. Typical use cases for occurrence SLOs are latency and throughput goals.
Request SLOs are based on events that are either good or bad. Bad events count against the SLO’s error budget. Web application availability is a common use case for a request SLO, where each request received counts as an event, and requests that fail due to server errors count as bad events.
Threshold
You can control how assertions are generated by tuning thresholds. This rule sets the latency threshold for login requests for a specific customer:
This rule raises a warning level assertion when a redis node has used more than 70% of its CPU:
Last updated