Asserts
Introduction

Founding Story

As builders and operators of cloud applications and founding engineers of industry-leading APM products, we were pinched every day; why was it still so hard to troubleshoot and optimize our app with the modern APM tools? We had assertions for our compile time with Unit, Component, and Integration tests but couldn’t apply the same rigor to our runtime. There wasn’t a platform to transcribe our system design embedded in the four golden signals and health metrics into Runtime Assertions. When we started creating Alerts to track these failures and dashboards to view the metrics, they were disjoint, too many to manage, too noisy to correlate, and will go out of date too soon with our continuous release.
As engineers, we wished for an adaptive and purposefully built solution for distributed systems and cloud architecture, and most importantly, that allowed us to manage our Monitoring as a Code.
What is Asserts?
Asserts is the next-generation insights for your distributed, multi-cloud applications.
With Asserts, your team can say goodbye to disjointed dashboards that don’t keep up with rapid releases. Your expert engineers won’t be constantly interrupted to pore over esoteric metric charts and sift through an ocean of disjointed logs hoping to stumble upon the needle in the haystack. And to top it off, your on-call team won’t be fatigued by alerts that don’t matter.
Let’s dig deeper to find how Asserts is different.

Features

Discover a Living Map of App and Infra components

Asserts taps into your telemetry data sources (app metrics), automatically builds a graph of your application and infrastructure components, and indexes the graph for search.
  • With our search, you can find how the components fit together in real-time and view KPIs in the built-in Grafana dashboard.
Our cloud component catalog is constantly evolving.

Instrument via SAAFE Assertions to collect the symptoms and causes

Asserts curates knowledge of common runtime failure patterns and potential causes, so your team doesn’t have to research and maintain these rules.
It continuously tracks resource Saturations, Amends (new deployments, scale events et al.), request & latency Anomalies, systemic Failures, and Errors on your golden signals and health metrics.
The occurrences of these assertions are annotated on the (Knowledge) Graph, so it is easy to consume at a glance.
With our unified search, you can combine components, relations, configurations, and associated assertions to express your intent in an easy natural language expression.
e.g., Search “Pods crashing on Nodes with high cpu:load

Wake up when it matters

The SRE book recommends Alerting on Service Level Objectives (SLO), to track"what's broken"and with Asserts setting up your SLOs and tracking your error budget is a breeze. And then finding "why it's broken" is just a click away in our Assertion workbench.

Spot issues quickly with Top Insights

With our always-on Assertions, you don’t have to wait for SLOs to breach and Alerts to fire. Top Insights presents a stack-ranked view of Services / Nodes that need attention based on their assertion score. And then Open in Workbench to find the root cause.

Troubleshoot in Workbench with all the Assertions

In our assertion workbench, dig in to view all the possible causes correlated across time and space, with just the right metrics and logs at your fingertip.
e.g., an amend (new deployment) on api-servertriggered a spike in error rate on an endpoint /slo/incidents. Jump to Dashboard or View Logs to see contextual logs in your existing log store, like Kibana, Graylog, et al.