Redis
Prometheus metrics for redis can be enabled using redis exporter. Once the exporter is setup check the following metrics to verify the setup:
- redis_up
- redis_uptime_in_seconds
Metric | KPI |
Request Counter redis_commands_total redis_connections_received_total | Request Rate rate(redis_commands_total[5m]) rate(redis_connections_received_total[5m]) |
Error Counter redis_rejected_connections_total | Error Ratio rate(request_connections[5m]) / rate(redis_connections_received_total[5m]) |
Latency Counter redis_commands_duration_seconds_total | Latency Average rate(redis_commands_duration_seconds_total[5m]) / rate(redis_commands_total[5m]) |
Metric | KPI |
CPU Usage redis_cpu_user_seconds_total redis_cpu_sys_seconds_total | rate(redis_cpu_user_seconds_total[5m]) + rate(redis_cpu_sys_seconds_total[5m]) |
Memory Usage redis_memory_used_rss_bytes | redis_memory_used_rss_bytes / redis_memory_max_bytes |
Network Bytes Received redis_net_input_bytes_total Network Bytes Transmitted redis_net_output_bytes_total | Data transfer rate rate(redis_net_input_bytes_total[5m) rate(redis_net_output_bytes_total[5m) |
Current Redis Connected clients redis_connected_clients | redis_connected_clients / redis_config_maxclients |
KPI | Alert |
Request Rate | RequestRateAnomaly |
Error Ratio | ErrorRatioBreach and ErrorBuildup based on an availability SLO of 99.9 |
Latency Average | LatencyAverageBreach and LatencyAverageAnomaly |
CPU Usage | Saturation with severity level of warning and critical when cpu utilization exceeds 70% and 90% respectively |
Memory Usage | Saturation with severity level of warning and critical when memory utilization exceeds 65% and 75% respectively |
Network Bytes | ResourceRateAnomaly |
Client Connections | Saturation with severity level of warning and critical when it exceeds 80% and 90% respectively ResourceMayExhaust if connections are about to exceed the limit of 256 connections within the next 4 hours |
RedisDown
Redis instance is down
1redis_up != 1
RedisUptimeReset
Redis instance restarted
1delta(redis_uptime_in_seconds[5m]) < 0
RedisMasterLinkDown
Redis master link down
1( 2 avg_over_time(redis_master_link_up[10m]) 3 and on (instance) 4 redis_instance_info{role="slave"} 5) == 0
RedisReplicationBroken
Redis instance lost a replica
1delta(redis_connected_slaves[1m]) < 0
RedisClusterFlapping
Changes have been detected in Redis replica connection
1changes(redis_connected_slaves[5m]) > 2
RedisRejectedConnections
Some connections to Redis have been rejected
1rate(redis_rejected_connections_total[1m]) * 60 > 0
RedisMissingMaster
Redis Master Missing
1count by (job, service, redis_mode, namespace,) 2 (redis_instance_info{role="master"}) == 0
RedisTooManyMasters
Standalone and HA setup should only have one master
1count by (job, service, namespace) 2 (redis_instance_info{role="master", redis_mode="standalone"}) > 1
RedisTooFewMastersInCluster
Redis cluster mode should have every instance in the role of "master"
1avg by (job, service, namespace) (redis_cluster_size) 2- 3count by (job, service, namespace) 4 (redis_instance_info{role="master", redis_mode="cluster"}) 5> 0
Redis KPI Dashboard shows all the above mentioned KPIs

Last modified 1yr ago