5 Signs You Have Alert Fatigue => Having an application running in production means that they are affecting other people’s lives. It’s a huge responsibility that needs to be well understood before being able to handle an on-call phone.
There are usually two generic ways of knowing if everything is ok with your applications: active monitoring — where you go to the dashboards check for errors and problems, and passive monitoring — commonly called alerts. It’s when a system notifies you about an ongoing issue.
The latter is usually better than the first. You work in the technology sector. As such, you want things to be automatic. Furthermore, it’s better to know that there is a problem right after it starts than only you were casually checking for problems, which can take much longer and, consequently, increasing the impact of the incident.
You could think that having an alert for every single issue that happens in your system could be the solution for having to check your dashboards to detect any problem.
It’s not the case.
Having the smallest thing triggering an alarm will give you Alert Fatigue. According to Wikipedia, “Alarm fatigue or alert fatigue occurs when one is exposed to a large number of frequent alarms (alerts) and consequently becomes desensitized to them. Desensitization can lead to longer response times or missing important alarms.”
Being overwhelmed by alerts won’t bring you any benefit, not for your application, neither for your health. You won’t give the attention that alerts need, and you’ll feel more tired over time.
Here are five signs that you have alert fatigue, even if you didn’t know it was a thing.
1. You Ignore the Recurrent Alerts
An alert is supposed to be an exception to the rule. You shouldn’t have an alert that regularly triggers, and that requires no manual intervention. Those false positives mean that you should tune your alarms better.
If you start having some alerts that trigger and you are not supposed to do anything, the most common thing to happen is for you to start ignoring the alerts. When this happens, it’s a matter of time until a real alarm pops off, and you ignored it because you thought it was just another fake alert.
2. You Get Angry When a New Alert Pops Out
Having an alarm being triggered isn’t precisely the most beautiful experience in the world. It should make you worried about what’s going on and apprehensive about its impacts.
If it gets to a point when you, instead of being worried, get angry, feeling a wave of frustration for an alarm to be triggered (again), you are no longer caring for your application. You are not hoping for no problems to occur. You are praying for no alarms to be triggered.
3. When a Major Incident Occurs, You Get So Many Alerts That You Lose Track of Them
It will happen one day that you’ll have a significant incident that requires your attention for a good part of the day. Imagine trying to solve a problem with tons of different alarms being triggered every X minutes.
It’s not easy, I know.
You get to a point when you no longer read what alert is being triggered and just acknowledge it to continue searching for the problem.
Eventually, you find the problem. But are all the alarms down? When you look at your alert log, you don’t even know which alerts are already resolved, and which are not. You need to check your systems actively.
4. You Don’t Have a Day Without at Least One Alert
I can’t emphasize this enough: an alarm is supposed to be something exceptional. If you have alarms being triggered every day, you either need to review your alert policy and tune the most common alarms, or you need to check what underlying problem is triggering them. You probably don’t have critical issues every day. Then, why should you get alerts?
When setting an alert, think if it’s worth being be woken up in the middle of the night because of it
5. Don’t Have a Desire to Solve the Problem
The feeling of accomplishment that you get when you find out what’s ruining your application is the best incentive for continuing hunting for problems and deliver the best service possible. However, if you start having more alarms in the queue than you can handle, it will demotivate you. You’ll feel overwhelmed while more alerts enter the line for analysis, ending with you, finally, giving up.
Losing the flame of problem-solving and caring about your applications does no good to anyone.
Passive monitoring is probably the best tool to know when your application has a critical issue going on that needs to be taken care of. It frees you from going to the app dashboards all the time just to know if there is any issue going on at the moment.
Avoid having an alarm ringing at the smallest thing. That is not the goal. If you don’t, you’ll end up being trapped in a cage that you built, and start ignoring what was supposed to be the max priority.
Don’t set up an alarm that you are not willing to research why it is being triggered. A good rule of thumb is only to set a new alarm if you would like to be woken up at 4 am because of it.
Don’t use alerts as a silver bullet. Use them for critical situations, both tech-related (low CPU, disk space, etc.) and business-related.