Alerts Quick Reference & Cheatsheet | Nife Deploy
Quick answers to common alert questions.
I Want To...#
Create an Alert Rule#
Time: 2-3 minutes
Get Alert Notifications#
Time: 5-10 minutes
Respond to a Firing Alert#
Time: Varies on issue complexity
Find a Specific Alert#
Use the filters:
Or scroll through the list at SRE โ Alerts
Fix Too Many Alerts#
Step 1: Find the noisy rule
Step 2: Increase threshold
Step 3: Test and adjust as needed
Test if Notifications Work#
Disable Alerts Temporarily#
Quick Reference Tables#
Alert Statuses#
| Status | Icon | Color | Meaning | Action |
|---|---|---|---|---|
| Firing | ๐ | Red | Active alert, needs attention | Acknowledge & Fix |
| Acknowledged | โฑ๏ธ | Yellow | Someone investigating | Wait or help |
| Resolved | โ | Green | Issue is fixed | None |
Severity Levels#
| Severity | Icon | When | Response Time |
|---|---|---|---|
| Critical | ๐ด | Immediate action needed | Minutes |
| Warning | ๐ | Important, needs attention | 1 hour |
| Info | ๐ก | FYI information | As time allows |
Notification Channels#
| Channel | Speed | Best For | Setup Time |
|---|---|---|---|
| Slow | Documentation | 1 min | |
| Slack | Fast | Team visibility | 2 min |
| PagerDuty | Immediate | Critical alerts | 5 min |
| Webhook | Depends | Custom integration | 10+ min |
Common Alert Examples#
Website/API#
| Alert | Threshold | Severity |
|---|---|---|
| Service Down | HTTP 503 | Critical |
| Slow Response | > 5 seconds | Warning |
| High Error Rate | > 5% | Warning |
| CPU High | > 80% | Warning |
Database#
| Alert | Threshold | Severity |
|---|---|---|
| Service Down | Not responding | Critical |
| CPU High | > 80% | Warning |
| Memory High | > 90% | Warning |
| Low Disk Space | < 10% | Critical |
Infrastructure#
| Alert | Threshold | Severity |
|---|---|---|
| Server Down | Unreachable | Critical |
| Disk Full | < 5% | Critical |
| High Latency | > 100ms | Warning |
| Network Error | > 1% loss | Warning |
Troubleshooting#
Alert Won't Fire#
Check:
- Is rule Enabled? (toggle should be ON)
- Is threshold correct? (Test with extreme value)
- Is metric actually hitting threshold?
Fix:
- Enable the rule
- Lower threshold temporarily to test
- Verify correct metric is selected
Not Getting Notifications#
Check:
- Click "Test" on your channel
- Did you receive test?
- Check email spam folder
- Check Slack channel settings
- Verify email/phone is correct
Fix:
- Resend test
- Check spam folder
- Verify email/channel settings
- Check connection/credentials
Too Many Alerts#
Solutions:
- Increase threshold (less sensitive)
- Disable low-priority rules
- Use digest instead of single emails
- Combine related alerts
Example:
False Alarms#
Why: Threshold too sensitive
Fix:
- Review when it fired
- Increase threshold
- Make condition more specific
- Or delete if not important
Duplicate Alerts#
Problem: Same issue creates multiple alerts
Fix:
- Delete duplicate rules
- Keep only one rule per metric
- Or adjust thresholds so only one fires
Time-Saving Tips#
Fastest to Respond#
Fastest to Create Rule#
Fastest to Test Channel#
Key Locations#
| Need | Path |
|---|---|
| Create rule | Alerts โ Alert Rules |
| Configure notifications | Alerts โ Alert Config |
| View alerts | SRE โ Alerts |
| Help | ? button on page |
Common Mistakes#
| Mistake | Impact | Fix |
|---|---|---|
| Too many alerts | Alert fatigue | Increase thresholds |
| Vague names | Confusion | Use specific names |
| No testing | Alerts don't work | Always test |
| Never adjusting | Gets worse over time | Review monthly |
| Not documenting | Team confusion | Document each rule |
Best Threshold Examples#
CPU Usage#
Memory Usage#
Response Time#
Error Rate#
Checklist: First Alert Setup#
- Created alert rule
- Set realistic threshold
- Chose appropriate severity
- Added clear name
- Set up notification channel
- Tested notification
- Linked rule to channel
- Enabled the rule
- Documented the rule
- Tested end-to-end
Do's and Don'ts#
Do โ #
- โ Create alerts for important things
- โ Use clear, specific names
- โ Test before relying
- โ Respond quickly to alerts
- โ Review and adjust regularly
Don't โ#
- โ Create too many rules
- โ Use vague names
- โ Skip testing
- โ Ignore alerts
- โ Let system grow stale
Quick Definitions#
Alert Rule Defines WHEN an alert should fire (the condition)
Notification Channel Defines HOW you get notified (email, Slack, etc.)
Severity How serious the alert is (Critical/Warning/Info)
Threshold The value that triggers the alert (> 80%, < 10%, etc.)
Status Current state of alert (Firing/Acknowledged/Resolved)
Response Times#
| Action | Time |
|---|---|
| Create rule | 2-3 min |
| Setup channel | 5-10 min |
| Acknowledge alert | 30 sec |
| Resolve alert | 30 sec |
| Test channel | 30 sec |
Support & Links#
| Need | Contact |
|---|---|
| Questions | [email protected] |
| Help on page | ? button |
| Documentation | Alerts Overview |
| Best Practices | Best Practices |
Related Pages#
- Alerts Overview - Start here
- Creating Alert Rules - How to create rules
- Alert Configuration - How to setup notifications
- Responding to Alerts - How to respond
- Best Practices - Advanced tips
Last Updated: January 2026
Need Help? [email protected]