Alerts Quick Reference
Quick answers to common alert questions.
I Want To...
Create an Alert Rule
Alerts → Alert Rules → New Rule
↓
Choose what to monitor (CPU, errors, etc.)
↓
Set threshold (> 80%, < 10%, etc.)
↓
Choose severity (Critical/Warning/Info)
↓
Name it clearly
↓
Save and Enable
Time: 2-3 minutes
Get Alert Notifications
Alerts → Alert Config → Add Channel
↓
Choose type (Email/Slack/PagerDuty/Webhook)
↓
Enter your details
↓
Test the channel
↓
Link to your alert rule
Time: 5-10 minutes
Respond to a Firing Alert
Get notification
↓
Click link or go to SRE → Alerts
↓
Find the alert with Firing status
↓
Click Acknowledge
↓
Investigate and fix the issue
↓
Click Resolve
Time: Varies on issue complexity
Find a Specific Alert
Use the filters:
Status: Firing / Acknowledged / Resolved
Severity: Critical / High / Medium
Or scroll through the list at SRE → Alerts
Fix Too Many Alerts
Step 1: Find the noisy rule
Alerts → Alert Rules → Look for frequently triggered rule
Step 2: Increase threshold
Example: Change from CPU > 70% to CPU > 85%
Step 3: Test and adjust as needed
Test if Notifications Work
Alerts → Alert Config
↓
Find your channel
↓
Click "Test"
↓
Check if you received it
Disable Alerts Temporarily
Alerts → Alert Rules
↓
Find the rule
↓
Toggle Enabled OFF
↓
(Remember to turn back ON later!)
Quick Reference Tables
Alert Statuses
| Status | Icon | Color | Meaning | Action |
|---|---|---|---|---|
| Firing | 🔔 | Red | Active alert, needs attention | Acknowledge & Fix |
| Acknowledged | ⏱️ | Yellow | Someone investigating | Wait or help |
| Resolved | ✓ | Green | Issue is fixed | None |
Severity Levels
| Severity | Icon | When | Response Time |
|---|---|---|---|
| Critical | 🔴 | Immediate action needed | Minutes |
| Warning | 🟠 | Important, needs attention | 1 hour |
| Info | 🟡 | FYI information | As time allows |
Notification Channels
| Channel | Speed | Best For | Setup Time |
|---|---|---|---|
| Slow | Documentation | 1 min | |
| Slack | Fast | Team visibility | 2 min |
| PagerDuty | Immediate | Critical alerts | 5 min |
| Webhook | Depends | Custom integration | 10+ min |
Common Alert Examples
Website/API
| Alert | Threshold | Severity |
|---|---|---|
| Service Down | HTTP 503 | Critical |
| Slow Response | > 5 seconds | Warning |
| High Error Rate | > 5% | Warning |
| CPU High | > 80% | Warning |
Database
| Alert | Threshold | Severity |
|---|---|---|
| Service Down | Not responding | Critical |
| CPU High | > 80% | Warning |
| Memory High | > 90% | Warning |
| Low Disk Space | < 10% | Critical |
Infrastructure
| Alert | Threshold | Severity |
|---|---|---|
| Server Down | Unreachable | Critical |
| Disk Full | < 5% | Critical |
| High Latency | > 100ms | Warning |
| Network Error | > 1% loss | Warning |
Troubleshooting
Alert Won't Fire
Check:
- Is rule Enabled? (toggle should be ON)
- Is threshold correct? (Test with extreme value)
- Is metric actually hitting threshold?
Fix:
- Enable the rule
- Lower threshold temporarily to test
- Verify correct metric is selected
Not Getting Notifications
Check:
- Click "Test" on your channel
- Did you receive test?
- Check email spam folder
- Check Slack channel settings
- Verify email/phone is correct
Fix:
- Resend test
- Check spam folder
- Verify email/channel settings
- Check connection/credentials
Too Many Alerts
Solutions:
- Increase threshold (less sensitive)
- Disable low-priority rules
- Use digest instead of single emails
- Combine related alerts
Example:
Before: CPU > 70% = 50 alerts/day
After: CPU > 85% = 5 alerts/day
False Alarms
Why: Threshold too sensitive
Fix:
- Review when it fired
- Increase threshold
- Make condition more specific
- Or delete if not important
Duplicate Alerts
Problem: Same issue creates multiple alerts
Fix:
- Delete duplicate rules
- Keep only one rule per metric
- Or adjust thresholds so only one fires
Time-Saving Tips
Fastest to Respond
Get Slack notification (2 sec)
↓
Click link (1 sec)
↓
Acknowledge (5 sec)
↓
Investigate (5-30 min)
↓
Resolve (5 sec)
Total: < 1 minute to acknowledge
Fastest to Create Rule
New Rule → Pick template → Set threshold
→ Name it → Save → Enable
Total: ~1 minute
Fastest to Test Channel
Alert Config → Find channel → Click Test
→ Receive notification
Total: ~30 seconds
Key Locations
| Need | Path |
|---|---|
| Create rule | Alerts → Alert Rules |
| Configure notifications | Alerts → Alert Config |
| View alerts | SRE → Alerts |
| Help | ? button on page |
Common Mistakes
| Mistake | Impact | Fix |
|---|---|---|
| Too many alerts | Alert fatigue | Increase thresholds |
| Vague names | Confusion | Use specific names |
| No testing | Alerts don't work | Always test |
| Never adjusting | Gets worse over time | Review monthly |
| Not documenting | Team confusion | Document each rule |
Best Threshold Examples
CPU Usage
Normal: 20-50%
Peak: 60-70%
Alert: 80% ← Not too sensitive
Critical: 95% ← Only if really bad
Memory Usage
Normal: 30-60%
Peak: 70-80%
Alert: 90% ← Getting concerning
Critical: 95% ← Critically high
Response Time
Normal: 100-300ms
Slow: 500-1000ms
Alert: 2000ms ← Warning level
Critical: 5000ms ← Very slow
Error Rate
Normal: 0-0.5%
Alert: 1% ← One in hundred
Critical: 5% ← One in twenty
Checklist: First Alert Setup
- Created alert rule
- Set realistic threshold
- Chose appropriate severity
- Added clear name
- Set up notification channel
- Tested notification
- Linked rule to channel
- Enabled the rule
- Documented the rule
- Tested end-to-end
Do's and Don'ts
Do ✅
- ✅ Create alerts for important things
- ✅ Use clear, specific names
- ✅ Test before relying
- ✅ Respond quickly to alerts
- ✅ Review and adjust regularly
Don't ❌
- ❌ Create too many rules
- ❌ Use vague names
- ❌ Skip testing
- ❌ Ignore alerts
- ❌ Let system grow stale
Quick Definitions
Alert Rule Defines WHEN an alert should fire (the condition)
Notification Channel Defines HOW you get notified (email, Slack, etc.)
Severity How serious the alert is (Critical/Warning/Info)
Threshold The value that triggers the alert (> 80%, < 10%, etc.)
Status Current state of alert (Firing/Acknowledged/Resolved)
Response Times
| Action | Time |
|---|---|
| Create rule | 2-3 min |
| Setup channel | 5-10 min |
| Acknowledge alert | 30 sec |
| Resolve alert | 30 sec |
| Test channel | 30 sec |
Support & Links
| Need | Contact |
|---|---|
| Questions | [email protected] |
| Help on page | ? button |
| Documentation | Alerts Overview |
| Best Practices | Best Practices |
Related Pages
- Alerts Overview - Start here
- Creating Alert Rules - How to create rules
- Alert Configuration - How to setup notifications
- Responding to Alerts - How to respond
- Best Practices - Advanced tips
Last Updated: January 2026
Need Help? [email protected]