Skip to main content

Managing Dashboard Alerts

Learn about the alerts on your dashboard and how to respond to them.

What are Alerts?

Alerts are notifications about issues, warnings, or important events in your infrastructure. They help you:

  • Detect problems - Before they impact users
  • Respond quickly - To critical issues
  • Monitor health - Of your entire system
  • Track changes - Important events and updates

Alert Types

Critical Alerts 🔴

Immediate action required

  • Service down or unavailable
  • Data loss risk
  • Security issue
  • Resource exhaustion

Response time: Immediately (within minutes)

Warning Alerts 🟡

Attention needed soon

  • High resource usage
  • Performance degradation
  • Configuration issues
  • Approaching limits

Response time: Within hours

Info Alerts ⚪

Informational only

  • Successful deployment
  • Maintenance completed
  • Configuration changes
  • Routine information

Response time: For reference

Alert Severity Levels

SeverityIconColorMeaningAction
Critical🔴RedUrgent issueImmediate
Warning🟡OrangeNeeds attentionSoon
InfoBlueFYIReference

Reading Alert Messages

Each alert shows:

  • Alert title - What the issue is
  • Severity - How urgent it is
  • Timestamp - When it occurred
  • Details - More information about the issue

Example Alert Messages

Critical Alert: "Application down: Payment Service unavailable for 5 minutes"

  • Severity: Critical
  • Action: Investigate immediately
  • Next step: Check app status, restart if needed

Warning Alert: "High CPU usage: API server 85% utilization"

  • Severity: Warning
  • Action: Monitor or scale up
  • Next step: Check performance, increase resources

Info Alert: "Deployment successful: New version of website deployed"

  • Severity: Info
  • Action: None required
  • Next step: Monitor for issues

Viewing Alerts

On Dashboard

  1. Find the Active Alerts section
  2. See up to 5 recent alerts
  3. Click alert for more details

Full Alert List

  1. Click View All in alerts section
  2. Or navigate to MonitoringAlerts
  3. See complete alert history
  4. Filter and search alerts

Responding to Alerts

Critical Alert Response

  1. Read the alert - Understand the issue
  2. Assess impact - How does this affect users?
  3. Take action:
    • Restart service
    • Scale up resources
    • Rollback deployment
    • Contact support
  4. Verify fix - Confirm issue is resolved
  5. Document - Note what happened and how you fixed it

Warning Alert Response

  1. Investigate - Understand the cause
  2. Monitor - Watch the situation
  3. Take action if needed:
    • Optimize performance
    • Increase resources
    • Fix configuration
  4. Prevent recurrence - Plan long-term solution

Info Alert Response

  1. Review - Note the information
  2. Archive - Mark as read if needed
  3. No action usually required

Common Alert Scenarios

Scenario: High CPU Usage Alert

Alert: "CPU usage: 95% on app server"

Actions:

  1. Check what's using CPU
  2. Optimize code if possible
  3. Increase instance size
  4. Add more instances
  5. Monitor improvement

Scenario: Deployment Failed Alert

Alert: "Deployment failed: Image pull error"

Actions:

  1. Check Docker image registry
  2. Verify credentials
  3. Check image availability
  4. Retry deployment
  5. Investigate root cause

Scenario: Database Connection Alert

Alert: "Database connections: 450/500 limit"

Actions:

  1. Check database query efficiency
  2. Add connection pooling
  3. Increase connection limit
  4. Optimize queries
  5. Monitor usage

Scenario: Service Down Alert

Alert: "Application unavailable: API Service"

Actions:

  1. Check service status immediately
  2. Review recent changes
  3. Check logs for errors
  4. Restart service if safe
  5. Rollback if necessary
  6. Contact support if needed

Alert Management

Marking Alerts as Read

  1. In the alerts section, click Mark all as read
  2. Or click individual alert to mark
  3. Read alerts stay visible but marked

Viewing Alert History

  1. Go to MonitoringAlerts
  2. See all alerts (new and old)
  3. Filter by severity
  4. Filter by date range
  5. Search by keyword

Setting Alert Rules

Create custom alerts for:

  • Specific resources
  • Threshold values
  • Application errors
  • Performance metrics

See Alert Configuration for details.

Best Practices for Alert Management

Act quickly on critical alerts - Don't delay
Read the full message - Understand context
Document responses - Keep records
Set up notifications - Get alerted via email or Slack
Review alert history - Identify patterns
Adjust thresholds - Reduce false alarms
Team communication - Notify team of issues
Escalate if needed - Contact support for help

Preventing Alerts

Proactive Monitoring

  1. Regular checks - Review metrics daily
  2. Capacity planning - Don't run near limits
  3. Code optimization - Reduce resource usage
  4. Health checks - Ensure services are responding
  5. Load testing - Test before high-traffic events

Configuration

  1. Set reasonable thresholds - Not too sensitive
  2. Right-size resources - Match actual needs
  3. Plan growth - Scale before hitting limits
  4. Automate scaling - Use auto-scaling rules
  5. Redundancy - Have backups for critical services

Alert Notifications

Email Notifications

  • Receive critical alerts via email
  • Immediate for urgent issues
  • Digest emails for less urgent

Slack Notifications

  • Real-time alerts in Slack
  • Integrate with your workflow
  • Team visibility

Configure Notifications

  1. Go to SettingsNotifications
  2. Choose notification method
  3. Select alert types to receive
  4. Set notification schedule

Troubleshooting

Not Receiving Alerts?

  1. Check notification settings
  2. Verify email address
  3. Check Slack workspace connection
  4. Look in spam/junk folder
  5. Contact support if still not working

Getting too many alerts?

  1. Adjust threshold values
  2. Remove false alarm rules
  3. Group related alerts
  4. Filter less important severities
  5. Set quiet hours if available

Alert seems wrong?

  1. Verify the data it's based on
  2. Check system status independently
  3. Investigate recent changes
  4. Consider updating threshold
  5. Report to support if it's a bug