Alert Management & Response | Incident Monitoring Guide | Nife Docs

Learn about the alerts on your dashboard and how to respond to them.

What are Alerts?#

Alerts are notifications about issues, warnings, or important events in your infrastructure. They help you:

Detect problems - Before they impact users
Respond quickly - To critical issues
Monitor health - Of your entire system
Track changes - Important events and updates

Alert Types#

Critical Alerts 🔴#

Immediate action required

Service down or unavailable
Data loss risk
Security issue
Resource exhaustion

Response time: Immediately (within minutes)

Warning Alerts 🟡#

Attention needed soon

High resource usage
Performance degradation
Configuration issues
Approaching limits

Response time: Within hours

Info Alerts ⚪#

Informational only

Successful deployment
Maintenance completed
Configuration changes
Routine information

Response time: For reference

Alert Severity Levels#

Severity	Icon	Color	Meaning	Action
Critical	🔴	Red	Urgent issue	Immediate
Warning	🟡	Orange	Needs attention	Soon
Info	⚪	Blue	FYI	Reference

Reading Alert Messages#

Each alert shows:

Alert title - What the issue is
Severity - How urgent it is
Timestamp - When it occurred
Details - More information about the issue

Example Alert Messages#

Critical Alert: "Application down: Payment Service unavailable for 5 minutes"

Severity: Critical
Action: Investigate immediately
Next step: Check app status, restart if needed

Warning Alert: "High CPU usage: API server 85% utilization"

Severity: Warning
Action: Monitor or scale up
Next step: Check performance, increase resources

Info Alert: "Deployment successful: New version of website deployed"

Severity: Info
Action: None required
Next step: Monitor for issues

Viewing Alerts#

On Dashboard#

Find the Active Alerts section
See up to 5 recent alerts
Click alert for more details

Full Alert List#

Click View All in alerts section
Or navigate to Monitoring → Alerts
See complete alert history
Filter and search alerts

Responding to Alerts#

Critical Alert Response#

Read the alert - Understand the issue
Assess impact - How does this affect users?
Take action:
- Restart service
- Scale up resources
- Rollback deployment
- Contact support
Verify fix - Confirm issue is resolved
Document - Note what happened and how you fixed it

Warning Alert Response#

Investigate - Understand the cause
Monitor - Watch the situation
Take action if needed:
- Optimize performance
- Increase resources
- Fix configuration
Prevent recurrence - Plan long-term solution

Info Alert Response#

Review - Note the information
Archive - Mark as read if needed
No action usually required

Common Alert Scenarios#

Scenario: High CPU Usage Alert#

Alert: "CPU usage: 95% on app server"

Actions:

Check what's using CPU
Optimize code if possible
Increase instance size
Add more instances
Monitor improvement

Scenario: Deployment Failed Alert#

Alert: "Deployment failed: Image pull error"

Actions:

Check Docker image registry
Verify credentials
Check image availability
Retry deployment
Investigate root cause

Scenario: Database Connection Alert#

Alert: "Database connections: 450/500 limit"

Actions:

Check database query efficiency
Add connection pooling
Increase connection limit
Optimize queries
Monitor usage

Scenario: Service Down Alert#

Alert: "Application unavailable: API Service"

Actions:

Check service status immediately
Review recent changes
Check logs for errors
Restart service if safe
Rollback if necessary
Contact support if needed