Uptime Monitoring & SLA Tracking: Real-Time Application Availability | Nife Docs
Uptime Monitoring#
The Uptime tab provides real-time monitoring of your application's availability and performance. Track response times, uptime percentage, and SSL certificate expiration with minute-level granularity.
Overview#
Uptime monitoring is critical for maintaining service reliability. It continuously checks your application availability and provides real-time visibility into performance metrics. Use this data to identify issues quickly and maintain SLA compliance.
Getting Started#
Accessing Uptime Monitoring#
- Navigate to Monitoring from the main sidebar
- Click the Uptime tab
- Select your application from the dropdown
- Uptime data loads automatically
Selecting an Application#
- Click the Uptime tab
- From the dropdown, select the application you want to monitor
- Wait for metrics to load (typically a few seconds)
- Data refreshes automatically every 1-5 minutes
Initial Setup#
First Time Setup:
- Select your application
- System begins monitoring immediately
- Data collection starts in background
- Historical data available after 5+ minutes
Demo Mode:
- Demo accounts display sample data
- Use for testing and learning
- Real data available after upgrading
Time Range Selection#
View uptime data for different monitoring periods to match your needs:
Available Time Ranges#
Last 1 Hour#
Use Case: Immediate monitoring and troubleshooting
Best For:
- Current issue investigation
- Real-time status checking
- Rapid response to incidents
- Minute-by-minute analysis
Granularity: 1-minute intervals
Last 3 Hours#
Use Case: Identify recent patterns
Best For:
- Recent issue investigation
- Trend detection
- Performance analysis
- Root cause investigation
Granularity: 3-5 minute intervals
Last 24 Hours#
Use Case: Comprehensive daily overview
Best For:
- Daily health check
- Pattern identification
- SLA compliance verification
- Capacity planning
Granularity: 15-30 minute intervals
Switching Time Ranges#
- Click the time range dropdown
- Select desired period
- Chart updates automatically
- Metrics recalculate
Uptime Metrics#
The dashboard displays four critical metrics that together provide complete visibility into application health:
Response Time (ms)#
Current Response Time:
- Shows: The most recent response time measurement
- Unit: Milliseconds (ms)
- Update Frequency: Every 1-5 minutes
Healthy Range:
- Excellent: < 100ms
- Good: 100-200ms
- Acceptable: 200-500ms
- Poor: > 500ms
What It Measures:
- Time from request to response
- Includes network latency
- Network round-trip to monitoring endpoint
- Not actual request processing
Interpreting Values:
- Consistent low values: Good performance
- Increasing trend: Degradation
- Sudden spike: Potential issue
- High values: Slow application or network
Average Response Time#
Shows: Average response time across all measurements in selected period
Calculation:
- Sum of all response times รท number of measurements
- Excludes failed requests
- Smooths out temporary spikes
Uses:
- Identify performance trends
- Detect gradual degradation
- Establish performance baseline
- Compare time periods
Tracking Tips:
- Note your baseline average
- Alert if increases > 20%
- Compare week-over-week
- Adjust based on time of day
Example:
- Current time avg: 145ms
- Previous day avg: 130ms
- Change: +15ms (11% increase)
- Indicates slight degradation
Uptime Percentage#
Shows: Percentage of time your application was available
Calculation:
Common Targets:
- 99.99% (Four 9s) - 52 minutes/year downtime
- 99.9% (Three 9s) - 8.7 hours/year downtime
- 99% (Two 9s) - 3.7 days/year downtime
- 95% - 18.25 days/year downtime
SLA Alignment:
- Premium tier: 99.9%+ uptime
- Standard tier: 99%+ uptime
- Basic tier: 95%+ uptime
Interpreting Values:
- 99%+ : Excellent, production-ready
- 95-99%: Acceptable, monitor closely
- 90-95%: Needs investigation
- < 90%: Critical issues
Certificate Expiration#
Shows: Days remaining on your SSL/TLS certificate
Status Indicators:
- โ > 30 days: Healthy
- โ ๏ธ 15-30 days: Action needed soon
- ๐ด < 15 days: Critical, renew immediately
- โ Expired: Certificate invalid
Action Timeline:
- 60 days before: Plan renewal
- 30 days before: Begin renewal process
- 15 days before: High priority
- On expiration: Emergency procedures
Renewal Process:
- Request new certificate from provider
- Validate domain ownership
- Install certificate on server
- Verify installation
- Test in monitoring
Common Certificates:
- Let's Encrypt: Free, 90-day validity
- CloudFlare: Free with plan
- Digicert: Premium, longer validity
- AWS Certificate Manager: Free with AWS
Visual Status Indicator#
Status Bar Visualization#
The horizontal colored bar shows real-time application status:
Bar Composition:
- Divided into segments (1 per monitoring check)
- Each segment represents a time period
- Color indicates status for that period
- Hover to see details
Status Colors#
Green Segments#
- Meaning: Successful response (application up)
- Response: Normal
- Action: None required
Red Segments#
- Meaning: Failed response (application down)
- Response: Application unavailable or slow
- Action: Investigate immediately
Yellow Segments#
- Meaning: Slow response (degraded performance)
- Response: Application slow but available
- Action: Monitor for worsening
Interactive Status Bar#
Hovering Over Segments: Displays detailed information:
- Response time in milliseconds
- Status code (200, 500, timeout, etc.)
- Exact timestamp of check
- Duration of check
Uses:
- Identify exact failure times
- Correlate with events
- Track issue resolution
- Document incidents
Uptime Chart#
Chart Features#
An area chart displaying response times over your selected period:
Chart Elements:
- X-Axis: Time progression (hours or minutes)
- Y-Axis: Response time in milliseconds
- Area: Response time trend
- Line: Response time average
Reading the Chart#
Flat Line:
- Consistent performance
- No degradation
- Stable application
Upward Trend:
- Performance degrading
- Increasing load
- Resource constraints
Sudden Spike:
- Temporary issue
- Brief unavailability
- Traffic spike
Downward Dip (Bottom):
- Complete outage
- Application down
- No response received
Identifying Patterns#
Daily Pattern:
- Peak hours: Higher response times
- Off-hours: Lower response times
- Expected and normal
Weekly Pattern:
- Weekdays: Consistent pattern
- Weekends: Different pattern
- Predictable variations
Scheduled Maintenance:
- Planned downtime visible
- Duration: Usually 30-60 minutes
- Normal and expected
Common Chart Scenarios#
Healthy Application:
- Flat response times (< 200ms)
- No downtime periods
- Predictable daily pattern
- No unexpected spikes
Degraded Performance:
- Gradually increasing response times
- Intermittent timeouts
- Increasing failures
- Needs investigation
Outage:
- Sudden drop to zero/timeout
- Sustained period of red
- Clear start and end time
- Visible on status bar
Interpreting Uptime Data#
Healthy Status#
Indicators:
- โ Green bar mostly solid green
- โ Response times < 200ms
- โ Response time trend flat
- โ Uptime percentage > 99.5%
- โ Certificate valid (> 30 days)
Action: Continue normal operations
Warning Signs#
Indicators:
- โ ๏ธ Response times increasing
- โ ๏ธ Occasional red segments
- โ ๏ธ Uptime dropping below 99%
- โ ๏ธ Certificate < 30 days
- โ ๏ธ Increased failure rate
Actions:
- Investigate cause of degradation
- Check application logs
- Review resource usage
- Plan certificate renewal
- Monitor closely
Critical Issues#
Indicators:
- ๐ด Multiple consecutive red segments
- ๐ด Response times > 1000ms
- ๐ด Uptime below 95%
- ๐ด Certificate expired
- ๐ด Persistent failures
Immediate Actions:
- Alert team immediately
- Check application status
- Review recent changes
- Start incident response
- Communicate to users
- Renew certificate if expired
Common Monitoring Scenarios#
Scenario 1: Detect Outage#
Situation: Application stops responding
Indicators:
- Red segments appear in status bar
- Response time drops to zero/timeout
- Chart shows downward spike
- Uptime percentage decreases
Steps to Investigate:
- Check exact time of failure
- Review application logs
- Check infrastructure status
- Verify DNS resolution
- Check network connectivity
Resolution:
- Restart application if needed
- Check recent deployments
- Review resource usage
- Fix underlying cause
- Monitor for recurrence
Scenario 2: Performance Degradation#
Situation: Response times slowly increasing
Indicators:
- Response times trending upward
- Chart shows upward slope
- Occasional red segments
- Uptime still high
Root Causes:
- Increased traffic
- Memory leak
- Database slowdown
- Resource constraints
- Network latency
Investigation:
- Check traffic volume
- Review application metrics
- Check database performance
- Monitor system resources
- Review recent changes
Solutions:
- Scale application horizontally
- Optimize code/queries
- Increase resources
- Clear caches
- Deploy fix
Scenario 3: Traffic Spike#
Situation: Response times spike during peak time
Indicators:
- Response times increase during peak hours
- Pattern repeats daily
- Returns to normal after peak
- Outages don't occur
Analysis:
- This is expected behavior
- Peak is predictable
- Application recovering normally
- No critical issue
Optimization:
- Scale during peak times
- Increase baseline resources
- Implement caching
- Optimize code
- Use CDN
Scenario 4: SSL Certificate Expiration#
Situation: Certificate expiration approaching
Indicators:
- Certificate expiration days: 15-30
- Yellow warning in metric
- Browser warnings if expired
Action Plan:
- Week 1: Request new certificate
- Week 2: Validate and install
- Week 3: Verify installation
- Day before: Final check
- After renewal: Monitor
Best Practices#
Daily Monitoring Routine#
- โ Check uptime percentage each morning
- โ Review response time trend
- โ Look for red segments
- โ Check certificate expiration (weekly)
Performance Management#
- โ Maintain target uptime > 99%
- โ Keep response time < 200ms
- โ Investigate degradation > 20%
- โ Document response to incidents
Certificate Management#
- โ Set reminder at 60 days before expiration
- โ Renew certificate at 30 days before
- โ Test in staging before production
- โ Verify installation after renewal
Incident Response#
- โ Alert team on outage
- โ Collect monitoring data
- โ Document timeline
- โ Perform root cause analysis
- โ Implement preventive measures
Troubleshooting#
No Uptime Data Showing#
Problem: "No Uptime Data Available" message
Solutions:
- Ensure application is selected
- Verify application is deployed
- Check application is receiving traffic
- Wait 5+ minutes for initial data
- Verify DNS records configured
- Check firewall rules allow monitoring
Data Shows Downtime But App is Up#
Problem: Monitoring shows outage, but app appears working
Possible Causes:
- Monitoring endpoint different from user endpoint
- Firewall blocking monitoring requests
- Regional network issue
- Application partially down
- DNS issue
Investigation:
- Test application from multiple locations
- Check network connectivity
- Verify DNS resolution
- Review application logs
- Check firewall rules
Certificate Expiration Showing Incorrectly#
Problem: Certificate days showing wrong value
Solutions:
- Verify certificate is installed correctly
- Check system clock is correct
- Refresh monitoring dashboard
- Wait for next automatic check
- Contact certificate provider
Response Times Unrealistic#
Problem: Response times don't match expectations
Causes:
- Monitoring from distant region
- Network latency including
- Large response payload
- Slow network connection
- Slow DNS resolution
Verification:
- Test locally
- Test from other regions
- Check network connectivity
- Review payload size
- Benchmark manually
Limits & Considerations#
| Item | Limit |
|---|---|
| Data Granularity | 1-5 minute intervals |
| Historical Data | Full history (unlimited) |
| Response Time Accuracy | ยฑ50ms |
| Uptime Calculation | Based on monitoring checks |
| Certificate Check | Daily verification |
| Monitoring Locations | Global distributed |
Related Documentation#
- Monitoring Overview - Overview of all monitoring
- HTTP Traffic - HTTP traffic analysis
- DNS Metrics - DNS performance monitoring
- DNS Analytics - Advanced DNS analysis
- Alerts - Set up uptime alerts
- Applications Management - Manage your applications