Uptime Monitoring & SLA Tracking: Real-Time Application Availability | Nife Docs

Uptime Monitoring#

The Uptime tab provides real-time monitoring of your application's availability and performance. Track response times, uptime percentage, and SSL certificate expiration with minute-level granularity.

Overview#

Uptime monitoring is critical for maintaining service reliability. It continuously checks your application availability and provides real-time visibility into performance metrics. Use this data to identify issues quickly and maintain SLA compliance.

Getting Started#

Accessing Uptime Monitoring#

Navigate to Monitoring from the main sidebar
Click the Uptime tab
Select your application from the dropdown
Uptime data loads automatically

Selecting an Application#

Click the Uptime tab
From the dropdown, select the application you want to monitor
Wait for metrics to load (typically a few seconds)
Data refreshes automatically every 1-5 minutes

Initial Setup#

First Time Setup:

Select your application
System begins monitoring immediately
Data collection starts in background
Historical data available after 5+ minutes

Demo Mode:

Demo accounts display sample data
Use for testing and learning
Real data available after upgrading

Time Range Selection#

View uptime data for different monitoring periods to match your needs:

Available Time Ranges#

Last 1 Hour#

Use Case: Immediate monitoring and troubleshooting

Best For:

Current issue investigation
Real-time status checking
Rapid response to incidents
Minute-by-minute analysis

Granularity: 1-minute intervals

Last 3 Hours#

Use Case: Identify recent patterns

Best For:

Recent issue investigation
Trend detection
Performance analysis
Root cause investigation

Granularity: 3-5 minute intervals

Last 24 Hours#

Use Case: Comprehensive daily overview

Best For:

Daily health check
Pattern identification
SLA compliance verification
Capacity planning

Granularity: 15-30 minute intervals

Switching Time Ranges#

Click the time range dropdown
Select desired period
Chart updates automatically
Metrics recalculate

Uptime Metrics#

The dashboard displays four critical metrics that together provide complete visibility into application health:

Response Time (ms)#

Current Response Time:

Shows: The most recent response time measurement
Unit: Milliseconds (ms)
Update Frequency: Every 1-5 minutes

Healthy Range:

Excellent: < 100ms
Good: 100-200ms
Acceptable: 200-500ms
Poor: > 500ms

What It Measures:

Time from request to response
Includes network latency
Network round-trip to monitoring endpoint
Not actual request processing

Interpreting Values:

Consistent low values: Good performance
Increasing trend: Degradation
Sudden spike: Potential issue
High values: Slow application or network

Average Response Time#

Shows: Average response time across all measurements in selected period

Calculation:

Sum of all response times ÷ number of measurements
Excludes failed requests
Smooths out temporary spikes

Uses:

Identify performance trends
Detect gradual degradation
Establish performance baseline
Compare time periods

Tracking Tips:

Note your baseline average
Alert if increases > 20%
Compare week-over-week
Adjust based on time of day

Example:

Current time avg: 145ms
Previous day avg: 130ms
Change: +15ms (11% increase)
Indicates slight degradation

Uptime Percentage#

Shows: Percentage of time your application was available

Calculation:

(Successful Responses ÷ Total Monitoring Checks) × 100

Common Targets:

99.99% (Four 9s) - 52 minutes/year downtime
99.9% (Three 9s) - 8.7 hours/year downtime
99% (Two 9s) - 3.7 days/year downtime
95% - 18.25 days/year downtime

SLA Alignment:

Premium tier: 99.9%+ uptime
Standard tier: 99%+ uptime
Basic tier: 95%+ uptime

Interpreting Values:

99%+ : Excellent, production-ready
95-99%: Acceptable, monitor closely
90-95%: Needs investigation
< 90%: Critical issues

Certificate Expiration#

Shows: Days remaining on your SSL/TLS certificate

Status Indicators:

✅ > 30 days: Healthy
⚠️ 15-30 days: Action needed soon
🔴 < 15 days: Critical, renew immediately
❌ Expired: Certificate invalid

Action Timeline:

60 days before: Plan renewal
30 days before: Begin renewal process
15 days before: High priority
On expiration: Emergency procedures

Renewal Process:

Request new certificate from provider
Validate domain ownership
Install certificate on server
Verify installation
Test in monitoring

Common Certificates:

Let's Encrypt: Free, 90-day validity
CloudFlare: Free with plan
Digicert: Premium, longer validity
AWS Certificate Manager: Free with AWS

Visual Status Indicator#

Status Bar Visualization#

The horizontal colored bar shows real-time application status:

Bar Composition:

Divided into segments (1 per monitoring check)
Each segment represents a time period
Color indicates status for that period
Hover to see details

Status Colors#

Green Segments#

Meaning: Successful response (application up)
Response: Normal
Action: None required

Red Segments#

Meaning: Failed response (application down)
Response: Application unavailable or slow
Action: Investigate immediately

Yellow Segments#

Meaning: Slow response (degraded performance)
Response: Application slow but available
Action: Monitor for worsening

Interactive Status Bar#

Hovering Over Segments: Displays detailed information:

Response time in milliseconds
Status code (200, 500, timeout, etc.)
Exact timestamp of check
Duration of check

Uses:

Identify exact failure times
Correlate with events
Track issue resolution
Document incidents

Uptime Chart#

Chart Features#

An area chart displaying response times over your selected period:

Chart Elements:

X-Axis: Time progression (hours or minutes)
Y-Axis: Response time in milliseconds
Area: Response time trend
Line: Response time average

Reading the Chart#

Flat Line:

Consistent performance
No degradation
Stable application

Upward Trend:

Performance degrading
Increasing load
Resource constraints

Sudden Spike:

Temporary issue
Brief unavailability
Traffic spike

Downward Dip (Bottom):

Complete outage
Application down
No response received

Identifying Patterns#

Daily Pattern:

Peak hours: Higher response times
Off-hours: Lower response times
Expected and normal

Weekly Pattern:

Weekdays: Consistent pattern
Weekends: Different pattern
Predictable variations

Scheduled Maintenance:

Planned downtime visible
Duration: Usually 30-60 minutes
Normal and expected

Common Chart Scenarios#

Healthy Application:

Flat response times (< 200ms)
No downtime periods
Predictable daily pattern
No unexpected spikes

Degraded Performance:

Gradually increasing response times
Intermittent timeouts
Increasing failures
Needs investigation

Outage:

Sudden drop to zero/timeout
Sustained period of red
Clear start and end time
Visible on status bar

Interpreting Uptime Data#

Healthy Status#

Indicators:

✅ Green bar mostly solid green
✅ Response times < 200ms
✅ Response time trend flat
✅ Uptime percentage > 99.5%
✅ Certificate valid (> 30 days)

Action: Continue normal operations

Warning Signs#

Indicators:

⚠️ Response times increasing
⚠️ Occasional red segments
⚠️ Uptime dropping below 99%
⚠️ Certificate < 30 days
⚠️ Increased failure rate

Actions:

Investigate cause of degradation
Check application logs
Review resource usage
Plan certificate renewal
Monitor closely

Critical Issues#

Indicators:

🔴 Multiple consecutive red segments
🔴 Response times > 1000ms
🔴 Uptime below 95%
🔴 Certificate expired
🔴 Persistent failures

Immediate Actions:

Alert team immediately
Check application status
Review recent changes
Start incident response
Communicate to users
Renew certificate if expired

Common Monitoring Scenarios#

Scenario 1: Detect Outage#

Situation: Application stops responding

Indicators:

Red segments appear in status bar
Response time drops to zero/timeout
Chart shows downward spike
Uptime percentage decreases

Steps to Investigate:

Check exact time of failure
Review application logs
Check infrastructure status
Verify DNS resolution
Check network connectivity

Resolution:

Restart application if needed
Check recent deployments
Review resource usage
Fix underlying cause
Monitor for recurrence

Scenario 2: Performance Degradation#

Situation: Response times slowly increasing

Indicators:

Response times trending upward
Chart shows upward slope
Occasional red segments
Uptime still high

Root Causes:

Increased traffic
Memory leak
Database slowdown
Resource constraints
Network latency

Investigation:

Check traffic volume
Review application metrics
Check database performance
Monitor system resources
Review recent changes

Solutions:

Scale application horizontally
Optimize code/queries
Increase resources
Clear caches
Deploy fix

Scenario 3: Traffic Spike#

Situation: Response times spike during peak time

Indicators:

Response times increase during peak hours
Pattern repeats daily
Returns to normal after peak
Outages don't occur

Analysis:

This is expected behavior
Peak is predictable
Application recovering normally
No critical issue

Optimization:

Scale during peak times
Increase baseline resources
Implement caching
Optimize code
Use CDN

Scenario 4: SSL Certificate Expiration#

Situation: Certificate expiration approaching

Indicators:

Certificate expiration days: 15-30
Yellow warning in metric
Browser warnings if expired

Action Plan:

Week 1: Request new certificate
Week 2: Validate and install
Week 3: Verify installation
Day before: Final check
After renewal: Monitor

Best Practices#

Daily Monitoring Routine#

✅ Check uptime percentage each morning
✅ Review response time trend
✅ Look for red segments
✅ Check certificate expiration (weekly)

Performance Management#

✅ Maintain target uptime > 99%
✅ Keep response time < 200ms
✅ Investigate degradation > 20%
✅ Document response to incidents

Certificate Management#

✅ Set reminder at 60 days before expiration
✅ Renew certificate at 30 days before
✅ Test in staging before production
✅ Verify installation after renewal

Incident Response#

✅ Alert team on outage
✅ Collect monitoring data
✅ Document timeline
✅ Perform root cause analysis
✅ Implement preventive measures

Troubleshooting#

No Uptime Data Showing#

Problem: "No Uptime Data Available" message

Solutions:

Ensure application is selected
Verify application is deployed
Check application is receiving traffic
Wait 5+ minutes for initial data
Verify DNS records configured
Check firewall rules allow monitoring

Data Shows Downtime But App is Up#

Problem: Monitoring shows outage, but app appears working

Possible Causes:

Monitoring endpoint different from user endpoint
Firewall blocking monitoring requests
Regional network issue
Application partially down
DNS issue

Investigation:

Test application from multiple locations
Check network connectivity
Verify DNS resolution
Review application logs
Check firewall rules

Certificate Expiration Showing Incorrectly#

Problem: Certificate days showing wrong value

Solutions:

Verify certificate is installed correctly
Check system clock is correct
Refresh monitoring dashboard
Wait for next automatic check
Contact certificate provider

Response Times Unrealistic#

Problem: Response times don't match expectations

Causes:

Monitoring from distant region
Network latency including
Large response payload
Slow network connection
Slow DNS resolution

Verification:

Test locally
Test from other regions
Check network connectivity
Review payload size
Benchmark manually

Limits & Considerations#

Item	Limit
Data Granularity	1-5 minute intervals
Historical Data	Full history (unlimited)
Response Time Accuracy	±50ms
Uptime Calculation	Based on monitoring checks
Certificate Check	Daily verification
Monitoring Locations	Global distributed

Uptime Monitoring#

Overview#

Getting Started#

Accessing Uptime Monitoring#

Selecting an Application#

Initial Setup#

Time Range Selection#

Available Time Ranges#

Last 1 Hour#

Last 3 Hours#

Last 24 Hours#

Switching Time Ranges#

Uptime Metrics#

Response Time (ms)#

Average Response Time#

Uptime Percentage#

Certificate Expiration#

Visual Status Indicator#

Status Bar Visualization#

Status Colors#

Green Segments#

Red Segments#

Yellow Segments#

Interactive Status Bar#

Uptime Chart#

Chart Features#

Reading the Chart#

Identifying Patterns#

Common Chart Scenarios#

Interpreting Uptime Data#

Healthy Status#

Warning Signs#

Critical Issues#

Common Monitoring Scenarios#

Scenario 1: Detect Outage#

Scenario 2: Performance Degradation#

Scenario 3: Traffic Spike#

Scenario 4: SSL Certificate Expiration#

Best Practices#

Daily Monitoring Routine#

Performance Management#

Certificate Management#

Incident Response#

Troubleshooting#

No Uptime Data Showing#

Data Shows Downtime But App is Up#

Certificate Expiration Showing Incorrectly#

Response Times Unrealistic#

Limits & Considerations#

Related Documentation#