Uptime Monitoring & SLA Tracking: Real-Time Application Availability | Nife Docs

Uptime Monitoring#

The Uptime tab provides real-time monitoring of your application's availability and performance. Track response times, uptime percentage, and SSL certificate expiration with minute-level granularity.

Overview#

Uptime monitoring is critical for maintaining service reliability. It continuously checks your application availability and provides real-time visibility into performance metrics. Use this data to identify issues quickly and maintain SLA compliance.


Getting Started#

Accessing Uptime Monitoring#

  1. Navigate to Monitoring from the main sidebar
  2. Click the Uptime tab
  3. Select your application from the dropdown
  4. Uptime data loads automatically

Selecting an Application#

  1. Click the Uptime tab
  2. From the dropdown, select the application you want to monitor
  3. Wait for metrics to load (typically a few seconds)
  4. Data refreshes automatically every 1-5 minutes

Initial Setup#

First Time Setup:

  1. Select your application
  2. System begins monitoring immediately
  3. Data collection starts in background
  4. Historical data available after 5+ minutes

Demo Mode:

  • Demo accounts display sample data
  • Use for testing and learning
  • Real data available after upgrading

Time Range Selection#

View uptime data for different monitoring periods to match your needs:

Available Time Ranges#

Last 1 Hour#

Use Case: Immediate monitoring and troubleshooting

Best For:

  • Current issue investigation
  • Real-time status checking
  • Rapid response to incidents
  • Minute-by-minute analysis

Granularity: 1-minute intervals

Last 3 Hours#

Use Case: Identify recent patterns

Best For:

  • Recent issue investigation
  • Trend detection
  • Performance analysis
  • Root cause investigation

Granularity: 3-5 minute intervals

Last 24 Hours#

Use Case: Comprehensive daily overview

Best For:

  • Daily health check
  • Pattern identification
  • SLA compliance verification
  • Capacity planning

Granularity: 15-30 minute intervals

Switching Time Ranges#

  1. Click the time range dropdown
  2. Select desired period
  3. Chart updates automatically
  4. Metrics recalculate

Uptime Metrics#

The dashboard displays four critical metrics that together provide complete visibility into application health:

Response Time (ms)#

Current Response Time:

  • Shows: The most recent response time measurement
  • Unit: Milliseconds (ms)
  • Update Frequency: Every 1-5 minutes

Healthy Range:

  • Excellent: < 100ms
  • Good: 100-200ms
  • Acceptable: 200-500ms
  • Poor: > 500ms

What It Measures:

  • Time from request to response
  • Includes network latency
  • Network round-trip to monitoring endpoint
  • Not actual request processing

Interpreting Values:

  • Consistent low values: Good performance
  • Increasing trend: Degradation
  • Sudden spike: Potential issue
  • High values: Slow application or network

Average Response Time#

Shows: Average response time across all measurements in selected period

Calculation:

  • Sum of all response times รท number of measurements
  • Excludes failed requests
  • Smooths out temporary spikes

Uses:

  • Identify performance trends
  • Detect gradual degradation
  • Establish performance baseline
  • Compare time periods

Tracking Tips:

  1. Note your baseline average
  2. Alert if increases > 20%
  3. Compare week-over-week
  4. Adjust based on time of day

Example:

  • Current time avg: 145ms
  • Previous day avg: 130ms
  • Change: +15ms (11% increase)
  • Indicates slight degradation

Uptime Percentage#

Shows: Percentage of time your application was available

Calculation:

(Successful Responses รท Total Monitoring Checks) ร— 100

Common Targets:

  • 99.99% (Four 9s) - 52 minutes/year downtime
  • 99.9% (Three 9s) - 8.7 hours/year downtime
  • 99% (Two 9s) - 3.7 days/year downtime
  • 95% - 18.25 days/year downtime

SLA Alignment:

  • Premium tier: 99.9%+ uptime
  • Standard tier: 99%+ uptime
  • Basic tier: 95%+ uptime

Interpreting Values:

  • 99%+ : Excellent, production-ready
  • 95-99%: Acceptable, monitor closely
  • 90-95%: Needs investigation
  • < 90%: Critical issues

Certificate Expiration#

Shows: Days remaining on your SSL/TLS certificate

Status Indicators:

  • โœ… > 30 days: Healthy
  • โš ๏ธ 15-30 days: Action needed soon
  • ๐Ÿ”ด < 15 days: Critical, renew immediately
  • โŒ Expired: Certificate invalid

Action Timeline:

  • 60 days before: Plan renewal
  • 30 days before: Begin renewal process
  • 15 days before: High priority
  • On expiration: Emergency procedures

Renewal Process:

  1. Request new certificate from provider
  2. Validate domain ownership
  3. Install certificate on server
  4. Verify installation
  5. Test in monitoring

Common Certificates:

  • Let's Encrypt: Free, 90-day validity
  • CloudFlare: Free with plan
  • Digicert: Premium, longer validity
  • AWS Certificate Manager: Free with AWS

Visual Status Indicator#

Status Bar Visualization#

The horizontal colored bar shows real-time application status:

Bar Composition:

  • Divided into segments (1 per monitoring check)
  • Each segment represents a time period
  • Color indicates status for that period
  • Hover to see details

Status Colors#

Green Segments#

  • Meaning: Successful response (application up)
  • Response: Normal
  • Action: None required

Red Segments#

  • Meaning: Failed response (application down)
  • Response: Application unavailable or slow
  • Action: Investigate immediately

Yellow Segments#

  • Meaning: Slow response (degraded performance)
  • Response: Application slow but available
  • Action: Monitor for worsening

Interactive Status Bar#

Hovering Over Segments: Displays detailed information:

  • Response time in milliseconds
  • Status code (200, 500, timeout, etc.)
  • Exact timestamp of check
  • Duration of check

Uses:

  • Identify exact failure times
  • Correlate with events
  • Track issue resolution
  • Document incidents

Uptime Chart#

Chart Features#

An area chart displaying response times over your selected period:

Chart Elements:

  • X-Axis: Time progression (hours or minutes)
  • Y-Axis: Response time in milliseconds
  • Area: Response time trend
  • Line: Response time average

Reading the Chart#

Flat Line:

  • Consistent performance
  • No degradation
  • Stable application

Upward Trend:

  • Performance degrading
  • Increasing load
  • Resource constraints

Sudden Spike:

  • Temporary issue
  • Brief unavailability
  • Traffic spike

Downward Dip (Bottom):

  • Complete outage
  • Application down
  • No response received

Identifying Patterns#

Daily Pattern:

  • Peak hours: Higher response times
  • Off-hours: Lower response times
  • Expected and normal

Weekly Pattern:

  • Weekdays: Consistent pattern
  • Weekends: Different pattern
  • Predictable variations

Scheduled Maintenance:

  • Planned downtime visible
  • Duration: Usually 30-60 minutes
  • Normal and expected

Common Chart Scenarios#

Healthy Application:

  • Flat response times (< 200ms)
  • No downtime periods
  • Predictable daily pattern
  • No unexpected spikes

Degraded Performance:

  • Gradually increasing response times
  • Intermittent timeouts
  • Increasing failures
  • Needs investigation

Outage:

  • Sudden drop to zero/timeout
  • Sustained period of red
  • Clear start and end time
  • Visible on status bar

Interpreting Uptime Data#

Healthy Status#

Indicators:

  • โœ… Green bar mostly solid green
  • โœ… Response times < 200ms
  • โœ… Response time trend flat
  • โœ… Uptime percentage > 99.5%
  • โœ… Certificate valid (> 30 days)

Action: Continue normal operations

Warning Signs#

Indicators:

  • โš ๏ธ Response times increasing
  • โš ๏ธ Occasional red segments
  • โš ๏ธ Uptime dropping below 99%
  • โš ๏ธ Certificate < 30 days
  • โš ๏ธ Increased failure rate

Actions:

  1. Investigate cause of degradation
  2. Check application logs
  3. Review resource usage
  4. Plan certificate renewal
  5. Monitor closely

Critical Issues#

Indicators:

  • ๐Ÿ”ด Multiple consecutive red segments
  • ๐Ÿ”ด Response times > 1000ms
  • ๐Ÿ”ด Uptime below 95%
  • ๐Ÿ”ด Certificate expired
  • ๐Ÿ”ด Persistent failures

Immediate Actions:

  1. Alert team immediately
  2. Check application status
  3. Review recent changes
  4. Start incident response
  5. Communicate to users
  6. Renew certificate if expired

Common Monitoring Scenarios#

Scenario 1: Detect Outage#

Situation: Application stops responding

Indicators:

  • Red segments appear in status bar
  • Response time drops to zero/timeout
  • Chart shows downward spike
  • Uptime percentage decreases

Steps to Investigate:

  1. Check exact time of failure
  2. Review application logs
  3. Check infrastructure status
  4. Verify DNS resolution
  5. Check network connectivity

Resolution:

  1. Restart application if needed
  2. Check recent deployments
  3. Review resource usage
  4. Fix underlying cause
  5. Monitor for recurrence

Scenario 2: Performance Degradation#

Situation: Response times slowly increasing

Indicators:

  • Response times trending upward
  • Chart shows upward slope
  • Occasional red segments
  • Uptime still high

Root Causes:

  • Increased traffic
  • Memory leak
  • Database slowdown
  • Resource constraints
  • Network latency

Investigation:

  1. Check traffic volume
  2. Review application metrics
  3. Check database performance
  4. Monitor system resources
  5. Review recent changes

Solutions:

  1. Scale application horizontally
  2. Optimize code/queries
  3. Increase resources
  4. Clear caches
  5. Deploy fix

Scenario 3: Traffic Spike#

Situation: Response times spike during peak time

Indicators:

  • Response times increase during peak hours
  • Pattern repeats daily
  • Returns to normal after peak
  • Outages don't occur

Analysis:

  1. This is expected behavior
  2. Peak is predictable
  3. Application recovering normally
  4. No critical issue

Optimization:

  • Scale during peak times
  • Increase baseline resources
  • Implement caching
  • Optimize code
  • Use CDN

Scenario 4: SSL Certificate Expiration#

Situation: Certificate expiration approaching

Indicators:

  • Certificate expiration days: 15-30
  • Yellow warning in metric
  • Browser warnings if expired

Action Plan:

  • Week 1: Request new certificate
  • Week 2: Validate and install
  • Week 3: Verify installation
  • Day before: Final check
  • After renewal: Monitor

Best Practices#

Daily Monitoring Routine#

  • โœ… Check uptime percentage each morning
  • โœ… Review response time trend
  • โœ… Look for red segments
  • โœ… Check certificate expiration (weekly)

Performance Management#

  • โœ… Maintain target uptime > 99%
  • โœ… Keep response time < 200ms
  • โœ… Investigate degradation > 20%
  • โœ… Document response to incidents

Certificate Management#

  • โœ… Set reminder at 60 days before expiration
  • โœ… Renew certificate at 30 days before
  • โœ… Test in staging before production
  • โœ… Verify installation after renewal

Incident Response#

  • โœ… Alert team on outage
  • โœ… Collect monitoring data
  • โœ… Document timeline
  • โœ… Perform root cause analysis
  • โœ… Implement preventive measures

Troubleshooting#

No Uptime Data Showing#

Problem: "No Uptime Data Available" message

Solutions:

  1. Ensure application is selected
  2. Verify application is deployed
  3. Check application is receiving traffic
  4. Wait 5+ minutes for initial data
  5. Verify DNS records configured
  6. Check firewall rules allow monitoring

Data Shows Downtime But App is Up#

Problem: Monitoring shows outage, but app appears working

Possible Causes:

  • Monitoring endpoint different from user endpoint
  • Firewall blocking monitoring requests
  • Regional network issue
  • Application partially down
  • DNS issue

Investigation:

  1. Test application from multiple locations
  2. Check network connectivity
  3. Verify DNS resolution
  4. Review application logs
  5. Check firewall rules

Certificate Expiration Showing Incorrectly#

Problem: Certificate days showing wrong value

Solutions:

  1. Verify certificate is installed correctly
  2. Check system clock is correct
  3. Refresh monitoring dashboard
  4. Wait for next automatic check
  5. Contact certificate provider

Response Times Unrealistic#

Problem: Response times don't match expectations

Causes:

  • Monitoring from distant region
  • Network latency including
  • Large response payload
  • Slow network connection
  • Slow DNS resolution

Verification:

  1. Test locally
  2. Test from other regions
  3. Check network connectivity
  4. Review payload size
  5. Benchmark manually

Limits & Considerations#

ItemLimit
Data Granularity1-5 minute intervals
Historical DataFull history (unlimited)
Response Time Accuracyยฑ50ms
Uptime CalculationBased on monitoring checks
Certificate CheckDaily verification
Monitoring LocationsGlobal distributed

Related Documentation#