Skip to main content

Uptime Monitoring

The Uptime tab provides real-time monitoring of your application's availability and performance. Track response times, uptime percentage, and SSL certificate expiration with minute-level granularity.

Overview

Uptime monitoring is critical for maintaining service reliability. It continuously checks your application availability and provides real-time visibility into performance metrics. Use this data to identify issues quickly and maintain SLA compliance.


Getting Started

Accessing Uptime Monitoring

  1. Navigate to Monitoring from the main sidebar
  2. Click the Uptime tab
  3. Select your application from the dropdown
  4. Uptime data loads automatically

Selecting an Application

  1. Click the Uptime tab
  2. From the dropdown, select the application you want to monitor
  3. Wait for metrics to load (typically a few seconds)
  4. Data refreshes automatically every 1-5 minutes

Initial Setup

First Time Setup:

  1. Select your application
  2. System begins monitoring immediately
  3. Data collection starts in background
  4. Historical data available after 5+ minutes

Demo Mode:

  • Demo accounts display sample data
  • Use for testing and learning
  • Real data available after upgrading

Time Range Selection

View uptime data for different monitoring periods to match your needs:

Available Time Ranges

Last 1 Hour

Use Case: Immediate monitoring and troubleshooting

Best For:

  • Current issue investigation
  • Real-time status checking
  • Rapid response to incidents
  • Minute-by-minute analysis

Granularity: 1-minute intervals

Last 3 Hours

Use Case: Identify recent patterns

Best For:

  • Recent issue investigation
  • Trend detection
  • Performance analysis
  • Root cause investigation

Granularity: 3-5 minute intervals

Last 24 Hours

Use Case: Comprehensive daily overview

Best For:

  • Daily health check
  • Pattern identification
  • SLA compliance verification
  • Capacity planning

Granularity: 15-30 minute intervals

Switching Time Ranges

  1. Click the time range dropdown
  2. Select desired period
  3. Chart updates automatically
  4. Metrics recalculate

Uptime Metrics

The dashboard displays four critical metrics that together provide complete visibility into application health:

Response Time (ms)

Current Response Time:

  • Shows: The most recent response time measurement
  • Unit: Milliseconds (ms)
  • Update Frequency: Every 1-5 minutes

Healthy Range:

  • Excellent: < 100ms
  • Good: 100-200ms
  • Acceptable: 200-500ms
  • Poor: > 500ms

What It Measures:

  • Time from request to response
  • Includes network latency
  • Network round-trip to monitoring endpoint
  • Not actual request processing

Interpreting Values:

  • Consistent low values: Good performance
  • Increasing trend: Degradation
  • Sudden spike: Potential issue
  • High values: Slow application or network

Average Response Time

Shows: Average response time across all measurements in selected period

Calculation:

  • Sum of all response times ÷ number of measurements
  • Excludes failed requests
  • Smooths out temporary spikes

Uses:

  • Identify performance trends
  • Detect gradual degradation
  • Establish performance baseline
  • Compare time periods

Tracking Tips:

  1. Note your baseline average
  2. Alert if increases > 20%
  3. Compare week-over-week
  4. Adjust based on time of day

Example:

  • Current time avg: 145ms
  • Previous day avg: 130ms
  • Change: +15ms (11% increase)
  • Indicates slight degradation

Uptime Percentage

Shows: Percentage of time your application was available

Calculation:

(Successful Responses ÷ Total Monitoring Checks) × 100

Common Targets:

  • 99.99% (Four 9s) - 52 minutes/year downtime
  • 99.9% (Three 9s) - 8.7 hours/year downtime
  • 99% (Two 9s) - 3.7 days/year downtime
  • 95% - 18.25 days/year downtime

SLA Alignment:

  • Premium tier: 99.9%+ uptime
  • Standard tier: 99%+ uptime
  • Basic tier: 95%+ uptime

Interpreting Values:

  • 99%+ : Excellent, production-ready
  • 95-99%: Acceptable, monitor closely
  • 90-95%: Needs investigation
  • < 90%: Critical issues

Certificate Expiration

Shows: Days remaining on your SSL/TLS certificate

Status Indicators:

  • ✅ > 30 days: Healthy
  • ⚠️ 15-30 days: Action needed soon
  • 🔴 < 15 days: Critical, renew immediately
  • ❌ Expired: Certificate invalid

Action Timeline:

  • 60 days before: Plan renewal
  • 30 days before: Begin renewal process
  • 15 days before: High priority
  • On expiration: Emergency procedures

Renewal Process:

  1. Request new certificate from provider
  2. Validate domain ownership
  3. Install certificate on server
  4. Verify installation
  5. Test in monitoring

Common Certificates:

  • Let's Encrypt: Free, 90-day validity
  • CloudFlare: Free with plan
  • Digicert: Premium, longer validity
  • AWS Certificate Manager: Free with AWS

Visual Status Indicator

Status Bar Visualization

The horizontal colored bar shows real-time application status:

Bar Composition:

  • Divided into segments (1 per monitoring check)
  • Each segment represents a time period
  • Color indicates status for that period
  • Hover to see details

Status Colors

Green Segments

  • Meaning: Successful response (application up)
  • Response: Normal
  • Action: None required

Red Segments

  • Meaning: Failed response (application down)
  • Response: Application unavailable or slow
  • Action: Investigate immediately

Yellow Segments

  • Meaning: Slow response (degraded performance)
  • Response: Application slow but available
  • Action: Monitor for worsening

Interactive Status Bar

Hovering Over Segments: Displays detailed information:

  • Response time in milliseconds
  • Status code (200, 500, timeout, etc.)
  • Exact timestamp of check
  • Duration of check

Uses:

  • Identify exact failure times
  • Correlate with events
  • Track issue resolution
  • Document incidents

Uptime Chart

Chart Features

An area chart displaying response times over your selected period:

Chart Elements:

  • X-Axis: Time progression (hours or minutes)
  • Y-Axis: Response time in milliseconds
  • Area: Response time trend
  • Line: Response time average

Reading the Chart

Flat Line:

  • Consistent performance
  • No degradation
  • Stable application

Upward Trend:

  • Performance degrading
  • Increasing load
  • Resource constraints

Sudden Spike:

  • Temporary issue
  • Brief unavailability
  • Traffic spike

Downward Dip (Bottom):

  • Complete outage
  • Application down
  • No response received

Identifying Patterns

Daily Pattern:

  • Peak hours: Higher response times
  • Off-hours: Lower response times
  • Expected and normal

Weekly Pattern:

  • Weekdays: Consistent pattern
  • Weekends: Different pattern
  • Predictable variations

Scheduled Maintenance:

  • Planned downtime visible
  • Duration: Usually 30-60 minutes
  • Normal and expected

Common Chart Scenarios

Healthy Application:

  • Flat response times (< 200ms)
  • No downtime periods
  • Predictable daily pattern
  • No unexpected spikes

Degraded Performance:

  • Gradually increasing response times
  • Intermittent timeouts
  • Increasing failures
  • Needs investigation

Outage:

  • Sudden drop to zero/timeout
  • Sustained period of red
  • Clear start and end time
  • Visible on status bar

Interpreting Uptime Data

Healthy Status

Indicators:

  • ✅ Green bar mostly solid green
  • ✅ Response times < 200ms
  • ✅ Response time trend flat
  • ✅ Uptime percentage > 99.5%
  • ✅ Certificate valid (> 30 days)

Action: Continue normal operations

Warning Signs

Indicators:

  • ⚠️ Response times increasing
  • ⚠️ Occasional red segments
  • ⚠️ Uptime dropping below 99%
  • ⚠️ Certificate < 30 days
  • ⚠️ Increased failure rate

Actions:

  1. Investigate cause of degradation
  2. Check application logs
  3. Review resource usage
  4. Plan certificate renewal
  5. Monitor closely

Critical Issues

Indicators:

  • 🔴 Multiple consecutive red segments
  • 🔴 Response times > 1000ms
  • 🔴 Uptime below 95%
  • 🔴 Certificate expired
  • 🔴 Persistent failures

Immediate Actions:

  1. Alert team immediately
  2. Check application status
  3. Review recent changes
  4. Start incident response
  5. Communicate to users
  6. Renew certificate if expired

Common Monitoring Scenarios

Scenario 1: Detect Outage

Situation: Application stops responding

Indicators:

  • Red segments appear in status bar
  • Response time drops to zero/timeout
  • Chart shows downward spike
  • Uptime percentage decreases

Steps to Investigate:

  1. Check exact time of failure
  2. Review application logs
  3. Check infrastructure status
  4. Verify DNS resolution
  5. Check network connectivity

Resolution:

  1. Restart application if needed
  2. Check recent deployments
  3. Review resource usage
  4. Fix underlying cause
  5. Monitor for recurrence

Scenario 2: Performance Degradation

Situation: Response times slowly increasing

Indicators:

  • Response times trending upward
  • Chart shows upward slope
  • Occasional red segments
  • Uptime still high

Root Causes:

  • Increased traffic
  • Memory leak
  • Database slowdown
  • Resource constraints
  • Network latency

Investigation:

  1. Check traffic volume
  2. Review application metrics
  3. Check database performance
  4. Monitor system resources
  5. Review recent changes

Solutions:

  1. Scale application horizontally
  2. Optimize code/queries
  3. Increase resources
  4. Clear caches
  5. Deploy fix

Scenario 3: Traffic Spike

Situation: Response times spike during peak time

Indicators:

  • Response times increase during peak hours
  • Pattern repeats daily
  • Returns to normal after peak
  • Outages don't occur

Analysis:

  1. This is expected behavior
  2. Peak is predictable
  3. Application recovering normally
  4. No critical issue

Optimization:

  • Scale during peak times
  • Increase baseline resources
  • Implement caching
  • Optimize code
  • Use CDN

Scenario 4: SSL Certificate Expiration

Situation: Certificate expiration approaching

Indicators:

  • Certificate expiration days: 15-30
  • Yellow warning in metric
  • Browser warnings if expired

Action Plan:

  • Week 1: Request new certificate
  • Week 2: Validate and install
  • Week 3: Verify installation
  • Day before: Final check
  • After renewal: Monitor

Best Practices

Daily Monitoring Routine

  • ✅ Check uptime percentage each morning
  • ✅ Review response time trend
  • ✅ Look for red segments
  • ✅ Check certificate expiration (weekly)

Performance Management

  • ✅ Maintain target uptime > 99%
  • ✅ Keep response time < 200ms
  • ✅ Investigate degradation > 20%
  • ✅ Document response to incidents

Certificate Management

  • ✅ Set reminder at 60 days before expiration
  • ✅ Renew certificate at 30 days before
  • ✅ Test in staging before production
  • ✅ Verify installation after renewal

Incident Response

  • ✅ Alert team on outage
  • ✅ Collect monitoring data
  • ✅ Document timeline
  • ✅ Perform root cause analysis
  • ✅ Implement preventive measures

Troubleshooting

No Uptime Data Showing

Problem: "No Uptime Data Available" message

Solutions:

  1. Ensure application is selected
  2. Verify application is deployed
  3. Check application is receiving traffic
  4. Wait 5+ minutes for initial data
  5. Verify DNS records configured
  6. Check firewall rules allow monitoring

Data Shows Downtime But App is Up

Problem: Monitoring shows outage, but app appears working

Possible Causes:

  • Monitoring endpoint different from user endpoint
  • Firewall blocking monitoring requests
  • Regional network issue
  • Application partially down
  • DNS issue

Investigation:

  1. Test application from multiple locations
  2. Check network connectivity
  3. Verify DNS resolution
  4. Review application logs
  5. Check firewall rules

Certificate Expiration Showing Incorrectly

Problem: Certificate days showing wrong value

Solutions:

  1. Verify certificate is installed correctly
  2. Check system clock is correct
  3. Refresh monitoring dashboard
  4. Wait for next automatic check
  5. Contact certificate provider

Response Times Unrealistic

Problem: Response times don't match expectations

Causes:

  • Monitoring from distant region
  • Network latency including
  • Large response payload
  • Slow network connection
  • Slow DNS resolution

Verification:

  1. Test locally
  2. Test from other regions
  3. Check network connectivity
  4. Review payload size
  5. Benchmark manually

Limits & Considerations

ItemLimit
Data Granularity1-5 minute intervals
Historical DataFull history (unlimited)
Response Time Accuracy±50ms
Uptime CalculationBased on monitoring checks
Certificate CheckDaily verification
Monitoring LocationsGlobal distributed