Skip to main content

Monitor VM Performance | Real-Time Metrics and Health Status Guide

Monitor your VM instances' performance, health status, and activity in real-time from the VMs Dashboard.

Overview

The Monitoring section within the Instance Detail Panel provides comprehensive performance tracking and health monitoring for your VM instances.

Accessing Monitoring

From VM Card

Quick Access to Monitoring

  1. Locate the VM instance on the dashboard
  2. Scroll to the bottom action buttons
  3. Click the Monitoring Button (Activity icon)
  4. Monitoring dashboard opens
  5. Real-time metrics display immediately

Monitoring Dashboard

Dashboard Components

  • Real-time metric displays
  • Historical trend graphs
  • Performance alerts
  • Activity log
  • Health status indicators
  • Recommendation suggestions

Real-Time Metrics

Key Performance Indicators

The monitoring dashboard displays live metrics updated every 1-5 minutes:

CPU Usage

  • Current CPU utilization percentage (0-100%)
  • Usage trend graph
  • Sustained high usage indicates bottleneck
  • Normal range: 20-60% for typical workloads

Memory Usage

  • RAM currently in use (GB and percentage)
  • Available memory remaining
  • Memory trend over time
  • High usage >85% may cause slowdowns

Disk Usage

  • Storage space used vs. total
  • Percentage utilization
  • Free space available
  • Alert if approaching 80%

Network Activity

  • Data in (ingress) - Mbps
  • Data out (egress) - Mbps
  • Network trend graph
  • Spike detection for anomalies

Understanding Metrics

CPU Metrics

  • Usage shows processor load
  • Peaks indicate heavy computation
  • Sustained high indicates need for optimization
  • Multi-core systems show per-core breakdown

Memory Metrics

  • Usage shows RAM consumption
  • Includes OS, applications, buffers
  • Trend shows if memory leaking
  • Available shows headroom for new processes

Disk Metrics

  • Usage shows storage capacity consumption
  • Includes OS files and application data
  • Growing trend indicates data accumulation
  • Alerting at 80-90% prevents out-of-disk errors

Network Metrics

  • Ingress: Data coming into instance
  • Egress: Data leaving instance
  • Spikes may indicate:
    • Large file transfers
    • Data download/upload
    • Network attack or compromise
    • High traffic periods

Metric Graphs

Time Range Selection

  • 1 Hour: Recent behavior and current issues
  • 24 Hours: Daily patterns and peak usage
  • 7 Days: Weekly trends and recurring issues
  • 30 Days: Long-term trends and capacity planning
  • Custom Range: Specific date range analysis

Graph Features

  • X-axis shows time
  • Y-axis shows metric value
  • Hover to see exact values
  • Zoom in/out for detailed view
  • Download graph as image

Identifying Patterns

  • Regular spikes at specific times
  • Steady growth over time
  • Sudden changes or anomalies
  • Correlation between metrics

Planning Decisions

  • Capacity planning based on trends
  • Identifying optimal performance windows
  • Predicting when upgrades needed
  • Detecting performance degradation

Performance Status

Instance Health Status

Status Indicators

StatusColorMeaning
HealthyGreenAll metrics normal, no issues
WarningYellowOne metric approaching threshold
CriticalRedMetric exceeded critical threshold

Health Components

  • CPU usage level
  • Memory utilization
  • Disk space availability
  • Network connectivity
  • Service responsiveness

Status Summary

Overall Health

  • Quick visual indicator
  • Summary of all components
  • Recommendations if issues detected
  • Actions to improve health

Performance Alerts

Setting Up Alerts

Available Alert Types

CPU Alerts

  • Trigger when CPU exceeds threshold
  • Default threshold: 80%
  • Duration: Sustained for 5+ minutes
  • Notification: Immediate

Memory Alerts

  • Trigger when memory usage exceeds threshold
  • Default threshold: 85%
  • Duration: Sustained for 5+ minutes
  • Notification: Immediate

Disk Alerts

  • Trigger when disk usage exceeds threshold
  • Default threshold: 80%
  • Duration: Triggered immediately
  • Notification: High priority

Network Alerts

  • High traffic detected
  • Packet loss detected
  • Connection timeouts
  • Unusual patterns

Creating Alert

Step-by-Step

  1. Click "Create Alert" button in monitoring panel
  2. Select metric to monitor (CPU, Memory, Disk, Network)
  3. Set threshold value (percentage or GB)
  4. Choose duration (immediate or sustained)
  5. Select notification method:
    • Email
    • Slack
    • SMS (premium)
    • Webhook
  6. Save alert

Managing Alerts

Viewing Alerts

  • List of active alerts
  • Alert status and severity
  • Last triggered time
  • Alert history

Modifying Alerts

  1. Click alert in list
  2. Select "Edit"
  3. Change threshold or settings
  4. Click "Save Changes"

Disabling Alerts

  1. Click alert
  2. Select "Disable"
  3. Alert won't trigger
  4. Can be re-enabled later

Deleting Alerts

  1. Click alert
  2. Select "Delete"
  3. Confirm deletion
  4. Alert is permanently removed

Activity Log

Recent Activity

Activity Types

  • Instance created/deleted
  • Configuration changed
  • Instance started/stopped
  • Snapshots created
  • Network changes
  • Security updates
  • User logins
  • Failed operations

Activity Details

  • Action performed
  • Timestamp
  • User who performed action
  • Status (success/failure)
  • Details of change

Viewing Activities

Activity List

  1. Scroll to Activity section in detail panel
  2. Shows 5-10 most recent activities
  3. Click "View All" to see complete history
  4. Filter activities by type
  5. Search by user or action

Exporting Activity Log

Export Options

  1. Click "Export" button
  2. Choose format:
    • CSV for spreadsheets
    • JSON for integration
  3. Select date range
  4. File downloads

Use Cases

  • Audit trail for compliance
  • Troubleshooting recent changes
  • Understanding infrastructure changes
  • Team activity tracking

Performance Optimization

When to Optimize

Signs Your Instance Needs Optimization

  1. High CPU Usage

    • Sustained >80%
    • Affecting application performance
    • Slowing response times
  2. High Memory Usage

    • Sustained >85%
    • Causing slowdowns
    • Application crashes
  3. Low Disk Space

    • 80% full

    • Application warnings
    • Risk of failures
  4. High Network Usage

    • Unexpected peaks
    • Data transfer bottleneck
    • Cost implications

Optimization Strategies

CPU Optimization

  1. Identify CPU-consuming processes
  2. Optimize application code
  3. Reduce running services
  4. Scale to more instances
  5. Upgrade to higher CPU instance type

Memory Optimization

  1. Check for memory leaks
  2. Optimize application memory use
  3. Reduce cache sizes
  4. Restart services periodically
  5. Increase instance memory allocation

Disk Optimization

  1. Clean up temporary files
  2. Implement log rotation
  3. Archive old data
  4. Remove unused packages
  5. Expand disk capacity

Network Optimization

  1. Optimize application payload
  2. Compress data transfer
  3. Use regional endpoints
  4. Implement caching
  5. Reduce unnecessary traffic

Predictive Metrics

Forecasting

Trend Prediction

  • Graph shows projected usage
  • Based on historical patterns
  • Helps predict when limits reached
  • Plan upgrades proactively

Capacity Planning

  • Growth rate analysis
  • Time until capacity reached
  • Recommended upgrade timing
  • Cost impact of upgrades

Benchmarking

Comparing Performance

Baseline Metrics

  • Normal operation values
  • Average usage patterns
  • Peak usage times
  • Minimum resources needed

Performance Comparison

  • Current vs. historical
  • Same time last week/month
  • Before/after optimization
  • Between instances

Export and Reporting

Export Metrics

Export Options

  1. Click "Export" button
  2. Choose format:
    • CSV for Excel/Sheets
    • JSON for integration
    • PDF for reports
  3. Select date range
  4. File downloads immediately

Using Exported Data

  • Import into analysis tools
  • Create custom reports
  • Share with team members
  • Archive for compliance
  • Integration with monitoring systems

Report Generation

Creating Reports

  1. Select date range
  2. Choose metrics to include
  3. Add custom notes
  4. Generate PDF report
  5. Share or print

Report Contents

  • Metric summaries
  • Trend graphs
  • Alert history
  • Recommendations
  • Capacity analysis

Alerting Best Practices

Setting Appropriate Thresholds

CPU Alerts

  • Production: 70% for critical apps
  • Development: 85% for general use
  • Start with default, adjust based on experience

Memory Alerts

  • Production: 75% to allow headroom
  • Development: 85% is acceptable
  • Monitor trend, not just threshold

Disk Alerts

  • Critical: 80% (watch closely)
  • Warning: 70% (take action)
  • Emergency: 90% (immediate action)

Network Alerts

  • Baseline: Establish normal usage first
  • Spike: 2-3x normal usage
  • Sustained: High for 5+ minutes

Alert Management

  1. Create Key Alerts

    • Disk space critical
    • Memory exhaustion
    • CPU sustained high
    • Service down
  2. Configure Notifications

    • Route to on-call team
    • Escalate if not acknowledged
    • Include remediation steps
  3. Review and Adjust

    • Monitor alert frequency
    • Reduce false positives
    • Adjust thresholds quarterly
    • Update as workload changes

Troubleshooting with Metrics

Diagnosis Using Metrics

Instance Not Responding

  • Check CPU - is it maxed out?
  • Check Memory - running low?
  • Check Network - any activity?
  • Check Disk - space available?
  • Review recent activity log

Slow Performance

  • CPU usage high? Optimize code
  • Memory low? Increase or restart
  • Disk I/O high? Check what's writing
  • Network latency? Check connection

Unexpected Costs

  • Check network egress usage
  • Review instance type vs. utilization
  • Check for unused instances
  • Monitor storage growth
MetricWarningCriticalAction
CPU70%85%Optimize or upgrade
Memory75%90%Increase RAM or restart
Disk80%95%Clean up or expand
Network In500 Mbps900 MbpsMonitor or optimize
Network Out500 Mbps900 MbpsMonitor or optimize

Next Steps