VM Management Troubleshooting | Common Issues and Solutions

This guide helps you resolve common issues encountered when using Nife VM Management.

General Troubleshooting Steps#

Before diving into specific issues, try these general troubleshooting steps:

  1. Refresh the Dashboard

    • Click the refresh button to reload instance data
    • Clear browser cache (Ctrl+Shift+Delete)
    • Reload the page (F5 or Ctrl+R)
  2. Verify Credentials

    • Confirm credentials are still valid in your cloud provider
    • Check keys haven't been rotated or revoked
    • Verify IAM permissions are still assigned
  3. Check Cloud Provider Status

  4. Review Error Messages

    • Read error messages carefully for specific details
    • Screenshot errors for support tickets
    • Check browser console (F12) for JavaScript errors
  5. Wait and Retry

    • Some operations are asynchronous
    • Wait 1-2 minutes for changes to propagate
    • Retry the operation

Creating VMs Issues#

"Organization not found" Error#

Cause: Organization doesn't exist or isn't accessible

Solutions:

  1. Refresh the page to reload organizations
  2. Verify you have access to the organization
  3. Ask organization admin to grant access
  4. Create a new organization if needed
  5. Contact Nife support if issue persists

"Cloud provider type is required" Error#

Cause: No cloud provider selected

Solutions:

  1. Select a cloud provider from the dropdown
  2. Choose between AWS, GCP, or Azure
  3. Don't leave it blank
  4. Ensure dropdown opened correctly

"Instance name is required" Error#

Cause: Instance name field is empty

Solutions:

  1. Enter a descriptive instance name
  2. Use lowercase letters, numbers, and hyphens
  3. Name should be 3-50 characters
  4. Make name unique within organization

File Upload Issues (GCP)#

"Please enter an instance name before uploading the file"#

Cause: Service account file upload attempted without instance name

Solutions:

  1. First, fill in the Instance Name field
  2. Then select the service account JSON file
  3. Click Upload

"Please select a valid JSON file"#

Cause: Selected file is not in JSON format

Solutions:

  1. Verify you selected the correct file from GCP
  2. File should be named something like project-id-xxxxx.json
  3. Don't rename the file
  4. Try downloading the key again from GCP
  5. Check file extension is .json

"Failed to upload file"#

Cause: Network issue, invalid file, or server error

Solutions:

  1. Check internet connection
  2. Try uploading again
  3. Refresh the page and retry
  4. Use a different browser
  5. Try smaller file size (though it shouldn't matter)
  6. Check file isn't corrupted
  7. Contact support if error persists

AWS Credential Validation Issues#

"Invalid Access Key ID"#

Cause: Access Key ID is incorrect or no longer valid

Solutions:

  1. Verify Access Key ID starts with AKIA
  2. Copy key exactly from AWS IAM console
  3. Check key hasn't been rotated
  4. Ensure key hasn't been disabled
  5. Verify IAM user is active
  6. Create new access key if needed

To create new AWS access key:

  1. Log into AWS console
  2. Go to IAM โ†’ Users โ†’ Your user
  3. Click Security credentials
  4. Click Create access key
  5. Select CLI usage
  6. Copy new Access Key ID
  7. Save Secret Access Key securely

"Invalid Secret Access Key"#

Cause: Secret key is incorrect or invalid

Solutions:

  1. Secret key is only visible once when created
  2. If lost, create a new access key
  3. Check for extra spaces or characters
  4. Verify secret key matches Access Key ID
  5. Use password manager to store securely
  6. Deactivate old key if regenerating

"Instance not found in AWS"#

Cause: Instance doesn't exist or wrong region specified

Solutions:

  1. Verify instance exists in AWS console
  2. Check instance ID format: should be i-xxxxxxxxxxxx
  3. Verify you're using the correct region
  4. Confirm instance isn't terminated
  5. Check instance is in correct AWS account
  6. Go to EC2 โ†’ Instances and verify

"Access Denied / Permission Denied"#

Cause: IAM user doesn't have EC2 permissions

Solutions:

  1. Verify user has AmazonEC2FullAccess policy
  2. Or verify custom policy grants necessary permissions
  3. Check policy is attached to user
  4. Wait 1-2 minutes for IAM changes to take effect
  5. Sign out and back into AWS to refresh
  6. Create new access key after policy attachment

GCP Credential Issues#

"Service account is invalid"#

Cause: Service account key file is invalid or incorrect

Solutions:

  1. Download fresh service account key from GCP
  2. Verify file is valid JSON
  3. Check file contains all required fields
  4. Don't manually edit the JSON file
  5. Try uploading again
  6. Use a different browser

"Compute Engine API not enabled"#

Cause: Compute Engine API isn't enabled in GCP project

Solutions:

  1. Go to GCP Console
  2. Navigate to APIs & Services โ†’ Library
  3. Search for "Compute Engine API"
  4. Click Enable
  5. Wait 2-3 minutes for activation
  6. Retry VM creation

"Instance not found in GCP"#

Cause: Instance doesn't exist or wrong zone specified

Solutions:

  1. Verify instance exists in GCP console
  2. Go to Compute Engine โ†’ VM instances
  3. Verify correct zone (e.g., us-central1-a)
  4. Check instance name spelling
  5. Ensure instance isn't deleted
  6. Verify project is correct

"Service account lacks permissions"#

Cause: Service account missing necessary GCP roles

Solutions:

  1. Go to GCP Console
  2. Navigate to IAM & Admin โ†’ IAM
  3. Find your service account
  4. Click Edit
  5. Grant "Compute Instance Admin (v1)" role
  6. Wait 1-2 minutes for permissions to apply
  7. Retry operation

Azure Credential Issues#

"Invalid Subscription ID"#

Cause: Subscription ID is incorrect or doesn't exist

Solutions:

  1. Go to Azure portal
  2. Click Subscriptions
  3. Copy exact Subscription ID (GUID)
  4. Verify subscription is active
  5. Check you have access to subscription
  6. Avoid extra spaces when copying

"Resource Group not found"#

Cause: Resource group doesn't exist in subscription

Solutions:

  1. Go to Azure portal
  2. Click Resource groups
  3. Find and verify resource group name
  4. Check spelling exactly
  5. Confirm resource group in correct subscription
  6. Verify you have access to resource group

"Invalid Client ID / Tenant ID"#

Cause: Service principal credentials are incorrect

Solutions:

  1. Go to Azure AD โ†’ App registrations
  2. Click your app
  3. Copy exact Application (client) ID
  4. Go to App registration Properties
  5. Copy exact Directory (tenant) ID
  6. Check GUIDs don't have extra spaces
  7. Create new credentials if needed

"Client secret invalid or expired"#

Cause: Client secret is wrong, expired, or no longer exists

Solutions:

  1. Go to Azure AD โ†’ App registrations โ†’ Your app
  2. Click Certificates & secrets
  3. Check if secret has expired (date shown)
  4. Create new client secret if expired
  5. Copy Value (not ID) of secret
  6. Immediately save secret securely (won't show again)
  7. Delete old expired secret

"Insufficient permissions for operation"#

Cause: Service principal lacks required permissions

Solutions:

  1. Go to Azure portal
  2. Click Subscriptions โ†’ Your subscription
  3. Click Access Control (IAM)
  4. Click Add โ†’ Add role assignment
  5. Select "Virtual Machine Contributor"
  6. Find and select your service principal
  7. Click Review + assign
  8. Wait 1-2 minutes for permissions to propagate

Managing VMs Issues#

"Instance not accessible" / "Cannot connect"#

Symptoms#

  • Instance shows as Running in dashboard
  • Cannot connect via SSH or console
  • Connection timeout errors

Solutions#

For AWS:

  1. Go to AWS EC2 console
  2. Check Security Group rules
  3. Ensure port 22 (SSH) is allowed
  4. Check inbound rules allow your IP
  5. Verify instance has public IP
  6. Try accessing from AWS console first

For GCP:

  1. Go to GCP Compute Engine
  2. Click on instance
  3. Click Edit
  4. Check Firewall rules
  5. Ensure ssh tag is applied
  6. Check VPC firewall rules allow SSH
  7. Try gcloud ssh command first

For Azure:

  1. Go to Azure portal
  2. Find Virtual Machine
  3. Check Network Interface
  4. Verify Network Security Group (NSG)
  5. Ensure inbound rule allows port 22
  6. Check your IP isn't blocked

"Operation timed out"#

Cause: Operation took longer than expected

Solutions:

  1. Wait 1-2 minutes and refresh
  2. Check instance status in cloud provider console
  3. Retry the operation
  4. Check network connectivity
  5. Try different browser
  6. Check for service maintenance windows

Instance Won't Start#

Symptoms:

  • Start button disabled or grayed out
  • Error when clicking start
  • Instance remains in Stopped state

Solutions:

  1. Check instance status in cloud provider console
  2. Verify cloud provider doesn't have issues
  3. Check for startup errors in instance logs
  4. Verify disk space available
  5. Check instance type limitations
  6. Try force stopping, then restarting
  7. Contact cloud provider support if persists

Instance Won't Stop#

Symptoms:

  • Stop operation hangs
  • Instance remains Running
  • Timeout errors

Solutions:

  1. Wait longer (stop can take minutes)
  2. Force stop from cloud provider console
  3. Check for services preventing shutdown
  4. Check instance logs for errors
  5. Try restart instead
  6. Contact cloud provider support

Cannot Delete Instance#

Symptoms:

  • Delete operation fails with error
  • Instance still exists after deletion attempt

Solutions:

  1. Verify instance has no dependencies
  2. Detach storage volumes first
  3. Remove from load balancers
  4. Cancel any ongoing operations
  5. Try deleting from cloud provider console
  6. Check for resource locks
  7. Verify proper permissions

"Insufficient permissions" Error#

Cause: Your Nife account lacks permissions for this organization

Solutions:

  1. Ask organization admin to grant access
  2. Verify your role in organization
  3. Check if you're in correct organization
  4. Ask for VM management permissions specifically
  5. Create test instance in accessible organization

Monitoring Issues#

No Metrics Displayed#

Cause: Instance too new, metrics not available, or monitoring disabled

Solutions:

  1. Wait 5-10 minutes for metrics to populate
  2. Check instance is in Running state
  3. Verify instance has been running >5 minutes
  4. Refresh the monitoring page
  5. Check browser console for errors
  6. Try different browser

Metrics Stopped Updating#

Cause: Instance offline, monitoring service issue, or connectivity problem

Solutions:

  1. Check instance status
  2. Refresh the monitoring page
  3. Stop and restart instance
  4. Check cloud provider is operational
  5. Wait 5 minutes and retry
  6. Clear browser cache
  7. Contact support if persists

Alerts Not Sending#

Cause: Notification method misconfigured or issue with service

Solutions:

  1. Verify alert threshold settings
  2. Check notification email/Slack is correct
  3. Verify notification service is enabled
  4. Check spam folder if email alert
  5. Recreate alert from scratch
  6. Test with manual alert trigger
  7. Contact support if issue continues

Export Issues#

Export File Is Empty#

Cause: No instances to export or query failed

Solutions:

  1. Verify you have VM instances created
  2. Check instances aren't filtered out
  3. Refresh instance list first
  4. Try again after refreshing page
  5. Check browser console for errors

Export File Is Corrupted#

Cause: Network issue during download or browser issue

Solutions:

  1. Try exporting again
  2. Use different browser
  3. Check available disk space
  4. Try JSON format instead of CSV (or vice versa)
  5. Check browser's download settings

CSV File Not Readable in Excel#

Cause: Encoding issue or Excel import problem

Solutions:

  1. Open in Google Sheets (usually works better)
  2. Import CSV with UTF-8 encoding
  3. Use Excel's Text Import Wizard
  4. Try opening with different application
  5. Export as JSON and convert

Performance Issues#

Dashboard Loads Slowly#

Cause: Too many instances, network issue, or browser performance

Solutions:

  1. Use filters to reduce visible instances
  2. Close other browser tabs
  3. Clear browser cache
  4. Refresh the page
  5. Try different browser
  6. Disable extensions
  7. Check internet connection

Operations Are Slow#

Cause: Cloud provider API delays, network latency, or service issues

Solutions:

  1. Wait for current operation to complete
  2. Check cloud provider status
  3. Try again after 1-2 minutes
  4. Verify network connection
  5. Try from different location
  6. Contact Nife support if recurring

Getting Help#

When to Contact Support#

Contact Nife support if:

  • Issue persists after troubleshooting
  • Error message unclear or unusual
  • Multiple operations failing
  • Data loss or security concern
  • Performance issues across multiple operations

Providing Support Information#

When contacting support, include:

  1. Error message: Exact text or screenshot
  2. Steps to reproduce: How you got the error
  3. Instance details: Name, provider, organization
  4. Timing: When did the issue start
  5. Environment: Browser, OS, network
  6. Recent changes: What changed before issue
  7. Screenshots: Of error or issue

Support Channels#

  • Email: [email protected]
  • Chat: Live chat in dashboard
  • Documentation: Check docs first
  • Status Page: Check service status

FAQ#

Q: How long do instance operations take? A: Start/stop: 1-5 minutes, Restart: 2-10 minutes, Create: 5-20 minutes

Q: Why can't I see my instance? A: Check filters, search query, or organization selection

Q: Do I lose data when stopping an instance? A: No, stopping preserves data. Only deleting removes data.

Q: Can I undo a deletion? A: No. Deletions are permanent. Restore from backups/snapshots if available.

Q: How often are metrics updated? A: Metrics update every 1-5 minutes depending on metric type.

Next Steps#