Kubernetes Logs Management Guide | Nife Deploy

Nife Deploy provides a comprehensive Kubernetes logs management system that allows you to collect, search, filter, and archive logs from your Kubernetes clusters. This guide will walk you through all the features and best practices.

Overview#

The K8s Logs Management system helps you:

  • Collect logs from Kubernetes pods and containers across multiple clusters
  • Search and filter logs using flexible criteria or LogQL queries
  • Store logs in your database or archive them to S3
  • Monitor system health and log collection status
  • Configure automatic log retention and archival policies

Getting Started#

Accessing K8s Logs#

  1. Navigate to your Nife Dashboard
  2. Select your Organization
  3. Go to Clusters > Kubernetes Logs
  4. Choose a cluster from the dropdown to begin managing logs

Key Components#

The Kubernetes Logs interface consists of several tabs:

  • Logs Tab - Search and view log entries
  • Metrics Tab - Monitor log collection metrics
  • Collection Config Tab - Configure which logs to collect
  • Archive Tab - Set up automatic log archival
  • S3 Storage Tab - Manage logs stored in S3
  • Clusters Tab - Manage cluster configurations
  • Settings Tab - System health and configuration

Logs Tab - Searching and Viewing Logs#

The Logs tab allows you to search and view Kubernetes logs with multiple filtering options.

Basic Log Search#

To search for logs:

  1. Select a Cluster - The cluster dropdown at the top determines which cluster's logs you search
  2. Enter a Search Query (optional) - Type keywords to search in log messages
  3. Filter by Namespace - Select a specific namespace or "All Namespaces"
  4. Filter by Log Level - Choose from:
    • All Levels
    • Error - Critical issues
    • Warning - Potential problems
    • Info - General information
    • Debug - Detailed debugging information
  5. Set Date Range (optional) - Click the calendar icon to select start and end dates
  6. Click "Search Logs" - Execute the search with your selected filters

Log Entry Details#

Each log entry displays:

  • Log Level Badge - Color-coded severity (red=error, yellow=warning, blue=info, gray=debug)
  • Namespace - The Kubernetes namespace
  • Pod Name - The pod that generated the log
  • Container Name - The specific container within the pod
  • Message - The actual log message
  • Timestamp - When the log was generated

Advanced Search with LogQL#

For advanced log queries, use LogQL (Log Query Language):

  1. Scroll down to the LogQL Query section
  2. Enter your query in the text area
  3. Click Execute

Example LogQL queries:

# Find all error logs in production namespace
{namespace="production"} |= "error"
# Find logs containing "connection" in pod "api-server"
{pod="api-server"} |= "connection"
# Search logs from the last hour
{namespace="staging"} | timestamp >= "1h"
# Find slow requests (response time > 1000ms)
{service="api"} |= "response_time" | duration > "1000ms"
What is LogQL?

LogQL is a query language similar to Prometheus PromQL, designed specifically for log querying. It allows complex filtering, label matching, and pattern searches.

Export Logs#

To export the current search results:

  1. Click Export button
  2. A JSON file containing all displayed logs will download to your computer
  3. File naming: k8s-logs-[timestamp].json

Tips for Effective Log Searching#

  • Use specific namespaces - Narrowing down your search space makes queries faster
  • Set reasonable date ranges - Searching the entire log history can be slow; use date filters
  • Start simple - Begin with basic keyword searches before trying complex LogQL
  • Save useful queries - Note down LogQL queries you find helpful
  • Export for analysis - Download logs for offline analysis or integration with other tools

Collection Config Tab - Configure Log Collection#

The Collection Config tab allows you to control which logs are collected from your Kubernetes clusters.

Understanding Collection Configurations#

A collection configuration defines:

  • Which cluster logs come from
  • Which namespace(s) to collect logs from
  • What log levels to collect
  • Collection frequency and limits
  • Which namespaces to exclude

Creating a New Configuration#

  1. Click Add Configuration button

  2. Fill in the following details:

    Cluster (Required)

    • Select the target Kubernetes cluster

    Namespace (Required)

    • Specify which namespace to collect logs from
    • Use all to collect from all namespaces

    App Filter (Optional)

    • Filter logs by application name or label

    Log Levels (Optional)

    • Select which log levels to collect
    • Options: Debug, Info, Warning, Error
    • Default: Info, Warning, Error

    Collection Interval (Default: 300 seconds)

    • How frequently to collect logs (in seconds)
    • Minimum: 60 seconds
    • Recommended: 300-600 seconds (5-10 minutes)

    Exclude Namespaces (Optional)

    • List namespaces to skip (useful when collecting "all")
    • Example: kube-system, kube-public

    Max Logs Per Collection (Default: 1000)

    • Maximum number of logs to fetch in each collection cycle
    • Increase for high-volume logging; decrease to reduce load
  3. Toggle Enable to activate the configuration

  4. Click Save Configuration

Managing Configurations#

Edit Configuration

  1. Find the configuration in the list
  2. Click the Edit icon
  3. Modify the settings
  4. Click Update

Disable Configuration

  • Toggle the Enable/Disable switch in the configuration row
  • Disabled configurations won't collect logs

Delete Configuration

  1. Click the Delete icon
  2. Confirm deletion

Bulk Update

  • Enable/disable multiple configurations at once
  • Select checkboxes for configurations you want to update
  • Click Bulk Update and choose the action

Best Practices for Collection#

  • Start with critical namespaces - Enable collection for production namespaces first
  • Exclude system namespaces - Skip kube-system, kube-public if not needed
  • Monitor storage - Adjust collection intervals based on your storage capacity
  • Use app filters - Filter by application to reduce noise
  • Review periodically - Disable unused configurations to save resources

Metrics Tab - Monitor Log Collection#

The Metrics tab displays real-time statistics about your log collection.

Key Metrics#

Total Logs

  • Total number of log entries collected across all configured sources

Logs Per Hour

  • Average rate of log generation (useful for capacity planning)

Average Log Size

  • Mean size of log entries (in bytes)

Storage Used

  • Total disk/storage space consumed by collected logs

Logs by Level

  • Breakdown of logs by severity level
  • Shows: Error count, Warning count, Info count, Debug count

Logs by Namespace

  • Distribution of logs across different namespaces
  • Helps identify high-volume namespaces

Interpreting Metrics#

  • High error rate - Investigate why errors are occurring in your cluster
  • Increasing storage - Consider enabling archival or increasing retention policies
  • Uneven distribution - Some namespaces may need more focused monitoring
  • Log size trends - Large logs may indicate verbose applications or inefficient logging
Storage Management

Monitor storage metrics regularly. If storage usage is growing rapidly, consider:

  1. Reducing collection intervals
  2. Lowering max logs per collection
  3. Enabling log compression in archive policies
  4. Reducing retention days

Archive Tab - Automatic Log Archival#

The Archive tab allows you to create policies for automatic log archival and cleanup.

Why Archive Logs?#

  • Reduce storage costs - Move old logs to cheaper S3 storage
  • Maintain compliance - Keep logs for required retention periods
  • Optimize performance - Remove old logs from primary database
  • Enable auditing - Maintain historical logs for investigation

Creating an Archive Policy#

  1. Click Create Policy

  2. Configure the policy:

    Policy Name

    • A descriptive name for your policy
    • Example: "Production 90-Day Archive"

    Retention Days

    • How long to keep logs before archiving
    • Typical values: 7, 30, 90, 365 days

    Compression

    • Enable to compress archived logs
    • Recommended: Always enabled (reduces storage ~70%)

    Archive Destination

    • Select where to store archived logs
    • Options: S3, GCS, or database

    Target Scope (Optional)

    • Specific app ID or namespace to archive
    • Leave blank to archive all
  3. Toggle Enabled to activate

  4. Click Create Policy

Managing Archive Policies#

View Policy Status

  • Active policies show last archive run date
  • Next scheduled run date is displayed

Run Archive Now

  1. Find the policy
  2. Click Run Now
  3. Archives will be created immediately (may take time for large datasets)

Edit Policy

  1. Click Edit
  2. Modify settings
  3. Click Update

Delete Policy

  1. Click Delete
  2. Confirm (archived data remains in S3)

Archive Jobs#

View the status of archive operations:

  • Pending - Waiting to start
  • Running - Currently archiving logs
  • Completed - Successfully archived
  • Failed - Review error message

S3 Archive Configuration#

To enable S3 archival, you need to configure S3 storage:

  1. Go to Settings Tab > Storage Configuration
  2. Enter S3 credentials:
    • Bucket Name - Your S3 bucket
    • Region - AWS region (e.g., us-east-1)
    • Prefix (Optional) - Path prefix for archived logs
Cost Optimization

Archiving logs to S3 can reduce costs by 90%+:

  • Database storage: ~$0.30/GB/month
  • S3 Standard: ~$0.023/GB/month
  • S3 Glacier: ~$0.004/GB/month (for cold archives)

S3 Storage Tab - View Archived Logs#

The S3 Storage tab shows logs that have been archived to S3.

Viewing S3 Archived Logs#

Search by Criteria

  • Cluster ID - Filter by cluster
  • Namespace - Filter by namespace
  • Pod Name - Filter by specific pod
  • Date Range - Logs archived in this period

Columns

  • Cluster and Namespace
  • Pod and Container names
  • S3 location (bucket and key)
  • Log count and file size
  • Compression status
  • Creation date

Retrieving Archived Logs#

  1. Find the archive entry you need
  2. Click View Details
  3. Log content will be loaded and displayed
  4. Click Download to save the archived logs

Storage Statistics#

At the top of the tab:

  • Total Files - Number of archived log files
  • Total Size - Combined size of all archives
  • Total Logs - Number of log entries archived
  • Oldest Log - Earliest log in archive
  • Newest Log - Most recent log in archive

Clusters Tab - Manage Cluster Connections#

The Clusters tab shows all connected Kubernetes clusters and their configurations.

Cluster Information#

For each connected cluster:

Cluster Status

  • Healthy - All components operational
  • Degraded - Some issues but operational
  • Unhealthy - Not operational

Health Indicators

  • Log Collector - Is it collecting logs?
  • Database - Can it store logs?
  • Storage - Is storage available?

Metrics

  • Pods Monitored - Number of pods sending logs
  • Logs Per Minute - Current logging rate

Cluster Configuration#

To configure a cluster:

  1. Click Edit Configuration on the cluster
  2. Modify settings:
    • Log Collector - Enable/disable log collection
    • Retention Days - Default retention period
    • Max Log Size - Maximum size per log entry
  3. Click Save

Refresh Cluster Information#

If cluster info seems outdated:

  1. Click Refresh Clusters
  2. Wait for connection status to update
  3. Check if previously unhealthy clusters are now operational

Troubleshooting Cluster Issues#

Cluster shows "Unhealthy"

  • Check cluster connectivity
  • Verify cluster credentials in Settings
  • Ensure log collector pod is running: kubectl get pods -n nife-logs

No logs appearing

  • Verify collection configs are enabled
  • Check if pods in target namespaces are generating logs
  • Review collection job status for errors

High connection latency

  • Check network connectivity to cluster
  • Review cluster resource utilization
  • Consider adjusting collection intervals

Settings Tab - System Health & Configuration#

The Settings tab displays system health information and overall configuration.

System Health Overview#

Overall Status

  • Green checkmark = All systems healthy
  • Yellow warning = Degraded performance
  • Red X = System issues

System Uptime

  • How long the logging system has been operational

Component Status#

Log Collector

  • Status of the log collection service
  • Last health check timestamp
  • Issues indicate collection may be paused

Database

  • Status of the log database
  • Last health check timestamp
  • Critical for log storage

Storage

  • Status of archive storage
  • Last health check timestamp
  • Important for archival operations

System Configuration#

Review your system settings:

  • API Version - v1
  • Log Collection Method - Kubernetes API
  • Default Retention - Default log retention period
  • Archive Format - How logs are compressed
  • Search Engine - LogQL Compatible

Best Practices Reference#

The Settings tab includes a best practices guide:

  1. Set up archive policies - Automatically manage log retention and storage
  2. Use LogQL queries - Leverage advanced filtering for analysis
  3. Monitor cluster health - Regular health checks ensure smooth operation
  4. Enable compression - Reduce storage costs significantly
  5. Use continuous collection - For production applications requiring real-time monitoring

Common Use Cases#

Finding Error Logs from a Specific Pod#

  1. Open Logs Tab
  2. Select namespace containing the pod
  3. Enter pod name (optional - helps narrow results)
  4. Set Log Level to "Error"
  5. Set date range to last 24 hours
  6. Click "Search Logs"

Debugging a Deployment Issue#

  1. Use Logs Tab with LogQL:
    {deployment="my-app"} |= "error" or "exception"
  2. Export results for detailed analysis
  3. Share logs with team via JSON export

Setting Up Log Archival for Compliance#

  1. Go to Archive Tab
  2. Create policy:
    • Retention Days: 365 (1 year)
    • Compression: Enabled
    • Destination: S3
  3. Enable policy
  4. Logs older than 365 days will auto-archive to S3

Reducing Storage Costs#

  1. Review Metrics Tab to identify high-volume namespaces
  2. Configure collection in Collection Config Tab:
    • Increase collection interval (e.g., 600 seconds)
    • Reduce max logs per collection
    • Exclude unnecessary namespaces
  3. Set archive policy with shorter retention (e.g., 7 days)
  4. Monitor storage usage in Metrics tab

Real-Time Monitoring of Production#

  1. Set collection interval to 60 seconds in Collection Config Tab
  2. Create Metrics dashboard showing logs per minute
  3. Set up alerts when log volume spikes
  4. Use LogQL for pattern detection

Troubleshooting#

"No logs found" in search results#

Possible causes:

  • Collection is disabled for the selected namespace
  • No logs exist in the selected time range
  • Search filters are too restrictive

Solutions:

  1. Check Collection Config tab - ensure namespace is enabled
  2. Widen date range and try again
  3. Try searching without filters
  4. Check cluster health in Clusters tab

High memory/storage usage#

Solutions:

  1. Reduce collection interval (collect less frequently)
  2. Reduce max logs per collection
  3. Reduce retention days
  4. Enable archive policies to move old logs to S3

LogQL query errors#

Common issues:

  • Missing quotes around label values: {namespace="prod"} โœ“ (not {namespace=prod})
  • Incorrect operators: Use |= for contains, != for does not contain
  • Missing closing braces {}

Validation:

  • Test query in a query editor before submitting
  • Check error message for syntax hints

Cluster shows "Unhealthy" status#

Troubleshooting steps:

  1. Verify cluster credentials in Settings
  2. Check cluster connectivity: kubectl cluster-info
  3. Verify log collector is running:
    kubectl get pods -n nife-logs
    kubectl logs -n nife-logs -l app=log-collector
  4. Check cluster resources (CPU, memory, disk)
  5. Review firewall/network policies

Performance Tips#

  1. Set appropriate collection intervals

    • Production: 60-300 seconds
    • Staging: 300-600 seconds
    • Development: 600+ seconds
  2. Use specific namespaces

    • Collecting all namespaces is slow
    • Use app filters to narrow scope
  3. Archive old logs regularly

    • Frees database space
    • Improves query performance
    • Reduces costs
  4. Filter by log level in collection

    • Don't collect debug logs in production
    • Reduces storage and processing
  5. Use date ranges in searches

    • Searching 6 months of logs is slow
    • Use specific time periods when possible

Security Best Practices#

  1. Restrict access - Limit who can view logs (use RBAC)
  2. Encrypt in transit - Use HTTPS for all API calls
  3. Mask sensitive data - Configure filters to redact PII
  4. Audit log access - Monitor who views logs
  5. Secure archive destination - Use S3 encryption and access controls
  6. Regular backups - Back up archive policies and configurations

FAQ#

Q: How long are logs retained? A: Default is 30 days. Configure in Archive policies or cluster settings.

Q: Can I search logs across multiple clusters? A: Currently, you must select one cluster at a time. Use exports to combine results.

Q: What's the maximum date range I can search? A: No limit, but searching large ranges is slower. Use specific date ranges for better performance.

Q: Does archiving delete logs from the database? A: Yes, archived logs are removed from the database and moved to S3 (if configured).

Q: Can I recover deleted archive policies? A: Archive data remains in S3, but the policy cannot be recovered. Create a new policy to access the data.

Q: How do I export logs for long-term storage? A: Use the Export button in Logs tab (JSON format) or set up S3 archival for automatic storage.

Q: What's the difference between "Continuous" and "One-time" collection? A: Continuous runs on a schedule automatically. One-time collects logs once on demand.


Next Steps#

  • Review your cluster health in the Settings Tab
  • Configure collections for critical namespaces in Collection Config Tab
  • Set up archival policies in Archive Tab
  • Practice LogQL queries in Logs Tab
  • Monitor metrics in Metrics Tab

For additional help, contact Nife Support.