Kubernetes Logs Management Guide | Nife Deploy

Nife Deploy provides a comprehensive Kubernetes logs management system that allows you to collect, search, filter, and archive logs from your Kubernetes clusters. This guide will walk you through all the features and best practices.

Overview#

The K8s Logs Management system helps you:

Collect logs from Kubernetes pods and containers across multiple clusters
Search and filter logs using flexible criteria or LogQL queries
Store logs in your database or archive them to S3
Monitor system health and log collection status
Configure automatic log retention and archival policies

Getting Started#

Accessing K8s Logs#

Navigate to your Nife Dashboard
Select your Organization
Go to Clusters > Kubernetes Logs
Choose a cluster from the dropdown to begin managing logs

Key Components#

The Kubernetes Logs interface consists of several tabs:

Logs Tab - Search and view log entries
Metrics Tab - Monitor log collection metrics
Collection Config Tab - Configure which logs to collect
Archive Tab - Set up automatic log archival
S3 Storage Tab - Manage logs stored in S3
Clusters Tab - Manage cluster configurations
Settings Tab - System health and configuration

Logs Tab - Searching and Viewing Logs#

The Logs tab allows you to search and view Kubernetes logs with multiple filtering options.

Basic Log Search#

To search for logs:

Select a Cluster - The cluster dropdown at the top determines which cluster's logs you search
Enter a Search Query (optional) - Type keywords to search in log messages
Filter by Namespace - Select a specific namespace or "All Namespaces"
Filter by Log Level - Choose from:
- All Levels
- Error - Critical issues
- Warning - Potential problems
- Info - General information
- Debug - Detailed debugging information
Set Date Range (optional) - Click the calendar icon to select start and end dates
Click "Search Logs" - Execute the search with your selected filters

Log Entry Details#

Each log entry displays:

Log Level Badge - Color-coded severity (red=error, yellow=warning, blue=info, gray=debug)
Namespace - The Kubernetes namespace
Pod Name - The pod that generated the log
Container Name - The specific container within the pod
Message - The actual log message
Timestamp - When the log was generated

Advanced Search with LogQL#

For advanced log queries, use LogQL (Log Query Language):

Scroll down to the LogQL Query section
Enter your query in the text area
Click Execute

Example LogQL queries:

# Find all error logs in production namespace
{namespace="production"} |= "error"

# Find logs containing "connection" in pod "api-server"
{pod="api-server"} |= "connection"

# Search logs from the last hour
{namespace="staging"} | timestamp >= "1h"

# Find slow requests (response time > 1000ms)
{service="api"} |= "response_time" | duration > "1000ms"

What is LogQL?

LogQL is a query language similar to Prometheus PromQL, designed specifically for log querying. It allows complex filtering, label matching, and pattern searches.

Export Logs#

To export the current search results:

Click Export button
A JSON file containing all displayed logs will download to your computer
File naming: k8s-logs-[timestamp].json

Tips for Effective Log Searching#

Use specific namespaces - Narrowing down your search space makes queries faster
Set reasonable date ranges - Searching the entire log history can be slow; use date filters
Start simple - Begin with basic keyword searches before trying complex LogQL
Save useful queries - Note down LogQL queries you find helpful
Export for analysis - Download logs for offline analysis or integration with other tools

Collection Config Tab - Configure Log Collection#

The Collection Config tab allows you to control which logs are collected from your Kubernetes clusters.

Understanding Collection Configurations#

A collection configuration defines:

Which cluster logs come from
Which namespace(s) to collect logs from
What log levels to collect
Collection frequency and limits
Which namespaces to exclude

Creating a New Configuration#

Click Add Configuration button
Fill in the following details:
Cluster (Required)
- Select the target Kubernetes cluster
Namespace (Required)
- Specify which namespace to collect logs from
- Use all to collect from all namespaces
App Filter (Optional)
- Filter logs by application name or label
Log Levels (Optional)
- Select which log levels to collect
- Options: Debug, Info, Warning, Error
- Default: Info, Warning, Error
Collection Interval (Default: 300 seconds)
- How frequently to collect logs (in seconds)
- Minimum: 60 seconds
- Recommended: 300-600 seconds (5-10 minutes)
Exclude Namespaces (Optional)
- List namespaces to skip (useful when collecting "all")
- Example: kube-system, kube-public
Max Logs Per Collection (Default: 1000)
- Maximum number of logs to fetch in each collection cycle
- Increase for high-volume logging; decrease to reduce load
Toggle Enable to activate the configuration
Click Save Configuration

Managing Configurations#

Edit Configuration

Find the configuration in the list
Click the Edit icon
Modify the settings
Click Update

Disable Configuration

Toggle the Enable/Disable switch in the configuration row
Disabled configurations won't collect logs

Delete Configuration

Click the Delete icon
Confirm deletion

Bulk Update

Enable/disable multiple configurations at once
Select checkboxes for configurations you want to update
Click Bulk Update and choose the action

Best Practices for Collection#

Start with critical namespaces - Enable collection for production namespaces first
Exclude system namespaces - Skip kube-system, kube-public if not needed
Monitor storage - Adjust collection intervals based on your storage capacity
Use app filters - Filter by application to reduce noise
Review periodically - Disable unused configurations to save resources

Metrics Tab - Monitor Log Collection#

The Metrics tab displays real-time statistics about your log collection.

Key Metrics#

Total Logs

Total number of log entries collected across all configured sources

Logs Per Hour

Average rate of log generation (useful for capacity planning)

Average Log Size

Mean size of log entries (in bytes)

Storage Used

Total disk/storage space consumed by collected logs

Logs by Level

Breakdown of logs by severity level
Shows: Error count, Warning count, Info count, Debug count

Logs by Namespace

Distribution of logs across different namespaces
Helps identify high-volume namespaces

Interpreting Metrics#

High error rate - Investigate why errors are occurring in your cluster
Increasing storage - Consider enabling archival or increasing retention policies
Uneven distribution - Some namespaces may need more focused monitoring
Log size trends - Large logs may indicate verbose applications or inefficient logging

Storage Management

Monitor storage metrics regularly. If storage usage is growing rapidly, consider:

Reducing collection intervals
Lowering max logs per collection
Enabling log compression in archive policies
Reducing retention days

Archive Tab - Automatic Log Archival#

The Archive tab allows you to create policies for automatic log archival and cleanup.

Why Archive Logs?#

Reduce storage costs - Move old logs to cheaper S3 storage
Maintain compliance - Keep logs for required retention periods
Optimize performance - Remove old logs from primary database
Enable auditing - Maintain historical logs for investigation

Creating an Archive Policy#

Click Create Policy
Configure the policy:
Policy Name
- A descriptive name for your policy
- Example: "Production 90-Day Archive"
Retention Days
- How long to keep logs before archiving
- Typical values: 7, 30, 90, 365 days
Compression
- Enable to compress archived logs
- Recommended: Always enabled (reduces storage ~70%)
Archive Destination
- Select where to store archived logs
- Options: S3, GCS, or database
Target Scope (Optional)
- Specific app ID or namespace to archive
- Leave blank to archive all
Toggle Enabled to activate
Click Create Policy

Managing Archive Policies#

View Policy Status

Active policies show last archive run date
Next scheduled run date is displayed

Run Archive Now

Find the policy
Click Run Now
Archives will be created immediately (may take time for large datasets)

Edit Policy

Click Edit
Modify settings
Click Update

Delete Policy

Click Delete
Confirm (archived data remains in S3)

Archive Jobs#

View the status of archive operations:

Pending - Waiting to start
Running - Currently archiving logs
Completed - Successfully archived
Failed - Review error message

S3 Archive Configuration#

To enable S3 archival, you need to configure S3 storage:

Go to Settings Tab > Storage Configuration
Enter S3 credentials:
- Bucket Name - Your S3 bucket
- Region - AWS region (e.g., us-east-1)
- Prefix (Optional) - Path prefix for archived logs

Cost Optimization

Archiving logs to S3 can reduce costs by 90%+:

Database storage: ~$0.30/GB/month
S3 Standard: ~$0.023/GB/month
S3 Glacier: ~$0.004/GB/month (for cold archives)

S3 Storage Tab - View Archived Logs#

The S3 Storage tab shows logs that have been archived to S3.

Viewing S3 Archived Logs#

Search by Criteria

Cluster ID - Filter by cluster
Namespace - Filter by namespace
Pod Name - Filter by specific pod
Date Range - Logs archived in this period

Columns

Cluster and Namespace
Pod and Container names
S3 location (bucket and key)
Log count and file size
Compression status
Creation date

Retrieving Archived Logs#

Find the archive entry you need
Click View Details
Log content will be loaded and displayed
Click Download to save the archived logs

Storage Statistics#

At the top of the tab:

Total Files - Number of archived log files
Total Size - Combined size of all archives
Total Logs - Number of log entries archived
Oldest Log - Earliest log in archive
Newest Log - Most recent log in archive

Clusters Tab - Manage Cluster Connections#

The Clusters tab shows all connected Kubernetes clusters and their configurations.

Cluster Information#

For each connected cluster:

Cluster Status

Healthy - All components operational
Degraded - Some issues but operational
Unhealthy - Not operational

Health Indicators

Log Collector - Is it collecting logs?
Database - Can it store logs?
Storage - Is storage available?

Metrics

Pods Monitored - Number of pods sending logs
Logs Per Minute - Current logging rate

Cluster Configuration#

To configure a cluster:

Click Edit Configuration on the cluster
Modify settings:
- Log Collector - Enable/disable log collection
- Retention Days - Default retention period
- Max Log Size - Maximum size per log entry
Click Save

Refresh Cluster Information#

If cluster info seems outdated:

Click Refresh Clusters
Wait for connection status to update
Check if previously unhealthy clusters are now operational

Troubleshooting Cluster Issues#

Cluster shows "Unhealthy"

Check cluster connectivity
Verify cluster credentials in Settings
Ensure log collector pod is running: kubectl get pods -n nife-logs

No logs appearing

Verify collection configs are enabled
Check if pods in target namespaces are generating logs
Review collection job status for errors

High connection latency

Check network connectivity to cluster
Review cluster resource utilization
Consider adjusting collection intervals

Settings Tab - System Health & Configuration#

The Settings tab displays system health information and overall configuration.

System Health Overview#

Overall Status

Green checkmark = All systems healthy
Yellow warning = Degraded performance
Red X = System issues

System Uptime

How long the logging system has been operational

Component Status#

Log Collector

Status of the log collection service
Last health check timestamp
Issues indicate collection may be paused

Database

Status of the log database
Last health check timestamp
Critical for log storage

Storage

Status of archive storage
Last health check timestamp
Important for archival operations

System Configuration#

Review your system settings:

API Version - v1
Log Collection Method - Kubernetes API
Default Retention - Default log retention period
Archive Format - How logs are compressed
Search Engine - LogQL Compatible

Best Practices Reference#

The Settings tab includes a best practices guide:

Set up archive policies - Automatically manage log retention and storage
Use LogQL queries - Leverage advanced filtering for analysis
Monitor cluster health - Regular health checks ensure smooth operation
Enable compression - Reduce storage costs significantly
Use continuous collection - For production applications requiring real-time monitoring

Common Use Cases#

Finding Error Logs from a Specific Pod#

Open Logs Tab
Select namespace containing the pod
Enter pod name (optional - helps narrow results)
Set Log Level to "Error"
Set date range to last 24 hours
Click "Search Logs"

Debugging a Deployment Issue#

Use Logs Tab with LogQL:
{deployment="my-app"} |= "error" or "exception"
Export results for detailed analysis
Share logs with team via JSON export

Setting Up Log Archival for Compliance#

Go to Archive Tab
Create policy:
- Retention Days: 365 (1 year)
- Compression: Enabled
- Destination: S3
Enable policy
Logs older than 365 days will auto-archive to S3

Reducing Storage Costs#

Review Metrics Tab to identify high-volume namespaces
Configure collection in Collection Config Tab:
- Increase collection interval (e.g., 600 seconds)
- Reduce max logs per collection
- Exclude unnecessary namespaces
Set archive policy with shorter retention (e.g., 7 days)
Monitor storage usage in Metrics tab

Real-Time Monitoring of Production#

Set collection interval to 60 seconds in Collection Config Tab
Create Metrics dashboard showing logs per minute
Set up alerts when log volume spikes
Use LogQL for pattern detection

Troubleshooting#

"No logs found" in search results#

Possible causes:

Collection is disabled for the selected namespace
No logs exist in the selected time range
Search filters are too restrictive

Solutions:

Check Collection Config tab - ensure namespace is enabled
Widen date range and try again
Try searching without filters
Check cluster health in Clusters tab

High memory/storage usage#

Solutions:

Reduce collection interval (collect less frequently)
Reduce max logs per collection
Reduce retention days
Enable archive policies to move old logs to S3

LogQL query errors#

Common issues:

Missing quotes around label values: {namespace="prod"} ✓ (not {namespace=prod})
Incorrect operators: Use |= for contains, != for does not contain
Missing closing braces {}

Validation:

Test query in a query editor before submitting
Check error message for syntax hints

Cluster shows "Unhealthy" status#

Troubleshooting steps:

Verify cluster credentials in Settings
Check cluster connectivity: kubectl cluster-info
Verify log collector is running:
kubectl get pods -n nife-logs
kubectl logs -n nife-logs -l app=log-collector
Check cluster resources (CPU, memory, disk)
Review firewall/network policies

Performance Tips#

Set appropriate collection intervals
- Production: 60-300 seconds
- Staging: 300-600 seconds
- Development: 600+ seconds
Use specific namespaces
- Collecting all namespaces is slow
- Use app filters to narrow scope
Archive old logs regularly
- Frees database space
- Improves query performance
- Reduces costs
Filter by log level in collection
- Don't collect debug logs in production
- Reduces storage and processing
Use date ranges in searches
- Searching 6 months of logs is slow
- Use specific time periods when possible

Security Best Practices#

Restrict access - Limit who can view logs (use RBAC)
Encrypt in transit - Use HTTPS for all API calls
Mask sensitive data - Configure filters to redact PII
Audit log access - Monitor who views logs
Secure archive destination - Use S3 encryption and access controls
Regular backups - Back up archive policies and configurations

FAQ#

Q: How long are logs retained? A: Default is 30 days. Configure in Archive policies or cluster settings.

Q: Can I search logs across multiple clusters? A: Currently, you must select one cluster at a time. Use exports to combine results.

Q: What's the maximum date range I can search? A: No limit, but searching large ranges is slower. Use specific date ranges for better performance.

Q: Does archiving delete logs from the database? A: Yes, archived logs are removed from the database and moved to S3 (if configured).

Q: Can I recover deleted archive policies? A: Archive data remains in S3, but the policy cannot be recovered. Create a new policy to access the data.

Q: How do I export logs for long-term storage? A: Use the Export button in Logs tab (JSON format) or set up S3 archival for automatic storage.

Q: What's the difference between "Continuous" and "One-time" collection? A: Continuous runs on a schedule automatically. One-time collects logs once on demand.

Next Steps#

Review your cluster health in the Settings Tab
Configure collections for critical namespaces in Collection Config Tab
Set up archival policies in Archive Tab
Practice LogQL queries in Logs Tab
Monitor metrics in Metrics Tab

For additional help, contact Nife Support.