K8s Logs Quick Reference & Cheatsheet
A quick reference guide for common tasks and commands in Kubernetes Logs management.
Dashboard Navigation#
| Task | Location |
|---|---|
| View logs | Clusters โ Select Cluster โ Logs Tab |
| Search logs | Logs Tab โ Enter filters โ Click "Search Logs" |
| Configure collection | Collection Config Tab โ Create/Edit configs |
| Set up archival | Archive Tab โ Create Policy |
| Monitor metrics | Metrics Tab โ View statistics |
| Check system health | Settings Tab โ System Health section |
| View S3 archives | S3 Storage Tab โ Search archives |
| Manage clusters | Clusters Tab โ View/edit configs |
Common Search Filters#
| Filter | Purpose | Example |
|---|---|---|
| Namespace | Limit to specific K8s namespace | "production" |
| Log Level | Filter by severity | "error" |
| Date Range | Specify time period | Last 24 hours |
| Search Query | Full-text search in messages | "connection timeout" |
| Pod Name | Filter by specific pod | "api-server-1" |
LogQL Quick Reference#
Basic Queries#
Operators#
| Operator | Meaning | Example |
|---|---|---|
\|= | Contains (matches) | \|= "error" |
!= | Does not contain | != "debug" |
\|~ | Regex match | \|~ "err.*\[0-9\]" |
!~ | Regex not match | !~ "warn" |
> | Greater than (numeric) | > 500 |
< | Less than (numeric) | < 100 |
>= <= | Greater/less or equal | >= 1000 |
Label Filters#
Advanced Queries#
Collection Configuration Defaults#
| Setting | Default | Recommended Range |
|---|---|---|
| Collection Interval | 300s (5 min) | 60-600 seconds |
| Max Logs Per Fetch | 1000 | 500-5000 |
| Retention Days | 30 | 7-365 |
| Log Levels | Info, Warn, Error | Depends on needs |
Archive Policy Presets#
Development Environment#
Staging Environment#
Production Environment#
Compliance (Long-term)#
Common Troubleshooting#
No Logs Appearing#
Checklist:
- Is collection enabled for the namespace?
- Does the cluster show "healthy" status?
- Are there logs in the selected date range?
- Is at least one pod in the namespace generating logs?
Fix:
- Go to Collection Config Tab
- Verify the configuration is Enabled
- Check Settings Tab > System Health
- Expand date range and try again
LogQL Query Errors#
Problem: Syntax error at position X
Solutions:
High Storage Usage#
Solutions:
- Reduce retention days in Archive policy
- Increase collection interval (collect less frequently)
- Reduce max logs per collection
- Exclude high-volume namespaces
Performance Tips#
Searching#
Collection#
API Rate Limits#
| Endpoint Type | Limit | Period |
|---|---|---|
| Standard calls | 100 | /minute |
| Search/Archive | 10 | /minute |
| Heavy operations | 5 | /minute |
If rate limited: Exponential backoff (1s, 2s, 4s, 8s...)
Storage Calculation#
Database Storage#
S3 Storage (After Compression)#
Common API Calls#
Get Logs (cURL)#
Create Archive Policy (cURL)#
List Collections (cURL)#
Field Mapping (UI โ API)#
| UI Field | API Parameter |
|---|---|
| Cluster ID | cluster_id |
| Namespace | namespace |
| Pod Name | pod_name |
| Container | container_name |
| Log Level | log_level |
| Start Time | start_time |
| End Time | end_time |
| Log Levels | log_levels |
| Collection Interval | collection_interval |
Retention vs Archive Timeline#
Collection Options Comparison#
| Option | When to Use | Cost | Speed |
|---|---|---|---|
| Interval: 60s | High-priority production | High | Real-time |
| Interval: 300s | Standard production | Medium | Near real-time |
| Interval: 600s | Staging/Dev | Low | Slightly delayed |
| On-Demand | Debugging | Variable | On request |
Status Indicators#
| Indicator | Meaning | Action |
|---|---|---|
| ๐ข Healthy | All systems operational | None needed |
| ๐ก Degraded | Some components slow | Monitor closely |
| ๐ด Unhealthy | System not operational | Check immediately |
Useful Metrics to Monitor#
Quick Wins for Cost Reduction#
- Archive older logs - Saves ~90% (moves to S3)
- Enable compression - Saves ~70% of storage
- Increase collection interval - Reduces volume (300s โ 600s)
- Exclude system namespaces - Reduces noise
- Set shorter retention - Reduces storage
Before Contacting Support#
- Verified cluster connectivity:
kubectl cluster-info - Checked system health status
- Confirmed collection configs are enabled
- Verified date range has logs
- Confirmed LogQL syntax
- Checked rate limits not exceeded
- Exported logs for analysis
Documentation Links#
Help & Support#
- Email: [email protected]
- Docs: https://docs.nife.io
- Status: https://status.nife.io