How to Monitor CPU, Memory and More on Linux

Understanding Linux resource usage should be a core skill for any sysadmin. Quickly diagnosing high utilization helps maintain stability and availability across critical systems.

In this comprehensive 2800+ word guide, I‘ll teach you how to monitor key Linux performance metrics using classic command-line tools. We‘ll cover installation, metrics, capabilities and real-world usage for the top solutions.

You‘ll gain practical experience troubleshooting resource issues and optimizing systems regardless of infrastructure scale or configuration complexity. Let‘s get started!

Why CLI Monitoring Tools Matter

Visualizing Linux resource metrics has many benefits:

  • Spot bottlenecks degrading app performance
  • Identify trends for capacity planning
  • Detect security incidents and suspicious activity
  • Balance and enforce resource allocations
  • Prevent problems before they cause outages

Without visibility, you fly blind. Even with cloud-based infrastructure, CLI tools should be in every admin‘s toolbox. Here‘s why:

Lightweight data collection – Low overhead collectors using built-in kernel instrumentation
Work anywhere – Distro-agnostic with no dependencies beyond base Linux install
Fast troubleshooting – Interactive usage and UI right from terminal
No ongoing costs – Ideal for frugal teams or test environments
Extensibility – Output redirection/scripting for custom solutions

Now let‘s explore your options…

Top CLI Tools for Linux Monitoring

top – Quick System & Process Overview

The top command provides a dynamic real-time view of overall usage and running processes:

top command showing system summary and per-process CPU/MEM stats

From top you can easily identify the process consuming the most CPU/memory for a given snapshot. Sorting, filtering and configuration options make drilling down simple.

Use top when you need a quick birds-eye process and system view. It‘s a standard starting point for investigation.

Key Metrics: CPU load, Memory/swap usage, Per-process CPU/MEM %

htop – Enhanced & Interactive top

htop improves the top interface for advanced system monitoring:

  • Scrolling for browsing processes
  • Tree view for parent-child process relationships
  • Interactive commands via keyboard shortcuts
  • Horizontally scrollable metrics without line wrap
  • Colored output emphasizing anomalies

htop process manager for Linux showing resource usage metrics

If you want a more powerful and customizable top, install htop. Its additional capabilities help monitor production systems.

Key Metrics: All top metrics plus detailed process tree hierarchy

glances – Full System Overview

glances takes a kitchen sink approach showing utilization metrics for all key subsystems:

  • Per-core CPU
  • RAM and swap usage
  • Disk I/O stats
  • Network bandwidth by interface
  • Running processes
  • Load averages
  • Temperatures/fans
  • File system spaces

glances system monitor overview

The quick system overview makes Glances shine for real-time troubleshooting. It works on a client-server model allowing central monitoring too.

Key Metrics: Dozens of system utilization metrics covering CPU, memory, disk, network

atop – System Monitor With Playback

Similar to top/htop, atop focuses on:

  • Per-process CPU/memory
  • System level CPU/disk/network

Where it advances beyond others is recording utilization over time for playback. This allows revisiting past activity to visually verify issues.

atop system monitor tool showing utilization graphs

Save raw data to analyze historically based on time ranges. Playback avoids relying on faulty human memory after problems!

Key Metrics: CPU usage, Memory, Disk I/O with recording ability

ps – Snapshot Process Overview

The ps command prints a snapshot of running processes.

Adding flags lets you filter and sort ps output:

ps aux --sort %mem | head -n 10

The above shows the top 10 memory consuming processes.

This helps confirm processes are running as expected without excessive resources. ps works for confirming statuses rather than constant monitoring since it lacks refresh ability by itself.

Key Metrics: Running processes with start time, %CPU, %MEM

nmon – Systems Performance Data Logger

Nmon gives you top-like system summary screens covering:

  • CPU
  • Memory
  • Network I/O
  • Disks
  • File systems

nmon system performance monitoring tool example view

Everything surfaces through an interactive terminal UI.

Unique to nmon is its ability to log performance data including process level statistics for historical review. This helps when hunting root causes.

Key Metrics: All utilization metrics with CSV export ability

Monit – Lightweight Process & Resource Watchdog

Monit takes a service monitoring approach tracking availability metrics like:

  • Process status (with auto restart)
  • Memory and CPU thresholds
  • Disk & network usage
  • Custom application KPIs
  • Validation check results

You get both real-time and historical data on service health. For example, monit can restart unresponsive processes automatically while warning if restarts are excessive.

It nudges DevOps teams towards proactive versus reactive monitoring.

Key Metrics: Service/process uptime, thresholds, availability KPIs

Monitorix – Lightweight Server Monitoring

Monitorix gives you web-based monitoring for:

  • CPU usage
  • Memory metrics
  • Disk utilization

Popular services are covered too:

  • MySQL/Postgres
  • Apache/Nginx web servers
  • Email servers

monitorix system dashboard showing graphs

The HTTP UI makes sharing and visualization easy. Monitorix works reliably even on older production servers.

Its simplicity works well for supplementing other tools.

Key Metrics: Core Linux system metrics + app server monitoring

Netdata – Metrics Visualization Tool

Netdata focuses on stunning interactive visualizations for metrics. Quickly analyze thousands of time-series across apps:

netdata system health monitoring dashboard

  • Customizable dashboard showing any app metric imaginable
  • Interactive exploration from broad overviews to specific metrics
  • Modern UI displaying insights clearly

Static screenshots don‘t do it justice. View the demo dashboards to appreciate the real-time clarity.

Key Metrics: All Linux & application metrics imaginable

Getting Started with Monitoring

Now that you‘ve seen an overview of common CLI monitoring tools, here are best practices to implement effective monitoring:

Stablish Baselining – Profile typical & peak usage patterns for CPU, memory, disk, network over a few weeks. This quantifies expected norms to compare against when detecting anomalies.

Configure Notifications – Trigger email/Slack/OpsGenie alerts when key metrics breach thresholds indicating possible issues. This allows getting ahead rather than simple reacting after slowness complains.

Monitor Uptime – Track service availability using Monit or similar watchdog to restart and warn if crashes repeat. Know your enemy.

Correlate Metrics – Ingest CLI stats into tools like Grafana or Kibana dashboards bringing together system, app and business data into insights.

Automate Reporting – Script CLI commands for ongoing record keeping. Nobody checks dashboards manually at 3am when trouble strikes.

Learn Continuously – Dive deeper into capabilities of tools like eBPF and stap extending Linux observability.

Choose a few tools matching your stack and maturity level for starters. As needs grow over time, expand breadth and sophistication of monitoring capabilities.

Let‘s wrap up with best practices around fixing issues uncovered.

Addressing High Resource Usage

When you eventually see a process hogging 50%+ CPU or memory usage, here are application optimization techniques to try:

Add Indexes – Boost database performance by reducing table scans
Enable Caching – Front cache layers reduce backend load
Scale Out – Distribute load balancing across new app instances
Tune Languages – Adjust GC, pools, workers based on stack
Profile Code – Use perf/strace for insights into slow code paths

Getting code changes prioritized needs irrefutable evidence. Monitoring proves where time is spent while benchmarking alternative approaches.

Choose the optimization matching your bottleneck for faster troubleshooting wins!

Conclusion

I hope this guide conveys why constantly instrumenting Linux systems matters given how unpredictable running software tends be.

Top takeaways include:

  • Start with htop for interactive troubleshooting and glances for overall situational awareness
  • Incorporate historical performance tracking with tools like atop early on
  • Focus monitoring on services from a user perspective with Monit and similar
  • Visually amplify metrics through tools like Netdata once metrics coverage expands
  • CLI tools form the foundation of Linux monitoring complementing other stacks
  • Address root causes discovered through optimization and best practices

Now you‘re equipped to better satisfy users while keeping systems humming! Constant vigilance pays off easing the on-call pain. What tools are you excited to try next during your admin adventures?