Loading...

Lab 45: System Monitoring Commands

Diagnose degraded server performance using practical monitoring commands across disk, kernel logs, memory, I/O, network sockets, and live process activity. Interpret signals that suggest a failing disk and confirm the system’s current pressure points.

troubleshooting core storage

Scenario

A user reports the server has become slow and unreliable. You are on-call and need to quickly assess system health. Start with disk utilization, then check kernel messages for hardware warnings, validate memory pressure, review I/O statistics, confirm key listening services, and finish with a live process view.

Operator context

This is the triage pass that helps you decide whether you are dealing with resource exhaustion, a noisy process, or a real hardware issue that needs escalation.

Objective

  • Check disk usage to identify capacity risk on mounted file systems.
  • Inspect recent kernel messages for hardware and filesystem warnings.
  • Verify current memory and swap usage to assess memory pressure.
  • Review I/O wait and disk throughput indicators.
  • Confirm listening sockets to validate network service exposure.
  • Use a live monitor to identify active CPU and process consumers.

What You’ll Practice

  • Disk utilization checks with df -h .
  • Kernel and hardware warning triage using dmesg .
  • Memory and swap inspection with free -h .
  • CPU and disk I/O statistics using iostat and iostat -x .
  • Socket and listener inspection using ss -tuln or netstat -tuln .
  • Real-time process monitoring using top .

Walkthrough

Step 1 : Check disk usage across mounted file systems.
Command
df -h

This is the fastest way to spot capacity issues. A nearly full root filesystem can degrade performance (writes block, logs fail, packages cannot install) and may cause services to behave unpredictably.

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        30G   28G  1.0G  97% /
tmpfs           1.9G     0  1.9G   0% /dev/shm
/dev/sdb1        50G   10G   40G  20% /home
Step 2 : View recent kernel messages for hardware warnings.
Command
dmesg | tail

Recent dmesg output often reveals the difference between “system is slow” and “disk is failing.” I/O errors and filesystems remounting read-only are high-severity signals that should trigger escalation and data-protection actions.

[ 1551.392013] sd 2:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 1551.392021] sd 2:0:0:0: [sda] tag#0 Sense Key : Medium Error [current]
[ 1551.392028] sd 2:0:0:0: [sda] tag#0 Add. Sense: Unrecovered read error
[ 1551.392036] blk_update_request: I/O error, dev sda, sector 4096208 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[ 1551.392052] Buffer I/O error on dev sda1, logical block 51234, async page read
[ 1551.392091] JBD2: Detected IO errors while flushing file data on sda1-8
[ 1551.392113] EXT4-fs warning (device sda1): ext4_end_bio:343: I/O error -5 writing to inode 524305 (offset 0 size 4096)
[ 1551.392120] Aborting journal on device sda1-8.
[ 1551.392122] EXT4-fs (sda1): previous I/O error to superblock detected
[ 1551.392135] EXT4-fs (sda1): Remounting filesystem read-only
Step 3 : Check current memory usage.
Command
free -h

Memory pressure can present as general slowness, timeouts, or heavy swapping. Use the available column as a more realistic view of memory that can be used without forcing cache eviction.

              total        used        free      shared  buff/cache   available
Mem:           3.8G        3.0G        300M        120M        500M        500M
Swap:          2.0G        500M        1.5G
Step 4 : View CPU and I/O statistics.
Command
iostat -x

iostat helps you spot I/O wait and disk pressure. Elevated %iowait can indicate slow storage, saturated disks, or a hardware problem that is forcing retries.

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.75    0.00    2.56    8.99    0.00   83.70
Device            tps    kB_read/s kB_wrtn/s kB_read kB_wrtn
sda               9.03      120.11     203.41   140243  237506
Step 5 : Check network socket statistics.
Command
ss -tuln

Listing listeners helps confirm which services are exposed and whether expected ports are open. ss is the modern replacement for many common netstat workflows.

Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:3306          0.0.0.0:*               LISTEN
udp        0      0 0.0.0.0:68              0.0.0.0:*
Step 6 : Launch a dynamic monitor for real-time stats.
Command
top

top provides a live view of load, CPU state (including I/O wait), memory usage, and which processes are consuming resources right now. This is often where you identify the immediate “who is doing it” answer.

top - 15:20:01 up 1 day,  2:53,  1 user,  load average: 2.56, 2.12, 1.95
Tasks: 152 total,   2 running, 149 sleeping,   0 stopped,   1 zombie
%Cpu(s):  6.8 us,  1.9 sy,  0.0 ni, 89.7 id,  1.4 wa,  0.0 hi,  0.2 si,  0.0 st
MiB Mem :   7823.2 total,   6543.7 used,    412.4 free,    867.1 buff/cache
MiB Swap:   2048.0 total,      85.3 used,   1962.7 free.  2301.4 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 1275 root      20   0  162284   8840   6328 S   7.3   0.1   0:15.28 systemd-journal
 1984 root      20   0  302564  12160   8424 S   5.0   0.2   0:10.47 NetworkManager
 2142 mysql     20   0 1512200 210412  18104 S   3.7   2.6   1:32.55 mysqld
 2458 apache    20   0  292384  17896   9820 S   1.6   0.2   0:22.13 httpd
 2667 lab       20   0  173820  10560   6404 R   1.3   0.1   0:05.09 top
 2784 lab       20   0  231240  15600  10048 S   0.7   0.2   0:09.87 bash
 2830 root      20   0  412604  24812  11200 S   0.3   0.3   0:18.42 gdm-session-wor
 2927 root      20   0  262148  13384   9472 S   0.3   0.2   0:07.22 udisksd
 2998 root      20   0  144432   8640   6260 S   0.0   0.1   0:02.13 cron
 3011 root      20   0  141288   7444   5928 S   0.0   0.1   0:01.09 sshd

Reference

  • df -h : Shows disk usage for mounted filesystems in human-readable units.
  • dmesg | tail : Displays the most recent kernel messages (useful for hardware and filesystem warnings).
  • free -h : Shows memory and swap usage in human-readable units.
  • iostat / iostat -x : Displays CPU and disk I/O statistics (extended output includes more device detail).
  • ss -tuln : Lists TCP/UDP listening sockets without resolving names.
    • -t : TCP sockets.
    • -u : UDP sockets.
    • -l : Listening sockets.
    • -n : Numeric output (no DNS/service name resolution).
  • netstat -tuln : Legacy alternative for listing listening sockets.
  • top : Interactive real-time view of processes, CPU, memory, and load averages.