Diagnose degraded server performance using a fast triage workflow across disk, kernel logs, memory, I/O, network sockets, and live processes. Identify signals of resource exhaustion versus hardware failure and capture evidence for escalation.
A user reports the server has become slow and unreliable. You are on-call and need to assess system health quickly. Start with disk utilization, then check kernel messages for hardware warnings, validate memory pressure, review I/O signals, confirm key listeners, and finish with a live process view.
This is the initial triage pass that helps you decide whether you are dealing with resource exhaustion, a noisy process, or a hardware issue that needs escalation.
df -h
.
dmesg
.
free -h
.
%iowait
using
iostat -x
.
ss -tuln
.
top
.
df -h
This is the fastest way to spot capacity risk. A nearly full filesystem can degrade performance (blocked writes, failed logging, unpredictable service behavior) and can prevent recovery actions like updates or package installs.
# Look for mountpoints near capacity:
# /dev/sda1 30G 28G 1.0G 97% /
dmesg | tail
Recent
dmesg
output often separates “system is slow” from “storage is
failing.” I/O errors, timeouts, and filesystems remounting
read-only are high-severity signals that should trigger
escalation and data-protection actions.
# High-severity patterns include:
# blk_update_request: I/O error, dev sda, ...
# EXT4-fs (sda1): Remounting filesystem read-only
free -h
Memory pressure can present as slowness, timeouts, and
thrashing. Use
available
as a practical view of memory that can be used without
forcing cache eviction. Persistent swap growth usually means
you are running short on RAM.
# Watch for low available memory and active swap usage.
iostat
is provided by the
sysstat
package. If it is not installed, you may need to install
it during a maintenance window or use alternate tooling.
iostat -x
iostat -x
helps you spot storage pressure and I/O wait. Elevated
%iowait
can indicate slow disks, saturated throughput, or failing
hardware forcing retries. If CPU looks mostly idle while
%iowait
is high, storage is often the bottleneck.
# Focus areas:
# avg-cpu: %iowait
# Per-device utilization and latency signals (if present in your output).
ss -tuln
Listener checks confirm which services are exposed and whether expected ports are open. This is a fast way to verify “service is up” versus “service is running but not reachable.” Use numeric output to avoid DNS resolution delays during incident response.
# Look for expected listeners:
# tcp LISTEN 0 0 0.0.0.0:22 ...
# tcp LISTEN 0 0 127.0.0.1:3306 ...
top
top
provides a live view of load, CPU state (including I/O wait),
memory usage, and the processes consuming resources right
now. Use this to identify the immediate top consumers and
decide whether you need to throttle, restart, or isolate a
workload.
# Watch for:
# - Sustained high load with low %idle
# - High %wa (I/O wait)
# - A single process dominating CPU or memory
If
/
is close to full, treat it as a priority incident. Free
space immediately (logs, caches, crash dumps), then confirm
services recover and writes are no longer blocked.
This is a strong indicator of failing storage or a degraded path. Capture logs, identify impacted mountpoints, and escalate. Do not keep retrying writes into a failing disk.
Swapping can make a system feel “randomly slow.” Identify the top memory consumers and validate whether the workload is normal growth, a leak, or an undersized host.
If the port is not listening, confirm the service is running
and bound to the correct interface. If it is listening only
on
127.0.0.1
when you expect external access, fix bind configuration
before chasing firewall rules.
This lab is read-only. Your cleanup is to record evidence (outputs and timestamps) and revert any temporary terminal filters or paging choices you used during triage.
df -h
dmesg | tail
free -h
iostat -x
ss -tuln
top
You can state the bottleneck (capacity, memory pressure, storage latency, or process saturation) and you have enough command output to justify next actions.
df -h
: Shows disk usage for mounted filesystems in human-readable
units.
-h
: Prints sizes in human-readable units (GiB/MiB).
dmesg | tail
: Displays the most recent kernel messages.
|
: Pipes output from the left command into the right
command.
tail
: Shows the last lines of output.
free -h
: Shows memory and swap usage in human-readable units.
-h
: Prints sizes in human-readable units (GiB/MiB).
iostat -x
: Displays extended CPU and device I/O statistics.
-x
: Enables extended per-device statistics.
ss -tuln
: Lists TCP/UDP listening sockets without resolving names.
-t
: TCP sockets.
-u
: UDP sockets.
-l
: Listening sockets.
-n
: Numeric output (no DNS/service name resolution).
top
: Interactive real-time view of processes, CPU, memory, and
load averages.