Service outages on Linux can break web applications, background workers, and scheduled jobs, so a consistent triage routine reduces downtime and prevents guesswork during incidents.
On most modern Linux systems, systemd manages services as units and records state transitions and failures in the journal. A service can be unavailable because the process exited, the unit failed to start due to dependencies, or the daemon started but never bound to its port, and socket activation can also delay startup until the first connection arrives.
Collect unit state and log excerpts before applying fixes, since restarts can rotate logs and change failure modes. Rate limiting (start-limit-hit) and automatic restarts can mask the first error, so focus on the earliest failure lines and treat repeated restarts as a symptom, not a solution.
Steps to troubleshoot a Linux service outage with systemctl, journalctl, and ss:
- Confirm the unit name and current service state.
$ sudo systemctl status --no-pager --full example.service | head -n 12 * example.service - Example API Loaded: loaded (/etc/systemd/system/example.service; enabled; preset: enabled) Active: activating (auto-restart) (Result: exit-code) since Sat 2026-01-10 07:30:15 +08; 792ms ago Process: 153130 ExecStart=/usr/local/bin/example-api --config /etc/example/example.conf (code=exited, status=1/FAILURE) Main PID: 153130 (code=exited, status=1/FAILURE) CPU: 24ms Jan 10 07:30:15 host.example.net systemd[1]: example.service: Main process exited, code=exited, status=1/FAILURE Jan 10 07:30:15 host.example.net systemd[1]: example.service: Failed with result 'exit-code'.List failing services with systemctl list-units --type=service --state=failed when the unit name is unknown.
Related: How to check a Linux service status
- Review recent service logs for failures or crash loops.
$ sudo journalctl -u example.service -b --no-pager -n 40 Jan 10 07:30:14 host.example.net example-api[153125]: ERROR: Unable to read config file: /etc/example/example.conf (Permission denied) Jan 10 07:30:14 host.example.net systemd[1]: example.service: Main process exited, code=exited, status=1/FAILURE Jan 10 07:30:14 host.example.net systemd[1]: example.service: Failed with result 'exit-code'. Jan 10 07:30:15 host.example.net systemd[1]: example.service: Scheduled restart job, restart counter is at 1. Jan 10 07:30:15 host.example.net systemd[1]: Started example.service - Example API. Jan 10 07:30:15 host.example.net example-api[153130]: ERROR: Unable to read config file: /etc/example/example.conf (Permission denied) Jan 10 07:30:15 host.example.net systemd[1]: example.service: Main process exited, code=exited, status=1/FAILURE ##### snipped #####
Narrow to a time window with --since "15 minutes ago" and prefer the first failure line, since later restarts often produce noisy duplicates.
- Verify the service is listening on expected ports or sockets.
$ sudo ss -lntp '( sport = :9000 )' State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess
A missing listener usually indicates a failed start; a socket-activated service may require checking the matching .socket unit.
- Check unit dependencies, environment files, and required paths.
$ sudo systemctl cat example.service # /etc/systemd/system/example.service [Unit] Description=Example API After=network-online.target Wants=network-online.target [Service] Type=simple User=svcuser Group=svcuser EnvironmentFile=-/etc/default/example ExecStart=/usr/local/bin/example-api --config /etc/example/example.conf WorkingDirectory=/var/lib/example Restart=on-failure RestartSec=1s [Install] WantedBy=multi-user.target
Environment files (for example /etc/example/example.env) can contain secrets; avoid pasting their contents into tickets, chat logs, or screenshots.
- Restart or reload the service after corrections.
$ sudo systemctl restart example.service
Run systemctl daemon-reload after editing unit files under /etc/systemd/system/ so systemd loads the updated definition.
Repeated restarts without fixing the underlying error can trigger start-limit-hit and bury the original failure under log churn.
- Confirm the service remains active and does not immediately restart.
$ sudo systemctl show example.service -p ActiveState -p SubState -p NRestarts NRestarts=1 ActiveState=activating SubState=auto-restart
Re-run the same command after a few minutes; a rising NRestarts count usually indicates a crash loop, a watchdog kill, or a dependency that is still unstable.
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
