In my opinion, in most cases, you should not use swap space on Linux. Instead, configure a crash kernel and set vm.panic_on_oom=1. If RAM is exhausted, a kcore is dumped and the host is automatically restarted. No more page thrashing, hours of downtime and the stress of SSH hanging when users are complaining.
The benefits are: 1) less downtime, 2) post-mortem kcore analysis to find the culprits, and 3) less disk space usage for swap. The costs are: 1) hundreds of MB of RAM reserved for the crash kernel, 2) danger of infinite reboots, 3) some benign swapping edge cases unavailable.
Overall, given what I've seen with hundreds of customers, the benefits significantly outweigh the costs. In particular, if implemented correctly, not only is there likely to be less downtime due to overcommitting memory, but if the kcores are analyzed, longer term, the systems are better sized and memory leaks are understood. I've seen so many cases where the OOM killer destroyed a process, and there wasn't enough information to understand why. And that was after hours of downtime due to the system being in a zombie-like state from swapping.
I'm open to disagreements on this, because I have read some information that some swap space can be beneficial for Linux to optimize its use of RAM (e.g. better utilize the file cache with a high swappiness on program pages that are rarely used, etc.); however, I've never seen evidence or practical demonstrations of this, only vague theoretical hand-waving, often without considering the costs of swap thrashing and an out-of-control OOM killer.
At a detailed level, here's an RHEL example, but other distributions are similar:
- Configure, start, and enable the crash kernel/kdump: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Kernel_Crash_Dump_Guide/chap-introduction-to-kdump.html
- Size the amount of RAM for the crash kernel correctly: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Kernel_Crash_Dump_Guide/appe-supported-kdump-configurations-and-targets.html#sect-kdump-memory-requirements
- Change /etc/kdump.conf to ensure makedumpfile uses `-d 23,31` so that more information is dumped for each user process (command line arguments, virtual memory, etc.).
- Set vm.panic_on_oom=1 in /etc/sysctl.conf
- Install the kernel-debuginfo, kernel-debuginfo-common, glibc-debuginfo, and glibc-debuginfo-common packages: https://access.redhat.com/solutions/9907
- Install the crash utility: https://github.com/crash-utility/crash
- Test it out (perhaps with kernel.sysrq=1 and /proc/sysrq-trigger) and learn how to use it: `crash /usr/lib/debug/lib/modules/*/vmlinux /var/crash/*/vmcore`
To disable swap, use `swapoff -a` to immediately disable swap partitions, and then remove any swap partitions from /etc/fstab for future reboots.
In principle, this guidance applies to all other operating systems; however, I haven't gone through a detailed practical setup as above.
Slightly related, here's my InterConnect presentation from earlier this year on Linux with WebSphere in the enterprise: https://publib.boulder.ibm.com/httpserv/cookbook/InterConnect2016_7393.pdf