Understanding what a scheduler is and how it works is fundamental to learning how to run your batch computing workloads on high-performance computing (HPC) systems well. A scheduler manages all aspects of how your application will access and consume the compute, memory, storage, I/O, and network resources available to you on these systems. There are a number of different distributed batch job schedulers — also sometimes referred to as workload or resource managers — that you might encounter on an HPC system. For example, the Slurm Workload Manager is the most popular one in use today on HPC systems. However, at the core of every such system sits the Linux scheduler.
In this first part of our series on Batch Computing, we will introduce you to the concept of a scheduler — what they are, why they exist, and how they work — using the Linux scheduler as our reference implementation and testbed. You will then learn how to interact with the Linux scheduler on your personal computer by running a series of example exercises intended to teach you about the most fundamental aspects of scheduling, including turning foreground processes into background ones and controlling their priority relative to the other processes running on your system.
To complete the exercises covered in Part I, you will need access to a computer with either:
- a Linux operating system (OS);
- a Unix-like OS such as macOS;
- a Linux-compatible OS environment such as the Windows Subsystem for Linux; or
- a virtual machine running a Linux OS through a hypervisor like VirtualBox.
----
COMPLECS (COMPrehensive Learning for end-users to Effectively utilize CyberinfraStructure) is a new SDSC program where training will cover non-programming skills needed to effectively use supercomputers. Topics include parallel computing concepts, Linux tools and bash scripting, security, batch computing, how to get help, data management and interactive computing. Each session offers 1 hour of instruction followed by a 30-minute Q&A. COMPLECS is supported by NSF award 2320934.