A lightweight tool for submitting Python functions for computation within a Slurm cluster
What is submitit?
Submitit is a lightweight tool for submitting Python functions for computation within a Slurm cluster.
It basically wraps submission and provide access to results, logs and more.
Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.
Submitit allows to switch seamlessly between executing on Slurm or locally.
An example is worth a thousand words: performing an addition
From inside an environment with submitit
installed:
import submitit
def add(a, b):
return a + b
# executor is the submission interface (logs are dumped in the folder)
executor = submitit.AutoExecutor(folder="log_test")
# set timeout in min, and partition for running the job
executor.update_parameters(timeout_min=1, slurm_partition="dev")
job = executor.submit(add, 5, 7) # will compute add(5, 7)
print(job.job_id) # ID of your job