Submitit Launcher plugin
The Submitit Launcher plugin provides a SLURM Launcher based on Submitit.
Installationβ
pip install hydra-submitit-launcher --upgrade
Usageβ
Once installed, add hydra/launcher=submitit_slurm
to your command line. Alternatively, override hydra/launcher
in your config:
defaults:
- hydra/launcher: submitit_slurm
Note that this plugin expects a valid environment in the target host. usually this means a shared file system between the launching host and the target host.
Submitit actually implements 2 different launchers: submitit_slurm
to run on a SLURM cluster, and submitit_local
for basic local tests.
You can discover the SLURM Launcher parameters with:
# @package hydra.launcher
_target_: hydra_plugins.hydra_submitit_launcher.submitit_launcher.SlurmLauncher
submitit_folder: ${hydra.sweep.dir}/.submitit/%j
timeout_min: 60
cpus_per_task: 1
gpus_per_node: 0
tasks_per_node: 1
mem_gb: 4
nodes: 1
name: ${hydra.job.name}
partition: null
comment: null
constraint: null
exclude: null
signal_delay_s: 120
max_num_timeout: 0
additional_parameters: {}
array_parallelism: 256
Similarly, you can discover the local launcher parameters with:
# @package hydra.launcher
_target_: hydra_plugins.hydra_submitit_launcher.submitit_launcher.LocalLauncher
submitit_folder: ${hydra.sweep.dir}/.submitit/%j
timeout_min: 60
cpus_per_task: 1
gpus_per_node: 0
tasks_per_node: 1
mem_gb: 4
nodes: 1
name: ${hydra.job.name}
You can set all these parameters in your configuration file and/or override them in the commandline:
python foo.py --multirun hydra/launcher=submitit_slurm hydra.launcher.timeout_min=3
For more details, including descriptions for each parameter, check out the config file. You can also check the Submitit documentation.
Caution: use of --multirun
is required for the launcher to be picked up.
Exampleβ
An example application using this launcher is provided in the plugin repository.
Starting the app with python my_app.py task=1,2,3 --multirun
(see Multi-run for details) will launch 3 executions (you can override the launcher to run locally for testing by adding hydra/launcher=submitit_local
):
$ python my_app.py task=1,2,3 --multirun
[HYDRA] Sweep output dir : multirun/2020-05-28/15-05-22
[HYDRA] #0 : task=1
[HYDRA] #1 : task=2
[HYDRA] #2 : task=3
You will be able to see the output of the app in the output dir:
$ tree
.
βββ 0
βΒ Β βββ my_app.log
βββ 1
βΒ Β βββ my_app.log
βββ 2
βΒ Β βββ my_app.log
βββ multirun.yaml
$ cat 0/my_app.log
[2020-05-28 15:05:23,511][__main__][INFO] - Process ID 15887 executing task 1 ...
[2020-05-28 15:05:24,514][submitit][INFO] - Job completed successfully