Joblib Launcher plugin
The Joblib Launcher plugin provides a launcher for parallel tasks based on Joblib.Parallel.
Installation
pip install hydra-joblib-launcher --upgrade
Usage
Once installed, add hydra/launcher=joblib to your command line. Alternatively, override hydra/launcher in your config:
defaults:
- override hydra/launcher: joblib
By default, process-based parallelism using all available CPU cores is used. By overriding the default configuration, it is e.g. possible limit the number of parallel executions.
The JobLibLauncherConf backing the config is defined here:
You can discover the Joblib Launcher parameters with:
# @package hydra.launcher
_target_: hydra_plugins.hydra_joblib_launcher.joblib_launcher.JoblibLauncher
n_jobs: -1
inner_max_num_threads: null
backend: null
prefer: processes
require: null
verbose: 0
timeout: null
pre_dispatch: 2*n_jobs
batch_size: auto
temp_folder: null
max_nbytes: null
mmap_mode: r
There are several standard approaches for configuring plugins. Check this page for more information.
See Joblib.Parallel documentation for full details about the parameters above.
Controlling native library thread pools
When using libraries that manage native thread pools, such as OpenBLAS, MKL, OpenMP, Numba, or NumExpr, set inner_max_num_threads to limit the number of native threads available to each Joblib worker process:
hydra:
launcher:
n_jobs: 8
inner_max_num_threads: 1
This can help avoid oversubscription when multiple Hydra jobs run in parallel and each job calls into a multithreaded native library. For arbitrary environment variables, use hydra.job.env_set instead.
NOTE: The only supported JobLib backend is Loky (process-based parallelism).
An example application using this launcher is provided in the plugin repository.
Starting the app with python my_app.py --multirun task=1,2,3,4,5 will launch five parallel executions:
$ python my_app.py --multirun task=1,2,3,4,5
[HYDRA] Joblib.Parallel(n_jobs=-1,verbose=0,timeout=None,pre_dispatch=2*n_jobs,batch_size=auto,temp_folder=None,max_nbytes=None,mmap_mode=r,backend=loky) is launching 5 jobs
[HYDRA] Launching jobs, sweep output dir : multirun/2020-02-18/10-00-00
[__main__][INFO] - Process ID 14336 executing task 2 ...
[__main__][INFO] - Process ID 14333 executing task 1 ...
[__main__][INFO] - Process ID 14334 executing task 3 ...
[__main__][INFO] - Process ID 14335 executing task 4 ...
[__main__][INFO] - Process ID 14337 executing task 5 ...