Version: Next

Configuring Experiments

Example

Problem#

A common problem is maintaining multiple configurations of an application. This can get especially tedious when the configuration differences span multiple dimensions. This pattern shows how to cleanly support multiple configurations, with each configuration file only specifying the changes to the master (default) configuration.

Solution#

Create a config file specifying the overrides to the default configuration, and then call it via the command line. e.g. $ python my_app.py +experiment=fast_mode.

To avoid clutter, we place the experiment config files in dedicated config group called experiment.

Example#

In this example, we will create configurations for each of the server and database pairings that we want to benchmark.

The default configuration is:

config.yaml
defaults:
- db: mysql
- server: apache
db/mysql.yaml
name: sqlite
server/apache.yaml
name: apache
port: 80
db/sqlite.yaml
name: sqlite
server/nginx.yaml
name: nginx
port: 80
Directory structure
โ”œโ”€โ”€ config.yaml
โ”œโ”€โ”€ db
โ”‚ โ”œโ”€โ”€ mysql.yaml
โ”‚ โ””โ”€โ”€ sqlite.yaml
โ””โ”€โ”€ server
โ”œโ”€โ”€ apache.yaml
โ””โ”€โ”€ nginx.yaml
$ python my_app.py
db:
name: mysql
server:
name: apache
port: 80

The benchmark config files specify the deltas from the default configuration:

experiment/aplite.yaml
# @package _global_
defaults:
- override /db: sqlite
server:
port: 8080
experiment/nglite.yaml
# @package _global_
defaults:
- override /db: sqlite
- override /server: nginx
server:
port: 8080
$ python my_app.py +experiment=aplite
db:
name: sqlite
server:
name: apache
port: 8080
$ python my_app.py +experiment=nglite
db:
name: sqlite
server:
name: nginx
port: 8080

Key concepts:

  • # @package _global_
    Changes specified in this config should be interpreted as relative to the _global_ package.
    We could instead place nglite.yaml and aplite.yaml next to config.yaml and omit this line.
  • The overrides of /db and /server are absolute paths.
    This is necessary because they are outside of the experiment directory.

Running the experiments from the command line requires prefixing the experiment choice with a +. The experiment config group is an addition, not an override.

Sweeping over experiments#

This approach also enables sweeping over those experiments to easily compare their results:

$ python my_app.py --multirun +experiment=aplite,nglite
[HYDRA] Launching 2 jobs locally
[HYDRA] #0 : +experiment=aplite
db:
name: sqlite
server:
name: apache
port: 8080
[HYDRA] #1 : +experiment=nglite
db:
name: sqlite
server:
name: nginx
port: 8080

To run all the experiment, use the glob syntax:

$ python my_app.py --multirun '+experiment=glob(*)'
[HYDRA] #0 : +experiment=aplite
...
[HYDRA] #1 : +experiment=nglite
...
Last updated on by Omry Yadan