Configuring Experiments

A common problem is maintaining multiple configurations of an application. This can get especially tedious when the configuration differences span multiple dimensions. This pattern shows how to cleanly support multiple configurations, with each configuration file only specifying the changes to the master (default) configuration.


Create a config file specifying the overrides to the default configuration, and then call it via the command line. e.g. $ python +experiment=fast_mode.

To avoid clutter, we place the experiment config files in dedicated config group called experiment.


In this example, we will create configurations for each of the server and database pairings that we want to benchmark.

The default configuration is:

defaults:  - db: mysql  - server: apache

name: mysql
name: apacheport: 80
name: sqlite
name: nginxport: 80
Directory structure
โ”œโ”€โ”€ config.yamlโ”œโ”€โ”€ dbโ”‚   โ”œโ”€โ”€ mysql.yamlโ”‚   โ””โ”€โ”€ sqlite.yamlโ””โ”€โ”€ server    โ”œโ”€โ”€ apache.yaml    โ””โ”€โ”€ nginx.yaml
$ python
db:  name: mysqlserver:  name: apache  port: 80

The benchmark config files specify the deltas from the default configuration:

# @package _global_defaults:  - override /db: sqlite    server:  port: 8080
# @package _global_defaults:  - override /db: sqlite  - override /server: nginx  server:  port: 8080
$ python +experiment=aplite
db:  name: sqliteserver:  name: apache  port: 8080
$ python +experiment=nglite
db:  name: sqliteserver:  name: nginx  port: 8080

Key concepts:

  • # @package _global_
    Changes specified in this config should be interpreted as relative to the _global_ package.
    We could instead place nglite.yaml and aplite.yaml next to config.yaml and omit this line.
  • The overrides of /db and /server are absolute paths.
    This is necessary because they are outside of the experiment directory.

Running the experiments from the command line requires prefixing the experiment choice with a +. The experiment config group is an addition, not an override.

Sweeping over experiments#

This approach also enables sweeping over those experiments to easily compare their results:

$ python --multirun +experiment=aplite,nglite
[HYDRA] Launching 2 jobs locally[HYDRA]        #0 : +experiment=aplitedb:  name: sqliteserver:  name: apache  port: 8080
[HYDRA]        #1 : +experiment=nglitedb:  name: sqliteserver:  name: nginx  port: 8080

To run all the experiments, use the glob syntax:

$ python --multirun '+experiment=glob(*)'
[HYDRA]        #0 : +experiment=aplite...[HYDRA]        #1 : +experiment=nglite...