Experiment arguments

Spiking-FullSubNet uses TOML configuration files (*.toml) to configure and manage experiments. Each experiment is configured by a *.toml file, which contains the experiment meta information, trainer, loss function, learning rate scheduler, optimizer, model, dataset, and acoustic features. the basename of the *.toml file is used as the experiment ID or identifier. You can track configuration changes using version control and reproduce experiments by using the same configuration file. For more information on TOML syntax, visit the TOML website.

Sample `*.toml` file

This sample file demonstrates many settings available for configuration in AudioZEN.

[meta]
save_dir = "sdnn_delays/exp"
seed = 0
use_amp = false
use_deterministic_algorithms = false

[trainer]
path = "trainer.Trainer"
[trainer.args]
max_epoch = 9999
clip_grad_norm_value = 5

[acoustics]
n_fft = 512
win_length = 256
sr = 16000
hop_length = 256

[loss]
path = "audiozen.loss.SoftDTWLoss"
[loss.args]
gamma = 0.1

[optimizer]
path = "torch.optim.RAdam"
[optimizer.args]
lr = 0.01
weight_decay = 1e-5

[model]
path = "model.Model"
[model.args]
threshold = 0.1
tau_grad = 0.1
scale_grad = 0.8
max_delay = 64
out_delay = 0

Check any experiment configuration file in the recipes directory for more details.

Configuration details

In the audiozen configuration file, we must contain the following sections:

meta: Configure the experiment meta information, such as save_dir, seed, etc.
trainer: Configure the trainer.
loss_function: Configure the loss function.
lr_scheduler: Configure the learning rate scheduler.
optimizer: Configure the optimizer.
model: Configure the model.
dataset: Configure the dataset.
acoustics: Configure the acoustic features.

`meta` section

The meta section is used to configure the meta information.

Item	Description
`save_dir`	The directory where the experiment is saved. The log information, model checkpoints, and enhanced audio files will be stored in this directory.
`seed`	The random seed used to initialize the random number generator.
`use_amp`	Whether to use automatic mixed precision (AMP) to accelerate the training.
`use_deterministic_algorithms`	Whether to use nondeterministic algorithms to accelerate the training. If it is True, the training will be slower but more reproducible.

`trainer` section

The trainer section is used to configure a trainer. It contains two parts: path and args. path is a string that specifies the path to the trainer class. args is a dictionary that specifies the arguments of the trainer class. It should be like:

[trainer]
path = "trainer.Trainer"
[trainer.args]
max_epochs = 100
clip_grad_norm_value = 5
...

In this example, AudioZEN will load a custom Trainer class from trainer.py in the python search path and initialize it with the arguments in the [trainer.args] section. You are able to use multiple ways to specify the path argument. See the next section for more details. In AudioZEN, Trainer class must be a subclass of audiozen.trainer.base_trainer.BaseTrainer. It supports the following arguments at least:

Item	Default	Description
`debug`	`false`	Whether to enable debug mode. If it is true, we will collect the happening time of NaN and Inf.
`max_steps`	`999999999`	The maximum number of steps to train.
`max_epochs`	`9999`	The maximum number of epochs to train. If `max_steps` is set, `max_epochs` will be ignored.
`max_grad_norm`	`-1`	The maximum norm of the gradients used for clipping. “-1” means no clipping.
`save_max_score`	`true`	Whether to find the best model by the maximum score.
`save_ckpt_interval`	`1`	The interval of saving checkpoints.
`max_patience`	`10`	The number of epochs with no improvement after which the training will be stopped.
`plot_norm`	`true`	Whether to plot the norm of the gradients.
`validation_interval`	`1`	The interval of validation.
`max_num_checkpoints`	`10`	The maximum number of checkpoints to keep. Saving too many checkpoints causes disk space to run out.
`scheduler_name`	`"constant_schedule_with_warmup"`	The name of the scheduler.
`warmup_steps`	`0`	The number of warmup steps.
`warmup_ratio`	`0.0`	The ratio of warmup steps. If `warmup_steps` is set, `warmup_ratio` will be ignored.
`gradient_accumulation_steps`	`1`	The number of gradient accumulation steps. It is used to simulate a larger batch size.

Loading a module by `path` argument

We support multiple ways to load the module by the path argument in the *.toml. For example, we have the following directory structure:

recipes/intel_ndns
├── README.md
├── run.py
└── sdnn_delays
    ├── baseline.toml
    ├── model.py
    └── trainer.py

In recipes/intel_ndns/sdnn_delays/baseline.toml, the path of the trainer is set to:

[trainer]
path = "sdnn_delays.trainer.Trainer"

In this case, we will call the Trainer class in the module recipes/intel_ndns/sdnn_delays/trainer. If we set the path to:

[trainer]
path = "audiozen.trainer.custom_trainer.CustomTrainer"

We will call the CustomTrainer class in audiozen/trainer/custom_trainer.py.

Important

If you want to get the Trainer in audiozen package, you must install it in editable way by pip install -e . first.

`loss_function`, `optimizer`, `model`, and `dataset` sections

loss_function, optimizer, model, dataset sections are used to configure the loss function, optimizer, model, and dataset, respectively. They have the same logic as the trainer section.

[loss_function]
path = "..."
[loss_function.args]
...

[optimizer]
path = "..."
[optimizer.args]
...

...

You may use the loss function provided by PyTorch or implement your own loss function. For example, the following configuration is used to configure the MSELoss of PyTorch:

[loss_function]
path = "torch.nn.MSELoss"
[loss_function.args]

Use a custom loss function from audiozen.loss:

[loss_function]
path = "audiozen.loss.MyLoss"
[loss_function.args]
weights = [1.0, 1.0]
...

Note

You must keep the [loss_function.args] section even this loss function does not need any arguments.

You may use the learning rate scheduler provided by PyTorch or implement your own learning rate scheduler. For example, the following configuration is used to configure the StepLR:

[lr_scheduler]
path = "torch.optim.lr_scheduler.StepLR"
[lr_scheduler.args]
step_size = 100
gamma = 0.5
...

`acoustics` section

The acoustics section is used to configure the acoustic features. These configurations are used for the whole project, like visualization, except for the dataloader and model sections. You are able to call them in any place of the customTrainer class.

Item	Description
`sr`	The sample rate of the audio.
`n_fft`	The number of FFT points.
`hop_length`	The number of samples between successive frames.
`win_length`	The length of the STFT window.

Experiment arguments

Sample *.toml file

Configuration details

meta section

trainer section

Loading a module by path argument

loss_function, optimizer, model, and dataset sections

acoustics section