Configurations for Training Essentials#

In the process of training a model, there are essential components that are required. These include the model itself, the optimizer, the scheduler, and the training setting (batch size, number of epochs, the optimizer to be used, etc.). Ablator also provides a set of configurations for these components.

Main Model Configuration#

class ablator.config.proto.ModelConfig(*args, **kwargs)[source]

Bases: ConfigBase

A base class for model configuration. This is used for defining model hyperparameters, so when initializing a model, this config is passed to the model constructor. The attributes from the model config object will be used to construct the model.

Examples

Define custom model configuration class for your model:

>>> @configclass
>>> class CustomModelConfig(ModelConfig):
>>>     input_size :int
>>>     hidden_size :int
>>>     num_classes :int

Define your model class, pass the configuration to the constructor, and build the model:

>>> class FashionMNISTModel(nn.Module):
>>>     def __init__(self, config: CustomModelConfig):
>>>         super(FashionMNISTModel, self).__init__()
>>>         self.fc1 = nn.Linear(config.input_size, config.hidden_size) # model config attributes are used here
>>>         self.relu1 = nn.ReLU()
>>>         self.fc3 = nn.Linear(config.hidden_size, config.num_classes) # model config attributes are used here
>>>     def forward(self, x):
>>>         # code for forward pass
>>>         return x

Optimizer Configurations#

class ablator.modules.optimizer.OptimizerConfig(name, arguments: dict[str, Any])[source]

Bases: ConfigBase

Configuration for an optimizer, including optimizer name and arguments (these arguments are specific to a certain type of optimizer like SGD, Adam, AdamW). This optimizer config will be provided to TrainConfig as part of the training setting of the experiment.

Examples

The following example shows how to create an optimizer config for SGD optimizer and use it in TrainConfig to define the training setting of the experiment.

>>> optim_config = OptimizerConfig("sgd", {"lr": 0.5})
>>> train_config = TrainConfig(
...     dataset="[Dataset Name]",
...     batch_size=32,
...     epochs=20,
...     optimizer_config=optim_config,
...     scheduler_config=None,
...     rand_weights_init = True
... )
>>> # ... create running config (proto/parallel), model wrapper, trainer and launch experiment

Note

A common use case is to run ablation studies on different optimizers to learn about their effects on the model performance. However, OptimizerConfig only configures one single optimizer for the experiment. But you can run experiments on different optimizers by creating a custom config class and add an extra method called make_optimizer. Go to the tutorial on Search space for different types of optimizers and scheduler for more details.

Attributes:

namestr: Name of the optimizer.
argumentsOptimizerArgs: Arguments for the optimizer, specific to a certain type of optimizer.

Scheduler Configurations#

class ablator.modules.scheduler.SchedulerConfig(name, arguments: dict[str, Any])[source]

Bases: ConfigBase

A class that defines a configuration for a learning rate scheduler. This scheduler config will be provided to TrainConfig (optional) as part of the training setting of the experiment.

Examples

The following example shows how to create a scheduler config and use it in TrainConfig to define the training setting of the experiment.

>>> optim_config = OptimizerConfig("sgd", {"lr": 0.5})
>>> scheduler_config = SchedulerConfig("step", arguments={"step_size": 1, "gamma": 0.99})
>>> train_config = TrainConfig(
...     dataset="[Dataset Name]",
...     batch_size=32,
...     epochs=20,
...     optimizer_config=optim_config,
...     scheduler_config=scheduler_config,
...     rand_weights_init = True
... )
>>> # ... create running config (proto/parallel), model wrapper, trainer and launch experiment

Note

A common use case is to run ablation studies on different schedulers to learn about their effects on the model performance. However, SchedulerConfig only configures one single scheduler for the experiment. But you can run experiments on different schedulers by creating a custom config class and add an extra method called make_scheduler. Go to this tutorial on Search space for different types of optimizers and scheduler for more details.

Attributes:

namestr: The name of the scheduler.
argumentsSchedulerArgs: The arguments needed to initialize the scheduler.

Training Configurations#

class ablator.config.proto.TrainConfig(*args, **kwargs)[source]

Bases: ConfigBase

Training configuration that defines the training setting, e.g., batch size, number of epochs, the optimizer to use, etc. This configuration is required when creating the run configuration (RunConfig and ParallelConfig), which sets up the running environment of the experiment.

Examples

The following example shows all the steps towards configuring an experiment:

Define model config, here we use default one with no custom hyperparameters (so we’re not running ablation study on the model architecture):

>>> my_model_config = ModelConfig()

Define optimizer and scheduler config, as training config requires an optimizer config, and optionally a scheduler config:

>>> my_optimizer_config = OptimizerConfig("sgd", {"lr": 0.5, "weight_decay": 0.5})
>>> my_scheduler_config = SchedulerConfig("step", arguments={"step_size": 1, "gamma": 0.99})

Define training config:

>>> my_train_config = CustomTrainConfig(
...     dataset="[Your Dataset]",
...     batch_size=32,
...     epochs=10,
...     optimizer_config = my_optimizer_config,
...     scheduler_config = my_scheduler_config,
...     rand_weights_init = True
... )

We now define the run config for prototype training, which is the last configuration step. Refer to Configurations for single model experiments and Configurations for parallel models experiments for more details on running configs.

>>> run_config = CustomRunConfig(
...     train_config=my_train_config,
...     model_config=my_model_config,
...     metrics_n_batches = 800,
...     experiment_dir = "/tmp/experiments",
...     device="cpu",
...     amp=False,
...     random_seed = 42
... )

Attributes:

dataset: str: dataset name. maybe used in custom dataset loader functions.
batch_size: int: batch size.
epochs: int: number of epochs to train.
optimizer_config: OptimizerConfig: optimizer configuration. (check OptimizerConfig for more details)
scheduler_config: Optional[SchedulerConfig]: scheduler configuration. (check SchedulerConfig for more details)
rand_weights_init: bool = True: whether to initialize model weights randomly.