Configurations for Training Essentials#

In the process of training a model, there are essential components that are required. These include the model itself, the optimizer, the scheduler, and the training setting (batch size, number of epochs, the optimizer to be used, etc.). Ablator also provides a set of configurations for these components.

Main Model Configuration#

class ablator.config.proto.ModelConfig(*args: Any, debug: bool = False, **kwargs: Any)[source]

Bases: ConfigBase

A base class for model configuration. This is used for defining model hyperparameters, so when initializing a model, it is passed to the model module constructor. The attributes from the model config object will be used to construct the model.

Examples

Define a custom model configuration class for your model:

>>> @configclass
>>> class CustomModelConfig(ModelConfig):
>>>     input_size :int
>>>     hidden_size :int
>>>     num_classes :int

Define your model class, pass the configuration to the constructor, and build the model:

>>> class FashionMNISTModel(nn.Module):
>>>     def __init__(self, config: CustomModelConfig):
>>>         super(FashionMNISTModel, self).__init__()
>>>         self.fc1 = nn.Linear(config.input_size, config.hidden_size) # model config attributes are used here
>>>         self.relu1 = nn.ReLU()
>>>         self.fc3 = nn.Linear(config.hidden_size, config.num_classes) # model config attributes are used here
>>>     def forward(self, x):
>>>         # code for forward pass
>>>         return x

RunConfig later requires a model config object, so we will create one, remember to pass values to the hyperparameters as we defined them to be Stateful:

>>> model_config = CustomModelConfig(input_size=512, hidden_size=100, num_classes=10)

Optimizer Configurations#

class ablator.modules.optimizer.OptimizerConfig(name: str, arguments: dict[str, Any])[source]

Bases: ConfigBase

Configuration for an optimizer, including optimizer name and arguments (these arguments are specific to a certain type of optimizer like SGD, Adam, AdamW). This optimizer config will be provided to TrainConfig as part of the training setting of the experiment.

Parameters:

namestr: Name of the optimizer, this can be any in ['adamw', 'adam', 'sgd'].
argumentsdict[str, ty.Any]: Arguments for the optimizer, specific to a certain type of optimizer. A common argument can be learning rate, e.g {'lr': 0.5}. If name is "adamw", can add eps to arguments, e.g {'lr': 0.5, 'eps': 0.001}. Refer to Configuration Basics scheduler tutorial for more details on each optimizer’s arguments.

Examples

The following example shows how to create an optimizer config for SGD optimizer and use it in TrainConfig to define the training setting of the experiment.

>>> optim_config = OptimizerConfig("sgd", {"lr": 0.5})
>>> train_config = TrainConfig(
...     dataset="[Dataset Name]",
...     batch_size=32,
...     epochs=20,
...     optimizer_config=optim_config,
...     scheduler_config=None
... )
>>> # ... create the run config (proto/parallel), model wrapper, trainer and launch the experiment

Note

Sometimes we want to run ablation studies on different optimizers to learn about their effects on the model performance. However, OptimizerConfig only configures one single optimizer for the experiment. But you can run experiments on different optimizers by creating a custom config class and add an extra method called make_optimizer. Go to the tutorial on Search space for different types of optimizers and scheduler for more details.

Attributes:

namestr: Name of the optimizer.
argumentsOptimizerArgs: Arguments for the optimizer, specific to a certain type of optimizer.

Scheduler Configurations#

class ablator.modules.scheduler.SchedulerConfig(name: str, arguments: dict[str, Any])[source]

Bases: ConfigBase

A class that defines a configuration for a learning rate scheduler. This scheduler config will be provided to TrainConfig (optional) as part of the training setting of the experiment.

Parameters:

namestr: The name of the scheduler, this can be any in ['None', 'step', 'cycle', 'plateau'].
argumentsdict[str, ty.Any]: The arguments for the scheduler, specific to a certain type of scheduler. Refer to Configuration Basics scheduler tutorial for more details on each scheduler’s arguments.

Examples

The following example shows how to create a scheduler config and use it in TrainConfig to define the training setting of the experiment. scheduler_config will initialize property arguments of type StepLRConfig, setting step_size=1, gamma=0.99 as its properties.

>>> optim_config = OptimizerConfig("sgd", {"lr": 0.5})
>>> scheduler_config = SchedulerConfig("step", arguments={"step_size": 1, "gamma": 0.99})
>>> train_config = TrainConfig(
...     dataset="[Dataset Name]",
...     batch_size=32,
...     epochs=20,
...     optimizer_config=optim_config,
...     scheduler_config=scheduler_config
... )
>>> # ... create the run config (proto/parallel), model wrapper, trainer and launch the experiment

Note

A common use case is to run ablation studies on different schedulers to learn about their effects on the model performance. However, SchedulerConfig only configures one single scheduler for the experiment. But you can run experiments on different schedulers by creating a custom config class and adding an extra method called make_scheduler. Go to this tutorial on Search space for different types of optimizers and scheduler for more details.

Attributes:

namestr: The name of the scheduler.
argumentsSchedulerArgs: The arguments needed to initialize the scheduler.

Training Configurations#

class ablator.config.proto.TrainConfig(*args: Any, debug: bool = False, **kwargs: Any)[source]

Bases: ConfigBase

Training configuration that defines the training setting, e.g., batch size, number of epochs, the optimizer to use, etc. This configuration is required when creating the run configurations (RunConfig and ParallelConfig, which set up the running environment of the experiment).

Examples

The following example shows all the steps towards configuring an experiment:

Define model config: for simplicity, we use the default one with no custom hyperparameters (so we’re not running an ablation study on the model architecture):

>>> my_model_config = ModelConfig()

Define optimizer and scheduler config, as training config requires an optimizer config, and optionally a scheduler config:

>>> my_optimizer_config = OptimizerConfig("sgd", {"lr": 0.5, "weight_decay": 0.5})
>>> my_scheduler_config = SchedulerConfig("step", arguments={"step_size": 1, "gamma": 0.99})

Define training config:

>>> my_train_config = TrainConfig(
...     dataset="[Your Dataset]",
...     batch_size=32,
...     epochs=10,
...     optimizer_config = my_optimizer_config,
...     scheduler_config = my_scheduler_config
... )

We now define the run config for prototype training, which is the last configuration step. Refer to Configurations for single model experiments and Configurations for parallel models experiments for more details on running configs.

>>> run_config = RunConfig(
...     train_config=my_train_config,
...     model_config=my_model_config,
...     metrics_n_batches = 800,
...     experiment_dir = "/tmp/experiments",
...     device="cpu",
...     amp=False,
...     random_seed = 42
... )

Attributes:

dataset: str: Dataset name. maybe used in custom dataset loader functions.
batch_size: int: Batch size.
epochs: int: Number of epochs to train.
optimizer_config: OptimizerConfig: Optimizer configuration.
scheduler_config: Optional[SchedulerConfig]: Scheduler configuration.