ablator.modules package#

Subpackages#

Submodules#

ablator.modules.optimizer module#

class ablator.modules.optimizer.AdamConfig(*args: Any, debug: bool = False, **kwargs: Any)[source]#

Bases: OptimizerArgs

Configuration for an Adam optimizer. This class has init_optimizer() method used to initialize and return an Adam optimizer.

Attributes:

betasTuple[float, float]: Coefficients for computing running averages of gradient and its square, by default (0.9, 0.999).
weight_decayfloat: Weight decay rate, by default 0.0.

betas: Tuple[float, float] = (0.9, 0.999)#

config_class#: alias of AdamConfig

init_optimizer(model: Module) → Adam[source]#

Creates and returns an Adam optimizer that optimizes the model’s parameters. These parameters will be processed via get_optim_parameters before used to initalized the optimizer.

Parameters:

modelnn.Module: The model that has parameters that the optimizer will optimize.

Returns:

Adam: An instance of the Adam optimizer.

Examples

>>> config = AdamConfig(lr=0.1, weight_decay=0.5, betas=(0.6,0.9))
>>> config.init_optimizer(MyModel())
Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.6, 0.9)
    capturable: False
    differentiable: False
    eps: 1e-08
    foreach: None
    fused: False
    lr: 0.1
    maximize: False
    weight_decay: 0.5
Parameter Group 1
    amsgrad: False
    betas: (0.6, 0.9)
    capturable: False
    differentiable: False
    eps: 1e-08
    foreach: None
    fused: False
    lr: 0.1
    maximize: False
    weight_decay: 0.0
)

weight_decay: float = 0.0#

class ablator.modules.optimizer.AdamWConfig(*args: Any, debug: bool = False, **kwargs: Any)[source]#

Bases: OptimizerArgs

Configuration for an AdamW optimizer. This class has init_optimizer() method used to initialize and return an AdamW optimizer.

Examples

>>> config = AdamWConfig(lr=0.1, weight_decay=0.5, betas=(0.9,0.99))

Attributes:

betasTuple[float, float]: Coefficients for computing running averages of gradient and its square, by default (0.9, 0.999).
epsfloat: Term added to the denominator to improve numerical stability, by default 1e-8.
weight_decayfloat: Weight decay rate, by default 0.01.

betas: Tuple[float, float] = (0.9, 0.999)#

config_class#: alias of AdamWConfig

eps: float = 1e-08#

init_optimizer(model: Module) → AdamW[source]#

Creates and returns an AdamW optimizer that optimizes the model’s parameters. These parameters will be processed via get_optim_parameters before used to initalized the optimizer.

Parameters:

modelnn.Module: The model that has parameters that the optimizer will optimize.

Returns:

AdamW: An instance of the AdamW optimizer.

Examples

>>> config = AdamWConfig(lr=0.1, weight_decay=0.5, betas=(0.9,0.99), eps=0.001)
>>> config.init_optimizer(MyModel())
AdamW (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.99)
    capturable: False
    eps: 0.001
    foreach: None
    lr: 0.1
    maximize: False
    weight_decay: 0.5
Parameter Group 1
    amsgrad: False
    betas: (0.9, 0.99)
    capturable: False
    eps: 0.001
    foreach: None
    lr: 0.1
    maximize: False
    weight_decay: 0.0
)

weight_decay: float = 0.01#

class ablator.modules.optimizer.OptimizerArgs(*args: Any, debug: bool = False, **kwargs: Any)[source]#

Bases: ConfigBase

A base class for optimizer arguments, here we define learning rate lr.

Attributes:

lrfloat: Learning rate of the optimizer

config_class#: alias of OptimizerArgs

abstract init_optimizer(model: Module)[source]#: Abstract method to be implemented by derived classes, which initializes the optimizer.

lr: float#

class ablator.modules.optimizer.OptimizerConfig(name: str, arguments: dict[str, Any])[source]#

Bases: ConfigBase

Configuration for an optimizer, including optimizer name and arguments (these arguments are specific to a certain type of optimizer like SGD, Adam, AdamW). This optimizer config will be provided to TrainConfig as part of the training setting of the experiment.

Parameters:

namestr: Name of the optimizer, this can be any in ['adamw', 'adam', 'sgd'].
argumentsdict[str, ty.Any]: Arguments for the optimizer, specific to a certain type of optimizer. A common argument can be learning rate, e.g {'lr': 0.5}. If name is "adamw", can add eps to arguments, e.g {'lr': 0.5, 'eps': 0.001}. Refer to Configuration Basics scheduler tutorial for more details on each optimizer’s arguments.

Examples

The following example shows how to create an optimizer config for SGD optimizer and use it in TrainConfig to define the training setting of the experiment.

>>> optim_config = OptimizerConfig("sgd", {"lr": 0.5})
>>> train_config = TrainConfig(
...     dataset="[Dataset Name]",
...     batch_size=32,
...     epochs=20,
...     optimizer_config=optim_config,
...     scheduler_config=None
... )
>>> # ... create the run config (proto/parallel), model wrapper, trainer and launch the experiment

Note

Sometimes we want to run ablation studies on different optimizers to learn about their effects on the model performance. However, OptimizerConfig only configures one single optimizer for the experiment. But you can run experiments on different optimizers by creating a custom config class and add an extra method called make_optimizer. Go to the tutorial on Search space for different types of optimizers and scheduler for more details.

Attributes:

namestr: Name of the optimizer.
argumentsOptimizerArgs: Arguments for the optimizer, specific to a certain type of optimizer.

arguments: OptimizerArgs#

config_class#: alias of OptimizerConfig

make_optimizer(model: Module) → Optimizer[source]#

Creates and returns an optimizer for the given model.

Parameters:

modelnn.Module: The model to optimize.

Returns:

optimizerOptimizer: The created optimizer.

Examples

>>> optim_config = OptimizerConfig("sgd", {"lr": 0.5, "weight_decay": 0.5})
>>> optim_config.make_optimizer(my_module)
SGD (
Parameter Group 0
    dampening: 0
    differentiable: False
    foreach: None
    lr: 0.5
    maximize: False
    momentum: 0.0
    nesterov: False
    weight_decay: 0.5
Parameter Group 1
    dampening: 0
    differentiable: False
    foreach: None
    lr: 0.5
    maximize: False
    momentum: 0.0
    nesterov: False
    weight_decay: 0.0
)

name: str#

class ablator.modules.optimizer.SGDConfig(*args: Any, debug: bool = False, **kwargs: Any)[source]#

Bases: OptimizerArgs

Configuration for an SGD optimizer. This class has init_optimizer() method, which is used to initialize and return an SGD optimizer.

Examples

>>> config = SGDConfig(lr=0.1, momentum=0.9)

Attributes:

weight_decayfloat: Weight decay rate.
momentumfloat: Momentum factor.

config_class#: alias of SGDConfig

init_optimizer(model: Module) → SGD[source]#

Creates and returns an SGD optimizer that optimizes the model’s parameters. These parameters will be processed via get_optim_parameters before used to initalized the optimizer.

Parameters:

modelnn.Module: The model that has parameters that the optimizer will optimize.

Returns:

optimizerSGD: The created SGD optimizer.

Examples

>>> config = SGDConfig(lr=0.1, weight_decay=0.5, momentum=0.9)
>>> config.init_optimizer(MyModel())
SGD (
Parameter Group 0
    dampening: 0
    differentiable: False
    foreach: None
    lr: 0.1
    maximize: False
    momentum: 0.9
    nesterov: False
    weight_decay: 0.5
Parameter Group 1
    dampening: 0
    differentiable: False
    foreach: None
    lr: 0.1
    maximize: False
    momentum: 0.9
    nesterov: False
    weight_decay: 0.0
)

momentum: float = 0.0#

weight_decay: float = 0.0#

ablator.modules.optimizer.get_optim_parameters(model: Module) → Iterator[Parameter][source]#

Get model parameters to be optimized. It first attempts to derive optimization parameters via a user-defined get_optim_param function which when it fails to find it simply uses the default torch nn.parameters()

Parameters:

modeltorch.nn.Module: The model for which to get parameters that will be optimized.

Returns:

abc.Iterator[nn.Parameter]: The list of parameters that require to be optimized. It can be a list, tensor or dictionary. Please see Pytorch Optimizer documentation on the specific format.

Notes

We provide a reasonable default that works well. If you want to use something else, you can pass a tuple in the Trainer’s init through optimizers, or subclass and override this method in a subclass.

Examples

>>> class MyModel(nn.Module):
>>>     def __init__(self, embedding_dim=10, vocab_size=10, *args, **kwargs) -> None:
>>>         super().__init__(*args, **kwargs)
>>>         self.param = nn.Parameter(torch.ones(100))
>>>         self.embedding = nn.Embedding(num_embeddings=vocab_size,
>>>                                     embedding_dim=embedding_dim)
>>>         self.norm_layer = nn.LayerNorm(embedding_dim)
>>>     def forward(self):
>>>         x = self.param + torch.rand_like(self.param) * 0.01
>>>         return x.sum().abs()
>>>     def get_optim_param(self):
>>>         return [{"params": [self.param], 'weight_decay':0.2}]
>>> mM = MyModel()
>>> get_optim_parameters(mM)
[{'params': ['param'], 'weight_decay': 0.2}]

ablator.modules.scheduler module#

class ablator.modules.scheduler.OneCycleConfig(*args: Any, debug: bool = False, **kwargs: Any)[source]#

Bases: SchedulerArgs

Configuration class for the OneCycleLR scheduler.

Attributes:

max_lrfloat: Upper learning rate boundaries in the cycle.
total_stepsDerived[int]: The total number of steps to run the scheduler in a cycle.
step_whenStepType: The step type at which the scheduler.step() should be invoked: 'train', 'val', or 'epoch'.

config_class#: alias of OneCycleConfig

init_scheduler(model: Module, optimizer: Optimizer) → OneCycleLR[source]#

Initializes the OneCycleLR scheduler. Creates and returns a OneCycleLR scheduler that monitors optimizer’s learning rate.

Parameters:

modelnn.Module: The model.
optimizerOptimizer: The optimizer used to update the model parameters, whose learning rate we want to monitor.

Returns:

OneCycleLR: The OneCycleLR scheduler, initialized with arguments defined as attributes of this class.

Examples

>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.7, momentum=0.9)
>>> scheduler = OneCycleConfig(max_lr=0.5, total_steps=100)
>>> scheduler.init_scheduler(model, optimizer)

max_lr: float#

step_when: Literal['train', 'val', 'epoch'] = 'train'#

total_steps: Derived[int]#

class ablator.modules.scheduler.PlateuaConfig(*args: Any, debug: bool = False, **kwargs: Any)[source]#

Bases: SchedulerArgs

Configuration class for ReduceLROnPlateau scheduler.

Attributes:

patienceint: Number of epochs with no improvement after which learning rate will be reduced.
min_lrfloat: A lower bound on the learning rate.
modestr: One of 'min', 'max', or 'auto', which defines the direction of optimization, so as to adjust the learning rate accordingly, i.e when a certain metric ceases improving.
factorfloat: Factor by which the learning rate will be reduced. new_lr = lr * factor.
thresholdfloat: Threshold for measuring the new optimum, to only focus on significant changes.
verbosebool: If True, prints a message to stdout for each update.
step_whenStepType: The step type at which the scheduler should be invoked: 'train', 'val', or 'epoch'.

config_class#: alias of PlateuaConfig

factor: float = 0.0#

init_scheduler(model: Module, optimizer: Optimizer) → ReduceLROnPlateau[source]#

Initialize the ReduceLROnPlateau scheduler.

Parameters:

modelnn.Module: The model being optimized.
optimizerOptimizer: The optimizer used to update the model parameters, whose learning rate we want to monitor.

Returns:

ReduceLROnPlateau: The ReduceLROnPlateau scheduler, initialized with arguments defined as attributes of this class.

Examples

>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.7, momentum=0.9)
>>> scheduler = PlateuaConfig(min_lr=1e-7, mode='min')
>>> scheduler.init_scheduler(model, optimizer)

min_lr: float = 1e-05#

mode: str = 'min'#

patience: int = 10#

step_when: Literal['train', 'val', 'epoch'] = 'val'#

threshold: float = 0.0001#

verbose: bool = False#

class ablator.modules.scheduler.SchedulerArgs(*args: Any, debug: bool = False, **kwargs: Any)[source]#

Bases: ConfigBase

Abstract base class for defining arguments to initialize a learning rate scheduler.

Attributes:

step_whenStepType: The step type at which the scheduler.step() should be invoked: 'train', 'val', or 'epoch'.

config_class#: alias of SchedulerArgs

abstract init_scheduler(model: Module, optimizer: Optimizer)[source]#: Abstract method to be implemented by derived classes, which creates and returns a scheduler object.

step_when: Literal['train', 'val', 'epoch']#

class ablator.modules.scheduler.SchedulerConfig(name: str, arguments: dict[str, Any])[source]#

Bases: ConfigBase

A class that defines a configuration for a learning rate scheduler. This scheduler config will be provided to TrainConfig (optional) as part of the training setting of the experiment.

Parameters:

namestr: The name of the scheduler, this can be any in ['None', 'step', 'cycle', 'plateau'].
argumentsdict[str, ty.Any]: The arguments for the scheduler, specific to a certain type of scheduler. Refer to Configuration Basics scheduler tutorial for more details on each scheduler’s arguments.

Examples

The following example shows how to create a scheduler config and use it in TrainConfig to define the training setting of the experiment. scheduler_config will initialize property arguments of type StepLRConfig, setting step_size=1, gamma=0.99 as its properties.

>>> optim_config = OptimizerConfig("sgd", {"lr": 0.5})
>>> scheduler_config = SchedulerConfig("step", arguments={"step_size": 1, "gamma": 0.99})
>>> train_config = TrainConfig(
...     dataset="[Dataset Name]",
...     batch_size=32,
...     epochs=20,
...     optimizer_config=optim_config,
...     scheduler_config=scheduler_config
... )
>>> # ... create the run config (proto/parallel), model wrapper, trainer and launch the experiment

Note

A common use case is to run ablation studies on different schedulers to learn about their effects on the model performance. However, SchedulerConfig only configures one single scheduler for the experiment. But you can run experiments on different schedulers by creating a custom config class and adding an extra method called make_scheduler. Go to this tutorial on Search space for different types of optimizers and scheduler for more details.

Attributes:

namestr: The name of the scheduler.
argumentsSchedulerArgs: The arguments needed to initialize the scheduler.

arguments: SchedulerArgs#

config_class#: alias of SchedulerConfig

make_scheduler(model: Module, optimizer: Optimizer) → _LRScheduler | ReduceLROnPlateau | Any[source]#

Creates a new scheduler for an optimizer, based on the configuration.

Parameters:

modelnn.Module: Some schedulers require information from the model. The model is passed as an argument.
optimizerOptimizer: The optimizer used to update the model parameters, whose learning rate we want to monitor.

Returns:

Scheduler: The scheduler.

Examples

>>> scheduler_config = SchedulerConfig("step", arguments={"step_size": 1, "gamma": 0.99})
>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.7, momentum=0.9)
>>> scheduler_config.make_scheduler(model, optimizer)

name: str#

class ablator.modules.scheduler.StepLRConfig(*args: Any, debug: bool = False, **kwargs: Any)[source]#

Bases: SchedulerArgs

Configuration class for StepLR scheduler.

Parameters:

step_sizeint: Period of learning rate decay, by default 1.
gammafloat: Multiplicative factor of learning rate decay, by default 0.99.
step_whenStepType: The step type at which the scheduler should be invoked: 'train', 'val', or 'epoch'.

config_class#: alias of StepLRConfig

gamma: float = 0.99#

init_scheduler(model: Module, optimizer: Optimizer) → StepLR[source]#

Initialize the StepLR scheduler for a given model and optimizer.

Parameters:

modelnn.Module: The model to apply the scheduler.
optimizerOptimizer: The optimizer used to update the model parameters, whose learning rate we want to monitor.

Returns:

StepLR: The StepLR scheduler, initialized with arguments defined as attributes of this class.

Examples

>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.7, momentum=0.9)
>>> scheduler = StepLRConfig(step_size=20, gamma=0.9)
>>> scheduler.init_scheduler(model, optimizer)

step_size: int = 1#

step_when: Literal['train', 'val', 'epoch'] = 'epoch'#

ablator.modules package#

Subpackages#

Submodules#

ablator.modules.optimizer module#

ablator.modules.scheduler module#

Module contents#