Configurations for parallel models experiments#

One of the main features of Ablator is the ability to train and optimize multiple models for ablation study/ hyperparameter optimization in parallel. The main components of this feature are SearchSpace and ParallelConfig.

class ablator.config.hpo.SearchSpace(*args: Any, **kwargs: Any)[source]

Bases: ConfigBase

Search space configuration, required in ParallelConfig, is used to define the search space for a hyperparameter. Its constructor takes as input keyword arguments that correspond to parameters defined in the Parameters section.

Parameters:
value_rangeOptional[Tuple[str, str]]

value range of the parameter.

categorical_valuesOptional[List[str]]

categorical values for the parameter.

subspacesOptional[List[Self]]

A list of search spaces,

sub_configurationOptional[SubConfiguration]

Subconfiguration for a SearchSpace.

value_typeFieldType

value type of the parameter’s values (continuous or discrete), by default FieldType.continuous.

n_binsOptional[int]

Total bins for grid sampling, optional.

logbool

To log, by default False.

Examples

In ablator, search space is defined for parallel ablation studies. For example, we want to run an ablation study on the model’s hidden size and activation function:

  • Given the following model configuration:

>>> @configclass
>>> class CustomModelConfig(ModelConfig):
>>>     hidden_size: int
>>>     activation: str
>>> my_model_config = CustomModelConfig(hidden_size=100, activation="relu")
  • The search space, which will be passed to ParallelConfig as a dictionary (notice how the key is expressed as model_config.<model-hyperparameter>), should look like this:

>>> search_space = {
...     "model_config.hidden_size": SearchSpace(value_range = [32, 64], value_type = 'int'),
...     "model_config.activation": SearchSpace(categorical_values = ["relu", "elu", "leakyRelu"])
... }
Attributes:
value_range: Optional[Tuple[str, str]]

Value range of the parameter.

categorical_values: Optional[List[str]]

Categorical values for the parameter.

subspaces: Optional[List[Self]]

A list of search spaces.

sub_configuration: Optional[SubConfiguration]

Subconfiguration for a SearchSpace.

value_type: FieldType = FieldType.continuous

Value type of the parameter’s values (continuous or discrete).

n_bins: Optional[int]

Total bins for grid sampling.

log: bool

To log, by default False.

class ablator.config.mp.ParallelConfig(*args: Any, debug: bool = False, **kwargs: Any)[source]

Bases: RunConfig

Parallel training configuration, extending from RunConfig, defines the settings of a parallel experiment (number of trials to run for, number of concurrent trials, search space for hyperparameter search, etc.).

ParallelConfig encapsulates every configuration (model config, optimizer-scheduler config, train config, and the search space) needed to run a parallel experiment. The entire umbrella of configuration is then passed to ParallelTrainer which launches the experiment.

Examples

There are several steps before defining a parallel run config, let’s go through them one by one:

  • Define training config:

>>> my_optim_config = OptimizerConfig("sgd", {"lr": 0.5, "weight_decay": 0.5})
>>> my_scheduler_config = SchedulerConfig("step", arguments={"step_size": 1, "gamma": 0.99})
>>> train_config = TrainConfig(
...     dataset="[Dataset Name]",
...     batch_size=32,
...     epochs=10,
...     optimizer_config = my_optimizer_config,
...     scheduler_config = my_scheduler_config
... )
  • Define model config, we want to run HPO on activation functions and model hidden size:

>>> @configclass
>>> class CustomModelConfig(ModelConfig):
>>>     hidden_size: int
>>>     activation: str
>>> model_config = CustomModelConfig(hidden_size=100, activation="relu")
  • Define search space:

>>> search_space = {
...     "train_config.optimizer_config.arguments.lr": SearchSpace(
...         value_range=[0.001, 0.01], value_type="float"
...     ),
...     "model_config.hidden_size": SearchSpace(value_range=[32, 64], value_type="int"),
...     "model_config.activation": SearchSpace(
...         categorical_values=["relu", "elu", "leakyRelu"]
...     ),
... }
  • Lastly, we will define the run config from the previous config components (remember to redefine the parallel config to update the model config type to be CustomModelConfig):

>>> @configclass
>>> class CustomParallelConfig(ParallelConfig):
...    model_config: CustomModelConfig
>>> parallel_config = CustomParallelConfig(
...     train_config=train_config,
...     model_config=model_config,
...     metrics_n_batches = 800,
...     experiment_dir = "/tmp/experiments/",
...     device="cuda",
...     amp=True,
...     random_seed = 42,
...     total_trials = 20,
...     concurrent_trials = 20,
...     search_space = search_space,
...     optim_metrics = {"val_loss": "min"},
...     optim_metric_name = "val_loss",
...     gpu_mb_per_experiment = 1024
... )
Attributes:
total_trials: Optional[int]

total number of trials.

concurrent_trials: int

number of trials to run concurrently.

search_space: Dict[SearchSpace]

search space for hyperparameter search, eg. {"train_config.optimizer_config.arguments.lr": SearchSpace(value_range=[0, 10], value_type="int"),}

gpu_mb_per_experiment: int

CUDA memory requirement per experimental trial in MB. e.g. a value of 100 is equivalent to 100MB

search_algo: SearchAlgo = SearchAlgo.tpe

type of search algorithm.

ignore_invalid_params: bool = False

whether to ignore invalid parameters when sampling or raise an error.

remote_config: Optional[RemoteConfig] = None

remote storage configuration.