Configurations for parallel models experiments#
One of the main features of Ablator is the ability to train and optimize
multiple models for ablation study/ hyperparameter optimization in parallel.
The main components of this feature are SearchSpace and ParallelConfig.
- class ablator.config.hpo.SearchSpace(*args: Any, **kwargs: Any)[source]
Bases:
ConfigBaseSearch space configuration, required in
ParallelConfig, is used to define the search space for a hyperparameter. Its constructor takes as input keyword arguments that correspond to parameters defined in the Parameters section.- Parameters:
- value_rangeOptional[Tuple[str, str]]
value range of the parameter.
- categorical_valuesOptional[List[str]]
categorical values for the parameter.
- subspacesOptional[List[Self]]
A list of search spaces,
- sub_configurationOptional[SubConfiguration]
Subconfiguration for a
SearchSpace.- value_typeFieldType
value type of the parameter’s values (continuous or discrete), by default
FieldType.continuous.- n_binsOptional[int]
Total bins for grid sampling, optional.
- logbool
To log, by default
False.
Examples
In ablator, search space is defined for parallel ablation studies. For example, we want to run an ablation study on the model’s hidden size and activation function:
Given the following model configuration:
>>> @configclass >>> class CustomModelConfig(ModelConfig): >>> hidden_size: int >>> activation: str >>> my_model_config = CustomModelConfig(hidden_size=100, activation="relu")
The search space, which will be passed to
ParallelConfigas a dictionary (notice how the key is expressed asmodel_config.<model-hyperparameter>), should look like this:
>>> search_space = { ... "model_config.hidden_size": SearchSpace(value_range = [32, 64], value_type = 'int'), ... "model_config.activation": SearchSpace(categorical_values = ["relu", "elu", "leakyRelu"]) ... }
- Attributes:
- value_range: Optional[Tuple[str, str]]
Value range of the parameter.
- categorical_values: Optional[List[str]]
Categorical values for the parameter.
- subspaces: Optional[List[Self]]
A list of search spaces.
- sub_configuration: Optional[SubConfiguration]
Subconfiguration for a
SearchSpace.- value_type: FieldType = FieldType.continuous
Value type of the parameter’s values (continuous or discrete).
- n_bins: Optional[int]
Total bins for grid sampling.
- log: bool
To log, by default
False.
- class ablator.config.mp.ParallelConfig(*args: Any, debug: bool = False, **kwargs: Any)[source]
Bases:
RunConfigParallel training configuration, extending from
RunConfig, defines the settings of a parallel experiment (number of trials to run for, number of concurrent trials, search space for hyperparameter search, etc.).ParallelConfigencapsulates every configuration (model config, optimizer-scheduler config, train config, and the search space) needed to run a parallel experiment. The entire umbrella of configuration is then passed toParallelTrainerwhich launches the experiment.Examples
There are several steps before defining a parallel run config, let’s go through them one by one:
Define training config:
>>> my_optim_config = OptimizerConfig("sgd", {"lr": 0.5, "weight_decay": 0.5}) >>> my_scheduler_config = SchedulerConfig("step", arguments={"step_size": 1, "gamma": 0.99}) >>> train_config = TrainConfig( ... dataset="[Dataset Name]", ... batch_size=32, ... epochs=10, ... optimizer_config = my_optimizer_config, ... scheduler_config = my_scheduler_config ... )
Define model config, we want to run HPO on activation functions and model hidden size:
>>> @configclass >>> class CustomModelConfig(ModelConfig): >>> hidden_size: int >>> activation: str >>> model_config = CustomModelConfig(hidden_size=100, activation="relu")
Define search space:
>>> search_space = { ... "train_config.optimizer_config.arguments.lr": SearchSpace( ... value_range=[0.001, 0.01], value_type="float" ... ), ... "model_config.hidden_size": SearchSpace(value_range=[32, 64], value_type="int"), ... "model_config.activation": SearchSpace( ... categorical_values=["relu", "elu", "leakyRelu"] ... ), ... }
Lastly, we will define the run config from the previous config components (remember to redefine the parallel config to update the model config type to be
CustomModelConfig):
>>> @configclass >>> class CustomParallelConfig(ParallelConfig): ... model_config: CustomModelConfig >>> parallel_config = CustomParallelConfig( ... train_config=train_config, ... model_config=model_config, ... metrics_n_batches = 800, ... experiment_dir = "/tmp/experiments/", ... device="cuda", ... amp=True, ... random_seed = 42, ... total_trials = 20, ... concurrent_trials = 20, ... search_space = search_space, ... optim_metrics = {"val_loss": "min"}, ... optim_metric_name = "val_loss", ... gpu_mb_per_experiment = 1024 ... )
- Attributes:
- total_trials: Optional[int]
total number of trials.
- concurrent_trials: int
number of trials to run concurrently.
- search_space: Dict[SearchSpace]
search space for hyperparameter search, eg.
{"train_config.optimizer_config.arguments.lr": SearchSpace(value_range=[0, 10], value_type="int"),}- gpu_mb_per_experiment: int
CUDA memory requirement per experimental trial in MB. e.g. a value of 100 is equivalent to 100MB
- search_algo: SearchAlgo = SearchAlgo.tpe
type of search algorithm.
- ignore_invalid_params: bool = False
whether to ignore invalid parameters when sampling or raise an error.
- remote_config: Optional[RemoteConfig] = None
remote storage configuration.