Search Space Basics#
In this tutorial, we will walk you through the process of defining a search space, and how to incorporate them into ablator for ablation study on various hyperparameters.
Create a search space with ablator.config.hpo.SearchSpace#
In Ablator, `ablator.config.hpo.SearchSpace <../config.train.parallel.experiment.rst#configurations-for-parallel-models-experiments>`__ is used to define the search space for a hyperparameter, based on which ablator creates many trials of different values for that hyperparameter. It allows you to specify the range of values for different types of data, and the type of data the hyperparameter is.
Import SearchSpace:
from ablator.config.hpo import SearchSpace
The SearchSpace class (one object created for one hyperparameter) takes the following arguments:
``value_range``: defines a continuous range for a continuous hyperparameter. It is specified in the format of
[<lower_bound>, <upper_bound>]. In each trial, the hyperparameter will be sampled with a value taken from this range.``categorical_values``: defines a discrete set for a discrete hyperparameter. In each trial, the hyperparameter will be sampled with a value taken from this set.
``value_type``: specifies the hyperparameter’s data type. Ablator supports
"int"for integer values and"float"for decimal or floating-point values. This argument is required for hyperparameters that take values from avalue_range.
Note that categorical values do not require value_type.
In the example below, we create a search space with a continuous float range [0.05, 0.1] and a search space with a discrete set [32, 64, 128]:
from ablator.config.hpo import SearchSpace
SearchSpace(value_range=[0.05, 0.1], value_type="float")
SearchSpace(categorical_values=[32, 64, 128])
Creating search space for hyperparameters#
Recall from Configuration Basics tutorial, ParallelConfig has an argument for search space, which is search_space. This argument is defined as a dictionary of SearchSpace objects, which captures all search spaces for all hyperparameters that we want to run ablation study on.
SearchSpace can be created for hyperparameters that are ablator-predefined configuration attributes, or custom configuration attributes:
Predefined Configurations: Ablator offers predefined configurations for optimizers, schedulers, batch size, epochs, and more. These configurations are readily available for users to use in their experiments.
Custom Configurations (added by users): Users can define custom configurations for hyperparameters specific to their models. For example, the number of hidden layer in a neural network, activation functions, and other relevant hyperparameters.
Using SearchSpace for predefined configurations#
For optimizers#
Ablator supports three predefined optimizers: SGD, Adam, and AdamW. For an optimizer chosen for the training process, you can create a search space for any of its parameters. For example, to create search space for AdamW optimizer’s (parameters are learning rate, epsilon, weight decay, etc.), you can do the following:
my_search_space = {
"train_config.optimizer_config.arguments.lr": SearchSpace(
value_range = [0.01, 0.05],
value_type = "float"
),
"train_config.optimizer_config.arguments.eps": SearchSpace(
value_range = [1e-9, 1e-7],
value_type = "float"
),
"train_config.optimizer_config.arguments.weight_decay": SearchSpace(
value_range = [1e-4, 1e-3],
value_type = "float"
),
}
The syntax for creating search space for optimizers in ablator is:
search_space = {
"train_config.optimizer_config.arguments.<parameter 1>": search_space_1,
"train_config.optimizer_config.arguments.<parameter 2>": search_space_2,
...
}
where <parameter 1> and <parameter 2> are the parameters for the corresponding optimizer. You can find parameters for all optimizers in the Configuration Basics tutorial.
For schedulers#
Ablator supports three predefined schedulers: StepLR, OneCycleLR, and ReduceLROnPlateau. For a scheduler chosen for the training process, you can create a search space for any of its parameters. For example, to create search space for ReduceOnPlateau scheduler (parameters are min learning rate, patience, factor, threshold, etc.), you can do the following:
my_search_space = {
"train_config.scheduler_config.arguments.min_lr": SearchSpace(
value_range = [1e-6, 1e-4],
value_type = "float"
),
"train_config.scheduler_config.arguments.threshold": SearchSpace(
value_range = [1e-5, 1e-3],
value_type = "float"
),
}
The syntax for creating search space for schedulers in ablator is:
search_space = {
"train_config.scheduler_config.arguments.<parameter 1>": search_space_1,
"train_config.scheduler_config.arguments.<parameter 2>": search_space_2,
...
}
where <parameter 1> and <parameter 2> are the parameters for the corresponding scheduler. You can find parameters for all schedulers in the Configuration Basics tutorial.
For other parameters#
We can also provide SearchSpace to other parameters like epochs, batch_size, etc. from TrainConfig.
The syntax will be:
search_space = {
"train_config.<parameter 1>": search_space_1,
"train_config.<parameter 2>": search_space_2,
...
}
where <parameter 1> and <parameter 2> are the attributes of TrainConfig.
For example, trying different batch_size or epochs can be easily done with the following snippet:
my_search_space = {
"train_config.batch_size": SearchSpace(
categorical_values = [32, 64, 128]
),
"train_config.epochs": SearchSpace(
value_range = [10, 20],
value_type = "int"
),
}
Using SearchSpace for custom configurqations#
In the previous tutorials, we have shown that you can run ablation experiment to study different components of a model. For example, we want to study the impact of the hyperparameters hidden_size and activation on the performance of a model. So we’re first creating a custom model configuration with these hyperparameters and using this configuration to build the model:
class CustomModelConfig(ModelConfig): # hyperparameters to be studied
hidden_size :int
activation: str
class MyModel(nn.Module):
def __init__(self, config: CustomModelConfig) -> None:
activation_list = {"relu" : nn.ReLU(), "elu": nn.ELU()}
input_size = 100
self.fc1 = nn.Linear(input_size, config.hidden_size)
self.act1 = activation_list[config.activation]
model_config = CustomModelConfig(
hidden_size = 256,
activation = "relu"
)
Note that we still need to create a model config object with initial values for the hyperparameters (and pass that to the running configuration), even though later we will create multiple trials with different values for them, taken from the search space.
Let’s now create SearchSpace for hidden_size (an integer range) and activation (a discrete set):
The syntax for search space for model hyperparameters is:
search_space = {
"model_config.<parameter 1>": search_space_1,
"model_config.<parameter 2>": search_space_2,
...
}
my_search_space = {
"model_config.hidden_size": SearchSpace(
value_range=[250, 500], value_type="int"
),
"model_config.activation": SearchSpace(
categorical_values = ["relu","elu"]
),
}
Putting everything into one:
my_search_space = {
"train_config.optimizer_config.arguments.lr": SearchSpace(
value_range = [0.01, 0.05],
value_type = "float"
),
"train_config.optimizer_config.arguments.eps": SearchSpace(
value_range = [1e-9, 1e-7],
value_type = "float"
),
"train_config.optimizer_config.arguments.weight_decay": SearchSpace(
value_range = [1e-4, 1e-3],
value_type = "float"
),
"train_config.scheduler_config.arguments.min_lr": SearchSpace(
value_range = [1e-6, 1e-4],
value_type = "float"
),
"train_config.scheduler_config.arguments.threshold": SearchSpace(
value_range = [1e-5, 1e-3],
value_type = "float"
),
"train_config.batch_size": SearchSpace(
categorical_values = [32, 64, 128]
),
"train_config.epochs": SearchSpace(
value_range = [10, 20],
value_type = "int"
),
"model_config.hidden_size": SearchSpace(
value_range=[250, 500], value_type="int"
),
"model_config.activation": SearchSpace(
categorical_values = ["relu","elu"]
)
}
Finally, my_search_space dictionary is passed to the ParallelConfig. This will be explored in more detail in Hyperparameter Optimization tutorial.
SearchSpace in YAML files#
If you are using a YAML file to define configurations, we can specify a search space as follows:
# other configurations ...
search_space:
model_config.hidden_size:
value_range:
- '250'
- '500'
categorical_values: null
value_type: int
train_config.optimizer_config.arguments.lr:
value_range:
- '0.001'
- '0.01'
categorical_values: null
value_type: float
model_config.activation:
value_range: null
categorical_values:
- relu
- leakyRelu
- elu
value_type: float
# other configurations ...
Conclusion#
In this tutorial, we have demonstrated how to create search space objects and how to utilize them to define search space for various hyperparameters. In the subsequent tutorial, we will explain how to use search_space with ParallelConfig to launch a parallel ablation experiment.