{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Search Space Basics" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "* This chapter covers the implementation of search space for hyperparameter optimization" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The search space refers to the range of values that can be used to explore each hyperparameter during HPO." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### SearchSpace Class" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ " The ````SearchSpace```` class in the Ablator library is responsible for defining the search space for hyperparameters. It allows you to specify the range for continuous hyperparameters, the categorical values for discrete hyperparameters, and the data type of the hyperparameter.\n", "\n", "Import ````SearchSpace```` using ````from ablator.main.configs import SearchSpace````.\n", "\n", "The SearchSpace class takes the following arguments:\n", "\n", "- **value_range**: This argument is used to define the range for continuous hyperparameters. It is specified in the format of [lower_bound, upper_bound]. For example, you may set value_range = [0, 1.0] for the dropout layer.\n", "\n", "- **categorical_values**: This argument is used for discrete hyperparameters. For example, to test the model's performance on different batch sizes, we can use categories like [32, 64, 128], etc. \n", "\n", "- **value_type**: This argument defines the data type of the hyperparameter. There are two data types supported: \"int\" for integer values and \"float\" for decimal or floating-point values. For example, value_type = \"int\" for integer type. \n", "\n", "Note that categorical values do not require a value type.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "from ablator.main.configs import SearchSpace\n", "\n", "SearchSpace(value_range=[0.05, 0.1], value_type=\"float\")\n", "SearchSpace(categorical_values=[32, 64, 128])\n", "```" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Creating a search space for hyperparameters." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The goal of defining a search space in hyperparameter optimization (HPO) is to encapsulate the possible values and ranges of hyperparameters.\n", "\n", "We use a Python dictionary where the key represents a hyperparameter in the config object, and the value is the ````SearchSpace```` object." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "search_space_lr = {\n", " \"train_config.optimizer_config.arguments.lr\": SearchSpace(\n", " value_range=[0.05, 0.1], value_type=\"float\"\n", " )\n", "}\n", "```" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Ablator can provide ````SearchSpace```` for:\n", "\n", "- **Predefined Configurations** in Ablator: Ablator offers predefined configurations for optimizers, schedulers, batch size, epochs, and more. These configurations are readily available for users to use in their experiments.\n", "\n", "- **Custom Configurations** added by users: Users can define custom configurations for parameters specific to their experiments. For example, parameters of custom models, activation functions, dropout layers, and other relevant hyperparameters. " ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Using \"SearchSpace\" for predefined configurations" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "* With optimizers\n", "\n", "The ablator provides three predefined optimizers with their arguments. [SGD, Adam, and AdamW]. \n", "\n", "Note: You can find all the default values for each optimizer in the \"Configuration Basics\" chapter." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Specifically for AdamW, a search space may be:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "search_space = {\n", " \"train_config.optimizer_config.arguments.lr\": SearchSpace(\n", " value_range = [0.01, 0.05],\n", " value_type = \"float\"\n", " ), \n", " \"train_config.optimizer_config.arguments.eps\": SearchSpace(\n", " value_range = [1e-9, 1e-7],\n", " value_type = \"float\"\n", " ), \n", " \"train_config.optimizer_config.arguments.weight_decay\": SearchSpace(\n", " value_range = [1e-4, 1e-3],\n", " value_type = \"float\"\n", " ),\n", "}\n", "```" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "* With schedulers\n", "\n", "The ablator also provides three schedulers: [step, cycle, and plateau]. You can use the ````SearchSpace```` with schedulers for their respective arguments.\n", "\n", "For example, a ````search_space```` for scheduler \"plateau\" will look like this:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "search_space = {\n", " \"train_config.scheduler_config.arguments.min_lr\": SearchSpace(\n", " value_range = [1e-6, 1e-4],\n", " value_type = \"float\"\n", " ), \n", " \"train_config.scheduler_config.arguments.mode\": SearchSpace(\n", " categorical_values = [\"min\", \"max\", \"auto\"]\n", " ),\n", " \"train_config.scheduler_config.arguments.threshold\": SearchSpace(\n", " value_range = [1e-5, 1e-3],\n", " value_type = \"float\"\n", " ),\n", "}\n", "```" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "* Other parameters\n", " \n", "We can also provide ````SearchSpace```` to other parameters like epochs, batch_size, etc. inside ````train_config````. " ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "For example, trying different batch_size or epochs can be easily done using the code:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "search_space = {\n", " \"train_config.batch_size\": SearchSpace(\n", " categorical_values = [32, 64, 128]\n", " ),\n", " \"train_config.epochs\": SearchSpace(\n", " value_range = [10, 20],\n", " value_type = \"int\"\n", " ),\n", "}\n", "```" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Using SearchSpace for Custom Configs" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "In this example, we will explore the \"hidden_size\" and \"activation\" parameters of a custom model and will create a search space for it." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Suppose, a custom model config takes some inputs like:\n", "\n", "````\n", "class CustomModelConfig(ModelConfig):\n", " ...\n", " input_size :int\n", " hidden_size :int\n", " num_classes :int \n", " activation: str\n", " ...\n", "````\n", "\n", "````\n", "model_config = CustomModelConfig(\n", " input_size = 28*28,\n", " hidden_size = 256, \n", " num_classes = 10,\n", " activation = \"relu\"\n", ")\n", "````\n", "\n", "Remember, a model config is defined to pass its arguments inside the constructor of the PyTorch model. \n", "For example: \n", "\n", "````\n", "class MyModel(nn.Module):\n", " def __init__(self, config: CustomModelConfig) -> None:\n", " activation_list = {\"relu\" : nn.ReLU(), \"elu\": nn.ELU()}\n", " \n", " self.fc1 = nn.Linear(config.input_size, config.hidden_size)\n", " self.act1 = activation_list[config.activation]\n", " ...\n", "````" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "We can provide a ````SearchSpace```` for model's \"hidden_size\" parameter like this:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "search_space = {\n", " \"model_config.hidden_size\": SearchSpace(\n", " value_range=[250, 500], value_type=\"int\"\n", " ),\n", "}\n", "```" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Similarly, we can also provide a search space for \"activation\"." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "search_space = {\n", " \"model_config.activation\": SearchSpace(\n", " categorical_values = [\"relu\",\"elu\"]\n", " ), \n", "}\n", "```" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The ablator will create trials using different \"hidden_size\" or \"activations\" according to the \"search_space\" provided. " ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Finally, the ````search_space```` dictionary is passed to the ParallelConfig, which will be explored in detail in the HPO tutorial." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### SearchSpace for YAML files\n", "\n", "If we are using a YAML file to define configurations, we can specify a search space as follows:\n", "\n", "````\n", "...\n", "search_space:\n", " model_config.hidden_size:\n", " value_range:\n", " - '250'\n", " - '500'\n", " categorical_values: null\n", " value_type: int\n", " train_config.optimizer_config.arguments.lr:\n", " value_range:\n", " - '0.001'\n", " - '0.01'\n", " categorical_values: null\n", " value_type: float\n", " model_config.activation:\n", " value_range: null\n", " categorical_values:\n", " - relu\n", " - leakyRelu\n", " - elu\n", " value_type: float\n", "...\n", "````" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Conclusion\n", "\n", "We have successfully explored the ````SearchSpace```` class and various ways to utilize it. In the subsequent chapter, we will learn how to use ````search_space```` with \"ParallelConfig\" for HPO." ] } ], "metadata": { "kernelspec": { "display_name": "env", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }