{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Search space for different types of optimizers and schedulers\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Different optimizers have different update rules and behavior, and they may perform better or worse depending on the specific dataset and model architecture.\n", "Hence, trying out different optimizers and learning rate schedulers can be a good technique for HPO.\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "- To work with different optimizers effectively in the ablator, it is necessary to create custom `OptimizerConfig` objects that can handle passing either torch-defined or custom optimizers to the ablator.\n", "\n", "- This is similar to the schedulers.\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### For different optimizers\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "`make_optimizer` function creates an optimizer object based on inputs from the custom configs.\n", "\n", "This example supports three optimizers: Adam, AdamW, and SGD, however, we can also pass our custom-defined optimizers.\n", "\n", "- Creates a list of model parameters called parameter_groups.\n", "- Defines dictionaries with specific parameters for each optimizer.\n", "- Sets optimizer parameters using the parameter groups, learning rate, and defined dictionaries.\n", "\n", "Returns the optimizer object.\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "import torch.optim as optim\n", "\n", "def make_optimizer(optimizer_name: str, model: nn.Module, lr: float):\n", "\n", " parameter_groups = [v for k, v in model.named_parameters()]\n", "\n", " adamw_parameters = {\n", " \"betas\": (0.0, 0.1),\n", " \"eps\": 0.001,\n", " \"weight_decay\": 0.1\n", " }\n", " adam_parameters = {\n", " \"betas\" : (0.0, 0.1),\n", " \"weight_decay\": 0.0\n", " }\n", " sgd_parameters = {\n", " \"momentum\": 0.9,\n", " \"weight_decay\": 0.1\n", " }\n", "\n", " Optimizer = None\n", "\n", " if optimizer_name == \"adam\":\n", " Optimizer = optim.Adam(parameter_groups, lr = lr, **adam_parameters)\n", " elif optimizer_name == \"adamw\":\n", " Optimizer = optim.AdamW(parameter_groups, lr = lr, **adamw_parameters)\n", " elif optimizer_name == \"sgd\":\n", " Optimizer = optim.SGD(parameter_groups, lr = lr, **sgd_parameters)\n", "\n", "\n", " return Optimizer\n", "```\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we can create our own `CustomOptimizerConfig`.\n", "\n", "Since; we are creating an Optimizer configuration. Ablator requires a method: `make_optimizer` with input as a model. Thus, creating the method and returning the torch optimizer from our previous function.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "@configclass\n", "class CustomOptimizerConfig(ConfigBase):\n", " name: Literal[\"adam\", \"adamw\", \"sgd\"] = \"adam\"\n", " lr: float = 0.001\n", "\n", " def make_optimizer(self, model: torch.nn.Module):\n", " return make_optimizer(self.name, model, self.lr)\n", "\n", "optimizer_config = CustomOptimizerConfig(name = \"adam\", lr = 0.001)\n", "```\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### For different Schedulers\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "- create a function `make_scheduler` which takes all the parameters and passes them to their respective learning rate schedulers.\n", "\n", "Returns the torch scheduler object.\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "from torch.optim.lr_scheduler import OneCycleLR, ReduceLROnPlateau, StepLR\n", "\n", "def make_scheduler(scheduler_name: str,model: nn.Module, optimizer: torch.optim):\n", "\n", " step_parameters = {\n", " \"step_size\" : 1,\n", " \"gamma\" : 0.99\n", " }\n", " plateau_parameters = {\n", " \"patience\": 10,\n", " \"min_lr\": 1e-5,\n", " \"mode\": \"min\",\n", " \"factor\": 0.0,\n", " \"threshold\": 1e-4\n", " }\n", " cycle_parameters = {\n", " \"max_lr\": 1e-3,\n", " \"total_steps\": 10\n", " }\n", "\n", " Scheduler = None\n", " \n", " if scheduler_name == \"step\":\n", " Scheduler = StepLR(optimizer, **step_parameters)\n", " elif scheduler_name == \"cycle\":\n", " Scheduler = OneCycleLR(optimizer, **cycle_parameters)\n", " elif scheduler_name == \"plateau\":\n", " Scheduler = ReduceLROnPlateau(optimizer, **plateau_parameters)\n", " \n", " return Scheduler\n", "```\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Similarly, creating a custom `CustomSchedulerConfig` and making the required method `make_scheduler` with the inputs: model and optimizer.\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "@configclass\n", "class CustomSchedulerConfig(ConfigBase):\n", " name: Literal[\"step\", \"cycle\", \"plateau\"] = \"step\"\n", "\n", " def make_scheduler(self, model: torch.nn.Module, optimizer: torch.optim):\n", " return make_scheduler(self.name, model, optimizer)\n", "\n", "scheduler_config = CustomSchedulerConfig(name = \"step\")\n", "```\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "`CustomTrainConfig` takes both objects to define the train configuration.\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "@configclass\n", "class CustomTrainConfig(TrainConfig):\n", " optimizer_config: CustomOptimizerConfig\n", " scheduler_config: CustomSchedulerConfig\n", "\n", "\n", "train_config = CustomTrainConfig(\n", " dataset=\"[Your Dataset]\",\n", " batch_size=32,\n", " epochs=10,\n", " optimizer_config = optimizer_config,\n", " scheduler_config = scheduler_config,\n", " rand_weights_init = True\n", ")\n", "```\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Now, we can try out different optimizers and schedulers by providing a search space to the ablator.\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "search_space_for_optimizers = {\n", " ...\n", " \"train_config.optimizer_config.name\": SearchSpace(categorical_values = [\"adam\", \"sgd\", \"adamw\"]),\n", " ...\n", "}\n", "\n", "search_space_for_schedulers = {\n", " ...\n", " \"train_config.scheduler_config.name\": SearchSpace(categorical_values = [\"step\", \"cycle\", \"plateau\"]),\n", " ...\n", "}\n", "```\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "vscode": { "languageId": "html" } }, "source": [ "
\n", "\n", "Note:\n", "\n", "In the default optimizer config, providing the name of the optimizer in the config will create an object of the associated optimizer class. Simply changing the name in the search space will result in a mismatch in the class type, causing an error. Hence, we have to define custom configs in this way.\n", "\n", "One benefit this method offers is that we can define our custom optimizers or schedulers as a class and pass them to their respective configs for the ablator to manage training.\n", "\n", "
\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Conclusion\n", "\n", "Finally, with this, we can now test different optimizers and schedulers for our model.\n" ] } ], "metadata": { "kernelspec": { "display_name": "env", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }