{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Search space for different types of optimizers and schedulers\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/fostiropoulos/ablator/blob/v0.0.1-mp/docs/source/notebooks/Searchspace-for-diff-optimizers.ipynb)\n",
    "\n",
    "Different optimizers have different update rules and behavior, and they may perform better or worse depending on the specific dataset and model architecture. Hence, trying out different optimizers and learning rate schedulers can be beneficial for ablation studies/ HPO.\n",
    "\n",
    "- To work with different optimizers effectively in the ablator, it is necessary to create custom config class from `OptimizerConfig` objects that can handle passing either torch-defined or custom optimizers to the ablator.\n",
    "\n",
    "- This is similar to schedulers.\n",
    "\n",
    "Let us first import necessary modules:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    import ablator\n",
    "except:\n",
    "    !pip install ablator\n",
    "    print(\"Stopping RUNTIME! Please run again\") # This script automatically restart runtime (if ablator is not found and installing is needed) so changes are applied\n",
    "    import os\n",
    "\n",
    "    os.kill(os.getpid(), 9)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from ablator import (ModelConfig, TrainConfig, ParallelConfig, SchedulerConfig, ModelWrapper,\n",
    "                     ParallelTrainer, configclass, ConfigBase, Literal, Optional)\n",
    "from ablator.config.hpo import SearchSpace\n",
    "\n",
    "import torch\n",
    "import torch.nn as nn\n",
    "import torch.optim as optim\n",
    "import torchvision\n",
    "import torchvision.transforms as transforms\n",
    "from torch.optim.lr_scheduler import OneCycleLR, ReduceLROnPlateau, StepLR\n",
    "\n",
    "import os\n",
    "import shutil\n",
    "from sklearn.metrics import accuracy_score"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Search space for optimizers"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We define a function called `create_optimizer` that creates an optimizer object based on the given inputs (optimizer name, model to optimize, and learning rate). In this example, we will use three optimizers: Adam, AdamW, and SGD. In specific, the function does the following:\n",
    "\n",
    "- Creates a list of model parameters `parameter_groups` from the model module `model.named_parameters()`.\n",
    "\n",
    "- Defines dictionaries with specific parameters for each optimizer.\n",
    "\n",
    "- Create the optimizer using the model parameters, learning rate, and the defined dictionaries for each optimizer parameters.\n",
    "\n",
    "Returns the optimizer object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_optimizer(optimizer_name: str, model: nn.Module, lr: float):\n",
    "\n",
    "    parameter_groups = [v for k, v in model.named_parameters()]\n",
    "\n",
    "    adamw_parameters = {\n",
    "      \"betas\": (0.0, 0.1),\n",
    "      \"eps\": 0.001,\n",
    "      \"weight_decay\": 0.1\n",
    "    }\n",
    "    adam_parameters = {\n",
    "      \"betas\" : (0.0, 0.1),\n",
    "      \"weight_decay\": 0.0\n",
    "    }\n",
    "    sgd_parameters = {\n",
    "      \"momentum\": 0.9,\n",
    "      \"weight_decay\": 0.1\n",
    "    }\n",
    "\n",
    "    Optimizer = None\n",
    "\n",
    "    if optimizer_name == \"adam\":\n",
    "        Optimizer = optim.Adam(parameter_groups, lr = lr, **adam_parameters)\n",
    "    elif optimizer_name == \"adamw\":\n",
    "        Optimizer = optim.AdamW(parameter_groups, lr = lr, **adamw_parameters)\n",
    "    elif optimizer_name == \"sgd\":\n",
    "        Optimizer = optim.SGD(parameter_groups, lr = lr, **sgd_parameters)\n",
    "\n",
    "    return Optimizer"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we create an Optimizer configuration `CustomOptimizerConfig`. Internally, Ablator requires that the optimizer config has function `make_optimizer` with input as a model module:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "CustomOptimizerConfig(name='adam', lr=0.001)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "@configclass\n",
    "class CustomOptimizerConfig(ConfigBase):\n",
    "    name: Literal[\"adam\", \"adamw\", \"sgd\"] = \"adam\"\n",
    "    lr: float = 0.001\n",
    "\n",
    "    def make_optimizer(self, model: nn.Module):\n",
    "        return create_optimizer(self.name, model, self.lr)\n",
    "\n",
    "optimizer_config = CustomOptimizerConfig(name = \"adam\", lr = 0.001)\n",
    "optimizer_config"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Here the configuration attribute `name` will be used in the search space, and we're allowing search space to be `[\"adam\", \"adamw\", \"sgd\"]`.\n",
    "\n",
    "- Inside `make_optimizer`, we call `create_optimizer` with the model, the name and lr attributes of the config object, and this function will return the corresponding optimizer."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Search space for different schedulers"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We define a function called `create_scheduler` that creates a scheduler object based on the given inputs (scheduler name, the model to optimize, the optimizer used). In this example, we will use three schedulers: StepLR, OneCycleLR, and ReduceLROnPlateau. In specific, the function does the following:\n",
    "\n",
    "- Defines dictionaries with specific parameters for each scheduler.\n",
    "\n",
    "- Create the scheduler using the optimizer and the defined dictionaries for each scheduler parameters.\n",
    "\n",
    "- Return the scheduler object.\n",
    "\n",
    "We also define a second function called `scheduler_arguments` that returns the arguments of the scheduler"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_scheduler(scheduler_name: str, model: nn.Module, optimizer: torch.optim):\n",
    "\n",
    "  parameters = scheduler_arguments(scheduler_name)\n",
    "  del parameters[\"step_when\"]\n",
    "\n",
    "  Scheduler = None\n",
    "  \n",
    "  if scheduler_name == \"step\":\n",
    "    Scheduler = StepLR(optimizer, **parameters)\n",
    "  elif scheduler_name == \"cycle\":\n",
    "    Scheduler = OneCycleLR(optimizer, **parameters)\n",
    "  elif scheduler_name == \"plateau\":\n",
    "    Scheduler = ReduceLROnPlateau(optimizer, **parameters)\n",
    "    \n",
    "  return Scheduler\n",
    "\n",
    "def scheduler_arguments(scheduler_name):\n",
    "  if scheduler_name == \"step\":\n",
    "    return {\n",
    "      \"step_size\" : 1,\n",
    "      \"gamma\" : 0.99,\n",
    "      \"step_when\": \"epoch\"\n",
    "    }\n",
    "  elif scheduler_name == \"plateau\":\n",
    "    return {\n",
    "      \"patience\":  10,\n",
    "      \"min_lr\":  1e-5,\n",
    "      \"mode\":  \"min\",\n",
    "      \"factor\":   0.0,\n",
    "      \"threshold\":  1e-4,\n",
    "      \"step_when\": \"val\"\n",
    "    }\n",
    "  elif scheduler_name == \"cycle\":\n",
    "    return {\n",
    "      \"max_lr\": 1e-3,\n",
    "      \"total_steps\": 10 * 1875, # number of training epochs * number of step per epoch len(train_loader)\n",
    "      \"step_when\": \"train\"\n",
    "    }"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Assigning values to these arguments should be done with careful consideration, compromising requirements of the schedulers and ablator configurations.\n",
    "\n",
    "One factor that might affect this is the `step_when` argument. `step_when` is solely for ablator to determine when to run `scheduler.step()` (hence it gets removed when creating the schedulers in `create_scheduler` function, as no schedulers uses it). If it's `\"train\"`, the learning rate is scheduled after every batch in the training loop, if it's `\"val\"`, scheduling takes place in the validation loop, and if it's `\"epoch\"`, scheduling takes place at the end of each epoch.\n",
    "\n",
    "Another factor is the configuration `TrainConfig.eval_epoch`, which decides how often the validation loop is run. If `eval_epoch` is `1`, the validation loop is run after every epoch, if it's `2`, the validation loop is run after every two epochs, and so on.\n",
    "\n",
    "For example, `OneCycleLR` requires `total_steps` to be `epochs * steps_per_epoch`, and this value is dependent on which value we're assigning to `step_when`:\n",
    "\n",
    "-  `step_when=\"epoch\"`: learning rate scheduled every epochs, so `total_steps = epochs * 1 = epochs`\n",
    "\n",
    "-  `step_when=\"val\"`: learning rate scheduled in validation loop, so `total_steps = epochs * len(val_loader)` if validation loop is run after every epoch (aka `eval_epoch=1`), and `total_steps = epochs * len(val_loader) / 2` if validation loop is run after every two epochs (aka `eval_epoch=2`), etc.\n",
    "\n",
    "-  `step_when=\"train\"`: learning rate scheduled in training loop, so `total_steps = epochs * len(train_loader)` (this is what we're using in the example above)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Similarly, we also create a custom config `CustomSchedulerConfig`, defining the required method `make_scheduler` with shceduler name, the model, and the optimizer as inputs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "CustomSchedulerConfig(name='step', arguments={'step_size': 1, 'gamma': 0.99, 'step_when': 'epoch'})"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "@configclass\n",
    "class CustomSchedulerConfig(SchedulerConfig):\n",
    "    def __init__(self, name, arguments=None):\n",
    "        arguments = scheduler_arguments(name)\n",
    "        super(CustomSchedulerConfig, self).__init__(name=name, arguments=arguments)\n",
    "\n",
    "    def make_scheduler(self, model: torch.nn.Module, optimizer: torch.optim):\n",
    "        return create_scheduler(self.name, model, optimizer)\n",
    "\n",
    "scheduler_config = CustomSchedulerConfig(name = \"step\")\n",
    "scheduler_config"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Here the configuration attribute `name` will be used in the search space, and we're allowing search space to be `[\"step\", \"cycle\", \"plateau\"]`.\n",
    "\n",
    "- In the constructor, we pass the parameter names and the dictionary arguments (constructed by `scheduler_arguments` function, corresponding to the scheduler) to the parent class constructor. The parent class constructor will accordingly initialize 2 attributes: `name` and `arguments`.\n",
    "\n",
    "- Inside `make_scheduler`, we call `create_scheduler`, passing the optimizer, the name, and and the model, and it will return the matching scheduler."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "\n",
    "Note\n",
    "\n",
    "Remember to redefine the `TrainConfig` config class, hence the `ParallelConfig`, before creating the training config to pass in the optimizer and scheduler config objects. E.g:\n",
    "\n",
    "```python\n",
    "@configclass\n",
    "class CustomTrainConfig(TrainConfig):\n",
    "  optimizer_config: CustomOptimizerConfig\n",
    "  scheduler_config: CustomSchedulerConfig\n",
    "\n",
    "@configclass\n",
    "class CustomParallelConfig(ParallelConfig):\n",
    "  model_config: CustomModelConfig\n",
    "  train_config: CustomTrainConfig\n",
    "\n",
    "```\n",
    "\n",
    "</div>"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create search space for optimizers and schedulers"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we can try out different optimizers and schedulers by providing a search space to the ablator.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "search_space = {\n",
    "    \"train_config.optimizer_config.lr\": SearchSpace(value_range = [0.001, 0.01], value_type = 'float'),\n",
    "    \"train_config.optimizer_config.name\": SearchSpace(categorical_values = [\"adam\", \"sgd\", \"adamw\"]),\n",
    "    \"train_config.scheduler_config.name\": SearchSpace(categorical_values = [\"step\", \"cycle\", \"plateau\"])\n",
    "}"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "vscode": {
     "languageId": "html"
    }
   },
   "source": [
    "<div class=\"alert alert-info\">\n",
    "\n",
    "Note:\n",
    "\n",
    "In the default `OptimizerConfig`, providing the name of the optimizer in the config will create an object of the associated optimizer class. Changing the name in the search space will result in a mismatch in the class type, causing an error. Hence, we have to define custom configs in this way.\n",
    "\n",
    "One benefit this method offers is that we can define our custom optimizers or schedulers as a class and pass them to their respective configs for the ablator to manage training.\n",
    "\n",
    "</div>\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Conclusion\n",
    "\n",
    "Finally, with this, we can now test different optimizers and schedulers for our model. You can go to [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/fostiropoulos/ablator/blob/v0.0.1-mp/docs/source/notebooks/SearchSpace-Optimizers-Schedulers-full-example.ipynb) for a complete example of an experiment."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "env",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.6"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}