{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Interpreting Results"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/fostiropoulos/ablator/blob/v0.0.1-mp/docs/source/notebooks/Interpreting-results.ipynb)\n",
    "\n",
    "Now that we have trained different variations of the model, we now proceed to the exciting part, which is associating different components/aspects of the model training process with the overall performance, and draw conclusions about their impacts.\n",
    "\n",
    "In this tutorial, we will demonstrate how to interpret results from the experiment output directory that was generated in the [Ablation experiment](./HPO-tutorial.ipynb) tutorial."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are two main steps to interpret the results of the ablation study:\n",
    "\n",
    "- Use [`ablator.analysis.results`](../ablator.analysis.results.rst) module to consolidates the metrics from all the trials into a unified combined dataframe.\n",
    "\n",
    "- Use [`ablator.analysis.plot.main`](../ablator.analysis.plot.main.rst) module to generate plots for the metrics and parameters."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us first import the necessary libraries and modules:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    import ablator\n",
    "except:\n",
    "    !pip install ablator\n",
    "    print(\"Stopping RUNTIME! Please run again\") # This script automatically restart runtime (if ablator is not found and installing is needed) so changes are applied\n",
    "    import os\n",
    "\n",
    "    os.kill(os.getpid(), 9)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from ablator.analysis.results import Results # for formatting results\n",
    "from ablator import PlotAnalysis, Optim # for plotting\n",
    "\n",
    "from ablator import ParallelConfig, ModelConfig, configclass # for configs\n",
    "\n",
    "from pathlib import Path # for defining path\n",
    "import pandas as pd # for reading dataframe"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Generate analysis Report"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Generating results\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[`ablator.analysis.results.Results`](../ablator.analysis.results.rst) is responsible for processing the results within all the trial directories in the experiment output directory. In specific, `Results.read_results()` method reads multiple results in parallel from the experiment directory using Ray, then returns all the combined metrics as a dataframe."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`ablator.analysis.results.Results` has several parameters:\n",
    "\n",
    "- `config`: The running config **class** that is used in the experiment. Here it should be `CustomParallelConfig`, so make sure that the same config class used in the HPO tutorial is used here.\n",
    "\n",
    "- `experiment_dir`: the experiment output directory (`/tmp/experiments/experiment_<experiment id>`)\n",
    "\n",
    "- `use_ray`: either to use ray to parallelize the result reading process or not.\n",
    "\n",
    "Below is a concrete example of how to read the results of an experiment:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@configclass\n",
    "class CustomModelConfig(ModelConfig): # Configurations from the Ablation experiment tutorial\n",
    "  num_filter1: int\n",
    "  num_filter2: int\n",
    "  activation: str\n",
    "\n",
    "@configclass\n",
    "class CustomParallelConfig(ParallelConfig): # Configurations from the Ablation experiment tutorial\n",
    "  model_config: CustomModelConfig\n",
    "\n",
    "directory_path = Path('/tmp/experiments/')\n",
    "\n",
    "results = Results(config = CustomParallelConfig, experiment_dir=directory_path, use_ray=True)\n",
    "\n",
    "df = results.read_results(config_type=CustomParallelConfig, experiment_dir=directory_path)\n",
    "\n",
    "df.to_csv(\"results.csv\") # Optional: save the results to results.csv"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Plotting graphs"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `ablator.analysis.plot.PlotAnalysis` class is utilized for plotting graphs.\n",
    "\n",
    "The responsibilities of the `PlotAnalysis` class include:\n",
    "\n",
    "- Generating plots for the provided metrics and parameters.\n",
    "\n",
    "- Mapping the output and attribute names to user-provided names for better readability.\n",
    "\n",
    "- Storing the generated plots in the desired directory."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We first create python dictionaries that map the configuration parameters (one for categorical and one for numerical parameters) to custom labels for the plots. This improves the readability of the plots. However, renaming attributes/metrics to custom names is optional. If not provided, the names will be the default like `train_config.batch_size`.\n",
    "\n",
    "Below is an example of how to create these dictionaries. The keys of the dictionary are the configuration parameters and the values are the custom names:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'model_config.activation': 'Activation',\n",
       " 'model_config.num_filter1': 'N. Filter 1',\n",
       " 'model_config.num_filter2': 'N. Filter 2',\n",
       " 'train_config.optimizer_config.arguments.lr': 'Learning Rate'}"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "categorical_name_remap = {\n",
    "    \"model_config.activation\": \"Activation\",\n",
    "}\n",
    "numerical_name_remap = {\n",
    "    \"model_config.num_filter1\": \"N. Filter 1\",\n",
    "    \"model_config.num_filter2\": \"N. Filter 2\",\n",
    "    \"train_config.optimizer_config.arguments.lr\": \"Learning Rate\",\n",
    "}\n",
    "\n",
    "attribute_name_remap = {**categorical_name_remap, **numerical_name_remap}\n",
    "attribute_name_remap"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we use the dataframe generated by `ablator.analysis.results.Results` and the name remap to initialize the `PlotAnalysis` object and to generate the plots. `PlotAnalysis` object is initialized with the following parameters:\n",
    "\n",
    "- `results`: Pandas dataframe for the experiment results.\n",
    "\n",
    "- `categorical_attributes`: List of all categorical hyerparameters names.\n",
    "\n",
    "- `numerical_attributes`: List of all numerical hyerparameters names.\n",
    "\n",
    "- `optim_metrics`: A dictionary mapping metric names to optimization directions.\n",
    "\n",
    "- `save_dir`: Directory to save plots to.\n",
    "\n",
    "- `cache`: Whether to cache results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "analysis = PlotAnalysis(\n",
    "    df,\n",
    "    save_dir=\"./plots\",\n",
    "    cache=True,\n",
    "    optim_metrics={\"val_accuracy\": Optim.max},\n",
    "    numerical_attributes=list(numerical_name_remap.keys()),\n",
    "    categorical_attributes=list(categorical_name_remap.keys()),\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `PlotAnalysis.make_figures()` method is responsible for generating graphs, specifically linear plots for numerical attributes and violin plots for categorical values. To generate these plots, call this function, passing the metric-attribute mappings dictionary:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "analysis.make_figures(\n",
    "    metric_name_remap={\n",
    "        \"val_accuracy\": \"Validation Accuracy\",\n",
    "    },\n",
    "    attribute_name_remap = attribute_name_remap\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can find the plots stored in `./plots` directory."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Read analysis report and draw conclusions"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we can see the plots generated for our previous HPO tutorial. These plots represent the experiment conducted in the Ablation experiment chapter. The results may vary depending on the specific values used for each trial within the search space.\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Linearplots\n",
    "\n",
    "<img src=\"./Images/model_config.num_filter1.png\" width=\"300\" height=\"300\" alt=\"Validation Accuracy vs. Number of Filters in Layer 1\">\n",
    "<img src=\"./Images/model_config.num_filter2.png\" width=\"300\" height=\"300\" alt=\"Validation Accuracy vs. Number of Filters in Layer 2\">\n",
    "<img src=\"./Images/train_config.optimizer_config.arguments.lr.png\" width=\"300\" height=\"300\" alt=\"Validation Accuracy vs. Learning Rate\">"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see that, with an increase in learning rate, the model's validation accuracy decreases. N. Filter 1 and 2 show some positive correlation with the performance.  "
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Violinplots\n",
    "\n",
    "\n",
    "<img src=\"./Images/model_config.activation.png\" width=\"600\" height=\"380\" alt=\"Validation Accuracy vs. Activations\">\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For activation functions, we can see \"elu\" and \"leaky relu\" perform a little better for this problem.\n",
    "Overall, \"elu\" gave the highest accuracy for the experiment."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Observations \n",
    "\n",
    "In an ablation experiment, hyperparameters are randomly selected for each trial from a predefined search space. When TPE is used, the experiments can be biased towards a specific hyper-parameter range. For example, for different random initialization of TPE, it randomly sampled a higher learning rate for which smaller network (fewer channels) performed better. The contrary results were obtained using TPE where a random initialization sampled from smaller learning rates, favoring a larger neural network (more channels). \n",
    "\n",
    "As a result, it appears we get contradictory conclusions for our Ablation experiments. We NOTE, that it is important to select a Random strategy when performing ablation experiments where we want to be definite about the performance of a method. For example, using a Random optimization strategy have us conclude that using XXX performs better. \n",
    "\n",
    "When exploring the correlations, the resulting plots can provide insights into how the hyperparameters interact when used simultaneously. The plots reveal trends and patterns that can help understand the combined effect of the hyperparameters on the model's performance.\n",
    "\n",
    "If significant correlations are found among the hyperparameters, it may be beneficial to conduct HPO on individual hyperparameters to gain a deeper understanding of their independent effects. This focused analysis allows for a more precise evaluation of each hyperparameter's influence on the model's performance."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conclusion\n",
    "\n",
    "We have completed the analysis part of the tutorial. We saw the complete pipeline to use ablator to train models. This starts with prototyping models to analyze the ablation results. We have significantly spent less time on writing boiler-plate code while getting the benefits of parallel training, storing metrics, and analysis."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "env",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.6"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}