ablator.analysis package#
Subpackages#
Submodules#
ablator.analysis.main module#
- class ablator.analysis.main.Analysis(results: DataFrame | Results, categorical_attributes: list[str] | None = None, numerical_attributes: list[str] | None = None, optim_metrics: dict[str, ablator.config.proto.Optim] | None = None, save_dir: str | None = None, cache: bool = False)[source]#
Bases:
objectA class that stores and processes the attributes, metrics, and other data for the plotting of the experiment result.
- Parameters:
- resultspd.DataFrame | Results
The result dataframe.
- categorical_attributeslist[str] | None
The list of all the categorical hyperparameter names, by default
None.- numerical_attributeslist[str] | None
The list of all the numerical hyperparameter names, by default
None.- optim_metricsdict[str, Optim] | None
A dictionary mapping metric names to optimization directions, by default
None.- save_dirstr | None
The directory to save analysis results to, by default
None.- cachebool
Whether to cache results, by default
False.
- Raises:
- FileNotFoundError
if the provided
save_dirto save plots don’t exists.- ValueError
if
cacheisTruebut nosave_diris provided.
- Attributes:
- optim_metricsdict[str, Optim]
A dictionary mapping metric names to optimization directions.
- save_dirPath | None
The directory to save analysis results to.
- cacheMemory | None
A joblib memory cache for saving results.
- categorical_attributeslist[str]
The list of all the categorical hyperparameter names
- numerical_attributeslist[str]
The list of all the numerical hyperparameter names
- experiment_attributeslist[str]
The list of all the hyperparameter names
- resultspd.DataFrame
The dataframe extracted from the results file based on given metrics names and hyperparameter names.
- property metric_names: list[str]#
- Returns:
- list[str]
list of all the metrics that will be plotted w.r.t hyperparameters.
Examples
>>> Make PlotAnalysis's object plots = Analysis( ... optim_metrics={"val_loss": Optim.min, "train_loss": Optim.min}, ) metrics = plots.metric_names >>> returns ['val_loss', 'train_loss']
ablator.analysis.results module#
- class ablator.analysis.results.Results(config: type[ablator.config.mp.ParallelConfig] | ParallelConfig, experiment_dir: str | Path, cache: bool = False, use_ray: bool = False)[source]#
Bases:
objectClass for processing experiment results. You can use this class to read the results in an experiment output directory. This can be used in combination with
PlotAnalysisto show the correlation between hyperparameters and metrics. Refer to Interpreting Results tutorial for more details on plotting and interpreting experiment results.- Parameters:
- configtype[ParallelConfig] | ParallelConfig
The configuration class used
- experiment_dirstr | Path
The path to the experiment directory.
- cachebool
Whether to cache the results, by default
False.- use_raybool
Whether to use ray for parallel processing, by default
False.
- Raises:
- FileNotFoundError
If the experiment directory doesn’t exists.
- ValueError
If
RunConfigis provided instead ofParallelConfig.
Examples
Suppose you have an experiment output directory stored at
<path to experiment output defined in config experiment_dir>. You can read the results from the directory as follows:>>> directory_path = Path('<path to experiment output defined in config experiment_dir>') >>> results = Results(config=ParallelConfig, experiment_dir=directory_path, use_ray=True) >>> df = results.read_results(config_type=ParallelConfig, experiment_dir=directory_path)
Pass
dftoPlotAnalysisto create an analysis object for plotting the correlation between the hyperparameters and the metrics and save the plots to an output directory. For example, the following template generates plots for each of the numerical and categorical hyperparameters and saves them to./plotsdirectory. Here “Validation Accuracy” is the name of the main metric.>>> analysis = PlotAnalysis( ... df, ... save_dir="./plots", ... cache=True, ... optim_metrics={"val_accuracy": Optim.max}, ... numerical_attributes=<numerical name remap keys names>, ... categorical_attributes=<categorical name remap keys names>, ... ) >>> analysis.make_figures( ... metric_name_remap={ ... "val_accuracy": "Validation Accuracy", ... }, ... attribute_name_remap= attribute_name_remap ... )
- Attributes:
- experiment_dirPath
The path to the experiment directory.
- configtype[ParallelConfig]
The configuration class used.
- metric_mapdict[str, Optim]
A dictionary mapping metric names to their optimization direction.
- data: pd.DataFrame
The processed results of the experiment. Refer to
read_resultsfor more details.- config_attrs: list[str]
The list of all the optimizable hyperparameter names
- search_space: dict[str, ty.Any]
All the search space of the experiment.
- numerical_attributes: list[str]
The list of all the numerical hyperparameter names
- categorical_attributes: list[str]
The list of all the categorical hyperparameter names.
- property metric_names: list[str]#
Get the list of all optimize directions
- Returns:
- list[str]
list of optimize metric names
Examples
>>> results.metric_names ["val_loss", "train_loss", "val_acc", "train_acc"]
- classmethod read_results(config_type: type[ablator.config.main.ConfigBase], experiment_dir: Path | str, num_cpus: float | None = None) DataFrame[source]#
Read all experiment results from the experiment directory (with ray if specified when initializing
Result).- Parameters:
- config_typetype[ConfigBase]
The configuration class.
- experiment_dirPath | str
The experiment directory.
- num_cpusfloat | None
Number of CPUs to use for ray processing, by default
None.
- Returns:
- pd.DataFrame
A data frame of all the results from all experiments in
experiment_dir.
- Raises:
- RuntimeError
If no results are present in the experiment directory.
Examples
>>> results.read_results(config_type = ParallelConfig, experiment_dir = "/tmp/results/experiment_8925_9991/") train_loss val_loss best_iteration best_loss current_epoch current_iteration epochs 13.3658738 0 inf 1 100 5 2.277102967 0.277085876 100 0.277085876 2 200 5 2.277154112 0.27619998 200 0.27619998 3 300 5 2.276529543 0.286987235 200 0.27619998 4 400 5 2.279828385 0.274052692 400 0.274052692 5 500 5 11.91869608 0 inf 1 100 5
- ablator.analysis.results.read_result(config_type: type[ablator.config.main.ConfigBase], json_path: Path) DataFrame | None[source]#
Read the results of an experiment and return them as a pandas
DataFrame.The function reads the data from a JSON file, processes each row, and appends experiment attributes from a YAML configuration file. The resulting
DataFrameis indexed and returned.- Parameters:
- config_typetype[ConfigBase]
The type of configuration class that is used to load the experiment configuration from a YAML file.
- json_pathPath
The path to the JSON file containing the results of the experiment.
- Returns:
- pd.DataFrame | None
A pandas
DataFramecontaining the processed experiment results. ReturnsNoneif there was an error in reading thejson_pathresults.
Examples
Suppose result json file
/tmp/myexperiment/results.jsoncontains:>>> json.load("results.json") [{ "train_loss": 10.35, "val_loss": NaN, "current_epoch": 1, }, { "train_loss": 3.89, "val_loss": 7.04, "current_epoch": 2, }]
And the corresponding configuration object
run_configis created as:>>> config = { ... "model_config": {}, ... "train_config": { ... 'dataset': 'Fashion-mnist', ... 'batch_size': 32, ... 'epochs': 20, ... 'optimizer_config': { ... 'name': 'adam', ... 'arguments': { ... 'betas': (0.9, 0.999), 'weight_decay': 0.0, 'lr': 0.001 ... } ... }, ... 'scheduler_config': None ... }, ... "experiment_dir": '/tmp/experiments', ... "random_seed": 42, ... # ... other configs ... "optim_metrics": None, ... "optim_metric_name": None ... } >>> run_config = RunConfig(**config)
The function
read_resultwill return a pandas data frame like below:>>> read_result(run_config, Path("/tmp/myexperiment/results.json")) experiment_dir keep_n_checkpoints ... train_loss val_loss current_epoch trial_uid step experiments_ 0 C:/tmp/experiments 3 ... 10.35 NaN 1 1 C:/tmp/experiments 3 ... 3.89 7.04 2