ablator.analysis package#

Subpackages#

Submodules#

ablator.analysis.main module#

class ablator.analysis.main.Analysis(results: DataFrame | Results, categorical_attributes: list[str] | None = None, numerical_attributes: list[str] | None = None, optim_metrics: dict[str, ablator.config.mp.Optim] | None = None, save_dir: str | None = None, cache=False)[source]#

Bases: object

A class for analyzing experimental results.

Attributes:
optim_metricsdict[str, Optim]

A dictionary mapping metric names to optimization directions.

save_dirstr | None

The directory to save analysis results to.

cacheMemory | None

A joblib memory cache for saving results.

categorical_attributeslist[str]

The list of all the categorical hyperparameter names

numerical_attributeslist[str]

The list of all the numerical hyperparameter names

experiment_attributeslist[str]

The list of all the hyperparameter names

resultspd.DataFrame

The dataframe extracted from the results file based on given metrics names and hyperparameter names.

property metric_names#

ablator.analysis.results module#

class ablator.analysis.results.Results(config: type[ablator.config.mp.ParallelConfig] | ParallelConfig, experiment_dir: str | Path, cache: bool = False, use_ray: bool = False)[source]#

Bases: object

Class for processing experiment results. You can use this class to read the results in an experiment output directory. This can be used in combination with PlotAnalysis to show the correlation between hyperparameters and metrics. Refer to Interpreting Results tutorial for more details on plotting and interpreting experiment results.

Parameters:
configtype[ParallelConfig]

The configuration class used

experiment_dirstr | Path

The path to the experiment directory.

cachebool, optional

Whether to cache the results, by default False

use_raybool, optional

Whether to use ray for parallel processing, by default False

Examples

>>> directory_path = Path('<path to experiment output defined in experiment_dir>')
>>> results = Results(config = ParallelConfig, experiment_dir=directory_path, use_ray=True)
>>> df = results.read_results(config_type=ParallelConfig, experiment_dir=directory_path)

Pass df to PlotAnalysis to create an analysis object that’s able to plot the correlation between the hyperparameters and metrics and save the plots to an output directory. For example, the following code snippet generates plots for each of the numerical and categorical hyperparameters and saves them to ./plots directory. Here “Validation Accuracy” is the name of the main metric.

>>> analysis = PlotAnalysis(
...         df,
...         save_dir="./plots",
...         cache=True,
...         optim_metrics={"val_accuracy": Optim.max},
...         numerical_attributes=<numerical name remap keys names>,
...         categorical_attributes=<categorical name remap keys names>,
...     )
>>> analysis.make_figures(
...     metric_name_remap={
...         "val_accuracy": "Validation Accuracy",
...     },
...     attribute_name_remap= attribute_name_remap
... )
Attributes:
experiment_dirPath

The path to the experiment directory.

configtype[ParallelConfig]

The configuration class used

metric_mapdict[str, Optim]

A dictionary mapping optimize metric names to their optimization direction.

data: pd.DataFrame

The processed results of the experiment. Refer read_results for more details.

config_attrs: list[str]

The list of all the optimizable hyperparameter names

search_space: dict[str, ty.Any]

All the search space of the experiment.

numerical_attributes: list[str]

The list of all the numerical hyperparameter names

categorical_attributes: list[str]

The list of all the categorical hyperparameter names

property metric_names: list[str]#

Get the list of all optimize directions

Returns:
list[str]

list of optimize metric names

classmethod read_results(config_type: type[ablator.config.main.ConfigBase], experiment_dir: Path | str, num_cpus=None) DataFrame[source]#

Read multiple results from experiment directory with ray to enable parallel processing.

This function calls read_result many times, refer to read_result for more details.

Parameters:
config_typetype[ConfigBase]

The configuration class

experiment_dirPath | str

The experiment directory

num_cpusint, optional

Number of CPUs to use for ray processing, by default None

Returns:
pd.DataFrame

A dataframe of all the results

ablator.analysis.results.read_result(config_type: type[ablator.config.main.ConfigBase], json_path: Path) DataFrame | None[source]#

Read the results of an experiment and return them as a pandas DataFrame.

The function reads the data from a JSON file, processes each row, and appends experiment attributes from a YAML configuration file. The resulting DataFrame is indexed and returned.

Parameters:
config_typetype[ConfigBase]

The type of the configuration class that is used to load the experiment configuration from a YAML file.

json_pathPath

The path to the JSON file containing the results of the experiment.

Returns:
pd.DataFrame | None

A pandas DataFrame containing the processed experiment results. Returns None if there was an error in reading the json_path results.

Examples

>>> result json file:
{
"run_id": "run_1",
"accuracy": 0.85,
"loss": 0.35
}
{
"run_id": "run_2",
"accuracy": 0.87,
"loss": 0.32
}
>>> config file
experiment_name: "My Experiment"
batch_size: 64
>>> return value
       run_id  accuracy loss experiment_name batch_size     path
0       run_1      0.85  0.35    My Experiment    64  path/to/experiment
1        run_2      0.87  0.32    My Experiment    64  path/to/experiment

Module contents#