ablator.analysis package#

Submodules#

ablator.analysis.main module#

class ablator.analysis.main.Analysis(results: DataFrame | Results, categorical_attributes: list[str] | None = None, numerical_attributes: list[str] | None = None, optim_metrics: dict[str, ablator.config.mp.Optim] | None = None, save_dir: str | None = None, cache=False)[source]#

Bases: object

A class for analyzing experimental results.

Attributes:

optim_metricsdict[str, Optim]: A dictionary mapping metric names to optimization directions.
save_dirstr | None: The directory to save analysis results to.
cacheMemory | None: A joblib memory cache for saving results.
categorical_attributeslist[str]: The list of all the categorical hyperparameter names
numerical_attributeslist[str]: The list of all the numerical hyperparameter names
experiment_attributeslist[str]: The list of all the hyperparameter names
resultspd.DataFrame: The dataframe extracted from the results file based on given metrics names and hyperparameter names.

property metric_names#

ablator.analysis.results module#

class ablator.analysis.results.Results(config: type[ablator.config.mp.ParallelConfig] | ParallelConfig, experiment_dir: str | Path, cache: bool = False, use_ray: bool = False)[source]#

Bases: object

Class for processing experiment results. You can use this class to read the results in an experiment output directory. This can be used in combination with PlotAnalysis to show the correlation between hyperparameters and metrics. Refer to Interpreting Results tutorial for more details on plotting and interpreting experiment results.

Parameters:

configtype[ParallelConfig]: The configuration class used
experiment_dirstr | Path: The path to the experiment directory.
cachebool, optional: Whether to cache the results, by default False
use_raybool, optional: Whether to use ray for parallel processing, by default False

Examples

>>> directory_path = Path('<path to experiment output defined in experiment_dir>')
>>> results = Results(config = ParallelConfig, experiment_dir=directory_path, use_ray=True)
>>> df = results.read_results(config_type=ParallelConfig, experiment_dir=directory_path)

Pass df to PlotAnalysis to create an analysis object that’s able to plot the correlation between the hyperparameters and metrics and save the plots to an output directory. For example, the following code snippet generates plots for each of the numerical and categorical hyperparameters and saves them to ./plots directory. Here “Validation Accuracy” is the name of the main metric.

>>> analysis = PlotAnalysis(
...         df,
...         save_dir="./plots",
...         cache=True,
...         optim_metrics={"val_accuracy": Optim.max},
...         numerical_attributes=<numerical name remap keys names>,
...         categorical_attributes=<categorical name remap keys names>,
...     )
>>> analysis.make_figures(
...     metric_name_remap={
...         "val_accuracy": "Validation Accuracy",
...     },
...     attribute_name_remap= attribute_name_remap
... )

Attributes:

experiment_dirPath: The path to the experiment directory.
configtype[ParallelConfig]: The configuration class used
metric_mapdict[str, Optim]: A dictionary mapping optimize metric names to their optimization direction.
data: pd.DataFrame: The processed results of the experiment. Refer read_results for more details.
config_attrs: list[str]: The list of all the optimizable hyperparameter names
search_space: dict[str, ty.Any]: All the search space of the experiment.
numerical_attributes: list[str]: The list of all the numerical hyperparameter names
categorical_attributes: list[str]: The list of all the categorical hyperparameter names

property metric_names: list[str]#

Get the list of all optimize directions

Returns:

list[str]: list of optimize metric names

classmethod read_results(config_type: type[ablator.config.main.ConfigBase], experiment_dir: Path | str, num_cpus=None) → DataFrame[source]#

Read multiple results from experiment directory with ray to enable parallel processing.

This function calls read_result many times, refer to read_result for more details.

Parameters:

config_typetype[ConfigBase]: The configuration class
experiment_dirPath | str: The experiment directory
num_cpusint, optional: Number of CPUs to use for ray processing, by default None

Returns:

pd.DataFrame: A dataframe of all the results

ablator.analysis.results.read_result(config_type: type[ablator.config.main.ConfigBase], json_path: Path) → DataFrame | None[source]#

Read the results of an experiment and return them as a pandas DataFrame.

The function reads the data from a JSON file, processes each row, and appends experiment attributes from a YAML configuration file. The resulting DataFrame is indexed and returned.

Parameters:

config_typetype[ConfigBase]: The type of the configuration class that is used to load the experiment configuration from a YAML file.
json_pathPath: The path to the JSON file containing the results of the experiment.

Returns:

pd.DataFrame | None: A pandas DataFrame containing the processed experiment results. Returns None if there was an error in reading the json_path results.

Examples

>>> result json file:
{
"run_id": "run_1",
"accuracy": 0.85,
"loss": 0.35
}
{
"run_id": "run_2",
"accuracy": 0.87,
"loss": 0.32
}
>>> config file
experiment_name: "My Experiment"
batch_size: 64
>>> return value
       run_id  accuracy loss experiment_name batch_size     path
0       run_1      0.85  0.35    My Experiment    64  path/to/experiment
1        run_2      0.87  0.32    My Experiment    64  path/to/experiment

ablator.analysis package#

Subpackages#

Submodules#

ablator.analysis.main module#

ablator.analysis.results module#

Module contents#