ablator.analysis package#
Subpackages#
Submodules#
ablator.analysis.main module#
- class ablator.analysis.main.Analysis(results: DataFrame | Results, categorical_attributes: list[str] | None = None, numerical_attributes: list[str] | None = None, optim_metrics: dict[str, ablator.config.mp.Optim] | None = None, save_dir: str | None = None, cache=False)[source]#
Bases:
objectA class for analyzing experimental results.
- Attributes:
- optim_metricsdict[str, Optim]
A dictionary mapping metric names to optimization directions.
- save_dirstr | None
The directory to save analysis results to.
- cacheMemory | None
A joblib memory cache for saving results.
- categorical_attributeslist[str]
The list of all the categorical hyperparameter names
- numerical_attributeslist[str]
The list of all the numerical hyperparameter names
- experiment_attributeslist[str]
The list of all the hyperparameter names
- resultspd.DataFrame
The dataframe extracted from the results file based on given metrics names and hyperparameter names.
- property metric_names#
ablator.analysis.results module#
- class ablator.analysis.results.Results(config: type[ablator.config.mp.ParallelConfig] | ParallelConfig, experiment_dir: str | Path, cache: bool = False, use_ray: bool = False)[source]#
Bases:
objectClass for processing experiment results. You can use this class to read the results in an experiment output directory. This can be used in combination with
PlotAnalysisto show the correlation between hyperparameters and metrics. Refer to Interpreting Results tutorial for more details on plotting and interpreting experiment results.- Parameters:
- configtype[ParallelConfig]
The configuration class used
- experiment_dirstr | Path
The path to the experiment directory.
- cachebool, optional
Whether to cache the results, by default
False- use_raybool, optional
Whether to use ray for parallel processing, by default
False
Examples
>>> directory_path = Path('<path to experiment output defined in experiment_dir>') >>> results = Results(config = ParallelConfig, experiment_dir=directory_path, use_ray=True) >>> df = results.read_results(config_type=ParallelConfig, experiment_dir=directory_path)
Pass
dftoPlotAnalysisto create an analysis object that’s able to plot the correlation between the hyperparameters and metrics and save the plots to an output directory. For example, the following code snippet generates plots for each of the numerical and categorical hyperparameters and saves them to./plotsdirectory. Here “Validation Accuracy” is the name of the main metric.>>> analysis = PlotAnalysis( ... df, ... save_dir="./plots", ... cache=True, ... optim_metrics={"val_accuracy": Optim.max}, ... numerical_attributes=<numerical name remap keys names>, ... categorical_attributes=<categorical name remap keys names>, ... ) >>> analysis.make_figures( ... metric_name_remap={ ... "val_accuracy": "Validation Accuracy", ... }, ... attribute_name_remap= attribute_name_remap ... )
- Attributes:
- experiment_dirPath
The path to the experiment directory.
- configtype[ParallelConfig]
The configuration class used
- metric_mapdict[str, Optim]
A dictionary mapping optimize metric names to their optimization direction.
- data: pd.DataFrame
The processed results of the experiment. Refer
read_resultsfor more details.- config_attrs: list[str]
The list of all the optimizable hyperparameter names
- search_space: dict[str, ty.Any]
All the search space of the experiment.
- numerical_attributes: list[str]
The list of all the numerical hyperparameter names
- categorical_attributes: list[str]
The list of all the categorical hyperparameter names
- property metric_names: list[str]#
Get the list of all optimize directions
- Returns:
- list[str]
list of optimize metric names
- classmethod read_results(config_type: type[ablator.config.main.ConfigBase], experiment_dir: Path | str, num_cpus=None) DataFrame[source]#
Read multiple results from experiment directory with ray to enable parallel processing.
This function calls
read_resultmany times, refer toread_resultfor more details.- Parameters:
- config_typetype[ConfigBase]
The configuration class
- experiment_dirPath | str
The experiment directory
- num_cpusint, optional
Number of CPUs to use for ray processing, by default
None
- Returns:
- pd.DataFrame
A dataframe of all the results
- ablator.analysis.results.read_result(config_type: type[ablator.config.main.ConfigBase], json_path: Path) DataFrame | None[source]#
Read the results of an experiment and return them as a pandas DataFrame.
The function reads the data from a JSON file, processes each row, and appends experiment attributes from a YAML configuration file. The resulting DataFrame is indexed and returned.
- Parameters:
- config_typetype[ConfigBase]
The type of the configuration class that is used to load the experiment configuration from a YAML file.
- json_pathPath
The path to the JSON file containing the results of the experiment.
- Returns:
- pd.DataFrame | None
A pandas DataFrame containing the processed experiment results. Returns None if there was an error in reading the json_path results.
Examples
>>> result json file: { "run_id": "run_1", "accuracy": 0.85, "loss": 0.35 } { "run_id": "run_2", "accuracy": 0.87, "loss": 0.32 } >>> config file experiment_name: "My Experiment" batch_size: 64 >>> return value run_id accuracy loss experiment_name batch_size path 0 run_1 0.85 0.35 My Experiment 64 path/to/experiment 1 run_2 0.87 0.32 My Experiment 64 path/to/experiment