Data Type Configuration#

These data type classes are used in configuration classes to specify data type of each config attribute, which provides ablator with the flexibility to expand into various configuration formats.

Common data types#

Ablator supports common structural data types like list, dictionary, etc. Details of each data type can be found in the following sections.

class ablator.config.types.List(iterable=(), /)[source]

Bases: List[T]

A class for list data type, used when you need to specify a config attribute to be a list. Remember to wrap the type of the list elements in List[], e.g List[str], List[int].

Examples

You can declare an attribute of type List as follows:

>>> @configclass
>>> class MyConfig(ConfigBase):
>>>     my_str_list: List[str]  # list of strings
>>>     my_int_list: List[int]  # list of integers

When initializing a config object, you can pass a list of proper values. In addition, ablator will automatically cast them to the correct type if possible. For example:

>>> MyConfig(my_str_list=["a", "b", 1.5, 2],
...          my_int_list=[1, 2, -3.5, 4])
my_str_dict:
- a
- b
- '1.5'
- '2'
my_int_dict:
- 1
- 2
- -3
- 4

Notice that the value of my_str_list[2] and my_int_list[3] are cast to string, and the value of my_int_list[2] is cast to an integer.

class ablator.config.types.Tuple(iterable=(), /)[source]

Bases: Tuple[T]

A class for tuple data type, used when you need to specify a config attribute to be a tuple. Remember to wrap the type of the tuple elements in Tuple[]. You also have the flexibility to specify the number of elements in the tuple and the data type for each of them.

Examples

You can declare an attribute of type Tuple as follows:

>>> @configclass
>>> class MyConfig(ConfigBase):
>>>     my_str_int_tuple: Tuple[str, int]
>>>     my_2str_int_tuple: Tuple[str, int, str]

When initializing a config object, you can pass a tuple of proper values. In addition, ablator will automatically cast them to the correct type if possible. For example:

>>> MyConfig(my_str_int_tuple=("a", 1.5), my_2str_int_tuple=("a", 1, 2))
my_str_int_tuple:
- a
- 1
my_2str_int_tuple:
- a
- 1
- '2'

Notice how data are cast in my_str_int_tuple[1] and my_2str_int_tuple[2].

Note

The number of elements in the tuple must match the number of types specified in Tuple[]. So for the example above, my_str_int_tuple must have exactly 2 elements, and my_2str_int_tuple must have exactly 3 elements.

class ablator.config.types.Dict[source]

Bases: Dict[str, T]

A class for dictionary data type, with keys as strings. Used when you need to specify a config attribute as a dictionary (in fact, ablator defines search_space as a dictionary of SearchSpace in config class ParallelConfig).

Examples

You can declare an attribute of type Dict as follows:

>>> @configclass
>>> class MyConfig(ConfigBase):
>>>     my_str_dict: Dict[str]
>>>     my_int_dict: Dict[int]
>>>     my_space_dict: Dict[SearchSpace]

When initializing a config object, you can pass a dictionary with keys as strings. For values, ablator will automatically cast them to the correct type if possible. For example:

>>> str_dict = {"str1": "val1", "str2": 2}
>>> int_dict = {"int1": 1, "int2": 2.5}
>>> space_dict = {"space1": SearchSpace(value_range = [0, 10], value_type = 'int')}
>>> MyConfig(my_str_dict=str_dict, my_int_dict=int_dict, my_space_dict=space_dict)
my_str_dict:
str1: val1
str2: '2'
my_int_dict:
int1: 1
int2: 2
my_space_dict:
    space1:
        value_range:
        - '0'
        - '10'
        categorical_values: null
        subspaces: null
        sub_configuration: null
        value_type: int
        n_bins: null
        log: false

Notice that the value at key str2 is cast to a string, and the value at key int2 is cast to an integer.

class ablator.config.types.Optional[source]

Bases: Generic[T]

A class for optional data types. This is helpful when a config attribute is optional, meaning that we can leave an optional config attribute empty. (in fact, ablator defines scheduler_config as optional in config class TrainConfig).

Examples

You can declare an attribute of type Optional as follows:

>>> @configclass
>>> class MyConfig(ConfigBase):
>>>     my_optional_list: Optional[List[str]]

When initializing a config object, you can pass a List[str] value to a4, or not passing values at all:

>>> MyConfig(my_optional_list=["a"])
my_optional_list:
- a
>>> MyConfig()
my_optional_list: null
class ablator.config.types.Enum(value)[source]

Bases: Enum

A custom Enum class that provides additional equality and hashing methods. This is useful when creating custom data types that take as value elements from a fixed set. In ablator, we use this class to define Optim, which specifies the optimization direction: Optim.min or Optim.max. Optim is used in config class ParallelConfig (optim_metrics attribute).

Examples

Create a custom Enum class by inheriting from Enum:

>>> from ablator import Enum
>>> class Color(Enum):
>>>     RED = 1
>>>     GREEN = 2
>>>     BLUE = 3

RED, GREEN, and BLUE are fixed value set for Color type. Internally, these values are mapped to integers 1, 2, and 3. The custom data type Color can now be used in config classes:

>>> @configclass
>>> class MyConfig(ConfigBase):
>>>     my_color: Color
>>> MyConfig(my_color=Color.RED)
my_color: 1

Methods

__eq__(self, __o: object) -> bool:

Checks for equality between the Enum instance and another object.

__hash__(self) -> int:

Calculates the hash of the Enum instance.

Ablator custom data types#

The next data classes are specific to ablator framework: Derived, Stateless, and Stateful. Users have the option to wrap these around the common data types, python primitive type, or custom classes to further modify their behavior. To learn more about these data types, go to Configuration Basics tutorial.

class ablator.config.types.Stateless[source]

Bases: Generic[T]

This type is for attributes that can take different value assignments between experiments. To make an attribute stateless, wrap Stateless around its type defenition, e.g Stateless[List[int]], Stateless[str].

Examples

>>> @configclass
>>> class MyModelConfig(ConfigBase):
>>>     attr: Stateless[List[int]]
>>> config = MyModelConfig(attr=[5,"6",7.25])  # Must provide values for ``attr`` before launching experiment

Note

Unlike Derived, when initializing config objects (aka before launching the experiment) that have stateless attributes, you have to assign values to these attributes.

class ablator.config.types.Derived[source]

Bases: Generic[T]

This type is for attributes that are derived during the experiment (after launching the experiment). To make an attribute derived, wrap Derived around its type defenition, e.g Derived[List[int]], Derived[str].

Examples

For example, you want to test how different pretrained word embeddings (e.g word2vec 100d, word2vec 300d) affect the performance of a classification model, and you will use ablator to run ablation study on the effect of word embeddings. Plus, the classification model architecture depends on the size of the embedding length of each pretrained set of word embeddings. In this case, the model architecture is derived from the pretrained word embeddings. So you can define a model config class as follows:

>>> @configclass
>>> class MyModelConfig(ModelConfig):
>>>     embed_dim: Derived[int]

Then you can define a model class that takes in the model config as input and set input length using embed_dim:

>>> class MyModel(nn.Module):
>>>     def __init__(self, config: MyModelConfig):
>>>         super().__init__()
>>>         self.embed_dim = config.embed_dim

Finally, config_parser is used to set the value of Derived attribute embed_dim based on the pretrained word embeddings:

>>> class MyLMWrapper(ModelWrapper):
>>>     def config_parser(self, run_config: RunConfig):
>>>         run_config.model_config.embed_dim = len(self.train_dataloader.word2vec.wv.vocab)
>>>         return run_config

Note

When initializing config objects, you do not have to assign values to attributes that are of Derived type.

class ablator.config.types.Stateful[source]

Bases: Generic[T]

This is for attributes that are fixed between experiments. By default, we assume that unannotated attributes are stateful. Unlike Derived and Stateless, in which you have to annotate attributes with these classes, e.g. attr: Statess[int] or attr: Statess[List[str]], for stateful, just define them without Stateful, e.g attr: int or attr: List[str].

Examples

The below example defines a model config that has stateful embedding dimensions, which means among every experiment, the embedding dimension must be the same (and will be 100).

>>> @configclass
>>> class MyModelConfig(ModelConfig):
>>>     embed_dim: int
>>> model_config = MyModelConfig(embed_dim=100) # Must provide values for ``embed_dim`` before launching experiment

Note

  • In contrary to Derived, when initializing config objects (aka before launching the experiment), you have to assign values to their stateful attributes.

  • Stateful is only applied in the context of experiments. So a stateful attribute must be the same between different run of the same experiment configurations. However, within each experiment, a search space on stateful attributes can be defined to run HPO on them.