Skip to content

Ash/dataclasses instead of argdicts

Andrey Sheshukov requested to merge ash/Dataclasses_instead_of_argdicts into main

Idea of this work is to try to get away from arg_dicts in each class, and implement the configuration parameters as input arguments to __init__ method.

Advantages:

  1. Explicit constructor of each class object, instead of calling configure in order to make a valid object.This means we can easily do unit testing. Using explicit standard python paradigms (passing arguments to c-tor, instead of configuring later) makes it easier to understand for a new user.
  2. Type hints and documentation right here in the code. This could be useful for type checking in IDEs (if you use one) and with external checkers like mypy
  3. Making custom wrappers for the type annotations (Parameter, Vector,... classes) we can get absolutely similar behaviour with generating command-line arguments to the parser, so this is backward-compatible change.

Disadvantages:

  1. It's hard to include all the input parameters from the parent class (like ToyGen including all arameters from PrimayGeneratorBase). But IMO this is not a good practice in general. We need to think more about ways to improve configuration.

UPD: it is straightforward. Subclassing dataclass inherits all the input variables. Let's do this for now for backward compatibility (but I don't like it)

Summary of changes

  • arg_dicts in (most of)n the classes are removed. These configuration arguments become input arguments for the constructor. It allows to easily store argument type (as type annotation) and default value
  • To store additional information, needed for the argparser, I'm using typing.Annotated class - you can store anything in addition to the base type, and all static type checkers (IDE, mypy) will consider it just a base type.
  • Added convenience functions: Parameter, Vector, Vector3,Choice, Flag - to easily create the corresponding Annotated types.
  • Changing BaseConfig.add_args method - to produce input arguments, using the arguments of the constructor (that means, it should work not only on the dataclass, but on any other).
  • Added BaseConfig.from_configuration class method for constructing object given the namespace.
  • Added a convenient BaseConfig.logger, which will automatically invoke correct logger for all the subclasses (maybe this should go to separate MR)
  • In the main file ntsim.ntsim_2.py I am trying to keep the same structure as before. But:
    • Moved all the Factories to a single dict of NTSim._factories, instead of keeping them separately in the class members. This means we can avoid looping over dir(self) trying to select only the factories, no need for these checks.
    • Avoid storing Blueprints (unless we failed to create the object from blueprint, like in the SensitiveDetector case)
  • Transfer all classes from arg_dicts to type annotations
Edited by Andrey Sheshukov

Merge request reports