rialto.jobs package
Submodules
rialto.jobs.decorators module
- rialto.jobs.decorators.config_parser(cf_getter: Callable) Callable[source]
Config parser functions decorator.
Registers a config parsing function into a rialto job prerequisite. You can then request the job via job function arguments.
- Parameters:
cf_getter – dataset reader function
- Returns:
raw function, unchanged
- rialto.jobs.decorators.datasource(ds_getter: Callable) Callable[source]
Dataset reader functions decorator.
Registers a data-reading function into a rialto job prerequisite. You can then request the job via job function arguments.
- Parameters:
ds_getter – dataset reader function
- Returns:
raw reader function, unchanged
- rialto.jobs.decorators.job(*args, custom_name=None, disable_version=False)[source]
Rialto jobs decorator.
Transforms a python function into a rialto transformation, which can be imported and ran by Rialto Runner. Is mainly used as @job and the function’s name is used, and the outputs get automatic. To override this behavior, use @job(custom_name=XXX, disable_version=True).
- Parameters:
*args –
list of positional arguments. Empty in case custom_name or disable_version is specified.
custom_name – str for custom job name.
disable_version – bool for disabling automatically filling the VERSION column in the job’s outputs.
- Returns:
One more job wrapper for run function (if custom name or version override specified). Otherwise, generates Rialto Transformation Type and returns it for in-module registration.
rialto.jobs.job_base module
- class rialto.jobs.job_base.JobBase[source]
Bases:
TransformationA Base Class for Rialto Jobs. Serves as a foundation into which the @job decorators are inserted.
- abstractmethod get_custom_callable() Callable[source]
Getter - Custom callable (i.e. job transformation function)
- run(reader: TableReader, run_date: date, spark: SparkSession = None, config: PipelineConfig = None, metadata_manager: MetadataManager = None, feature_loader: PysparkFeatureLoader = None) DataFrame[source]
Rialto transformation run
- Parameters:
reader – data store api object
info_date – date
spark – spark session
config – pipeline config
- Returns:
dataframe
rialto.jobs.moduleregister module
- class rialto.jobs.module_register.ModuleRegister[source]
Bases:
objectModule register. Class which is used by @datasource and @config_parser decorators to register callables / getters.
Resolver, when searching for a getter for f() defined in module M, uses find_callable(“f”, “M”).
- classmethod add_callable_to_module(callable, parent_name)[source]
Add a callable to the specified module’s storage.
- Parameters:
callable – The callable to be added.
parent_name – The name of the module to which the callable is added.
- classmethod find_callable(callable_name, module_name)[source]
Find a callable by its name in the specified module and its dependencies.
- Parameters:
callable_name – The name of the callable to find.
module_name – The name of the module to search in.
- Returns:
The found callable or None if not found.
- classmethod register_callable(callable)[source]
Register a callable by adding it to the module’s storage.
- Parameters:
callable – The callable to be registered.
rialto.jobs.resolver module
- class rialto.jobs.resolver.Resolver[source]
Bases:
objectResolver handles dependency management between datasets and jobs.
We register different callables, which can depend on other callables. Calling resolve() we attempt to resolve these dependencies.
- register_getter(callable: Callable, name: str = None) str[source]
Register callable with a given name for later resolution.
In case name isn’t present, function’s __name__ attribute will be used.
- Parameters:
callable – callable to register (getter)
name – str, custom name, f.__name__ will be used otherwise
- Returns:
str, name under which the callable has been registered
- register_object(object: Any, name: str) None[source]
Register an object with a given name for later resolution.
- Parameters:
object – object to register (getter)
name – str, custom name
- Returns:
None
- resolve(callable: Callable) Dict[str, Any][source]
Take a callable and resolve its dependencies / arguments. Arguments can be a) objects registered via register_object b) callables registered via register_getter c) ModuleRegister registered callables via ModuleRegister.register_callable (+ dependencies)
Arguments are resolved recursively according to requirements; For example, if we have a(b, c), b(d), and c(), d() registered, then we recursively call resolve() methods until we resolve c, d -> b -> a
- Parameters:
callable – function to resolve
- Returns:
result of the callable