rialto.runner package

Subpackages

Submodules

rialto.runner.engine module

class rialto.runner.engine.RunnerEngine(services: RunnerServices, rerun: bool, skip_dependencies: bool)[source]

Bases: object

Orchestrates pipeline execution lifecycle and task tracking

check_tasks() None[source]

Check task completion and dependency status

debug_first_task(op: str = None) DataFrame[source]

Debug mode: execute first task and return result

dry_run_execution(op: str = None) None[source]

Execute pre-run checks without task execution

finalize() None[source]

Send final reports via mail/bookkeeping

log_task_status() None[source]

Log summary of task statuses

register_tasks(pipelines: List[PipelineConfig]) None[source]

Register tasks for all pipelines and date combinations

run(op: str = None) None[source]

Execute all tasks

run_tasks() None[source]

Execute runnable tasks with per-task error isolation

select_pipelines(op: str = None) List[PipelineConfig][source]

Select pipelines to run based on operation name

rialto.runner.runner module

class rialto.runner.runner.Runner(spark: SparkSession, config_path: str, run_date: str = None, rerun: bool = False, op: str = None, skip_dependencies: bool = False, overrides: Dict = None, merge_schema: bool = False, services: RunnerServices = None)[source]

Bases: object

Entry point for pipeline execution orchestration (beginner-friendly API)

dry_run()[source]

Dry run - log status of pipelines without executing

rialto.runner.runner_services module

class rialto.runner.runner_services.DefaultRunnerServices[source]

Bases: object

Factory for default Runner services composition

static build(spark: SparkSession, config_path: str, run_date: str = None, merge_schema: bool = False, overrides: dict = None) RunnerServices[source]

Build default services for Runner.

Parameters:
  • spark – SparkSession instance

  • config_path – Path to pipeline configuration YAML

  • run_date – Override run date (optional)

  • merge_schema – Enable schema merging in writer

  • overrides – Configuration overrides

Returns:

RunnerServices bundle

class rialto.runner.runner_services.RunnerServices(config: PipelinesConfig, date_manager: DateManager, writer: DatabricksWriter, data_checker: DataChecker, task_checker: TaskStatusChecker, registry: TaskRegistry, executor: PipelineExecutor, tracker: Tracker)[source]

Bases: object

Bundle of collaborator services for Runner orchestration

config: PipelinesConfig
data_checker: DataChecker
date_manager: DateManager
executor: PipelineExecutor
registry: TaskRegistry
task_checker: TaskStatusChecker
tracker: Tracker
writer: DatabricksWriter

rialto.runner.transformation module

class rialto.runner.transformation.Transformation[source]

Bases: object

Interface for feature implementation

abstractmethod run(reader: DataReader, run_date: date, spark: SparkSession = None, config: PipelineConfig = None, metadata_manager: MetadataManager = None, feature_loader: PysparkFeatureLoader = None) DataFrame[source]

Run the transformation

Parameters:
  • reader – data store api object

  • run_date – date

  • spark – spark session

  • config – pipeline config

  • metadata_manager – metadata manager

  • feature_loader – feature loader

Returns:

dataframe

rialto.runner.utils module

rialto.runner.utils.find_dependency(config: PipelineConfig, name: str)[source]

Get dependency from config

Parameters:
  • config – Pipeline configuration

  • name – Dependency name

Returns:

Dependency object

Module contents