rialto.runner package
Subpackages
- rialto.runner.reporting package
- rialto.runner.services package
- Submodules
- rialto.runner.services.config_loader module
- rialto.runner.services.config_overrides module
- rialto.runner.services.data_checker module
- rialto.runner.services.date_manager module
- rialto.runner.services.executor module
- rialto.runner.services.result_mapper module
- rialto.runner.services.table module
- rialto.runner.services.task_registry module
- rialto.runner.services.task_status_checker module
- rialto.runner.services.writer module
- Module contents
Submodules
rialto.runner.engine module
- class rialto.runner.engine.RunnerEngine(services: RunnerServices, rerun: bool, skip_dependencies: bool)[source]
Bases:
objectOrchestrates pipeline execution lifecycle and task tracking
- debug_first_task(op: str = None) DataFrame[source]
Debug mode: execute first task and return result
rialto.runner.runner module
- class rialto.runner.runner.Runner(spark: SparkSession, config_path: str, run_date: str = None, rerun: bool = False, op: str = None, skip_dependencies: bool = False, overrides: Dict = None, merge_schema: bool = False, services: RunnerServices = None)[source]
Bases:
objectEntry point for pipeline execution orchestration (beginner-friendly API)
rialto.runner.runner_services module
- class rialto.runner.runner_services.DefaultRunnerServices[source]
Bases:
objectFactory for default Runner services composition
- static build(spark: SparkSession, config_path: str, run_date: str = None, merge_schema: bool = False, overrides: dict = None) RunnerServices[source]
Build default services for Runner.
- Parameters:
spark – SparkSession instance
config_path – Path to pipeline configuration YAML
run_date – Override run date (optional)
merge_schema – Enable schema merging in writer
overrides – Configuration overrides
- Returns:
RunnerServices bundle
- class rialto.runner.runner_services.RunnerServices(config: PipelinesConfig, date_manager: DateManager, writer: DatabricksWriter, data_checker: DataChecker, task_checker: TaskStatusChecker, registry: TaskRegistry, executor: PipelineExecutor, tracker: Tracker)[source]
Bases:
objectBundle of collaborator services for Runner orchestration
- config: PipelinesConfig
- data_checker: DataChecker
- date_manager: DateManager
- executor: PipelineExecutor
- registry: TaskRegistry
- task_checker: TaskStatusChecker
- writer: DatabricksWriter
rialto.runner.transformation module
- class rialto.runner.transformation.Transformation[source]
Bases:
objectInterface for feature implementation
- abstractmethod run(reader: DataReader, run_date: date, spark: SparkSession = None, config: PipelineConfig = None, metadata_manager: MetadataManager = None, feature_loader: PysparkFeatureLoader = None) DataFrame[source]
Run the transformation
- Parameters:
reader – data store api object
run_date – date
spark – spark session
config – pipeline config
metadata_manager – metadata manager
feature_loader – feature loader
- Returns:
dataframe