rialto.common package
Submodules
rialto.common.table_reader module
- class rialto.common.table_reader.DataReader[source]
- Bases: - object- This is an abstract class defining interface for reader of spark tables - Data reader provides to public functions, get_latest and get_table. get_latest reads a single snapshot of the given table, while get_table reads the whole table or multiple snapshots. - abstract get_latest(table: str, date_column: str, date_until: date | None = None, uppercase_columns: bool = False) DataFrame[source]
- Get latest available date partition of the table until specified date - Parameters:
- table – input table path 
- date_until – Optional until date (inclusive) 
- uppercase_columns – Option to refactor all column names to uppercase 
 
- Returns:
- Dataframe 
 
 - abstract get_table(table: str, date_column: str, date_from: date | None = None, date_to: date | None = None, uppercase_columns: bool = False) DataFrame[source]
- Get a whole table or a slice by selected dates - Parameters:
- table – input table path 
- date_from – Optional date from (inclusive) 
- date_to – Optional date to (inclusive) 
- uppercase_columns – Option to refactor all column names to uppercase 
 
- Returns:
- Dataframe 
 
 
- class rialto.common.table_reader.TableReader(spark: SparkSession)[source]
- Bases: - DataReader- An implementation of data reader for databricks tables - get_latest(table: str, date_column: str, date_until: date | None = None, uppercase_columns: bool = False) DataFrame[source]
- Get latest available date partition of the table until specified date - Parameters:
- table – input table path 
- date_until – Optional until date (inclusive) 
- date_column – column to filter dates on, takes highest priority 
- uppercase_columns – Option to refactor all column names to uppercase 
 
- Returns:
- Dataframe 
 
 - get_table(table: str, date_column: str, date_from: date | None = None, date_to: date | None = None, uppercase_columns: bool = False) DataFrame[source]
- Get a whole table or a slice by selected dates - Parameters:
- table – input table path 
- date_from – Optional date from (inclusive) 
- date_to – Optional date to (inclusive) 
- date_column – column to filter dates on, takes highest priority 
- uppercase_columns – Option to refactor all column names to uppercase 
 
- Returns:
- Dataframe 
 
 
rialto.common.utils module
- rialto.common.utils.cast_decimals_to_floats(df: DataFrame) DataFrame[source]
- Find all decimal types in the table and cast them to floats. Fixes errors in .toPandas() conversions. - Parameters:
- df – input df 
- Returns:
- pyspark DataFrame with fixed types 
 
- rialto.common.utils.get_caller_module() Any[source]
- Ged module containing the function which is calling your function. - Inspects the call stack, where: 0th entry is this function 1st entry is the function which needs to know who called it 2nd entry is the calling function - Therefore, we’ll return a module which contains the function at the 2nd place on the stack. - Returns:
- Python Module containing the calling function.