As of 23/04 2021 the new version 2.22.0 is out
Standardization Improvements
- #1753 Add an option to ignore trailing and leading white spaces on CSV read
Conformance Improvements
- #1662
MappingConformanceRule
now allows having multiple output columns from mapping tables. The implementations are done for Broadcast and GroupExplode
Menas Improvements
- #1702 Dataset properties essentiality level Mandatory enhanced with parameter
allowRun
. (If set to true SparkJobs task will still execute even if the property is not defined. Menas UI configuration validation will fail though.)
Helper Scripts Improvements
- #1759 Having a more strict DRA setup in_Helper scripts_. Differentiating between spark job run-time parameters for DRA and non-DRA setup. If DRA is explicitly enabled or set to
true
in default, ignores the non-DRA number of executors, memory and cores.
Project Testing
- #1625 Now it is possible to populate test data into HDFS and MongoDB by bash scripts, since the data are part of the codebase. It is possible to run tests from codebase with test JSONs using Hermes 0.3.1+
New Helper Scripts Configuration
Add/Change in enceladus_env.sh
/_enceladus_env.cmd
file of your deployment.
STD_DEFAULT_DRA_EXECUTOR_MEMORY
- default value forDRA_EXECUTOR_MEMORY
parameter in Standardization and combined Standardization&Conformance jobs.STD_DEFAULT_DRA_EXECUTOR_CORES
- default value forDRA_EXECUTOR_CORES
parameter in Standardization and combined Standardization&Conformance jobs.CONF_DEFAULT_DRA_EXECUTOR_MEMORY
- default value forDRA_EXECUTOR_MEMORY
parameter in Conformance jobs.CONF_DEFAULT_DRA_EXECUTOR_CORES
- default value forDRA_EXECUTOR_CORES
parameter in Conformance jobs.STD_DEFAULT_DRA_ENABLED
- set totrue
, to enable DRA as default for Standardization and combined Standardization&Conformance jobs. Recommended.CONF_DEFAULT_DRA_ENABLED
- set totrue
, to enable DRA as default for Conformance jobs. Recommended.