As of 23/03 2023 the new version 3.0.0 is out

Major changes

  • #2037 Enceladus now operates and is build on Spark 3.2 and Scala 2.12.
  • #2152 Removed hardcoded dependency on Spline. Added options to Helper scripts to run Spline codeless.
  • #1612 The web part of the application was split to UI (Menas) and Rest API (RestAPI).
  • #1693 New Rest API endpoins; the old endpoints remain functional, considered deprecated.
  • #2118 Oozie has been removed from the project.

Standardization Improvements

  • #417 Remove extra code when spark-xml handles empty arrays.

Standardization & Conformance Improvements

  • #2037 Switched to use Spark 3.2.
  • #2159 Now adding “Run number” into additional info of _INFO file.
  • #2126 Added a wrapper atop HDFS services, now EnceladusFileSystem services. This service can use none or hdfs implementation. Env variable HADOOP_CONF_DIR is not being used anymore, instead a property enceladus.rest.hadoop.conf.dir is defined and can equal none or hdfs.
  • #2072 A user can specify a list of HTTP error status codes that are allowed to be retried when a REST API call yields such an HTTP error. Currently supported HTTP error status codes: 401, 403, and 404.
  • #2121 A random (quadratically + linearly randomized) wait has been added between retries of the same URL and between switching URLs on calling the REST API.
  • #2165 An experimental usage of Standardization/Conformance/Standardization & Conformance as an embedded library was spotted. Previously multiple such jobs couldn’t run within one session, now it’s possible.
  • #2105 Configuration keys starting with menas.rest have been changed to enceladus.rest.
  • #2117 enceladus.menas.uri property has been added and used in logging Menas run URI.

Menas Improvements

  • #2095 Lading page buttons order has been changed to conform the order of items in the left-side menu.

RestAPI Improvements

  • #601 Added swagger API documentation for the Rest API.
  • #1693 /api-v3/{datasets|schemas|mapping-tables|property-defintions}/ Rest API V3 added. Schema is now updatable via entity payload as well (outside of attachment upload). Includes checks on entities and is to be used externally. The API is now truly RESTful (endpoint naming, structure, methods, Location header on creating responses). Also, /api-v3/{datasets|schemas|mapping-tables|property-definitions/datasets}/{name}/used-in and /api-v3/{datasets|schemas|mapping-tables|property-definitions/datasets}/{name}/{version}/used-in REST API V3 added, the former is used for disable requests to check the dependencies. Entities are now disabled and enabled entirely (all versions). UsedIn normalization introduced.
  • #2060 Some potentionally large container-listing endpoints are now paginated, with offset (default: 0) and limit (default: 20). Specifically, pagination is added to: GET of /api-v3/{datasets|schemas|mapping-tables|property-definitions/datasets} and /api-v3/runs[/{datasetName}[/{datasetVersion}]].
  • #2160 SpringFox implementation replaced by newer SpringDoc -> brings OpenAPI 3 (can be generated from annotations), newer swagger with more info, examples, and JWT working (/api/login is part of OpenAPI and can be used in Swagger UI, too).
  • #2162 Dataset conformance is now optional, if it is not defined it assigns an empty list as a default value.
  • #2131 Usage of CSRF token and manually persisting JWT in cookies on frontend side has been removed.

Standardization Fixes

  • Spark Data Standardization: #38 In some rare cases processing of arrays that contained structs with a column having a numeric name could have failed. That has been addressed now.

Conformance Fixes

  • #2112 Behaviour of and/or filter combiner in case of two OrJoinedFilters/AndJoinedFilters have been fixed.

Examples and documentation changes

  • #2109 Examples README pointed to wrong paths, fixed.
  • #2135 Added example calls for each supported rest-api endpoint; supported are both v2 and v3 endpoints.
  • #2143 Code test coverage tooling replaced to use jacoco plugin.
  • #2172 Updated hermes json files to be compatible with Spark 3 and other changes.
  • #2145 Remove examples module from Enceladus release. Folder remains, but contains mostly data, no project.
  • #2129 In example data added new variable to store file content before sending by curl command to solved an error in windows.

Other overall or internal changes

  • #2111 Renamed Menas API mentions to REST API.
  • #1816 UDF registration does not allow registration for mulitple SparkSessions

Configuration changes

Too much to list. For spark-jobs, web applications and helper scripts.

Dependencies upgraded

An non-exhaustive list of dependent libraries added that were added or whose versions were upgraded, and could/would affect the application function.

Scala 2.12
Spark 3.2.2
Abris 6.2.0
Atum 3.9.0
Cobrix 2.6.0
Spark-Data-Standardization 0.2.0
Absa-Spark-Commons 0.4.0
Absa-Common 1.1.0
Jacskon 2.14.1
OpenAPI 3 Library for spring-boot 1.6.14