Standardization and Conformance support plugins that allow executing additional actions at certain times of the computation.

A plugin can be externally developed. In this case, in order to use the plugin a plugin jar needs to be supplied to spark-submit using the --jars option. You can also use built-in plugins by enabling them in application.conf or passing configuration information directly to spark-submit.

The way it works is like this. A plugin factory (a class that implements PluginFactory) overrides the apply method. Standardization and Conformance will invoke this method when job starts and provides a configuration that includes all settings from application.conf plus settings passed to JVM via spark-submit. The factory then instantiates a plugin and returns it to the caller. If the factory throws an exception the Spark application (Standardization or Conformance) will be stopped. If the factory returns null an error will be logged by the application, but it will continue to run.

There’s one type of plugins supported for now:

Control Metrics Plugins

Control metrics plugins allow execution of additional actions any time a checkpoint is created or job status changes. In order to write such a plugin to Enceladus you need to implement the ControlMetricsPlugin and ControlMetricsPluginFactory interfaces.

Controls metrics plugins are invoked each time a job status changes (e.g. from running to succeeded) or when a checkpoint is reached. A Checkpoint is an Atum concept to ensure accuracy and completeness of data. A checkpoint is created at the end of Standardization and Conformance, and after each conformance rule configured to create control measurements. At this point the onCheckpoint() callback is called with an instance of control measurements. It is up to the plugin to decide what to do at this point. All exceptions thrown from a plugin will be logged, but the spark application will continue to run.