What are built-in plugins

Built-in plugins provide some additional but relatively elementary functionality. And also serve as an example how plugins are written. Unlike externally created plugins they are automatically included in the SparkJobs.jar file and therefore don’t need to be included using the --jars option.

Existing built-in plugins

The plugin class name is specified for Standardization and Conformance separately since some plugins need to run only during execution of one of these jobs. Plugin class name keys have numeric suffixes (.1 in this example). The numeric suffix specifies the order at which plugins are invoked. It should always start with 1 and be incremented by 1 without gaps.

KafkaInfoPlugin

The purpose of this plugin is to send control measurements to a Kafka topic each time a checkpoint is reached or job status is changed. This can help to monitor production issues and react to errors as quickly as possible. Control measurements are sent in Avro format and the schema is automatically registered in a schema registry.

This plugin is a built-in one. In order to enable it, you need to provide the following configuration settings in application.conf:

standardization.plugin.control.metrics.1=za.co.absa.enceladus.plugins.builtin.controlinfo.mq.kafka.KafkaInfoPlugin
conformance.plugin.control.metrics.1=za.co.absa.enceladus.plugins.builtin.controlinfo.mq.kafka.KafkaInfoPlugin
kafka.schema.registry.url="http://127.0.0.1:8081"
kafka.bootstrap.servers="127.0.0.1:9092"
kafka.info.metrics.client.id="controlInfo"
kafka.info.metrics.topic.name="control.info"
# Optional security settings
#kafka.security.protocol="SASL_SSL"
#kafka.sasl.mechanism="GSSAPI"
# Optional Schema Registry Security Parameters
#kafka.schema.registry.basic.auth.credentials.source=USER_INFO
#kafka.schema.registry.basic.auth.user.info=user:password

KafkaErrorSenderPlugin

The purpose of this plugin is to send errors to a Kafka topic.

This plugin is a built-in one. In order to enable it, you need to provide the following configuration settings in application.conf:

standardization.plugin.postprocessor.1=za.co.absa.enceladus.plugins.builtin.errorsender.mq.kafka.KafkaErrorSenderPlugin
conformance.plugin.postprocessor.1=za.co.absa.enceladus.plugins.builtin.errorsender.mq.kafka.KafkaErrorSenderPlugin
`kafka.schema.registry.url`=
`kafka.bootstrap.servers`=
`kafka.error.client.id`=
`kafka.error.topic.name`=