Solutions overview

This chapter includes the following solutions that describe how to design pipelines for common use cases:

Converting data to the Parquet data format
Automating Impala metadata updates for drift synchronization for Hive
Managing output files
Stopping a pipeline after processing all available data
Offloading data from relational sources to Hadoop
Sending email during pipeline processing
Preserving an audit trail of events
Loading data into Databricks Delta Lake
Drift Synchronization Solution for Hive
Drift Synchronization Solution for PostgreSQL

Parent topic: Solutions