Solutions overview
This chapter includes the following solutions that describe how to design pipelines for
common use cases:
- Converting data to the Parquet data format
- Automating Impala metadata updates for drift synchronization for Hive
- Managing output files
- Stopping a pipeline after processing all available data
- Offloading data from relational sources to Hadoop
- Sending email during pipeline processing
- Preserving an audit trail of events
- Loading data into Databricks Delta Lake
- Drift Synchronization Solution for Hive
- Drift Synchronization Solution for PostgreSQL