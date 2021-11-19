This is a common question with a simple answer: Data ingestion and ETL are different parts of the same workflow. You first ingest data from, say, a bunch of data vendors and files, and then when it’s ready, you extract it, transform it and load it (ETL) using a data pipeline that moves it to another destination.

Data ingestion is a much broader term than ETL. Ingestion refers to the general process of ingesting data from hundreds or thousands of sources and preparing it for transfer. ETL is a very specific action, or job, that you can run.

Though, if you want to split hairs, ingestion today involves a fair amount of extracting, transforming, and loading. It’s very rare to ingest and transfer data without some transformation, unless you’re just replicating a database, saving raw system logs, or for some reason, are indifferent to quality.

Of course, as data infrastructure costs have fallen, ELT (loading it into your warehouse or lake before transforming it) has become more popular than ETL. Data teams have to worry less about blowing out their analytics tool budget, so they can now afford to load everything and sort it out later. But again, even when you’re just moving the data, you should always, always be checking its quality and cleaning it.

Which brings us to our data ingestion framework.