The obvious difference is the ELT process performs the Load function before the Transform function – a reversal of the second and third steps of the ETL process. ELT copies or exports the data from the source locations, but instead of moving it to a staging area for transformation, it loads the raw data directly to the target data store, where it can be transformed as needed. ELT does not transform any data in transit.
However, the order of steps is not the only difference. In ELT, the target data store can be a data warehouse, but more often it is a data lake, which is a large central store designed to hold both structured and unstructured data at massive scale.
Data lakes are managed using a big data platform (such as Apache Hadoop) or a distributed NoSQL data management system. They can support business intelligence, but more often, they’re created to support artificial intelligence, machine learning, predictive analytics and applications driven by real-time data and event streams.
There are other differences between ETL and ELT, too. For example, because it transforms data before moving it to the central repository, ETL can make data privacy compliance simpler, or more systematic, than ELT (e.g., If analysts don’t transform sensitive data before they need to use it, it could sit unmasked in the data lake). However, data scientists might prefer ELT, which lets them play in a “sandbox” of raw data and do their own data transformation tailored to specific applications. But, in most cases, the choice between ETL and ELT will depend on the choice between on available business resources and needs.