Businesses have been generating data since the age of the abacus, but modern analytics only became possible with the arrival of the digital computer and data storage.
A major step forward arrived in the 1970s, with a move to larger centralized databases. ETL was then introduced as a process for integrating and loading data for computation and analysis, eventually becoming the primary method to process data for data warehousing projects.
In the late 1980s, data warehouses and the move from transactional databases to relational databases that stored the information in relational data formats grew in popularity. Older transactional databases would store information transaction-by-transaction, with duplicate customer information stored with each transaction, so there was no easy way to access customer data in a unified way over time. With relational databases, analytics became the foundation of business intelligence (BI) and a significant tool in decision making.
Until the arrival of more sophisticated ETL software, early attempts were largely manual efforts by the IT team to extract data from various systems and connectors, transform the data into a common format, and then load it into interconnected tables. Still, the early ETL steps were worth the effort, as advanced algorithms, plus the rise of neural networks, produced ever-deeper opportunities for analytical insights.
The era of big data arrived in the 1990s as computing speeds and storage capacity continued to grow rapidly, with large volumes of data being pulled from new sources, such as social media and the Internet of Things (IoT). A limiting factor remained, with data often stored in on-premises data warehouses.
The next major step in both computing and ETL was cloud computing, which became popular in the late 1990s. Using data warehouses such as Amazon Web Services (AWS), Microsoft Azure and Snowflake, data can now be accessed from around the globe and quickly scaled to enable ETL solutions to deliver remarkable detailed insights and new-found competitive advantage.
The latest evolution is ETL solutions using streaming data to deliver up-to-the-second insights from huge amounts of data.