A core component of data warehousing implementations, OLAP enables fast, flexible multidimensional data analysis for business intelligence (BI) and decision support applications.
What is OLAP?
OLAP (for online analytical processing) is software for performing multidimensional analysis at high speeds on large volumes of data from a data warehouse, data mart, or some other unified, centralized data store.
Most business data have multiple dimensions—multiple categories into which the data are broken down for presentation, tracking, or analysis. For example, sales figures might have several dimensions related to location (region, country, state/province, store), time (year, month, week, day), product (clothing, men/women/children, brand, type), and more.
But in a data warehouse, data sets are stored in tables, each of which can organize data into just two of these dimensions at a time. OLAP extracts data from multiple relational data sets and reorganizes it into a multidimensional format that enables very fast processing and very insightful analysis.
What is an OLAP cube?
The core of most OLAP systems, the OLAP cube is an array-based multidimensional database that makes it possible to process and analyze multiple data dimensions much more quickly and efficiently than a traditional relational database.
A relational database table is structured like a spreadsheet, storing individual records in a two-dimensional, row-by-column format. Each data “fact” in the database sits at the intersection of two dimensions–a row and a column—such as region and total sales.
SQL and relational database reporting tools can certainly query, report on, and analyze multidimensional data stored in tables, but performance slows down as the data volumes increase. And it requires a lot of work to reorganize the results to focus on different dimensions.
This is where the OLAP cube comes in. The OLAP cube extends the single table with additional layers, each adding additional dimensions—usually the next level in the “concept hierarchy” of the dimension. For example, the top layer of the cube might organize sales by region; additional layers could be country, state/province, city and even specific store.
In theory, a cube can contain an infinite number of layers. (An OLAP cube representing more than three dimensions is sometimes called a hypercube.) And smaller cubes can exist within layers—for example, each store layer could contain cubes arranging sales by salesperson and product. In practice, data analysts will create OLAP cubes containing just the layers they need, for optimal analysis and performance.
OLAP cubes enable four basic types of multidimensional data analysis:
The drill-down operation converts less-detailed data into more-detailed data through one of two methods—moving down in the concept hierarchy or adding a new dimension to the cube. For example, if you view sales data for an organization’s calendar or fiscal quarter, you can drill-down to see sales for each month, moving down in the concept hierarchy of the “time” dimension.
Roll up is the opposite of the drill-down function—it aggregates data on an OLAP cube by moving up in the concept hierarchy or by reducing the number of dimensions. For example, you could move up in the concept hierarchy of the “location” dimension by viewing each country's data, rather than each city.
Slice and dice
The slice operation creates a sub-cube by selecting a single dimension from the main OLAP cube. For example, you can perform a slice by highlighting all data for the organization's first fiscal or calendar quarter (time dimension).
The dice operation isolates a sub-cube by selecting several dimensions within the main OLAP cube. For example, you could perform a dice operation by highlighting all data by an organization’s calendar or fiscal quarters (time dimension) and within the U.S. and Canada (location dimension).
The pivot function rotates the current cube view to display a new representation of the data—enabling dynamic multidimensional views of data. The OLAP pivot function is comparable to the pivot table feature in spreadsheet software, such as Microsoft Excel, but while pivot tables in Excel can be challenging, OLAP pivots are relatively easier to use (less expertise is required) and have a faster response time and query performance.
MOLAP vs. ROLAP vs. HOLAP
OLAP that works directly with a multidimensional OLAP cube is known as multidimensional OLAP, or MOLAP. Again, for most uses, MOLAP is the fastest and most practical type of multidimensional data analysis.
However, there are two other types of OLAP which may be preferable in certain cases:
ROLAP, or relational OLAP, is multidimensional data analysis that operates directly on data on relational tables, without first reorganizing the data into a cube.
As noted previously, SQL is a perfectly capable tool for multidimensional queries, reporting, and analysis. But the SQL queries required are complex, performance can drag, and the resulting view of the data is static—it can't be pivoted to represent a different view of the data. ROLAP is best when the ability to work directly with large amounts of data is more important than performance and flexibility.
HOLAP, or hybrid OLAP, attempts to create the optimal division of labor between relational and multidimensional databases within a single OLAP architecture. The relational tables contain larger quantities of data, and OLAP cubes are used for aggregations and speculative processing. HOLAP requires an OLAP server that supports both MOLAP and ROLAP.
A HOLAP tool can "drill through" the data cube to the relational tables, which paves the way for quick data processing and flexible access. This hybrid system can offer better scalability but can't escape the inevitable slow-down when accessing relational data sources. Also, its complex architecture typically requires more frequent updates and maintenance, as it must store and process all the data from relational databases and multidimensional databases. For this reason, HOLAP can end up being more expensive.
OLAP vs. OLTP
Online transaction processing, or OLTP, refers to data-processing methods and software focused on transaction-oriented data and applications.
The main difference between OLAP and OLTP is in the name: OLAP is analytical in nature, and OLTP is transactional.
OLAP tools are designed for multidimensional analysis of data in a data warehouse, which contains both transactional and historical data. In fact, an OLAP server is typically the middle, analytical tier of a data warehousing solution. Common uses of OLAP include data mining and other business intelligence applications, complex analytical calculations, and predictive scenarios, as well as business reporting functions like financial analysis, budgeting, and forecast planning.
OLTP is designed to support transaction-oriented applications by processing recent transactions as quickly and accurately as possible. Common uses of OLTP include ATMs, e-commerce software, credit card payment processing, online bookings, reservation systems, and record-keeping tools.
OLAP and cloud architecture
OLAP enables companies to maximize the potential of their corporate data by transforming it into the most practical format for multidimensional analysis. This, in turn, makes it easier to discern valuable business insights. However, if these systems are kept in-house, it limits the potential for scaling.
Cloud-based OLAP services are less expensive and easier to set up, making them more attractive for small businesses or startups on a budget. Enterprises can tap into the vast potential of cloud-based data warehouses that perform sophisticated analytics at unrivaled speeds because they use massively parallel processing (MPP). Therefore, companies can use OLAP at cloud speed and scale, analyzing vast amounts of data without moving it from their cloud data warehouse.
Constance Hotels, Resorts & Golf is a luxury hotel group with nine properties on islands in the Indian Ocean. However, a lack of island-to-island communications gave way to organizational silos, with business data isolated in each resort. The organization built a cloud data warehouse and analytics architecture to link all on-premises systems and tools with a central cloud-based data repository. In doing this, the company gained the group-wide insight they needed to leverage advanced, predictive analytics and implement an OLAP system.
OLAP in cloud architecture is a fast and cost-effective solution built for the future. Once the cubes are made, teams can use existing business intelligence tools to instantly connect with the OLAP model and draw interactive real-time insights from their cloud data.
OLAP and IBM Cloud
IBM Db2 Warehouse on Cloud is a managed public cloud service. You can set up IBM Db2 Warehouse on-premises with your hardware or in a private cloud. As a data warehouse, this service includes features such as in-memory data processing and columnar tables for online analytical processing (OLAP). As these features share a common database engine, you can easily optimize or move data workloads.
Teams can also use IBM Cubing Services to create, edit, import, export, and deploy OLAP cube models over the relational warehouse schema. This service provides a multidimensional view of data stored in a relational database to optimize and enhance the performance of OLAP queries.
Sign up for an IBMid and create your IBM Cloud account.