Planning your project
Gathering requirements, carefully choosing your data sources, and considering the best design approach produce better performing metadata.
Gather requirements
Gather the business requirements first. It’s not recommended to simply see what data is available, and then work around it. Ideally, the source data should be structured and enriched for analytical applications. A common approach is to combine star schemas to represent a conformed dimensional warehouse that spans one or more subject areas in a business.
Ensure that your IBM® Cognos® Analytics applications are built on properly structured data sources that meet the users’ query requirements and performance expectations. This will allow you to avoid queries that continually “restructure” and “enrich” data at run time. It’s best to have processes that feed the source data, such as the extract, transform, and load (ETL) process, to procure and store the required data. Performance starts with the data source and any optimizations that can be implemented within it, such as aggregate tables, materialized views, precomputed calculations, indexing, and so on.
Consider the following, additional questions:
- What are the performance expectations?
- How current the data needs to be?
- What level of detail is required?
The answers to these types of questions dictate what data sources to use, what data the sources should contain, what level of detail is stored (for example, hour versus day, versus week, versus month), how often the data is refreshed, and so on.
Choose data sources
Cognos Analytics performs best with star or snowflake schemas. These schemas consist of dimension tables that contain attributes, such as name, date, color, or city, which are used to categorize data. They also use fact tables that contain key performance indicators (KPI), otherwise known as measures or facts.
In the following example, Products, Time, Order method and Sales order are dimension tables, and Sales fact is the fact table.
Dimension tables that are referenced by multiple fact tables are known as shared or reference dimension tables. In the following example, Products and Time are shared dimensions to the Returned items fact and Sales fact tables.
With this type of structure for your enterprise data, the metadata modeling process is greatly simplified, which ensures best performance. Of course, performance can be impacted by several factors that include data volume, system resources, database vendor, or database optimizations, such as indexes. All of these elements must be considered when undertaking a Cognos Analytics project. Testing against the database structure directly, outside of Cognos Analytics, ensures that it performs as expected.
Cognos Analytics can query against many types of data sources, not just databases. Microsoft Excel or CSV files can be used. Or your requirements might see an OLAP source as the best option. For example, you might have users who are knowledgeable about dimensional functions and require an OLAP data source to accomplish their queries. You must consider what works best and satisfies the needs of users.
Data sources can also be mixed and matched. For example, you can use a CSV file and a database table in the same project and create a relationship join between them. However, caution should be used as performance might be an issue if these two sources aren’t combined properly. Using certain optimization techniques that are discussed later in this document can help to improve performance when combining disparate data sources in a project.
Think about design
As mentioned before, Cognos Analytics works best with the star or snowflake schema data warehouse structure. The Cognos Analytics query service is optimized to recognize dimension and fact tables based on the nature of the join between tables. The more clearly a metadata modeler can identify and configure fact and dimension tables, the higher is the success rate of expected and performant results.