of an enterprise data warehouse (EDW) should satisfy many functional
and nonfunctional requirements that depend on the specific tasks solved
by the EDW. As there is no generic bank, airline, or oil company, so
there is no single solution for the EDW to fit all occasions. But the
basic principles that EDW must follow can still be formulated.
and foremost it is the data quality that can be understood as complete,
accurate and reproducible data, delivered in time where they are
needed. Data quality is difficult to measure directly, but it can be
judged by the decisions made. That is, data quality requires investment,
and it can generate profits in turn.
Secondly, it is the
security and reliability of data storage. The value of information
stored in EDW can be compared to the market value of the company.
Unauthorized access to EDW is a threat with serious consequences, and
therefore adequate protection measures must be taken.
Thirdly, the data must be available to the employees to the extent necessary and sufficient to carry out their duties.
Fourthly, employees should have a unified understanding of the data, so a single semantic space is required.
Fifthly, it is necessary, if possible, to resolve conflicts in data encoding in the source systems. Pic. 4. Recommended EDW Architecture
The proposed architecture follows the examined principles of modular design - "
unsinkable compartments”. The strategy of "
divide and rule"
is applicable not only in politics. By separating the architecture into
modules, we also concentrate in them certain functionality to give
power over the unruly IT elements.
ETL tools provide complete,
reliable and accurate information gathering from data sources by means
of algorithms concentrated in ETL for the collection, processing, data
conversion and interaction with metadata and master data management
Metadata management system is the principal "
keeper of wisdom"
which you can ask for advice. Metadata management system supports the
relevance of business metadata, technical, operational and project
The master data system is an arbitrator for conflict resolution of data encoding.
Data Warehouse (CDW) has only the workload of reliable and secure data
storage. Depending on the tasks, the reliability of CDW can be up to
99,999%, to ensure smooth functioning with no more than 5 minutes of
downtime per year. CDW’s software and hardware tools can protect data
from unauthorized access, sabotage and natural disasters. Data structure
in the CDW is optimized solely for the purpose of ensuring effective
Data sample, restructuring, and delivery tools
(SRD) in this architecture are the only users of the CDW, taking on the
whole job of data marts filling and, thereby, reducing the user queries
workload on the CDW.
Data marts contain data in formats and
structures that are optimized for tasks of specific data mart users. At
present, when even a laptop can be equipped with a terabyte disk drive,
the problems associated with multiple data duplication in the data mart
do not matter. The main advantages of this architecture are:
- comfortable user’s operation with the necessary amount of data,
- the possibility to restore quickly the contents from the CDW in case of data marts failover,
- off-line data access when connection with the CDW is lost.
architecture allows a separate design, development, operation and
refinement of individual EDW components without a radical overhaul of
the whole system. This means that the beginning of work on the
establishment of EDW does not require hyper effort or hyper investments.
To start it is enough to implement a data warehouse with limited
capabilities, and following the proposed principles, to develop a
prototype that is working and truly useful for users. Then you need to
identify the bottlenecks and to evolve the required components.
of this architecture along with the triple strategy for data
integration, metadata, and master data , allows to reduce time and
budgeting needed for EDW implementation and to develop it in accordance
with changing business requirements.