|

The main architectural components needed to build & exploit
a data warehouse are as follows:
Source systems
Most data will be captured by existing operational systems,
but external data may also need to be included.
ETL procedures
The warehouse will need regular updates from source systems.
Feed programs can be coded manually, but, except for small-scale
developments, it is usually better to use an ETL tool to extract,
transform and load the data. These generate code and documentation
automatically, reducing both development and maintenance timescales.
Data
staging area (optional)
If a complex series of transformations is required, it may
be necessary to perform these in stages, using one or more
interim data stores. A common approach is to minimise the
burden on operational systems by taking straight copies of
all relevant tables. Another is to create an operational
data store.
Using a staging area also helps to insulate the warehouse
from source system changes, reducing maintenance.
Data warehouse
A separate, physical database will need to be designed and
built, with a structure suited to running queries rather than
on line transaction processing (OLTP).
The choice of software platform, schema and indexing need
careful consideration (see design
issues), but most warehouses are implemented as relational
databases.
Functional
data marts (optional)
To obtain the best possible performance, it is usually necessary
to create a number of data marts, fed from the main warehouse,
which thus acts as a central hub. Each of these will focus
on a specific application e.g. market segmentation, financial
analysis.
Data marts are often populated using SQL
queries, but could also be maintained by the ETL tool. The
type of database chosen is even more critical, but could be
either relational or multi-dimensional (see design
issues).
Reporting
& analysis tools
There are a number of generic business
intelligence tools on the market. It is therefore unusual
to build bespoke components to handle these processes (stage
3 & 4 above). Many of these tools have tightly integrated
modules for querying, aggregation, analysis and presentation,
so that users may not even perceive these as separate stages.
Vendors do, however, take very different approaches, and
a clear understanding of requirements is needed before selecting
reporting and analysis tools. Some tools, for example,
access the warehouse directly, whilst others only access data
stored in proprietary format data marts.
Analytic
applications (optional)
Where information derived from the data warehouse is used
to make routine decisions, it may be worth building a customised
application to support this process. Such decision support
systems are often referred to as analytic
applications.
Many analytic applications need to write back data
to the data warehouse (or, more often, to an associated data
mart) e.g. forecasting, target setting and budgeting. Some
may even write back to operational systems e.g. pricing, campaign
management or cross-selling. These are sometimes referred
to as decision
processing, or as closed
loop applications.
Usually, such applications are only added after critical
reporting systems are in place. However, because relatively
few business intelligence tools allow write-back, the intention
to implement an analytic application can have a major bearing
on both architecture and tool selection.
|