|
Many early data warehousing projects failed, having fallen
into one or more of the traps described below. These pitfalls
are still difficult to avoid, unless those steering the project
are able to understand and anticipate the associated risks.
Developing
multiple point solutions (‘stovepipes’)
One of the primary motivations for developing a data warehouse
is usually to improve access to the data in legacy systems,
and to simplify analysis. There is a very strong temptation
to build a series of independent data marts, each based on
one source system, without properly integrating these data
sources. This may give a series of quick wins, but ‘islands’
of information remain unconnected.
The end result is a new generation of ‘stovepipe’ reporting
systems, with the same inconsistencies, and no improvement
in the ability to perform analyses that cut across multiple
operational systems.
Management information initiatives dominated by senior executives
looking for quick fixes are prone to this trap, as are those
sponsored by a single functional area such as finance or marketing.
The solution is to agree and implement enterprise wide data
definitions, and to insist on using conformed
dimensions. This takes longer, and delays early benefits.
Unconstrained
scope (‘analysis paralysis’)
The ‘waterfall’ approach to controlling the development of
large-scale systems is to specify everything up front, before
designing and then building. This does not work with data
warehouses, because
- users cannot usually specify all their analysis and reporting
needs;
- by the time the system is ready, those needs may well
have changed.
An attempt to fully model an organisation’s data requirements
can take months or even years, and is likely to become mired
in technical detail and political debate, soaking up time
and resources without delivering any real or visible benefit.
Projects led by a traditional IT department are more likely
to fail in this manner. The best way to avoid ‘analysis paralysis’
is to divide warehouse development into a number of smaller
stages, each with a clearly defined scope. These stages can
then be prioritised on the basis of their feasibility and
expected benefits.
To avoid this staged approach leading to multiple
point solutions requires careful balance. The trick is
to visualise a high level outline of the complete warehouse,
and develop it in small increments, reviewing the vision after
each stage.
Unscaleable
infrastructure
When conducting a small-scale pilot, or developing the first
data mart,
it is relatively easy to obtain good performance. Adding more
data, or rolling out the solution to hundreds of users may
stretch existing hardware, software or network infrastructure
beyond their capacity, with embarrassing consequences, and
necessitating substantial rework.
Sharing a hardware platform with operational systems is especially
likely to cause long term difficulties (see resource
contention). Capacity planning is therefore essential,
and should be based on expected data volumes and usage patterns.
The best way to avoid this trap is to involve experts who
have implemented similar systems before. It is also advisable
to visit reference sites and to include performance tests
as part of a proof of concept exercise. Choosing warehouse
components before understanding the business requirements
is very dangerous.
Unmanageble
administration
It is easy to under-estimate the complexity of administering
a data warehouse environment, and this will increase as the
warehouse evolves. Once users realise the potential, demand
for new data and reports can explode.
Loading, backup and archiving procedures all need to be maintained,
and will probably need to be squeezed into an ever-tighter
window as new data sources and functional data
marts come on stream. Security, documentation and support
will also need to be managed as new applications and users
are added over time.
The key to avoiding administrative problems is to retain
centralised control of all data and meta-data
added to the warehouse. It is wise to establish clear responsibilities
and robust procedures from the outset, in particular an agreed
process to prioritise user requests and a nominated Data Warehouse
Administrator.
It is also important to assign adequate resources to ongoing
administration and maintenance. If the quality of the data
input is allowed to suffer, the warehouse will lose credibility
and fall into disuse.
Resource
contention (operational use)
When a data warehouse is successful, there may well be pressure
from users to extend its functionality to support operational
tasks not adequately addressed elsewhere. If the warehouse
platform starts to be used for operational processing as a
result, this can compromise the design (see design
issues), and lead to competition for development team
resources and use of system capacity.
The most common solution is to segregate the decision support
and operational processing environments. This becomes more
difficult if new applications are being introduced to support
closed loop decision
processing. In this case a hybrid information systems
architecture needs to be developed, with clear boundaries
between operational and decision support functions. There
may well be a need for an operational
data store to help insulate the warehouse from day to
day requirements.
Inadequate
change management
Modern business intelligence software tools are much more
flexible than most existing reporting systems, and can enable
different ways of working with data. With a well-designed
data warehouse, and the right tools, it is possible for end
users to write their own reports. With OLAP
software, the more adventurous may even learn to interrogate
the data themselves, following a train of thought that was
not anticipated, let alone built into the system.
This paradigm shift towards self-service reporting and quantitative
analysis can lead to significant process and cultural changes,
and may encounter resistance from both IT staff and business
users. This will take time, and requires careful planning.
It is also very hard for users to elucidate their analytic
requirements until they have seen what is possible and tried
the tools out. This is another reason why a prototyping approach
is often recommended.
The best way to overcome resistance to change is to treat
the data warehousing initiative as both a technical and a
business project, and manage it accordingly. It is essential
to engage the senior management and user representatives from
the outset, and make them part of the core development team.
It also pays to consider the skills and roles of different
types of user before choosing tools for reporting and
analysis.
Other risks
Other common reasons for failure include:
- Focusing on the technology instead of the desired business
benefits. (This often manifests itself as an undue emphasis
on tool selection (see selecting
reporting and analysis tools);
- Under-estimating the data preparation effort (a data warehouse
is often likened to an iceberg: 80% is hidden beneath the
surface);
- Publishing inconsistent or incorrect data without suitable
caveats. (This can seriously undermine management and user
confidence.)
Selecting
reporting and analysis tools
One of the dangers in choosing business intelligence software
is allowing the sophisticated needs of the few power
users to dominate those of the majority who will remain
casual users.
Whilst all users may gradually move up the learning curve,
it is not safe to assume that everyone will take to OLAP
like ducks to water. In practice, it is unlikely that one
tool or supplier will cater for the full spectrum of reporting
and analysis requirements.
The other danger is in allowing the selection process to
start too early and take too long: the software market evolves
rapidly, and vendors will continually leapfrog one another
in the race to add new features and functionality. Whilst
the users must be happy with the tool they are getting, it
is far more important to get the data structures and quality
right. It is also much easier to compare tools once part of
the warehouse has actually been built, so that the evaluation
team can work with real data and hence realistic scenarios.
|