About OR
OR Topics - Data Warehousing & Business Intelligence
MANAGING A DATA WAREHOUSE
Pitfalls

Many early data warehousing projects failed, having fallen into one or more of the traps described below. These pitfalls are still difficult to avoid, unless those steering the project are able to understand and anticipate the associated risks.

Developing multiple point solutions (‘stovepipes’)

One of the primary motivations for developing a data warehouse is usually to improve access to the data in legacy systems, and to simplify analysis. There is a very strong temptation to build a series of independent data marts, each based on one source system, without properly integrating these data sources. This may give a series of quick wins, but ‘islands’ of information remain unconnected.

The end result is a new generation of ‘stovepipe’ reporting systems, with the same inconsistencies, and no improvement in the ability to perform analyses that cut across multiple operational systems.

Management information initiatives dominated by senior executives looking for quick fixes are prone to this trap, as are those sponsored by a single functional area such as finance or marketing. The solution is to agree and implement enterprise wide data definitions, and to insist on using conformed dimensions. This takes longer, and delays early benefits.

Unconstrained scope (‘analysis paralysis’)

The ‘waterfall’ approach to controlling the development of large-scale systems is to specify everything up front, before designing and then building. This does not work with data warehouses, because

  1. users cannot usually specify all their analysis and reporting needs;
  2. by the time the system is ready, those needs may well have changed.

An attempt to fully model an organisation’s data requirements can take months or even years, and is likely to become mired in technical detail and political debate, soaking up time and resources without delivering any real or visible benefit.

Projects led by a traditional IT department are more likely to fail in this manner. The best way to avoid ‘analysis paralysis’ is to divide warehouse development into a number of smaller stages, each with a clearly defined scope. These stages can then be prioritised on the basis of their feasibility and expected benefits.

To avoid this staged approach leading to multiple point solutions requires careful balance. The trick is to visualise a high level outline of the complete warehouse, and develop it in small increments, reviewing the vision after each stage.

Unscaleable infrastructure

When conducting a small-scale pilot, or developing the first data mart, it is relatively easy to obtain good performance. Adding more data, or rolling out the solution to hundreds of users may stretch existing hardware, software or network infrastructure beyond their capacity, with embarrassing consequences, and necessitating substantial rework.

Sharing a hardware platform with operational systems is especially likely to cause long term difficulties (see resource contention). Capacity planning is therefore essential, and should be based on expected data volumes and usage patterns.

The best way to avoid this trap is to involve experts who have implemented similar systems before. It is also advisable to visit reference sites and to include performance tests as part of a proof of concept exercise. Choosing warehouse components before understanding the business requirements is very dangerous.

Unmanageble administration

It is easy to under-estimate the complexity of administering a data warehouse environment, and this will increase as the warehouse evolves. Once users realise the potential, demand for new data and reports can explode.

Loading, backup and archiving procedures all need to be maintained, and will probably need to be squeezed into an ever-tighter window as new data sources and functional data marts come on stream. Security, documentation and support will also need to be managed as new applications and users are added over time.

The key to avoiding administrative problems is to retain centralised control of all data and meta-data added to the warehouse. It is wise to establish clear responsibilities and robust procedures from the outset, in particular an agreed process to prioritise user requests and a nominated Data Warehouse Administrator.

It is also important to assign adequate resources to ongoing administration and maintenance. If the quality of the data input is allowed to suffer, the warehouse will lose credibility and fall into disuse.

Resource contention (operational use)

When a data warehouse is successful, there may well be pressure from users to extend its functionality to support operational tasks not adequately addressed elsewhere. If the warehouse platform starts to be used for operational processing as a result, this can compromise the design (see design issues), and lead to competition for development team resources and use of system capacity.

The most common solution is to segregate the decision support and operational processing environments. This becomes more difficult if new applications are being introduced to support closed loop decision processing. In this case a hybrid information systems architecture needs to be developed, with clear boundaries between operational and decision support functions. There may well be a need for an operational data store to help insulate the warehouse from day to day requirements.

Inadequate change management

Modern business intelligence software tools are much more flexible than most existing reporting systems, and can enable different ways of working with data. With a well-designed data warehouse, and the right tools, it is possible for end users to write their own reports. With OLAP software, the more adventurous may even learn to interrogate the data themselves, following a train of thought that was not anticipated, let alone built into the system.

This paradigm shift towards self-service reporting and quantitative analysis can lead to significant process and cultural changes, and may encounter resistance from both IT staff and business users. This will take time, and requires careful planning.

It is also very hard for users to elucidate their analytic requirements until they have seen what is possible and tried the tools out. This is another reason why a prototyping approach is often recommended.

The best way to overcome resistance to change is to treat the data warehousing initiative as both a technical and a business project, and manage it accordingly. It is essential to engage the senior management and user representatives from the outset, and make them part of the core development team. It also pays to consider the skills and roles of different types of user before choosing tools for reporting and analysis.

Other risks

Other common reasons for failure include:

  • Focusing on the technology instead of the desired business benefits. (This often manifests itself as an undue emphasis on tool selection (see selecting reporting and analysis tools);
  • Under-estimating the data preparation effort (a data warehouse is often likened to an iceberg: 80% is hidden beneath the surface);
  • Publishing inconsistent or incorrect data without suitable caveats. (This can seriously undermine management and user confidence.)

Selecting reporting and analysis tools

One of the dangers in choosing business intelligence software is allowing the sophisticated needs of the few power users to dominate those of the majority who will remain casual users. Whilst all users may gradually move up the learning curve, it is not safe to assume that everyone will take to OLAP like ducks to water. In practice, it is unlikely that one tool or supplier will cater for the full spectrum of reporting and analysis requirements.

The other danger is in allowing the selection process to start too early and take too long: the software market evolves rapidly, and vendors will continually leapfrog one another in the race to add new features and functionality. Whilst the users must be happy with the tool they are getting, it is far more important to get the data structures and quality right. It is also much easier to compare tools once part of the warehouse has actually been built, so that the evaluation team can work with real data and hence realistic scenarios.

Click Here Section Map Click Here RELATIONSHIP WITH OR Click Here MAIN
Click Here OVERVIEW Click Here BUILDING Click Here RESOURCES

© 2002 The OR Society

Top of Page