Two Things to Get Right


    Over twenty years of building Data Warehouses we have learned a lot, and there are many published documents on best practices and pitfalls to avoid. I just want to say a few things on two particular factors that are critical.


    Business Involvement

    We will examine the criticality of business in detail in an overall Business Intelligence project later, but suffice it here to say that it is vital. Any Data Warehouse project designed by IT people with no business involvement is almost destined to fail. Business people need to be involved in nearly all aspects of the project because only then will they become enthused with the Data Warehouse and find innovative ways in which to use it.


    Scalable Platform

    The underlying ‘platform’ for the Data Warehouse is a computer and a relational database, and the way these work in combination determines scalability. A key methodology employed in building Data Warehouses is the ‘think big, start small’ concept, which means that you start with a well-scoped project that will take, say, four months to complete, and then build it out to fulfil other business requirements in an incremental, project-by-project way. This is absolutely the right approach and is one of the reasons that I advocate TNF data models and relational databases. I also advocate hardware platforms that can be grown, so as the Data Warehouse grows, the computer and database can handle the growth with ease by just adding  hardware and software without changing anything, and anything includes:

    •  Physical Database design
    •  Already deployed applications
    •  Already working ETL logic
    •  Existing data layout
    •  Business understanding of the data

    When we think of scalability, we commonly think about the ability of the platform to cope with an increasing amount of data, but that is only one facet of growth as it applies to our Data Warehouse environment. In fact, you must consider an array of dimensions that will grow if the project is successful, and all will put heavy demands on the platform, which must therefore be able to grow seamlessly.

    Such dimensions will include:

    •  Volume of data
    •  Number of concurrent users
    •  Number of concurrent queries
    •  Complexity of queries
    •  The need for very recent data
    •  Mixed workload
    •  24/7 operation

About bibongo

I'm a consultant in the field of Business Intelligence and have been since the mid 80's which gives you some idea of my age! I'm priviledged to have held senior positions with Teradata, Oracle, Hp and EMC. I have an English son and a Swedish daughter seperated by some 18 years which is another type of welcome challenge!
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s