Data Warehousing is over twenty years old and still the many business and technical users out there do not wholly approve of the validity of the architecture, and still cite the enormous cost of Data Warehousing, the lack of building skills, the perceived failure rate, the poor quality of the data and the lack of the expected return on investment
When I joined Teradata, a US company, in 1987, the term ‘Data Warehouse’ was not in existence, although the nature of the applications that Teradata was used for matched all today’s criteria of a Data Warehouse. Teradata manufactured and sold a large but modular parallel processing platform that could basically store vast amounts of data and run SQL queries very quickly. Although many people shudder at the cost of Data Warehousing today, those early Teradata systems (called the DBC1012) were often sold as DASD (disk) replacement systems for DB2 mainframe applications because ‘Teradata’ storage was cheaper than the IBM equivalent.
Those early years of Teradata in Europe were very successful, and gradually the worth of holding large amounts of historical data and running very complex queries over that data began to be understood by some of Europe’s largest companies, including BT, BA and TSB. At some time in these early years a Teradata guy coined the phrase ‘Information Factory’ and from somewhere soon after came the term ‘Data Warehouse’ which has stuck with us today.
In those early days 300 gigabytes of data was deemed huge, and quite rightly so, because that volume of data could consume over six hundred disks. Now of course just one disk will do and Data Warehouses larger that several terabytes are common and running on hardware and RDBMSs from many vendors.
Although IBM and Oracle have probably competed best with Teradata in the Data Warehouse market, it has spawned many other vendors, to become the billion-dollar industry we commonly – and wrongly – call BI. There is money to be made everywhere!
So there are plenty of companies and components to help in the building of a Data Warehouse, but the basic fact is that the overall concept has not changed or improved significantly over all of these twenty years or so. We are still doing what we used to do, only quicker and with more data, and it’s pretty much this fact that frustrates me so much. I know companies on their third try at Data Warehousing and with each new iteration they change the platform, believing that it’s the platform that is failing them, not realising that they don’t actually have any real idea of what success is or how to understand what they’ve achieved so far.