One of the most common arguments in the area of Business Intelligence today concerns the level of detail at which a company should keep its data for the purpose of informational analysis. There are many reasons for this argument, and perhaps the most pertinent three are influenced by: What data is available The limitations of current technology to support differing volumes of data Existing and pending data protection laws and ownership issues The most restrictive of the above list is the third, which is governed by the fact that differing legislation in different countries regulates both the amount of data one can hold over history and the level of detail of this data, most especially if it can be used to track the behaviour of individuals. This issue of privacy is of utmost importance and will have ramifications across many aspects of decision-making. The second issue in our list is one of technology, and it’s fair to say that the state in which we find ourselves today is one in which current technology gives us the ability to store and enquire on huge (if not limitless) amounts of data. The pertinent discussion should centre not around ‘can we do it?’ but rather what business value can be obtained from differing ‘levels’ of data, and this subject is the major theme of this post. The remaining issue – what data is available – is obviously an important factor simply because you can’t keep and use what you don’t have in the first place. In fact, understanding the gap between available data and that which is required is a key objective in requirement definition and must primarily be a business-led activity, not only in defining important and missing data items, but also putting a worth on capturing them. Where Does this Data Come From? Let’s look at four very common business transactions to understand what types of data they create and manipulate.
Using an Automatic Teller Machine (ATM)
Buying some items from a supermarket
Booking an airline ticket
Making a telephone call
Using an Automatic Teller Machine (ATM) ATMs not only give out money to the just, but also serve to capture a huge amount of data. Every time you withdraw money from an ATM, a transaction record is created and stored in the bank’s operational systems containing the following: account number date and time the ATM identifier (i.e. where you are) the amount of money involved what you are requesting to do with that money, i.e. withdraw it Meanwhile, at that same instant, it is assessing how much money you have in your account, along with such information as credit limits etc. Buying some items from a supermarket
When you buy something from a supermarket the system records at the very least: supermarket identification till identification time and day items purchased coupons used price paid payment method At best, and with the advent of loyalty cards, the system could know your: name address date of birth credit card identification credit limit And of course the retailer knows already the full book price for the items bought, levels of discount applied and stock levels available in all outlets.
Booking an airline ticket. In this industry the reservation systems capture: who you are your address payment details data and time of reservation and flight personal preferences (meals, seats, smoking habits) destination routes arrival time price paid frequent flier attributes associates flying together – family groups etc. attributes of young fliers – age, gender etc. Making a mobile telephone call
Perhaps more data is collected when you use your mobile (or wire line telephone) than with any other sort of transaction. The Call Detail Record (CDR) which is generated by the telephone switches on a per-call basis (broadly speaking) is a great source of information for the telephone company in terms of analytical potential. For example, for mobile calls, the CDR contains: calling number called number time of call duration of call location of caller and called termination codes tariffs
We see, therefore, that a huge amount of very detailed data is generated and made available to retailers in many industries when a customer buys or uses goods. The question is: Is it of value to hold a record of every transaction or can I derive the same value merely by summarising these transactions by some attributes? Well, to answer the question, let’s first look at some of the characteristics of data to understand the issues involved.