Tuesday, September 30, 2014

Lost Precision in Dataminea

Things are looking good, they say... Our data is getting more structured, and data mining has become a standard working tool in the business toolbox. We can analyse our customers, their behavior, the markets in which we operate and our own supply-chain environment.

With all this apparent maturity - you would think we are in a good place.

Initially we gave away a lot of our data without realizing it. Now, at least, we are aware of what data we share and kind-of how it is used. It sounds sensible and fair and for the most part of it - this is true.

The danger we have opened ourselves to, however, is an increased sensitivity to information misconceptions. While we may have increased our precision in representing data, we have insufficient tools to control its accuracy.

Before going further, I think it is important to highlight the distinction between the two. Precision, refers to the ability to generate results which are consistent and repeatable - meaning that our tool is reliable in generating the same result over and over again. Accuracy, on the other hand, is how close the measurement is to the actual truth.

Now as I have noted in the past, the truth can be perceived from different points of view, and while we may be able to generate more reliable results, they tend to serve a limited set of perspectives. This is no accident, as these views are used to satisfy specific measures and drive  specific behavior. This is not new. Politicians, advertisers and a lot of other groups and individuals continue to use this ability to distort the view on certain realities and create an arbitrage in opinion to their advantage.

Some may call this the art of doing business, and perhaps that is what it is.

The bottom line is, however, that with all this Dataminea going on around us, we are becoming much more sensitive to data miss-representation which can be a good or bad thing - depending what you are trying to achieve, and how you manage your data.

My question to you is: do you understand the perspectives and the level of precision of the data you handle?

Sunday, September 14, 2014

Price Tagging Data

How much is data worth? Is it based on how rare it is? how hard it was to obtain? how it serves the business opportunities or risk of those who buy it? If other people sell your data, shouldn’t you get a cut? what should that be? How do you price tag data?
 
The truth is that this is like any other economy of trade. As a producer  of data, you would be interested in making a profit, and hence will optimize your costs and look to price your data at a level that will fit the market in which you operate. As a consumer, you will be willing to pay based on the profit you believe the information is likely to deliver for you, coupled with the market conditions for obtaining the information. This of course is bundled with quality and timing.

How is the price of data affected differently than other commodities? Timing, for one, is a big factor. When access to certain information becomes pervasive - the cost of the information diminishes dramatically. No one will pay to know what the name of our planet is. Start talking about very old information, which is no longer pervasive- and the price will start going up. You may want to pay someone to know certain facts about the history within a particular region. Although the underlying value is the access to information, the influencing criteria is time.

Quality is probably the second point worth mentioning. While I do not see a major difference in the impact of quality of data on the value and the cost - the objective valuation of the quality of data is harder to achieve than in other commodities. Many products are evaluated for quality over a clear a predefined criteria. Gold has carets, cars have performance indicators and design styles, services have customer satisfaction and food has taste and presentation. Quality of information not only mean different things to different people, businesses define data quality differently, depending on how they intend to use it. This dual quality obscureness of both people and organization leads to an incoherent set of standards to which to value information against. While it could be plotted on a 2 dimensional surface - in reality people tend to draw subjective conclusions on the quality of the data - and it almost becomes a matter of taste rather than a skilled practice.

So while experience counts, and some data will prove itself more valuable than other. Are we simply in an open marketplace for information exchange? without regulation for each data type, we will continue to buy data fruits while being blindfolded. While everyone else is doing this - I guess it is fair game. But I am not too sure that is the case.

How do you, or would you, put a price on the data you manage?