Monday, December 30, 2013

Data Psychology

Is better data about data mining or people minding? As much as some people would like to believe that it is merely a technology problem - better data is, without a doubt, about much more than just a choice and implementation of hardware and software.

Better data is about quality, right? Well... what kind of quality?

Structural quality you say? well-formed? complete? concise? Interoperable? Well... does it matter? who cares? Oh yes - people care. Regulators, shareholders and let's not forget customers.

But hold on, this is true, but isn't data quality an operational thing? You know: timeliness, uniqueness, validity? True, but even if you ignore the structural quality people - what about the operators? People design, configure and maintain these technologies.

Ok, and what about functional quality of data? Things like fitness for use? availability? relevance? Those look at the applicability to business processes which people own and operate.

So yes, you can master your data, and you can interrogate the data to its tiniest bits, and you can draw beautiful models and flows. But, without spending a considerable amount of effort in understanding the psychology of data, in other words how the people around the data can and will behave around it - you are doomed from the start.

Let's take an analogy: You have been invited for the end of the year party. What will you wear?

Firstly you want to make a certain impression on the other attendees. You have an emotion attached to the event and this will influence your decision and behavior with regards to your dress code. This might be a more formal type of event where you want to impress people, or it might be casual, seeing old friends, where you want and intent to relax and enjoy quality time with these people. Or perhaps you are forced to be with people you do not really enjoy spending time with.

Secondly, the party may have a theme, or the crowd might have specific ethnic or religious views, which you need to consider. Perhaps you need to dress conservatively, or emphasize a certain identity? This type of environment specification will also influence your choice of clothing.

Finally, we get to the third and last factor. Resources. How much time do you have to find something to wear? Do you have money to buy something new? These constraints will ultimately influence your behavior further in terms of getting yourself ready for the end of the year party.

Going back to data management, people's behavior to come to the party (i.e. participate in a data strategy), will be influenced by the same exact factors. No, I don't mean what they will wear to your data management meetings, but rather the level and ways in which they might commit their participation.

People will always consider first their personal perspective in terms of their job, their personal and professional style and the impact of your data management efforts on their own world. Only then will they consider what actions would be acceptable, given their operating environment, for example how their business model, customers and partners might respond to any changes. Then lastly they will consider the resources they have available and the level of commitment they are able to provide you.

A defect in a record detail might frustrate or annoy the user, but the provider might have no interest in improving the data. A technician running a system may have a limited perspective and prescriptive operating constraints with no flexibility (or awareness) to measure or mitigate information risks. There maybe shortage of resources, under skilled workers or simply unmotivated employees - the scenarios are endless...

So what does this mean? I have given you a few examples and specific dimensions to consider in your approach to data management. You need to consider the technical aspects of system design and integration, but also take a closer look at the human dynamics which are a real success factor on your data management strategy.

Ask yourself this question: Is your team and strategy sensitive enough to consider the psychology of the data you are trying to manage? ... It might seem common sense, but I think in most cases - the answer is no.

Monday, December 16, 2013

The Theory of Data Mind

People have different metadata for their lives. Not only from a fundamental semantic perspective (I.e. language), but also at a more complex levels ranging from domain specific formally obtained (I.e. education) to personal semantics built over time (I.e. life experiences).

Now when you were a child, your caregivers thought you the theory of mind (hopefully). This is the ability to understand that other people's perspective could be, and actually is, in most situations - different from yours. In professional terms we call the application of this theory - stakeholder management. To succeed in your tasks you need to satisfy the requests of the people who will judge its completion. It might be yourself, your manager, your client and/or other authorities. In effect, to be successful you need to convince your stakeholders that you have completed the task to their satisfaction.

In data management we talk a lot about context. The semantics, quality, responsibilities and impact relating to data all mean different things to different people. This is because they have different responsibilities, life experiences and education. Learning to deal with these differences is an essential to the success of implementing data management.

Interestingly enough, the practice of data architecture, which forms part of the discipline of data management, is in itself entangled by semantics and theory of mind issues. Too often people rely on a limited set of theories and a limited set of stakeholders’ perspectives to both define and implement data management.

What is the difference between data and information architecture? Who is responsible to the data profiling? What is the best reference framework to use in implementing data management? Most likely, everyone reading this post has their own opinion. As long as we continue to support independent organization – there is nothing wrong with adopting different sets of tools and practices. However, it does make it more challenging (and costly) to create a sustainable data strategy. Imagine the rotation of responsibilities between existing employees and new employees. The more exceptional your approach – the more difficult it would be to introduce it to people and to integrate it to existing processes.

The reality of it all is that in practical terms, companies will continue to implement different strategies towards data management. Some of the variations will originate from the different characteristics and requirements of each organization, while other differences will be due to the lack of consensus on a single “golden” reference framework to implement data management. Companies will only align their approach when there is a real business need to do so.

So what all of this mean to you, as someone concerned with the well-being of the data in your organization? When implementing data management limit your scope (but keep a strategic mind).  Keep theoretical conversations outside the formal data management conversations. The leaner and meaner (precise and useful) your strategy is – the more likely you are to succeed.

Just be warned: you will get nowhere if you forget to apply the theory mind to the application of your data strategy.

Saturday, November 30, 2013

Do we under-stand-n-dice?

Data interoperability is rooted in understandability and lineage, which is founded on a common agreed language, or in other words - in standards. Now, unfortunately while standards need a technical specification, this is only the beginning. Standards remain the shadows of our data if not implemented and governed properly.

Let's go down to basics. Take the ISO 8601 for example (most commonly known as yyyy-mm-dd date format). While this is an excellent standard (in my humble opinion), it shadows our data since it is only partially adopted globally. Of course in some scopes it is fully adopted, and in others it seems to be none-existent. This type of standard reaches every aspect of our lives, and will take time to be adopted world-wide. The point is that it is still to resolve all our problems today relating to date representation.

There is a spectrum of data standards adoption, and this stems from the implicit cost and benefit to the business. Why adopt a standard if it will cost too much to implement? This is probably the single, biggest question people ask when it comes to standards. We do not, and should not standardize just because it sounds like the right thing, but we should also not discount the potential benefits. We need to carefully evaluate the strategic impact of this type of governance. There is an ongoing cost for having standards and there are risks and occurrences of cost relating to the lack of standardization.

While we do standardize a lot, I think we tend to standardize our standardization. We do it too broadly and hence at time apply standards which are too shallow, and at times we over-standardizing others. In other words - we can and should - optimize this practice.

Supporting the business is all about managing risk and resources. Hence the practice of standardizing should be driven out of an understanding the business impact of adoption (or not adopting) standards. This should be quantified in terms of business relevant risk and of course - money.

So don't let your business play with the under-standardizing dice. Help them choose which standards to evolve, and which ones to let go. This, I think, should be the standard approach. Don't you?

Friday, November 15, 2013

Be Mindful of Data Arbitrage

Arbitrage is a term often used in economics and finance to describe the practice of taking advantage of a price difference between markets [Wikipedia]. This then leads to the opportunity to make profit without taking any real risk. As a result, arbitrage is usually very short lived and leads to a natural re-balancing of the price differences.

People also refer to arbitrage when they talk about using enhanced amount of information to improve marketing and sales, or to use some unique piece of information to keep you ahead of competition.

In a system with high information transparency, or an effective information feedback, arbitrage is a self-regulating control mechanism for the system. However, this is not true where the feedback system is broken. In these situations, the arbitrage becomes the power of knowledge, as one holder of information stands at the advantage point from another party.

This happens across all areas of life, and in particular where any type bargaining is concerned. Whilst it is true that you might be willing to pay for convenience or a brand, you may not always have the ability or the right to know what is the real value of the deal for the other side.

So, while the arbitrage in data is broken, and feedback loops are dysfunctional - knowledge remains power. However, once standardization and governance kicks-in - balance is restored.

Therefore be careful with your information strategy, and be critical of the competitive advantage out of your information assets. Consider how the maturity of industry standards and governance, demands from regulators and technological capabilities might impact the value you derive out of your information assets.

In the meantime, it might help to acknowledge that information arbitrage is all around us and take decisive actions to master or surrender to information management gaps where it specifically benefits and aligns with your business strategy.

Thursday, October 31, 2013

How wet is your data?

People do not like being called wetware – but we are. Physical computers, microchips, disk drives, plugs and cables are what we normally call Hardware. Software on the other hand refers to a type of architecture that includes “softer” components, which are in a more transient state. This is very true for things like computer programs. Somewhere along the way however, I picked up an extension for these two definitions which goes and defines humans as Wetware, suggesting that changes in people’s behavior is more frequent and less predictable than that of software.
 
The question I pose is then really about how much of your data is affected, or handled, by humans rather than computers or automated processes. This question is important because it has a direct impact on the amount of time and effort you need to spend on data governance. Why? Because governance is something you use to control human behavior.

One of my favorite speakers on Data Management, Dr. Peter Aikens talks about leveraging data. In other words, how to use data effectively to maximize its value for the organization you support. I like to extend this thought further by noting that in order to leverage data effectively, you need to affect how it is being used, and for that you need to control the leavers of your data.
 
Technology and process can be loosely regarded as hardware and software, and for that we have ample tools to affect change. Managing people, however, is something we have been doing since the dawn of mankind. So why is it so difficult to control data and its quality? Simply put – lack of management. Some say it’s the lack of technology and process management and others will advocate is the lack of people management. Alas – it is all three.
 
Since this is a conversation about wet data though, I want to conclude this post with the “people management” side of things. I have discussed perceptions and different priorities in the past, but when you think about it, these are motivated by deeper constructs such as emotions and desires.
 
Now if you want to influence those, you should consider training in softer skills like psychology and change management.

Monday, October 14, 2013

The Need for None Information

Is having the right data at the right time by the right people the right thing? What is the right thing? business prosperity? personal prosperity? Legacy? Happiness? - this gets very personal very quickly.

The real question is how do we balance a diverse set of needs for similar and connected information between different stakeholders? Some information we want to share with others, some we do not. Some information has no relevance to us and some we cannot live without. I spoke about context and measures as ways to impose relevance and controls on data. But in this post I also want to explore the need for none information.

It is quite clear that having all the information you need, does not necessarily make you successful. Sometimes, in order to succeed, you need to not have information. It might be in the form of isolation in order to provide focus (are you wearing earphones to block out the audible information?); It might be to create an environment where controlled parameters are used to study or impact the outcome. It might sound like this is only applicable to scientific research, but the term "manipulation" or "influence" comes to mind. How many outcomes in the history of man-kind or in your life have been a direct outcome of choosing to share, or to block certain information.

So is our near-instant access to information today a perception only? Is there, or isn't there a bigger conspiracy on our lives at play? Well this becomes very personal very quickly again.

We sometimes get caught up in the belief that more information, and better processing of information is ultimately what makes us more productive - but that is false. When machines supersede humans in processing and determining outcomes based on information, our only true value will be the innovation on capitalizing on opportunities. I am not saying there is no need to manage information, we are far from conquering that quest. However, along the way, we need to ensure that we grow ourselves to become true governors of the information. Carefully learning how to respond to our environment, releasing and blocking information (regulating in other words) the flows of information to advance our goals.

After all, history will be determined by what we have, and have not done with information.

So I ask you now, who really needs all that data?

Monday, September 30, 2013

Data SWOT’ing

A SWOT analysis is a structured planning method used to evaluate the Strengths, Weaknesses, Opportunities, and Threats involved in a project or in a business venture [wikipedia]. I probably share the view with many by stating that it is useful in assessing many situations (business or otherwise), by helping you focus on the different aspects of your environment and your relative position to them. This in turn helps you decide what you should do next to progress towards your chosen goals.

A Data SWOT is really about looking at your data assets. Instead of asking how your business is positioned in its environment, we take a specific angle of asking what is the relative position of your data compared to the same type of data in your environment.

For example: If you have a strong and up-to-date customer information set in your newspaper delivery business, you might come to realize an opportunity to sell your updated data to other local vendors. You could sell the updates for a fee, and let other service providers (who are entitled to this information) reduce their cost in updating their customer information. You can also take this one level further, and build an agreement with (almost) all the providers, and agree on a process to mutually update each other when customer information changes. This might reduce all vendors’ individual costs and will make the customers happy as well, since they will only need to update their details once.

This is of course nothing new! Risk management information providers have been using this model for quite some time, sharing data about customer credit rating and so forth.
 
The point is this: With a Data SWOT analysis you can identify ways to increase the value realization of your data, or plug holes in poor quality data.

Now taking this one step further: what other business tools, or management methodologies / models etc. can / do you use to create an information-centric perspective on your business?

And until then - happy SWOT’ing…

Sunday, September 15, 2013

The Origin of Poor Data

Why do we have data quality problems? What is the underlying cause, or set of parameters which lead us to a situation where data quality becomes an issue? Can we pull it out at the root? Do people have the potential to master data management through tweaking the basics of the underlying causes earlier on? can we eradicate data quality issues altogether?

We already know that data quality issues arise from sub-optimal delivery of information. It is not a result of incompetent professionals who do not know how to use the right information once they have it. So the issue is really, as noted by many - quality at source.

There are several reasons why the source of the data is poor at times. It stems from either a lack of understanding or a lack of acceptance of the importance associated with the needed quality of information. When it comes to the former - this is usually a matter of informing and clarifying the requirements with the right audience. For the latter - controls in the form of reward and punishment are likely to be most effective.

But this is not the underlying cause. Why do people fail to understand the requirements, and why are information providers not accountable to the information from the start? The answer of course is context and relevance, as I have alluded to in previous posts and this seems fair enough. I have my set of circumstances to deal with, and you have yours. We both act in the best interest of the values we hold and the responsibilities that are assigned to us - and we hope for the best!

That, unfortunately, does not sound quite right. If we want to reach Teneo Vulgo, then we must be able to assimilate higher level of awareness to others' information needs. Can we know everything everyone wants? well of course not. So, what do we do? How do we ensure that the responsibilities and context is sensitive enough to reduce the impact of siloed context of information?

One approach would be to educate people from a young age to appreciate the multiple reasons and uses of information. We do this in language, when teaching them that a single word can mean different things. Story tellers use this duality (or ambiguity) to create mystery and surprise their audience - but this is, again, context specific. The generalization of this is in the context of education. We need to promote the value of correct information (by showing for example how wrong information can lead to problems e.g. lying or Information Quality Trainwrecks) we need to demonstrate, in real-world examples how the same information can be used by different people for different reasons, and what it really means to look after information; and we need to give more credit to people who do this right.

But let's get practical. Although this is what the true generation of Teneo Vulgo will value and practice - we want to improve the quality of information today. Unfortunately, I have no better cure than clarity and control. But here is an afterthought: You might want to combine the two by tying them together. In other words, people's performance and remuneration should be linked as closely as possible to data management so that they will develop an appreciation to the impact of poor quality of data they distribute.

Friday, August 30, 2013

Good-Enough Data Management

The boundary between Quantum and Newtonian data quality is a grey area which is made clear through prioritizing business value. The same way you choose one physics model over the other for the sake of addressing a real world problem, so you must choose the right level of controls over the usage and quality of data. A detailed analysis of the weight of every sand grain, will cause you to spend too much time and money in building a house. The grains have to be within a certain weight tolerance, and rough estimations are "good enough" - so simple, cost-effective filters will do the job just fine.

Another example which is closer to data management than the weight of sand is good-enough parenting. If you protect your child from making any bad decisions (such as insisting to wear slippers to go and play in the rain) - they might end up becoming overly dependent on your judgment and will never learn why this choice of footwear is not such a great idea.
  
So what do parents do? They carefully choose when to get involved, and when to step back and let the child learn from their own mistakes. The idea is to increase the effectiveness of an independent thinker while still maintaining reasonable control.

But there is another side to it, and that is resource costs (like in the sand example). A parent typically has a million things to do, and being pedantic is not necessarily productive in every situation. You need to utilize your time and money effectively, and hence you need to allow some risk and some impurities for the sake of progress.

Therefore... to maximize the value of your data, your oversight and management of it must be good-enough, or fit for purpose. Too little management you will need to endure higher risks and remediation costs. Too much control - and you will slow the business down, incur unnecessary costs and will loose your credibility.

The reason each organization requires a different data management operating model is much like the fact that each child requires its own parenting style.

Thursday, August 15, 2013

Towards Semantic Coherence

When we Tweet, Blog, Collaborate, E-mail or even discuss something face to face – we communicate. We understand each other (hopefully) when we agree on a context and on an intention. Now while I am all for free and open communication, I am not a big fan of information noise. It might be beautiful, it might be charming – but it’s like trying to talk to 10 people at the same time.

When I do try to communicate with someone I usually have my own specific context and intention. I get easily distracted by interesting facts, especially when 10 people are talking to me, but get frustrated if I do not manage to communicate what I need to. What do I do in this kind of situation? Well, I try, politely as I can, to increase the priority of my context and intention, and find ways to optimize the communication. In other words I am influencing my partner to focus on more relevant information. I may go as far as to change my partner in conversation to improve the effectiveness of my communication. Sounds almost cruel when one talks about people… but remember, this is just a metaphor :-)

So, we open our e-mail, we read our rss feeds, tweets, timelines, and whatever other channels we believe will keep us close to the topics that matter to us – and it works. We continue to optimize these channels on an ongoing basis – and life is beautiful... or is it? What’s wrong here?

Well, let me first point out an observation. Our context, or intention, is driven out of goals we wish to achieve. In the past, these were created and contained in specific environments. For example, you want to buy something – you went to a shop; you want to learn something – you went to classes. You have a question – you call someone. While our goals might have not changed, our channels to reach some of these goal – most definitely have. What is the impact? Since we shorten the distance of travel (one browser tab to another) we have also increased the frequency of alternating our attention. This may give us a better sense of control, but can overwhelm us with intention. This is not good, as it actually slows us down. We do more things, but we do everything slower.

What we need to do, is work harder at designating our attention to what we need to do. Don’t drive and text, don’t tweet while walking with a friend to get a cup of coffee. Don’t speak on the phone while having dinner etc. etc. Sounds like something your mother might say – and you know what? She’s right. It is not a matter of manners, but an issue with semantic coherence. What I mean by this, is staying loyal to your current intention. How can you communicate effectively and ensure you stay focused on your goals? Simply by ensuring you dedicate your presence to a pre-defined and intentional conversation.

The interesting bit is that this is true to groups as much as it does to individuals.

A balanced frequency of multiplexing communication focus will prevail.  Technology and businesses will also learn to link and capitalize on this by engaging with consumers at the correct semantic moment with the right content. For example, advertise a special in a restaurant, while you are looking to book a table.

To get there, we need to help communities build semantic coherence to improve their common intention and context so that they become more effective in reaching their common goals. This is done through creating relevant and useful dictionaries, collaboration platforms and managed information strategies… quite a bit of work!

One thing is clear though: This is not a technological challenge but rather a change management one.

Saturday, July 27, 2013

From Newtonian to Quantum Data

Big data is a joke. Well not in a sense that it is not important, or that we cannot extract value from it - but rather in the sense that it is actually small, relatively that is. You see, it is merely the tip of the iceberg when it comes to the total information that exists in the universe. Our ability to process information, even what we call big data is insignificant compared to the amount of data the world around us is processing. If we were living in The Matrix, you can only imagine the processing power and the incomprehensible size of data stores that would be required. Now, if you know a bit about modern physics, you have probably come across the double-slit experiment. The finding, which indicates that light is both a wave and a particle, seem to suggest that there is some form of communication between the different light particles. Regardless of how you believe the world was created, there is no doubt that there is ongoing connection between large objects such as planets and miniscule particles such as atoms, quarks, strings, or whatever sits at the bottom of the physics chain.


Continuing with physics (apologies, one of my pet topics) - thermodynamics second law states that the entropy of an isolated system never decreases. This means the level of disorder in the universe is ever increasing. So whatever information had to be processed by the universe when you started reading this post has already increased immensely by the time you got to this sentence.  So just think about the amount of information that exists in this ever disordering universe – it is mind boggling!!

So yes, we can track GPS locations of billions of users, calculate correlations of distant measures and generate unheard of before predictions based on relationships between seemingly unrelated phenomena. But, is the universe really one perfectly orchestrated data management system?

The answer is yes. Simply put – by the laws of nature. The things that define measures of control are frame of reference, or in other words - patterns. You can only control something that has some level of predictability. Even randomness is regarded as a pattern, and has some predictable scale of operation (normal distribution is the classic example). So what am I saying here? Data management is a set of laws that help us describe, understand and control information. Furthermore, I believe there is a valid case to compare information management and the theories in physics. Science has learned that different theories fit different scales.  While Newtonian physics is well suited to describe everyday type of human realities, quantum physics is best to describe interactions at the atomic level. Whilst widely different, they naturally increase or decrease in validity as you scale the phenomena you observe.



So, my point here is that big data (which is all about statistics, complex structures, correlations and finding patterns) requires one set of rules, whilst "Little data" (which is all about quality and insight) require a completely different set of rules. The two sets should increase and decrease in validity as you scale the phenomena you are observing, and should jointly provide simplicity and beauty that helps us understand and appreciate how information is exchanged between different stakeholders.

Sunday, June 30, 2013

Don’t forget the data scars

Ever sat in a coffee shop, and looked around you at the people sitting at other tables, or perhaps at those who are walking by? You might, in fact, be doing that right now. You might be at your work desk, walking somewhere, or sitting on a bus or train.

What do you notice? You probably take a quick measure of people’s physical build, their likely state of mind (happy, upset, stressed or annoyed) and maybe take notice of their choice in fashion. You might go as far as trying to guess whether they are well-off, whether they are local or tourists, or even how different or similar their journey to this point in time and space might be to yours.

One lesson I have learned in life, is that you can never know where a person has come from, and where their bound. To illustrate this point, just imagine that person you are looking at, is some undercover secret agent, who is in the middle of a secret mission to save the world…

But there is more. What I really would like to draw your attention is to the signature that time bestows upon us. In simple terms – the scars. You might notice a scar on someone’s arm which is a memory of a physical injury or streaks of white hair or wrinkles that are the scars of time on people’s worldly vessel. This would probably make you think of the person’s longer, and significant history in life. Not only what their immediate journey might be, but what hardships and joys they might have experienced during their life.

These scars are a form of metadata about people. Did you ever stop to think about that? What about the fact that data has scars of its own history? Sure, data cannot be joyous or experience pain, but the way it is handled can affect its appearance, much like the scars in people. Take for example, two reports that show financial numbers. One report has many suffixes such as 8.99 or 16.99 etc. while the other set of numbers are fully rounded (e.g. 9.00 and 17.00). Now if I told you one report is in aid of budget planning, and the other is for product price tagging – chances are you will know which list is which. You will still need to verify it – but you can advance an assumption out of the data scar. When all the financial numbers are rounded in a certain pattern – it gives you a clue. When correlation is apparent between parameters of values – there is a historical event in the life of the data that caused it to appear the way it does.

When you are looking at the data, and you cannot find the answers you are looking for. Try looking deeper, in to the scars of data, therein may lies the story which can give you the answers you are looking for. It’s like reading between the lines of a letter, or a note. Don’t get discouraged by massive amount of seemingly meaningless data. If there is a story behind those numbers, the scars will help you find it.

So good luck with your data analysis, and don’t forget the data scars.

Thursday, May 30, 2013

One way to look at Information Services

In my last post I discussed the need to have a governing system for information management. I called the lack thereof: “Information Services Contract Failure”. I will now discuss one way to look at Information Services.

Enterprise Architecture, as broken-down by the TOGAF framework, looks at an organization as comprising of Business, Information Systems (Data and Applications) and Technology. When you look at the business from an information perspective, you have business processes, which require and result in information and applications which act as custodians to the business information. Now while business interaction defines the conditions for an Information Services contract, it does not imply that contract exist merely between two involved business parties. As you know, one piece of information is usually relevant to more than one user, and the same piece of information could be originated from multiple sources in various ways.
 
Managing an Information Service Contract focuses on the information. Not its sources. Not its users. The information itself as an asset. Obviously the value is measured through its usage, and its liability through its handling costs and associated risks of misuse. So what we are really saying here is that the management of information is owned by the business as a whole, and not by a single (unfortunate) department. One way to implement this is through a central Information Management business team, who looks after one or more types of information. It is not the only care taker of the information and it does not own the information. It simply takes accountability for safeguarding the information and hence needs to have authority to influence internal custodians of the information. This does not include any legal rights to decide how the information must be handled. They must coordinate, advise, provide monitoring and insights and they can help manage the flow and organization of the information – but that is it.
 
Now, like any other business area, a central Information Management team have their own technology support. These people must look after ALL the data hubs, warehouses, and reporting platforms of the organization. You may want to  p a u s e  here and think about what this really means in terms of the current structure of your technology organization.

You might think that with a central team that manages the “golden copy” of the information there are no trust issues. But, while it might be true that in this model information converges through one (logical) path, trust in information emanates not from the lack of ambiguity, but rather from the degree of appropriateness of the information for it designated use.
 
According to an article I recently read (“The Hidden Biases in Big Data”), you need to consider the completeness-accuracy of data. Some data gets excluded due to technical limitations in sourcing it. However, there is another consideration relating to accuracy and that is the inappropriate use of filters. While we would like to believe that people intend for information to remain objective, poor decisions, driven out of preconceptions and/or lack of skill – can easily lead to skewed information with significant business implications.
 
For example, if a beverages distributor assumes low profit margins in a particular region based on past sales and economic conditions, he may decide to ignore or adjust any higher than expected sales figures as they do not make sense and deemed as errors. This can easily affect the company negatively. Firstly in terms of puzzling financial reports, which will result with extra cash that remains unallocated. This might then be adjusted later as an accounting error, or adjusted to sales figures from another period. This can even affect the marketing team, who would deem their actually successful marketing strategy as a failure.
 
As much as information quality is in the eye of the user, so is its trustworthiness in the hands of its craftsmen.

Wednesday, May 8, 2013

What we have here is a failure to communicate



I have written before about the need for structure in order to be able to communicate effectively. This is rather common sense more than anything else. You need a system of information exchange which is understood by the involved parties in order for the communication to carry value. There is no value in two people talking on the phone, if one speaks only one language, and the other one only speaks another. There may be a tone to interpret, or some common / similar words, but that again is part of a common system of information exchange.

Now I use the term “system” for a specific reason. A system is a pattern of behavior with inputs and outputs. If you have an understanding of the system, you can predict, to a certain level of confidence, what the outcome will be given certain inputs. In the context of information exchange, you can predict the outcome of communicating a message, if you have sufficient knowledge of what information was requested, and how the information fits with the reason you are communicating. Generally your prediction is that the information is understood and used for specific and agreed reasons.

Now this is where things get more interesting. You probably chuckled reading “understood and used for specific and agreed reasons”. That is the fundamental assumption, and yet it could not be further from the truth. You assume people understand the measurement units, the abbreviations, and the set of classification you apply to data. Yet you know, at the same breath that there is likely to be misinterpretations, misrepresentations and a need to verify and correct some data manually.

That really should not be the case.

Information exchange is only as effective as the compliance to its governing system. The less people are educated in a language such as English, the less two people from different cultures will be able to communicate effectively even if they both speak English. How much time do you spend learning the basics of your primary language? Usually longer than your first 10 years of your life if not longer. So why are we surprised that communication breaks down so often? It is sometimes related to the channel and method, but more often than not – it is because of information service contract failure.

(To be continued...)

Saturday, March 30, 2013

Bridging perceptions to create common knowledge

One of the first lessons I learned when I started my career was to “manage expectations”. At face value, it is about making sure you communicate what you are working on, and letting people know what they can expect from you and when. However at a deeper level, one needs to start appreciating that your audience, whether it is your boss, client, colleague, friend, family-member, teacher or service provider – all have different views of the world. To communicate effectively with people, you need to understand their agenda. Why they are engaging with you? What are they trying to achieve outside the context of your engagement?

Another way to think about this, is in terms of visualization. When you look at a house from the front, you see one thing, if you look at it from above – you see something completely different. Yet a simpler example is a cone: from the front it looks like a triangle, and from the bottom, as a circle. 





The film industry has been using perceptions for decades to create specific impressions: Gigantic cruse boats sinking (Titanic), wars between nations (Lord of the rings), and perhaps generally speaking - ordinary people and objects showed in unusual proportions or demonstrating unusual abilities.

The use of perception is common in many industries, from judiciary to science, fashion and marketing. What matters at the end is the communication of a single and clear message, and the trick is to know how your audience will interpret your message.

Now, when it comes to common knowledge – there are two things to consider. Firstly, common knowledge by definition requires a consensus for it to be classified as such. Therefore the way common knowledge is defined and disseminated has to take in to account the community’s perceptions. So when you want to generate common knowledge, make sure to position and articulate the information in a way that is clearly understood and perceived as valuable by the community.

Secondly, for information to become knowledge – it must be understood. In other words, the community members must understand how they can apply the information to their specific situation. You can achieve this by giving examples and demonstrating how the information can be applied to generate value. This will allow your audience to create analogies, transfer the application in their mind, and draw a scenario where they could apply the information in their own context. Another way, would be to apply the information to existing common knowledge, and draw specific value for the community as a whole.

For example, if we were all working for one company, and I tell you about a search capability that we have which allows us to find various documents in the organization. By demonstrating how one can use this to find company procedures – you can immediately understand the value and gain the knowledge. If the search capability is limited to a departmental level, then by demonstrating how the “Information Management” team can use the capability to find documents – other members of the organization can think of scenarios where the same capability would be deemed valuable.

The bottom line is that the managing expectations is a vital indicator and tool for governing common knowledge.

That is, at least, my own perception …...


Thursday, February 28, 2013

Teach your children the philosophy of Information Management

I find myself, often, using analogies to explain Information Management (IM). Why? … Well, we use patterns people are familiar with to convey the characteristics of something new or unfamiliar. It therefore follows, that there is a limited exposure to what IM is, and why it is important. Why are people not aware, or under exposed to IM? We are, after all, moving towards an era where IM will become vital for a business to remain competitive. Surely this is something that people should learn as part of their education. There are courses and degrees that offer you the opportunity to learn about IM. However, in my view, that should form part of basic education, and people should get exposure to IM from a young age. This is essential for the Information age to transform into the era of “commonly known” (Teneo Vulgo).

For humans to become a specie that truly leverages on its members’ knowledge, we need to have a natural skill to manage the proliferation of information. That is not the case today, where we merely apply controls, defined by cultural and other systems to protect key components of our knowledge. For example, we teach our children to remember their home phone number and address. We further tell them not to share this information with strangers, and we might even explain why. These are controls, not a philosophy or a way of life.

So, how do we build communities that master the power of IM? Simple, we need to build memories by connecting with people’s emotions and conveying the message of actively managing information. One way would be to use story telling. For example stories about love and wars where heroes and lovers defeated the odds by using information to its full potential. The stories should be more real as people’s education matures, to teach people that this is not a mere fallacy or fantasy. The more dependent the outcome of the story on the mastery of knowledge - the better. In fact, I would argue that most of the stories told today carry the value of information management, but that this aspect of the story gets little or no attention.

People who master IM take advantage of situations all the time. They capitalize on opportunities and mitigate risk. Now imagine this power in the hands of a group with a common purpose. We already use crowd sourcing to build knowledge, accelerate the evolution and improve the accuracy of systems. It is time for the next level.

There are companies out there which are already building scalable Artificial Intelligence. But there is a bigger opportunity. We have an opportunity to scale real intelligence, or human computing. Semantic and context aware systems that work with people to derive the best result. You might argue that it is impossible to call on people to assist when needed, but there are programs out there that will prove you wrong. Have a look at the navigation app “waze” as a good example.

The challenge nonetheless remains to raise future generations to appreciate the opportunities and risks associated with information management so that people will innovate more to leverage on each other’s knowledge and intelligence and create managed systems of common knowledge.

So my message today is: Teach your children the philosophy of Information Management.

Wednesday, January 30, 2013

The best analytics tool

The purpose of data analytics, using big or small data, is to provide business insight. It is about taking sets of information, organizing them in a meaningful way and then combining them with other bits of information in order to empower intelligent decisions. 

We all use the best analytics tool, every day, practically all the time: our brains!

I am sitting here looking out the window at a set of colors and shapes. That is really meaningless information, as what I described is too vague. If I provide you with more details you will be able to match the patterns in your analytical tool (your brain), compare it to things you already know and will soon be able to comprehend what I am looking at. Let’s fire-up your analytics engine: I am looking at Green and brown patterns. The brown patterns are rectangle-like in vertical orientation and considerably thick. The green ones are mainly on the upper portion of my view, in oval-like shapes, randomly oriented, and have chainsaw-like edges. They are mixed with long, thin and curved brown patterns which link to the bigger brown patterns I already mentioned earlier. I can stop. You probably worked-out that I am looking at some trees outside my window.

You may be thinking now: well obviously our brains are analytical tools. We always talk about people’s analytical skills. That is true, but the property I am after here is the effectiveness of the tool. Can we build analytical technologies which are better than our own brains? Faster? Definitely. Bigger capacity? Of course. Better ability to filter information, reason and draw insight? I doubt it!

If now I tell you that the wind is blowing. You will start asking: well what does this mean? Well, that is related to reason I am looking out the window in the first place. Let’s assume I am planning to go on a picnic. Because I know it is windy, you might conclude that this is likely to degrade the quality of the picnic. So the “business” dilemma here is whether to postpone or cancel the picnic. This now demands further insight. Can I go at any other time? What is at stake? Are there any other parties involved? What if I now told you now it is 2 degrees Celsius outside?  Or that the planned picnic’s location is far away from where I am now, or even that the scheduled time is only tomorrow afternoon?

Your brain is continuously integrating the information I am giving you. It is analyzing each scenario and drawing insight. You can barely even notice that your brain is doing all this work for you.

The point I am trying to make is that our brain does not change its capacity or processing speed yet we generally improve our analytical skills every day through learning. So forget distributed data stores, in-memory computing, and processors’ speed. The best analytical tool obviously needs all the right information and knowledge base to draw conclusions, but the real “magic” lies in its ability to filter and integrate information, even (or especially) when there are gaps in the information that is available to us. Performance will always come second to clear business requirements and well-designed algorithms for reasoning and decision making. It seems to me that we are always chasing the rainbow of bigger and better data stores with stronger processing power, perhaps conveniently forgetting to check our pockets for wisdom, common sense and simplicity.

Therefore, I leave you with this: Ask not what you can do with the data, but ask what can the data do for you.