Sunday, December 4, 2011

Data Governance and Wrapping Paper


Your unusual uncle has just turned 80, and told you that his single special wish is for ALL his presents to be wrapped in "baby yellow" wrapping paper with large orange triangles (go figure...).

You then start spreading the word. You tell cousin Jean "Yellow wrapping paper with triangles" (out goes "Baby" yellow, "orange" and "big" triangles). Cousin Jean tell auntie Gina "light color wrapping paper with some symmetric shapes" (out goes "Yellow" and "triangle")...

You can imagine where this is going.... By the time the presents are set on the birthday table - not a single present looks wrapped in anything that even resembles what your uncle wanted... 

What we have here is a classic case of a "broken telephone" (like the children's game, where they have to whisper a message across a line of kids).

This is obviously a severe failure in data governance. Did you question why the specific colors and shapes matter? No! Did everyone carefully comply with a set of minimum criteria specification? No (they were possibly not clearly  defined). Did anyone check that cousin James, all the way from across the Atlantic, understood that the yellow-light-faded-brown-tissue paper does NOT actually mean toilet paper ?!! Well, no (that is why your uncle's new golf bag is wrapped in toilet paper!).

Information governance is a social problem - it is not limited to business. Have you ever wondered why we spend YEARS teaching our children to build competence in communication patterns (language), but yet very little time (if at all), in nurturing a behavior of ensuring the awareness to the correct fitness of information and its adherence to quality requirements.

And people wonder why I call our times the age of information chaos...

Wednesday, November 9, 2011

Shaping the future of information management


The life cycle of information starts with the ability to notice a signal which has a meaning to the observer. It ends when there is no relevant trace of such information in the behavior of the entity. Information travels between entities and it influences, and is affected, by the dynamics that exists within a population.

Communication utilizes a commonly agreed set of patterns which represent certain signals which have a meaning to a given construct. Language is then an agreed standard of patterns, which is used to construct more complex patterns - commonly known as words.

When children attend school, they are exposed to the rules that govern the patterns - effectively training them on using a protocol. This is then used to build even more complex structures which move from words to sentences to statements to topics and to areas of knowledge.

What does this has to do with information management? If you spell a word incorrectly, you might be able to compensate the meaning using the context of the sentence - even if you use a different language, or even omit the word altogether. If you create multiple issues, including the use of different languages between the speaker and the listener - it becomes trivial that a communication breakdown will occur.

What happens when a word in the dictionary changes? People will continue to use the commonly known meaning - until they are told otherwise. Even then, if their counterparts are not aware of the change, they will refrain from using the new word.

How do we teach our children to manage information? We expose them to scenarios in which certain information is valid, and instill within them the idea that they must OWN the information replica that they are given.

Take for example a telephone number. We will teach them to manage an address book. Be it in memory, on paper or using a digital medium. We tell them how to use the number and for what purpose.

There is a flaw in this model.

The person does not OWN the number. S/he receives a snapshot of the information which is static enough to exert effort to create and maintain a replica of the information. Given the "business requirements" and the "quality attributes" of this scenario - that is "good enough" for most cases and the information is "fit for use".

Still sound ok? Well, we have completely ignored data governance. Have we thought them what to do when the phone number changes? not working? Have we ensured that when our phone number changes we are able to effectively update all replicas of this information?

The ideal process is to maintain a link to all the dependant replicas and to send a message to all the owners of these replicas that the information has changed. When did you last tell someone to ensure they are able to manage all the dependent  replications of their data? Do they understand why it is important? what is the impact?

In the phone example, it is easy to understand the "business impact" of the number changing and people not being able to get hold of you. Before you run off to list you key stakeholders and develop a process for publishing a change in your contact details - rest assure this does not end here. What about medical information as another, arguably, critical type of information with multiple stakeholders. Are you confident all the necessary "users" have "good enough" access to the these type of changes? What about when an emergency situation arises?

How do we ensure that people think about information management in this way? Simply put - we must educate them to evaluate the impact of changes in the quality levels of information. As long as we succeed in linking it to the real “business impact” - we can enable people and organizations to identify the critical attributes and define the necessary controls and processes which will minimize the risk in losing the necessary quality of the information.

To entrench this in the generations to come, we must teach our children to challenge the quality of the information, engage in analyzing the scenarios in which the quality of the information fails to deliver on the usage required, and how to invest in  mitigation strategies in a sensible (economic) extent of effort.

Now go plan your contact details change publication strategy…


Saturday, October 22, 2011

Teneo Vulgo

The term "Teneo Vulgo", or "Commonly known" can be used to describe the new Era of information Management.

Social media, semantic web and information governance are taking the world by storm. What distinguishes these "technologies" from what we have seen before is the human phenomenon which they enable. The power of mass validation of content and the improving ability to organize and manage information are changing the way we engage with and value of information.

Let's take a few steps back. Before internet search capabilities (and the build-up of on-line content), we had to rely on expert knowledge just to answer, sometimes, the simplest questions. We would ask family, friends, and teachers or visit a library (that physical place with inked-printed-paper-bounded books). Today, you can do all that, but in a "real-virtual" sense amplify the effect of a single query by a factor of just about a million.

Is that good? Well, you do get all the information you need (hopefully), but you also get a lot of information that you do not need. We obtain more information more quickly, but then we need to sift through and clean it. This requires time and also increases the risk of using incorrect information, which may appear initially as credible. In the past, you had to spend a lot more time and effort in collecting information. You had to make sure you identify the right sources, to avoid wasting a considerable amount of time and money to obtain content which is of no use to you.

This is analogous to the differences between software architecture and structural architecture. The latter has always been understood as requiring careful planning. It is too costly to build a bridge or a house in an incorrect manner. Software, on the other hand, has been initially perceived as cheap and easy to rebuild. In recent years however, it has become clearer that careful design is important for software development. This is due to the associated cost of incorrect implementation, re-deployment, application life-time and maintenance optimization. 

Teneo Vulgo, in a sense, is the recognition that the information we deal with is not ours alone. Other people can connect, verify, dispute our data and in certain situations also change it. This requires an appreciation to the life cycle of information and a change in our thinking as well as in our fundamental skills and capabilities as they relate to communication and organizing information.

Why do people care about where, how and why information is being shared?

Society, whether big or small, relies on information sharing to promote a common goal. People want to share information, because they believe that this will assist in realizing an outcome they support. You might tweet about an earthquake you experienced; not only to express your emotions and let your communities know you are ok, but also because you believe that this will promote greater awareness. This, in turn, is likely to accelerate the scale and speed of response from authorities and other organizations. You might blog about Teneo Vulgo, because you believe there is a need to debate and evolve this concept in order to accelerate the momentum for the human race to improve its communal intelligence.

But how does mass media fit in to this picture then? Information systems, like any other system, can be optimized through improving the response to changes in the system's input. The best way to do this, if the conditions allow, is to flood the system with inputs, observe its response and learn how to tune the controls which are imposed on it. Mass media acts as the flood, semantic web as a structure for the inputs and response (output) and information governance acts as the control over the system of information dissemination.

"Commonly known" is the optimized result of such a collective knowledge system. Once we have analyzed and tuned our measures to manage mass knowledge - we have developed a capability to answer questions not through the use of a single set of unverified sources, with an unverified method - but rather via a well-tuned and self-optimizing communal knowledge management system. This then is the foundation of progressive social intelligence, and differentiates societies in terms of their ability to steer towards common goals.

These ideas are not necessarily new patterns of behavior. Nonetheless, it is important to note that the potential to optimize communal knowledge through this framework is approaching its peak. In order to maximize on this opportunity, communities will develop technologies which will allow them to build and control communal knowledge.

So what can we do today?  We need to teach our children to manage the knowledge they connect with through a full appreciation of its life cycle and all its stakeholders. We need to also become more conscious of the challenges of connecting communal knowledge in order to help advance the intelligence of our communities.

Tuesday, September 20, 2011

Data Done Right!

Why data quality?


Quality is important, right? Informed decisions and availability of precise information is at the heart of every single business activity. But, is the level of data quality where it should be? In most cases, the answer is no, and there are valid reasons for that. For example: There is no budget for the source system to improve the quality; the provider cannot improve the quality due to other constraints; etc. What do we do then? We implement whatever we can to minimize the impact ( c o m p e n s a t i n g    p r o c e s s e s).


But let’s explore the real causes of these limitations. To begin with, data (or information) often travels across business areas. This means that the consumer is truly dependant on the supplier (thanks for the insight OCDQ). That is a trivial, but a very important point. If the supplier does not understand the impact of the data, they would tend to prioritize what they perceive as important. This can have an adverse affect on their customers, their customer’s customer etc.


Secondly, if you do not question the origination of the data and its quality, to its true point of creation, you allow yourself to make assumptions and take on unintended risks. There may be quality sensitivities that you are exposing to unknown volatilities. There could be timing delays, approximations on the value of the data, incorrect relationships to associated information and even wrongful interpretation of what the data represents.


Is it really all that important? Well, besides the additional complexities (and ongoing costs) of workarounds, the impact on decisions, flexibility and time-to-market, the repeating cycle of cleaning up data instead of preventing poor data quality by design – there are other implications of poor data. One example is the effect on social intelligence, which impacts the capability of a community to operate in union (that, is a whole different story).


What is data quality anyway?


There are various view points from which we can look at the definition of data quality. You can look at it from an intrinsic, contextual or an operational perspective (for example). Regardless of the perspective you adopt, you end up with the same, or similar, quality dimensions which impact the ability of the data customer to use this data to its best economic value. Some examples of data quality dimensions include: Accessibility, Accuracy, Reputability, Clarity, Completeness, Conciseness, Consistency, Correctness, Interoperability, Precision, Relevancy, Timeliness, Uniqueness and Validity. To measure data quality, you need to firstly identify which data matters (i.e. has a meaningful impact on the business). Thereafter, identify which data quality dimensions have the highest impact on the data (i.e. the highest sensitivity). Only then can you question the thresholds and start measuring the status of the data quality.


Business, who is always the ultimate supplier and customer of the data internally, has the most intimate understanding of the impact of data. They also have the biggest influence on data quality since they operate, specify or otherwise assume data quality requirements.


So, who owns data?


Does Business or IT own the data? Is it the supplier or the customer of the data? To answer this question, we need to clarify what we mean by ownership. Is it the accountability? exclusivity on making decisions? Well - it depends. It depends on how you choose to define the scope of responsibilities within a data management framework.


Instead of looking at that topic, let’s list what kind of data management roles may exist. This then feeds into how we look at the vision for data management across an organization. Data Management roles may include: A Data Governance Council, Functional Data Governance Boards, Data Sponsors, Data Owners, Data Governors, Data Custodians, Data Stewards and Subject Matter Experts. Without going into further details on what these roles may mean, one thing should be clear: data is everyone’s problem – not just one person or area.


What we advocate then, is that like everything else, when there is more than one person who (should) care, conversation is the key to reach clarity and agreement. These roles are then assigned to existing stakeholders within the organization, depending on how they need to interact with the data. For example, IT may be the custodian of business data and the owner of data relating to the business of IT. While business, customers, data management, architecture, integration, development and support – all have something to say about data – they take a different view on sometimes the very same concerns. The overlap of scope in these disciplines simply means that we need to understand each other’s priorities and agree on what needs to be done, and on who does what.


Data management


This is arguably a relatively new discipline that looks at what is required to ensure that data is efficiently and effectively looked after as any other business asset. Finance is looked after by Finance, risk by Risk Management, and data by Data Management. The same way these areas depend and affect one another, so does Data Management.


Here is what is in scope for Data Management: governance, data architecture, data development, data operations, data security, reference and master data, data warehousing and business intelligence, content management, meta-data management and data quality. (These grouping of functions are defined in the DAMA DMBOK).


There are other (similar) approaches and players in the thought leadership on data management. Some examples include The IBM infogovcommunity.com and the EDMCouncil


What does all this mean?


The impact of data issues is evident in everything from project work to production support. There are undoubtedly numerous examples of poor visibility on the meaning of data; issues relating to gaps in data quality criteria; and costs and risks associated to impaired data management practices. Reflecting on, and discussing the type of impact data has on business – will allow you to qualify and quantify data issues and data limitations / opportunities.


The single most important requirement is to create a cultural change which would shift the minds of everyone to realize that data optimization is not a side-effect of technical requirements, but rather a strategic enabler. It opens the door for better cost management and for identifying opportunities for business resilience and growth.


So, what can YOU do today?


Talk to people in your organization about how they can save money by driving clarity on data quality – both in order to help them better understand how they are being impacted by data, as well as to identify costs, risks and opportunities around optimizing data as an asset.

Saturday, April 2, 2011

Data Quality - Let's not forget the people, people!

Looking around at what people say - there seems to be a growing conversation that data quality is not just about the processes. For me, data governance in its sociological context, coupled with a true understanding of the business impact - is the key for effective data quality management.

Of course you need six sigma, TDQM framework, Zachman framework etc. However, unless you explain to people that data quality is really about healing and preventing issues - you will never get past the budget and commitment for patch up jobs.

I am not negating data quality technologies, methods or metrics. On the contrary, these are essential tools to SUPPORT the data governance and ensure data quality compliance. The bottom line is that people need to believe that data quality is the right thing to do in order for them to commit to the role the play in this process. This, you can only achieve through positioning data quality against their true motivational drivers (the only real reasons they would agree to meet with you in the first place).

Strategy, budgets, relationships, politics, fears, aspirations etc - all play an important role here. While I support and believe in the traditional quality methodologies - data forms an essential part of the connection between the purpose of the business to the components which realize it. Therefore, the human element is a natural parameter in the equation. Remember the three types of resources which enable business (people, processes and technology)?

I will say it again: Let's not forget the people, people!