Wednesday, August 29, 2018

Schema of Life

We communicate using semantics, in other words - a language. It has repeating patterns and rules which enable us to create familiarity that translates to meaning and leads to understanding. We learn languages by exposure to the patterns and rules and experimenting until the useful level of understanding is achieved. I am not a linguistic expert of any sort, and I am sure my statement is not exactly what the textbooks would use to define a language. I apologize for that. Nonetheless, I would say it is a reasonable framing of what constitutes a means of communication.


Now without language, there is no communication. Without communication there is isolation and disconnect. We spend a great deal of resources to develop, compare and interpret information. However, when we exchange structured information, we often spend very little resources on ensuring the information we exchange is well understood and exists within a clearly understood contract.


When we sign up to exchange data, we often focus on the channel rather than the semantics. We will agree when to deliver information, where and how. We will also agree on the scope. For example, every Friday, we will get a export of all the new customers in a spreadsheet delivered to this xyz server. We will agree on the format, which will allow us to extract meaning from the data and also on the usage, provided there is sufficient risk or value embedded in the data. Notice however, that semantics has a limited presence in this definition.


Things get a bit better when the provider gives you a set of definitions, which is a portion of their semantic definition of the data. However, this definition is often riddled in several ways. Firstly, it is likely that it was partially defined, since the context of the contract is limited to a certain business activity. It would be rare for an organization to provide a definition based on a mature internal semantics language. Secondly, the domain of your business is at least slightly different from your provider and your definition of a product is not necessarily the same as your provider (and so are many more definitions). Lastly, your business is not likely to hold a mature semantics language for the same reasons your provider does not (there is little direct profit value out of such activities).


Now, this is the pit fall. When the semantics differ, and dissemination of the semantics is poor, you end up with augmented meanings and improper use of the data. It takes teams of knowledge workers to daily address complaints and quality issues, which we often blame on our provider or "bugs" in our systems.


To be clear, I am not pointing to any specific organization I have worked with, or for before. There is not a single institution I have ever come across which does not exhibit this phenomena. Nonetheless, this is simply a clear indication of a low maturity of data management.


So, what do we do?


Well, everything is driven by value or perceived value. There is a price tag to these activities. You would need to imagine a world without these issues first, and imagine the efficiency and the opportunity that come along with effective data management. Then you need to draw attention to the vision. In other words - market it. Finally, you need to work with the leaders of the organization to consciously integrate design and operational behavior changes to reduce ambiguity and create better semantics harmony. Internally and across organizations.


Ultimately, this will lead to a data exchange language which will allow us all to communicate and respond more effectively. So instead of having to ask questions about people meant by "date originated" or "number of irregular accounts", we can focus on value enhancements and product development rather than reactive and corrective behavior.

Friday, August 24, 2018

The Return to Diversity: The Unexpected Outcome of Global Data Evolution

Back in the days, information was scarce. It took a long time for information to travel from its source to its consumers. Over time we have seen technological breakthroughs, starting from the print press and moving on to digital media to the internet and social media. With these advancements our ability to access information has increased. Information travels much faster and the number of sources available has become abundant.

However, it has also become increasingly difficult to verify the sources of information. It has become much cheaper to create content and to publish it to the outside world. Therefore, anyone with any intention can find innovative ways to publish their content in a convincing manner. As a result,  it is much harder to be certain that the information you are consuming is in fact the whole truth and in fact the perspective you are looking for.

For this reason, we are now in an era where people are starting to diverge to "content groups" based on the concentration of specific sources aligned with a particular view of the world. While you will always have people who crossover between varying sources of information, the majority of people will stick to a set of sources that align with their education, background and experiences. So Instead of the free flow of information bringing us closer together, it is actually creating a wider reap and division between groups. This will deepen gaps between geographical areas, ethnic groups, languages and more.

This will eventually result in an increase in the diversity of perspectives, culture and behavior across the globe. However, while there are negative consequences in diversity, there are also benefits. One major advantage is the fact that with diversity comes strength. More perspectives means more and varied ideas and approaches to solve problems and innovate.

We must therefore challenge ourselves to firstly recognize that this pattern in the evolution of information commodity is a reality. Secondly we should find ways to use this phenomenon to help us maximize the return of our goals and objectives. You only need to look at recent politics in the U.S. to realize how politicians and corporations use this to their advantage.

The other question remaining is: where will this lead us? Are we going to see more diversity and more drift between groups in human society? Are we going to see a deliberate growth in diversity with an underline common set of core values?

How we respond to this change will drive and determine the evolution of mankind. There is little that an individual or a small group can do to control this. However, as a society as a whole, we can and should develop a global data governance framework that will look at data holistically across all domains and across all social and economic activity. This framework will enable us to drive this diversity to ensure a common set of values are protected. These core shared values will help support the ultimate goal of sustainability and evolution of our specie. Otherwise, we are basically taking a potentially irreversible chance with the future of mankind.

Friday, August 10, 2018

Implementing Data Sharing in a Multi-Stakeholder Ecosystem

Information flow is complex, but it does not seem so for most of the stakeholders involved. At a high level, data flows from the originator and passes through various data handlers and eventually reaches the data consumers. Now this would have been complex enough had the data remained in its original packaging, but we know, data is re-organized, filtered, translated and aggregated. This affects the roles and responsibilities of each of the links in the flow of data. Therefore, it is crucial to understand the implication of these processing points and to amend the contract that is attached to the data being processed.

This contract needs to define the meaning of the data, its origination, the constraints imposed by its originator (which need to include the data owners' rights) and the scope, or conditions, which apply to the data. This could be implemented in various ways, but should not be locked into a single medium or format, since most data can be transported over various mechanism and the contract would be relevant regardless of the mode of storage or representation.

To provide an optimal control over data, you need to consider several elements:
  1. Holistic flow chart: starting from origination and extending through the data flow's life cycle as far as possible from a practical and risk/value proposition perspective.
  2. Governance body: together with the stakeholders who manage the links in the data flow, determine the policy and processes to follow to ensure initiation, use and retirement of data. This would include everything from quality control, issue resolution, data life management and related responsibilities.
  3.  Internal governance controls: develop measures and processes to ensure compliance with the data ecosystem policy, while ensuring compliance with related policies around the internal business components which handle the data. For example: while you need to ensure you keep customer data for as long as it legally permissible, you also need to consider whether keeping it for that long serves a purpose and value to the business (as well a cost of maintenance and prolonging of handling risk)
The point is that there is an important thread for information handling, which is often ignored, and is often the source of risk exposure, conflicts, misunderstanding and a barrier for value enhancement. This thread is the need to consider data in an EXTERNAL ecosystem. Most data is not isolated to your business. It co-exists with customers, vendors, policy makers and others. To succeed in this challenge, one needs to stitch a business vertical ecosystem with a horizontal data life ecosystem. A significant portion of this horizontal ecosystem exists outside your business and control, and the challenge is to accept this, identify the risk and opportunities within this fluid position and create and govern the right mechanism to maximize the benefits (short and long term) for your business.