Thoughts on Data Management: April 2014

Tuesday, April 29, 2014

Fixing e-Logical Data

How important is a Logical Data Model? On the one hand you have a database, which holds information used by many products and many users. On the other, you have custom queries populating reports requested by specialized teams.

Yet, there are commonalities and obvious flows which live in between the two, and while some data may be stored more efficiently in a normalized form, the further you drift from the business reality in terms of your data representation, the harder it becomes to manage it.

This is exactly where a Logical Data Model comes in. The purpose of the model is to handle the complexity of mapping many models to one, and vise versa. Ideally, it should assist in optimizing the data structure for both the common data model and the dimensional data models as well as assist in the correct translation of the model to “physical” (implemented) structures.

So what is e-logical here? and what is broken that needs fixing? SImply put - the low priority given to the LDM is illogical, and this becomes pervasive in electronic data storage and processing. In particular when models become larger and more complex the implicit impact is greater. Customers put pressure on business to provide them with accurate and relevant information, while the technology is focused on optimizing costs. The business hence sees little value in a “shared” data model, and the database people care mostly about performance and storage cost.

One might think this is trivial and that a logical data model always exists and is well managed. However, in reality, an LDM exists, in-part across multiple products and teams. By chance, or by the nature of the business, you will get similarities, and sometimes by design you will get some good synergies. However, even subtle differences in semantics can lead to an enormous amount of time and energy spent on resolving miscommunication by either over or under estimating the meaning and the appropriate usage of the data.

The ultimate goal of an LDM is full data standardization. To visualize what a mature LDM framework will deliver, consider the level of standardization that exists with electronic power and communication cables. A manufacturer of a new electronic device would refer to the existing standards and even existing components to ensure the product they create is compatible with the standard electric and auxiliary connections they wish to offer its users. It makes the product more useful and appealing.

In a mature and governed LDM the same notion of appeal and usefulness applies. Certain conventions are the standard and certain level of quality can be expected. The internet is in fact the product of the OSI model which manages the transport of data, but it remains largely context-less as it focuses on connectivity and delivery and not meaning. As a more visible example, think about the iso country code standards and natural language. Without a standard for English letters and agreed language rules and word meaning - you would not be able to read this thought.

Now imagine your organization working at that level of data standardization across customer information, product details and supply management. Sounds great… and expensive.

While in the long run it becomes less expensive and mutually valuable to all users, it takes time and effort to get there. What you can do in the mean time is to keep the goal in mind and use opportunities to evolve and mature the LDM in your business and industry. While context will always augment your data to a dimensional model, the need to collaborate will push for standardization.

Tuesday, April 15, 2014

Breaking the Data Chains

Change is hard, and frankly, people do not generally like change. We all enjoy having our routines. The same route to work, the same familiar faces, smells and sounds. It gives us comfort knowing what to expect. Moreover, there are many instances where good habits and predictive behaviour is helpful in maintaining a well-functioning society.

But change can be good, especially when it is as a result of a well-thought of plan. You may want to improve your well-being and start doing more exercise; you may want to improve on your financial well-being, learn new skills and improve your contribution to the business.

Data chains, however, refer to the phenomenon where an organization fears systemic risk on their data, and in effect avoid changes that may “rock the boat”. It is of course natural and sensible to mitigate risks, however, this should not come at the expense of opportunities, and can lead to loss of competitive advantage and optimization.

As an example, assume a company that handles customer information. When customers complain that the system limits the amount of information you can input to an address field - the company responds by providing interactive assistance in abbreviating parts of the address. For example, helping the customer abbreviate street to st. drive to dr. and so on.

You do not have to be an information architect to appreciate the adverse impact on your customers. Imagine you are the customer, and you want to update your service provider with a new address. Will you spend 10-15 minutes working out how to fit your 150 characters street address into a 50 character space? I know I got very irritated when I did that. This is an example of a data chain.

The designers and developers of the system decided to add dubious functionality instead of correcting the flaws in managing the data requirement appropriately. Don’t even get me started on the poor interoperability and life-time cost of maintaining this solution.

People compromise on sub-optimal data management solutions with a short to medium outlook. This is a dis-service to the business, which ties down the organization to complexity and higher systemic risk.

To liberate your organization from data chains, you must create a clear vision and a capability to guide workers towards that vision. This will allow you to make decisive decisions on how to design and implement data solutions. And in the event where time and money is a constraint (now when did that ever happen…) - factor in the long-term implications of the solution, implement a short term fix, but secure the funding and commitment to revisit the problem and ensure the adverse long term negative impact is addressed.

Now go on, set your data free...