Tuesday, September 20, 2011

Data Done Right!

Why data quality?


Quality is important, right? Informed decisions and availability of precise information is at the heart of every single business activity. But, is the level of data quality where it should be? In most cases, the answer is no, and there are valid reasons for that. For example: There is no budget for the source system to improve the quality; the provider cannot improve the quality due to other constraints; etc. What do we do then? We implement whatever we can to minimize the impact ( c o m p e n s a t i n g    p r o c e s s e s).


But let’s explore the real causes of these limitations. To begin with, data (or information) often travels across business areas. This means that the consumer is truly dependant on the supplier (thanks for the insight OCDQ). That is a trivial, but a very important point. If the supplier does not understand the impact of the data, they would tend to prioritize what they perceive as important. This can have an adverse affect on their customers, their customer’s customer etc.


Secondly, if you do not question the origination of the data and its quality, to its true point of creation, you allow yourself to make assumptions and take on unintended risks. There may be quality sensitivities that you are exposing to unknown volatilities. There could be timing delays, approximations on the value of the data, incorrect relationships to associated information and even wrongful interpretation of what the data represents.


Is it really all that important? Well, besides the additional complexities (and ongoing costs) of workarounds, the impact on decisions, flexibility and time-to-market, the repeating cycle of cleaning up data instead of preventing poor data quality by design – there are other implications of poor data. One example is the effect on social intelligence, which impacts the capability of a community to operate in union (that, is a whole different story).


What is data quality anyway?


There are various view points from which we can look at the definition of data quality. You can look at it from an intrinsic, contextual or an operational perspective (for example). Regardless of the perspective you adopt, you end up with the same, or similar, quality dimensions which impact the ability of the data customer to use this data to its best economic value. Some examples of data quality dimensions include: Accessibility, Accuracy, Reputability, Clarity, Completeness, Conciseness, Consistency, Correctness, Interoperability, Precision, Relevancy, Timeliness, Uniqueness and Validity. To measure data quality, you need to firstly identify which data matters (i.e. has a meaningful impact on the business). Thereafter, identify which data quality dimensions have the highest impact on the data (i.e. the highest sensitivity). Only then can you question the thresholds and start measuring the status of the data quality.


Business, who is always the ultimate supplier and customer of the data internally, has the most intimate understanding of the impact of data. They also have the biggest influence on data quality since they operate, specify or otherwise assume data quality requirements.


So, who owns data?


Does Business or IT own the data? Is it the supplier or the customer of the data? To answer this question, we need to clarify what we mean by ownership. Is it the accountability? exclusivity on making decisions? Well - it depends. It depends on how you choose to define the scope of responsibilities within a data management framework.


Instead of looking at that topic, let’s list what kind of data management roles may exist. This then feeds into how we look at the vision for data management across an organization. Data Management roles may include: A Data Governance Council, Functional Data Governance Boards, Data Sponsors, Data Owners, Data Governors, Data Custodians, Data Stewards and Subject Matter Experts. Without going into further details on what these roles may mean, one thing should be clear: data is everyone’s problem – not just one person or area.


What we advocate then, is that like everything else, when there is more than one person who (should) care, conversation is the key to reach clarity and agreement. These roles are then assigned to existing stakeholders within the organization, depending on how they need to interact with the data. For example, IT may be the custodian of business data and the owner of data relating to the business of IT. While business, customers, data management, architecture, integration, development and support – all have something to say about data – they take a different view on sometimes the very same concerns. The overlap of scope in these disciplines simply means that we need to understand each other’s priorities and agree on what needs to be done, and on who does what.


Data management


This is arguably a relatively new discipline that looks at what is required to ensure that data is efficiently and effectively looked after as any other business asset. Finance is looked after by Finance, risk by Risk Management, and data by Data Management. The same way these areas depend and affect one another, so does Data Management.


Here is what is in scope for Data Management: governance, data architecture, data development, data operations, data security, reference and master data, data warehousing and business intelligence, content management, meta-data management and data quality. (These grouping of functions are defined in the DAMA DMBOK).


There are other (similar) approaches and players in the thought leadership on data management. Some examples include The IBM infogovcommunity.com and the EDMCouncil


What does all this mean?


The impact of data issues is evident in everything from project work to production support. There are undoubtedly numerous examples of poor visibility on the meaning of data; issues relating to gaps in data quality criteria; and costs and risks associated to impaired data management practices. Reflecting on, and discussing the type of impact data has on business – will allow you to qualify and quantify data issues and data limitations / opportunities.


The single most important requirement is to create a cultural change which would shift the minds of everyone to realize that data optimization is not a side-effect of technical requirements, but rather a strategic enabler. It opens the door for better cost management and for identifying opportunities for business resilience and growth.


So, what can YOU do today?


Talk to people in your organization about how they can save money by driving clarity on data quality – both in order to help them better understand how they are being impacted by data, as well as to identify costs, risks and opportunities around optimizing data as an asset.