Skip to main content

Select your location

Woman analyzing data

9 answers to all of your big data governance questions

In a computing environment characterized by decentralized sources of information and limited data management practices, organizations face challenges when seeking answers to basic questions about their operations. This may seem counterintuitive, but the more dynamic and successful the enterprise, the greater its challenges around data governance.

Companies must implement large-scale data systems that provide long-term viability, reduce technical debt, and provide a basis for AI-enabled products that drive internal efficiencies as well as external growth.

I was recently joined by David Totten, Chief Technology Officer, US Partner Ecosystem at Microsoft, and a group of senior technology leaders to discuss the data governance journey. Insights and actions were connected to common questions.

How is data governance characterized today?

Today data governance is built around the mindset of democratizing data as well as data responsibility. There are three focus areas:

  • Data standards and rules on modernizing data foundations

  • Making sure the accountability of users is well understood

  • Leveraging technology to do a lot of the work

What are the components of data governance?

Data governance consists of the familiar trio of people, process, and technology. Many companies, however, don’t implement all three components. Many think of data governance as a technology solution. Many think about it only from a data leak perspective and focus on closing off all potential leaks to external sources.

Actually, data governance is about 1) the data, 2) who has access to it, and 3) making sure that everyone who needs it has access to it. Technology, people, and process are all required to be effective.

What is the proportion of each component?

Leverage technology as much as possible. There are now platforms that can get data governance 95% of the way with features like pre-built functions, easy access, and automated alerts when policies disagree with each other. The remaining 5% is people and process, humans making judgement calls that are beyond the capabilities of technology.

Baseline processes can be part of the technology platform, but rules can't be defined by algorithms. People need to be the final arbiters about what processes need to look like in a given organization.

Is there a standard approach to data governance?

No. Data governance is highly contextual. It will differ by industry, and even between companies in the same industry, as well as across geographies. There is no one-size-fits-all playbook for solutions, though there are solutions that can be used as building blocks and modified as needed.

The road to get there can look similar in regards to what steps to take to put the best plan into place. Here are some steps to get started:

  • Data assessment: data catalog at table level, use cases

  • Data audit: evaluation of key data metrics including data quality, sparsity, latency, consistency, etc.

  • Strategy analysis: development of high points of data strategy

  • Gap analysis: understanding of missing elements in overall data governance stack and remediation recommendations.

When these are completed you can then begin to work towards developing a prototype and ultimately launching a version to share with key stakeholders within business and demonstrate the applied value to the existing workflow. This whole process to begin productizing data governance typically takes 12-weeks.

What does a steady state look like in terms of data governance?

There is no steady state or end date. Data governance must always evolve and change over time to keep pace with changes in the company. Even the most data-disciplined organization must maintain flexibility.

One acquisition, one change of ownership, a shift in geography, unexpected regulatory changes—these are just a few examples of occurrences that can require radical changes in data needs and usage, and therefore similarly radical changes in data governance strategy. This can be quite a challenge for large enterprises simply because of the sheer volume of data involved.

Who in an organization should be accountable for the data governance strategy?

In a perfect world, it would be one person--a czar, so to speak. But finding that intelligent, future-proof, strategic, detail-oriented individual is close to if not completely impossible. In our not-so-perfect world, the best option is a governance committee with people from different parts of the business and different levels of expertise in both technology and people.

This is the most effective collaborative environment for ensuring data governance. This approach also helps get buy-in from different parts of the organization, which means that it is easier to apply standards, embed personal accountability, and prioritize investments around infrastructure and services.

In our democratized data environment, where does governance sit?

Governance cannot just sit as a gate into the business. This approach can present big risks to data quality and reliability. In these days of democratized data, instead of being a serial step in an overall data process, governance needs to be a continuous observation function, sort of a rectangular box on the bottom of the data lake with checkpoints all the way out through and including the feedback loop with customers.

What is the biggest problem with data governance implementation?

Too many technologies: "I want this for my database. I want that for my data lake. I want one for my ingestion logic, I want another for my control and yet another for my data management taxonomy tool." It’s important to standardize on one platform for all of it.

Are there elements of governance that can accelerate the business?

Data democratization carries the potential to increase business velocity. The right governance approach can expose previously unknown or overlooked data sources. There has to be a way to enable these kinds of discoveries rather than preventing access to new data sources.

Governance can also free up data assets from siloed systems, allowing previously unavailable looks at relationships between assets from different departments. These new doorways to data can have a significant impact on a business’s trajectory and speed of growth.

The discussion concluded by coming full circle back to the definition of data governance: It's a fluid construct based on where the data organization is, where stakeholders are, the biggest risks in the roadmap, and how mature the company is. Organization. Finally, it includes people and process, and technology that is leveraged as much as possible.