As companies struggle to process, store, and leverage ever-increasing amounts of structured and unstructured data, data governance is becoming a critical part of every company’s data management.
Governance not only helps a company understand and use its data, but it ensures everyone has access to the data they need, when they need it. “Data doesn’t have much value if it lies dormant in your system, where no one can gain insight from it,” says Salim Syed, head of engineering for Capital One Slingshot. “A well-governed data platform brings data out of that darkness.”
Effective governance also enables a company to implement and manage internal policies and standards related to the security and usage of data. This not only supports a company’s response to external compliance directives, but also standardizes the data for use across the company. Standardized data provides the “single source of truth” required for critical business decisions, as well as the data quality and trustworthiness teams need to do their jobs.
Data governance challenges
On the surface, implementing data governance might seem obvious and straightforward, but the act of governing data across a company’s teams and products introduces levels of complexity that many companies either half-heartedly attempt to address or avoid altogether.
Instilling the processes, policies, and protections of governance requires new mindsets around people, processes, and technology. “It’s not the run-time activities that persuade someone not to do governance,” says Syed. “It’s all the work that’s needed to set up governance.”
For many, the approach to data governance is to establish policies that are overseen by individual sectors of the business, which makes implementation all the more difficult. “Think about all the different teams that are doing that in a large organization,” explains Syed. “They all have to do that dependency check, and each team is also doing separate development work to meet those requirements, which is a lot of duplicated effort.”
A siloed data governance initiative that requires each team to monitor its own data dependencies takes time and effort away from other work as well. “It becomes cumbersome to innovate because at every step of innovation, you have to check if there are dependencies on your governance policies,” says Syed.
Siloed approaches also introduce the possibility of error and make it more difficult to ensure all governance policies are followed consistently, in all cases. These hurdles can result in a lack of buy-in from employees and stakeholders, deflating any realized data governance benefits.
federated governance solution
In many companies, data is viewed as an IT asset, and thus an IT responsibility. Although that might have been true in the past, the volume and speed of data today, and the innovative ways companies are using their data, means data is the responsibility—and the driving force—for all business units.
To build an effective data governance program to serve every area of the business, it’s best to centralize the framework to reduce errors and to reduce duplicate efforts. “For federated teams to be successful in applying data management rules and governance, you can’t just set a policy and let every team go build technology to enforce it,” says Syed. A centralized approach is less complicated to monitor, facilitates data consistency and accuracy, and is easier to make transparent, all of which helps with stakeholder buy-in. “If you have a centrally managed data platform, a centrally managed data ingestion pipeline, and a centrally managed data policy, then you only make changes to [the data] in one place,” Syed explains. This ensures data remains compliant, secure, and consistent wherever it is used.
A best practice in establishing a centralized data governance initiative, Syed argues, is to build a central data catalog. All incoming data is ingested to a central location where it is first classified—meaning, data is identified and labeled with metadata, and restriction levels are determined. From there, access and permissions can be assigned, which facilitates sharing across the organization. “With a centralized catalog, wherever your data resides, it’s the map,” explains Syed. “Once it’s cataloged and classified, then you can share. You can basically break the silo.”
A data catalog effectively
By: MIT Technology Review Insights
Title: Increasing amounts of data require holistic governance
Sourced From: www.technologyreview.com/2022/07/11/1055450/increasing-amounts-of-data-require-holistic-governance/
Published Date: Mon, 11 Jul 2022 16:00:00 +0000
Did you miss our previous article…