The potential impact of the ongoing worldwide data explosion continues to excite the imagination. A 2018 report estimated that every second of every day, every person produces 1.7 MB of data on average—and annual data creation has more than doubled since then and is projected to more than double again by 2025. A report from McKinsey Global Institute estimates that skillful uses of big data could generate an additional $3 trillion in economic activity, enabling applications as diverse as self-driving cars, personalized health care, and traceable food supply chains.
But adding all this data to the system is also creating confusion about how to find it, use it, manage it, and legally, securely, and efficiently share it. Where did a certain dataset come from? Who owns what? Who’s allowed to see certain things? Where does it reside? Can it be shared? Can it be sold? Can people see how it was used?
As data’s applications grow and become more ubiquitous, producers, consumers, and owners and stewards of data are finding that they don’t have a playbook to follow. Consumers want to connect to data they trust so they can make the best possible decisions. Producers need tools to share their data safely with those who need it. But technology platforms fall short, and there are no real common sources of truth to connect both sides.
How do we find data? When should we move it?
In a perfect world, data would flow freely like a utility accessible to all. It could be packaged up and sold like raw materials. It could be viewed easily, without complications, by anyone authorized to see it. Its origins and movements could be tracked, removing any concerns about nefarious uses somewhere along the line.
Today’s world, of course, does not operate this way. The massive data explosion has created a long list of issues and opportunities that make it tricky to share chunks of information.
With data being created nearly everywhere within and outside of an organization, the first challenge is identifying what is being gathered and how to organize it so it can be found.
A lack of transparency and sovereignty over stored and processed data and infrastructure opens up trust issues. Today, moving data to centralized locations from multiple technology stacks is expensive and inefficient. The absence of open metadata standards and widely accessible application programming interfaces can make it hard to access and consume data. The presence of sector-specific data ontologies can make it hard for people outside the sector to benefit from new sources of data. Multiple stakeholders and difficulty accessing existing data services can make it hard to share without a governance model.
Europe is taking the lead
Despite the issues, data-sharing projects are being undertaken on a grand scale. One that’s backed by the European Union and a nonprofit group is creating an interoperable data exchange called Gaia-X, where businesses can share data under the protection of strict European data privacy laws. The exchange is envisioned as a vessel to share data across industries and a repository for information about data services around artificial intelligence (AI), analytics, and the internet of things.
Hewlett Packard Enterprise recently announced a solution framework to support companies, service providers, and public organizations’ participation in Gaia-X. The dataspaces platform, which is currently in development and based on open standards and cloud native, democratizes access to data, data analytics, and AI by making them more accessible to domain experts and common users. It provides a place where experts from domain areas can more easily identify trustworthy datasets and securely perform analytics on operational data—without always requiring the costly movement of data to centralized locations.
By using this framework to integrate complex data sources across IT landscapes, enterprises will be able to provide data transparency at scale, so everyone—whether a data scientist or not—knows what data they have, how to access it, and how to use it in real time.
Data-sharing initiatives are also on the top of enterprises’ agendas. One important priority enterprises
By: Janice Zdankus, Robert Christiansen
Title: Getting value from your data shouldn’t be this hard
Sourced From: www.technologyreview.com/2021/10/19/1037290/getting-value-from-your-data-shouldnt-be-this-hard/
Published Date: Tue, 19 Oct 2021 16:00:00 +0000