Data is a life-changer. But even as it empowers you, it puts you at risk. As paradoxical as it may sound, data can give you the deepest insights, provided they are structured well and analyzed using the appropriate tools. But it’s a disaster when it falls into the wrong hands and is a waste when siloed across multiple systems. The latter is significant in today’s world, where data is spread across your cloud, on-premises, SaaS applications, and more, making it difficult for you to bring them together to glean more meaningful insights. Such data silos limit what you can do with them, threaten data integrity, waste resources, and make collaboration almost impossible. To address this problem of siloed data, Microsoft has come up with a new unified data governance service called Azure Purview that collates data from different sources, builds a data landscape, classifies sensitive data, creates a data lineage, and more.
Needless to say, you have better control over your data and can get in-depth insights for streamlined decision-making. Let’s take a deep dive into this new service to see how it can benefit you.
Azure Purview: Key features and their benefits
Available in public preview, Azure Purview is designed to bring all your data together for improved data management, governance, and visibility.
One of the highlights of Azure Purview is that it encompasses the collective experience of Microsoft across its Bing, indexing, and Azure search capabilities, so you can expect it to be top-notch.
Mike Flasko, partner director of program management, Azure Purview, says that this service is the result of Microsoft’s self-need for streamlined data governance and data mapping. In this sense, Microsoft itself is one of Purview's customers, so this product addresses many real-world problems faced by large companies today.
But is it as significant as Microsoft claims it to be?
To answer this question, let’s look at some existing problems in data use and governance and how Azure Purview handles them.
Bridging data types and formats
A common problem that companies face is that different platforms and applications generate data in varying formats and types, from columnar data to files, thereby making interoperability difficult. There are currently no easy ways to bridge these data types and connect the data without significant time and effort.
The fact that there are many processes associated with each data makes it cumbersome to link the data and bring them together.
But Azure Purview makes this easy. All that the administrator has to do is go to the classification settings and choose the data types and formats that must be scanned and indexed. Purview scans the metadata and, during search results, displays all the associated data regardless of the type and format.
For example, if you search for the term “marketing,” it will bring up all relevant data. This can include spreadsheets, reviews, blob objects, newsletters, and just about any related information, even if they are in different formats.
Enhancing data governance
Data governance is the process of establishing policies to ensure that you have complete control of your data throughout its lifecycle. Also, it defines responsibilities within an organization to determine who can access data and how they can be used.
Azure Purview addresses data governance and its challenges well as it gives administrators and data scientists a bird’s eye view of the entire landscape, so they can quickly understand the data state at large and get key insights on them, such as the location of sensitive information, the level of data generation, and more.
Accordingly, they can set up alerts and notifications to monitor the health and status of data across the entire enterprise.
Data is spread across your cloud, on-prem systems, SaaS applications, databases, etc., and this impedes its usability.
Azure Purview automatically discovers data and classifies them without having to move them across systems or formats. All the metadata are indexed and brought together as a unified data map, so you know what data sits where.
Every search result even provides detailed information, such as the location of the data. When you click on it, you can see a lot of rich information such as the table name, its fields, data types stored in each field, and more.
You can even click on its location to open up the data in Power BI desktop for better visualization. The related data tab shows all the tables and information related to the one you’re looking through.
When you have all the data in a single place, it’s easy to run analytics on them to get the insights you want. But if they are spread across systems, this process becomes cumbersome.
Azure Purview aims to address it through a streamlined user interface that enables data producers and consumers to collaborate. For example, business users and IT experts can interact with the same data to understand the business context associated with the data.
The highlight is that the unstructured and semi-structured data are also indexed and displayed to make them highly relevant and useful.
Tracking the origin
Tracking the data through its lifecycle gives a better context, so you can get the appropriate insights. Again, this is hard because we generate tons of data each second, so tracking the origins requires significant resources.
Azure Purview tracks and visualizes the lineage of data, right from where it was created and its movement across the entire lifecycle, so you can better understand how data has transformed, and this can significantly determine the way it’s used.
This data lineage and its derived forms tell you if the data has come from an authoritative source. This is way more than the simple key-value pair mappings found in data governance tools and can play a big role in data mapping and understanding.
Tapping into technology
Azure Purview has tapped into some of the innovations that have happened in the open-source community, such as Apache Atlas, a platform that provides metadata management and governance to build data assets.
It also uses artificial intelligence and machine learning to return intelligent results from a search, so they can be highly relevant and useful to organizations. It provides everything from scanning and classifying the data to provide the business context
The above features clearly show that Azure Purview can bring together all the data in a form that you can visualize and understand the connections. It is undoubtedly a significant step in the world of data governance and management and is expected to unify data and make them highly relevant and useful for business users.
Azure Purview Workflow
Moving on, let’s understand the workflow to appreciate its ease of use and importance for all kinds of users within the organization.
Typically, organizations have many data assets like tables, files, models, databases, and more that are spread across cloud, on-prem, and SaaS environments. As a first step, connect these different assets to Azure Purview using connectors, so it can scan all these sources to gather their metadata without moving or transforming the data.
Next, all the metadata is published to the Azure Purview data map, an intelligent graph that describes all the data in it. You can even use Apache Atlas APIs to push data from other sources that are not connected to Purview.
That’s it! Now all users within the organization can quickly find the data they want. Also, the data officers can get end-to-end insights on it.
Lastly, let’s touch a bit on security and setup.
Security and setup
Azure Purview automatically detects sensitive information and classifies them. So, it is displayed in the search results, and only authorized users can open and view this information.
As for setup, it’s a breeze. If you’re an administrator:
- Click the sources to see the possible data sources that Purview can scan.
- Click “Register” at the top left-hand corner and choose your data sources. If you have an Azure account, it can collect data from all the different Azure sources with just a single click.
- You can choose to organize them as collections and tree views and set the classification and scanning configuration settings at the root level of each collection. You can also choose how often you want to scan the data to stay on top of data changes and their impact.
As you can see, Purview scores high on security and setup too.
Final words on Azure Purview
Overall, Azure Purview is expected to simplify data governance and help organizations make the most of their valuable asset (data), regardless of where it is located. For more information on pricing, click here.
So, what do you think of Azure Purview? Please let us know your thoughts in the comments section.
Featured images: Pixabay