Google recently announced that its data discovery and metadata management service, Data Catalog is available in public beta. The service, which allows users to discover, manage, and analyze data within Google Cloud Platform, was originally unveiled at the Google Cloud Next ‘19 event in San Francisco. If you’re interested in making use of it during the public beta period, here are some of the features to know.
Data discovery in Google Data Catalog
Data Catalog allows users to search for tables in Google BigQuery or sift through topics in Cloud Pub/Sub across all the cloud projects they have access to. It uses the same search technology as other popular Google tools like Gmail and Google Drive. And it integrates with the access controls in users’ Cloud Identity & Access Management data. So you can easily get up and running without having to set up additional permissions.
Schematized business metadata
In Data Catalog, you can also tag data assets with metadata so you can easily search through them. You can also use tag templates to make organizing your metadata even easier. Additionally, Google Data Catalog supports doubles, booleans, and enumerated type. The service also offers several API options that augment within the interface. This allows you to attach tags in bulk when tables are created in BigQuery.
Data loss prevention
Google Data Catalog supports integration with Cloud DPL, which helps companies and organizations with data governance as they apply to regulatory and compliance requirements. This integration lets you create new jobs and scan through your tables for sensitive data so you can attach tags and classify them properly. This gives organizations access to richer out-of-the-box data sets that are sorted into categories and up to compliance standards. The tool also lets users perform periodic scans so you can keep tags updated and ensure compliance standards on an ongoing basis.
Featured image: Shutterstock