cyberpogo cyberpogo
  • Stuff
    • Gears
    • Insights
    • Research
    • Tools
    • Design
  • Soul
    • Artificial Intelligence
    • Automation
    • Machine Learning
    • Robotics
  • Dream
    • Software
    • Programming
    • Data
    • Solutions
  • Build
    • Engineering
    • DevOps
    • Containers
    • Architecture
    • Automation
    • Mobile
    • Hybrid Cloud
    • Multi-Cloud
    • Public Cloud
    • Cloud-Native
  • Lead
    • Platforms
    • Enterprise
    • People
    • Project Management
    • Practices
  • Now
    • Technology
    • Featured
  • Us
cyberpogo cyberpogo
  • Stuff
    • Gears
    • Insights
    • Research
    • Tools
    • Design
  • Soul
    • Artificial Intelligence
    • Automation
    • Machine Learning
    • Robotics
  • Dream
    • Software
    • Programming
    • Data
    • Solutions
  • Build
    • Engineering
    • DevOps
    • Containers
    • Architecture
    • Automation
    • Mobile
    • Hybrid Cloud
    • Multi-Cloud
    • Public Cloud
    • Cloud-Native
  • Lead
    • Platforms
    • Enterprise
    • People
    • Project Management
    • Practices
  • Now
    • Technology
    • Featured
  • Us
  • Data

Understand And Trust Data With Dataplex Data Lineage

  • Francisco Juan
  • March 17, 2023
  • 3 minute read

Today, we are excited to announce the general availability of Dataplex data lineage — a fully managed Dataplex capability that helps you understand how data is sourced and transformed within the organization. Dataplex data lineage automatically tracks data movement across BigQuery, BigLake, Cloud Data Fusion (Preview), and Cloud Composer (Preview), eliminating operational hassles around manual curation of lineage metadata.

With rising data volume spread across data silos, it can be challenging for organizations to ensure users have a self-service mechanism to discover, understand and trust the data. Organizations constantly struggle with questions such as:

  • Is the data extracted from an authoritative source?
  • What is the impact if I drop this table?
  • The data in this table seems corrupted – where did this data come from, and when was it last refreshed?
  • How is sensitive information being moved or copied? Is it in adherence to data governance practices?

To answer the above questions, organizations need to track how data is sourced and transformed, which can be complex and requires significant effort.

Dataplex data lineage describes each lineage relationship by detailing what happened and when it happened in an interactable lineage graph, providing data observability.

Data analysts who want to know if a table originates from an authoritative source can now answer this in a self-service manner with a simple look-up of lineage for the concerned table — available in Dataplex and in BigQuery for in-context analysis.

Data engineers can reduce time to identify and resolve data issues through root cause analysis using the operational metadata trace asserting a lineage relationship. Data lineage also aids deterministic change management by providing the ability to evaluate the impact of a change and collaborate with the corresponding stakeholders to minimize any adverse impact.

Read More  Cloud Wisdom Weekly: 3 Ways Serverless Can Save Money And Accelerate App Development

Finally, data lineage provides a map of data movement which can become the foundation for data governance practice. It enables data stewards and owners to evaluate and enforce adherence to governance requirements, especially when tracking the movement of sensitive information.

Dataplex data lineage provides APIs for extensibility so that organizations can report lineage from various systems and have a single map of how data entries are related.

What our customers are saying

L’Oréal, the world’s largest cosmetics company, is on a mission to ‘create the beauty that moves the world.’ “Dataplex data lineage helps us understand how data moves across our organization,” said Sébastien Morand, Head of Data Engineering team, L’Oréal. “As a fully managed solution, it becomes the main entry point to diagnose data issues and evaluate the impact of a change or incident — providing insight on what happened and when it happened, including reference to the execution metadata. Directly integrated into our beauty tech data platform, data lineage helps us reduce data issues and also enables us to mitigate issues faster when it does happen.”

“At Wayfair, we treat data-as-a-product and are building a robust data platform that provides self-service access and compliance constructs,” said Vinit Rajopadhye, Associate Director on Data Infrastructure & Data Enablement at Wayfair. “We are excited about Dataplex data lineage as it helps our data consumers trust data based on where it originates and the transformations applied.”

Hurb is an online travel agency in Brazil with a mission to optimize travel through technology. “Hurb has a rapidly growing data platform, with new data assets created and registered daily to support business decision-making and Machine Learning models,” said Vinícius dos Santos Mello, Senior Data Engineer. “Thanks to Dataplex data lineage features, we have end-to-end data observability across data in BigQuery. We can proactively address schema changes, data quality issues, and asset depreciation that could otherwise negatively affect the business.”

Read More  Using AI To Increase Asset Utilization And Production Uptime For Manufacturers

“As a company with many business domains and services, we handle a large volume of data and use it to power our decision making, so it is crucial to ensure data quality.Dataplex data lineage provides a visual understanding of the flow of data across our organization, improving efficiency of impact investigations when problems occur and increasing the reliability of the data.” said Mitsunori Fukase, Data Platform Department Group Manager, DeNA.

Get started with Dataplex data lineage

You can get started with Dataplex data lineage by enabling the Data Lineage API on your project. You can learn more here.

Additional Resources:

  • Dataplex data lineage labs
  • Quickstart – track lineage for BigQuery table copy

By: George Verghese (Product Manager, Google Cloud)
Originally published at Google Cloud Blog

Francisco Juan

Editor's Picks
View Post

How To Connect Your Go Application To Cloud SQL

View Post

From Raw Data To Actionable Insights: The Power Of Data Aggregation

View Post

Effective Strategies To Closing The Data-Value Gap

View Post

Google Data Cloud & AI Summit : In Less Than 12 Hours From Now

View Post

Sovereign Clouds Are Becoming A Big Deal Again

View Post

Coop Reduces Food Waste By Forecasting With Google’s AI And Data Cloud

View Post

BigQuery Under The Hood: Behind The Serverless Storage And Query Optimizations That Supercharge Performance

View Post

Sumitovant More Than Doubles Its Research Output In Its Quest To Save Lives

LATEST POSTS
  • DBS Singapore: The Best Boasting To Be The Best For So Long, Humbled By Hubris
  • Workload Identity For GKE Made Easy With Open Source Tools
  • Bard And ChatGPT — A Head To Head Comparison
  • Why (And How) Google Cloud Is Adding Attack Path Simulation To Security Command Center
  • Modernize Your Apps And Accelerate Business Growth With AI
about
Towards the creation and advancement of a true cyber commons

We provide you with knowledge and resources so you can innovate and stay ahead of the curve in the current and future world.

We cover a broad range of topics related to science, technology, and humanities to guide you on the latest trends, products, reviews, news, tools, and many more.

If you have any questions, enquiries or would like to sponsor content, kindly reach out to us at:

[email protected]

  • Platforms
    • Data
    • Enterprise
    • Hybrid Cloud
    • Multi-Cloud
    • Public Cloud
    • Mobile
    • Cloud-Native
  • Engineering
    • Software
    • DevOps
    • Solutions
    • Containers
    • Architecture
    • Automation
  • Technology
    • Gears
    • Insights
    • Research
    • Tools
    • Design
  • Programming
    • Software
    • DevOps
  • Artificial Intelligence
    • Automation
    • Machine Learning
    • Robotics
  • People
    • Project Management
    • Practices
  • About Us
Latest Posts
  • Why Your Open Source Project Needs A Content Strategy
  • How To Connect Your Go Application To Cloud SQL
  • Kubernetes K8s.gcr.io Redirect: What You Need To Know As An Anthos Or GKE User
  • From Raw Data To Actionable Insights: The Power Of Data Aggregation
  • Effective Strategies To Closing The Data-Value Gap
cyberpogo cyberpogo
innovating the future, bit and byte at a time

Input your search keywords and press Enter.