How trust signals move across the modern data stack

Platforms
3
CDGC, Databricks, Unity Catalog
Layers
4
Source, ingest, govern, consume
Outputs
5+
Scores, tags, trends, dashboards, lineage
Pattern
API
Most common enterprise integration
01DQ Execution in CDGC
02Score Extraction
03Bronze Landing
04Databricks Standardization
05Unity Catalog Persistence
06Governance Enrichment
07Business Consumption

End-to-end flow

01 🧪

DQ Execution in CDGC

CDGC profiles data, validates rules, and computes dataset and rule-level scores.

02 📤

Score Extraction

Extract results through APIs, CDI pipelines, or event-driven triggers for downstream processing.

03 🪣

Bronze Landing

Land raw DQ payloads in cloud object storage for traceability and replayability.

04 ⚙️

Databricks Standardization

Normalize CDGC payloads into a standard enterprise score model.

05 🏛️

Unity Catalog Persistence

Persist curated score history into governed Delta tables inside Unity Catalog.

06 🏷️

Governance Enrichment

Add tags, lineage, and metadata context to strengthen discoverability and trust.

07 📊

Business Consumption

Expose scorecards to business, governance, and engineering teams.

Source → Standardize → Govern → Consume

  • Source system — CDGC runs rules and calculates scores
  • Integration layer — APIs or CDI move results to the lakehouse
  • Databricks layer — Transforms raw payloads into enterprise score models
  • Governance layer — Unity Catalog stores governed Delta tables and tags
  • Consumption layer — Dashboards and scorecards expose trust signals

Detailed process

The pattern below is designed for enterprise-scale implementations where Informatica remains the system of execution for quality rules, while Databricks and Unity Catalog provide standardization, governed persistence, and consumption.

01

DQ Execution in CDGC

CDGC profiles data, validates rules, and computes dataset and rule-level scores.

  • Profiling and validation
  • Dimension-level scores
  • Results stored in Informatica repositories
02

Score Extraction

Extract results through APIs, CDI pipelines, or event-driven triggers for downstream processing.

  • REST API extraction
  • CDI push to storage
  • Optional event/webhook pattern
03

Bronze Landing

Land raw DQ payloads in cloud object storage for traceability and replayability.

  • ADLS / S3 / GCS
  • JSON or Parquet
  • Partition by run date and source
04

Databricks Standardization

Normalize CDGC payloads into a standard enterprise score model.

  • Map score dimensions
  • Align domain semantics
  • Prepare silver and gold models
05

Unity Catalog Persistence

Persist curated score history into governed Delta tables inside Unity Catalog.

  • Delta tables in UC
  • Historical tracking
  • Reusable enterprise score layer
06

Governance Enrichment

Add tags, lineage, and metadata context to strengthen discoverability and trust.

  • Quality status tags
  • Lineage visibility
  • Stewardship-ready metadata
07

Business Consumption

Expose scorecards to business, governance, and engineering teams.

  • Databricks SQL dashboards
  • BI scorecards
  • Trend and SLA monitoring

DrivenByData at a glance

CDGC remains the execution engine for quality rules. Databricks becomes the integration and standardization layer. Unity Catalog becomes the governed trust layer for enterprise-wide visibility.

CDGCRun rules
DatabricksStandardize scores
Unity CatalogGovern and publish

Code examples

Extract from CDGC API

import requests

url = "https://<informatica-instance>/api/v2/dq/results"
headers = {"Authorization": "Bearer <token>"}
response = requests.get(url, headers=headers)
payload = response.json()

Standardize in Databricks

df_clean = df_raw.selectExpr(
  "assetName as table_name",
  "score as overall_score",
  "dimensionScores.completeness as completeness_score",
  "dimensionScores.accuracy as accuracy_score",
  "runDate as score_date"
)

Persist in Unity Catalog

CREATE TABLE governance.dq.table_quality_scores (
  catalog_name STRING,
  schema_name STRING,
  table_name STRING,
  score_date DATE,
  overall_score DECIMAL(5,2),
  status STRING,
  source_system STRING,
  run_id STRING
) USING DELTA;

Apply tags

ALTER TABLE main.sales.customer
SET TAGS (
  'dq_status' = 'gold',
  'dq_score' = '97'
);

What CDGC owns

Rule execution, profiling logic, score calculation, and source-level quality assessment.

What Databricks owns

Ingestion, transformation, standardization, historical persistence, and downstream analytics.

What Unity Catalog owns

Governance, access control, discoverability, metadata context, and trust signaling across the lakehouse.

About the author

Subramanian Gopalkrishnan is a Data Governance and Data Engineering leader with 18+ years of experience across regulated industries, helping enterprises build trusted, modern data ecosystems across cloud, analytics, governance, and AI transformation.