Data Platforms · 02 of 04 · Databricks

One platform for ETL, ML, and the next thing.

Databricks gave the lakehouse a name. We give it discipline: Unity Catalog as the spine, Delta as the storage truth, DLT for declarative pipelines, MLflow for the models that live next to the data they were trained on.

60+Workspaces governed
34 PBUnder Delta
ConsultingPartner · multi-region
What we build on Databricks

A lakehouse that earns its name.

Not a notebook farm. A platform — with workspaces, lineage, contracts and CI.

Workspace & Account Design

Metastore-per-region, workspaces by environment, service principals, IAM tied to your IdP. The boring foundation that scales past 50 users.

Metastore · Workspaces · SP

Delta Live Tables Pipelines

Declarative pipelines with expectations, auto-recovery, schema evolution and CDC. Operational primitives we shouldn't have to write anymore.

DLT · Expectations · CDC

Unity Catalog Governance

One catalog, cross-workspace lineage, fine-grained access, attribute-based policies, AI-assisted classification. Audit-grade without bureaucracy.

Unity · Lineage · ABAC

SQL Warehouses

SQL serverless for BI and ad-hoc, photon for the heavy joins, query-result caching that actually fires. BI tools never know it's a lake.

SQL Warehouse · Photon · Serverless

MLflow & Mosaic AI

Tracking, registry, model serving, feature engineering, vector search. Production ML without leaving the platform the data lives in.

MLflow · Mosaic · Vector · Feature Store

Cost & Cluster Discipline

Right-sized clusters with autoscaling and spot, job vs all-purpose split, cost-tag policies, DBU dashboards by team. Lake economics, in the open.

DBU · Spot · Autoscale · Tagging
Reference Medallion Architecture

Bronze, Silver, Gold — and a governed hand-off.

The Medallion pattern works because each layer has one job. We make the contracts between layers explicit, tested and observable.

ETY · DATABRICKS LAKEHOUSE · acme-prod · metastore: eu-west-1BRONZE · LAND RAW · APPEND-ONLYSILVER · CLEAN · CONFORM · DEDUPGOLD · BUSINESS-READY · DIM/FACTevents_rawdelta · 12 GBorders_rawdelta · 6 GBusers_rawdelta · 1 GBlog_rawdelta · 220 MBevents_cleanDLT · liveorders_cleanDLT · liveuser_dimSCD-2sessionswindowedmart_revenuedim/factmart_funneldim/factfeature_storeonline + offlinechurn_scoremodel outUNITY CATALOG · LINEAGE · POLICY · AUDIT · CLASSIFICATION

One pipeline definition. Three tables you can trust.

Each Medallion layer has a contract: Bronze is raw and append-only, Silver is clean and de-duped, Gold is business-ready. Delta Live Tables expresses the whole thing declaratively — including expectations, lineage and recovery.

  • 1
    Bronze: append-only landing

    Raw data lands once, schema evolution handled, replay always possible.

  • 2
    Silver: clean & conformed

    Deduplicated, joined to dimensions, contract enforced via DLT expectations.

  • 3
    Gold: marts & features

    Star schemas for BI, feature tables for ML. Same governance, different consumer.

  • 4
    Unity Catalog binds it all

    Lineage from source to dashboard, ABAC policies, classification — one catalog, every workspace.

The Databricks surface

Lake, warehouse, ML — one platform.

Capabilities we've shipped at scale. Production runbooks on file for each.

Storage

Delta LakeDelta UniFormIceberg readLiquid Clustering

Pipelines

Delta Live TablesWorkflowsStructured StreamingAuto Loader

Governance

Unity CatalogLineageClassificationABAC

SQL & BI

SQL WarehousePhotonMaterialized ViewsLakeview

ML & AI

MLflowMosaic AIVector SearchFeature StoreModel Serving

Apps & Agents

Lakehouse AppsGenieAI/BIDBRX

Delivery

Databricks Asset BundlesTerraformdbtGit folders

FinOps

Budget PoliciesTaggingSpotPhoton optimization
Recent Databricks work

Lakehouse, not notebook farm.

Three quick takes.

Media · 14B-row events platform

DLT pipelines replaced 38 Airflow DAGs.

Bronze-silver-gold rebuild on DLT with expectations, schema-evolution and replay. Operational headcount halved, freshness improved.

−68%Pipeline code
Freshness
DLTDeltaUnity
Retail · churn ML in production

Feature Store + Model Serving in 5 weeks.

Feature engineering as Delta tables, online + offline feature store, MLflow registry to Model Serving with traffic shadowing.

5 wkTo production
+11ptRetention lift
MLflowFeature StoreServing
Pharma · 4-region governance

Unity Catalog rollout to 12 BUs.

Metastore-per-region, ABAC policies aligned to GxP, classification & lineage from source to BI. Auditor finished early.

12BUs governed
0Audit findings
UnityABACLineage

Lakehouse with guardrails.

30 minutes. Bring your top three pipelines and your last DBU bill — we'll point to where the platform is buying its weight, and where it isn't.