Medallion Architecture: Why the Industry's Biggest Players All Bet on the Same Pattern
Most organizations don't have a data shortage. They have a data trust problem.
Reports contradict each other. Teams spend hours reconciling numbers before a meeting. Executives make decisions based on data they're not fully confident in. Meanwhile, the organization is sitting on more data than it has ever had, spread across more systems than ever before.
This is the business problem that medallion architecture solves. And it's the reason Databricks, Microsoft, and Oracle have all converged on the same fundamental approach to organizing enterprise data. When three of the largest technology companies in the world independently arrive at the same answer, it's worth understanding why.
I have spent 15+ years in enterprise data and technology, including hands-on work implementing medallion architecture on a large-scale unified data platform initiative. This article is written for executives and business leaders who are evaluating data platform investments or trying to make sense of what their technology teams are recommending. The goal isn't to go deep on the technical details. It's to give you the strategic context to ask the right questions and make the right calls.
What Is Medallion Architecture?
Medallion architecture is a data design pattern that organizes data into three progressive layers, each representing a higher level of quality and business readiness. The three layers are Bronze (raw data), Silver (cleaned and validated data), and Gold (curated, business-ready data). It is the dominant pattern used in modern data lakehouses and enterprise data platforms.
Think of it like a water treatment process. Raw water comes in (Bronze), it gets filtered and treated (Silver), and clean, safe water comes out the other end ready for use (Gold). You wouldn't skip the treatment steps just to move faster. The same logic applies to enterprise data.
Databricks coined the term, but the pattern itself reflects a principle that data engineers have followed for decades: raw data is not ready for decision-making, and the process of making it ready should be structured, repeatable, and auditable.
The Business Case for a Layered Data Architecture
Before getting into how each platform works, it helps to understand what problem the architecture is actually solving in business terms.
When data moves from its source system into your organization's reporting and analytics environment, a lot can go wrong. Data arrives in inconsistent formats. Records get duplicated. Systems that were never designed to talk to each other get forced into the same reports. The result is the situation most large organizations know well: no single source of truth, low confidence in the numbers, and an analytics team that spends more time cleaning data than generating insight.
The medallion architecture addresses this by creating clear boundaries between raw, in-progress, and finished data. Each layer has a defined purpose:
- Bronze layer: The landing zone. Data arrives here exactly as it was in the source system. Nothing is changed, nothing is filtered. This layer preserves a complete, unmodified record that can be reprocessed if business rules or requirements change later.
- Silver layer: The validation and enrichment zone. This is where duplicates are removed, data types are standardized, records from different systems are joined, and quality checks are applied. The Silver layer is where your data starts becoming trustworthy.
- Gold layer: The business-ready zone. Data here is curated, aggregated, and structured for specific reporting, analytics, or AI use cases. This is what your dashboards, executive reports, and machine learning models should be built on.
The Role of the Semantic Model
There is often a fourth component that sits between the Gold data and the business users consuming it: the semantic model. A semantic model translates raw data structures into business language. Instead of a table called "fact_rev_txn_q4," your analysts and executives see "Q4 Revenue." Instead of cryptic field names and numeric codes, they see the terms your organization actually uses.
This translation layer is what makes self-service analytics work in practice. Without it, even perfectly curated Gold data requires a technical intermediary to interpret. With it, business users can answer their own questions without waiting on the data team. All three platforms in this article have invested heavily in semantic modeling capabilities, because they recognize it as the bridge between good data and actual business use.
Business Outcomes
The business outcomes this architecture enables are tangible:
- Faster time to insight
- Reduced operational costs from eliminating redundant reporting tools
- Improved data governance and audit readiness
- Lower risk of decisions made on bad data
- A foundation that supports AI and advanced analytics without requiring a complete rebuild later
A Critical Reality Check: These Platforms Are Still Maturing
Before evaluating any of these platforms, executives need to understand something that vendor presentations rarely lead with: all three of these platforms are still actively developing.
Databricks, Microsoft Fabric, and Oracle's AI Data Platform are shipping new capabilities on a near-monthly basis. Features that didn't exist six months ago are now core parts of the product. Capabilities on the roadmap today may redefine the platform by the time your organization finishes implementation.
This is not a reason to wait. It is a reason to evaluate differently. The question isn't just "what can this platform do today?" It's "is this vendor's direction aligned with where our organization is going?" A platform that is right for your needs today but heading in a different direction than your business is a risk. A platform that is slightly immature today but developing rapidly in exactly the areas you care about is an opportunity.
The medallion architecture pattern itself is stable and proven. The tooling built around it is what's evolving. Executives who understand that distinction will make better platform decisions than those who don't.
How the Big Three Implement Medallion Architecture
Databricks: The Originator
Databricks coined the term "medallion architecture" and their platform is built around it. For business leaders, the important thing to understand about Databricks is that it was designed for organizations that need flexibility and want to avoid being locked into a single vendor's ecosystem. It runs on open standards like Delta Lake and Apache Spark, which means your data assets aren't trapped inside a proprietary format.
The practical business implication is that Databricks tends to deliver the most value in organizations with strong technical teams and complex, evolving data needs. It's a powerful platform, and it requires the technical depth to use it well. Unity Catalog provides unified governance across all data and AI assets, with fine-grained access controls, automated PII detection, and column-level lineage tracking. Organizations that have invested in that capability often see significant competitive advantage: faster data pipelines, more sophisticated AI capabilities, and the ability to scale without rebuilding.
The risk to be transparent about: Databricks is engineering-forward, and organizations without strong internal data engineering capability may underutilize the platform or incur higher implementation costs than anticipated.
Strategic fit: Organizations with strong technical teams, complex data environments, and a desire for flexibility and open standards.
Microsoft Fabric: The Integrated Suite
Microsoft Fabric launched in 2023 and has been developing rapidly since. For executives already running Microsoft 365, Azure, or Power BI, the strategic case for Fabric is straightforward: it is designed to bring all of your data and analytics capabilities into a single, governed environment on infrastructure you already pay for.
The business outcome that Fabric is most directly designed to deliver is consolidation. If your organization is running multiple analytics tools, multiple data warehouses, or multiple reporting environments, Fabric is built to bring those into one place. Organizations that have successfully consolidated onto Fabric report significant reductions in the time analysts spend moving data between tools, meaningful cost savings from retiring legacy platforms, and faster delivery of reports and dashboards to decision makers.
The native integration with Power BI is a genuine differentiator. When the layer where your data becomes business-ready connects directly to the tool your teams use to visualize and share it, you eliminate an entire category of integration work and the errors that come with it. Fabric also has one of the strongest semantic modeling stories of the three platforms. Power BI's semantic model layer allows organizations to define business metrics, hierarchies, and terminology once, centrally, and have those definitions flow consistently across every report and dashboard in the organization. For executives who have experienced the frustration of two reports showing different revenue figures because two teams calculated the metric differently, this is the capability that fixes that problem at the source.
Medallion architecture is the recommended design pattern for Fabric. Data is stored in OneLake, a unified logical data lake built on Azure Data Lake Storage, with Delta Lake as the default storage format. Governance is backed by Microsoft Purview, providing sensitivity labels, federated domain management, access controls, and auditing across all workloads.
The risk to be transparent about: Fabric is still maturing. Some capabilities that organizations expect from an enterprise data platform are still being built. Early adopters have encountered gaps, and implementation timelines should account for a platform that is evolving in real time.
Strategic fit: Organizations already invested in the Microsoft ecosystem, particularly those using Power BI and Azure, looking to consolidate their data and analytics environment.
I have been leading stakeholder engagement on a unified data platform initiative at a major university built on Microsoft Fabric, where we are implementing medallion architecture to consolidate six departments onto a single enterprise data platform. That hands-on experience with Bronze, Silver, and Gold layer implementation, along with the real-world challenges of data quality gates, layer ownership, and cross-department governance, informed much of what I have written here.
Oracle AI Data Platform: The Enterprise AI Foundation
Oracle's AI Data Platform, announced in October 2025, is the newest of the three and reflects where the entire industry is heading. Where Databricks and Microsoft Fabric were originally designed for analytics and evolved toward AI, Oracle built this platform with AI as the primary use case from day one.
For executives, the most important thing to understand about Oracle's platform is how it handles governance. In most organizations, data governance and AI governance are treated as separate concerns managed by separate teams. Oracle's platform treats them as the same concern, managed through the same framework. The platform's Master Catalog provides centralized metadata management, role-based access control, lineage tracking, and compliance auditing. The same controls that ensure your data is accurate and compliant also apply to the AI models and outputs built on top of that data. The platform supports medallion architecture natively, with Delta Uniform across Delta Lake, Apache Iceberg, and Apache Hudi formats, and uses Apache Spark for compute. For organizations where AI is becoming a strategic initiative and regulatory scrutiny of AI is increasing, that kind of unified governance is a meaningful risk mitigation.
The integration story with Oracle Fusion Cloud applications is also relevant for organizations already running those systems. Oracle provides a native bulk extraction tool (BICC) that brings Fusion Cloud data into the AI Data Platform Workbench, and Oracle's separate Fusion AI Data Platform product offers prebuilt data pipelines, KPIs, and dashboards for Fusion application data. For organizations already invested in Oracle's ecosystem, this reduces the integration effort compared to building from scratch.
The risk to be transparent about: As the newest platform of the three, Oracle's offering has the least track record at enterprise scale. Organizations adopting it early are, to some degree, betting on the roadmap. Oracle's size and investment level make that a reasonable bet, but it is a bet.
Strategic fit: Large enterprises running Oracle Fusion Cloud systems, and organizations where unified data and AI governance is a strategic priority.
What All Three Get Right
Despite their differences, all three platforms are aligned on the principles that matter most for enterprise data.
Data quality is a process, not a project. You don't fix data once. You build systems that progressively improve it, catch problems early, and preserve the ability to go back and reprocess when something goes wrong.
Governance is not optional. Every platform here treats access controls, lineage, and compliance as core capabilities, not afterthoughts. That matters for organizations facing regulatory requirements, audit obligations, or simply the need to answer "where did this number come from?" with a reliable answer.
AI readiness is the new standard. The same data foundation that powers your dashboards today powers your AI applications tomorrow. Organizations that get this architecture right don't have to rebuild when they are ready to move into AI. They are already there.
Platform Comparison: Databricks vs. Microsoft Fabric vs. Oracle AI Data Platform
Medallion Architecture
- Databricks: Coined the term. Native support via Delta Lake.
- Fabric: Recommended design pattern. OneLake with Delta Lake as default format.
- Oracle AIDP: Native support. Delta Uniform across Delta Lake, Iceberg, and Hudi.
Governance
- Databricks: Unity Catalog. Fine-grained access control, PII detection, column-level lineage.
- Fabric: Microsoft Purview. Sensitivity labels, federated domains, auditing.
- Oracle AIDP: Master Catalog. Role-based access, lineage tracking, compliance auditing.
AI Capabilities
- Databricks: Notebooks, MLflow, Mosaic AI for generative AI workloads.
- Fabric: Copilot across workloads, Data Science workload, Microsoft Foundry.
- Oracle AIDP: Spark notebooks, AI agent support, OCI AI services.
Semantic / BI Layer
- Databricks: No native BI tool. Integrates with Power BI, Tableau, and others.
- Fabric: Power BI native. Centralized semantic model layer for consistent metrics.
- Oracle AIDP: Oracle Analytics Cloud for visualization and self-service.
Open Standards
- Databricks: Delta Lake, Spark, Iceberg. Lakehouse Federation for external systems.
- Fabric: Delta Lake (default), Spark, ADLS Gen2. OneLake shortcuts for cross-cloud access.
- Oracle AIDP: Delta Lake, Iceberg, Hudi (via Delta Uniform), Spark.
Ecosystem Fit
- Databricks: Cloud-agnostic. Runs on AWS, Azure, and GCP.
- Fabric: Microsoft ecosystem. M365, Azure, Power BI, Teams.
- Oracle AIDP: OCI-native. Fusion Cloud data integration via BICC connector.
Platform Maturity
- Databricks: Most mature. Years of enterprise adoption at scale.
- Fabric: Generally available since November 2023. Rapidly evolving.
- Oracle AIDP: Announced October 2025. Newest of the three. Some features in preview.
Best For
- Databricks: Organizations with strong technical teams and multi-cloud requirements.
- Fabric: Organizations in the Microsoft ecosystem looking to consolidate analytics.
- Oracle AIDP: Large enterprises on OCI or Oracle Fusion Cloud prioritizing unified AI governance.
The Decision Most Organizations Get Wrong
Here is where I will be direct, because it is the most important point in this article.
The single biggest mistake organizations make when embarking on a data platform initiative is selecting the technology before understanding the business requirements. It seems obvious when stated plainly, but it happens constantly. A vendor gives a compelling demonstration. Leadership gets excited. A platform gets selected. And then the hard work of figuring out what the organization actually needs begins, after the decision has already been made.
That sequencing creates real costs. It means requirements get forced to fit the tool rather than the tool being selected to fit the requirements. It means stakeholders who should have been at the table early get engaged late, surfacing critical information after decisions are already locked. It means governance, compliance, and integration requirements get discovered mid-implementation rather than during planning.
The right sequence is this: understand the business problem first. Assess the current state of your data environment. Gather requirements from every stakeholder group that touches data. Then, and only then, evaluate platforms against those requirements.
Organizations that follow this sequence make better platform decisions, experience fewer mid-project surprises, and see faster time to value. The ones that skip it spend a significant portion of their implementation budget correcting for what they didn't know when they started.
I wrote about this in more detail in my article on lessons learned building a unified data platform at a major university, where this exact sequencing issue shaped much of what we learned.
Key Takeaways
- Medallion architecture is the dominant data design pattern across enterprise data platforms. Databricks, Microsoft Fabric, and Oracle have all built their platforms around it.
- The pattern organizes data into three layers. Bronze (raw), Silver (cleaned and validated), and Gold (business-ready). Each layer has a defined purpose.
- The semantic model is the bridge. It translates technical data structures into business language, enabling self-service analytics.
- All three platforms are still maturing. Evaluate based on direction and alignment with your organization's needs, not just today's feature set.
- Databricks fits organizations with strong technical teams that value flexibility and open standards.
- Microsoft Fabric fits organizations already in the Microsoft ecosystem looking to consolidate data and analytics tools.
- Oracle AI Data Platform fits large enterprises on OCI or Oracle Fusion Cloud with a focus on unified AI and data governance.
- The biggest mistake is choosing the platform before understanding the requirements. Technology selection should come after business needs are clearly documented.
What to Do Next
If your organization is considering a data platform investment, or if you are already in the middle of one and feeling the friction described in this article, the next step isn't to pick a platform. It's to get clear on what you need.
That starts with an honest assessment of where you are today: what data you have, where it lives, who needs it, and what decisions you wish you could make faster or with more confidence.
If you would like to think through that assessment together, I am happy to have that conversation. The platform decision comes later. Getting the foundation right comes first.
FAQ
What is medallion architecture?
Medallion architecture is a data design pattern that organizes data into three progressive layers: Bronze (raw data), Silver (cleaned and validated data), and Gold (curated, business-ready data). The pattern is used in data lakehouses and enterprise data platforms to progressively improve data quality as it moves through each layer. Databricks coined the term, but the same pattern is now implemented by Microsoft Fabric, Oracle AI Data Platform, and others.
What is the difference between medallion architecture and ETL?
ETL (Extract, Transform, Load) describes the process of moving data from one system to another. Medallion architecture describes how the data is organized after it arrives. They are complementary, not competing. In a medallion architecture, ETL processes are what move data between the Bronze, Silver, and Gold layers. The medallion pattern provides the structure. ETL provides the mechanism.
Does Snowflake use medallion architecture?
Snowflake does not use the "medallion architecture" label in its own documentation, but many organizations running Snowflake implement the Bronze, Silver, Gold pattern on top of it. The pattern is not tied to any single platform. It is a logical design approach that can be applied to Snowflake, Databricks, Microsoft Fabric, Oracle, or any data platform that supports layered data organization.
What is the difference between medallion architecture and star schema?
Star schema is a database modeling technique that structures data into fact tables and dimension tables for efficient querying and reporting. Medallion architecture is a broader pattern that describes how data flows through progressive quality layers. A star schema would typically live in the Gold layer of a medallion architecture, where data is structured specifically for business reporting and analytics. They work together, not as alternatives.
Is medallion architecture still used?
Yes. Medallion architecture is the dominant data organization pattern in the enterprise data platform market. Databricks, Microsoft Fabric, and Oracle have all built their platforms around it. The pattern continues to grow in adoption because it provides a clear, scalable structure that supports both traditional analytics and AI workloads.
What is the difference between medallion architecture and Data Vault?
Data Vault is a database modeling methodology focused on auditability and historical tracking. Medallion architecture is a data flow pattern focused on progressive data quality improvement. Some organizations use Data Vault modeling within the Silver layer of a medallion architecture, combining the strengths of both approaches. Data Vault emphasizes how data is modeled. Medallion architecture emphasizes how data moves and improves across layers.
References
- What is Medallion Architecture? - Databricks
- What is the Medallion Lakehouse Architecture? - Databricks Documentation (AWS)
- What is the Medallion Lakehouse Architecture? - Azure Databricks (Microsoft Learn)
- Unity Catalog: Unified Governance for Data and AI - Databricks
- What is Microsoft Fabric? - Microsoft Learn
- Implement Medallion Lakehouse Architecture in Fabric - Microsoft Learn
- Implement Medallion Architecture with Materialized Lake Views - Microsoft Fabric
- Organize a Fabric Lakehouse Using Medallion Architecture Design - Microsoft Learn Training
- Domains in Microsoft Fabric - Microsoft Learn
- Oracle Unveils AI Data Platform - Oracle Announcement (October 2025)
- Overview of Oracle AI Data Platform and Workbench - Oracle Documentation
- Bring Fusion Data into Oracle AI Data Platform Workbench Using BICC - Oracle Blog
- Oracle Fusion AI Data Platform - Oracle
- Oracle AI Data Platform Workbench - Oracle
Cory Holmes is an AI Architect, Fractional Chief AI Officer, and Microsoft Certified AI Transformation Leader with 15+ years of experience in enterprise data, cloud infrastructure, and AI. He has led data platform initiatives across healthcare, higher education, and business environments, including hands-on implementation of medallion architecture on large-scale unified data platforms. Follow his work at CoryHolmes.com or connect on LinkedIn.
Need help with your data platform or AI strategy? Book a strategy call with Fractional AI Advisors.