4 min readJan 19, 2021

Solving Babel Tower of Business Data with Open Data Initiative and Common Data Model

Motivation

Data is oil. However, data is hard to dig and often locked in “silos”. Especially, data about business processes are not like open source technologies. Every product that implements some business processes have their own language in terms of data model. This brings two challenges to leverage data from a variety of data sources:

Application data model hard to understand (traditionally rely on documentation, which is separate from data and hard to find), not enough metadata description
Different application uses different nomenclature, semantics (even for same entity, different attributes, different field name for same attributes, different name for same entity, etc.), hard to bring together

The result is a Babel tower of application data.

Enter Open Data Initiative

Open Data Initiative | Microsoft

Unify your business data A common data model, and a common data lake help you say goodbye to costly, complex…

www.microsoft.com

and Common Data Model

Common Data Model - Common Data Model

If you've ever had to bring data from multiple systems and applications together, you know what an expensive and…

docs.microsoft.com

Common Data Model includes a set of standardized, extensible data schemas that Microsoft and its partners have published.

Result: Different applications talk same language

1. Human can understand data based on metadata

2. Machine can expect standardized schema for an entity

How it works

Common Data Model and Azure Data Lake Storage Gen2 - Common Data Model

Data stored in accordance with the Common Data Model provides semantic consistency across apps and deployments. With…

docs.microsoft.com

Data producer

Outputs to data along with metadata/schema description (using CDM library), data will be standard format (csv, parquet, CDM does not provide library for writing data) into CDM folders

The Common Data Model manifest object - Common Data Model

In a data lake, a Common Data Model folder is a collection-spread over subfolders or accounts-of the data files and…

docs.microsoft.com

Introducing Common Data Model metadata manifest - Common Data Model

In a data lake, a Common Data Model folder is a collection, spread over sub-folders or accounts, of the data files and…

docs.microsoft.com

microsoft/CDM

The Common Data Model (CDM) is a standard and extensible collection of schemas (entities, attributes, relationships)…

github.com

Sample producers

CDS/Dynamics 365

About - Common Data Model - Common Data Model

Common Data Model is built upon a rich and extensible metadata definition system that enables you to describe and share…

docs.microsoft.com

Exporting CDS data to Azure Data Lake is Generally Available

Preview announcement for Export to data lake service. The Export to data lake service enables continuous replication of…

powerapps.microsoft.com

SAP

Integration of SAP ERP Data into a Common Data Model - SAP Modulleri Kullanici Destek ve Egitim…

1. Introduction In the past, data was often stored in differing formats across an enterprise, leading to data…

sapofis.com

Adobe

Access the Open Data Initiative with Platform

Use the Open Data Initiative data model as part of your Platform investment to easily bring together all your data and…

www.adobe.com

Data consumer

consumes entity data using schema.

https://microsoft.github.io/CDM/

As depicted by following notebook, we can get a Spark dataframe for an entity

Azure/spark-cdm-connector

Contribute to Azure/spark-cdm-connector development by creating an account on GitHub.

github.com

readDf = (spark.read.format(“com.microsoft.cdm”)
.option(“storage”, storageAccountName)
.option(“manifestPath”, container + “/implicitTest/default.manifest.cdm.json”)
.option(“entity”, “TestEntity”)
.load())

Azure Synapse Analytics, Spark, Azure Data Factory has support to process CDM data.

Common Data Model format - Azure Data Factory

APPLIES TO: Azure Data Factory Azure Synapse Analytics The Common Data Model (CDM) metadata system makes it possible…

docs.microsoft.com

Accelerator

Industry accelerators are basic components within the Microsoft powerplatform and Dynamics 365 that enable ISVs and other solution providers to quickly build industry vertical solutions. The accelerators extend the Common Data Model so that it includes new entities to support a data schema for concepts within specific industries.

Industry accelerators - Common Data Model

Common Data Model enables horizontal consistency for customers' business data, making it easier for them to create…

docs.microsoft.com

Dynamics 365 Healthcare Accelerator

Rapidly develop healthcare solutions using data model and use case templates based on HL7/FHIR

appsource.microsoft.com

Dynamics 365 healthcare accelerator (Deprecated) - Common Data Model

Rapidly develop healthcare solutions using Fast Healthcare Interoperability Resources (FHIR) entities (such as patient…

docs.microsoft.com

We can expect if customer uses these accelerators, the output to CDM folder will be expected CDM format.

Appendix

The Common Data Model in Azure Data Lake Storage - Azure Data Services - Data Factory Data Flow …

If like me, you've been keeping taps on what Microsoft has been up to on the Power Platform world, you would have…

adatis.co.uk

The Common Data Model in Azure Data Lake Storage - Azure Data Services - Adatis

If like me, you've been keeping taps on what Microsoft has been up to on the Power Platform world, you would have…

adatis.co.uk

ADF Adds Support for Inline Datasets and Common Data Model to Data Flows

Azure Data Factory makes ETL even easier when working with corporate data entities by adding support for inline…

techcommunity.microsoft.com

Motivation

Open Data Initiative | Microsoft

Unify your business data A common data model, and a common data lake help you say goodbye to costly, complex…

Common Data Model - Common Data Model

If you've ever had to bring data from multiple systems and applications together, you know what an expensive and…

How it works

Common Data Model and Azure Data Lake Storage Gen2 - Common Data Model

Data stored in accordance with the Common Data Model provides semantic consistency across apps and deployments. With…

Data producer

The Common Data Model manifest object - Common Data Model

In a data lake, a Common Data Model folder is a collection-spread over subfolders or accounts-of the data files and…

Introducing Common Data Model metadata manifest - Common Data Model

In a data lake, a Common Data Model folder is a collection, spread over sub-folders or accounts, of the data files and…

microsoft/CDM

The Common Data Model (CDM) is a standard and extensible collection of schemas (entities, attributes, relationships)…

About - Common Data Model - Common Data Model

Common Data Model is built upon a rich and extensible metadata definition system that enables you to describe and share…

Exporting CDS data to Azure Data Lake is Generally Available

Preview announcement for Export to data lake service. The Export to data lake service enables continuous replication of…

Integration of SAP ERP Data into a Common Data Model - SAP Modulleri Kullanici Destek ve Egitim…

1. Introduction In the past, data was often stored in differing formats across an enterprise, leading to data…

Access the Open Data Initiative with Platform

Use the Open Data Initiative data model as part of your Platform investment to easily bring together all your data and…

Data consumer

Azure/spark-cdm-connector

Contribute to Azure/spark-cdm-connector development by creating an account on GitHub.

Common Data Model format - Azure Data Factory

APPLIES TO: Azure Data Factory Azure Synapse Analytics The Common Data Model (CDM) metadata system makes it possible…

Accelerator

Industry accelerators - Common Data Model

Common Data Model enables horizontal consistency for customers' business data, making it easier for them to create…

Dynamics 365 Healthcare Accelerator

Rapidly develop healthcare solutions using data model and use case templates based on HL7/FHIR

Dynamics 365 healthcare accelerator (Deprecated) - Common Data Model

Rapidly develop healthcare solutions using Fast Healthcare Interoperability Resources (FHIR) entities (such as patient…

Appendix

The Common Data Model in Azure Data Lake Storage - Azure Data Services - Data Factory Data Flow …

If like me, you've been keeping taps on what Microsoft has been up to on the Power Platform world, you would have…

The Common Data Model in Azure Data Lake Storage - Azure Data Services - Adatis

If like me, you've been keeping taps on what Microsoft has been up to on the Power Platform world, you would have…

ADF Adds Support for Inline Datasets and Common Data Model to Data Flows

Azure Data Factory makes ETL even easier when working with corporate data entities by adding support for inline…

Written by Xin Cheng

No responses yet