Create a shortcut to Dataverse environment
Follow the steps in this article to create and use a Dataverse shortcut for integrating Microsoft Sustainability Manager data into your Fabric workspace. These steps also prepare the data for transformation into the environmental, social, and governance (ESG) data model schema.
Prerequisites
- Deploy the Environmental data and insights capability in Fabric workspace.
- Ensure that Sustainability Manager is installed and set up in your Dataverse environment.
- Ensure you have the Dataverse system administrator security role.
- Ensure you're an administrator of the Fabric workspace.
- Enable Link to Fabric from the Dataverse environment.
Step 1: Create a Dataverse shortcut
In this step, you create a shortcut to the Dataverse environment where Sustainability Manager data is present from the IngestedRawData lakehouse.
Open the workspace in Fabric in which you deployed the Sustainability solution. To open the solution, select it from the list view. To get to the managed capability page, select the ESG data estate capability from the Manage deployed capabilities list view.
From the ESG data estate managed capability page, select the IngestedRawData lakehouse.
- From the lakehouse, select Get data, and then select New shortcut.
Select Dataverse as the shortcut type.
A wizard appears to set up your connection credentials for the Dataverse environment. Select Create new connection and provide the environment URL and the credentials with system administrator permissions. Fabric uses these credentials to set up the connection to the Dataverse environment and securely access the data.
- Select all the Sustainability Manager tables from the list, and then select Next. Look for the tables that begin with the prefix msdyn_.
- The list of selected Sustainability Manager tables and their shortcut names appears.
Each shortcut is created with the suffix _partitioned. To remove the suffix from all the table names, you can create a notebook, add the IngestedRawData lakehouse to the notebook, and then run the following logic. Replace "[LAKEHOUSE_NAME]" with the lakehouse name that contains the Dataverse table shortcuts. For example, SDS_ESGDE_test_IngestedRawData_LH.
TARGET_DB = "[LAKEHOUSE_NAME]"
DATAVERSE_SHORTCUT_SUFFIX = "_partitioned"
for table in spark.catalog.listTables(TARGET_DB):
if table.name.endswith(DATAVERSE_SHORTCUT_SUFFIX):
updated_table_name = table.name.removesuffix(DATAVERSE_SHORTCUT_SUFFIX)
try:
spark.sql(f"ALTER TABLE {TARGET_DB}.{table.name} RENAME TO {TARGET_DB}.{updated_table_name}")
print(f"Successfully renamed table from '{table.name}' to '{updated_table_name}' in Lakehouse '{TARGET_DB}'")
except Exception as e:
print(f"Failed renaming table from '{table.name}' to '{updated_table_name}' in Lakehouse '{TARGET_DB}'")
The tables are now available in the IngestedRawData lakehouse and are ready to be transformed into the ESG data model.
Step 2: Transform data
After the Sustainability Manager data integrates into the IngestedRawData lakehouse, you need to unify and harmonize the data by transforming it into the ESG data model schema. The capability deployment created the ProcessedESGData lakehouse in the workspace for storing the harmonized data.
The ESG data model schema links the ESG data entities to other data entities that describe other functional areas of a company, such as HR, IT, or finance production.
For transforming Sustainability Manager data to the ESG data model schema, the capability provides notebooks and data pipelines that use prebuilt transformation and orchestration technologies using libraries stored in the ConfigAndDemoData lakehouse in your workspace.
Run the LoadReferenceTables_INTB notebook to load and transform the Sustainability Manager data to the ESG data model schema in the ProcessedESGData lakehouse.
The data pipeline runs these steps:
The first step is deactivated by default and isn't required for this method, because the data from Sustainability Manager is already loaded as lakehouse tables.
The transformation of the Sustainability Manager data to the ESG data model schema is a two-stage process:
- First, the TransformMSMDataToProcessedESGData_INTB notebook transforms the Microsoft Sustainability Manager data to the intermediary raw data import tables.
- Then, the TransformRawImportESGDataToProcessedESGData_INTB notebook transforms the data from raw data import tables to the ESG data model schema.