Deploy and configure SDOH datasets - Transformations (preview) in healthcare data solutions
[This article is prerelease documentation and is subject to change.]
The SDOH datasets - Transformations (preview) pipeline helps you integrate various SDOH (Social determinants of health) datasets into Fabric OneLake. You can deploy and configure this capability after deploying healthcare data solutions to your Fabric workspace and the healthcare data foundations capability. This article outlines the deployment process and shows you how to access the public datasets for end-to-end-execution.
SDOH datasets - Transformations (preview) is an optional capability under healthcare data solutions in Microsoft Fabric. You have the flexibility to decide whether or not to use it, depending on your specific needs or scenarios.
Prerequisites
- Deploy healthcare data solutions in Microsoft Fabric.
- Install the foundational notebooks and pipelines in Deploy healthcare data foundations.
Deploy SDOH datasets - Transformations (preview)
You can deploy the capability using the setup module explained in Healthcare data solutions: Deploy healthcare data foundations. However, the sample data selection step in this module doesn't deploy sample data for this capability. The SDOH datasets - Transformations (preview) sample data installs exclusively in your healthcare data solutions environment after you finish deploying the capability.
If you didn't use the setup module to deploy the capability and want to use the capability tile instead, follow these steps:
Go to the healthcare data solutions home page on Fabric.
Select the SDOH datasets - Transformations (preview) tile.
On the capability page, select Deploy to workspace.
The deployment can take a few minutes to complete. Don't close the tab or the browser while deployment is in progress. While you wait, you can work in another tab.
After the deployment completes, you can see a notification on the message bar.
Select Manage capability from the message bar to go to the Capability management page.
Here, you can view, configure, and manage the artifacts deployed with the capability.
Artifacts
The capability installs two notebooks, a data pipeline, and the sample SDOH datasets in your healthcare data solutions environment.
Artifact | Type | Description |
---|---|---|
healthcare#_msft_sdoh_raw_extract_bronze_ingestion | Notebook | Facilitates the ingestion of SDOH public datasets into delta tables within the bronze lakehouse. |
healthcare#_msft_sdoh_bronze_silver_flatten | Notebook | Transforms the SDOH public datasets from the bronze lakehouse and ingests the data into the silver lakehouse. |
healthcare#_msft_sdoh_ingestion | Data pipeline | Sequentially runs a series of notebooks to ingest and transform SDOH public datasets from the landing zone into a custom data model in the silver lakehouse. It enables unification of SDOH data with core healthcare modalities such as clinical and claims. |
8SdohPublicDataset | Sample data | Contain SDOH data published by government agencies and other official sources, consolidated at geographic levels such as state, county, or zip code. The preview release provides eight sample SDOH datasets to help you run data pipelines and explore the capability. To learn more, see Public datasets in SDOH datasets - Transformations (preview). |
Notebook configuration
Global configuration: The global configuration values apply to the SDOH datasets - Transformations (preview) pipeline as outlined in Admin lakehouse: Global configuration and the healthcare#_msft_config_notebook in Deploy healthcare data foundations.
Notebook-level configuration: The SDOH datasets - Transformations (preview) notebooks deploy with preconfigured values required to run the associated data pipeline. Some configuration parameters inherit from the global configuration and can be overridden at the notebook level. By default, you aren't expected to make any changes to the notebook configuration files. If needed, you can review or modify the configuration by selecting the respective notebooks in your environment.
Runtime configuration: The SDOH notebooks are preconfigured to run using Runtime 1.2 (Spark 3.4, Delta 2.4) by default. Ensure you maintain this setting at the environment level. To learn more, see Reset Spark runtime version in the Fabric workspace.