Deploy and configure DICOM data transformation in healthcare data solutions
Note
This content is currently being updated.
DICOM data transformation enables you to use the imaging data ingestion pipeline and bring your Digital Imaging and Communications in Medicine (DICOM) data to OneLake. You can set up the capability after you deploy healthcare data solutions and the healthcare data foundations capability to your Fabric workspace.
DICOM data transformation is an optional capability under healthcare data solutions in Microsoft Fabric. You have the flexibility to decide whether or not to use it, depending on your specific needs or scenarios.
Prerequisites
Install the foundational notebooks and pipelines in Deploy healthcare data foundations.
Deploy the DICOM imaging sample data as explained in Deploy sample data.
Deploy, configure, and run the OMOP transformations pipelines (optional). For detailed guidance, see:
Deploy DICOM data transformation
You can deploy the capability and the associated sample data using the setup module explained in Healthcare data solutions: Deploy healthcare data foundations. Alternatively, you can also deploy the sample data later using the steps in Deploy sample data. This capability uses the 340ImagingStudies sample dataset.
If you didn't use the setup module to deploy the capability and want to use the capability tile instead, follow these steps:
Go to the healthcare data solutions home page on Fabric.
Select the DICOM data transformation tile.
On the capability page, select Deploy to workspace.
The deployment can take a few minutes to complete. Don't close the tab or the browser while deployment is in progress. While you wait, you can work in another tab.
After the deployment completes, you can see a notification on the message bar.
Select Manage capability from the message bar to go to the Capability management page.
Here, you can view, configure, and manage the artifacts deployed with the capability.
Artifacts
The capability installs the following two notebooks and data pipeline in your healthcare data solutions environment:
Artifact | Type | Description |
---|---|---|
healthcare#_msft_imaging_dicom_extract_bronze_ingestion | Notebook | Uses the MetadataExtractionOrchestrator module in the healthcare data solutions library to extract the DICOM metadata from DCM files. The metadata is then stored in the dicomimagingmetastore delta table in the bronze lakehouse. |
healthcare#_msft_imaging_dicom_fhir_conversion | Notebook | Uses the MetadataToFhirConvertor module in the healthcare data solutions library to convert the DICOM metadata in the bronze delta table. The conversion process involves transforming metadata from the dicomimagingmetastore table into the FHIR ImagingStudy resource in the FHIR R4.3 format. The output is saved as NDJSON files. |
healthcare#_msft_imaging_with_clinical_foundation_ingestion | Data pipeline | Sequentially runs the following notebooks to ingest clinical FHIR NDJSON data and imaging DICOM data into delta tables in the bronze lakehouse, and then flatten the data into the silver lakehouse: • healthcare#_msft_raw_process_movement: Orchestrates file movement from the Ingest folder to the Process folder and ensures the presence of all file names in the Process folder. • healthcare#_msft_imaging_dicom_extract_bronze_ingestion: Extracts metadata (DICOM tags) from the ingested DICOM DCM files. • healthcare#_msft_imaging_dicom_fhir_conversion: Converts the metadata to the FHIR presentation of ImagingStudy in NDJSON format. • healthcare#_msft_fhir_ndjson_bronze_ingestion: Facilitates the ingestion of FHIR and ImagingStudy NDJSON data into delta tables within the bronze lakehouse. • healthcare#_msft_bronze_silver_flatten: Transforms and flattens the data from the bronze lakehouse and ingests the data into the silver lakehouse. |
The DICOM data transformation notebooks deploy with preconfigured values required to run the associated data pipeline. Some configuration parameters inherit from the global configuration. By default, you aren't expected to make any changes to the notebook configuration files. If needed, you can open the notebook and review the configuration.
Imaging sample data
The sample data shipped with healthcare data solutions includes the imaging sample datasets that you can use to run the DICOM data transformation pipeline. You can also explore the data transformation and progression through the medallion bronze, silver, and gold lakehouses. The provided imaging sample data might not be clinically meaningful, but they're technically complete and comprehensive to demonstrate the solution's imaging capabilities. To access the sample data folders, make sure you download the imaging sample dataset 340ImagingStudies as explained in Deploy sample data.
Deploy the DICOM API in Azure Health Data Services
Important
Follow this deployment section only if you're using the DICOM service in Azure Health Data Services.
Azure Health Data Services is a cloud-based solution that helps you collect, store, and analyze health data from different sources and formats. It supports various healthcare standards, such as DICOM. The DICOM service (part of Azure Health Data Services) is a cloud-based solution that enables healthcare organizations to store, manage, and exchange medical imaging data securely and efficiently with any DICOM web-enabled systems or applications.
DICOM data transformation in healthcare data solutions has a native integration with the DICOM service. If you're already using the DICOM service, you can also use the imaging analytical capabilities in healthcare data solutions. This native integration eliminates the need for manual integration of datasets between the two services. The integration is based on providing a OneLake shortcut to Azure Data Lake Storage Gen2 for the Azure Health Data Services DICOM service. To set up the data lake integration, follow the steps in Deploy the DICOM service with Azure Data Lake Storage. For more information on this ingestion pipeline, go to Option 2: End to end integration with the DICOM service.
Configure Azure Data Lake Storage ingestion
Follow this Bring Your Own Storage (BYOS) configuration option if you wish to use DICOM files from your Azure Data Lake Storage Gen2 storage location. With this option, you don't need to copy or move the DICOM files to the OneLake ingestion folders. Instead, you can create a OneLake shortcut to Data Lake Storage Gen2 and access the DICOM files from their original storage location.