Overview of Unstructured clinical notes enrichment in healthcare data solutions (preview)

[This article is prerelease documentation and is subject to change.]

Unstructured clinical notes enrichment is a capability that uses Azure AI Language's Text Analytics for health service for data extraction and structuring, enhancing their analytical potential. This service extracts key Fast Healthcare Interoperability Resources (FHIR) entities from unstructured clinical notes and creates structured data from these clinical notes. You can then analyze the structured data to gain insights, predictions, and quality measures to improve patient health outcomes.

Text Analytics for health enables information labeling through named entity recognition (NER) and entity linking. You can use this service as a modular component in the healthcare data solutions (preview) data pipelines to create structured FHIR data from unstructured clinical notes. FHIR data can contain references to documents or parts of documents, known as DocumentReferences. These documents often contain rich clinical information that can enhance a patient's clinical profile when converted to structured health data that conforms to the FHIR standard. Clinical notes are also a great source of information that can be mined to guide a patient's care pathway and deliver better results. Analysts and data scientists can use this data for conducting exploratory analysis on their clinical datasets.

Unstructured clinical notes enrichment is an optional capability under healthcare data solutions in Microsoft Fabric (preview). You have the flexibility to decide whether or not to use it, depending on your specific needs or scenarios.

To learn how to deploy, configure, and use this capability, see:

Prerequisites

Using Azure AI Language's Text Analytics for health service is optional. But if you use it, you must accept the Responsible AI Terms and Conditions for deploying the service in your environment. For the installation steps and guidance, go to Set up Azure Language service.

To review the transparency notes, see:

Pricing model

The pricing model bases itself on the total number of text records processed by the Text Analytics for health API service. A text record is measured as 1,000 characters. This means that for each piece of text you submit to the API for analysis, the character count of the text is divided by 1000 to determine the number of text records used. For example, if you submit a text that is 3,200 characters long, it counts as four text records. The service uses this calculation model for billing purposes.

Here's the cost breakdown for document processing:

  • For up to 5,000 text records, inferencing is included in the service.
  • For 5,000 to 500,000 text records, the cost is $25 USD per 1,000 text records processed.
  • For 500,000 to 2.5 million text records, the cost is $15 USD per 1,000 text records processed.
  • For more than 2.5 million text records, the cost is $10 USD per 1,000 text records processed.

The pricing model encourages you to process large volumes of text by offering a reduced cost per record for higher volumes. Only successful inferences are charged.

To prevent incurring processing costs, we limit the documentreferencecontent text (clinical notes) that the API processes by setting the nlp_document_limit parameter value to 10 in the healthcare#_msft_silver_ta4h notebook. You can review this configuration as explained in Configure the healthcare#_msft_silver_ta4h notebook. For more information about the pricing model, see Azure AI Language pricing.