Overview of CMS claims data transformations (preview) in healthcare data solutions

Important

  • This is a preview feature.
  • Preview features aren't meant for production use and might have restricted functionality. These features are available before an official release so that customers can get early access and provide feedback.
  • To review the terms of service, see Healthcare data solutions in Microsoft Fabric.

The CMS claims data transformations (preview) capability in healthcare data solutions enables you to ingest, store, and analyze claims data in CMS (Centers for Medicare & Medicaid Services) CCLF (Claim and Claim Line Feed) format. Ingesting claims data into healthcare data solutions helps monitor population-level trends and utilization, and measure performance against benchmarks to reduce overall claim expenses and improve patient care.

Medicare entities use CCLF files to access beneficiary claims data for analysis and care management. These files contain detailed Medicare claims information, including demographic data, provider details, and service records. They also contain data files for services such as skilled nursing, hospice, and durable medical equipment. Structured as flat, fixed-width text files, CCLF files provide granular claim-level data details such as beneficiary IDs, diagnosis codes, procedure codes, service dates, and payment amounts. Each file type (for example, CCLF1 for beneficiary data and CCLF2 for inpatient claims) caters to specific claim categories, enabling entities to monitor costs and care quality effectively.

Note

Healthcare data solutions in Microsoft Fabric support CCLF Information Packet Version 38.0.

This capability seamlessly transforms claims data into tabular shapes that can be persisted in OneLake. Bringing claims data to OneLake enables scenarios such as:

  • Care management analytics
  • Identifying gaps in care
  • Revenue cycle analysis

CMS claims data transformations (preview) is an optional capability under healthcare data solutions in Microsoft Fabric. You have the flexibility to decide whether or not to use it, depending on your specific needs or scenarios.

To learn how to deploy, configure, and use the capability, see:

Conceptual architecture

The capability uses the innovative medallion lakehouse design explained in Data architecture and management in healthcare data solutions in Microsoft Fabric. This framework organizes and processes claims data through the following layers:

  • Bronze: Stores the source claims data in its original format. It extracts data from the source files and stores them as NDJSON files.

  • Silver: Based on the FHIR specification, this layer stores the claims data in the ExplantionOfBenefit FHIR resource sourced from the bronze lakehouse.

The claims transformation pipeline processes the claims data through these stages:

  1. Ingest and persist the raw claims files (present in the native CCLF format) in the bronze lakehouse.
  2. Extract data from these files and insert them into the bronze lakehouse delta table, ensuring data integrity.
  3. Convert the delta table data into FHIR NDJSON files, store them in OneLake, and transform them into relational FHIR format in the silver lakehouse.