Use the OMOP transformations sample notebooks in healthcare data solutions
Note
This content is currently being updated.
This section shows you two Observational Medical Outcomes Partnership (OMOP) sample scenarios. These scenarios reflect common clinical research investigations conducted by the OMOP community regarding exposure to primary and secondary drugs across patient populations. From a time to value perspective, it demonstrates how quickly you can visualize analytical outcomes within your Fabric workspace. You can achieve this visualization by executing the sample notebooks after the data pipelines populates the Fast Healthcare Interoperability Resources (FHIR) clinical data in the silver and gold lakehouses, respectively.
Prerequisites
Before you run the sample notebooks healthcare#_msft_omop_drug_exposure_era_sample and healthcare#_msft_omop_drug_exposure_insights_sample, make sure you have the following requirements:
Verify whether the OMOP database is created and populated with sample data.
Deploy and set up the OMOP sample data in your environment, as explained in Deploy OMOP transformations.
Review the sample notebook configuration, as explained in:
Sample scenario
The sample scenarios aim to identify patient cohorts stratified by gender and age who are exposed to a secondary drug during a certain period while on the same primary drug. The process includes the following steps:
Stratify patient population by gender and age.
Identify the drug (for example, insulin isophane, human 70 UNT/ML/insulin, regular, human 30Unit) taken by the patient population over a period of one year, at least once.
If there isn't enough data, consider a period of five years instead.
Identify another drug (the second drug) that the same patient population is exposed to during the same period.
Plot the distribution of secondary drug exposure across the gender strata.
Generate the records and visualize the distribution as a histogram plot.
Tip
The sample scenarios reference the OHDSI Drug Eras sample scripts and the OMOP Drug Exposure queries. You can review these resources to learn more about similar examples published by the OMOP community.
Sample notebook execution inputs
The primary objective of the development design is to generate the drug era records, represented by the OMOP standardized derived table drug_era. This table stores the calculated drug eras, containing aggregated information on drug exposures grouped by person, drug ingredient, and persistence window. It represents continuous periods of assumed exposure to a specific active ingredient, distinct from individual drug exposure records.
The table contains the following columns:
drug_era_id
: Unique identifier for each drug era.person_id
: Foreign key referencing the person exposed to the drug, with demographic details in the Person table.drug_concept_id
: Foreign key referring to a standardized concept identifier for the active ingredient.drug_era_start_date
: Start date of the drug era, derived from the first drug exposure.drug_era_end_date
: End date of the drug era, based on the last drug exposure.drug_exposure_count
: Total number of drug exposures during the drug era.gap_days
: Number of days not covered by the drug exposure records that contributed to the drug era.
To generate the drug era records, we use the following OMOP standardized clinical tables:
Drug Exposure: This table contains the drug exposure data, including
drug_exposure_id
,person_id
,drug_concept_id
,drug_exposure_start_date
,drug_exposure_end_date
, anddays_supply
.Concept Ancestor: This table stores hierarchical relationships between concepts in various vocabularies such as RxNorm. It includes the
ancestor_concept_id
(a reference to a higher-level concept) and thedescendant_concept_id
(a reference to a lower-level concept), representing the broader to narrower concept connections.Concept: This table contains the concept data, including
concept_id
,concept_name
,domain_id
,vocabulary_id
, andconcept_class_id
.
Sample input parameters
primary_drug
=1596977 - insulin
secondary_drug
=1308216 - lisinopril
year
=2022
Sample notebook outputs
When you run the two sample notebooks, they generate a histogram with a distribution of the secondary drug exposure across the gender and age strata of the patient population identified during a specific period from the derived OMOP table omop.drug_era. In this example, we consider a period of one year.
You can use the distribution to analyze the following aspects:
- Impact of exposure by gender and age.
- Median distribution of impacted population.
- Descriptive statistics to describe the characteristics of the population.
Things to remember
To test your custom scenarios, make a copy of the sample notebooks. Don't update the notebooks directly.
The visualization notebook uses the following parameters that you can configure to run different analyses:
primary_drug
: The primary drug to analyze.secondary_drug
: The secondary drug to analyze.year
: The year for which the analysis should be performed.
Running the drug exposure era notebook multiple times first deletes all the existing OMOP drug_era records, and then recreates the records based on the latest OMOP data.