How to: Query nested data in Microsoft Fabric mirrored databases from Azure Cosmos DB (Preview)

Use the mirrored database in Microsoft Fabric to query nested JSON data sourced from Azure Cosmos DB for NoSQL.

Important

Mirroring for Azure Cosmos DB is currently in preview. Production workloads aren't supported during preview. Currently, only Azure Cosmos DB for NoSQL accounts are supported.

Prerequisites

Tip

During the public preview, it's recommended to use a test or development copy of your existing Azure Cosmos DB data that can be recovered quickly from a backup.

Create nested data within the source database

Create JSON items within your Azure Cosmos DB for NoSQL account that contain varying levels of nested JSON data.

  1. Navigate to your Azure Cosmos DB account in the Azure portal.

  2. Select Data Explorer from the resource menu.

  3. Use + New container to create a new container. For this guide, name the container TestC. The corresponding database name is arbitrary.

  4. Use the + New item option multiple times to create and save these five JSON items.

    {
      "id": "123-abc-xyz",
      "name": "A 13",
      "country": "USA",
      "items": [
        {
          "purchased": "11/23/2022",
          "order_id": "3432-2333-2234-3434",
          "item_description": "item1"
        },
        {
          "purchased": "01/20/2023",
          "order_id": "3431-3454-1231-8080",
          "item_description": "item2"
        },
        {
          "purchased": "02/20/2023",
          "order_id": "2322-2435-4354-2324",
          "item_description": "item3"
        }
      ]
    }
    
    {
      "id": "343-abc-def",
      "name": "B 22",
      "country": "USA",
      "items": [
        {
          "purchased": "01/20/2023",
          "order_id": "2431-2322-1545-2322",
          "item_description": "book1"
        },
        {
          "purchased": "01/21/2023",
          "order_id": "3498-3433-2322-2320",
          "item_description": "book2"
        },
        {
          "purchased": "01/24/2023",
          "order_id": "9794-8858-7578-9899",
          "item_description": "book3"
        }
      ]
    }
    
    {
      "id": "232-abc-x43",
      "name": "C 13",
      "country": "USA",
      "items": [
        {
          "purchased": "04/03/2023",
          "order_id": "9982-2322-4545-3546",
          "item_description": "clothing1"
        },
        {
          "purchased": "05/20/2023",
          "order_id": "7989-9989-8688-3446",
          "item_description": "clothing2"
        },
        {
          "purchased": "05/27/2023",
          "order_id": "9898-2322-1134-2322",
          "item_description": "clothing3"
        }
      ]
    }
    
    {
      "id": "677-abc-yuu",
      "name": "D 78",
      "country": "USA"
    }
    
    {
      "id": "979-abc-dfd",
      "name": "E 45",
      "country": "USA"
    }
    

Setup mirroring and prerequisites

Configure mirroring for the Azure Cosmos DB for NoSQL database. If you're unsure how to configure mirroring, refer to the configure mirrored database tutorial.

  1. Navigate to the Fabric portal.

  2. Create a new connection and mirrored database using your Azure Cosmos DB account's credentials.

  3. Wait for replication to finish the initial snapshot of data.

Query basic nested data

Now, use the SQL analytics endpoint to create a query that can handle simple nested JSON data.

  1. Navigate to the mirrored database in the Fabric portal.

  2. Switch from Mirrored Azure Cosmos DB to SQL analytics endpoint.

    Screenshot of the selector to switch between items in the Fabric portal.

  3. Open the context menu for the test table and select New SQL Query.

  4. Run this query to expand on the items array with OPENJSON. This query uses OUTER APPLY to include extra items that might not have an items array.

    SELECT 
        t.name, 
        t.id, 
        t.country, 
        P.purchased, 
        P.order_id, 
        P.item_description 
    FROM OrdersDB_TestC AS t
    OUTER APPLY OPENJSON(t.items) WITH
    (
        purchased datetime '$.purchased',
        order_id varchar(100) '$.order_id',
        item_description varchar(200) '$.item_description'
    ) as P
    

    Tip

    When choosing the data types in OPENJSON, using varchar(max) for string types could worsen query performance. Instead, use varchar(n) wher n could be any number. The lower n is, the more likely you will see better query performance.

  5. Use CROSS APPLY in the next query to only show items with an items array.

    SELECT
        t.name,
        t.id,
        t.country,
        P.purchased,
        P.order_id,
        P.item_description 
    FROM
        OrdersDB_TestC as t CROSS APPLY OPENJSON(t.items) WITH (
            purchased datetime '$.purchased',
            order_id varchar(100) '$.order_id',
            item_description varchar(200) '$.item_description' 
        ) as P 
    

Create deeply nested data

To build on this nested data example, let's add a deeply nested data example.

  1. Navigate to your Azure Cosmos DB account in the Azure portal.

  2. Select Data Explorer from the resource menu.

  3. Use + New container to create a new container. For this guide, name the container TestD. The corresponding database name is arbitrary.

  4. Use the + New item option multiple times to create and Save this JSON item.

    {
      "id": "eadca09b-e618-4090-a25d-b424a26c2361",
      "entityType": "Package",
      "packages": [
        {
          "packageid": "fiwewsb-f342-jofd-a231-c2321",
          "storageTemperature": "69",
          "highValue": true,
          "items": [
            {
              "id": "1",
              "name": "Item1",
              "properties": {
                "weight": "2",
                "isFragile": "no"
              }
            },
            {
              "id": "2",
              "name": "Item2",
              "properties": {
                "weight": "4",
                "isFragile": "yes"
              }
            }
          ]
        },
        {
          "packageid": "d24343-dfdw-retd-x414-f34345",
          "storageTemperature": "78",
          "highValue": false,
          "items": [
            {
              "id": "3",
              "name": "Item3",
              "properties": {
                "weight": "12",
                "isFragile": "no"
              }
            },
            {
              "id": "4",
              "name": "Item4",
              "properties": {
                "weight": "12",
                "isFragile": "no"
              }
            }
          ]
        }
      ],
      "consignment": {
        "consignmentId": "ae21ebc2-8cfc-4566-bf07-b71cdfb37fb2",
        "customer": "Humongous Insurance",
        "deliveryDueDate": "2020-11-08T23:38:50.875258Z"
      }
    }
    

Query deeply nested data

Finally, create a T-SQL query that can find data deeply nested in a JSON string.

  1. Open the context menu for the TestD table and select New SQL Query again.

  2. Run this query to expand all levels of nested data using OUTER APPLY with consignment.

    SELECT
        P.id,
        R.packageId,
        R.storageTemperature,
        R.highValue,
        G.id,
        G.name,
        H.weight,
        H.isFragile,
        Q.consignmentId,
        Q.customer,
        Q.deliveryDueDate 
    FROM
        OrdersDB_TestD as P CROSS APPLY OPENJSON(P.packages) WITH ( packageId varchar(100) '$.packageid',
        storageTemperature INT '$.storageTemperature',
        highValue varchar(100) '$.highValue',
        items nvarchar(MAX) AS JSON ) as R 
    OUTER APPLY OPENJSON (R.items) WITH (
        id varchar(100) '$.id',
        name varchar(100) '$.name',
        properties nvarchar(MAX) as JSON 
    ) as G OUTER APPLY OPENJSON(G.properties) WITH  (
        weight INT '$.weight',
        isFragile varchar(100) '$.isFragile'
    ) as H OUTER APPLY OPENJSON(P.consignment) WITH  (
        consignmentId varchar(200) '$.consignmentId',
        customer varchar(100) '$.customer',
        deliveryDueDate Date '$.deliveryDueDate'
    ) as Q 
    

    Note

    When expanding packages, items is represented as JSON, which can optionally expand. The items property has sub-properties as JSOn which also can optionally expand.

  3. Finally, run a query that chooses when to expand specific levels of nesting.

    SELECT
        P.id,
        R.packageId,
        R.storageTemperature,
        R.highValue,
        R.items,
        Q.consignmentId,
        Q.customer,
        Q.deliveryDueDate 
    FROM
        OrdersDB_TestD as P CROSS APPLY OPENJSON(P.packages) WITH (
            packageId varchar(100) '$.packageid',
            storageTemperature INT '$.storageTemperature',
            highValue varchar(100) '$.highValue',
            items nvarchar(MAX) AS JSON
        ) as R 
    OUTER APPLY OPENJSON(P.consignment) WITH  (
        consignmentId varchar(200) '$.consignmentId',
        customer varchar(100) '$.customer',
        deliveryDueDate Date '$.deliveryDueDate'
    ) as Q 
    

    Note

    Property limits for nested levels are not enforced in this T-SQL query experience.