CosmosDB Analytical store delete changes in sql sink

Kuldeep Bhati 0 Reputation points
2024-04-30T13:52:46.2133333+00:00

Hi I am trying to capture delete changes from Azure cosmosdb analytical store using change data capture in Azure datafactory, Source is Cosmosdb, sink is Azure sql database. In between I am flattening my file using dataflow.

My insert, update is working fine, but delete is not working in sql database, so row is not deleting, if I delete a row in transactional cosmosdb.

If I understand it correctly Transaction and Analytical cosmosdb store is in sync, so change should be reflected immediately.

Is there any method or query I can use for delete operation in pipeline, so I can use any lookups.

Azure SQL Database
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,629 questions
Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,520 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,024 questions
{count} votes

5 answers

Sort by: Most helpful
  1. Kuldeep Bhati 0 Reputation points
    2024-05-01T14:04:29.26+00:00

    Hi Thanks for your reply, I have setup delete option in sink for dataflow, and I have also established cdc first and then I am doing delete.

    In sink I have enabled delete, update and upsert for dataflow. I am using {_rid} as key column on the sink side.

    One thing I have noticed if I delete 1 record in transactional store and run the pipeline, my pipeline reads 1 change in record and in second step of flatten file, but in sink it is not writing anything (deleting that record in azure sql database). My pipeline runs successfully.

    If I create a new pipeline then I still see that deleted record, if I load that data again from analytical container in cosmosdb (from beginning of time). That data does not exist in transaction container of cosmosdb.

    0 comments No comments

  2. Kuldeep Bhati 0 Reputation points
    2024-05-02T05:29:39.0733333+00:00

    Hi Do you have any solution for it, as my setup is correct, but my records are not deleting after I delete records in transaction database

    0 comments No comments

  3. Harishga 5,590 Reputation points Microsoft Vendor
    2024-05-02T07:24:09.3266667+00:00

    Hi @Kuldeep Bhati
    As per the documentation, currently, there is no way to delete or truncate a Cosmos container. Typically, the way most people do this today is to delete and then recreate the container. That avoids the RU/s cost of deleting every item. So, if you want to delete a row from the transactional Cosmos DB, you can delete the entire container and recreate it with the remaining rows.

    You can use the REST endpoint in Azure Data Factory to call Azure REST API for Delete Document in Cosmos DB (SQL).

    The Delete Document operation deletes an existing document in a collection.

    260432-image.png

    Reference:
    Copy and transform data from and to a REST endpoint - Azure Data Factory & Azure Synapse | Microsoft Learn
    https://video2.skills-academy.com/en-us/answers/questions/820865/using-azure-data-factory-to-call-azure-rest-api-of

    Hope this helps. Do let us know if you any further queries.

    0 comments No comments

  4. Kuldeep Bhati 0 Reputation points
    2024-05-09T14:12:26.07+00:00

    Hi Harshiga,

    Thanks for your reply.

    My records are deleting correctly in cosmosdb transaction store, but same changes I do not see in analytical store. The reason why I am saying this because if I create a new pipeline and rerun the datafactory, I can see deleted records in sql database sink.

    Second

    My records are also not deleting in azure sql database which I am using as sink in ADF pipeline. I use cosmosdb analytical container as source for adf. I have enabled insert,upsert and delete at sink.Regards

    Kuldeep Bhati


  5. Kuldeep Bhati 0 Reputation points
    2024-07-04T16:50:58.6366667+00:00

    Hi Harshiga, thanks for your reply. I am still not get any deletes in my sink (SQL), I have selected the below options in change data capture in ADF.

    Capture intermediate updates Enable this option if you would like to capture the history of changes to items including the intermediate changes between change data capture reads.
    Capture intermediate updates Enable this option if you would like to capture the history of changes to items including the intermediate changes between change data capture reads.
    Capture Deletes Enable this option to capture user-deleted records and apply them on the Sink. Deletes can't be applied on Azure Data Explorer and Azure Cosmos DB Sinks.

    Eevery time I add or update a record, I can see new row added in sql sink with incremental _ts. When I delete a record in cosmosdb, I do not dee that updated in sql sink

    I looked at the below article shared by you and it mention ___usr_opType column in sink, how I can get that column in my sql or blob sink. I also added system generated column option in adf pipeline.

    https://video2.skills-academy.com/en-us/azure/cosmos-db/get-started-change-data-capture

    Value Description Option
    1 UPDATE Capture Intermediate updates
    1 UPDATE Capture Intermediate updates
    2 INSERT There isn't an option for inserts, it's on by default
    3 USER_DELETE Capture Deletes
    4 TTL_DELETE Capture Transactional store TTLs

    If I can get this column, I can identify that record was inserted, deleted or updated. Can you please help us on this how to get this column in ADF pipeline. I tried but I am not getting it. I also tried the pipeline mentioned in the article, but couldn't see usrtype data in delta table.

    Regards

    Kuldeep Bhati