Extract Salesforce live data ingestion in Azure SQL database

Abhishek Chaudhary 0 Reputation points
2023-08-20T11:49:04.5+00:00

I have successfully copied data from Salesforce to my Azure SQL Database using Azure Data Factory pipelines, and I want to ensure that my pipeline automatically retrieves updated or new entries from Salesforce into my database. I am considering using Change Data Capture event from Salesforce and subscribing to the event using Azure Function or Event Hub. Can someone advise me on the best way to achieve this? Additionally, I want to ensure that any deleted rows in Salesforce are updated in my Azure SQL database.

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
5,073 questions
Azure Event Hubs
Azure Event Hubs
An Azure real-time data ingestion service.
644 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,803 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 25,866 Reputation points
    2023-08-20T13:27:25.8833333+00:00

    If you want to work with CDC then you need to enable it for the objects in Salesforce that you want to track. This will capture changes including create, update, delete, and undelete operations.

    Then create an Azure Function that will be triggered by the Salesforce CDC events. Use the appropriate language SDK (for example C#) to process the change events. You may use a Server-Sent Event (SSE) client to subscribe to the CDC event stream.

    Inside the Azure Function, you will need to process the change events and execute the necessary SQL commands to insert, update, or delete records in your Azure SQL Database.

    If you want to use an Azure Event Hubs with Salesforce CDC then setting up the Salesforce CDC is always a must (similar to what I mentioned above) then create an Event Hub which can be used as a highly scalable ingestion point for the CDC events.

    Then create a Consumer to Process Events, if you are able to write it a separate application (could be an Azure Function) that consumes events from the Event Hub, processes them, and writes them to the Azure SQL Database then it would be perfect.

    One last option is to use the Azure Data Factory with Incremental Loads where you can configure the pipeline to perform incremental loads based on a modified timestamp column or using a watermark.

    But keep in mind that with this option handling deletions might be trickier. You might consider running a separate process that syncs the existing records in Salesforce with your Azure SQL Database and marks any missing records as deleted.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.