What is the best way to ingest data into ADX for Production workloads

Question

Option A : Synapse Pipelines using Azure Data Explorer activity using ingestion statement. as mentioned here https://video2.skills-academy.com/en-us/azure/data-explorer/data-factory-command-activity
Option B : Synapse Notebook using Python or C# SDKs What are the pros and cons of each method

Answer

Hi DataEngineer, Thanks for reaching out to Microsoft Q&A.

The best method for your use case depends on your preference. You can also use a combination of methods to ingest data from different sources or formats. imo I dont see any difference or better-optimized way with one compared to the other. In one way pipelines might be easy to configure and handle due to low code compared to the notebook.

Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.

Answer

Hi @DataEngineer ,

as @Vinodh247-1375 mentioned there are several solutions, it depends on your use-case which is the best for you.

You mention two solutions.

Option A, Synapse Pipelines using Azure Data Explorer activity is a nice low-code solution that integrates well if you are already using Synapse pipelines.

Option B, using one of the SDKs (eg. Python or C#) in combination with logic like Synapse Notebook or An Azure functions is perfect if you want to have full control on the way data is ingested and how tags and extends are managed.

Regarding the SDK, there is a 'direct ingestion' method and a 'queued ingestion'. The first one is not recommended for production because of the non asynchronous way of ingestion.

The queued ingest is the preferred way in production but gives you less control because data is ingested first and handled later on (successful or not). You can monitor the ingest afterwards then. There are diagnostics setting though (ingestionProperties.ReportLevel, ingestionProperties.ReportMethod) but again this has a huge impact on performance in production.

Other ways of ingestion are eg. Azure Stream Analytics, the Azure Data Explorer data connections connected to eg. EventHub, IoT Hub or blob storage and technically even a Logic App could be used for ingestion via the ADX connection and actions.

Again, it depends on you experience, the architecture landscape and how much control you want to have.

If the response helped, do "Accept Answer". If it doesn't work, please let us know the progress. All community members with similar issues will benefit by doing so. Your contribution is highly appreciated.

Share via

What is the best way to ingest data into ADX for Production workloads

2 answers