What is the best way to ingest data into ADX for Production workloads

DataEngineer 40 Reputation points
2024-02-26T10:07:13.4233333+00:00
  1. Option A : Synapse Pipelines using Azure Data Explorer activity using ingestion statement. as mentioned here https://video2.skills-academy.com/en-us/azure/data-explorer/data-factory-command-activity
  2. Option B : Synapse Notebook using Python or C# SDKs What are the pros and cons of each method
Azure Data Explorer
Azure Data Explorer
An Azure data analytics service for real-time analysis on large volumes of data streaming from sources including applications, websites, and internet of things devices.
501 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Vinodh247-1375 12,506 Reputation points
    2024-02-26T11:14:27.3+00:00

    Hi DataEngineer, Thanks for reaching out to Microsoft Q&A.

    The best method for your use case depends on your preference. You can also use a combination of methods to ingest data from different sources or formats. imo I dont see any difference or better-optimized way with one compared to the other. In one way pipelines might be easy to configure and handle due to low code compared to the notebook.

    Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.


  2. Sander van de Velde | MVP 30,711 Reputation points MVP
    2024-02-26T17:14:58.9266667+00:00

    Hi @DataEngineer ,

    as @Vinodh247-1375 mentioned there are several solutions, it depends on your use-case which is the best for you.

    You mention two solutions.

    Option A, Synapse Pipelines using Azure Data Explorer activity is a nice low-code solution that integrates well if you are already using Synapse pipelines.

    Option B, using one of the SDKs (eg. Python or C#) in combination with logic like Synapse Notebook or An Azure functions is perfect if you want to have full control on the way data is ingested and how tags and extends are managed.

    Regarding the SDK, there is a 'direct ingestion' method and a 'queued ingestion'. The first one is not recommended for production because of the non asynchronous way of ingestion.

    The queued ingest is the preferred way in production but gives you less control because data is ingested first and handled later on (successful or not). You can monitor the ingest afterwards then. There are diagnostics setting though (ingestionProperties.ReportLevel, ingestionProperties.ReportMethod) but again this has a huge impact on performance in production.

    Other ways of ingestion are eg. Azure Stream Analytics, the Azure Data Explorer data connections connected to eg. EventHub, IoT Hub or blob storage and technically even a Logic App could be used for ingestion via the ADX connection and actions.

    Again, it depends on you experience, the architecture landscape and how much control you want to have.


    If the response helped, do "Accept Answer". If it doesn't work, please let us know the progress. All community members with similar issues will benefit by doing so. Your contribution is highly appreciated.