Pause and resume Azure Data Factory pipeline

jigsm 236 Reputation points
2020-10-16T17:42:23.147+00:00

Respected,

I have a requirement where the Azure Data Factory Pipeline will insert data into a staging table, over here I want the pipeline to pause, we will be showing a UI page to user

so that they can review the data that is being processed. Once they review it , I want the pipeline to resume (a click event on the Ok button on the UI) and continue with the

remaining activities.

So, is there a way where we can pause and resume the pipeline.

Regards

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,568 questions
0 comments No comments
{count} votes

Accepted answer
  1. MartinJaffer-MSFT 26,081 Reputation points
    2020-10-16T19:56:55.123+00:00

    Hello @jigsm and thank you for your question.

    If I understand correctly, you have an application which the customer interacts with, and triggers the pipeline. After the pipeline starts, you want the customer to confirm the pipeline should continue. If the customer either declines or does not respond, the pipeline should stop at that point.

    For Data Factory V2 there is not a "pause" for pipelines, however I have something better.
    I say better, because with the below method, you do not need to manage pausing and resuming pipelines. It is part of the workflow.

    Insert a Validation Activity in the pipeline at the point which you want to pause. The validation activity is used to wait for a dataset to be ready (exist, or be a certain size). Let the customer confirmation fulfill the validation activity requirements. For example, the validation activity looks for a blob by name "confirm/{id}". Initially this blob does not exist. If the customer clicks OK, then create the blob. The validation activity finds it and pipeline resumes execution.

    The validation activity lets you set retry and timeout. This can handle both the case of customer clicking "no" and the case of customer not responding. By controlling the time between retry and total duration of activity, you can match the lifetime to that of your application UI.

    If the validation activity does not find the blob after retrying, it eventually "times out". In pipeline logic, this is viewed by dependencies as an activity failure status. This can be used to stop the pipeline execution, or do other logic.

    0 comments No comments

3 additional answers

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,081 Reputation points
    2020-10-16T20:18:32.083+00:00

    @jigsm

    There is also the Webhook activity, which may be more well suited for your situation than the Validation activity. However I do not have as much experience implementing it.

    0 comments No comments

  2. jigsm 236 Reputation points
    2020-10-16T20:40:37.787+00:00

    MartinJaffer-MSFT ,

    Thanks for the prompt reply!!!

    I will evaluate both and keep you posted.

    Regards


  3. John Aherne 516 Reputation points
    2020-10-16T22:52:15.107+00:00

    Another thing you could do is split into two separate pipelines.
    First workflow gets data to review.
    In your UI, if they click approve, you could then call the next pipeline to finish the process via ADF APIs.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.