Execution Plan and Buffer Allocation

Applies to: SQL Server SSIS Integration Runtime in Azure Data Factory

Before execution, the data flow task examines its components and generates an execution plan for each sequence of components. This section provides details about the execution plan, how to view the plan, and how input and output buffers are allocated based on the execution plan.

Understanding the Execution Plan

An execution plan contains source threads and work threads, and each thread contains work lists that specify output work lists for source threads or input and output work lists for work threads. The source threads in an execution plan represent the source components in the data flow and are identified in the execution plan by SourceThreadn, where n is the zero-based number of the source thread.

Each source thread creates a buffer, sets a listener, and calls the PrimeOutput method on the source component. This is where execution starts and data originates, as the source component starts adding rows to the output buffers that are provided to it by the data flow task. After the source threads are running, the balance of work is distributed among work threads.

A work thread may contain both input and output work lists and is identified in the execution plan as WorkThreadn, where n is the zero-based number of the work thread. These threads contain output work lists when the graph contains a component with asynchronous outputs.

The following sample execution plan represents a data flow that contains a source component connected to a transformation with an asynchronous output connected to a destination component. In this example, WorkThread0 contains an output work list because the transformation component has an asynchronous output.

SourceThread0   
    Influences: 72 158   
    Output Work List   
        CreatePrimeBuffer of type 1 for output id 10   
        SetBufferListener: "WorkThread0" for input ID 73   
        CallPrimeOutput on component "OLE DB Source" (1)   
    End Output Work List   
    This thread drives 0 distributors   
End SourceThread0   
WorkThread0   
    Influences: 72 158   
    Input Work list, input ID 73   
        CallProcessInput on input ID 73 on component "Sort" (72) for view type 2   
    End Input Work list for input 73   
    Output Work List   
        CreatePrimeBuffer of type 3 for output id 74   
        SetBufferListener: "WorkThread1" for input ID 171with internal handoff   
        CallPrimeOutput on component "Sort" (72)   
    End Output Work List   
    This thread drives 0 distributors   
End WorkThread0   
WorkThread1   
    Influences: 158   
    Input Work list, input ID 171  
        CallProcessInput on input ID 171 on component "OLE DB Destination" (158) for view type 4  
    End Input Work list for input 171   
    Output Work List   
    End Output Work List   
    This thread drives 0 distributors   
End WorkThread1  

Note

The execution plan is generated every time a package is executed, and can be captured by adding a log provider to the package, enabling logging, and selecting the PipelineExecutionPlan event.

Understanding Buffer Allocation

Based on the execution plan, the data flow task creates buffers that contain the columns defined in the outputs of the data flow components. The buffer is reused as the data flows through the sequence of components, until a component with asynchronous outputs is encountered. Then, a new buffer is created, which contains the output columns of the asynchronous output and the output columns of downstream components.

During execution, components have access to the buffer in the current source or work thread. The buffer is either an input buffer, provided by the ProcessInput method, or an output buffer, provided by the PrimeOutput method. The Mode property of the PipelineBuffer also identifies each buffer as an input or output buffer.

Transformation components with asynchronous outputs receive the existing input buffer from the ProcessInput method, and receive the new output buffer from the PrimeOutput method. A transformation component with asynchronous outputs is the only type of data flow component that receives both an input and an output buffer.

Because the buffer provided to a component is likely to contain more columns than the component has in its input or output column collections, component developers can call the FindColumnByLineageID method to locate a column in the buffer by specifying its LineageID.