SQL Pagination for bulk data transfer with Logic Apps

Introduction

Data transfer is one of the key tasks for integration developer and if it is a bulk data transfer with rows in millions for single call to database then the developer needs to think about the approach which will suit the solution

In this TechNet wiki, we will go through the method of SQL pagination and how we can implement pagination within Azure Logic Apps .We will do a bulk data extract from Azure SQL database and send to blob storage from were partners can read the files for processing. For those who are new to concept of pagination

Pagination is the process of dividing the transaction result set into discrete subsets. “.

Pagination helps you to control the result output and provide cleaner structure to the result set. This means if you have a record-set of 100 k then you can divide it into 50 k each or a recordset as per your business requirement. This will be called as page size. The more generalized definition can be found on wiki https://en.wikipedia.org/wiki/Pagination .

The more generalized example of pagination can be seen on Bing or Google where your search result is being divided across multiple pages.

The concept of pagination is cross data platform and you can use the method of pagination within Oracle, DB2 with OFFSET/FETCH clause.

Solution Design

In this solution, we are developing an interface for fictions company Adventure Works which holds the product data within Adventure Works DB. The required data is distributed across multiple tables and Adventure Works partner organization requires these data to be replicated on daily basis to import within their own data warehouse system.  The exchange file format should be comma separated format (csv) between both the organizations.

The application will be using Azure Logic Apps and SQL connector to build a simple solution which will be triggered on daily basis and store the product csv data within Azure Blob storage.

                                       

The workflow action being listed below

  1. Logic App is configured with recurrence trigger with recurrence interval set to 1 day.
  2. Logic App has SQL connector configured to call stored procedure LogicAppPaginationSample.
  3. The stored procedure set the page count and page size as an input parameter.
  4. LogicAppPaginationSample   stored procedure is looped within Do until loop till the result set is empty.

Prerequisite

To work with the scenario, you need to have a valid subscription to Microsoft Azure. You can register for free subscription at link: Jump https://azure.microsoft.com/en-us/free/

Implementation

For this solution, we have used the sample Adventure Works database. We have created a stored procedure LogicAppPaginationSample which makes a cross join within Product, Product Category, and Product Model. The number of rows returned in single transaction is 1548610
     
                            

The sample stored procedure is listed below

Create Procedure  LogicAppPaginationSample
@pageNumber int,
@pageSize Int
 
As begin
 
select
product.[ProductNumber] As  [ProductNumber],
product.[Name] As  ProductName,
product.[ListPrice] As  ListPrice,
ProductCategory.[Name] As  CategoryName,
 ProductModel.[Name] As  ModelName,
 ProductDescription.[Description] As  [ProductDescription]
FROM [SalesLT].product product  
cross join [SalesLT].ProductCategory ProductCategory
cross join [SalesLT].ProductModel  ProductModel 
left  join [SalesLT].ProductDescription  ProductDescription on
ProductDescription.ProductDescriptionID=ProductModel.ProductModelID
 
ORDER BY  product.ProductID
OFFSET @PageSize * (@PageNumber - 1) ROWS
FETCH NEXT  @PageSize ROWS  ONLY
END

To test the Pagination within SQL Management Studio you can execute the stored procedure with page size and page number. The SQL Query will be like below

DECLARE @return_value int
 
EXEC    @return_value = [dbo].[LogicAppPaginationSample]
        @pageNumber = 2,
        @pageSize = 5000
 
SELECT  'Return Value' = @return_value
 
GO

The next step is to build the logic apps and associated blob container to store the product details as csv.

The logic app workflow will be like below
          

It will contain certain set of actions with recurrence trigger set to 1 hour. The related action are Do-Until Loop, Create CSV Table, and a call to the stored procedure which is being created in earlier step.

There are certain consideration while dealing with large volume transaction data

  • Select Proper azure  SQL tier which will give you max  performance benefit
  • Do indexing on the table for fast querying
  • Build Logic App with minimum actions.

The Actual run history will look like below where the stored procedure will fetch 50 K records each time until the SQL result set is set to empty.
     
                                
Once the Logic app workflow is completed, you can see the csv files being created at the blob storage. With this basic workflow, you can implement complex business case with joining event grid or artificial intelligence of the data extracted. You can de-batch the record set to initiate another workflow which can do crud operation with other data contracts. 

For this sample solution navigate to the blob storage and you will multiple product files generated through logic apps and each file will hold 50 K records which with Header value set. The conversion of JSON to CSV is being done through a newly added connector Create CSV table which takes JSON array as input and convert the result set into CSV string.

                  

To verify the result set you can download any of the csv files created and open with excel or notepad++ to get the actual row count and result set. In this sample, we have used notepad to open one of the file content.
                
             

Conclusion

SQL pagination is very much essential when you are doing data operation with the large set of records. It has in memory benefits.  With SQL pagination you are controlling the output record set of the stored procedure which might be essential for any downstream system which would not process big record set at one time

See Also

An important place to find a huge amount of BizTalk related articles is the TechNet Wiki itself.
The best entry point is BizTalk Server Resources on the TechNet Wiki. Another important place to find Logic App related articles is the TechNet Wiki itself. The best entry point is Logic App Resources on the TechNet Wiki.