Migrate AWS Data Pipelines to Azure

Question

We have a new project kicking off in next couple of days.
One of the main requirements is to migrate AWS data transformation pipelines to azure.

would appreciate if you could share any case study examples, books for us so that we could do some research and readiness.

Accepted Answer

Hi Lokesh,

Thank you for posting query in Microsoft Q&A Platform.

In your case, it seems they are ETL pipelines which you would like to migrate to Azure. If correct me if i am wrong. For this we can use Azure Data Factory Pipelines. And to explore azure services I would suggest to go through official documentation. That helps to understand service and implementation of your solutions.

Below is the documentation link for Azure Data Factory Service.

https://video2.skills-academy.com/en-us/azure/data-factory/

Below is the video content to understand Azure Data Factory Service prepared by me.

https://www.youtube.com/playlist?list=PLMWaZteqtEaLTJffbbBzVOv9C0otal1FO

Below is the link for Azure Data Factory Blog

https://techcommunity.microsoft.com/t5/azure-data-factory-blog/bg-p/AzureDataFactoryBlog

Hope this helps. please let me know if any further queries.

Please consider hitting Accept Answer button. Accepted answers help community as well.

Answer

i hope this helps you

https://www.automationfactory.ai/casestudies/azure-data-factory-case-studies-ai-and-automation-service

The data transfer was not a simple lift-and-shift process. After extraction, the data was transformed and schema changes were made so that the data appears similar to other data that the client was accustomed to handling.
Azure Data Factory was used to provide ETL logic and for file processing. Azure SQL Data Warehouse was used to ensure end storing of the data that is to be consumed by the analysts. Azure Data Lake was used for storing files for longer terms.
The solution had to process large sets of data of more than 11,000 files and a total 2TB compressed size with extra files introduced every day. The ingestion should be rate-controlled and parallelizable for ensuring the management of multiple database connections with orderly ingestion.
The data was also available in the CSV format stored in an S3 storage bucket. The client was already using Azure, which prompted them to consolidate into the existing infrastructure.
AutomationFactory.ai built a data pipeline to move from AWS to Azure. With manually triggered initial load, the update schedules got set to check the new files to be conducted at regular intervals.
Status tables were created to keep track of all the files and the status of data when it gets passed through the data pipelines.
The data pipelines were abstracted deliberately to allow the least of the work to include new sources of data in the future.

Share via

Migrate AWS Data Pipelines to Azure

1 additional answer