The on-prem NAS drive copies data from multiple locations to the Blob Storage

Brahmendra Shukla 0 Reputation points
2024-07-19T15:03:28.6166667+00:00

I have a requirement to copy data from a NAS drive to blob storage using Azure Data Factory (ADF). The NAS drive has 51 different locations, examples of which are:

T:\scd\in

T:\vnd\ed\in

T:\tx\rx\in\em

T:\rx\gs\tm\in\em

The NAS server is denoted by T.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,494 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 22,311 Reputation points
    2024-07-19T18:18:05.6133333+00:00

    In your case you need "Self-hosted" and follow the instructions to download and install the integration runtime on a machine that has access to your NAS drive.

    After installation, register the integration runtime with your ADF.

    • Create the linked services :

    Then you need to create a linked service and configure the linked service to use the self-hosted integration runtime and provide the necessary details to connect to the NAS drive (file path, credentials).

    Similarly, create another linked service for your Azure Blob Storage configure it with the necessary details (storage account name, key, container).

    • Create the Datasets :
    1. Create Dataset for NAS Drive:
      • Go to the "Author" tab and create a new dataset.
      • Choose "File System" as the dataset type.
      • Configure the dataset to use the linked service created for the NAS drive.
      • Specify the file path or pattern if needed.
    2. Create Dataset for Azure Blob Storage:
      • Similarly, create another dataset for Azure Blob Storage.
      • Choose "Azure Blob Storage" as the dataset type and configure it to use the linked service for Blob Storage.
      • Specify the container and folder path if needed.
    • Create the pipeline :
    1. Create a New Pipeline:
      • In the "Author" tab, create a new pipeline.
    2. Add Copy Activity:
      • Drag and drop the "Copy data" activity onto the pipeline canvas.
      • Configure the copy activity to use the NAS drive dataset as the source and the Azure Blob Storage dataset as the sink.
    3. Configure Source and Sink:
      • In the source tab, specify the file path or wildcard pattern to include all required files.
      • In the sink tab, specify the destination folder and file naming pattern if necessary.

    Finally then schedule and trigger the pipeline.

    More links :

    https://stackoverflow.com/questions/62931931/how-to-connect-to-a-network-drive-from-azure-data-factory-pipeline

    https://video2.skills-academy.com/en-us/azure/data-factory/connector-file-system?tabs=data-factory


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.