ADF copy-activity from Microsoft 365 to storage account fails

Yonatan Shlain 0 Reputation points
2024-06-09T22:26:35.5233333+00:00

Hello,

I’ve a continuous problem with my ADF pipeline - I’m trying to run a "copy-data" pipeline and encounter access and permission errors with my specific resources.

  • My source is a Microsoft 365 Table connector (I’m retrieving some columns from my organization mails)
  • My sink is a storage account.

In my SA access control properties, I’ve given my app appropriate IAM role with the all the necessary permissions so it can access the storage account and write the data successfully.

Now since I don’t want my storage to be public, I chose the public network access to be "Enabled from selected virtual networks and IP addresses".

First try:

Since ADF is a resource instance in my Azure subscription, I’ve specified my instance to have access to my storage account based on its system-assigned managed identity (Microsoft.DataFactory/factories) and configured all propely (see https://roshan-vin4u.medium.com/authenticate-azure-data-factory-with-azure-data-lake-gen-2-using-managed-identities-3663f1449440).

But when I ran the pipeline it failed, claiming I can’t use system-assigned managed identity with Microsoft 365 connector.

Second try:

I tried to access my storage account with private link using ADF PE, so I've created private endpoint and configured it all properly (see https://video2.skills-academy.com/en-us/answers/questions/635312/connect-data-factory-to-azure-storage-wiht-private).

First I configured the service endpoint to be the storage's dfs url and run the pipeline, got the error:

"ErrorCode=UserErrorOffice365DataLoaderError,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Office365 data loading failed to execute. office365LoadErrorType: PermanentError ...Failure happened on 'Sink' side. ErrorCode=AdlsGen2ForbiddenError"

Then I configured it to be the storage's blob url and got the next error "the remote server return an error: (403) .... Unable to create Azure blob container"

Conclusions

I've also tested it and got same results when disabling public network access.

My conclusion is that the runtime did access the storage account via the private endpoint but failed (either due to some misconfiguration or some functionality problem).

The weirdest issue is when I tried using a simple ADF copy-data pipeline from one storage account to another using PE for my linked-service and integration runtime, it did ran successfully.

What can be the problem? How do I solve this issue?

Thank you!

Microsoft 365
Microsoft 365
Formerly Office 365, is a line of subscription services offered by Microsoft which adds to and includes the Microsoft Office product line.
4,205 questions
Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
2,863 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,990 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Harishga 5,425 Reputation points Microsoft Vendor
    2024-06-10T12:58:24.2666667+00:00

    Hi @Yonatan Shlain
    Welcome to Microsoft Q&A platform and thanks for posting your question here.

    It seems you're facing a complex issue with Azure Data Factory when trying to use a "copy-data" pipeline with a Microsoft 365 Table connector as the source and a storage account as the sink.

    Step 1: Understanding the Error with System-Assigned Managed Identity

    Your first attempt was to use a system-assigned managed identity for ADF to access the storage account. However, the pipeline failed because the Microsoft 365 connector does not support system-assigned managed identities.

    Step 2: Attempting Access with Private Link and ADF Private Endpoint

    You then tried to create a private endpoint for ADF to access the storage account. This approach is generally correct and should work if configured properly.

    Step 3: Analyzing the Errors Received

    The first error with the dfs endpoint suggests a permission issue on the Data Lake Storage Gen2 side, which could be related to the role assignment or the configuration of the private endpoint.

    The second error with the blob endpoint indicates a forbidden error, which again points to a possible misconfiguration in permissions or the private endpoint setup.

    Step 4: Verifying Permissions and Configuration

    Ensure that the Managed Identity of ADF has been granted the "Storage Blob Data Contributor" role on the storage account.

    Double-check the private endpoint configuration for both the dfs and blob services of the storage account to ensure they are correctly linked to ADF.

    Step 5: Testing with Public Network Access Disabled

    Since the same errors occurred with public network access disabled, it confirms that the issue lies within the private network configuration or permission setup.

    Step 6: Comparing with a Working Scenario

    You mentioned that a simple ADF copy-data pipeline from one storage account to another worked successfully. This suggests that the issue is specific to the combination of Microsoft 365 connector and the storage account.

    Step 7: Possible Solutions

    Consider using a user-assigned managed identity instead of a system-assigned one, as it offers more flexibility and is often recommended for complex scenarios.

    Review the documentation for the Microsoft 365 Table connector and ensure all prerequisites and configurations are met.

    Reference:

    https://video2.skills-academy.com/en-us/azure/data-factory/data-factory-service-identity#overview
    https://video2.skills-academy.com/en-us/azure/data-factory/connector-dynamics-crm-office-365?tabs=data-factory#supported-capabilities

    Hope this helps. Do let us know if you any further queries.

    0 comments No comments

  2. Alex Arulswamy 0 Reputation points Microsoft Employee
    2024-07-01T18:10:04.5866667+00:00

    We are facing a similar issue in my case where the ADF copy activity from one sql instance to another using different integration runtime (IR). One IR is self-hosted and another IR is managed vnet. In my case I am using user assigned managed identity, so the proposed solution is not working.

    Wanted to confirm if check if this a product issue or limitation for copy-data activity in ADF? I can share more details if needed.

    0 comments No comments