Synapse Notebook running manually but not throught the pipeline in Synapse (with access policies defined in Key vaults for both my account and the Synapse Workspace))

Abhinandan Shrestha 20 Reputation points
2023-12-29T08:24:08.0466667+00:00
creds = mssparkutils.credentials.getSecret('fmd-prod-db-readonly','Analytics-JSON')

#Define credentials from json
credentials = Credentials.from_service_account_info(json.loads(creds))
client = BetaAnalyticsDataClient(credentials=credentials)

Im using synapse notebook to access Key vaultsecret, it working if i manually run all in the notebook. I'm having issues when running this notebookk with the pipeline.

I've given Get, List secrets permission to Synapse Workspace as well as my account. But pipeline throws this error message. 

{
    "errorCode": "6002",
    "message": "---------------------------------------------------------------------------\nPy4JJavaError                             Traceback (most recent call last)\nCell In[7], line 19\n     16 property_id = \"284920073\"\n     18 # load secret from key vault\n---> 19 creds = mssparkutils.credentials.getSecret('fmd-prod-db-readonly','Analytics-JSON')\n     21 #Define credentials from json\n     22 credentials = Credentials.from_service_account_info(json.loads(creds))\n\nFile ~/cluster-env/env/lib/python3.10/site-packages/notebookutils/mssparkutils/credentials.py:21, in getSecret(akvName, secret, linkedService)\n     19 def getSecret(akvName, secret, linkedService=''):\n     20     if linkedService == '':\n---> 21         return creds.getSecret(akvName, secret)\n     22     else:\n     23         return creds.getSecret(akvName, secret, linkedService)\n\nFile ~/cluster-env/env/lib/python3.10/site-packages/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args)\n   1316 command = proto.CALL_COMMAND_NAME +\\\n   1317     self.command_header +\\\n   1318     args_command +\\\n   1319     proto.END_COMMAND_PART\n   1321 answer = self.gateway_client.send_command(command)\n-> 1322 return_value = get_return_value(\n   1323     answer, self.gateway_client, self.target_id, self.name)\n   1325 for temp_arg in temp_args:\n   1326     if hasattr(temp_arg, \"_detach\"):\n\nFile /opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py:169, in capture_sql_exception.
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,860 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,568 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Bhargava-MSFT 30,816 Reputation points Microsoft Employee
    2023-12-29T20:40:33.5166667+00:00

    Hello Abhinandan Shrestha,

    When you run the notebook manually, then user's Azure active directory permissions will be used. If you run the notebook via the pipeline, then synapse managed identity permissions will be used.

    To access secrets from Azure Key Vault via the pipeline, the Synapse workspace managed identity has the necessary permissions to read secrets from the Key Vault.

    Similarly, to interacts with a Storage Blob, granting the Synapse workspace managed identity the Storage Blob Data Contributor role on the respective Azure Storage account can resolve access issues.

    I hope this helps. Please let me know if you have any further questions.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.