Synapse Notebook running manually but not throught the pipeline in Synapse (with access policies defined in Key vaults for both my account and the Synapse Workspace))

Question

creds = mssparkutils.credentials.getSecret('fmd-prod-db-readonly','Analytics-JSON')

#Define credentials from json
credentials = Credentials.from_service_account_info(json.loads(creds))
client = BetaAnalyticsDataClient(credentials=credentials)

Im using synapse notebook to access Key vaultsecret, it working if i manually run all in the notebook. I'm having issues when running this notebookk with the pipeline.

I've given Get, List secrets permission to Synapse Workspace as well as my account. But pipeline throws this error message. 

{
    "errorCode": "6002",
    "message": "---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
Cell In[7], line 19
     16 property_id = \"284920073\"
     18 # load secret from key vault
---> 19 creds = mssparkutils.credentials.getSecret('fmd-prod-db-readonly','Analytics-JSON')
     21 #Define credentials from json
     22 credentials = Credentials.from_service_account_info(json.loads(creds))

File ~/cluster-env/env/lib/python3.10/site-packages/notebookutils/mssparkutils/credentials.py:21, in getSecret(akvName, secret, linkedService)
     19 def getSecret(akvName, secret, linkedService=''):
     20     if linkedService == '':
---> 21         return creds.getSecret(akvName, secret)
     22     else:
     23         return creds.getSecret(akvName, secret, linkedService)

File ~/cluster-env/env/lib/python3.10/site-packages/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args)
   1316 command = proto.CALL_COMMAND_NAME +\
   1317     self.command_header +\
   1318     args_command +\
   1319     proto.END_COMMAND_PART
   1321 answer = self.gateway_client.send_command(command)
-> 1322 return_value = get_return_value(
   1323     answer, self.gateway_client, self.target_id, self.name)
   1325 for temp_arg in temp_args:
   1326     if hasattr(temp_arg, \"_detach\"):

File /opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py:169, in capture_sql_exception.

Answer

Hello Abhinandan Shrestha,

When you run the notebook manually, then user's Azure active directory permissions will be used. If you run the notebook via the pipeline, then synapse managed identity permissions will be used.

To access secrets from Azure Key Vault via the pipeline, the Synapse workspace managed identity has the necessary permissions to read secrets from the Key Vault.

Similarly, to interacts with a Storage Blob, granting the Synapse workspace managed identity the Storage Blob Data Contributor role on the respective Azure Storage account can resolve access issues.

I hope this helps. Please let me know if you have any further questions.

Share via

Synapse Notebook running manually but not throught the pipeline in Synapse (with access policies defined in Key vaults for both my account and the Synapse Workspace))

1 answer

Your answer