Accessing Azure Synapse Dedicated SQL Pool Using PySpark Without Exposing Passwords

Vandrangi Charaz Kumar 40 Reputation points
2024-06-14T08:38:35.8933333+00:00

I want to create a notebook for processing data in a table from the database using PySpark, to be used later in a Databricks pipeline.

I'm unsure how to read data from the existing database in the notebook without including my password. Here is my current method, which I don't want to use due to security concerns:

python

Copy code

spark = SparkSession.builder.appName("SynapseAnalyticsDemo").getOrCreate()

Set up Synapse Analytics credentials

synapse_servername = "<synapse_servername>.sql.azuresynapse.net"

synapse_database = "<synapse_database>"

synapse_username = "<synapse_username>"

synapse_password = "<synapse_password>"

synapse_jdbc_url = f"jdbc:sqlserver://{synapse_servername}:1433;database={synapse_database};user={synapse_username};password={synapse_password}"

Define the SQL query and table name

table_name = "<sql_pool_table>"

query = f"SELECT * FROM {table_name} WHERE some_column = 'some_value'"

I've also tried this approach without success:

pyspark

Copy code

%%pyspark

df = spark.sql("SELECT * FROM TableName")

df.show()

In this second method, I used the exact table name from SQL Server Management Studio and tried the Azure Synapse dedicated SQL pool name, but neither worked.

How can I access the Azure Synapse dedicated SQL pool using a PySpark notebook without exposing my password?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,916 questions
0 comments No comments
{count} votes

Accepted answer
  1. Smaran Thoomu 15,765 Reputation points Microsoft Vendor
    2024-06-14T10:06:20.35+00:00

    Hi @Vandrangi Charaz Kumar

    Thanks for the question and using MS Q&A platform.

    To access data from the dedicated SQL pool in Azure Synapse Notebook without exposing your password, you can store your database password securely in Azure Key Vault. To do this, you need to create an Azure Key Vault and then create a secret in the Key Vault with your database password.

    Retrieve the secret value using below code:

    from notebookutils import mssparkutils
    mssparkutils.credentials.getSecret('AzureKeyvaultName', 'secretName')
    
    
    

    Access dedicated sql database using below code:

    from notebookutils import mssparkutils
    
    jdbcHostname = "synpool.database.windows.net"
    
    jdbcPort = 1433
    
    jdbcDatabase = "dedsql"
    
    jdbcUsername = "sqladminuser"
    
    jdbcPassword = mssparkutils.credentials.getSecret('sqlservercred', 'sqlpassword')
    
    jdbcDriver = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
    
    #url = s"jdbc:sqlserver://${database_host}:${database_port}/${database_name}"
    
    table = "students"
    
    jdbcUrl = f"jdbc:sqlserver://{jdbcHostname}:{jdbcPort};databaseName={jdbcDatabase}"
    
    df1 = spark.read.format("jdbc").option("driver", jdbcDriver).option("url", jdbcUrl).option("dbtable", table).option("user", jdbcUsername).option("password", jdbcPassword).load()
    
    df1.show()
    
    
    

    enter image description here

    Check this for adding access policies.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.