Connecting an on-premises SQL database to a Spark Notebook in an Azure Synapse workspace using private endpoints involves several steps. Here's a detailed guide to help you troubleshoot and verify your setup:
Prerequisites
- Azure Synapse Workspace: Ensure you have an Azure Synapse workspace.
- On-Premises SQL Database: Ensure your on-premises SQL database is accessible via a private network.
- Private Endpoints: Ensure you have private endpoints for both the SQL Server and Synapse workspace.
- VNet Peering: Ensure VNet peering is configured correctly.
- DNS Configuration: Ensure DNS is configured to resolve the private endpoints.
Steps to Connect On-Premises SQL DB through Spark Notebook in Synapse Workspace
1. Create Private Endpoints
- Private Endpoint for SQL Server:
- Go to the Azure portal.
- Navigate to your SQL Server.
- Create a private endpoint for the SQL Server in the VNet where your Synapse workspace is located.
- Private Endpoint for Synapse Workspace:
- Go to the Azure portal.
- Navigate to your Synapse workspace.
- Create a private endpoint for the Synapse workspace in the VNet where your SQL Server is located.
2. Configure VNet Peering
- VNet Peering:
- Ensure that the VNet where the Synapse workspace is located is peered with the VNet where the SQL Server is located.
- Go to the Azure portal.
- Navigate to Virtual Networks.
- Configure VNet peering for both VNets.
3. Configure DNS
- DNS Configuration:
- Ensure that the DNS is configured to resolve the private IP addresses of the private endpoints.
- You can use Azure Private DNS zones or your own DNS server to achieve this.
- Add DNS records for the private endpoints in your DNS server.
4. Verify Network Security Groups (NSGs)
- NSGs Configuration:
- Ensure that the Network Security Groups (NSGs) associated with your VNets/subnets allow traffic between the Synapse workspace and the SQL Server.
- Verify that the necessary ports (e.g., 1433 for SQL Server) are open.
5. Install JDBC Driver in Spark Notebook
- Install JDBC Driver:
- Ensure that the JDBC driver for SQL Server is installed in your Spark environment.
// Install SQL Server JDBC driver
%%configure -f
{
"conf": {
"spark.jars.packages": "com.microsoft.sqlserver:mssql-jdbc:8.4.1.jre8"
}
}
6. Connect to SQL Server from Spark Notebook
- Connecting to SQL Server:
- Use the following code to connect to the SQL Server from your Spark Notebook.
import java.sql.{Connection, DriverManager, ResultSet}
// Define the JDBC URL
val jdbcUrl = "jdbc:sqlserver://<private-endpoint-ip>:1433;databaseName=<database-name>"
// Define the connection properties
val connectionProperties = new java.util.Properties()
connectionProperties.put("user", "<username>")
connectionProperties.put("password", "<password>")
// Establish the connection
val connection = DriverManager.getConnection(jdbcUrl, connectionProperties)
// Execute a query
val statement = connection.createStatement()
val resultSet = statement.executeQuery("SELECT * FROM <table-name>")
// Process the result set
while (resultSet.next()) {
println(resultSet.getString("<column-name>"))
}
// Close the connection
connection.close()
Replace <private-endpoint-ip>
, <database-name>
, <username>
, <password>
, <table-name>
, and <column-name>
with appropriate values.
Troubleshooting Tips
- Check Connectivity:
- Use tools like
telnet
ornc
to verify connectivity to the private endpoint IP and port from the Synapse workspace.
- Use tools like
- Network Security Groups:
- Ensure that NSGs allow outbound traffic from Synapse workspace to SQL Server and inbound traffic from SQL Server to Synapse workspace.
- DNS Resolution:
- Use
nslookup
ordig
to verify that the private endpoint DNS resolves to the correct private IP address.
- Use
- Logs and Monitoring:
- Check Azure Monitor and SQL Server logs for any connectivity issues or errors.
- Firewall Rules:
- Ensure that the firewall rules on the SQL Server allow connections from the IP range of the Synapse workspace.
By following these steps and verifying each configuration, you should be able to connect your on-premises SQL database to a Spark Notebook in Azure Synapse workspace using private endpoints.