Connecting Azure Database for PostgreSQL and Data Factory

Laura Balbi 0 Reputation points
2024-06-10T11:31:09.75+00:00

Hello,

I'd want to connect my Azure Database for PostgreSQL (from now on referred as DB) to use it as a dataset in Data Factory (from now on referred as ADF).

My DB is in a private network in a resource group (rg_1)
my ADF is public and is in a resource griuo (rg_2)

I played around a bit with networks and endpoints but I'm always getting this error

The value of the property '' is invalid: 'The remote name could not be resolved: 'test-laura-private.postgres.database.azure.com''.

Is not clear to me how do i have to set up this virtual network to avoid this mistake

Thanks in advance

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,992 questions
Azure Database for PostgreSQL
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 18,341 Reputation points
    2024-06-10T13:27:22.5166667+00:00

    You need to ensure proper network configuration to connect your Azure Database for PostgreSQL (DB) to ADF while the database is in a private network and the ADF is public.

    Follow these steps :

    1. Create a Private Endpoint for PostgreSQL:
      • Navigate to your PostgreSQL instance in the Azure portal.
      • Select "Private endpoint connections" under the "Settings" section.
      • Click on "Add" to create a new private endpoint.
      • Follow the wizard to select your virtual network and subnet.
    2. Enable Managed Virtual Network Integration in ADF:
      • Go to your ADF instance in the Azure portal.
      • Navigate to the "Manage" tab and select "Managed private endpoints."
      • Create a new managed private endpoint, selecting your PostgreSQL private endpoint.
    3. Configure Network Security Groups (NSGs):
      • Go to the "Networking" section of your virtual network and subnets.
      • Edit the NSG rules to allow traffic on port 5432.
    4. Set Up Private DNS Zone:
      • In the Azure portal, navigate to "Private DNS zones."
      • Create a new private DNS zone if you don't already have one.
      • Add an A record for your PostgreSQL database with the private endpoint's IP address.
      • Link this DNS zone to your virtual network.
    5. Configure ADF Linked Service:
      • In ADF, create a new linked service for PostgreSQL.
      • Use the DNS name or private IP address of the PostgreSQL private endpoint.
      • Provide the necessary authentication details.

    More links :

    https://docs.microsoft.com/en-us/azure/postgresql/howto-create-private-endpoint

    https://docs.microsoft.com/en-us/azure/data-factory/managed-virtual-network-private-endpoint

    https://docs.microsoft.com/en-us/azure/virtual-network/network-security-groups-overview

    https://docs.microsoft.com/en-us/azure/virtual-network/manage-network-security-group

    https://docs.microsoft.com/en-us/azure/dns/private-dns-overview

    https://docs.microsoft.com/en-us/azure/dns/private-dns-getstarted-portal

    https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-database-for-postgresql


    UPDATE :

    Maybe you need to try a self-hosted integration runtime which involves installing the IR on a VM or an on-premises machine that has access to your private network where the PostgreSQL database is hosted so you can securely connect to your database.

    Once configured, you can create a linked service in ADF using the self-hosted IR, providing the internal IP address or DNS name of your PostgreSQL database for seamless data integration.

    Another thing that comes to my mind is that you may need to deploy a jump box within your virtual network to provide a secure intermediary with access to your private PostgreSQL database and the internet. In this case, ADF can connect to the PostgreSQL database through this proxy.
    You can configure your linked service in ADF to route requests via the jump box, ensuring secure and efficient data transfers.