I am working on an ADF pipeline. One of the steps will include a Python script that connects to an external SFTP, download some files and upload them to my Storage Account.
The SFTP owner asked to share the IP that he should add to Firewall exceptions so the code can connect with the SFTP.
My current setup:
- Virtual Network
- Public IP Address with static IP and DNS Name Label
- A Pool in the Azure Batch Service. In the Network Configuration step I used my Vnet from 1 and default Subnet, usermanaged IP address provisioning type and assigned PublicIP ID.
- I can now see that the IP of Node that was created is the same as the one from 2.
Shall it work? I tried to prepare a POC of this solution that follows this path with copying a file from one container of Storage Account to Another.
- It works with the Public Access in the Networking blade enabled
- It works with the Enabled from selected virtual networks and IP addresses selected and my Virtual Network Added
- It works with the Enabled from selected virtual networks and IP addresses selected and my local IP added to the whitelist when I run it locally
- It DOES NOT WORK with the Enabled from selected virtual networks and IP addresses selected and The Node Public IP (from 2 / 4) added to the whitelist. What might be the reason for that and how to cope with it?
The code is pretty simple:
from azure.storage.blob import BlobClient
import pandas as pd
from io import BytesIO
import requests
# Print current IP address
response = requests.get('https://api.ipify.org?format=json')
ip_address = response.json()['ip']
print(f'Current IP Address: {ip_address}')
# Define parameters
connectionString = "connectionstring"
inputContainerName = "input"
inputBlobName = "iris.csv"
outputContainerName = "output"
outputBlobName = "iris_setosa.csv"
# Establish connection with the blob storage account for input container
input_blob = BlobClient.from_connection_string(conn_str=connectionString, container_name=inputContainerName, blob_name=inputBlobName)
# Download the blob as a stream
input_stream = input_blob.download_blob()
df = pd.read_csv(BytesIO(input_stream.readall()))
# Take a subset of the records
df = df[df['Species'] == "setosa"]
# Save the subset of the iris dataframe locally in memory
output_stream = BytesIO()
df.to_csv(output_stream, index=False)
output_stream.seek(0) # Reset the stream position to the beginning
# Establish connection with the blob storage account for output container
output_blob = BlobClient.from_connection_string(conn_str=connectionString, container_name=outputContainerName, blob_name=outputBlobName)
# Upload the stream to the output container
output_blob.upload_blob(output_stream, overwrite=True)