Synapse Analytics - Cluster does not use requirements.txt file

Rory Tarnow-Mordi 1 Reputation point
2020-08-27T11:21:21.78+00:00

I've been trying to include some additional Python packages in a Spark cluster, but when I put my requirements.txt file up (using any of the three methods here) and restart the cluster, the packages are not available (either when I try to import <package> or when I list them using:)

import pip #needed to use the pip functions
for i in pip.get_installed_distributions(local_only=True):
print(i)

The requirements.txt is attached.

20836-requirements.txt

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,631 questions
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha-msft 19,376 Reputation points Microsoft Employee
    2020-08-31T21:24:34.163+00:00

    Hello @Rory Tarnow-Mordi ,

    Can you please try the below steps and let me knmow if that works .
    .

    • Be sure auto pause is ON (whatever the time value)
    • Re-scale your pool (let’s say from 3nodes_to_6nodes) to a new value (3nodes_to_8nodes).
    • Check the “Force new settings” and apply.

    This force your notebook to acquire a new nodes instance from the pool, where your modules will be loaded

    Thanks
    Himanshu