How to fix "The specified path already exists" issue raised by Databricks Delta Live Table pipeline executions

Slim MISSAOUI 10 Reputation points
2024-08-28T09:26:38.4266667+00:00

Hello,

I have several DLT pipelines that are generating an exception "The specified path already exists". The exception indicates that there is an issue with the internal checkpoint file of the DLT streaming tables.

org.apache.hadoop.fs.FileAlreadyExistsException: Operation failed: "The specified path already exists.", 409, PUT, https://xxx.dfs.core.windows.net/managed/__unitystorage/schemas/43a-ae09-4e97-a2eb-324fd2e84f2e/tables/ee9d67c-2043-4250-af53-c1951aad6/_dlt_metadata/checkpoints/configuration__ids/24?resource=directory&timeout=90&st=2024-08-28T07:27:49Z&sv=2020-02-10&ske=2024-08-28T09:27:49Z&sig=XXXXX&sktid=0652c929-6106-451-ba96-0ebb59a37670&se=2024-08-28T08:47:09Z&sdd=5&skoid=5e4b8638-3c04-45fXXXXXXXXXXXXXXXXX&spr=https&sks=b&skt=2024-08-28T07:27:49Z&sp=racwdm&skv=2021-08-06&sr=d, PathAlreadyExists, "The specified path already exists.

Moreover, the problem is random (it doesn't always occur at the same stage of the workflows) and affects all my Databricks environments (3 environments).

Could you please help me ?

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,161 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.