sync command group

Note

This information applies to Databricks CLI versions 0.205 and above, which are in Public Preview. To find your version of the Databricks CLI, run databricks -v.

Also, note that the sync command group can synchronize file changes from a local development machine only to workspace user (/Users) files in your Azure Databricks workspace. It cannot synchronize to DBFS (dbfs:/) files. To synchronize file changes from a local development machine to DBFS (dbfs:/) in your Azure Databricks workspace, use the dbx sync utility.

The sync command group within the Databricks CLI enables one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Azure Databricks workspace.

Note

sync commands cannot synchronize file changes from a directory within a remote Azure Databricks workspace, back to a directory within a local filesystem.

You run sync commands by appending them to databricks sync. To display help for the sync command, run databricks sync -h.

Important

To install the Databricks CLI, see Install or update the Databricks CLI. To configure authentication for the Databricks CLI, see Authentication for the Databricks CLI.

Incrementally sync local file changes to a remote directory

To perform a single, incremental, one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Azure Databricks workspace, run the sync command, as follows:

databricks sync <local-directory-path> <remote-directory-path>

For example, to do a one-time, one-way, incremental synchronization of all file changes in the folder named my-folder within the local current working directory, to a specific path within the remote workspace, run the following command:

databricks sync ./my-folder/ /Users/someone@example.com/

In this example, only file changes since the last run of the sync command are synchronized to /Users/someone@example.com/. By default, the workspace URL within the caller’s DEFAULT profile is used to determine the remote workspace to sync to.

Fully sync local file changes to a remote directory

To perform a single, full, one-way synchronization of file changes within a local filesystem directory to a directory within a remote Azure Databricks workspace, regardless of when the last sync command was run, use the --full option, for example:

databricks sync ./my-folder/ /Users/someone@example.com/ --full

Continuously sync local file changes to a remote directory

To turn on continuous, one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Azure Databricks workspace, use the --watch option, for example:

databricks sync ./my-folder/ /Users/someone@example.com/ --watch

One-way synchronization continues until the command is stopped from the terminal, typically by pressing Ctrl + c or Ctrl + z.

Polling for possible synchronization events happens once per second by default. To change this interval, use the --interval option along with the number of seconds to poll followed by the character s, for example for five seconds:

databricks sync ./my-folder/ /Users/someone@example.com/ --watch --interval 5s

Change the sync progress output format

Sync progress information is output to the terminal in text format by default. To specify the sync progress output format, use the --output option, specifying either text (the default, if --output is not otherwise specified) or json, for example:

databricks sync ./my-folder/ /Users/someone@example.com/ --output json