Authenticate access to Azure Databricks resources
To access a Azure Databricks resource with the Databricks CLI or REST APIs, clients must authenticate using a Azure Databricks account with the required authorization to access the resource. To securely run a Databricks CLI command or call a Databricks API request that requires authorized access to an account or workspace, you must provide an access token based on valid Azure Databricks account credentials. This article covers the authentication options to provide those credentials and authorize access to an Azure Databricks workspace or account.
The following table shows the authentication methods available to your Azure Databricks account.
Azure Databricks authentication methods
Because Azure Databricks tools and SDKs work with one or more supported Azure Databricks authentication methods, you can select the best authentication method for your use case. For details, see the tool or SDK documentation in Developer tools.
Method | Description | Use case |
---|---|---|
OAuth for service principals (OAuth M2M) | Short-lived OAuth tokens for service principals. | Unattended authentication scenarios, such as fully automated and CI/CD workflows. |
OAuth for users (OAuth U2M) | Short-lived OAuth tokens for users. | Attended authentication scenarios, where you use your web browser to authenticate with Azure Databricks in real time, when prompted. |
Personal access tokens (PAT) | Short-lived or long-lived tokens for users or service principals. | Scenarios where your target tool does not support OAuth. |
Azure managed identities authentication | Microsoft Entra ID tokens for Azure managed identities. | Use only with Azure resources that support managed identities, such as Azure virtual machines. |
Microsoft Entra ID service principal authentication | Microsoft Entra ID tokens for Microsoft Entra ID service principals. | Use only with Azure resources that support Microsoft Entra ID tokens and do not support managed identities, such as Azure DevOps. |
Azure CLI authentication | Microsoft Entra ID tokens for users or Microsoft Entra ID service principals. | Use to authenticate access to Azure resources and Azure Databricks using the Azure CLI. |
Microsoft Entra ID user authentication | Microsoft Entra ID tokens for users. | Use only with Azure resources that only support Microsoft Entra ID tokens. Databricks does not recommend that you create Microsoft Entra ID tokens for Azure Databricks users manually. |
What authentication approach should I choose?
You have two options to authenticate a Databricks CLI command or API call for access to your Azure Databricks resources:
- Use an Azure Databricks user account (called “user-to-machine” authentication, or U2M). Choose this only when you are running an Azure Databricks CLI command from your local client environment or calling an Azure Databricks API request from code you own and run exclusively.
- Use an Azure Databricks service principal (called “machine-to-machine” authentication, or M2M). Choose this if others will be running your code (especially in the case of an app), or if you are building automation that will call Azure Databricks CLI commands or API requests.
- If you are using Azure Databricks, you can also use an MS Entra service principal to authenticate access to your Azure Databricks account or workspace. However, Databricks recommends that you use a Databricks service principal with our provided OAuth authentication over MS Entra service principal authentication. This is because Databricks’ authentication uses OAuth access tokens that are more robust when authenticating only with Azure Databricks.
For more details on using an MS Entra service principal to access Databricks resources, see MS Entra service principal authentication.
You must also have an access token linked to the account you will use to call the Databricks API. This token can be either an OAuth 2.0 access token or a personal access token (PAT). However, Azure Databricks strongly recommends you use OAuth over PATs for authorization as OAuth tokens are automatically refreshed by default and do not require the direct management of the access token, improving your security against token hijacking and unwanted access. Because OAuth creates and manages the access token for you, you provide an OAuth token endpoint URL, a client ID, and a secret you generate from your Azure Databricks workspace instead of directly providing a token string yourself. PATs expose the risk of long-lived tokens providing egress opportunities if they are not regularly audited and rotated or revoked, or if the token strings and passwords are not securely managed for your development environment.
How do I use OAuth to authenticate with Azure Databricks?
Azure Databricks provides unified client authentication to assist you with authentication by using a default set of environment variables you can set to specific credential values. This helps you work more easily and securely since these environment variables are specific to the environment that will be running the Azure Databricks CLI commands or calling Azure Databricks APIs.
- For user account (user-to-machine) authentication, Azure Databricks OAuth is handled for you with Databricks client unified authentication, as long as the tools and SDKs implement its standard. If they don’t, you can manually generate an OAuth code verifier and challenge pair to use directly in your Azure Databricks CLI commands and API requests. See Step 1: Generate an OAuth code verifier and code challenge pair.
- For service principal (machine-to-machine) authentication, Azure Databricks OAuth requires that the caller provide client credentials along with a token endpoint URL where the request can be authorized. (This is handled for you if you use Azure Databricks tools and SDKs that support Databricks unified client authentication.) The credentials include a unique client ID and client secret. The client, which is the Databricks service principal that will run your code, must be assigned to Databricks workspaces. After you assign the service principal to the workspaces it will access, you are provided with a client ID and a client secret that you will set with specific environment variables.
These environment variables are:
DATABRICKS_HOST
: This environment variable is set to the URL of either your Azure Databricks account console (http://accounts.cloud.databricks.com
) or your Azure Databricks workspace URL (https://{workspace-id}.cloud.databricks.com
). Choose a host URL type based on the type of operations you will be performing in your code. Specifically, if you are using Azure Databricks account-level CLI commands or REST API requests, set this variable to your Azure Databricks account URL. If you are using Azure Databricks workspace-level CLI commands or REST API requests, use your Azure Databricks workspace URL.DATABRICKS_ACCOUNT_ID
: Used for Azure Databricks account operations. This is your Azure Databricks account ID. To get it, see Locate your account ID.DATABRICKS_CLIENT_ID
: (M2M OAuth only) The client ID you were assigned when creating your service principal.DATABRICKS_CLIENT_SECRET
: (M2M OAuth only) The client secret you generated when creating your service principal.
You can set these directly, or through the use of a Databricks configuration profile (.databrickscfg
) on your client machine.
To use an OAuth access token, your Azure Databricks workspace or account administrator must have granted your user account or service principal the CAN USE
privilege for the account and workspace features your code will access.
For more details on configuring OAuth authorization for your client and to review cloud provider-specific authorization options, see Unified client authentication.
Authentication for third-party services and tools
If you are writing code which accesses third-party services, tools, or SDKs you must use the authentication and authorization mechanisms provided by the third-party. However, if you must grant a third-party tool, SDK, or service access to your Azure Databricks account or workspace resources, Databricks provides the following support:
Databricks Terraform Provider: This tool can access Azure Databricks APIs from Terraform on your behalf, using your Azure Databricks user account. For more details, see Provision a service principal by using Terraform.
Git providers such as GitHub, GitLab, and Bitbucket can access Azure Databricks APIs using a Databricks service principal. For more details, see Service principals for CI/CD.
Jenkins can access Azure Databricks APIs using a Databricks service principal. For more details, see CI/CD with Jenkins on Azure Databricks.
Azure DevOps can access Azure Databricks APIs using an MS Entra service principal and ID. For more details, see Authenticate with Azure DevOps on Databricks.
Azure Databricks configuration profiles
An Azure Databricks configuration profile contains settings and other information that Azure Databricks needs to authenticate. Azure Databricks configuration profiles are stored in local client files for your tools, SDKs, scripts, and apps to use. The standard configuration profile file is named .databrickscfg
. For more information, see Azure Databricks configuration profiles.