May 2021

These features and Azure Databricks platform improvements were released in May 2021.

Note

Releases are staged. Your Azure Databricks account may not be updated until a week or more after the initial release date.

Databricks Machine Learning: a data-native and collaborative solution for the full ML lifecycle

May 27, 2021

The new Machine Learning persona, selectable in the sidebar of the Azure Databricks UI, gives you easy access to a new purpose-built environment for ML, including the model registry and four new features in Public Preview:

  • A new dashboard page with convenient resources, recents, and getting started links.
  • A new Experiments page that centralizes experiment discovery and management.
  • AutoML, a way to automatically generate ML models from data and accelerate the path to production.
  • Feature Store, a way to catalog ML features and make them available for training and serving, increasing reuse. With a data-lineage–based feature search that leverages automatically-logged data sources, you can make features available for training and serving with simplified model deployment that doesn’t require changes to the client application.

For details, see AI and machine learning on Databricks.

SQL Analytics is renamed to Databricks SQL

May 27, 2021

SQL Analytics is renamed to Databricks SQL. For more details, see the Databricks SQL release note.

Create and manage ETL pipelines using Delta Live Tables (Public Preview)

May 26, 2021

Databricks is pleased to introduce Delta Live Tables, a cloud service that makes extract, transform, and load (ETL) development simple, reliable, and scalable. Delta Live Tables:

  • Provides an intuitive and familiar declarative interface to build pipelines.
  • Enables you to monitor data processing pipelines, visualize dependencies, and manage pipelines and dependencies across different environments.
  • Enables test-driven development, enforcement of data quality constraints, and application of uniform data error handling policies
  • Automates deployment of your data processing pipelines so you can easily upgrade, rollback, and incrementally reprocesses data.

See What is Delta Live Tables? for details.

Azure Spot VMs are GA

May 24, 2021

The ability to create Azure Databricks clusters with Azure Spot Virtual Machines is now generally available. You can now get the benefit of significantly lower-cost Azure spot instances and reduce your total cost of ownership (TCO) of Azure Databricks. You can choose to use Azure spot instances when you:

Encrypt Databricks SQL queries and query history using your own key (Public Preview)

May 20, 2021

For details, see the Databricks SQL release notes.

Increased limit for the number of terminated all-purpose clusters

May 18, 2021: Version 3.46

You can now have up to 150 terminated all-purpose clusters in an Azure Databricks workspace. Previously the limit was 120. For details, see Terminate a compute. The limit on the number of terminated all-purpose clusters returned by the Clusters API request is also now 150.

Increased limit for the number of pinned clusters

May 18, 2021: Version 3.46

You can now have up to 70 pinned clusters in an Azure Databricks workspace. Previously the limit was 50. For details, see Pin a compute

Manage where notebook results are stored (Public Preview)

May 18, 2021: Version 3.46

You can now choose to store all notebook results in your root Azure Storage instance regardless of size or run type. By default, some results for interactive notebooks are stored in Azure Databricks. A new configuration enables you to store these in root Azure Storage instance in your own account. For details, see Configure notebook result storage location.

This feature has no impact on notebooks run as jobs, whose results are always stored in root Azure Storage instance.

Encrypt notebook and secret data in the control plane with your own key (Public Preview)

May 10, 2021

An Azure Databricks workspace comprises a control plane that is hosted in an Azure Databricks-managed subscription and a compute plane that is deployed in your Azure subscription. The control plane stores your managed services data, which includes notebook commands, secrets, and other workspace configuration data. By default, this data is encrypted with an Azure Databricks-managed key, but you can now add a key from your Azure Key Vault instance to encrypt this data. See Enable customer-managed keys for managed services.

Databricks Runtime 7.4 series support ends

May 3, 2021

Support for Databricks Runtime 7.4, Databricks Runtime 7.4 for Machine Learning, and Databricks Runtime 7.4 for Genomics ended on May 3. See Databricks support lifecycles.

Repos users can now integrate with Azure DevOps using personal access tokens

May 3-10, 2021: Version 3.45

In addition to Microsoft Entra ID access tokens, you can now use a personal access token to authenticate with Azure DevOps. For details, see Set up Databricks Git folders (Repos).