June 2020
These features and Azure Databricks platform improvements were released in June 2020.
Note
Releases are staged. Your Azure Databricks account may not be updated until up to a week after the initial release date.
Databricks Connect now supports Databricks Runtime 6.6
June 26, 2020
Databricks Connect now supports Databricks Runtime 6.6.
Databricks Runtime 7.0 ML GA
Jun 22, 2020
Databricks Runtime 7.0 ML is built on top of Databricks Runtime 7.0 and includes the following new features:
- Notebook-scoped Python libraries and custom environments managed by conda and pip commands.
- Updates for major Python packages including tensorflow, tensorboard, pytorch, xgboost, sparkdl, and hyperopt.
- Newly added Python packages lightgbm, nltk, petastorm, and plotly.
- RStudio Server Open Source v1.2.
For more information, see the complete Databricks Runtime 7.0 ML (EoS) release notes.
Databricks Runtime 7.0 GA, powered by Apache Spark 3.0
June 18, 2020
Databricks Runtime 7.0 is powered by Apache Spark 3.0 and now supports Scala 2.12.
Spark 3.0 brings many additional features and improvements, including:
- Adaptive Query Execution, a flexible framework to do adaptive execution in Spark SQL and support changing the number of reducers at runtime.
- Redesigned pandas UDFs with type hints.
- Structured Streaming web UI.
- Better compatibility with ANSI SQL standards.
- Join hints.
Databricks Runtime 7.0 adds:
- Improved Auto Loader for processing new data files incrementally as they arrive on a cloud blob store during ETL.
- Improved COPY INTO command for loading data into Delta Lake with idempotent retries.
- Many improvements, library additions and upgrades, and bug fixes.
For more information, see the complete Databricks Runtime 7.0 (EoS) release notes.
Databricks Runtime 7.0 for Genomics GA
June 18, 2020
Databricks Runtime 7.0 for Genomics is built on top of Databricks Runtime 7.0 and includes the following library changes:
- The ADAM library has been updated from version 0.30.0 to 0.32.0.
- The Hail library is not included in Databricks Runtime 7.0 for Genomics, because there is no release based on Apache Spark 3.0.
Stage-dependent access controls for MLflow models
June 16-23, 2020: Version 3.22
You can now assign stage-dependent access controls to users or groups, allowing them to manage MLflow Models registered in the MLflow Model Registry at the Staging or Production stage. We introduced two new permission levels, CAN MANAGE STAGING VERSIONS and CAN MANAGE PRODUCTION VERSIONS. Users with these permissions can perform transitions between stages allowed for the level.
For details, see MLflow model ACLs.
Notebooks now support disabling auto-scroll
June 16-23, 2020: Version 3.22
When you run a notebook cell using shift+enter, the default notebook behavior is to auto-scroll to the next cell if the cell is not visible. You can now disable auto-scroll in > User Settings > Editor settings. If you disable auto-scroll, on shift+enter the focus moves to the next cell, but the notebook does not scroll to that cell.
Metastore IP addresses to change on June 30, 2020
June 11, 2020
The default metastore for Azure Databricks uses Azure Database for MySQL. All Azure Database for MySQL IP addresses for Azure Databricks metastores are changing on June 30, 2020. If you have an Azure Databricks workspace deployed in your own virtual network, your route table for that deployment may include an Azure Databricks metastore IP address or route to a firewall or proxy appliance with an access list that includes that address. If that is the case, you must update your Azure Databricks route tables or firewalls with new MySQL IPs before June 30, 2020 to avoid disruption.
Internet Explorer 11 support ends on August 15
June 9, 2020
In keeping with industry trends and to ensure a stable and consistent user experience for our customers, Azure Databricks will end support for Internet Explorer 11 on August 15, 2020.
Databricks Runtime 6.2 series support ends
June 3, 2020
Support for Databricks Runtime 6.2, Databricks Runtime 6.2 for Machine Learning, and Databricks Runtime 6.2 for Genomics ended on June 3. See Databricks support lifecycles.
Simplify and control cluster creation using cluster policies (Public Preview)
June 2-9, 2020: Version 3.21
Cluster policies are admin-defined, reusable cluster templates that enforce rules on cluster attributes and thus ensure that users create clusters that conform to those rules. As an Azure Databricks admin, you can now create cluster policies and give users policy permissions. By doing that, you have more control over the resources created, give users the level of flexibility they need to do their work, and considerably simplify the cluster creation experience.
For details, see Create and manage compute policies.
SCIM Me endpoint now returns SCIM compliant response
June 2-9, 2020: Version 3.21
The SCIM Me endpoint now returns the same information as the /users/{id}
endpoint, including information such as groups and entitlements.
See CurrentUser API.
Restrict access to Azure Databricks using IP access lists (Public Preview)
June 1, 2020
Azure Databricks workspaces can now be configured so that users connect to the service only through existing corporate networks with a secure perimeter. Azure Databricks admins can use the IP Access List API to define a set of approved IP addresses, including allow and block lists. All incoming access to the web application and REST APIs requires that the user connect from an authorized IP address, guaranteeing that workspaces cannot be accessed from a public network like a coffee shop or an airport unless your users use VPN.
This feature requires the Premium plan.
For more information, see Configure IP access lists for workspaces.