DatabricksCluster Class
Defines Databricks cluster information for use in a DatabricksSection.
Initialize.
- Inheritance
-
azureml._base_sdk_common.abstract_run_config_element._AbstractRunConfigElementDatabricksCluster
Constructor
DatabricksCluster(existing_cluster_id=None, spark_version=None, node_type=None, instance_pool_id=None, num_workers=None, min_workers=None, max_workers=None, spark_env_variables=None, spark_conf=None, init_scripts=None, cluster_log_dbfs_path=None, permit_cluster_restart=None)
Parameters
Name | Description |
---|---|
existing_cluster_id
|
A cluster ID of an existing interactive cluster on the Databricks workspace. If this parameter is specified, none of the other parameters should be specified. Default value: None
|
spark_version
|
The version of Spark for the Databricks run cluster. Example: "10.4.x-scala2.12". Default value: None
|
node_type
|
The Azure VM node types for the Databricks run cluster. Example: "Standard_D3_v2". Default value: None
|
instance_pool_id
|
The instance pool ID to which the cluster needs to be attached to. Default value: None
|
num_workers
|
The number of workers for a Databricks run cluster. If this parameter is specified, the
Default value: None
|
min_workers
|
The minimum number of workers for an autoscaled Databricks cluster. Default value: None
|
max_workers
|
The number of workers for an autoscaled Databricks run cluster. Default value: None
|
spark_env_variables
|
dict(<xref:{str:str}>)
The Spark environment variables for the Databricks run cluster. Default value: None
|
spark_conf
|
dict(<xref:{str:str}>)
The Spark configuration for the Databricks run cluster. Default value: None
|
init_scripts
|
Deprecated. Databricks announced the init script stored in DBFS will stop work after Dec 1, 2023. To mitigate the issue, please 1) use global init scripts in databricks following https://video2.skills-academy.com/azure/databricks/init-scripts/global 2) comment out the line of init_scripts in your AzureML databricks step. Default value: None
|
cluster_log_dbfs_path
|
The DBFS path to where clusters logs need to be delivered. Default value: None
|
permit_cluster_restart
|
if existing_cluster_id is specified, this parameter tells whether cluster can be restarted on behalf of user. Default value: None
|
Methods
validate |
Validate the specified Databricks cluster details. Validate checks the types of provided parameters as well as whether the correct combination
of parameters is provided. For example, you need to either specify the |
validate
Validate the specified Databricks cluster details.
Validate checks the types of provided parameters as well as whether the correct combination
of parameters is provided. For example, you need to either specify the existing_cluster_id
or specify the rest of the cluster parameters. For more information see the constructor
parameter definitions.
validate()
Exceptions
Type | Description |
---|---|
class:azureml.exceptions.UserErrorException
|