Azure HDInsight monitoring data reference
This article contains all the monitoring reference information for this service.
See Monitor HDInsight for details on the data you can collect for Azure HDInsight and how to use it.
Metrics
This section lists all the automatically collected platform metrics for this service. These metrics are also part of the global list of all platform metrics supported in Azure Monitor.
For information on metric retention, see Azure Monitor Metrics overview.
Supported metrics for Microsoft.HDInsight/clusters
The following table lists the metrics available for the Microsoft.HDInsight/clusters resource type.
- All columns might not be present in every table.
- Some columns might be beyond the viewing area of the page. Select Expand table to view all available columns.
Table headings
- Category - The metrics group or classification.
- Metric - The metric display name as it appears in the Azure portal.
- Name in REST API - The metric name as referred to in the REST API.
- Unit - Unit of measure.
- Aggregation - The default aggregation type. Valid values: Average (Avg), Minimum (Min), Maximum (Max), Total (Sum), Count.
- Dimensions - Dimensions available for the metric.
- Time Grains - Intervals at which the metric is sampled. For example,
PT1M
indicates that the metric is sampled every minute,PT30M
every 30 minutes,PT1H
every hour, and so on. - DS Export- Whether the metric is exportable to Azure Monitor Logs via diagnostic settings. For information on exporting metrics, see Create diagnostic settings in Azure Monitor.
Category | Metric | Name in REST API | Unit | Aggregation | Dimensions | Time Grains | DS Export |
---|---|---|---|---|---|---|---|
Availability | Categorized Gateway Requests Number of gateway requests by categories (1xx/2xx/3xx/4xx/5xx) |
CategorizedGatewayRequests |
Count | Count, Total (Sum) | HttpStatus |
PT1M, PT1H, P1D | Yes |
Availability | Gateway Requests Number of gateway requests |
GatewayRequests |
Count | Count, Total (Sum) | HttpStatus |
PT1M, PT1H, P1D | Yes |
Availability | REST proxy Consumer RequestThroughput Number of consumer requests to Kafka REST proxy |
KafkaRestProxy.ConsumerRequest.m1_delta |
CountPerSecond | Total (Sum) | Machine , Topic |
PT1M, PT1H, P1D | Yes |
Availability | REST proxy Consumer Unsuccessful Requests Consumer request exceptions |
KafkaRestProxy.ConsumerRequestFail.m1_delta |
CountPerSecond | Total (Sum) | Machine , Topic |
PT1M, PT1H, P1D | Yes |
Availability | REST proxy Consumer RequestLatency Message latency in a consumer request through Kafka REST proxy |
KafkaRestProxy.ConsumerRequestTime.p95 |
Milliseconds | Average | Machine , Topic |
PT1M, PT1H, P1D | Yes |
Availability | REST proxy Consumer Request Backlog Consumer REST proxy queue length |
KafkaRestProxy.ConsumerRequestWaitingInQueueTime.p95 |
Milliseconds | Average | Machine , Topic |
PT1M, PT1H, P1D | Yes |
Availability | REST proxy Producer MessageThroughput Number of producer messages through Kafka REST proxy |
KafkaRestProxy.MessagesIn.m1_delta |
CountPerSecond | Total (Sum) | Machine , Topic |
PT1M, PT1H, P1D | Yes |
Availability | REST proxy Consumer MessageThroughput Number of consumer messages through Kafka REST proxy |
KafkaRestProxy.MessagesOut.m1_delta |
CountPerSecond | Total (Sum) | Machine , Topic |
PT1M, PT1H, P1D | Yes |
Availability | REST proxy ConcurrentConnections Number of concurrent connections through Kafka REST proxy |
KafkaRestProxy.OpenConnections |
Count | Total (Sum) | Machine , Topic |
PT1M, PT1H, P1D | Yes |
Availability | REST proxy Producer RequestThroughput Number of producer requests to Kafka REST proxy |
KafkaRestProxy.ProducerRequest.m1_delta |
CountPerSecond | Total (Sum) | Machine , Topic |
PT1M, PT1H, P1D | Yes |
Availability | REST proxy Producer Unsuccessful Requests Producer request exceptions |
KafkaRestProxy.ProducerRequestFail.m1_delta |
CountPerSecond | Total (Sum) | Machine , Topic |
PT1M, PT1H, P1D | Yes |
Availability | REST proxy Producer RequestLatency Message latency in a producer request through Kafka REST proxy |
KafkaRestProxy.ProducerRequestTime.p95 |
Milliseconds | Average | Machine , Topic |
PT1M, PT1H, P1D | Yes |
Availability | REST proxy Producer Request Backlog Producer REST proxy queue length |
KafkaRestProxy.ProducerRequestWaitingInQueueTime.p95 |
Milliseconds | Average | Machine , Topic |
PT1M, PT1H, P1D | Yes |
Availability | Number of Active Workers Number of Active Workers |
NumActiveWorkers |
Count | Average, Maximum, Minimum | MetricName |
PT1M, PT1H, P1D | Yes |
Availability | Pending CPU Pending CPU Requests in YARN |
PendingCPU |
Count | Average, Maximum, Minimum | <none> | PT1M, PT1H, P1D | Yes |
Availability | Pending Memory Pending Memory Requests in YARN |
PendingMemory |
Count | Average, Maximum, Minimum | <none> | PT1M, PT1H, P1D | Yes |
Metric dimensions
For information about what metric dimensions are, see Multi-dimensional metrics.
This service has the following dimensions associated with its metrics.
Dimensions for the Microsoft.HDInsight/clusters table include:
- HttpStatus
- Machine
- Topic
- MetricName
Resource logs
This section lists the types of resource logs you can collect for this service. The section pulls from the list of all resource logs category types supported in Azure Monitor.
HDInsight doesn't use Azure Monitor resource logs or diagnostic settings. Logs are collected by other methods, including the use of the Log Analytics agent.
Azure Monitor Logs tables
This section lists the Azure Monitor Logs tables relevant to this service, which are available for query by Log Analytics using Kusto queries. The tables contain resource log data and possibly more depending on what is collected and routed to them.
HDInsight Clusters
Microsoft.HDInsight/Clusters
The available logs and metrics vary depending on your HDInsight cluster type.
- HDInsightAmbariClusterAlerts
- HDInsightAmbariSystemMetrics
- HDInsightGatewayAuditLogs
- HDInsightHBaseLogs
- HDInsightHBaseMetrics
- HDInsightHadoopAndYarnLogs
- HDInsightHadoopAndYarnMetrics
- HDInsightHiveAndLLAPLogs
- HDInsightHiveAndLLAPMetrics
- HDInsightHiveQueryAppStats
- HDInsightHiveTezAppStats
- HDInsightJupyterNotebookEvents
- HDInsightKafkaLogs
- HDInsightKafkaMetrics
- HDInsightKafkaServerLog
- HDInsightOozieLogs
- HDInsightRangerAuditLogs
- HDInsightSecurityLogs
- HDInsightSparkApplicationEvents
- HDInsightSparkBlockManagerEvents
- HDInsightSparkEnvironmentEvents
- HDInsightSparkExecutorEvents
- HDInsightSparkExtraEvents
- HDInsightSparkJobEvents
- HDInsightSparkLogs
- HDInsightSparkSQLExecutionEvents
- HDInsightSparkStageEvents
- HDInsightSparkStageTaskAccumulables
- HDInsightSparkTaskEvents
- HDInsightStormLogs
- HDInsightStormMetrics
- HDInsightStormTopologyMetrics
Log table mapping
The new Azure Monitor integration implements new tables in the Log Analytics workspace. The following tables show the log table mappings from the classic Azure Monitor integration to the new one.
The New table column shows the name of the new table. The Description row describes the type of logs/metrics that are available in this table. The Classic table column is a list of all the tables from the classic Azure Monitor integration whose data is now present in the new table.
Note
Some tables are completely new and not based on previous tables.
General workload tables
New table | Description | Classic table |
---|---|---|
HDInsightAmbariSystemMetrics | System metrics collected from Ambari. The metrics now come from each node in the cluster (except for edge nodes) instead of just the two headnodes. Each metric is now a column and each metric is reported once per record. | metrics_cpu_nice_cl, metrics_cpu_system_cl, metrics_cpu_user_cl, metrics_memory_cache_CL, metrics_memory_swap_CL, metrics_memory_total_CLmetrics_memory_buffer_CL, metrics_load_1min_CL, metrics_load_cpu_CL, metrics_load_nodes_CL, metrics_load_procs_CL, metrics_network_in_CL, metrics_network_out_CL |
HDInsightAmbariClusterAlerts | Ambari Cluster Alerts from each node in the cluster (except for edge nodes). Each alert is a record in this table. | metrics_cluster_alerts_CL |
HDInsightSecurityLogs | Records from the Ambari Audit and Auth Logs. | log_ambari_audit_CL, log_auth_CL |
HDInsightRangerAuditLogs | All records from the Ranger Audit log for ESP clusters. | ranger_audit_logs_CL |
HDInsightGatewayAuditLogs_CL | The Gateway nodes audit information. Same format as the classic table, and still located in the Custom Logs section. | log_gateway_Audit_CL |
Spark workload
Note
Spark application related tables have been replaced with 11 new Spark tables that give more in-depth information about your Spark workloads.
New table | Description | Classic table |
---|---|---|
HDInsightSparkLogs | All logs related to Spark and its related component: Livy and Jupyter. | log_livy_CL, log_jupyter_CL, log_spark_CL, log_sparkappsexecutors_CL, log_sparkappsdrivers_CL |
HDInsightSparkApplicationEvents | Event information for Spark Applications including Submission and Completion time, App ID, and AppName. Useful for keeping track of when applications started and completed. | |
HDInsightSparkBlockManagerEvents | Event information related to Spark's Block Manager. Includes information such as executor memory usage. | |
HDInsightSparkEnvironmentEvents | Event information related to the Environment an application executes in including, Spark Deploy Mode, Master, and information about the Executor. | |
HDInsightSparkExecutorEvents | Event information about the Spark Executor usage for by an Application. | |
HDInsightSparkExtraEvents | Event information that doesn't fit into any other Spark table. | |
HDInsightSparkJobEvents | Information about Spark Jobs including their start and end times, result, and associated stages. | |
HDInsightSparkSqlExecutionEvents | Event information on Spark SQL Queries including their plan info and description and start and end times. | |
HDInsightSparkStageEvents | Event information for Spark Stages including their start and completion times, failure status, and detailed execution information. | |
HDInsightSparkStageTaskAccumulables | Performance metrics for stages and tasks. | |
HDInsightTaskEvents | Event information for Spark Tasks including start and completion time, associated stages, execution status, and task type. | |
HDInsightJupyterNotebookEvents | Event information for Jupyter Notebooks. |
Hadoop/YARN workload
New table | Description | Classic table |
---|---|---|
HDInsightHadoopAndYarnMetrics | JMX metrics from the Hadoop and YARN frameworks. Contains all the same JMX metrics as the previous Custom Logs tables, plus more important metrics: Timeline Server, Node Manager, and Job History Server. Contains one metric per record. | metrics_resourcemanager_clustermetrics_CL, metrics_resourcemanager_jvm_CL, metrics_resourcemanager_queue_root_CL, metrics_resourcemanager_queue_root_joblauncher_CL, metrics_resourcemanager_queue_root_default_CL, metrics_resourcemanager_queue_root_thriftsvr_CL |
HDInsightHadoopAndYarnLogs | All logs generated from the Hadoop and YARN frameworks. | log_mrjobsummary_CL, log_resourcemanager_CL, log_timelineserver_CL, log_nodemanager_CL |
Hive/LLAP workload
New table | Description | Classic table |
---|---|---|
HDInsightHiveAndLLAPMetrics | JMX metrics from the Hive and LLAP frameworks. Contains all the same JMX metrics as the previous Custom Logs tables, one metric per record. | llap_metrics_hiveserver2_CL, llap_metrics_hs2_metrics_subsystemllap_metrics_jvm_CL, llap_metrics_llap_daemon_info_CL, llap_metrics_buddy_allocator_info_CL, llap_metrics_deamon_jvm_CL, llap_metrics_io_CL, llap_metrics_executor_metrics_CL, llap_metrics_metricssystem_stats_CL, llap_metrics_cache_CL |
HDInsightHiveAndLLAPLogs | Logs generated from Hive, LLAP, and their related components: WebHCat and Zeppelin. | log_hivemetastore_CL log_hiveserver2_CL, log_hiveserve2interactive_CL, log_webhcat_CL, log_zeppelin_zeppelin_CL |
Kafka workload
New table | Description | Classic table |
---|---|---|
HDInsightKafkaMetrics | JMX metrics from Kafka. Contains all the same JMX metrics as the old Custom Logs tables, plus other important metrics. One metric per record. | metrics_kafka_CL |
HDInsightKafkaLogs | All logs generated from the Kafka Brokers. | log_kafkaserver_CL, log_kafkacontroller_CL |
HBase workload
New table | Description | Classic table |
---|---|---|
HDInsightHBaseMetrics | JMX metrics from HBase. Contains all the same JMX metrics from the previous tables. In contrast with the previous tables, each row contains one metric. | metrics_regionserver_CL, metrics_regionserver_wal_CL, metrics_regionserver_ipc_CL, metrics_regionserver_os_CL, metrics_regionserver_replication_CL, metrics_restserver_CL, metrics_restserver_jvm_CL, metrics_hmaster_assignmentmanager_CL, metrics_hmaster_ipc_CL, metrics_hmaser_os_CL, metrics_hmaster_balancer_CL, metrics_hmaster_jvm_CL, metrics_hmaster_CL, metrics_hmaster_fs_CL |
HDInsightHBaseLogs | Logs from HBase and its related components: Phoenix and HDFS. | log_regionserver_CL, log_restserver_CL, log_phoenixserver_CL, log_hmaster_CL, log_hdfsnamenode_CL, log_garbage_collector_CL |
Oozie workload
New table | Description | Classic table |
---|---|---|
HDInsightOozieLogs | All logs generated from the Oozie framework. | Log_oozie_CL |
Activity log
The linked table lists the operations that can be recorded in the activity log for this service. These operations are a subset of all the possible resource provider operations in the activity log.
For more information on the schema of activity log entries, see Activity Log schema.
Related content
- See Monitor HDInsight for a description of monitoring HDInsight.
- See Monitor Azure resources with Azure Monitor for details on monitoring Azure resources.